How Smartsheet reduced latency and optimized costs in their serverless architecture

Post Syndicated from Anton Aleksandrov original https://aws.amazon.com/blogs/architecture/how-smartsheet-reduced-latency-and-optimized-costs-in-their-serverless-architecture/

Cloud software as a service (SaaS) companies are often looking for ways to enhance their architectures for performance and cost-efficiency. Serverless technologies offload infrastructure management, allowing development teams to focus on innovation and delivering business value. As application architectures grow and face more demanding requirements, continued optimization helps maximize both the technical and financial advantages of the serverless approach.

In this post, we discuss Smartsheet’s journey optimizing its serverless architecture. We explore the solution, the stringent requirements Smartsheet faced, and how they’ve achieved an over 80% latency reduction. This technical journey offers valuable insights for organizations looking to enhance their serverless architectures with proven enterprise-grade optimization techniques.

Solution overview

Smartsheet is a leading cloud-based enterprise work management platform, enabling millions of users worldwide to plan, manage, track, automate, and report on work at scale. At the core of the platform lies an event-driven architecture that processes real-time user activity across various document types. Given the collaborative nature of the platform, multiple users can work on these documents concurrently. Every document interaction triggers a series of events that must be processed with minimal latency to maintain data consistency and provide immediate feedback. Processing delays can impact user experience and productivity, making consistently low latency a fundamental business requirement.

Smartsheet’s traffic pattern is spiky during business hours and mostly dormant during nights and weekends. Within peak periods, traffic can fluctuate as users collaborate in real time. To efficiently manage dynamic workloads, which can surge from hundreds to tens of thousands of events per second within minutes, Smartsheet implements a serverless event processing architecture using services such as Amazon Simple Queue Service (Amazon SQS) and AWS Lambda. This architecture uses the elasticity of serverless services and the ability to automatically scale dynamically based on the traffic volume. It makes sure Smartsheet can efficiently handle sudden traffic surges while automatically scaling down during off-peak hours, optimizing for both performance and cost-efficiency.

The following diagram illustrates the high-level architecture of the Smartsheet event processing pipeline.

high-level architecture of the Smartsheet event processing pipeline

Optimization opportunity

Smartsheet uses Lambda functions to serve both batch jobs and API requests. The primary runtime used for building those functions is Java. Lambda automatically scales the number of execution environments allocated to your function on demand to accommodate traffic volume. When Lambda receives an incoming request, it attempts to serve it with an existing execution environment first. If no execution environments are available, the service initializes a new one. During initialization, the Smartsheet’s function code commonly sends several requests to external dependencies, such as databases and REST APIs, which might take time to reply.

The following diagram illustrates how Lambda functions reach out to external dependencies during initialization.

Lambda functions reach out to external dependencies during initialization

These tasks introduced execution environment initialization latency, commonly referred to as a cold start. Although cold starts typically affect less than 1% of requests, Smartsheet had stringent low latency requirements for their architecture to further prioritize the best possible end-user experience.

“To reduce customer request latency while keeping costs low, our engineering team utilized Lambda provisioned concurrency with auto scaling and Graviton, which resulted in an 83% reduction in P95 latency while providing a high quality of service as we continue to scale our platform and its limits,” says Abhishek Gurunathan, Sr Director of Engineering at Smartsheet.

Addressing the cold start with provisioned concurrency

To reduce cold start latency, the Smartsheet team adopted provisioned concurrency in their architecture, a capability that allows developers to specify the number of execution environments that Lambda should keep warm to instantly handle invocations. The following diagram illustrates the difference. Without provisioned concurrency, execution environments are created on demand, which means some invocations (typically less than 1%) need to wait for the execution environment to be created and initialization code to be run. With provisioned concurrency, Lambda creates execution environments and runs initialization code preemptively, making sure invocations are served by warm execution environments.

invocations are served by warm execution environments

Provisioned concurrency includes a dynamic spillover mechanism, making your serverless architecture highly resilient to traffic spikes. When incoming traffic exceeds the preconfigured provisioned concurrency, additional requests are automatically served by on-demand concurrency rather than being throttled. This provides seamless scalability and maintains service availability even during traffic surges, while still providing the performance benefits of pre-warmed execution environments for the majority of requests.

The Smartsheet team configured provisioned concurrency to match their historical P95 concurrency needs. This resulted in immediate improvements—the number of cold starts dropped dramatically and P95 invocation latency dropped by 83%. As the team monitored system performance, they quickly identified another architecture optimization opportunity—the Lambda functions were heavily used during work hours but had significantly fewer invocations at night and on weekends, as illustrated in the following graph.

Lambda functions were heavily used during work hours but had significantly fewer invocations at night and on weekends

Setting a static provisioned concurrency configuration worked great for busy periods, but was underutilized during off-times. The Smartsheet team wanted to further fine-tune their architecture and increase provisioned concurrency utilization rates to achieve higher cost-efficiency. This led them to look into provisioned concurrency auto scaling to match traffic patterns as well as adopting an AWS Graviton architecture.

Auto scaling provisioned concurrency and Graviton architecture

Two common approaches to enable provisioned concurrency are setting a static value and using auto scaling. With static configuration, you specify a fixed number of pre-initialized execution environments that remain continuously warm to serve invocations. This approach is highly effective for architectures that handle predictable traffic patterns. Unpredictable traffic patterns, however, can lead to under-provisioning during peak periods (with spillover to on-demand concurrency resulting in more cold starts) or underutilization during low-usage periods. To address that, provisioned concurrency with auto scaling dynamically adjusts the configuration based on utilization metrics, automatically scaling the number of execution environments up or down to match the actual demand. This dynamic approach optimizes for cost-efficiency and is particularly recommended for architectures with fluctuating traffic patterns.

The following figure compares static and dynamic provisioned concurrency.

static and dynamic provisioned concurrency

To further optimize the architecture for cost-efficiency, the Smartsheet team has implemented provisioned concurrency auto scaling based on utilization metrics. Smartsheet used an infrastructure as code (IaC) approach with Terraform to define auto scaling policies for maximum reusability across hundreds of functions. The policies track the LambdaProvisionedConcurrencyUtilization metric and define the scaling threshold according to the function purpose. For functions implementing interactive APIs, the auto scale threshold is 60% utilization to pre-provision execution environments early, keeping latency extra-low, and making functions more resilient towards traffic surges. For functions that implement asynchronous data processing, Smartsheet’s goal was to achieve the highest utilization rate and cost-efficiency, so they’ve defined the auto scale threshold at 90%.

The following diagram illustrates the architecture of auto scaling policies based on provisioned concurrency utilization rate and workload type.auto scaling policies based on provisioned concurrency utilization rate and workload type

Another optimization technique Smartsheet employed was switching the CPU architecture used by their Lambda functions from x86_64 to arm64 Graviton. To achieve this, Smartsheet adopted the ARM versions of Lambda layers they’ve used, such as Datadog and Lambda Insights extensions. This was required because binaries built using one architecture might be incompatible with a different one. Because Smartsheet functions were implemented with Java and packaged as JAR files, they didn’t have any compatibility issues when moving to Graviton. With Terraform used for codifying the infrastructure, this architecture switch was a simple property change in aws_lambda_function resources, as illustrated in the following code:

property change in aws_lambda_function resources

By switching to a Graviton architecture, Smartsheet saved 20% on function GB-second costs. See AWS Lambda pricing for details.

Best practices

Use the following techniques and best practices to optimize your serverless architectures, reduce cold starts, and increase cost-efficiency:

  • Fine-tune your Lambda functions to find the optimal balance between cost and performance. Increasing memory allocation also adds CPU capacity, which often means faster execution and can lead to reduced overall costs.
  • Use a Graviton2 architecture for compatible workloads to benefit from a better price-performance ratio. Depending on the workload type, switching to Graviton can yield up to 34% improvement.
  • Use provisioned concurrency and Lambda SnapStart to reduce cold starts in your serverless architectures. Start with static provisioned concurrency based on your historical concurrency requirements, monitor utilization, and introduce auto scaling into your architecture to achieve the optimal cost-performance profile.

Conclusion

Serverless architectures using services like Lambda and Amazon SQS offload the infrastructure management and scaling concerns to AWS, allowing teams to focus on innovation and delivering business value. As Smartsheet’s journey demonstrates, using provisioned concurrency and Graviton in your architectures can help significantly improve user experience by reducing latencies while also achieving better cost-efficiency, providing a practical blueprint for optimization across the organization. Whether you’re running large-scale enterprise applications or building new cloud solutions, these proven techniques can help you unlock similar performance gains and cost-efficiencies in your serverless architectures.

To learn more about serverless architectures, see Serverless Land.


About the authors

 

EU OS: A European Proposal for a Public Sector Linux Desktop (The New Stack)

Post Syndicated from corbet original https://lwn.net/Articles/1018058/

The New Stack looks
at EU OS
, an attempt to create a desktop system for the European public
sector.

EU OS is not a brand-new Linux distribution in the traditional
sense. Instead, it is a proof-of-concept built atop Fedora’s
immutable KDE Plasma spin (Kinoite). EU OS takes a layered approach
to customization. The project’s vision is to provide a standard,
adaptable Linux base that can be extended with national, regional
or sector-specific customizations, making it suitable for a wide
range of European public sector needs.

Metasploit Wrap-Up 04/18/2025

Post Syndicated from Christophe De La Fuente original https://blog.rapid7.com/2025/04/18/metasploit-wrap-up-04-18-2025/

Smaller Fetch Payloads

Metasploit Wrap-Up 04/18/2025

This week, a significant enhancement was made to the already awesome fetch payload feature by our very own bwatters-r7. The improvement introduces a new option, PIPE_FETCH, which optimizes the process by serving both the payload and the command to be executed simultaneously.

This enhancement directly addresses the challenge of limited space by significantly reducing the size of the command that needs to be run. The PIPE_FETCH option works by initially generating a small command. When this compact command is executed, it fetches the actual, larger command that needs to be run. The fetched command is then directly piped into the shell, streamlining the execution process and making it feasible to use fetch payloads in scenarios where space constraints were previously a limitation.

New module content (2)

BentoML RCE

Authors: Takahiro Yokoyama and c2an1
Type: Exploit
Pull request: #20041 contributed by Takahiro-Yoko
Path: linux/http/bentoml_rce_cve_2025_27520
AttackerKB reference: CVE-2025-27520

Description: This adds a module for an unauthenticated remote code execution in BentoML (CVE-2025-27520).

Langflow AI RCE

Authors: Naveen Sunkavally (Horizon3.ai) and Takahiro Yokoyama
Type: Exploit
Pull request: #20022 contributed by Takahiro-Yoko
Path: multi/http/langflow_unauth_rce_cve_2025_3248
AttackerKB reference: CVE-2025-3248

Description: This adds a module for CVE-2025-3248, an unauthenticated RCE vulnerability that affects Langflow versions prior to 1.3.0.

Enhancements and features (4)

  • #19982 from jvoisin – Updates the Linux enum_protections module to use proper names instead of executable names and add a file-based detection method.
  • #20031 from bcoles – Adds metadata and improves the code quality of multiple FreeBSD exploit modules.
  • #20032 from bcoles – Improves the code quality of multiple nops modules.
  • #20035 from bcoles – Enhances the code quality of multiple encoder modules.

Bugs fixed (3)

  • #20005 from fabpiaf – Fixes a LoadError when loading sqlite3 modules in Metasploit’s Docker support.
  • #20036 from bcoles – Fixes an issue with the exploit/windows/local/unquoted_service_path module that previously claimed a file upload was successful regardless of whether the file upload was successful or not.
  • #20043 from adfoster-r7 – Update Open WAN-to-LAN proxy on AT&T routers error handling when an older Python version is detected.

Documentation

You can find the latest Metasploit documentation on our docsite at docs.metasploit.com.

Get it

As always, you can update to the latest Metasploit Framework with msfupdate and you can get more details on the changes since the last blog post from GitHub:

If you are a git user, you can clone the Metasploit Framework repo (master branch) for the latest. To install fresh without using git, you can use the open-source-only Nightly Installers or the commercial edition Metasploit Pro.

[$] The problem of unnecessary readahead

Post Syndicated from corbet original https://lwn.net/Articles/1016860/

The final session in the memory-management track of the 2025 Linux Storage,
Filesystem, Memory-Management, and BPF Summit was a brief, last-minute
addition run by Kalesh Singh. The kernel’s readahead mechanism is
generally good for performance; it ensures that data is present by the time
an application gets around to asking for it. Sometimes, though, readahead
can go a little too far.

[$] Tracepoints for the VFS?

Post Syndicated from jake original https://lwn.net/Articles/1017573/

Adding tracepoints to some kernel subsystems has been controversial—or
disallowed—due to concerns about the user-space
ABI
that they might create. The virtual filesystem (VFS) layer has
long been one of the subsystems that has not allowed any tracepoints, but
that may be changing. At the 2025 Linux Storage, Filesystem, Memory
Management, and BPF Summit (LSFMM+BPF), Ted Ts’o led a discussion about
whether the ABI concerns are outweighed by the utility of tracepoints for
the VFS.

Security updates for Friday

Post Syndicated from daroc original https://lwn.net/Articles/1018020/

Security updates have been issued by Debian (graphicsmagick and libapache2-mod-auth-openidc), Fedora (giflib, mod_auth_openidc, mysql8.0, perl, perl-Devel-Cover, perl-PAR-Packer, perl-String-Compare-ConstantTime, rust-openssl, rust-openssl-sys, trunk, and workrave), Mageia (chromium-browser-stable and rust), Oracle (java-1.8.0-openjdk, java-17-openjdk, java-21-openjdk, kernel, libreoffice, and webkit2gtk3), Red Hat (gvisor-tap-vsock), SUSE (containerd, docker, docker-stable, forgejo, GraphicsMagick, libmozjs-115-0, perl-32bit, poppler, subfinder, and thunderbird), and Ubuntu (erlang and ruby2.3, ruby2.5).

Ту-тууу! И Тръмп им се показа

Post Syndicated from Емилия Милчева original https://www.toest.bg/tu-tuuu-i-trump-im-se-pokaza/

Ту-тууу! И Тръмп им се показа

На Велика сряда, когато Исус Христос поставя опрощаването на грешниците над това на праведниците, лидерът на ГЕРБ Бойко Борисов се врече, че Пеевски и Горанов ще излязат от списъка със санкционирани лица по глобалния закон „Магнитски“. Дали разполага с някаква информация, или пък участва в процесите по избелване на политическия си съюзник и на бившия си министър? Те бездруго изглеждат чисти по български – обрасли с подозрения и уличаващи факти, но без обвинения и присъди.

Към ноември 2024 г. Съединените щати са наложили санкции по Глобалния закон „Магнитски“ на 245 чуждестранни физически лица и 310 юридически лица от над 50 държави по света. Българите в този списък са осем – впечатляващо като за 6-милионна страна, все емблематични лица за система, срещу която е насочен самият закон. 

Това са: 

  • олигархът Делян Пеевски;
  • бившият финансов министър Владислав Горанов;
  • хазартният бос Васил Божков;
  • Илко Желязков, известен като Лейтенанта на Доган;
  • Николай Малинов от Национално движение „Русофили“;
  • бившите шефове на АЕЦ „Козлодуй“ Александър Николов и Иван Генов;
  • бившият енергиен министър Румен Овчаров. 

Нито един от тях не е осъден в България за корупционно поведение, констатирано и описано в мотивите при налагане на санкциите от Вашингтон (при управлението на президента Джо Байдън). 

Когато Горанов беше включен в списъка през 2023 г., Борисов побърза да заяви, че ГЕРБ се разграничава от него; че партията не е получила „нито стотинка“ от хазартния бос Васил Божков (заявил, че е плащал редовно всеки месец, а „вратичката“, недогледана от Министерството на финансите, струва на бюджета 600 млн. лв. – б.а.); че вярва, че Горанов не е виновен. 

Сега Борисов открито заговори за „опрощение“, в тон с унгарския премиер Виктор Орбан, известен като един от малцината европейски лидери, близки на американския президент Доналд Тръмп. Не морално, а политическо опрощение – чрез лични контакти (особено с Тръмп!), лобизъм и натиск върху външни фактори. Процесът тече не отсега. Известно е, че Пеевски е наел адвокатски кантори и лобисти, за да направи всичко възможно да излезе от списъка и да оперира с активите си. Също и Горанов, който след известно скриване от публичност вече е чест коментатор по различни теми в мейнстрийм медиите.

Санкциите удрят по парите им и по възможностите за бизнес, тъй като достъпът до глобалната финансова система е почти невъзможен. Дори ако използват офшорни компании или доверителни фондове, те също попадат под санкции, стига да се установи, че са под контрола на лица от списъка „Магнитски“ или действат от тяхно име.

Всички техни активи на територията на САЩ или под контрола на американски лица (включително банки) са замразени и те не могат да се разпореждат с тях – да ги прехвърлят, продават или използват по какъвто и да е начин. 

Американски граждани, компании и финансови институции нямат право да влизат във финансови или в търговски взаимоотношения със санкционираните по „Магнитски“ или със свързани с тях фирми. Всякакви опити за заобикаляне на санкциите – чрез посредници или прикрити структури – също подлежат на наказание.

Много международни банки и компании, включително извън САЩ, избягват да правят бизнес със санкционирани лица от страх да не нарушат американското законодателство или да попаднат под вторични санкции.

Ренесанс за клептократичните системи 

И ето че от инструмент за борба с корупцията и за именуването ѝ законът „Магнитски“ се превръща в разменна монета и в инструмент, който укрепва властта на санкционираните. Ако изобщо се стигне до това, Тръмп, който прави политика така, както прави бизнес, би подходил прагматично, търсейки стратегически отстъпки от страна на България. Намеса в сделката за продажба на „Лукойл Нефтохим“, още оръжейни сделки, център за данни, прекратяване на зависимости от руски енергийни източници и подкрепа за американски енергийни проекти в региона…

Службата за контрол на чуждестранните активи на Министерството на финансите на САЩ (OFAC) има правомощието да добавя или да премахва лица от списъка със санкционирани. Но политическото решение за налагането или вдигането на санкциите винаги идва от най-високо ниво – от президента.

Премахнатите наскоро санкции срещу унгарския политик Антал Роган, близък на Виктор Орбан и шеф на кабинета му, несъмнено са видимата част от много по-широка задкулисна сделка между Вашингтон и Будапеща. Когато администрацията на Байдън го вкара в черния списък, го определи като ключова фигура в „клептократична екосистема“, която подкопава демократичните институции в Унгария. Американското Министерство на финансите обвини Роган, че е използвал служебното си положение, за да обогати себе си и политическите си съюзници чрез насочване на обществени поръчки към приближени бизнесмени.

Сега държавният секретар на САЩ Марко Рубио заяви, че ако Антал Роган остане в списъка, това ще е несъвместимо с външнополитическите интереси на САЩ. 

И лидерът на ГЕРБ е вдъхновен от тези „опрощения“ за клептократичните системи. Стигна се дотам, че дори отрече онова, което е написано черно на бяло при налагане на санкциите – че Пеевски не бил санкциониран за корупция.

Той [Делян Пеевски – б.а.] не е санкциониран за корупция. Ще дойде време и за него, и за Горанов да излезе истината, както излезе и за Унгария.

Дали не гледа на Пеевски и ДПС – Ново начало като на дългосрочен партньор? Привикването с Пеевски във властта започна да става факт, за което помагат и всички разкази за неговото всемогъщество и влияние във всяка сфера – съдебната система, службите за сигурност, строителството, спорта. Той изглежда като некоронования български Орбан.

Кой върви по стъпките на Орбан?

А Борисов не просто се опитва да изпере Пеевски и Горанов, той пренаписва разказа за корупцията – от провал на държавата в наказването ѝ до сделка между власт и влияние. Орбан рекламира своята сделка с видео от Мар-а-Лаго, където се видя с Тръмп осем месеца преди президентските избори и получи комплимент какъв „фантастичен лидер“ е. Сигурно защото е превърнал Унгария в партия държава, подчинена на автократичния му режим.

Със или без санкции за Роган никой не се заблуждава каква е истината за Унгария. Хиляди унгарци протестираха с искания за оставки на Орбан и главния прокурор Петер Полт при скандала Шадл–Вьолнер. Той избухна, след като бивш съпартиец на Орбан – Петeр Магяр, публикува запис от разговор с правосъдната министърка Юдит Варга (от времето, когато двамата са били съпрузи). В разговора Варга описва как помощници на началника на кабинета на Орбан (Антал Роган) искат да премахнат определени части от документи по дело за подкуп. 

В опит за сближаване с европейския приятел на Тръмп – Виктор Орбан, партия ДПС – Ново начало, която до миналата година беше част от семейството на Алианса на либералите и демократите в Европа (АЛДЕ), вече е в Групата на консерваторите и патриотите в Парламентарната асамблея на Съвета на Европа (ПАСЕ). Съюзът е скрепен на среща между председателя на Групата на консерваторите и патриотите – унгареца Жолт Немет от партията на Орбан „Фидес“, и Атидже Вели от формацията на Пеевски.

В България има двама политици, които се ползват от публично изразените симпатии на Орбан – Борисов (наричат се един друг „стари приятелю“) и президентът Радев, който също е щедър на комплименти към изолирания в ЕС лидер. 

Вече и друг силен човек в България бърза да си проправи път към Орбан, но бъдещето не изглежда розово за унгарския премиер. „Фидес“, която управлява Унгария в последните 18 години, може да изгуби изборите през април 2026 г. заради инфлацията, растящия дълг, слабата национална валута, натиска върху медийната свобода. Тези проблеми сриват държавата, която осъществи най-успешните реформи към пазарна икономика в Източна Европа. Създадена от споменатия Магяр, понастоящем евродепутат, партия „Тиса“, вече част от ЕНП, набира популярност с фокус върху борбата с корупцията, възстановяване на демократичните институции и по-дълбока интеграция с ЕС. Формацията на Магяр се утвърждава като сериозен претендент за властта – в едни социологически проучвания е първа, в други – втора, с незначителна разлика с „Фидес“. Въпросът е колко различна от „Фидес“ е „Тиса“.

Ту-тууу

Как би могъл Борисов да се помоли за „опрощение“ за Пеевски и изваждането му от списъка „Магнитски“? Единствено чрез създаденото с мандата на ГЕРБ правителство, което да откликва на инициативи и предложения от страна на новата американска администрация. 

Борисов дори обяви, че интерес към българските реактори имало и от Доналд Тръмп, който в първия си мандат изпращал експерти да ги видят. 

Искам с президента Тръмп, с Мъск, с „Mайкрософт“, Бил Гейтс – имат интерес, ако можем да го направим това нещо – не вярвам да има българин да не иска най-мощният изкуствен интелект да е тука. Аз мисля за държавата, ако ме разбирате.

Изявлението му бе гарнирано с типичния Бойко-Борисов превод на Възкресението, предизвикал смях и сарказъм в социалните мрежи.

Господ са го мъчили по най-отвратителния начин, тръни му забива̀ли в главата, пирони в ръцете, върза̀ли го на дървото, на кръста. И когато всички се отрекли от него, изведнъж се показал отгоре и казал: „Ту-туу.“ Даже накарал Тома Неверни, понеже имал много дупки от пироните, и му казал: „Избери си една, мушни си пръста, ама само в една, няма да ме мушиш по цялото тяло“…

На Борисов и на Пеевски, изглежда, им се е показал Тръмп. 

А по улиците тръгват нови протести „Вън Пеевски от властта“. Но с него трябва да си отиде и Бойко Борисов.

История на женското здраве(опазване). Сексуалната революция

Post Syndicated from Надежда Цекулова original https://www.toest.bg/istoria-na-zhenskoto-zdraveopazvane-seksualnata-revolyutsia/


История на женското здраве(опазване). Сексуалната революция

In 20th century a new kind of man was invented: a woman. Social scientists realized she could do any job a man could, but without the need to talk about it. She has even been given a right to vote so she could choose herself which man would tell them what to do.1

В първия текст от поредицата „Анатомия на пола: Жена“ преминахме бързо през дългия път на отношението към женското здраве от древността до началото на XX в. И без особена изненада установихме, че макар в различните епохи и култури социалната роля на жената да се е различавала значимо – от полубогиня до полуробиня, фокусът винаги е бил нейните репродуктивни способности и органи. 

През XX в. икономическите и социалните промени в индустриализиращите се общества вдъхновяват нови посоки в развитието не само на женското здраве(опазване), а и в историята на жените въобще. Tова е столетието, в което ще се проведат първите съвременни научни изследвания, в които се сравнява например как дадена здравна технология повлиява върху жени и мъже, и ще бъдат направени проучвания защо телата нерядко реагират различно. Би било обаче твърде оптимистично да смятаме, че векът ще остане в историята с това, при положение че тогава избухва и сексуалната революция.


Раждайте повече, но само от правилната раса

Първата световна война разтърсва основите на европейските общества. Докато мъжете загиват в окопите, жените заемат работните им места в заводите, фермите, пощите, транспорта, болниците, навлизайки масово в професии, които дотогава са им били отказвани. Въпреки острата нужда от работата и уменията им обаче, мъжкият свят все още ги приема с несигурност. В среда на бурни промени се поставят основите на нов поглед към жената и съответно – към нейното здраве. До трансформацията на този нов поглед в реални подходи обаче предстои да бъде изминат дълъг и болезнен път. 

След края на Първата световна война вниманието на европейските държави се насочва към възстановяването на населението и инфраструктурата. В този контекст женското здравеопазване се превръща във водеща грижа, но отново с фокус жените да бъдат здрави, за да раждат повече здрави деца. Държавите започват да въвеждат програми за пренатални грижи, подкрепа за кърмене и профилактика на детски болести като част от усилията за подобряване на общественото здраве и демографските перспективи. Паралелно с това обаче се смразява съвсем крехкият към онзи момент разговор за контрол на раждаемостта и за контрацепцията като право на избор на жената.

Във Франция например следвоенната загуба на население води до разгръщане на зaродилите се в началото на века т.нар. pro-natalist policies, насочени към повишаване на раждаемостта, включващи както насърчителни мерки (медицински грижи за бъдещи майки, финансови стимули за многодетни семейства), така и ограничения (строги рестрикции на контрацепцията и абортите). 

Във Ваймарска Германия през 20-те години здравната система полага основите на обществените профилактични грижи, включително майчински и детски консултации. В този период обаче се заражда и движението за „расова хигиена“ (Rassenhygiene). Когато нацисткият режим идва на власт, се приема Законът за предотвратяване на потомство при наследствено болни (Gesetz zur Verhütung erbkranken Nachwuchses). Законът влиза в сила на 1 януари 1934 г. и определя категории лица, подлежащи на принудителна стерилизация, като наред с различни видове увреждания, под ударите на процедурата попадат и жени с „асоциално поведение“. Смята се, че до 1945 г. са стерилизирани поне 200 000 жени, като сред тях, освен изброените категории, са попаднали и хиляди роми и синти, подбрани по етнически признак. 

Докато между двете световни войни европейските правителства търсят пътища за демографско възстановяване, женските движения се опитват да поставят други въпроси. Колко деца иска да има една жена? Кога и при какви условия? И има ли тя правото да не иска изобщо? Периодът е белязан от икономически кризи и нарастващ национализъм, но жените продължават да се организират и да настояват за равенство, здравни реформи и социална справедливост.

Тези дебати често се водят встрани от официалния здравен дискурс и далеч не са еднозначни. В Нидерландия например Алета Якобс – първата жена лекар в страната и дългогодишна застъпничка за женски права – поддържа неотстъпчиво идеята, че здравето и правото на избор не са част от „медицински“ или „демографски“ дебат – а въпрос на човешко достойнство. Якобс е авторка на богато илюстрована книга за анатомията на женското тяло, която е уникална за своето време. Якобс обаче остава непопулярна и сравнително изолирана в разбиранията си. Паралелно в Британия се появява книгата Married Love на Мари Стопс, в която авторката изказва скандалното за времето си твърдение, че жената има право на сексуално удоволствие в брака. Идеите на Стопс обаче са крайно противоречиви, тъй като наред с правото на жената на избор тя поддържа и популяризира расова чистота и осигуряване на „качество“ на потомството чрез принудителна стерилизация на определени категории хора.

Макар в междувоенната епоха жените да не успяват да насочат разговора за женско здраве отвъд традиционното вглеждане в репродуктивните си способности, те създават основата, върху която след още няколко десетилетия ще разцъфти сексуалната революция.

Следвоенният „домашен идеал“ и първите пукнатини

След края на Втората световна война милиони жени в Европа и САЩ се оказват в парадоксално положение. По време на войната те са изучили нови професии, участвали са в помощни медицински мисии, изнасяли са на гърба си голяма част от икономиката. А мирът ги приканва да се върнат „там, където им е мястото“ – затворени у дома.

В САЩ активно се популяризира идеализираният образ на домакинята, чийто основен дълг е да бъде добра съпруга и майка. Женското здравеопазване се фокусира почти изключително върху бременността и раждането, а разговорът за сексуалност, удоволствие или контрол върху плодовитостта остава маргинализиран в малки общности и неформални канали. В Западна Европа следвоенните години също са период на демографски страхове – обществата са травмирани от загуби и насърчаването на раждаемостта остава централна политическа цел. Здравната система в страни като Франция, Италия и Западна Германия осигурява достъпни майчинско-детски грижи, но все с подчертан интерес единствено към подкрепа на репродуктивната роля на жената. Контрацепцията почти навсякъде е ограничена или незаконна, а темите за сексуалното образование, менструацията или аборта се разглеждат не в рамките на здравната култура, а като морални въпроси, често под силното влияние на Църквата.

Точно в тази среда започва един от най-важните проекти в историята на женското здраве – създаването на орално противозачатъчно средство, което да е надеждно, дискретно и в ръцете на самата жена.

Инициативата тръгва от Маргарет Сангър, дългогодишна активистка за контрол на раждаемостта, която получава подкрепа и средства от Катрин Маккормак – богата вдовица, мотивирана да инвестира в търсенето на превантивно средство срещу бременност. Ученият, превърнал идеите им в реалност, е д-р Грегъри Пинкус, специалист по хормонална биология, който започва клинични изпитвания върху нов синтетичен прогестин. След години опити, през 1960 г. в САЩ е одобрено първото орално противозачатъчно хапче. То дава на жените възможност самостоятелно, дългосрочно и ефективно да контролират тялото си, без да зависят от партньора си или от лекар.

Сексуалната революция – не нов разговор за секса, а нов език за тялото

През 60-те години в Европа започва да се разгръща нова културна вълна, която поставя под въпрос старите догми. Това, което днес наричаме сексуална революция, е не просто промяна в поведението, а трансформация на отношението към тялото – особено женското. Жените започват открито да говорят за менструация, аборт, удоволствие, сексуално насилие — теми, които до този момент са били или табу, или медицински опростени и сведени до репродукцията като най-малко общо кратно. 

Под натиска на обществените промени и новите разбирания за сексуалното здраве много европейски държави започват бавно и мъчително да либерализират законодателството си в посока разрешаване на контрацепцията и абортите. Паралелно се разгръща и нов тип здравно образование, включващо сексуално възпитание в училищата, създаване на клиники по семейно планиране и възможност за женското тяло да се говори като за нещо надхвърлящо репродуктивната функция.

Сексуалната революция не премахва стереотипите, но успява да промени перспективата. Жената вече не е просто майка или съпруга – тя започва да се утвърждава и като субект в грижите за своето тяло, което разтваря вратите за много по-широк и цялостен поглед върху медицината за жени.

Така се откроява нуждата от ново отношение към женското тяло – не като временен носител на нов живот, а като пълноценен биологичен, емоционален и социален организъм, който има свои ритми, уязвимости и нужди. Започва да се говори за менопауза, остеопороза, сърдечносъдови заболявания, злокачествени заболявания при жени, както и за това как се различават симптомите, лечението и реакцията на женското тяло в сравнение с мъжкото.

И все пак дори когато жените започват да бъдат разпознавани като пълноценни участници в грижите за собственото си здраве, медицинската наука дълго време продължава да говори за тях, но не и със тях – и най-често на базата на мъжкото тяло. Въпреки постигнатия напредък жените остават слабо представени в клинични проучвания, симптомите им често се игнорират или тълкуват погрешно, а болестите, засягащи предимно жени, получават по-малко внимание и финансиране. Така развитието на женското здраве(опазване) навлиза в нова фаза, в която науката трябва да се научи да бъде по-точна, по-представителна и по-справедлива. 

Ако бях цинична, бих отбелязала, че нещата тъкмо бяха тръгнали бавно в правилната посока, когато един възрастен, богат и влиятелен бял президент заповяда думата „жена“ да бъде изтрита от научните масиви на американското правителство, а хиляди хора по света се втурнаха да изтриват целия прогрес в отношението към жените. И добре че не съм такъв човек, та няма да го кажа.

Докъде стигна медицината в изследването на женското здраве и дали в момента този напредък е поставен на пауза – четете следващия месец в „Тоест“.

1 През XX в. бил открит нов вид човек – жената. Изследователи осъзнали, че тя може да върши всяка работа, която и мъжът, но без да има нужда да говори за това. Дори ѝ било дадено правото да гласува, за да може да си избере кой точно мъж да ѝ казва какво да прави. (Cunck on Earth – британски комедиен документален сериал)


„Анатомия на пола: Жена“ разглежда здравето на жените като неразривна част от обществото, историята и културата. В поредицата изследваме как са се променяли нагласите към женското здраве, как медицината е възприемала специфичните потребности на жените и какви процеси са повлияли на достъпа им до качествени здравни грижи. Вглеждаме се в научните открития, но и в културните митове; в официалните политики, но и в личните истории на жени, борещи се за правото си на здраве и достойнство.

Т.Е. от Е.Т. – епизод 11

Post Syndicated from Тоест original https://www.toest.bg/t-e-ot-e-t-epizod-11/

Т.Е. от Е.Т. – епизод 11

Аз съм ту-туу.
Ти си ту-туу.
Той/тя/то е ту-туу.
Ние сме ту-туу.
Вие сте ту-туу.
Те са ту-туу.

В случай че искате да го използвате в изречение.

Следете видеорубриката на Елена Телбис за „Тоест“ и в Instagram и TikTok.

Announcing the AWS Well-Architected Generative AI Lens

Post Syndicated from Dan Ferguson original https://aws.amazon.com/blogs/architecture/announcing-the-aws-well-architected-generative-ai-lens/

We are delighted to introduce the new AWS Well-Architected Generative AI Lens. The AWS Well-Architected Framework provides architectural best practices for designing and operating generative AI workloads on AWS. The Generative AI Lens uses the Well-Architected Framework to outline the steps for performing a Well-Architected Framework Review for your generative AI workloads.

The Generative AI Lens provides a consistent approach for customers to evaluate architectures that use large language models (LLMs) to achieve their business goals. This lens addresses common considerations relevant to model selection, prompt engineering, model customization, workload integration, and continuous improvement. Specifically excluded from this lens are best practices associated with model training and advanced model customization techniques. We identify best practices that help you architect your cloud-based applications and workloads according to AWS Well-Architected design principles gathered from supporting thousands of customer implementations.

The Generative AI Lens joins a collection of Well-Architected lenses published under AWS Well-Architected Lenses.

What is the Generative AI Lens?

The Well-Architected Generative AI Lens focuses on the six pillars of the Well-Architected Framework across six phases of the generative AI lifecycle, as illustrated in the following figure.

The six phases are:

  1. Scoping the impact of generative AI in solving your problem.
  2. Selecting a model that sufficiently addresses the task.
  3. Customizing the model with prompts, data sources, or updated weights to improve performance.
  4. Integrating the model into your existing applications.
  5. Deploying the new generative AI capability into your environment.
  6. Iterating and improving on the generative AI capabilities you have released.

Unlike the traditional waterfall approach, an iterative approach is required to achieve a working prototype based on the six phases of the generative AI lifecycle. The lens provides you with a set of established cloud-agnostic best practices in the form of Well-Architected Framework pillars for each generative AI lifecycle phase.

You can also use the Well-Architected Generative AI Lens wherever you are on your cloud journey. You can choose to apply this guidance either during the design of your generative AI workloads or after your workloads have entered production as a part of the continuous improvement process.

What’s else is discussed in the Generative AI Lens?

The Generative AI Lens also discusses the following key topics:

  • Responsible AI – Responsible implementation of generative AI workloads is discussed in this paper. We describe some of the common considerations facing customers as they address the responsible implementation and deployment of generative AI.
  • Data architecture for generative AI – At the core of any AI workload is data. We feature a brief survey on the nuances of data architectures with regards to generative AI workloads.

Who should use the Generative AI Lens?

The Generative AI Lens is of use to many roles. Business leaders can use this lens to acquire a broader appreciation of the end-to-end implementation and benefits of generative AI. Data scientists and engineers can read this lens to understand how to use, secure, and gain insights from their data at scale. Risk and compliance leaders can understand how generative AI is implemented responsibly by providing compliance with regulatory and governance requirements.

Generative AI Lens components

The lens includes four focus areas:

  • The Well-Architected Generative AI Lens design principles – Design principles are the guidelines and value statements that frame the presented best practices.
  • The Generative AI lifecycle and the Well Architected Framework pillars – This considers all aspects of the generative AI lifecycle and reviews design strategies to align to the pillars of the overall Well-Architected Framework:
    • Operational excellence – Ability to support ongoing development, run operational workloads effectively, gain insight into your operations, and continuously improve supporting processes and procedures to deliver business value.
    • Security – Ability to protect data, systems, and assets, and to take advantage of cloud technologies to improve your security.
    • Reliability – Ability of a workload to perform its intended function correctly and consistently, and to automatically recover from failure situations.
    • Performance efficiency – Ability to use computing resources efficiently to meet system requirements, and to maintain that efficiency as system demand changes and technologies evolve.
    • Cost optimization – Ability to run systems to deliver business value at the lowest price point.
    • Sustainability – Addresses the long-term environmental, economic, and societal impact of your business activities.
  • Cloud-agnostic best practices – These are best practices for each generative AI lifecycle phase across the Well-Architected Framework pillars irrespective of your technology setting. The best practices are accompanied by:
    • Implementation guidance – The AWS implementation plans for each best practice with references to AWS technologies and resources.
    • Resources – A set of links to AWS documents, blogs, videos, and code examples as supporting resources to the best practices and their implementation plans.
  • Related generative AI architecture considerations – This includes discussions on the generative AI application lifecycle, and where the listed best practices in this lens could fit into the lifecycle. Additionally, we discuss elements of data architecture for generative AI workloads, and Well-Architected considerations for responsible AI.

What are the next steps?

The new Well-Architected Generative AI Lens is available now. Use the lens to make sure that your generative AI workloads are architected with operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability in mind.

If you require support on the implementation or assessment of your generative AI workloads, please contact your AWS Solutions Architect or Account Representative.

Special thanks to everyone across the AWS Solution Architecture, AWS Professional Services, and Machine Learning communities who contributed to the Generative AI Lens. These contributions encompassed diverse perspectives, expertise, backgrounds, and experiences in developing the new AWS Well-Architected Generative AI Lens.

For additional reading, refer to the AWS Well-Architected Framework and pillar whitepapers, or use the AWS Well-Architected Machine Learning Lens and its custom lens accessible from the AWS Well-Architected Tool.


About the authors

Accelerate your analytics with Amazon S3 Tables and Amazon SageMaker Lakehouse

Post Syndicated from Sandeep Adwankar original https://aws.amazon.com/blogs/big-data/accelerate-your-analytics-with-amazon-s3-tables-and-amazon-sagemaker-lakehouse/

Amazon SageMaker Lakehouse is a unified, open, and secure data lakehouse that now seamlessly integrates with Amazon S3 Tables, the first cloud object store with built-in Apache Iceberg support. With this integration, SageMaker Lakehouse provides unified access to S3 Tables, general purpose Amazon S3 buckets, Amazon Redshift data warehouses, and data sources such as Amazon DynamoDB or PostgreSQL. You can then query, analyze, and join the data using Redshift, Amazon Athena, Amazon EMR, and AWS Glue. In addition to your familiar AWS services, you can access and query your data in-place with your choice of Iceberg-compatible tools and engines, providing you the flexibility to use SQL or Spark-based tools and collaborate on this data the way you like. You can secure and centrally manage your data in the lakehouse by defining fine-grained permissions with AWS Lake Formation that are consistently applied across all analytics and machine learning(ML) tools and engines.

Organizations are becoming increasingly data driven, and as data becomes a differentiator in business, organizations need faster access to all their data in all locations, using preferred engines to support rapidly expanding analytics and AI/ML use cases. Let’s take an example of a retail company that started by storing their customer sales and churn data in their data warehouse for business intelligence reports. With massive growth in business, they need to manage a variety of data sources as well as exponential growth in data volume. The company builds a data lake using Apache Iceberg to store new data such as customer reviews and social media interactions.

This enables them to cater to their end customers with new personalized marketing campaigns and understand its impact on sales and churn. However, data distributed across data lakes and warehouses limits their ability to move quickly, as it may require them to set up specialized connectors, manage multiple access policies, and often resort to copying data, that can increase cost in both managing the separate datasets as well as redundant data stored. SageMaker Lakehouse addresses these challenges by providing secure and centralized management of data in data lakes, data warehouses, and data sources such as MySQL, and SQL Server by defining fine-grained permissions that are consistently applied across data in all analytics engines.

In this post, we guide you how to use various analytics services using the integration of SageMaker Lakehouse with S3 Tables. We begin by enabling integration of S3 Tables with AWS analytics services. We create S3 Tables and Redshift tables and populate them with data. We then set up SageMaker Unified Studio by creating a company specific domain, new project with users, and fine-grained permissions. This lets us unify data lakes and data warehouses and use them with analytics services such as Athena, Redshift, Glue, and EMR.

Solution overview

To illustrate the solution, we are going to consider a fictional company called Example Retail Corp. Example Retail’s leadership is interested in understanding customer and business insights across thousands of customer touchpoints for millions of their customers that will help them build sales, marketing, and investment plans. Leadership wants to conduct an analysis across all their data to identify at-risk customers, understand impact of personalized marketing campaigns on customer churn, and develop targeted retention and sales strategies.

Alice is a data administrator in Example Retail Corp who has embarked on an initiative to consolidate customer information from multiple touchpoints, including social media, sales, and support requests. She decides to use S3 Tables with Iceberg transactional capability to achieve scalability as updates are streamed across billions of customer interactions, while providing same durability, availability, and performance characteristics that S3 is known for. Alice already has built a large warehouse with Redshift, which contains historical and current data about sales, customers prospects, and churn information.

Alice supports an extended team of developers, engineers, and data scientists who require access to the data environment to develop business insights, dashboards, ML models, and knowledge bases. This team includes:

Bob, a data analyst who needs to access to S3 Tables and warehouse data to automate building customer interactions growth and churn across various customer touchpoints for daily reports sent to leadership.

Charlie, a Business Intelligence analyst who is tasked to build interactive dashboards for funnel of customer prospects and their conversions across multiple touchpoints and make those available to thousands of Sales team members.

Doug, a data engineer responsible for building ML forecasting models for sales growth using the pipeline and/or customer conversion across multiple touchpoints and make those available to finance and planning teams.

Alice decides to use SageMaker Lakehouse to unify data across S3 Tables and Redshift data warehouse. Bob is excited about this decision as he can now build daily reports using his expertise with Athena. Charlie now knows that he can quickly build Amazon QuickSight dashboards with queries that are optimized using Redshift’s cost-based optimizer. Doug, being an open source Apache Spark contributor, is excited that he can build Spark based processing with AWS Glue or Amazon EMR to build ML forecasting models.

The following diagram illustrates the solution architecture.

Implementing this solution consists of the following high-level steps. For Example Retail, Alice as a data Administrator performs these steps:

  1. Create a table bucket. S3 Tables stores Apache Iceberg tables as S3 resources, and customer details are managed in S3 Tables. You can then enable integration with AWS analytics services, which automatically sets up the SageMaker Lakehouse integration so that the tables bucket is shown as a child catalog under the federated s3tablescatalog in the AWS Glue Data Catalog and is registered with AWS Lake Formation for access control. Next, you create a table namespace or database which is a logical construct that you group tables under and create a table using Athena SQL CREATE TABLE statement.
  2. Publish your data warehouse to Glue Data Catalog. Churn data is managed in a Redshift data warehouse, which is published to the Data Catalog as a federated catalog and is available in SageMaker Lakehouse.
  3. Create a SageMaker Unified Studio project. SageMaker Unified Studio integrates with SageMaker Lakehouse and simplifies analytics and AI with a unified experience. Start by creating a domain and adding all users (Bob, Charlie, Doug). Then create a project in the domain, choosing project profile that provisions various resources and the project AWS Identity and Access Management (IAM) role that manages resource access. Alice adds Bob, Charlie, and Doug to the project as members.
  4. Onboard S3 Tables and Redshift tables to SageMaker Unified Studio. To onboard the S3 Tables to the project, in Lake Formation, you grant permission on the resource to the SageMaker Unified Studio project role. This enables the catalog to be discoverable within the lakehouse data explorer for users (Bob, Charlie, and Doug) to start querying tables .SageMaker Lakehouse resources can now be accessed from computes like Athena, Redshift, and Apache Spark based computes like Glue to derive churn analysis insights, with Lake Formation managing the data permissions.

Prerequisites

To follow the steps in this post, you must complete the following prerequisites:

Alice completes the following steps to create the S3 Table bucket for the new data she plans to add/import into an S3 Table.

  1. AWS account with access to the following AWS services:
    • Amazon S3 including S3 Tables
    • Amazon Redshift
    • AWS Identity and Access Management (IAM)
    • Amazon SageMaker Unified Studio
    • AWS Lake Formation and AWS Glue Data Catalog
    • AWS Glue
  2. Create a user with administrative access.
  3. Have access to an IAM role that is a Lake Formation data lake administrator. For instructions, refer to Create a data lake administrator.
  4. Enable AWS IAM Identity Center in the same AWS Region where you want to create your SageMaker Unified Studio domain. Set up your identity provider (IdP) and synchronize identities and groups with AWS IAM Identity Center. For more information, refer to IAM Identity Center Identity source tutorials.
  5. Create a read-only administrator role to discover the Amazon Redshift federated catalogs in the Data Catalog. For instructions, refer to Prerequisites for managing Amazon Redshift namespaces in the AWS Glue Data Catalog.
  6. Create an IAM role named DataTransferRole. For instructions, refer to Prerequisites for managing Amazon Redshift namespaces in the AWS Glue Data Catalog.
  7. Create an Amazon Redshift Serverless namespace called churnwg. For more information, see Get started with Amazon Redshift Serverless data warehouses.

Create a table bucket and enable integration with analytics services

Alice completes the following steps to create the S3 Table bucket for the new data she plans to add/import into an S3 Tables.

Follow the below steps to create a table bucket to enable integration with SageMaker Lakehouse:

  1. Sign in to the S3 console as user created in prerequisite step 2.
  2. Choose Table buckets in the navigation pane and choose Enable integration.
  3. Choose Table buckets in the navigation pane and choose Create table bucket.
  4. For Table bucket name, enter a name such as blog-customer-bucket.
  5. Choose Create table bucket.
  6. Choose Create table with Athena.
  7. Select Create a namespace and provide a namespace (for example, customernamespace).
  8. Choose Create namespace.
  9. Choose Create table with Athena.
  10. On the Athena console, run the following SQL script to create a table:
    CREATE TABLE customer (
      `c_salutation` string, 
      `c_preferred_cust_flag` string, 
      `c_first_sales_date_sk` int, 
      `c_customer_sk` int, 
      `c_login` string, 
      `c_current_cdemo_sk` int, 
      `c_first_name` string, 
      `c_current_hdemo_sk` int, 
      `c_current_addr_sk` int, 
      `c_last_name` string, 
      `c_customer_id` string, 
      `c_last_review_date_sk` int, 
      `c_birth_month` int, 
      `c_birth_country` string, 
      `c_birth_year` int, 
      `c_birth_day` int, 
      `c_first_shipto_date_sk` int, 
      `c_email_address` string)
      TBLPROPERTIES ('table_type' = 'iceberg')
      
    
    INSERT INTO customer VALUES
    ('Dr.','N',2452077,13251813,'Y',1381546,'Joyce',2645,2255449,'Deaton','AAAAAAAAFOEDKMAA',2452543,1,'GREECE',1987,29,2250667,'[email protected]'),
    ('Dr.','N',2450637,12755125,'Y',1581546,'Daniel',9745,4922716,'Dow','AAAAAAAAFLAKCMAA',2432545,1,'INDIA',1952,3,2450667,'[email protected]'),
    ('Dr.','N',2452342,26009249,'Y',1581536,'Marie',8734,1331639,'Lange','AAAAAAAABKONMIBA',2455549,1,'CANADA',1934,5,2472372,'[email protected]'),
    ('Dr.','N',2452342,3270685,'Y',1827661,'Wesley',1548,11108235,'Harris','AAAAAAAANBIOBDAA',2452548,1,'ROME',1986,13,2450667,'[email protected]'),
    ('Dr.','N',2452342,29033279,'Y',1581536,'Alexandar',8262,8059919,'Salyer','AAAAAAAAPDDALLBA',2952543,1,'SWISS',1980,6,2650667,'[email protected]'),
    ('Miss','N',2452342,6520539,'Y',3581536,'Jerry',1874,36370,'Tracy','AAAAAAAALNOHDGAA',2452385,1,'ITALY',1957,8,2450667,'[email protected]')

This is just an example of adding a few rows to the table, but generally for production use cases, customers use engines such as Spark to add data to the table.

S3 Tables customer is now created, populated with data and integrated with SageMaker Lakehouse.

Set up Redshift tables and publish to the Data Catalog

Alice completes the following steps to connect the data in Redshift to be published into the data catalog. We’ll also demonstrate how the Redshift table is created and populated, but in Alice’s case Redshift table already exists with all the historic data on sales revenue.

  1. Sign in to the Redshift endpoint churnwg as an admin user.
  2. Run the following script to create a table under the dev database under the public schema:
    CREATE TABLE customer_churn (
    customer_id BIGINT,
    tenure INT,
    monthly_charges DECIMAL(5,1),
    total_charges DECIMAL(5,1),
    contract_type VARCHAR(100),
    payment_method VARCHAR(100),
    internet_service VARCHAR(100),
    has_phone_service BOOLEAN,
    is_churned BOOLEAN
    );
    
    INSERT INTO customer_churn VALUES
    (10251783, 12, 70.5, 850.0, 'Month-to-Month', 'Credit Card', 'Fiber Optic', true, true),
    (13251813, 36, 55.0, 1980.0, 'One Year', 'Bank Transfer', 'DSL', true, false),
    (12755125, 6, 90.0, 540.0, 'Month-to-Month', 'Mailed Check', 'Fiber Optic', false, true),
    (26009249, 12, 70.5, 850.0, 'One Year', 'Credit Card', 'DSL', true, false),
    (3270685, 36, 55.0, 1980.0, 'One Year', 'Bank Transfer', 'DSL', true, false),
    (29033279, 6, 90.0, 540.0, 'Month-to-Month', 'Mailed Check', 'Fiber Optic', false, true),
    (6520539, 24, 60.0, 1440.0, 'Two Year', 'Electronic Check', 'DSL', true, false);

    This is just an example of adding a few rows to the table, but generally for production use cases, customers use several ways to add data to the table as documented in Loading data in Amazon Redshift.

  3. On the Redshift Serverless console, navigate to the namespace.
  4. On the Action dropdown menu, choose Register with AWS Glue Data Catalog to integrate with SageMaker Lakehouse.
  5. Choose Register.
  6. Sign in to the Lake Formation console as the data lake administrator.
  7. Under Data Catalog in the navigation pane, choose Catalogs and Pending catalog invitations.
  8. Select the pending invitation and choose Approve and create catalog.
  9. Provide a name for the catalog (for example, churn_lakehouse).
  10. Under Access from engines, select Access this catalog from Iceberg-compatible engines and choose DataTransferRole for the IAM role.
  11. Choose Next.
  12. Choose Add permissions.
  13. Under Principals, choose the datalakeadmin role for IAM users and roles, Super user for Catalog permissions, and choose Add.
  14. Choose Create catalog.

Redshift Table customer_churn is now created, populated with data and integrated with SageMaker Lakehouse.

Create a SageMaker Unified Studio domain and project

Alice now sets up SageMaker Unified Studio domain and projects so that she can bring users (Bob, Charlie and Doug) together in the new project.

Complete the following steps to create a SageMaker domain and project using SageMaker Unified Studio:

  1. On the SageMaker Unified Studio console, create a SageMaker Unified Studio domain and project using the All Capabilities profile template. For more details, refer to Setting up Amazon SageMaker Unified Studio. For this post, we create a project named churn_analysis.
  2. Setup AWS Identity center with users Bob, Charlie and Doug, Add them to domain and project.
  3. From SageMaker Unified Studio, navigate to the project overview and on the Project details tab, note the project role Amazon Resource Name (ARN).
  4. Sign in to the IAM console as an admin user.
  5. In the navigation pane, choose Roles.
  6. Search for the project role and add AmazonS3TablesReadOnlyAccess by choosing Add permissions.

SageMaker Unified Studio is now setup with domain, project and users.

Onboard S3 Tables and Redshift tables to the SageMaker Unified Studio project

Alice now configures SageMaker Unified Studio project role for fine-grained access control to determine who on her team gets to access what data sets.

Grant the project role full table access on customer dataset. For that, complete the following steps:

  1. Sign in to the Lake Formation console as the data lake administrator.
  2. In the navigation pane, choose Data lake permissions, then choose Grant.
  3. In the Principals section, for IAM users and roles, choose the project role ARN noted earlier.
  4. In the LF-Tags or catalog resources section, select Named Data Catalog resources:
    • Choose <account_id>:s3tablescatalog/blog-customer-bucket for Catalogs.
    • Choose customernamespace for Databases.
    • Choose customer for Tables.
  5. In the Table permissions section, select Select and Describe for permissions.
  6. Choose Grant.

Now grant the project role access to subset of columns  from customer_churn dataset.

  1. In the navigation pane, choose Data lake permissions, then choose Grant.
  2. In the Principals section, for IAM users and roles, choose the project role ARN noted earlier.
  3. In the LF-Tags or catalog resources section, select Named Data Catalog resources:
    • Choose <account_id>:churn_lakehouse/dev for Catalogs.
    • Choose public for Databases.
    • Choose customer_churn for Tables.
  4. In the Table Permissions section, select Select.
  5. In the Data Permissions section, select Column-based access.
  6. For Choose permission filter, select Include columns and choose customer_id, internet_service, and is_churned.
  7. Choose Grant.

All users in the project churn_analysis in SageMaker Unified Studio are now setup. They have access to all columns in the table and fine-grained access permissions for Redshift table where they have access to only three columns.

Verify data access in SageMaker Unified Studio

Alice can now do a final verification if the data is all available to ensure that each of her team members are set up to access the datasets.

Now you can verify data access for different users in SageMaker Unified Studio.

  1. Sign in to SageMaker Unified Studio as Bob and choose the churn_analysis
  2. Navigate to the Data explorer to view s3tablescatalog and churn_lakehouse under Lakehouse.

Data Analyst uses Athena for analyzing customer churn

Bob, the data analyst can now logs into to the SageMaker Unified Studio, chooses the churn_analysis project and navigates to the Build options and choose Query Editor under Data Analysis & Integration.

Bob chooses the connection as Athena (Lakehouse), the catalog as s3tablescatalog/blog-customer-bucket, and the database as customernamespace. And runs the following SQL to analyze the data for customer churn:

select * from "churn_lakehouse/dev"."public"."customer_churn" a, 
"s3tablescatalog/blog-customer-bucket"."customernamespace"."customer" b
where a.customer_id=b.c_customer_sk limit 10;

Bob can now join the data across S3 Tables and Redshift in Athena and now can proceed to build full SQL analytics capability to automate building customer growth and churn leadership daily reports.

BI Analyst uses Redshift engine for analyzing customer data

Charlie, the BI Analyst can now logs into the SageMaker Unified Studio and chooses the churn_analysis project. He navigates to the Build options and choose Query Editor under Data Analysis & Integration. He chooses the connection as Redshift (Lakehouse), Databases as dev, Schemas as public.

He then runs the follow SQL to perform his specific analysis.

select * from "dev@churn_lakehouse"."public"."customer_churn" a, 
"blog-customer-bucket@s3tablescatalog"."customernamespace"."customer" b
where a.customer_id=b.c_customer_sk limit 10;

Charlie can now further update the SQL query and use it to power QuickSight dashboards that can be shared with Sales team members.

Data engineer uses AWS Glue Spark engine to process customer data

Finally, Doug logs in to SageMaker Unified Studio as Doug and chooses the churn_analysis project to perform his analysis. He navigates to the Build options and choose JupyterLab under IDE & Applications. He downloads the churn_analysis.ipynb notebook and upload it into the explorer. He then runs the cells by selecting compute as project.spark.compatibility.

He runs the following SQL to analyze the data for customer churn:

Doug, now can use Spark SQL and start processing data from both S3 tables and Redshift tables and start  building forecasting models for customer growth and churn

Cleaning up

If you implemented the example and want to remove the resources, complete the following steps:

  1. Clean up S3 Tables resources:
    1. Delete the table.
    2. Delete the namespace in the table bucket.
    3. Delete the table bucket.
  2. Clean up the Redshift data resources:
    1. On the Lake Formation console, choose Catalogs in the navigation pane.
    2. Delete the churn_lakehouse catalog.
  3. Delete SageMaker project, IAM roles, Glue resources, Athena workgroup, S3 buckets created for domain.
  4. Delete SageMaker domain and VPC created for the setup.

Conclusion

In this post, we showed how you can use SageMaker Lakehouse to unify data across S3 Tables and Redshift data warehouses, which can help you build powerful analytics and AI/ML applications on a single copy of data. SageMaker Lakehouse gives you the flexibility to access and query your data in-place with Iceberg-compatible tools and engines. You can secure your data in the lakehouse by defining fine-grained permissions that are enforced across analytics and ML tools and engines.

For more information, refer to Tutorial: Getting started with S3 Tables, S3 Tables integration, and Connecting to the Data Catalog using AWS Glue Iceberg REST endpoint. We encourage you to try out the S3 Tables integration with SageMaker Lakehouse integration and share your feedback with us.


About the authors

Sandeep Adwankar is a Senior Technical Product Manager at AWS. Based in the California Bay Area, he works with customers around the globe to translate business and technical requirements into products that enable customers to improve how they manage, secure, and access data.

Srividya Parthasarathy is a Senior Big Data Architect on the AWS Lake Formation team. She works with the product team and customers to build robust features and solutions for their analytical data platform. She enjoys building data mesh solutions and sharing them with the community.

Aditya Kalyanakrishnan is a Senior Product Manager on the Amazon S3 team at AWS. He enjoys learning from customers about how they use Amazon S3 and helping them scale performance. Adi’s based in Seattle, and in his spare time enjoys hiking and occasionally brewing beer.

Announcing AWS Security Reference Architecture Code Examples for Generative AI

Post Syndicated from Ievgeniia Ieromenko original https://aws.amazon.com/blogs/security/announcing-aws-security-reference-architecture-code-examples-for-generative-ai/

Amazon Web Services (AWS) is pleased to announce the release of new Security Reference Architecture (SRA) code examples for securing generative AI workloads. The examples include two comprehensive capabilities focusing on secure model inference and RAG implementations, covering a wide range of security controls and best practices for AWS generative AI services.

These new code examples are available in the AWS SRA Examples Repository and include ready-to-deploy CloudFormation templates for implementing detective security controls such as network segmentation, identity management, encryption, prompt injection detection, and logging and monitoring. The solutions align with the AWS SRA Design Guidance page and demonstrate our commitment to helping customers secure their generative AI implementations.

Customers can get started with these examples by following the implementation instructions for each solution in the AWS SRA Examples Repository Solutions GenAI page. Additional documentation and implementation guidance is available in the AWS SRA Design Guidance Generative AI Architecture Deep Dive.

AWS strives to continuously provide security solutions that help customers meet their security architecture needs. Customers can reach out to the team by submitting an issue in the code repository.

If you have feedback about this post, submit comments in the Comments section below.

Ievgeniia Ieromenko

Ievgeniia Ieromenko

Ievgeniia a Security Engineer at AWS, focusing on cloud security architecture and best practices. She is a key contributor to the AWS Security Reference Architecture GitHub repository, helping customers implement secure cloud environments.

Liam Schneider

Liam Schneider

Liam is a Sr. Security Engineer with deep experience in cloud and application security, focused on reducing risk, improving system resilience, and aligning security with business needs. Liam has a strong background in compliance, team leadership, and building secure, scalable solutions across complex environments. He is known for practical, effective approaches to modern security challenges in both enterprise and cloud-first organizations.

Justin Kontny

Justin Kontny

Justin is a Sr. Security Engineer at AWS who combines his passion for software development with expertise in cloud security. He focuses on transforming security from a barrier to a business enabler through innovative AI-driven automation. When not pushing the boundaries of cloud security, Justin enjoys time with his children and being active outdoors.

How to help prevent hotlinking using referer checking, AWS WAF, and Amazon CloudFront

Post Syndicated from Alex Smith original https://aws.amazon.com/blogs/security/how-to-prevent-hotlinking-by-using-aws-waf-amazon-cloudfront-and-referer-checking/

Note: This post was first published April 21, 2016. The updated version aligns with the latest version of AWS WAF (AWS WAF v2) and includes screenshots that reflect the changes in the AWS console experience.


AWS WAF Classic has been deprecated and will be end-of-life (EOL) in September 2025. This update describes how to use the latest version of AWS WAF (WAFv2) to help prevent hotlinking. Updates have been made to the screenshots to reflect the changes in the AWS Management Console for AWS WAF.

Hotlinking—also known as inline linking—is a form of content leeching where an unauthorized third-party website embeds links to resources originally referenced in a primary site’s HTML. The third-party website doesn’t incur the cost of hosting the content, which means that your website can be charged for the content other sites use. It also results in slow loading times, lost revenue, and potential legal issues.

Now, you can use AWS WAF to help prevent hotlinking. AWS WAF is a web application firewall that’s closely integrated with Amazon CloudFront—a content delivery network (CDN)—and can help protect your web applications from common web exploits that could affect application availability, compromise security, and consume excessive resources. In this blog post, I show you how to help prevent hotlinking by using header inspection in AWS WAF, while still taking advantage of the improved user experience from a CDN such as CloudFront.

Solution overview

You can address hotlinking in various ways. For instance, you can validate the Referer header (sent by a browser to indicate to the server which page the visitor was referred from) at your web server (for example, by using the Apache module mod_rewrite), and either issue a redirect back to your site’s main page or return a 403 Forbidden error to the visitor’s browser.

If you’re using a CDN such as CloudFront to speed up your site’s delivery of content, validating the Referer header at the web server becomes less practical. The CDN stores a copy of your content in the edge of its network of servers, so even if your web server validates the original request’s headers (in this case, the referer), additional requests for that content must be validated by the CDN itself, because they are unlikely to reach the origin web server.

Figure 1 illustrates this process.

Figure 1: Request – response flow showing instances of a cache-miss and a cache-hit

Figure 1: Request – response flow showing instances of a cache-miss and a cache-hit

The process shown in Figure 1 is as follows:

  1. A request is received from a user client (1) at a CloudFront edge location (2).
  2. The edge location attempts to return a cached copy of the file requested. This request, if fulfilled from the cache, is considered a cache hit.
    1. In the case of a cache miss—when the content is either not in the edge or is not valid (for example, if the content is out of date)—the request is forwarded to the origin (3) (such as an Amazon Simple Storage Service (Amazon S3) bucket) for a new copy of the object.
    2. In the case of a cache hit, the origin cannot apply validation logic to the user’s request, because the edge server doesn’t need to contact the origin to fulfil the user’s request.

In the next section, I show you how to inspect the client-request headers using AWS WAF to allow or block requests at the CDN.

Solution implementation—two approaches

This post includes two ways to set up AWS WAF to help prevent hotlinking:

  • Using a separate subdomain: Static files (such as images or styling components) to be protected are moved to a separate subdomain such as static.example.com so that you only need to validate the Referer header.
  • Using the same domain: Static files are located under a directory on the same domain. This solution includes how to extend this example to check for an empty Referer header.

The choice of approach will depend on how your site is structured and the level of protection you want to implement. The first approach enables you to set up a Referer header check to make sure that requests for the images only come from an allowlisted sub-domain, while the second approach has an additional check for an empty Referer header. The second approach extends the first approach and allows for some flexibility for users to share direct links to the image while still preventing unaffiliated third-party sites from embedding the image links on their websites.

Terms

The following list includes key terms used in this post:

  • AWS WAF configurations consist of a web access control list (web ACL), associated with a given CloudFront distribution.
  • Each web ACL is a collection of one or more rules, and each rule can have one or more match conditions.
  • Match conditions are made up of one or more filters, which inspect components of the request (such as its headers or URI) to match for certain conditions.
  • Case-sensitivity: HTTP header names are case-insensitive. Referer and referer point to the same HTTP header. HTTP header values, however, are case-sensitive.

Prerequisites

You must have a CloudFront distribution set up before configuring an AWS WAF web ACL. For information about how to set up a CloudFront distribution with an S3 bucket as an origin, see Configure distributions.

Approach 1: A separate subdomain

In this example, you create an AWS WAF rule set that contains a single rule with a single match condition, which in turn has a single filter. The match condition checks the Referer header and verifies that it contains a given value. If the request matches the condition specified in the rule, the traffic is allowed. Otherwise, the AWS WAF rule blocks the traffic.

For this example, because all the static files are on a separate subdomain (static.example.com) accessed only from the site example.com, you will block hotlinking for any file that don’t have a referer that ends with example.com.

Use the following steps to set this up using the AWS WAF console.

Step 1: Create and name a new web ACL

  1. Sign in to the AWS WAF console.
  2. If you have not created a web ACL before, Choose Create web ACL on the AWS WAF console landing page.
  3. Because you want to associate the web ACL with a CloudFront distribution, select Amazon CloudFront distributions as the Resource type.
    1. Enter a Name for the web ACL that you’re creating. For this example, I used the name sample-webacl. The page will automatically populate an associated Amazon CloudWatch metric name. CloudWatch is a monitoring service that allows you to gather and report on metrics of various services. This CloudWatch metric can be used later to report on how your newly created AWS WAF configuration is being used.
    2. After you have supplied the name of the web ACL, you can select the available AWS resources to be protected by this web ACL. In this example, you will fill that in later, so leave this field blank for now.
    3. By default, AWS WAF can inspect up to 16 KB of the web request body with additional values of 32, 48, and 64 KB for an additional cost. Leave the web request Body size limit at the default value of 16 KB.
    4. Choose Next.
  4. Figure 2: Describing the web ACL and associating it to resources

    Figure 2: Describing the web ACL and associating it to resources

Step 2: Create a string match condition on Referer header

AWS WAF ACLs can use AWS managed rule groups, rule groups from AWS Marketplace providers, or you can write your own rules and rule groups. For this example, you will create your own rules and rule groups.

  1. In the AWS WAF console, choose Add rules, and select Add my own rules and rule groups to create the string match condition.
    Figure 3: Add rules and rule groups

    Figure 3: Add rules and rule groups

  2. This will bring you to the Rule visual editor page. The default Rule type will be set to Rule builder which you can leave unchanged. In the Rule builder section, select Regular rule.
    Figure 4: Rule type and Rule builder

    Figure 4: Rule type and Rule builder

  3. The next step is to construct a string match condition to match on the Referer header. Under Name, enter a name for the rule, such as Referer-check. Make sure that If a request is set to doesn’t match the statement (NOT). The string match condition is a negative match which means that if the Referer header field value does not match the value specified in the rule, the request will be blocked. This makes sure that requests for static.example.com which only originate from example.com are allowed. In the Statement section, use the following settings:
    1. Inspect: Select Single header.
    2. Header field name: Enter referer as the value.
    3. Match type: Select Exactly matches string.
    4. String to match: Enter example.com as the value.
    5. Text transformation: Select Lowercase. This isn’t required for most modern browsers, but is a good practice because HTTP header field values are case sensitive.
    Figure 5: Rule name and statement

    Figure 5: Rule name and statement

  4. In the Action section, select Block as the Action. Choose Add Rule.
    Figure 6: Rule Action

    Figure 6: Rule Action

    In the preceding rule statement, you’re configuring AWS WAF to inspect a header with the name Referer and checking if the value of the header matches the static string example.com. If the value of the Referer header is not example.com, then the request is blocked.

  5. The next page is Add rules and rule groups. It shows the following attributes of the web ACL:
    1. AWS WAF rules that have been added to the web ACL.
    2. Web ACL capacity units (WCUs).
    3. Default web ACL action.
    4. Token domain list.
    5. Because you’re only adding one rule to this web ACL, choose Next.
      Figure 7: Rules and rule groups, WCUs, and default web ACL action

      Figure 7: Rules and rule groups, WCUs, and default web ACL action

  6. On the next page, you will set the rule priority. Because you added only one rule, you will not need to adjust the rule priority. If there is more than one rule, you can select a particular rule and use the Move up or Move down options to organize the rule order. Choose Next.
    Figure 8: Set rule priority

    Figure 8: Set rule priority

  7. The Configure metrics page details can be left at the default values. Choose Next to proceed to the final step.
    Figure 9: Configure metrics

    Figure 9: Configure metrics

  8. The final step is to review the web ACL details. If you need to change one of the settings of the web ACL, you can choose Edit step for the corresponding step. Choose Create web ACL to finalize creating the AWS WAF web ACL.
    Figure 10: Review and create web ACL

    Figure 10: Review and create web ACL

Step 3: Associate the new rule with the relevant CloudFront distribution

You can now associate AWS resources with the web ACL that you created in the previous steps. In this case, the AWS resource is a CloudFront distribution.

  1. In the AWS WAF console, choose Web ACLs in the navigation pane. Select the web ACL named sample-webacl that you created previously.
    Figure 11: Select a Web ACL to configure

    Figure 11: Select a Web ACL to configure

  2. Choose Add AWS resources.
    Figure 12: Add AWS resources

    Figure 12: Add AWS resources

  3. Eligible AWS resources will be displayed in the pop-up page. Select the CloudFront distribution from the Resources list. Choose Add to associate the ACL sample-webacl with the CloudFront distribution.
    Figure 13: Select CloudFront distribution to associate with sample-webacl

    Figure 13: Select CloudFront distribution to associate with sample-webacl

  4. The next page is the Web ACLs page, which will show the CloudFront distribution selected in the previous step in the Associated AWS resources section.
    Figure 14: Web ACLs and Associated AWS resources

    Figure 14: Web ACLs and Associated AWS resources

Test the referer check rule

You’re ready to test the web ACL that you created by issuing a cURL command from the command line and confirming that the referer check is matched correctly. When you request files without the allowlisted Referer header, the requests are blocked at the CDN. However, valid requests still are allowed through.

When a third party embeds your content (request blocked at the CDN)

» curl –H "Referer: example.net -I https://static.example.com/favicon.ico
« HTTP/1.1 403 Forbidden

When you embed your content (request allowed through the CDN)

» curl –H "Referer: example.com -I https://static.example.com/favicon.ico
« HTTP/1.1 200 OK

Note: With Approach 1, you must make the request with an allowlisted Referer header. In this example, all paths are filtered.

Approach 2: All content under the same domain, with filtering by path

In the second approach, you allow a blank Referer header and filter by a given URL path. To do this, you will create an AWS WAF web ACL that contains multiple rules with additional match conditions, which in turn are comprised of multiple filters. As with the first approach, the match condition looks at the Referer header; however, you will validate the header in two ways. First, you validate whether the request contains the expected header, and if it does not, you apply the second validation, which checks to see whether it has a URL style Referer header. This enables you to access the assets directly in a browser when the assets aren’t embedded elsewhere in a website but still provides protection against hotlinking.

Accessing an image directly in the browser can be useful in situations where users might want to share the link to the image directly, thus helping to prevent a negative user experience when sharing the image link with other users. This approach makes it an improvement over the first approach where requests for the images must originate from the sub-domain.

You will also validate the path used in the request (in this example /wp-content), which allows AWS WAF to protect individual folders under a single domain name.

Step 1: Decide what to protect

As in the first approach, rather than filter on everything under a domain, you will filter based on the path. In this case, /wp-content. This allows you to protect your uploaded content that sits under /wp-content, but without having to put the content into a separate subdomain.

Step 2: Create and name a new web ACL

You can use the web ACL that you created for Approach 1, or you can repeat Step 1 of Approach 1 to create a new web ACL.

Step 3: Create string match conditions on the referer

For Approach 2, the assumption is that everything exists under a single domain, so instead of using the catch-all example.com, use the more secure https://example.com/ and mark the header as Starting with https://example.com.

Because you’re explicitly filtering on one header, you must watch out for two things:

  • Switching between www.example.com and example.com in your application.
  • Switching between https:// and http:// in your application.

If either of these switches occurs, you will see a 403 Forbidden error returned instead of your embedded files. In this example, all content is delivered directly through https://example.com/.

For this example, you will construct two rules, each of which will contain multiple string match conditions. AWS WAF allows for conditional match conditions within a rule so you can create nested logic statements. For example, a rule evaluation is true if all the statements within a rule statement are evaluated to true.

First rule: Validate a Referer header:

For this rule, you will set the following match conditions and AWS WAF actions:

Rule name: Validate-Referer-header

If Referer header value starts with https://example.com

AND

If URI path starts with /wp-content

THEN

ALLOW request

  1. Open the AWS Management Console for AWS WAF and navigate to WAF & Shield.
  2. Choose Web ACLs in the navigation pane and select Global (CloudFront) as the AWS Region.
    Figure 15: Web ACLs and AWS Regions

    Figure 15: Web ACLs and AWS Regions

  3. The page will refresh to show the Web ACL sample-webacl that you created in the preceding Step 2. Select sample-webacl.
    Figure 16: Web ACLs list

    Figure 16: Web ACLs list

  4. Select the Rules tab.
    Figure 17: Web ACL rules

    Figure 17: Web ACL rules

  5. Choose Add rules and select Add my own rules and rule groups. If you’re reusing the web ACL created in Approach 1, delete the Referer-check rule before adding new rules.
    Figure 18: Add rules and rule groups

    Figure 18: Add rules and rule groups

  6. For Rule type, select Rule builder.
    Figure 19: Rule type

    Figure 19: Rule type

  7. In the Rule section, use the following settings:
    1. Name: Enter Validate-referer-header as the value.
    2. Type: Select Regular rule.
    3. If a request: Select matches all the statements (AND).
    Figure 20: Rule name and match condition

    Figure 20: Rule name and match condition

  8. In the Statement 1 section, use the following settings:
    1. Inspect: Select Single header.
    2. Header field name: Enter referer as the value.
    3. Match type: Select Starts with string.
    4. String to match: Enter https://example.com as the value.
    5. Text transformation: Select Lowercase.
    Figure 21: First string match condition

    Figure 21: First string match condition

  9. Create the second string match condition (Statement 2). For the URL itself, you want to protect content under /wp-content, so you will create a string match to validate that case using the same steps as for the first string match condition, with two changes:
    1. For Inspect, select URI path.
    2. For String to match, enter /wp-content as the value.
    Figure 22: Second string match condition

    Figure 22: Second string match condition

  10. Change the Action to Allow and choose Add Rule at the bottom of the page.
    Figure 23: Set the Action to Allow

    Figure 23: Set the Action to Allow

  11. In the Set rule priority page, choose Save.
    Figure 24: Save the rule

    Figure 24: Save the rule

Second rule: Validate without a Referer header

For the second rule, you will set the following match conditions and rule actions:

Rule name: Validate- with-no-Referer-header

If Referer header contains ://

AND

If URI path starts with /wp-content

THEN

BLOCK request

The second rule is similar to the first rule, but it matches when the Referer header value includes ://. You use this match condition to check whether the Referer header has been set at all. If it has, you block the request.

  1. In the Web ACL page, choose Add rules and select Add my own rules and rule groups to be taken to the Rule type page.
    Figure 25: Create the second rule

    Figure 25: Create the second rule

  2. For Rule type and Rule builder, use the following settings:
    1. Rule type: Select Rule builder.
    2. Name: Enter Validate-with-no_Referer-header as the value.
    3. Type: Select Regular rule.
    4. If a request: Select matches all the statements (AND).
    Figure 26: Set the rule type and matching

    Figure 26: Set the rule type and matching

  3. For Statement 1, use the following settings:
    1. Inspect: Select Single header.
    2. Header field name: Enter Referer as the value.
    3. Match type: Select Contains string.
    4. String to match: Enter ://
    Figure 27: Configure Statement 1

    Figure 27: Configure Statement 1

  4. For Statement 2, use the following settings:
    1. Inspect: Select URI path.
    2. Match type: Select Starts with string.
    3. String to match: Enter /wp-content as the value.
    Figure 28: Configure Statement 2

    Figure 28: Configure Statement 2

  5. For Action, keep the default setting of Block and choose Add Rule.
    Figure 29: Add rule

    Figure 29: Add rule

  6. The resulting Set rule priority page will list the rules in the sample-webacl web ACL and will look like the following figure. It shows the name of the rule, the rule priority, the web capacity units (WCUs) and the AWS WAF response. Choose Save.
    Figure 30: Rule priority and web ACL units used the web ACL.

    Figure 30: Rule priority and web ACL units used the web ACL.

The Rules tab will now show both of the rules that you added with their corresponding AWS WAF actions in addition to the default action of Allow for requests that don’t match one of the rules.

Figure 31: Rules tab of sample-webacl web ACL

Figure 31: Rules tab of sample-webacl web ACL

Step 4: Associate the new rules with the relevant CloudFront distribution

  1. Select the Associated AWS Resources tab and choose Add AWS resources.
    Figure 32: Add AWS resources

    Figure 32: Add AWS resources

  2. Select the relevant CloudFront distribution and choose Add.
    Figure 33: Select the CloudFront distribution

    Figure 33: Select the CloudFront distribution

  3. The web ACLs page will show the CloudFront distribution in the Associated AWS resources tab.
    Figure 34: Associated AWS resources

    Figure 34: Associated AWS resources

Test the rules

Similar to Approach 1, you have filtering at the CDN, but this time the filtering is based on the path and direct linking is allowed (without a Referer header).

You can use cURL to verify that the new AWS WAF web ACL correctly protects your content. Use the –H argument to send a different Referer header to the CloudFront distribution, which allows you to test as if you are embedding the website content in an unauthorized page.

When a third party embeds your content

» curl –H "Referer: https://example.net/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 403 Forbidden

When your content is directly linked (with no Referer)

» curl -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

When you embed your content

» curl –H "Referer: https://example.com/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

Conclusion

AWS WAF is a web application firewall that lets you monitor and control the HTTP(S) requests that are forwarded to your protected web application resources. In this post, you saw how to use the AWS WAF custom rule builder feature to prevent content hotlinking to protect your website’s content hosted in an Amazon S3 bucket.

The two approaches demonstrated in this post provide you with ways to implement a robust referer check solution that helps prevent unauthorized third-party websites from linking back to static assets on your website, thus helping to prevent increased bandwidth costs, bad user experience, and degraded performance because of resource leeching. Following the concept of least privilege, you can further restrict the AWS WAF rules to apply only to certain image file extensions (such as .jpg or .png).

While referer checking helps prevent unaffiliated sites from backlinking to your site’s images and benefitting by using your site’s bandwidth, more sophisticated exploits can carefully craft a request to bypass the referer check. Other web request mechanisms, such as web browser plugins, server-to-server requests that forge referer header values, or privacy-based web browsers may also cause inconsistencies in accurately evaluating the referer header value. Be aware of such inconsistencies and consider using additional private content mechanisms such as signed URLs and token authentication.

Web browsers don’t have a mechanism to validate if a Referer header has been tampered with. Referer checking should be implemented as part of a broader web application security strategy by using AWS WAF application protection rules, Bot Control, Fraud Control, and Distributed Denial of Service (DDOS) protection. Effective web traffic monitoring using AWS WAF logs, Amazon CloudWatch metrics, and web ACL traffic dashboards will help ensure that bad actors aren’t bypassing the AWS WAF rules that you have set up to protect your web traffic.

You can use AWS WAF to build on top of the referer check to implement more advanced content protection solutions such as rate-limiting, bot mitigation, and DDOS mitigations to further secure your website against a wide range of exploits.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this solution or its implementation, start a new thread on the AWS WAF forum.

Alex Smith was the original author of this post in 2016.

Sanchith Kandaka

Sanchith Kandaka

With over 15 years of experience in the Content Delivery and Application Security space, Sanchith is excited about all things edge related. He has worked as a Solutions Architect and a Solutions Engineer and is now a Specialist Solutions Architect at AWS focused on AWS Edge Services and Perimeter Protection services including Amazon CloudFront, AWS WAF, and AWS Shield.

The collective thoughts of the interwebz