Tag Archives: Uncategorized

iPhone Malware that Operates Even When the Phone Is Turned Off

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/iphone-malware-that-operates-even-when-the-phone-is-turned-off.html

Researchers have demonstrated iPhone malware that works even when the phone is fully shut down.

t turns out that the iPhone’s Bluetooth chip­ — which is key to making features like Find My work­ — has no mechanism for digitally signing or even encrypting the firmware it runs. Academics at Germany’s Technical University of Darmstadt figured out how to exploit this lack of hardening to run malicious firmware that allows the attacker to track the phone’s location or run new features when the device is turned off.

[…]

The research is the first — or at least among the first — to study the risk posed by chips running in low-power mode. Not to be confused with iOS’s low-power mode for conserving battery life, the low-power mode (LPM) in this research allows chips responsible for near-field communication, ultra wideband, and Bluetooth to run in a special mode that can remain on for 24 hours after a device is turned off.

The research is fascinating, but the attack isn’t really feasible. It requires a jailbroken phone, which is hard to pull off in an adversarial setting.

Slashdot thread.

Attacks on Managed Service Providers Expected to Increase

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/attacks-on-managed-service-providers-expected-to-increase.html

CISA, NSA, FBI, and similar organizations in the other Five Eyes countries are warning that attacks on MSPs — as a vector to their customers — are likely to increase. No details about what this prediction is based on. Makes sense, though. The SolarWinds attack was incredibly successful for the Russian SVR, and a blueprint for future attacks.

News articles.

AWS Week in Review – May 16, 2022

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-may-16-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you in person. The Serverless Developer Advocates are going around many of the AWS Summits with the Serverlesspresso booth. If you attend an event that has the booth, say “Hi 👋” to my colleagues, and have a coffee while asking all your serverless questions. You can find all the upcoming AWS Summits in the events section at the end of this post.

Last week’s launches
Here are some launches that got my attention during the previous week.

AWS Step Functions announced a new console experience to debug your state machine executions – Now you can opt-in to the new console experience of Step Functions, which makes it easier to analyze, debug, and optimize Standard Workflows. The new page allows you to inspect executions using three different views: graph, table, and event view, and add many new features to enhance the navigation and analysis of the executions. To learn about all the features and how to use them, read Ben’s blog post.

Example on how the Graph View looks

Example on how the Graph View looks

AWS Lambda now supports Node.js 16.x runtime – Now you can start using the Node.js 16 runtime when you create a new function or update your existing functions to use it. You can also use the new container image base that supports this runtime. To learn more about this launch, check Dan’s blog post.

AWS Amplify announces its Android library designed for Kotlin – The Amplify Android library has been rewritten for Kotlin, and now it is available in preview. This new library provides better debugging capacities and visibility into underlying state management. And it is also using the new AWS SDK for Kotlin that was released last year in preview. Read the What’s New post for more information.

Three new APIs for batch data retrieval in AWS IoT SiteWise – With this new launch AWS IoT SiteWise now supports batch data retrieval from multiple asset properties. The new APIs allow you to retrieve current values, historical values, and aggregated values. Read the What’s New post to learn how you can start using the new APIs.

AWS Secrets Manager now publishes secret usage metrics to Amazon CloudWatch – This launch is very useful to see the number of secrets in your account and set alarms for any unexpected increase or decrease in the number of secrets. Read the documentation on Monitoring Secrets Manager with Amazon CloudWatch for more information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other launches and news that you may have missed:

IBM signed a deal with AWS to offer its software portfolio as a service on AWS. This allows customers using AWS to access IBM software for automation, data and artificial intelligence, and security that is built on Red Hat OpenShift Service on AWS.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish. This week’s episode introduces you to Amazon DynamoDB and shares stories on how different customers use this database service. You can listen to all the episodes directly from your favorite podcast app or the podcast web page.

AWS Open Source News and Updates – Ricardo Sueiras, my colleague from the AWS Developer Relation team, runs this newsletter. It brings you all the latest open-source projects, posts, and more. Read edition #112 here.

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

You can register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

The NSA Says that There are No Known Flaws in NIST’s Quantum-Resistant Algorithms

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/the-nsa-says-that-there-are-no-known-flaws-in-nists-quantum-resistant-algorithms.html

Rob Joyce, the director of cybersecurity at the NSA, said so in an interview:

The NSA already has classified quantum-resistant algorithms of its own that it developed over many years, said Joyce. But it didn’t enter any of its own in the contest. The agency’s mathematicians, however, worked with NIST to support the process, trying to crack the algorithms in order to test their merit.

“Those candidate algorithms that NIST is running the competitions on all appear strong, secure, and what we need for quantum resistance,” Joyce said. “We’ve worked against all of them to make sure they are solid.”

The purpose of the open, public international scrutiny of the separate NIST algorithms is “to build trust and confidence,” he said.

I believe him. This is what the NSA did with NIST’s candidate algorithms for AES and then for SHA-3. NIST’s Post-Quantum Cryptography Standardization Process looks good.

I still worry about the long-term security of the submissions, though. In 2018, in an essay titled “Cryptography After the Aliens Land,” I wrote:

…there is always the possibility that those algorithms will fall to aliens with better quantum techniques. I am less worried about symmetric cryptography, where Grover’s algorithm is basically an upper limit on quantum improvements, than I am about public-key algorithms based on number theory, which feel more fragile. It’s possible that quantum computers will someday break all of them, even those that today are quantum resistant.

It took us a couple of decades to fully understand von Neumann computer architecture. I’m sure it will take years of working with a functional quantum computer to fully understand the limits of that architecture. And some things that we think of as computationally hard today will turn out not to be.

За руската и украинската артилерия… и не само

Post Syndicated from original http://www.gatchev.info/blog/?p=2443

Разказаха ми нещо, което ме замисли.

Във войната в Украйна руската артилерия действа по класическия начин от Втората световна война. Дислоцира се батареята, приготвя се за стрелба и чака заповеди от комбата по какво да стреля. Право на избор къде да се дислоцира, по кого да стреля и т.н. няма – прави се каквото е заповядано. Без заповед само се отбранява, ако я атакуват. (Ако не ѝ е забранено и това.)

Украинската артилерия е пръсната на малки единици – често дори само едно оръдие. Имат (доста голям) район, в който да се дислоцират – те си избират къде точно, кога и къде да се местят и т.н. Централни заповеди почти няма. Вместо тях има приложение за смартфони. Операторите на дронове въвеждат в него координати на забелязаната цел, оръдейците – данни за позицията си, и приложението моментално им дава данни за поставяне на оръдието, за да бъде ударена конкретната цел с висока точност. Стрелят по нея на които оръдия е в обсега и според важността ѝ. Децентрализирано управление на огъня.

Като резултат украинската артилерия често дори успява да надмине руската като ефективност и резултатност. Въпреки че е няколко пъти по-малка, и с изключение на доставените от Запад към 10% от оръдията ѝ, по-стара и по-хилава от руската. Унищожението на руската армия при опитите ѝ за създаване на плацдарм през Сиверски Донец го демонстрира изключително убедително… А в същото време е много по-малко уязвима от руската – единично оръдие се мести и крие много по-лесно, поразява се със стандартна артилерия много по-трудно и т.н.

А толкова ли са изостанали руснаците, че не прилагат същото? Нямат ли си кадърни програмисти?

О-хо-хо! Имат, и още как! 98% от ботнетовете са реализирани и контролирани от руснаци. Вярно е, не от случайни киберкримки, а от поделение 26165 на ГРУ – хакерите на руската армия. Но това ги прави не по-малко, а по-достъпни за военни цели. А киберпрестъпността печели и плаща доста, лесно наема качествен талант, иска се изключително майсторство, за да си водещ сред нея. Очевидно хакерите на ГРУ са.

А защо тогава не го правят?

Защото подобно приложение би нарушило принцип номер едно на руската армия – тя е, за да изпълнява заповеди. В нея личната инициатива се поощрява единствено когато е в изпълнение на конкретна заповед, и дори тогава не винаги. Във всички други случаи се хвали на думи и наказва на дело.

А това е заради един фундаментален факт. Под „капиталистическата“ боя Русия продължава да е феодална държава.

Например в нея изгодните бизнеси се водят юридически собственост на съответните олигарси, но реално са тяхно ленно владение, дадено им за вярна служба. В момента, в който „сеньорът“ прецени, че друг би му служил по-добре, владението преминава в „собственост“ на този друг, срещу непублична сума (като правило нула). Ако настоящият му „собственик“ не е наясно със системата – примерно се заблуждава, че бизнесът е наистина негова собственост – му се случва каквото ще му докаже грешката му (Ходорковски, anyone?). Подобно е положението и с всички други видове „владения“ – високи постове, привилегии и т.н.

Размерът и важността на „владението“ се определя от точно едно нещо – до каква степен „собственикът“ му е над закона. Това не винаги съвпада с обичайните представи. Например типичният армейски генерал командва като минимум бригада, няколко хиляди войници и огромно количество бойна техника, а лейтенантът от „службите“ често няма нито един подчинен и единственото му оръжие е един пистолет. Ако обаче лейтенантът изкомандва генерала да му пази колата, генералът ще го изпълни на мига, въпреки че лейтенантът официално е много далеч под него в йерархията. Просто „службите“, като реално политическа полиция, са благородническа категория безусловно над всички други, тъй като са по-доверени на властта от всички други. Съответно идеята, че можете да им се опънете за нещо, или да ги съдите, би разсмяла нормалния руснак.

Затова за Русия безпрекословното подчинение в армията е от върховна важност. Заради тежкото си въоръжение армията е по-силова институция дори от „службите“, а в същото време е много по-малко доверена на феодалната върхушка. Позволи ли се в нея култура на самоинициатива, върхушката е под заплаха, по-страшна от външните врагове. За всяка феодална върхушка най-страшният ѝ враг е простолюдието, което тя тъпче – а армията е набрана предимно от него… Затова подобна гъвкавост, базирана на самоинициатива, няма да бъде допусната в руската армия дори при риск за загуба на войната.

Накратко, Русия е феодализъм с тънък до прозрачност слой капиталистическа боя отгоре. Докато Украйна е доста несъвършена и дори с феодални остатъци демокрация, но все пак по-скоро демокрация. Затова тя може да си позволи да даде свобода на инициативата на бойците си – може да разчита на лоялността им. Докато Русия няма как да посмее да ѝ разчита.

А свободата на инициативата и гъвкавостта при изпълнение на задачи са огромни предимства в съвременната война. На теория най-ефективното ѝ постижение, интеграцията на видовете оръжия и родовете войски, е чисто командна задача. На практика обаче, подобно на икономиката, съвременната военна обстановка е твърде сложна и динамична, за да може да бъде обхваната адекватно от централизирана командна система. Съответно делегирането на инициатива и децентрализирането на действията носят огромен плюс. През 1982 г. Израел успява да разгроми многократно по-многобройните арабски армии основно защото позволява инициатива и децентрализиране на действията. Това, че израелската армия е въоръжена със западно оръжие, а арабските с руско, също е фактор, но с по-малко значение.

Ето затова Украйна, въпреки че има няколко пъти по-малки човешки и военни ресурси от Русия, има отлични шансове да спечели тази война. Тя е по същество война на капитализма срещу феодализма. Виждали сме я неведнъж, и на военния фронт, и на икономическия, и на социалния. Знаем как свършва.

А и не е зле да се замислим над още нещо. През 2014 г. Украйна успя да изкара от властта част от руските агенти там. Малко, но все пак. Като резултат, за само 8 години доста украинци стигнаха дотам да карат коли, които по българските критерии изглеждат луксозни. И завистливите ганьовци ги плюят за това. Вместо да се замислят защо те не карат такива – и какво е нужно, за да получат един ден тази възможност.

Всъщност, не за да я получат те – за да я получим всички в България… Те няма да го проумеят. Ако бяха способни на това, нямаше да са завистливи ганьовци. Но може би ние, които смятаме себе си за извън тази група, е добре да вземем мерки по въпроса.

Upcoming Speaking Engagements

Post Syndicated from Schneier.com Webmaster original https://www.schneier.com/blog/archives/2022/05/upcoming-speaking-engagements-19.html

This is a current list of where and when I am scheduled to speak:

  • I’m speaking on “Securing a World of Physically Capable Computers” at OWASP Belgium’s chapter meeting in Antwerp, Belgium, on May 17, 2022.
  • I’m speaking at Future Summits in Antwerp, Belgium, on May 18, 2022.
  • I’m speaking at IT-S Now 2022 in Vienna, Austria, on June 2, 2022.
  • I’m speaking at the 14th International Conference on Cyber Conflict, CyCon 2022, in Tallinn, Estonia, on June 3, 2022.
  • I’m speaking at the RSA Conference 2022 in San Francisco, June 6-9, 2022.
  • I’m speaking at the Dublin Tech Summit in Dublin, Ireland, June 15-16, 2022.

The list is maintained on this page.

A new Spark plugin for CPU and memory profiling

Post Syndicated from Bo Xiong original https://aws.amazon.com/blogs/devops/a-new-spark-plugin-for-cpu-and-memory-profiling/

Introduction

Have you ever wondered if there are low-hanging optimization opportunities to improve the performance of a Spark app? Profiling can help you gain visibility regarding the runtime characteristics of the Spark app to identify its bottlenecks and inefficiencies. We’re excited to announce the release of a new Spark plugin that enables profiling for JVM based Spark apps via Amazon CodeGuru. The plugin is open sourced on GitHub and published to Maven.

Walkthrough

This post shows how you can onboard this plugin with two steps in under 10 minutes.

  • Step 1: Create a profiling group in Amazon CodeGuru Profiler and grant permission to your Amazon EMR on EC2 role, so that profiler agents can emit metrics to CodeGuru. Detailed instructions can be found here.
  • Step 2: Reference codeguru-profiler-for-spark when submitting your Spark job, along with PROFILING_CONTEXT and ENABLE_AMAZON_PROFILER defined.

Prerequisites

Your app is built against Spark 3 and run on Amazon EMR release 6.x or newer. It doesn’t matter if you’re using Amazon EMR on Amazon Elastic Compute Cloud (Amazon EC2) or on Amazon Elastic Kubernetes Service (Amazon EKS).

Illustrative Example

For the purposes of illustration, consider the following example where profiling results are collected by the plugin and emitted to the “CodeGuru-Spark-Demo” profiling group.

spark-submit \
--master yarn \
--deploy-mode cluster \
--class \
--packages software.amazon.profiler:codeguru-profiler-for-spark:1.0 \
--conf spark.plugins=software.amazon.profiler.AmazonProfilerPlugin \
--conf spark.executorEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"}" \
--conf spark.executorEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.dynamicAllocation.enabled=false \t

An alternative way to specify PROFILING_CONTEXT and ENABLE_AMAZON_PROFILER is under the yarn-env.export classification for instance groups in the Amazon EMR web console. Note that PROFILING_CONTEXT, if configured in the web console, must escape all of the commas on top of what’s for the above spark-submit command.

[
  {
    "classification": "yarn-env",
    "properties": {},
    "configurations": [
      {
        "classification": "export",
        "properties": {
          "ENABLE_AMAZON_PROFILER": "true",
          "PROFILING_CONTEXT": "{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"\\,\\\"driverEnabled\\\":\\\"true\\\"}"
        },
        "configurations": []
      }
    ]
  }
]

Once the job above is launched on Amazon EMR, profiling results should show up in your CodeGuru web console in about 10 minutes, similar to the following screenshot. Internally, it has helped us identify issues, such as thread contentions (revealed by the BLOCKED state in the latency flame graph), and unnecessarily create AWS Java clients (revealed by the CPU Hotspots view).

Go to your profiling group under the Amazon CodeGuru web console. Click the “Visualize CPU” button to render a flame graph displaying CPU usage. Switch to the latency view to identify latency bottlenecks, and switch to the heap summary view to identify objects consuming most memory.

Troubleshooting

To help with troubleshooting, use a sample Spark app provided in the plugin to check if everything is set up correctly. Note that the profilingGroupName value specified in PROFILING_CONTEXT should match what’s created in CodeGuru.

spark-submit \
--master yarn \
--deploy-mode cluster \
--class software.amazon.profiler.SampleSparkApp \
--packages software.amazon.profiler:codeguru-profiler-for-spark:1.0 \
--conf spark.plugins=software.amazon.profiler.AmazonProfilerPlugin \
--conf spark.executorEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\"}" \
--conf spark.executorEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.yarn.appMasterEnv.PROFILING_CONTEXT="{\\\"profilingGroupName\\\":\\\"CodeGuru-Spark-Demo\\\",\\\"driverEnabled\\\":\\\"true\\\"}" \
--conf spark.yarn.appMasterEnv.ENABLE_AMAZON_PROFILER=true \
--conf spark.dynamicAllocation.enabled=false \
/usr/lib/hadoop-yarn/hadoop-yarn-server-tests.jar

Running the command above from the master node of your EMR cluster should produce logs similar to the following:

21/11/21 21:27:21 INFO Profiler: Starting the profiler : ProfilerParameters{profilingGroupName='CodeGuru-Spark-Demo', threadSupport=BasicThreadSupport (default), excludedThreads=[Signal Dispatcher, Attach Listener], shouldProfile=true, integrationMode='', memoryUsageLimit=104857600, heapSummaryEnabled=true, stackDepthLimit=1000, samplingInterval=PT1S, reportingInterval=PT5M, addProfilerOverheadAsSamples=true, minimumTimeForReporting=PT1M, dontReportIfSampledLessThanTimes=1}
21/11/21 21:27:21 INFO ProfilingCommandExecutor: Profiling scheduled, sampling rate is PT1S
...
21/11/21 21:27:23 INFO ProfilingCommand: New agent configuration received : AgentConfiguration(AgentParameters={MaxStackDepth=1000, MinimumTimeForReportingInMilliseconds=60000, SamplingIntervalInMilliseconds=1000, MemoryUsageLimitPercent=10, ReportingIntervalInMilliseconds=300000}, PeriodInSeconds=300, ShouldProfile=true)
21/11/21 21:32:23 INFO ProfilingCommand: Attempting to report profile data: start=2021-11-21T21:27:23.227Z end=2021-11-21T21:32:22.765Z force=false memoryRefresh=false numberOfTimesSampled=300
21/11/21 21:32:23 INFO javaClass: [HeapSummary] Processed 20 events.
21/11/21 21:32:24 INFO ProfilingCommand: Successfully reported profile

Note that the CodeGuru Profiler agent uses a reporting interval of five minutes. Therefore, any executor process shorter than five minutes won’t be reflected by the profiling result. If the right profiling group is not specified, or it’s associated with a wrong EC2 role in CodeGuru, then the log will show a message similar to “CodeGuruProfilerSDKClient: Exception while calling agent orchestration” along with a stack trace including a 403 status code. To rule out any network issues (e.g., your EMR job running in a VPC without an outbound gateway or a misconfigured outbound security group), then you can remote into an EMR host and ping the CodeGuru endpoint in your Region (e.g., ping codeguru-profiler.us-east-1.amazonaws.com).

Cleaning up

To avoid incurring future charges, you can delete the profiling group configured in CodeGuru and/or set the ENABLE_AMAZON_PROFILER environment variable to false.

Conclusion

In this post, we describe how to onboard this plugin with two steps. Consider to give it a try for your Spark app? You can find the Maven artifacts here. If you have feature requests, bug reports, feedback of any kind, or would like to contribute, please head over to the GitHub repository.

Author:

Bo Xiong

Bo Xiong is a software engineer with Amazon Ads, leveraging big data technologies to process petabytes of data for billing and reporting. His main interests include performance tuning and optimization for Spark on Amazon EMR, and data mining for actionable business insights.

Surveillance by Driverless Car

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/surveillance-by-driverless-car.html

San Francisco police are using autonomous vehicles as mobile surveillance cameras.

Privacy advocates say the revelation that police are actively using AV footage is cause for alarm.

“This is very concerning,” Electronic Frontier Foundation (EFF) senior staff attorney Adam Schwartz told Motherboard. He said cars in general are troves of personal consumer data, but autonomous vehicles will have even more of that data from capturing the details of the world around them. “So when we see any police department identify AVs as a new source of evidence, that’s very concerning.”

Running hybrid Active Directory service with AWS Managed Microsoft Active Directory

Post Syndicated from Lewis Tang original https://aws.amazon.com/blogs/architecture/running-hybrid-active-directory-service-with-aws-managed-microsoft-active-directory/

Enterprise customers often need to architect a hybrid Active Directory solution to support running applications in the existing on-premises corporate data centers and AWS cloud. There are many reasons for this, such as maintaining the integration with on-premises legacy applications, keeping the control of infrastructure resources, and meeting with specific industry compliance requirements.

To extend on-premises Active Directory environments to AWS, some customers choose to deploy Active Directory service on self-managed Amazon Elastic Compute Cloud (EC2) instances after setting up connectivity for both environments. This setup works fine, but it also presents management and operations challenges when it comes to EC2 instance operation management, Windows operating system, and Active Directory service patching and backup. This is where AWS Directory Service for Microsoft Active Directory (AWS Managed Microsoft AD) helps.

Benefits of using AWS Managed Microsoft AD

With AWS Managed Microsoft AD, you can launch an AWS-managed directory in the cloud, leveraging the scalability and high availability of an enterprise directory service while adding seamless integration into other AWS services.

In addition, you can still access AWS Managed Microsoft AD using existing administrative tools and techniques, such as delegating administrative permissions to select groups in your organization. The full list of permissions that can be delegated is described in the AWS Directory Service Administration Guide.

Active Directory service design consideration with a single AWS account

Single region

A single AWS account is where the journey begins: a simple use case might be when you need to deploy a new solution in the cloud from scratch (Figure 1).

A single AWS account and single-region model

Figure 1. A single AWS account and single-region model

In a single AWS account and single-region model, the on-premises Active Directory has “company.com” domain configured in the on-premises data center. AWS Managed Microsoft AD is set up across two availability zones in the AWS region for high availability. It has a single domain, “na.company.com”, configured. The on-premises Active Directory is configured to trust the AWS Managed Microsoft AD with network connectivity via AWS Direct Connect or VPN. Applications that are Active-Directory–aware and run on EC2 instances have joined na.company.com domain, as do the selected AWS managed services (for example, Amazon Relational Database Service for SQL server).

Multi-region

As your cloud footprint expands to more AWS regions, you have two options also to expand AWS Managed Microsoft AD, depending on which edition of AWS Managed Microsoft AD is used (Figure 2):

  1. With AWS Managed Microsoft AD Enterprise Edition, you can turn on the multi-region replication feature to configure automatically inter-regional networking connectivity, deploy domain controllers, and replicate all the Active Directory data across multiple regions. This ensures that Active-Directory–aware workloads residing in those regions can connect to and use AWS Managed Microsoft AD with low latency and high performance.
  2. With AWS Managed Microsoft AD Standard Edition, you will need to add a domain by creating independent AWS Managed Microsoft AD directories per-region. In Figure 2, “eu.company.com” domain is added, and AWS Transit Gateway routes traffic among Active-Directory–aware applications within two AWS regions. The on-premises Active Directory is configured to trust the AWS Managed Microsoft AD, either by Direct Connect or VPN.
A single AWS account and multi-region model

Figure 2. A single AWS account and multi-region model

Active Directory Service Design consideration with multiple AWS accounts

Large organizations use multiple AWS accounts for administrative delegation and billing purposes. This is commonly implemented through AWS Control Tower service or AWS Control Tower landing zone solution.

Single region

You can share a single AWS Managed Microsoft AD with multiple AWS accounts within one AWS region. This capability makes it simpler and more cost-effective to manage Active-Directory–aware workloads from a single directory across accounts and Amazon Virtual Private Cloud (VPC). This option also allows you seamlessly join your EC2 instances for Windows to AWS Managed Microsoft AD.

As a best practice, place AWS Managed Microsoft AD in a separate AWS account, with limited administrator access but sharing the service with other AWS accounts. After sharing the service and configuring routing, Active Directory aware applications, such as Microsoft SharePoint, can seamlessly join Active Directory Domain Services and maintain control of all administrative tasks. Find more details on sharing AWS Managed Microsoft AD in the Share your AWS Managed AD directory tutorial.

Multi-region

With multiple AWS Accounts and multiple–AWS-regions model, we recommend using AWS Managed Microsoft AD Enterprise Edition. In Figure 3, AWS Managed Microsoft AD Enterprise Edition supports automating multi-region replication in all AWS regions where AWS Managed Microsoft AD is available. In AWS Managed Microsoft AD multi-region replication, Active-Directory–aware applications use the local directory for high performance but remain multi-region for high resiliency.

Multiple AWS accounts and multi-region model

Figure 3. Multiple AWS accounts and multi-region model

Domain Name System resolution design

To enable Active-Directory–aware applications communicate between your on-premises data centers and the AWS cloud, a reliable solution for Domain Name System (DNS) resolution is needed. You can set the Amazon VPC Dynamic Host Configuration Protocol (DHCP) option sets to either AWS Managed Microsoft AD or on-premises Active Directory; then, assign it to each VPC in which the required Active-Directory–aware applications reside. The full list of options working with DHCP option sets is described in Amazon Virtual Private Cloud User Guide.

The benefit of configuring DHCP option sets is to allow any EC2 instances in that VPC to resolve their domain names by pointing to the specified domain and DNS servers. This prevents the need for manual configuration of DNS on EC2 instances. However, because DHCP option sets cannot be shared across AWS accounts, this requires a DHCP option sets also to be created in additional accounts.

DHCP option sets

Figure 4. DHCP option sets

An alternative option is creating an Amazon Route 53 Resolver. This allows customers to leverage Amazon-provided DNS and Route 53 Resolver endpoints to forward a DNS query to the on-premises Active Directory or AWS Managed Microsoft AD. This is ideal for multi-account setups and customers desiring hub/spoke DNS management.

This alternative solution replaces the need to create and manage EC2 instances running as DNS forwarders with a managed and scalable solution, as Route 53 Resolver forwarding rules can be shared with other AWS accounts. Figure 5 demonstrates a Route 53 resolver forwarding a DNS query to on-premises Active Directory.

Route 53 Resolver

Figure 5. Route 53 Resolver

Conclusion

In this post, we described the benefits of using AWS Managed Microsoft AD to integrate with on-premises Active Directory. We also discussed a range of design considerations to explore when architecting hybrid Active Directory service with AWS Managed Microsoft AD. Different design scenarios were reviewed, from a single AWS account and region, to multiple AWS accounts and multi-regions. We have also discussed choosing between the Amazon VPC DHCP option sets and Route 53 Resolver for DNS resolution.

Further reading

ICE Is a Domestic Surveillance Agency

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/ice-is-a-domestic-surveillance-agency.html

Georgetown has a new report on the highly secretive bulk surveillance activities of ICE in the US:

When you think about government surveillance in the United States, you likely think of the National Security Agency or the FBI. You might even think of a powerful police agency, such as the New York Police Department. But unless you or someone you love has been targeted for deportation, you probably don’t immediately think of Immigration and Customs Enforcement (ICE).

This report argues that you should. Our two-year investigation, including hundreds of Freedom of Information Act requests and a comprehensive review of ICE’s contracting and procurement records, reveals that ICE now operates as a domestic surveillance agency. Since its founding in 2003, ICE has not only been building its own capacity to use surveillance to carry out deportations but has also played a key role in the federal government’s larger push to amass as much information as possible about all of our lives. By reaching into the digital records of state and local governments and buying databases with billions of data points from private companies, ICE has created a surveillance infrastructure that enables it to pull detailed dossiers on nearly anyone, seemingly at any time. In its efforts to arrest and deport, ICE has — without any judicial, legislative or public oversight — reached into datasets containing personal information about the vast majority of people living in the U.S., whose records can end up in the hands of immigration enforcement simply because they apply for driver’s licenses; drive on the roads; or sign up with their local utilities to get access to heat, water and electricity.

ICE has built its dragnet surveillance system by crossing legal and ethical lines, leveraging the trust that people place in state agencies and essential service providers, and exploiting the vulnerability of people who volunteer their information to reunite with their families. Despite the incredible scope and evident civil rights implications of ICE’s surveillance practices, the agency has managed to shroud those practices in near-total secrecy, evading enforcement of even the handful of laws and policies that could be invoked to impose limitations. Federal and state lawmakers, for the most part, have yet to confront this reality.

EDITED TO ADD (5/13): A news article.

So it’s been a while…

Post Syndicated from Adam Bradley original https://ibms360.co.uk/?p=902

Dear Reader,

Wow, it’s been a while since our last post here, almost 2 years! Time has totally flown by. I checked the traffic this morning and was pleasantly surprised to see that we’re still getting 2,000+ hits per month which is just incredible given that we haven’t published any updates.

So, I’m guessing you probably want to know whats going on and why we’re not posting here. Let me summarily answer your most important questions below:

  1. Are all of you okay? – Yes.
  2. Is the project dead? – No.
  3. Do you have any updates for us? – Unfortunately, not really.

So, the reason we haven’t been posting here is mainly because, well, nothing has changed. Chris & I have both been insanely busy with regular life, work is non-stop and personal commitments on top mean that currently we have little time to focus on the project. I’m additionally moving to Southampton for a new job which will put me further away from the project and will likely just add to the delay.

The small updates we do have for you are mainly administrative. Back in June 2020 Peter Vaughan purchased and donated some shelving units to the project to allow us to better store our parts, media, etc.

And in September of 2020 we had some members of the CCS (British Computer Conservation Society) visit us in a socially distanced fashion (remember that?!) to ask us some questions about the project following our application to join their projects register.

We have now successfully joined the CCS, and look forward to working with them in the future.

So, what of the project now? Well, for now we’ve basically decided to park the project for a while until one or both of us has more time to spend on it. It sucks because we really want to see the project move forward and succeed, but right now neither of us are particularly in a position to make that happen; and whilst we do have fantastic support from the rest of the team, realistically we need to be somewhat involved in order to be able to progress things in the direction we’d like them to go in.

So, basically we’re on pause for the moment. When will we be off pause? I don’t know. It depends on a lot of factors. Trust me though, if anything changes you will all be the first to hear about it!

All the best,

Adam

P.S. If you’ve sent us an email and we haven’t replied, I can only apologise. A lot of them seem to have disappeared into a black hole of our old email server, and so if you’d like to get in touch please send us another email and we’ll do our best to get back to you.

Apple Mail Now Blocks Email Trackers

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/apple-mail-now-blocks-email-trackers.html

Apple Mail now blocks email trackers by default.

Most email newsletters you get include an invisible “image,” typically a single white pixel, with a unique file name. The server keeps track of every time this “image” is opened and by which IP address. This quirk of internet history means that marketers can track exactly when you open an email and your IP address, which can be used to roughly work out your location.

So, how does Apple Mail stop this? By caching. Apple Mail downloads all images for all emails before you open them. Practically speaking, that means every message downloaded to Apple Mail is marked “read,” regardless of whether you open it. Apples also routes the download through two different proxies, meaning your precise location also can’t be tracked.

Crypto-Gram uses Mailchimp, which has these tracking pixels turned on by default. I turn them off. Normally, Mailchimp requires them to be left on for the first few mailings, presumably to prevent abuse. The company waived that requirement for me.

Friday Squid Blogging: Squid Filmed Changing Color for Camouflage Purposes

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/friday-squid-blogging-squid-filmed-changing-color-for-camouflage-purposes.html

Video of oval squid (Sepioteuthis lessoniana) changing color in reaction to their background. The research paper claims this is the first time this has been documented.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

Throttling a tiered, multi-tenant REST API at scale using API Gateway: Part 1

Post Syndicated from Nick Choi original https://aws.amazon.com/blogs/architecture/throttling-a-tiered-multi-tenant-rest-api-at-scale-using-api-gateway-part-1/

Many software-as-a-service (SaaS) providers adopt throttling as a common technique to protect a distributed system from spikes of inbound traffic that might compromise reliability, reduce throughput, or increase operational cost. Multi-tenant SaaS systems have an additional concern of fairness; excessive traffic from one tenant needs to be selectively throttled without impacting the experience of other tenants. This is also known as “the noisy neighbor” problem. AWS itself enforces some combination of throttling and quota limits on nearly all its own service APIs. SaaS providers building on AWS should design and implement throttling strategies in all of their APIs as well.

In this two-part blog series, we will explore tiering and throttling strategies for multi-tenant REST APIs and review tenant isolation models with hands-on sample code. In part 1, we will look at why a tiering and throttling strategy is needed and show how Amazon API Gateway can help by showing sample code. In part 2, we will dive deeper into tenant isolation models as well as considerations for production.

We selected Amazon API Gateway for this architecture since it is a fully managed service that helps developers to create, publish, maintain, monitor, and secure APIs. First, let’s focus on how Amazon API Gateway can be used to throttle REST APIs with fine granularity using Usage Plans and API Keys. Usage Plans define the thresholds beyond which throttling should occur. They also enable quotas, which sets a maximum usage per a day, week, or month. API Keys are identifiers for distinguishing traffic and determining which Usage Plans to apply for each request. We limit the scope of our discussion to REST APIs because other protocols that API Gateway supports — WebSocket APIs and HTTP APIs — have different throttling mechanisms that do not employ Usage Plans or API Keys.

SaaS providers must balance minimizing cost to serve and providing consistent quality of service for all tenants. They also need to ensure one tenant’s activity does not affect the other tenants’ experience. Throttling and quotas are a key aspect of a tiering strategy and important for protecting your service at any scale. In practice, this impact of throttling polices and quota management is continuously monitored and evaluated as the tenant composition and behavior evolve over time.

Architecture Overview

Figure 1. Cloud Architecture of the sample code.

Figure 1 – Architecture of the sample code

To get a firm foundation of the basics of throttling and quotas with API Gateway, we’ve provided sample code in AWS-Samples on GitHub. Not only does it provide a starting point to experiment with Usage Plans and API Keys in the API Gateway, but we will modify this code later to address complexity that happens at scale. The sample code has two main parts: 1) a web frontend and, 2) a serverless backend. The backend is a serverless architecture using Amazon API Gateway, AWS Lambda, Amazon DynamoDB, and Amazon Cognito. As Figure I illustrates, it implements one REST API endpoint, GET /api, that is protected with throttling and quotas. There are additional APIs under the /admin/* resource to provide Read access to Usage Plans, and CRUD operations on API Keys.

All these REST endpoints could be tested with developer tools such as curl or Postman, but we’ve also provided a web application, to help you get started. The web application illustrates how tenants might interact with the SaaS application to browse different tiers of service, purchase API Keys, and test them. The web application is implemented in React and uses AWS Amplify CLI and SDKs.

Prerequisites

To deploy the sample code, you should have the following prerequisites:

For clarity, we’ll use the environment variable, ${TOP}, to indicate the top-most directory in the cloned source code or the top directory in the project when browsing through GitHub.

Detailed instructions on how to install the code are in ${TOP}/INSTALL.md file in the code. After installation, follow the ${TOP}/WALKTHROUGH.md for step-by-step instructions to create a test key with a very small quota limit of 10 requests per day, and use the client to hit that limit. Search for HTTP 429: Too Many Requests as the signal your client has been throttled.

Figure 2: The web application (with browser developer tools enabled) shows that a quick succession of API calls starts returning an HTTP 429 after the quota for the day is exceeded.

Figure 2: The web application (with browser developer tools enabled) shows that a quick succession of API calls starts returning an HTTP 429 after the quota for the day is exceeded.

Responsibilities of the Client to support Throttling

The Client must provide an API Key in the header of the HTTP request, labelled, “X-Api-Key:”. If a resource in API Gateway has throttling enabled and that header is missing or invalid in the request, then API Gateway will reject the request.

Important: API Keys are simple identifiers, not authorization tokens or cryptographic keys. API keys are for throttling and managing quotas for tenants only and not suitable as a security mechanism. There are many ways to properly control access to a REST API in API Gateway, and we refer you to the AWS documentation for more details as that topic is beyond the scope of this post.

Clients should always test for the response to any network call, and implement logic specific to an HTTP 429 response. The correct action is almost always “try again later.” Just how much later, and how many times before giving up, is application dependent. Common approaches include:

  • Retry – With simple retry, client retries the request up to defined maximum retry limit configured
  • Exponential backoff – Exponential backoff uses progressively larger wait time between retries for consecutive errors. As the wait time can become very long quickly, maximum delay and a maximum retry limits should be specified.
  • Jitter – Jitter uses a random amount of delay between retry to prevent large bursts by spreading the request rate.

AWS SDK is an example client-responsibility implementation. Each AWS SDK implements automatic retry logic that uses a combination of retry, exponential backoff, jitter, and maximum retry limit.

SaaS Considerations: Tenant Isolation Strategies at Scale

While the sample code is a good start, the design has an implicit assumption that API Gateway will support as many API Keys as we have number of tenants. In fact, API Gateway has a quota on available per region per account. If the sample code’s requirements are to support more than 10,000 tenants (or if tenants are allowed multiple keys), then the sample implementation is not going to scale, and we need to consider more scalable implementation strategies.

This is one instance of a general challenge with SaaS called “tenant isolation strategies.” We highly recommend reviewing this white paper ‘SasS Tenant Isolation Strategies‘. A brief explanation here is that the one-resource-per-customer (or “siloed”) model is just one of many possible strategies to address tenant isolation. While the siloed model may be the easiest to implement and offers strong isolation, it offers no economy of scale, has high management complexity, and will quickly run into limits set by the underlying AWS Services. Other models besides siloed include pooling, and bridged models. Again, we recommend the whitepaper for more details.

Figure 3. Tiered multi-tenant architectures often employ different tenant isolation strategies at different tiers. Our example is specific to API Keys, but the technique generalizes to storage, compute, and other resources.

Figure 3- Tiered multi-tenant architectures often employ different tenant isolation strategies at different tiers. Our example is specific to API Keys, but the technique generalizes to storage, compute, and other resources.

In this example, we implement a range of tenant isolation strategies at different tiers of service. This allows us to protect against “noisy-neighbors” at the highest tier, minimize outlay of limited resources (namely, API-Keys) at the lowest tier, and still provide an effective, bounded “blast radius” of noisy neighbors at the mid-tier.

A concrete development example helps illustrate how this can be implemented. Assume three tiers of service: Free, Basic, and Premium. One could create a single API Key that is a pooled resource among all tenants in the Free Tier. At the other extreme, each Premium customer would get their own unique API Key. They would protect Premium tier tenants from the ‘noisy neighbor’ effect. In the middle, the Basic tenants would be evenly distributed across a set of fixed keys. This is not complete isolation for each tenant, but the impact of any one tenant is contained within “blast radius” defined.

In production, we recommend a more nuanced approach with additional considerations for monitoring and automation to continuously evaluate tiering strategy. We will revisit these topics in greater detail after considering the sample code.

Conclusion

In this post, we have reviewed how to effectively guard a tiered multi-tenant REST API hosted in Amazon API Gateway. We also explored how tiering and throttling strategies can influence tenant isolation models. In Part 2 of this blog series, we will dive deeper into tenant isolation models and gaining insights with metrics.

If you’d like to know more about the topic, the AWS Well-Architected SaaS Lens Performance Efficiency pillar dives deep on tenant tiers and providing differentiated levels of performance to each tier. It also provides best practices and resources to help you design and reduce impact of noisy neighbors your SaaS solution.

To learn more about Serverless SaaS architectures in general, we recommend the AWS Serverless SaaS Workshop and the SaaS Factory Serverless SaaS reference solution that inspired it.

Corporate Involvement in International Cybersecurity Treaties

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/corporate-involvement-in-international-cybersecurity-treaties.html

The Paris Call for Trust and Stability in Cyberspace is an initiative launched by French President Emmanuel Macron during the 2018 UNESCO’s Internet Governance Forum. It’s an attempt by the world’s governments to come together and create a set of international norms and standards for a reliable, trustworthy, safe, and secure Internet. It’s not an international treaty, but it does impose obligations on the signatories. It’s a major milestone for global Internet security and safety.

Corporate interests are all over this initiative, sponsoring and managing different parts of the process. As part of the Call, the French company Cigref and the Russian company Kaspersky chaired a working group on cybersecurity processes, along with French research center GEODE. Another working group on international norms was chaired by US company Microsoft and Finnish company F-Secure, along with a University of Florence research center. A third working group’s participant list includes more corporations than any other group.

As a result, this process has become very different than previous international negotiations. Instead of governments coming together to create standards, it is being drive by the very corporations that the new international regulatory climate is supposed to govern. This is wrong.

The companies making the tools and equipment being regulated shouldn’t be the ones negotiating the international regulatory climate, and their executives shouldn’t be named to key negotiation roles without appointment and confirmation. It’s an abdication of responsibility by the US government for something that is too important to be treated this cavalierly.

On the one hand, this is no surprise. The notions of trust and stability in cyberspace are about much more than international safety and security. They’re about market share and corporate profits. And corporations have long led policymakers in the fast-moving and highly technological battleground that is cyberspace.

The international Internet has always relied on what is known as a multistakeholder model, where those who show up and do the work can be more influential than those in charge of governments. The Internet Engineering Task Force, the group that agrees on the technical protocols that make the Internet work, is largely run by volunteer individuals. This worked best during the Internet’s era of benign neglect, where no one but the technologists cared. Today, it’s different. Corporate and government interests dominate, even if the individuals involved use the polite fiction of their own names and personal identities.

However, we are a far cry from decades past, where the Internet was something that governments didn’t understand and largely ignored. Today, the Internet is an essential infrastructure that underpins much of society, and its governance structure is something that nations care about deeply. Having for-profit tech companies run the Paris Call process on regulating tech is analogous to putting the defense contractors Northrop Grumman or Boeing in charge of the 1970s SALT nuclear agreements between the US and the Soviet Union.

This also isn’t the first time that US corporations have led what should be an international relations process regarding the Internet. Since he first gave a speech on the topic in 2017, Microsoft President Brad Smith has become almost synonymous with the term “Digital Geneva Convention.” It’s not just that corporations in the US and elsewhere are taking a lead on international diplomacy, they’re framing the debate down to the words and the concepts.

Why is this happening? Different countries have their own problems, but we can point to three that currently plague the US.

First and foremost, “cyber” still isn’t taken seriously by much of the government, specifically the State Department. It’s not real to the older military veterans, or to the even older politicians who confuse Facebook with TikTok and use the same password for everything. It’s not even a topic area for negotiations for the US Trade Representative. Nuclear disarmament is “real geopolitics,” while the Internet is still, even now, seen as vaguely magical, and something that can be “fixed” by having the nerds yank plugs out of a wall.

Second, the State Department was gutted during the Trump years. It lost many of the up-and-coming public servants who understood the way the world was changing. The work of previous diplomats to increase the visibility of the State Department’s cyber efforts was abandoned. There are few left on staff to do this work, and even fewer to decide if they’re any good. It’s hard to hire senior information security professionals in the best of circumstances; it’s why charlatans so easily flourish in the cybersecurity field. The built-up skill set of the people who poured their effort and time into this work during the Obama years is gone.

Third, there’s a power struggle at the heart of the US government involving cyber issues, between the White House, the Department of Homeland Security (represented by CISA), and the military (represented by US Cyber Command). Trying to create another cyber center of power within the State Department threatens those existing powers. It’s easier to leave it in the hands of private industry, which does not affect those government organizations’ budgets or turf.

We don’t want to go back to the era when only governments set technological standards. The governance model from the days of the telephone is another lesson in how not to do things. The International Telecommunications Union is an agency run out of the United Nations. It is moribund and ponderous precisely because it is run by national governments, with civil society and corporations largely alienated from the decision-making processes.

Today, the Internet is fundamental to global society. It’s part of everything. It affects national security and will be a theater in any future war. How individuals, corporations, and governments act in cyberspace is critical to our future. The Internet is critical infrastructure. It provides and controls access to healthcare, space, the military, water, energy, education, and nuclear weaponry. How it is regulated isn’t just something that will affect the future. It is the future.

Since the Paris Call was finalized in 2018, it has been signed by 81 countries — including the US in 2021 — 36 local governments and public authorities, 706 companies and private organizations, and 390 civil society groups. The Paris Call isn’t the first international agreement that puts companies on an equal signatory footing as governments. The Global Internet Forum to Combat Terrorism and the Christchurch Call to eliminate extremist content online do the same thing. But the Paris Call is different. It’s bigger. It’s more important. It’s something that should be the purview of governments and not a vehicle for corporate power and profit.

When something as important as the Paris Call comes along again, perhaps in UN negotiations for a cybercrime treaty, we call for actual State Department officials with technical expertise to be sitting at the table with the interests of the entire US in their pocket…not people with equity shares to protect.

This essay was written with Tarah Wheeler, and previously published on The Cipher Brief.

Orchestrating Amazon S3 Glacier Deep Archive object retrieval using AWS Step Functions

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/orchestrating-amazon-s3-glacier-deep-archive-object-retrieval-using-aws-step-functions/

This blog was written by Monica Cortes Sack, Solutions Architect, Oskar Neumann, Partner Solutions Architect, and Dhiraj Mahapatro, Principal Specialist SA, Serverless.

AWS Step Functions now support over 220 services and over 10,000 AWS API actions. This enables you to use the AWS SDK integration directly instead of writing an AWS Lambda function as a proxy.

One such service integration is with Amazon S3. Currently, you write scripts using AWS CLI S3 commands to achieve automation around running S3 tasks. For example, S3 integrates with AWS Transfer Family, builds a custom security check, takes action on S3 buckets on S3 object creations, or orchestrates a workflow around S3 Glacier Deep Archive object retrieval. These script executions do not provide an execution history or an easy way to validate the behavior.

Step Functions’ AWS SDK integration with S3 declaratively creates serverless workflows around S3 tasks without relying on those scripts. You can validate the execution history and behavior of a Step Functions workflow.

This blog highlights one of the S3 use cases. It shows how to orchestrate workflows around S3 Glacier Deep Archive object retrieval, cost estimation, and interaction with the requester using Step Functions. The demo application provides additional details on the entire architecture.

S3 Glacier Deep Archive is a storage class in S3 used for data that is rarely accessed. The service provides durable and secure long-term storage, trading immediate access for cost-effectiveness. You must restore archived objects before they are downloadable. It supports two options for object retrieval:

  1. Standard – Access objects within 12 hours of the start of the restoration process.
  2. Bulk – Access objects within 48 hours of the start of the restoration process.

Business use case

Consider a research institute that stores backups on S3 Glacier Deep Archive. The backups are maintained in S3 Glacier Deep Archive for redundancy. The institute has multiple researchers with one central IT team. When a researcher requests an object from S3 Glacier Deep Archive, the central IT team retrieves it and charges the corresponding research group for retrieval and data transfer costs.

Researchers are the end users and do not operate in the AWS Cloud. They run computing clusters on-premises and depend on the central IT team to provide them with the restored archive. A member of the research team requesting an object retrieval provides the following information to the central IT team:

  1. Object key to be restored.
  2. The number of days the researcher needs the object accessible for download.
  3. Researcher’s email address.
  4. Retrieve within 12 or 48 hours SLA. This determines whether “Standard” or “Bulk” retrieval respectively.

The following overall architecture explains the setup on AWS and the interaction between a researcher and the central IT team’s architecture.

Architecture overview

Architecture diagram

Architecture diagram

  1. The researcher uses a front-end application to request object retrieval from S3 Glacier Deep Archive.
  2. Amazon API Gateway synchronously invokes AWS Step Functions Express Workflow.
  3. Step Functions initiates RestoreObject from S3 Glacier Deep Archive.
  4. Step Functions stores the metadata of this retrieval in an Amazon DynamoDB table.
  5. Step Functions uses Amazon SES to email the researcher about archive retrieval initiation.
  6. Upon completion, S3 sends the RestoreComplete event to Amazon EventBridge.
  7. EventBridge rule triggers another Step Functions for post-processing after the restore is complete.
  8. A Lambda function inside the Step Functions calculates the estimated cost (retrieval and data transfer out) and updates existing metadata in the DynamoDB table.
  9. Sync data from DynamoDB table using Amazon Athena Federated Queries to generate reports dashboard in Amazon QuickSight.
  10. Step Functions uses SES to email the researcher with cost details.
  11. Once the researcher receives an email, the researcher uses the front-end application to call the /download API endpoint.
  12. API Gateway invokes a Lambda function that generates a pre-signed S3 URL of the retrieved object and returns it in the response.

Setup

  1. To run the sample application, you must install CDK v2, Node.js, and npm.
  2. To clone the repository, run:
    git clone https://github.com/aws-samples/aws-stepfunctions-examples.git
    cd cdk/app-glacier-deep-archive-retrieval
  3. To deploy the application, run:
    cdk deploy --all

Identifying workflow components

Starting the restore object workflow

The first component is accepting the researcher’s request to start the archive retrieval process. The sample application created from the demo provides a basic front-end app that shows the files from an S3 bucket that has objects stored in S3 Glacier Deep Archive. The researcher retrieves file requests from the front-end application reached by the sample application’s Amazon CloudFront URL.

Glacier Deep Archive Retrieval menu

Glacier Deep Archive Retrieval menu

The front-end app asks the researcher for an email address, the number of days the researcher wants the object to be available for download, and their ETA on retrieval speed. Based on the retrieval speed, the researcher accepts either Standard or Bulk object retrieval. To test this, put objects in the data bucket under the S3 Glacier Deep Archive storage class and use the front-end application to retrieve them.

Item retrieval prompt

Item retrieval prompt

The researcher then chooses the Retrieve file. This action invokes an API endpoint provided by API Gateway. The API Gateway endpoint synchronously invokes a Step Functions Express Workflow. This validates the restore object request, gets the object metadata, and starts to restore the object from S3 Glacier Deep Archive.

The state machine stores the metadata of the restore object AWS SDK call in a DynamoDB table for later use. You can use this metadata to build a dashboard in Amazon QuickSight for reporting and administration purposes. Finally, the state machine uses Amazon SES to email the researcher, notifying them about the restore object initiation process:

Restore object initiated

Restore object initiated

The following state machine shows the workflow:

Workflow diagram

Workflow diagram

The ability to use S3 APIs declaratively using AWS SDK from Step Functions makes it convenient to integrate with S3. This approach avoids writing a Lambda function to wrap the SDK calls. The following portion of the state machine definition shows the usage of S3 HeadObject and RestoreObject APIs:

"Get Object Metadata": {
    "Next": "Initiate Restore Object from Deep Archive",
    "Catch": [{
        "ErrorEquals": ["States.ALL"],
        "Next": "Bad Request"
    }],
    "Type": "Task",
    "ResultPath": "$.result.metadata",
    "Resource": "arn:aws:states:::aws-sdk:s3:headObject",
    "Parameters": {
        "Bucket": "glacierretrievalapp-databucket-abc123",
        "Key.$": "$.fileKey"
    }
}, 
"Initiate Restore Object from Deep Archive": {
    "Next": "Update restore operation metadata",
    "Type": "Task",
    "ResultPath": null,
    "Resource": "arn:aws:states:::aws-sdk:s3:restoreObject",
    "Parameters": {
        "Bucket": "glacierretrievalapp-databucket-abc123",
        "Key.$": "$.fileKey",
        "RestoreRequest": {
            "Days.$": "$.requestedForDays"
        }
    }
}

You can extend the previous workflow and build your own Step Functions workflows to orchestrate other S3 related workflows.

Processing after object restoration completion

S3 RestoreObject is a long-running process for S3 Glacier Deep Archive objects. S3 emits a RestoreCompleted event notification on the object restore completion to EventBridge. You set up an EventBridge rule to trigger another Step Functions workflow as a target for this event. This workflow takes care of the object restoration post-processing.

cfnDataBucket.addPropertyOverride('NotificationConfiguration.EventBridgeConfiguration.EventBridgeEnabled', true);

An EventBridge rule triggers the following Step Functions workflow and passes the event payload as an input to the Step Functions execution:

new aws_events.Rule(this, 'invoke-post-processing-rule', {
  eventPattern: {
    source: ["aws.s3"],
    detailType: [
      "Object Restore Completed"
    ],
    detail: {
      bucket: {
        name: [props.dataBucket.bucketName]
      }
    }
  },
  targets: [new aws_events_targets.SfnStateMachine(this.stateMachine, {
    input: aws_events.RuleTargetInput.fromObject({
      's3Event': aws_events.EventField.fromPath('$')
    })
  })]
});

The Step Functions workflow gets object metadata from the DynamoDB table and then invokes a Lambda function to calculate the estimated cost. The Lambda function calculates the estimated retrieval and the data transfer costs using the contentLength of the retrieved object and the Price List API for the unit cost. The workflow then updates the calculated cost in the DynamoDB table.

The retrieval cost and the data transfer out cost are proportional to the size of the retrieved object. The Step Functions workflow also invokes a Lambda function to create the download API URL for object retrieval. Finally, it emails the researcher with the estimated cost and the download URL as a restoration completion notification.

Workflow studio diagram

Workflow studio diagram

The email notification to the researcher looks like:

Email example

Email example

Downloading the restored object

Once the object restoration is complete, the researcher can download the object from the front-end application.

Front end retrieval menu

Front end retrieval menu

The researcher chooses the Download action, which invokes another API Gateway endpoint. The endpoint integrates with a Lambda function as a backend that creates a pre-signed S3 URL sent as a response to the browser.

Administering object restoration usage

This architecture also provides a view for the central IT team to understand object restoration usage. You achieve this by creating reports and dashboards from the metadata stored in DynamoDB.

The sample application uses Amazon Athena Federated Queries and Amazon Athena DynamoDB Connector to generate a reports dashboard in Amazon QuickSight. You can also use Step Functions AWS SDK integration with Amazon Athena and visualize the workflows in the Athena console.

The following QuickSight visualization shows the count of restored S3 Glacier Deep Archive objects by their contentType:

QuickSite visualization

QuickSight visualization

Considerations

With the preceding approach, you should consider that:

  1. You must start the object retrieval in the same Region as the Region of the archived object.
  2. S3 Glacier Deep Archive only supports standard and bulk retrievals.
  3. You must enable the “Object Restore Completed” event notification on the S3 bucket with the S3 Glacier Deep Archive object.
  4. The researcher’s email must be verified in SES.
  5. Use a Lambda function for the Price List GetProducts API as the service endpoints are available in specific Regions.

Cleanup

To clean up the infrastructure used in this sample application, run:

cdk destroy --all

Conclusion

Step Functions’ AWS SDK integration opens up different opportunities to orchestrate a workflow. Step Functions provides native support for retries and error handling, which offloads the heavy lifting of handling them manually in scripts.

This blog shows one example use case with S3 Glacier Deep Archive. With AWS SDK integration in Step Functions, you can build any workflow orchestration using S3 or S3 control APIs. For example, a workflow to enforce AWS Key Management Service encryption based on an S3 event, or create static website hosting on-demand in a few steps.

With different S3 API calls available in Step Functions’ Workflow Studio, you can declaratively build a Step Functions workflow instead of imperatively calling each S3 API from a shell script or command line. Refer to the demo application for more details.

For more serverless learning resources, visit Serverless Land.

Amazon EC2 DL1 instances Deep Dive

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/amazon-ec2-dl1-instances-deep-dive/

This post is written by Amr Ragab, Principal Solutions Architect, Amazon EC2.

AWS is excited to announce that the new Amazon Elastic Compute Cloud (Amazon EC2) DL1 instances are now generally available in US-East (N. Virginia) and US-West (Oregon). DL1 provides up to 40% better price performance for training deep learning models as compared to current generation GPU-based EC2 instances. The dl1.24xlarge instance type features eight Intel-Habana Gaudi accelerators, which are custom-built to train deep learning models. Each Gaudi accelerator has 32 GB of high bandwidth memory (HBM2) and a peer-to-peer bidirectional bandwidth of 100 Gbps RoCE, for a total bidirectional interconnect bandwidth of 700 Gbps per card. Further instance specifications are as follows:

Instance Size vCPU Instance Memory (GiB) Gaudi Accelerators Network Bandwidth (Gbps) Total Accelerator Interconnect (Gbs) Local Instance Storage EBS Bandwidth (Gbps)
d1.24xlarge 96 768 8 4×100 Gbps 700 4x1TB NVMe 19

Instance Architecture

System architecture of the amazon ec2 dl1 instances.

As the preceding instance architecture indicates, pairs of Gaudi accelerators (e.g., Gaudi0 and Gaudi1) are attached directly through a PCIe Gen3x16 link. Additionally, peer-to-peer networking via 100 Gbps RoCEv2 links – with seven active links per card – provides a torus configuration with a total of 700 Gbps of interconnect bandwidth per card. This topology is a separate interconnect outside of the two NUMA domains. Furthermore, the instance supports four EFA ENIs and 4x1TB of local NVMe SSD storage. We will provide a peer-direct driver over EFA, which will let you utilize high throughput, low latency peer-direct networking between accelerators across multiple instances to efficiently scale multi-node distributed training workloads.

Quick Start

Quickly get started with DL1 and SynapseAI SDK through with the following options:

1) Habana Deep Learning AMIs provided by AWS.

2) AWS Marketplace AMIs provided by Habana.

3) Using Packer to build a custom Amazon Machine Images (AMI) provided by this GitHub repo. This repo also provides build scripts to create Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS) AMIs.

After selecting an AMI, launch a dl1.24xlarge instance in either us-east-1 or us-west-2. To help identify in which availability zone(s) dl1.24xlarge is available, run the following command:

aws ec2 describe-instance-type-offerings \
--location-type availability-zone \
--filters Name=instance-type,Values=dl1.24xlarge \
--region us-west-2 \
--output table

Once launched, you can connect to the instance over SSH (with the correct security group attached).

Habana Collectives Communication Library (HCL/HCCL)

As part of the Habana SynapseAI SDK, Habana Gaudi’s use the HCCL library for handling the collectives between HPUs. Get more information on HCCL here. On DL1 through the HCL-tests, we can confirm close to 700 Gbps (689 Gbps) per card for the collectives tested as follows.

You can confirm these tests by cloning the github repo here.

Habana DL1 HCCL tests.

Amazon EKS Quick Start

Support for DL1 on Amazon EKS is available today with Amazon EKS versions > 1.19. The following is a quick start to get up and running quickly with DL1.

The following dependencies will be needed:

eksctl – You need version 0.70.0+ of eksctl.
kubectl – You use Kubernetes version 1.20 in this post.

Create EKS cluster:

eksctl create cluster --region us-east-1 --without-nodegroup \
--vpc-public-subnets subnet-037d8e430963c2d3e,subnet-0abe898359a7d43e9

Nodegroup configuration – save the following codeblock to a file called dl1-managed-ng.yaml. Replace the AMI ID in the code block with the AMI created earlier.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: fabulous-rainbow-1635807811
  region: us-west-2

vpc:
  id: vpc-34f1894c
  subnets:
    public:
      endpoint-one:
        id: subnet-4532e73d
      endpoint-two:
        id: subnet-8f8b7dc5

managedNodeGroups:
  - name: dl1-ng-1d
    instanceType: dl1.24xlarge
    volumeSize: 200
    instancePrefix: dl1-ng-1d-worker
    ami: ami-072c632cbbc2255b3
    iam:
      withAddonPolicies:
        imageBuilder: true
        autoScaler: true
        ebs: true
        fsx: true
        cloudWatch: true
    ssh:
      allow: true
      publicKeyName: amrragab-aws
    subnets:
    - endpoint-one
    minSize: 1
    desiredCapacity: 1
    maxSize: 4
    overrideBootstrapCommand: |
      #!/bin/bash
      /etc/eks/bootstrap.sh fabulous-rainbow-1635807811

Create the managed nodegroup with the following command:

eksctl create nodegroup -f dl1-managed-ng.yaml

Once the nodegroup has been completed, you must apply the habana-k8s-device-plugin

kubectl create -f https://vault.habana.ai/artifactory/docker-k8s-device-plugin/habana-k8s-device-plugin.yaml

Once completed, you should see the Gaudi devices as an allocatable resource in your EKS
cluster, presenting 8 Gaudi accelerators per DL1 node in the cluster.

Allocatable:

attachable-volumes-aws-ebs: 39
cpu:                        95690m
ephemeral-storage:          192188443124
habana.ai/gaudi:            8
hugepages-1Gi:              0
hugepages-2Mi:              30000Mi
memory:                     753055132Ki
pods:                       15

Example Distributed Machine Learning (ML) Workloads

The following tables are examples of Mixed Precision/FP32 training results comparing DL1 to the common GPU instances used for ML training.

Model: ResNet50
Framework: TensorFlow 2
Dataset: Imagenet2012
GitHub: https://github.com/HabanaAI/Model-
References/tree/master/TensorFlow/computer_vision/Resnets/resnet_keras

Instance Type Batch Size
Mixed Precision Training Throughput (images/sec)
8x Gaudi – 32 GB (dl1.24xlarge) 256 13036
8x A100 – 40 GB (p4d.24xlarge) 256 17921
8x V100 – 32 GB (p3dn.24xlarge) 256 9685
8x V100 – 16GB (p3.16xlarge) 256 8945

Model: Bert Large – Pretraining
Framework: Pytorch 1.9
Dataset: Wikipedia/BooksCorpus
GitHub: https://github.com/HabanaAI/Model-References/tree/master/PyTorch/nlp/bert

Instance Type Batch Size
@128 Sequence
Length
Mixed Precision Training Throughput (seq/sec)
8x Gaudi – 32 GB (dl1.24xlarge) 256 1318
8x A100 – 40 GB (p4d.24xlarge) 8192 2979
8x V100 – 32 GB (p3dn.24xlarge) 8192 1458
8x V100 – 16GB (p3.16xlarge) 8192 1013

You can find a more comprehensive list of ML models supported with performance data here. Support for containers with TensorFlow and Pytorch are also available. Furthermore, you can stay up-to-date with the operator support for TensorFlow and Pytorch.

CONCLUSION

We are excited to innovate on behalf of our customers and provide a diverse choice in ML accelerators with DL1 instances. The DL1 instances powered by Gaudi accelerators can provide up to 40% better price performance for training deep learning models as compared to current generation GPU-based EC2 instances. DL1 instances use the Habana SynapseAI SDK with framework support in Pytorch and TensorFlow. Additional future support for EFA with peer direct HPUs across nodes will also be supported. Now it’s time to go power up your ML workloads with Amazon EC2 DL1 instances.

15.3 Million Request-Per-Second DDoS Attack

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/15-3-million-request-per-second-ddos-attack.html

Cloudflare is reporting a large DDoS attack against an unnamed company “operating a crypto launchpad.”

While this isn’t the largest application-layer attack we’ve seen, it is the largest we’ve seen over HTTPS. HTTPS DDoS attacks are more expensive in terms of required computational resources because of the higher cost of establishing a secure TLS encrypted connection. Therefore it costs the attacker more to launch the attack, and for the victim to mitigate it. We’ve seen very large attacks in the past over (unencrypted) HTTP, but this attack stands out because of the resources it required at its scale.

The attack only lasted 15 seconds. No word on motive. Was this a test? Or was that 15-second delay critical for some other fraud?

News article.

Let’s Architect! Serverless architecture on AWS

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-serverless-architecture-on-aws/

Serverless architecture and computing allow you and your teams to focus on delivering business value in place of investing time tweaking the infrastructure characteristics. AWS is not only providing serverless computing as a service, but share that half of our new applications built by Amazon are using AWS Lambda, as noted by Andy Jassy in his 2020 re:Invent keynote.

In this post, we share insights into reimagining a serverless environment.

I Build Applications – Event-driven Architecture

Event-driven architecture is common in modern applications built with microservices, and it is the cornerstone for designing serverless workloads. It uses events to trigger and communicate between decoupled services.

With this video, you can learn how to start with a prototype then scale to mass adoption using decoupled systems that run when responding to, without needing to redesign. Danilo Poccia, Chief Evangelist at AWS, begins the session with the APIs, then gives an example on how to build an event-driven architecture using Amazon EventBridge. The session closes with how to understand what is happening in this exchange of events.

Event-driven communication with asynchronous invocation

Event-driven communication with asynchronous invocation

Building modern cloud applications? Think integration

This re:Invent 2021 session explains modern cloud applications based on serverless or microservices, and how connections between components define important characteristics, like scalability, availability, and coupling.

How your systems are interconnected describes your system’s essential properties, such as resiliency and changeability. Gregor Hohpe, AWS Enterprise Strategist, shares tips on what to consider when integrating different services, such as lifecycle, level of control over the systems you are integrating, and how integration becomes an integral part of your software delivery cycle. The goal is to use the same method to integrate at the same speed as software deployment.

Integration approaches with Gregor Hohpe

Integration approaches with Gregor Hohpe

Serverless architectural patterns and best practices

Serverless architectures require a mindset shift: existing patterns need to be revisited, and new patterns created using the new architecture style. For each pattern created by AWS, we provide operational, security, and reliability best practices and discuss potential challenges. We also demonstrate some patterns in reference architecture diagrams.

This session helps you identify services and applications to create serverless architectures and understand areas of potential savings, increased agility, and reliability in your organization. Heitor Lessa, Principal Solutions Architect at AWS, starts the session identifying the benefits of Lambda Power Tuning: he details setting up memory when there are hundreds of functions, then follows with best practices for the pattern created.

Best practices for serverless architecture

Best practices for serverless architecture

Best practices of advanced serverless developers

This session is an overview of architectural best practices, optimizations, and handy codes that can be used to build secure, scalable, and high-performance serverless applications.

Julian Wood, Senior Developer Advocate at AWS, provides the recommended practices for implementing serverless applications inside your company, such as Lambda, to transform and not transport, avoid monolithic services and functions, orchestrate workflow with step functions, choreograph events. Julian also touches on understanding different ways you can invoke Lambda functions and what you should be aware of with each invocation model.

Three types of AWS Lambda invocation models

Three types of AWS Lambda invocation models

Building next-gen applications with event-driven architectures

Maintaining data consistency across multiple services can be challenging. It can also be difficult to work with large amounts of data in different data stores and locations. Teams building microservices architectures often find that integration with other applications and external services can make their workloads more monolithic and tightly coupled.

In this session, you can learn how to use event-based architectures to decouple and decentralize application components. Coupling is not one-dimensional, and it’s a trade-off to balance and optimize over time. This video demonstrates patterns based on message queues and events: for each pattern you can learn the advantages, the disadvantages, and the options for building it on AWS.

Sam Dengler, Principal Solutions Architect at AWS, explains the mental models to apply while designing choreography and orchestration in a scenario with microservices. The strategy adopted by Taco Bell for identifying their bounded contexts is also detailed, as well as the architecture built on Lambda for running the business logic and on AWS Step Functions for orchestration.

Choreography and orchestration are two modes of interaction in a microservices architecture

Choreography and orchestration are two modes of interaction in a microservices architecture

See you next time!

Thanks for joining our discussion on serverless architecting! If you want to deep dive into the topic, read all about Serverless on AWS!

See you in a couple of weeks when we discuss architecting for resilience!

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

Other posts in this series