Digital Imaging and Communications in Medicine (DICOM) is the international standard for the transmission, storage, retrieval, print, and display of medical images and related information. While DICOM has revolutionized the medical imaging industry, allowing for enhanced patient care through the easy exchange of imaging data, it also presents potential vulnerabilities when exposed to the open internet.
About five years ago, I was in the hospital while an ultrasound was taken of my pregnant wife. While the doctor made the images, a small message on the screen got my attention: “writing image to disk – transfer DICOM.” Digging into the DICOM standard at the time resulted in being able to discover exposed systems over the internet, retrieve medical images, use demo software, and 3D-print a pelvis. An example of that research is still available online here. It’s now five years later, so I was curious to see if things had changed (and no worries—I will not 3D-print another body part 😉).
This article delves into the risks associated with the unintended exposure of DICOM data and the importance of safeguarding this data.
Understanding DICOM
DICOM is more than just an image format; it encompasses a suite of protocols that allow different medical imaging devices and systems, such as MRI machines, X-ray devices, and computer workstations, to communicate with each other. A typical DICOM file not only contains the image but also the associated metadata, which may have patient demographic information, clinical data, and sometimes even the patient’s full name, date of birth, and other personal identifiers.
What Are the Exposure Risks?
Breach of Patient Confidentiality: The most pressing concern is the breach of patient confidentiality. If DICOM data is exposed online, there’s a high risk of unauthorized access to sensitive patient information. Such breaches have the potential to result in legal consequences, financial penalties, and damage to the reputations of medical institutions.
Data Manipulation: An unprotected system might allow malicious entities not only to view but also to alter medical data. Such manipulations have the potential to lead to mis-diagnoses, inappropriate treatments, or other medical errors.
Ransomware Attacks: In recent years, healthcare institutions have become prime targets for ransomware attacks. Exposing DICOM data could potentially provide a gateway for cybercriminals to encrypt vital medical information and demand a ransom for its release.
Data Loss: Without proper security measures, data could be accidentally or maliciously deleted, leading to loss of crucial medical records.
Service Interruptions: Unprotected DICOM servers could be vulnerable to denial-of-service (DoS) attacks, disrupting medical services and interfering with patient care.
Research
While previously I focused on the imaging part of the protocol, this time I looked into the possibility of retrieving PII data* from openly exposed DICOM servers.
Using Sonar, Rapid7’s proprietary internet scan engine, a study was conducted to scan for the DICOM port exposed to the internet. Using the output of the scan, a simple Python script was created that used the IP addresses discovered as input, whereby a basic set of DICOM descriptors from the “PATIENT” root-level were queried. The standard itself is very extensive and contains many fields that can be retrieved, such as PII related data including name, date of birth, comments on the treatment, and many more.
Unfortunately, we were able to quickly retrieve sensitive patient information. No need for authentication; we received the information simply by requesting it. The following screenshot is an example of what we retrieved, with the PII altered for privacy purposes.
In some cases, we were able to get more details on the study and status of the patient:
Importantly, our results not only discovered hospitals, but also private practice and veterinary clinics.
When scanning for systems connected to the internet, we focused on the two main TCP ports: TCP port 104 and TCP port 11112. We ignored the TCP port 4242 since that is mostly used to send images. In total we discovered more than 3600 results that replied to these two ports.
Although it might be interesting to geolocate where these systems are, we believe that it is better to investigate which systems are really possible candidates that we can retrieve data from and geolocate those.
TCP port 104 stats
After retrieving the list of IP addresses that responded to the open port and matched a DICOM reply, we scanned the list by using a custom script that would query if a connection could be established or not. The following diagram shows the results of this scan.
In 45% of cases, the remote server was accepting a connection that could be used for retrieving information.
TCP port 11112 stats
Next, we used the list of IP addresses that responded to a DICOM ping reply on TCP port 1112. Again we used our script to query if a connection could be established or not. The diagram below shows the results of this particular scan.
Of the total number of 1921 discovered systems responding to our DICOM connection verification script, 43% of these systems were accepting a connection that could be used for retrieving data.
Since we now know how many systems are connected, accepting connections to retrieve the information, let’s map those out on a global map, where each orange colored country is a country where systems were discovered:
Not much seems to have changed since my initial research in 2018; even searching for medical images using a fairly simple Google query results in the ability to download images from DICOM systems, including complete MRI sets. The image below showcases an innocent example from a veterinary clinic where an X-ray of an unfortunate pet was made.
Conclusion
While DICOM has proven invaluable in the world of medical imaging, its exposure to the internet poses significant risks. Healthcare institutions are the prime targets of threat actors; therefore, these risks have detrimental implications on patients’ healthcare services and consumer trust, and they cause legal and financial damage to healthcare providers.
It’s essential for healthcare institutions to recognize these risks and implement robust measures to protect both patient data and their reputations. As the cyber landscape continues to evolve, so too must the defenses that guard against potential threats. Healthcare organizations should make it a part of their business strategy to regularly scan their exposure to the internet and institute robust protections against potential risks.
*Note: Where possible, Rapid7 used their connections with National CERTS to inform them of our findings. All data that was discovered has been securely removed from the researcher’s system.
This week is the Virus Bulletin Conference in London. Part of the conference is the Cyber Threat Alliance summit, where CTA members like Rapid7 showcase their research into all kinds of cyber threats and techniques.
Traditionally, when we investigate a campaign, the focus is mostly on the code of the file, the inner workings of the malware, and communications towards threat actor-controlled infrastructure. Having a background in forensics, and in particular data forensics, I’m always interested in new ways of looking at and investigating data. New techniques can help proactively track, detect, and hunt for artifacts.
In this blog, which highlights my presentation at the conference, I will dive into the world of Shell Link files (LNK) and Virtual Hard Disk files (VHD). As part of this research, Rapid7 is releasing a new feature in Velociraptor that can parse LNK files and will be released with the posting of this blog.
VHD files
VHD and its successor VHDX are formats representing a virtual hard disk. They can contain contents usually found on a physical hard drive, such as disk partitions and files. They are typically used as the hard disk of a virtual machine, are built into modern versions of Windows, and are the native file format for Microsoft’s hypervisor, Hyper-V. The format was created by Connectix for their Virtual PC, known as Microsoft Virtual since Microsoft acquired Connectix in 2003. As we will see later, the word “Connectix” is still part of the footer of a VHD file.
Why would threat actors use VHD files in their campaigns? Microsoft has a security technology that is called “Mark of the Web” (MOTW). When files are downloaded from the internet using Windows, they are marked with a secret Zone.Identifier NTFS Alternate Data Stream (ADS) with a particular value called the MOTW. MOTW-tagged files are restricted and unable to carry out specific operations. Windows Defender SmartScreen, which compares files with an allowlist of well-known executables, will process executables marked with the MOTW. SmartScreen will stop the execution of the file if it is unknown or untrusted and will alert the user not to run it. Since VHD files are a virtual hard-disk, they can contain files and folders. When files are inside a VHD container, they will not receive the MOTW and bypass the security restrictions.
Depending on the underlying operating system, the VHD file can be in FAT or NTFS. The great thing about that is that traditional file system forensics can be applied. Think about Master-File_Table analysis, Header/Footer analysis and data carving, to name a few.
Example case:
In the past we investigated a case where a threat-actor was using a VHD file as part of their campaign. The flow of the campaign demonstrates how this attack worked:
After sending a spear-phishing email with a VHD file, the victim would open up the VHD file that would auto-mount in Windows. Next, the MOTW is bypassed and a PDF file with backdoor is opened to download either the Sednit or Zebrocy malware. The backdoor would then establish a connection with the command-and-control (C2) server controlled by the threat actor.
After retrieving the VHD file, first it is mounted as ‘read-only’ so we cannot change anything about the digital evidence. Secondly, the Master-File-Table (MFT) is retrieved and analyzed:
Besides the valuable information like creation and last modification times (always take into consideration that these can be altered on purpose), two of the files were copied from a system into the VHD file. Another interesting discovery here is that the VHD disk contained a RECYCLE.BIN file that contained deleted files. That’s great since depending on the filesize of the VHD (the bigger, the more chance that files are not overwritten), it is possible to retrieve these deleted files by using a technique called “data carving.”
Using Photorec as one of the data carving tools, again the VHD file is mounted read-only and the tool pointed towards this share to attempt to recover the deleted files.
After running for a short bit, the deleted files could be retrieved and used as part of the investigation. Since this is not relevant for this blog, we continue with the footer analysis.
Footer analysis of a VHD file
The footer, which is often referred to as the trailer, is an addition to the original header that is appended to the end of a file. It is a data structure that resembles a header.
A footer is never located at a fixed offset from the beginning of an image file unless the image data is always the same size because by definition it comes after the image data, which is typically of variable length. It is often situated a certain distance from the end of a picture file. Similar to headers, footers often have a defined size. A rendering application can use a footer’s identification field or magic number, like a header’s, to distinguish it from other data structures in the file.
When we look at the footer of the VHD file, certain interesting fields can be observed:
These values are some of the examples of the data structures that are specified for the footer of a VHD file, but there are also other values like “type of disk” that can be valuable during comparisons of multiple campaigns by an actor.
From the screenshot, we can see that “conectix” is the magic number value of the footer of a VHD file, you can compare it to a small fingerprint. From the other values, we can determine that the actor used a Windows operating system, and we can derive from the HEX value the creation time of the VHD file.
From a threat hunting or tracking perspective, these values can be very useful. In the below example, a Yara rule was written to identify the file as a VHD file and secondly the serial number of the hard drive used by the actor:
Shell link files (LNK), aka Shortcut files
A Shell link, also known as a Shortcut, is a data object in this format that houses data that can be used to reach another data object. Windows files with the “LNK” extension are in a format known as the Shell Link Binary File Format. Shell links can also be used by programs that require the capacity to store a reference to a destination file. Shell links are frequently used to facilitate application launching and linking scenarios, such as Object Linking and Embedding (OLE).
LNK files are massively abused in multiple cybercrime campaigns to download next stage payloads or contain code hidden in certain data fields. The data structure specification of LNK files mentions that LNK files store various information, including “optional data” in the “extra data” sections. That is an interesting area to focus on.
Below is a summarized overview of the Extra Data structure:
The ‘Header’ LinkInfo part contains interesting data on the type of drive used, but more importantly it contains the SerialNumber of the hard drive used by the actor when creating the LNK file:
Other interesting information can be found; for example, around a value with regards to the icon used and in this file used, it contains an interesting string.
Combining again that information, a simple Yara rule can be written for this particular LNK file which might have been used in multiple campaigns:
One last example is to look for the ‘Droids’ values in the Extra Data sections. Droids stands for Digital Record Object Identification. There are two values present in the example file:
The value in these fields translates to the MAC address of the attacker’s system… yes, you read this correctly and may close your open mouth now…
Also this can be used to build upon the previous LNK Yara rule, where you could replace the “.\\3.jpg” part with the MAC address value to hunt for LNK files that were created on that particular device with that MAC address.
In a recent campaign called “Raspberry Robin”, LNK files were used to distribute the malware. Analyzing the LNK files and using the above investigation technique, the following Yara rule was created:
Velociraptor LNK parser
Based on our research into LNK files, an updated LNK parser was developed by Matt Green from Rapid7 for Velociraptor, our advanced open-source endpoint monitoring, digital forensics, and cyber response platform.
With the parser, multiple LNK files can be processed and information can be extracted to use as an input for Yara rules that can be pushed back into the platform to hunt.
Windows.Forensics.Lnk parses LNK shortcut files using Velociraptor’s built-in binary parser. The artifact outputs fields aligning to Microsoft’s ms-shllink protocol specification and some analysis hints to assist review or detection use cases. Users have the option to search for specific indicators in key fields with regex, or control the definitions for suspicious items to bubble up during parsing.
Some of the default targeted suspicious attributes include:
Large size
Startup path location for auto execution
Environment variable script — environment variable with a common script configured to execute
No target with an environment variable only execution
Suspicious argument size — large sized arguments over 250 characters as default
Arguments have ticks — ticks are common in malicious LNK files
Arguments have environment variables — environment variables are common in malicious LNKs
Arguments have rare characters — look for specific rare characters that may indicate obfuscation
Arguments that have leading space. Malicious LNK files may have many leading spaces to obfuscate some tools
Arguments that have http strings — LNKs are regularly used as a download cradle
Suspicious arguments — some common malicious arguments observed in field
Suspicious trackerdata hostname
Hostname mismatch with trackerdata hostname
Due to the use of Velociraptor’s binary parser, the artifact is significantly faster than other analysis tools. It can be deployed as part of analysis or at scale as a hunting function using the IOCRegex and/or SuspiciousOnly flag.
Summary
It is worth investigating the characteristics of file types we tend to skip in threat actor campaigns. In this blog I provided a few examples of how artifacts can be retrieved from VHD and LNK files and then used for the creation of hunting logic. As a result of this research, Rapid7 is happy to release a new LNK parser feature in Velociraptor and we welcome any feedback.
This year, Cloudflare officially became a teenager, turning 13 years old. We celebrated this milestone with a series of announcements that benefit both our customers and the Internet community.
From developing applications in the age of AI to securing against the most advanced attacks that are yet to come, Cloudflare is proud to provide the tools that help our customers stay one step ahead.
We hope you’ve had a great time following along and for anyone looking for a recap of everything we launched this week, here it is:
Cloudflare Stream’s LL-HLS support is now in open beta. You can deliver video to your audience faster, reducing the latency a viewer may experience on their player to as little as 3 seconds.
Cloudflare account permissions are now available to all customers, not just Enterprise. In addition, we’ll show you how you can use them and best practices.
Now available: Workers AI, a serverless GPU cloud for AI, Vectorize so you can build your own vector databases, and AI Gateway to help manage costs and observability of your AI applications.
Cloudflare delivers the best infrastructure for next-gen AI applications, supported by partnerships with NVIDIA, Microsoft, Hugging Face, Databricks, and Meta.
Launching Workers AI — AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs.
Cloudflare’s vector database, designed to allow engineers to build full-stack, AI-powered applications entirely on Cloudflare's global network — available in Beta.
AI Gateway helps developers have greater control and visibility in their AI apps, so that you can focus on building without worrying about observability, reliability, and scaling. AI Gateway handles the things that nearly all AI applications need, saving you engineering time so you can focus on what you're building.
Developers can now use WebGPU in Cloudflare Workers. Learn more about why WebGPUs are important, why we’re offering them to customers, and what’s next.
Many AI companies are using Cloudflare to build next generation applications. Learn more about what they’re building and how Cloudflare is helping them on their journey.
Cloudflare launches a new product, Hyperdrive, that makes existing regional databases much faster by dramatically speeding up queries that are made from Cloudflare Workers.
D1 is now in open beta, and the theme is “scale”: with higher per-database storage limits and the ability to create more databases, we’re unlocking the ability for developers to build production-scale applications on D1.
Introducing the Browser Rendering API, which enables developers to utilize the Puppeteer browser automation library within Workers, eliminating the need for serverless browser automation system setup and maintenance
We partnered with Microsoft Edge to provide a fast and secure VPN, right in the browser. Users don’t have to install anything new or understand complex concepts to get the latest in network-level privacy: Edge Secure Network VPN is available on the latest consumer version of Microsoft Edge in most markets, and automatically comes with 5GB of data.
We are revamping the playground that demonstrates the power of Workers, along with new development tooling, and the ability to share your playground code and deploy instantly to Cloudflare’s global network
Engineers from Cloudflare and Vercel have published a draft specification of the connect() sockets API for review by the community, along with a Node.js compatible polyfill for the connect() API that developers can start using.
Announcing new pricing for Cloudflare Workers, where you are billed based on CPU time, and never for the idle time that your Worker spends waiting on network requests and other I/O.
Cloudflare is rolling out post-quantum cryptography support to customers, services, and internal systems to proactively protect against advanced attacks.
Announcing a contribution that helps improve privacy for everyone on the Internet. Encrypted Client Hello, a new standard that prevents networks from snooping on which websites a user is visiting, is now available on all Cloudflare plans.
Cloudflare customers can now scan messages within their Office 365 Inboxes for threats. The Retro Scan will let you look back seven days to see what threats your current email security tool has missed.
Any Cloudflare user, on any plan, can choose specific categories of bots that they want to allow or block, including AI crawlers. We are also recommending a new standard to robots.txt that will make it easier for websites to clearly direct how AI bots can and can’t crawl.
Deep dive into Cloudflare’s approach and ongoing research into detecting novel web attack vectors in our WAF before they are seen by a security researcher.
Deep dive into the fundamental concepts behind the Distributed Aggregation Protocol (DAP) protocol with examples on how we’ve implemented it into Daphne, our open source aggregator server.
We are rolling out post-quantum cryptography support for outbound connections to origins and Cloudflare Workers fetch() calls. Learn more about what we enabled, how we rolled it out in a safe manner, and how you can add support to your origin server today.
Cloudflare’s updated benchmark results regarding network performance plus a dive into the tools and processes that we use to monitor and improve our network performance.
One More Thing
When Cloudflare turned 12 last year, we announced the Workers Launchpad Funding Program – you can think of it like a startup accelerator program for companies building on Cloudlare’s Developer Platform, with no restrictions on your size, stage, or geography.
A refresher on how the Launchpad works: Each quarter, we admit a group of startups who then get access to a wide range of technical advice, mentorship, and fundraising opportunities. That includes our Founders Bootcamp, Open Office Hours with our Solution Architects, and Demo Day. Those who are ready to fundraise will also be connected to our community of 40+ leading global Venture Capital firms.
In exchange, we just ask for your honest feedback. We want to know what works, what doesn’t and what you need us to build for you. We don’t ask for a stake in your company, and we don’t ask you to pay to be a part of the program.
Over the past year, we’ve received applications from nearly 60 different countries. We’ve had a chance to work closely with 50 amazing early and growth-stage startups admitted into the first two cohorts, and have grown our VC partner community to 40+ firms and more than $2 billion in potential investments in startups building on Cloudflare.
Next up: Cohort #3! Between recently wrapping up Cohort #2 (check out their Demo Day!), celebrating the Launchpad’s 1st birthday, and the heaps of announcements we made last week, we thought that everyone could use a little extra time to catch up on all the news – which is why we are extending the deadline for Cohort #3 a few weeks to October 13, 2023. AND we’re reserving 5 spots in the class for those who are already using any of last Wednesday’s AI announcements. Just be sure to mention what you’re using in your application.
So once you’ve had a chance to check out the announcements and pour yourself a cup of coffee, check out the Workers Launchpad. Applying is a breeze — you’ll be done long before your coffee gets cold.
Until next time
That’s all for Birthday Week 2023. We hope you enjoyed the ride, and we’ll see you at our next innovation week!
Over the last twelve months, we have been talking about the new baseline of encryption on the Internet: post-quantum cryptography. During Birthday Week last year we announced that our beta of Kyber was available for testing, and that Cloudflare Tunnel could be enabled with post-quantum cryptography. Earlier this year, we made our stance clear that this foundational technology should be available to everyone for free, forever.
Today, we have hit a milestone after six years and 31 blog posts in the making: we’re starting to roll out General Availability1 of post-quantum cryptography support to our customers, services, and internal systems as described more fully below. This includes products like Pingora for origin connectivity, 1.1.1.1, R2, Argo Smart Routing, Snippets, and so many more.
This is a milestone for the Internet. We don't yet know when quantum computers will have enough scale to break today's cryptography, but the benefits of upgrading to post-quantum cryptography now are clear. Fast connections and future-proofed security are all possible today because of the advances made by Cloudflare, Google, Mozilla, the National Institutes of Standards and Technology in the United States, the Internet Engineering Task Force, and numerous academic institutions
What does General Availability mean? In October 2022 we enabled X25519+Kyber as a beta for all websites and APIs served through Cloudflare. However, it takes two to tango: the connection is only secured if the browser also supports post-quantum cryptography. Starting August 2023, Chrome is slowly enabling X25519+Kyber by default.
The user’s request is routed through Cloudflare’s network (2). We have upgraded many of these internal connections to use post-quantum cryptography, and expect to be done upgrading all of our internal connections by the end of 2024. That leaves as the final link the connection (3) between us and the origin server.
We are happy to announce that we are rolling out support for X25519+Kyber for most inbound and outbound connectionsas Generally Available for use including origin servers and Cloudflare Workersfetch()es.
Plan
Support for post-quantum outbound connections
Free
Started roll-out. Aiming for 100% by the end of the October.
Pro and business
Aiming for 100% by the end of year.
Enterprise
Roll-out begins February 2024. 100% by March 2024.
For our Enterprise customers, we will be sending out additional information regularly over the course of the next six months to help prepare you for the roll-out. Pro, Business, and Enterprise customers can skip the roll-out and opt-in within your zone today, or opt-out ahead of time using an API described in our companion blog post. Before rolling out for Enterprise in February 2024, we will add a toggle on the dashboard to opt out.
With an upgrade of this magnitude, we wanted to focus on our most used products first and then expand outward to cover our edge cases. This process has led us to include the following products and systems in this roll out:
1.1.1.1
AMP
API Gateway
Argo Smart Routing
Auto Minify
Automatic Platform Optimization
Automatic Signed Exchange
Cloudflare Egress
Cloudflare Images
Cloudflare Rulesets
Cloudflare Snippets
Cloudflare Tunnel
Custom Error Pages
Flow Based Monitoring
Health checks
Hermes
Host Head Checker
Magic Firewall
Magic Network Monitoring
Network Error Logging
Project Flame
Quicksilver
R2 Storage
Request Tracer
Rocket Loader
Speed on Cloudflare Dash
SSL/TLS
Traffic Manager
WAF, Managed Rules
Waiting Room
Web Analytics
If a product or service you use is not listed here, we have not started rolling out post-quantum cryptography to it yet. We are actively working on rolling out post-quantum cryptography to all products and services including our Zero Trust products. Until we have achieved post-quantum cryptography support in all of our systems, we will publish an update blog in every Innovation Week that covers which products we have rolled out post-quantum cryptography to, the products that will be getting it next, and what is still on the horizon.
Products we are working on bringing post-quantum cryptography support to soon:
Cloudflare Gateway
Cloudflare DNS
Cloudflare Load Balancer
Cloudflare Access
Always Online
Zaraz
Logging
D1
Cloudflare Workers
Cloudflare WARP
Bot Management
Why now?
As we announced earlier this year, post-quantum cryptography will be included for free in all Cloudflare products and services that can support it. The best encryption technology should be accessible to everyone – free of charge – to help support privacy and human rights globally.
“What was once an experimental frontier has turned into the underlying fabric of modern society. It runs in our most critical infrastructure like power systems, hospitals, airports, and banks. We trust it with our most precious memories. We trust it with our secrets. That’s why the Internet needs to be private by default. It needs to be secure by default.”
Our work on post-quantum cryptography is driven by the thesis that quantum computers that can break conventional cryptography create a similar problem to the Year 2000 bug. We know there is going to be a problem in the future that could have catastrophic consequences for users, businesses, and even nation states. The difference this time is we don’t know how the date and time that this break in the computational paradigm will occur. Worse, any traffic captured today could be decrypted in the future. We need to prepare today to be ready for this threat.
We are excited for everyone to adopt post-quantum cryptography into their systems. To follow the latest developments of our deployment of post-quantum cryptography and third-party client/server support, check out pq.cloudflareresearch.com and keep an eye on this blog.
***
1We are using a preliminary version of Kyber, NIST’s pick for post-quantum key agreement. Kyber has not been finalized. We expect a final standard to be published in 2024 under the name ML-KEM, which we will then adopt promptly while deprecating support for X25519Kyber768Draft00.
Recently, Rapid7 observed the Fake Browser Update lure tricking users into executing malicious binaries. While analyzing the dropped binaries, Rapid7 determined a new loader is utilized in order to execute infostealers on compromised systems including StealC and Lumma.
The IDAT loader is a new, sophisticated loader that Rapid7 first spotted in July 2023. In earlier versions of the loader, it was disguised as a 7-zip installer that delivered the SecTop RAT. Rapid7 has now observed the loader used to deliver infostealers like Stealc, Lumma, and Amadey. It implements several evasion techniques including Process Doppelgänging, DLL Search Order Hijacking, and Heaven’s Gate. IDAT loader got its name as the threat actor stores the malicious payload in the IDAT chunk of PNG file format.
Prior to this technique, Rapid7 observed threat actors behind the lure utilizing malicious JavaScript files to either reach out to Command and Control (C2) servers or drop the Net Support Remote Access Trojan (RAT).
The following analysis covers the entire attack flow, which starts from the SocGholish malware and ends with the stolen information in threat actors’ hands.
Technical Analysis
Threat Actors (TAs) are often staging their attacks in the way security tools will not detect them and security researchers will have a hard time investigating them.
Stage 1 – SocGholish
First observed in the wild as early as 2018, SocGholish was attributed to TA569. Mainly recognized for its initial infection method characterized as “drive-by” downloads, this attack technique involves the injection of malicious JavaScript into compromised yet otherwise legitimate websites. When an unsuspecting individual receives an email with a link to a compromised website and clicks on it, the injected JavaScript will activate as soon as the browser loads the page.
The injected JavaScript investigated by Rapid7 loads an additional JavaScript that will access the final URL when all the following browser conditions are met:
The access originated from the Windows OS
The access originated from an external source
Cookie checks are passed
This prompt falsely presents itself as a browser update, with the added layer of credibility coming from the fact that it appears to originate from the intended domain.
Once the user interacts with the “Update Chrome” button, the browser is redirected to another URL where a binary automatically downloads to the user’s default download folder. After the user double clicks the fake update binary, it will proceed to download the next stage payload. In this investigation, Rapid7 identified a binary called ChromeSetup.exe, the file name widely used in previous SocGholish attacks.
Stage 2 – MSI Downloader
ChromeSetup.exe downloads and executes the Microsoft Software Installer (MSI) package from: hxxps://ocmtancmi2c5t[.]xyz/82z2fn2afo/b3/update[.]msi.
In similar investigations, Rapid7 observed that the initial dropper executable appearance and file name may vary depending on the user’s browser when visiting the compromised web page. In all instances, the executables contained invalid signatures and attempted to download and install an MSI package.
Rapid7 determined that the MSI package executed with several switches intended to avoid detection:
/qn to avoid an installation UI
/quiet to prevent user interaction
/norestart to prevent the system from restarting during the infection process
When executed, the MSI dropper will write a legitimate VMwareHostOpen.exe executable, multiple legitimate dependencies, and the malicious Dynamic-Link Library (DLL) file vmtools.dll. It will also drop an encrypted vmo.log file which has a PNG file structure and is later decrypted by the malicious DLL. Rapid7 spotted an additional version of the attack where the MSI dropped a legitimate pythonw.exe, legitimate dependencies, and the malicious DLL file python311.dll.In that case, the encrypted file was named pz.log,though the execution flow remains the same.
Stage 3 – Decryptor
When executed, the legitimate VMWareHostOpen.exe loads the malicious vmtools.dllfrom the same directory as from which the VMWareHostOpen.exeis executed. This technique is known as DLL Search Order Hijacking.
During the execution of vmtools.dll, Rapid7 observed that the DLL loads API libraries from kernel32.dll and ntdll.dll using API hashing and maps them to memory. After the API functions are mapped to memory, the DLL reads the hex string 83 59 EB ED 50 60 E8 and decrypts it using a bitwise XOR operation with the key F5 34 84 C3 3C 0F 8F, revealing the string vmo.log. The file is similar to the Vmo\log directory, where Vmware logs are stored.
The DLL then reads the contents from vmo.log into memory and searches for the string …IDAT. The DLL takes 4 bytes following …IDAT and compares them to the hex values of C6 A5 79 EA. If the 4 bytes following …IDAT are equal to the hex values C6 A5 79 EA, the DLL proceeds to copy all the contents following …IDAT into memory.
Once all the data is copied into memory, the DLL attempts to decrypt the copied data using the bitwise XOR operation with key F4 B4 07 9A. Upon additional analysis of other samples, Rapid7 determined that the XOR keys were always stored as 4 bytes following the hex string C6 A5 79 EA.
Once the DLL decrypts the data in memory, it is decompressed using the RTLDecompressBuffer function. The parameters passed to the function include:
Compression format
Size of compressed data
Size of compressed buffer
Size of uncompressed data
Size of uncompressed buffer
The vmtools.dll DLL utilizes the compression algorithm LZNT1 in order to decompress the decrypted data from the vmo.log file.
After the data is decompressed, the DLL loads mshtml.dll into memory and overwrites its .text section with the decompressed code. After the overwrite, vmtools.dll calls the decompressed code.
Stage 4 – IDAT Injector
Similarly to vmtools.dll,IDAT loader uses dynamic imports. The IDAT injector then expands the %APPDATA% environment variable by using the ExpandEnvironmentStringsW API call. It creates a new folder under %APPDATA%, naming it based on the QueryPerformanceCounter API call output and randomizing its value.
All the dropped files by MSI are copied to the newly created folder. IDAT then creates a new instance of VMWareHostOpen.exefrom the %APPDATA% by using CreateProcessW and exits.
The second instance of VMWareHostOpen.exebehaves the same up until the stage where the IDAT injector code is called from mshtml.dllmemory space. IDAT immediately started the implementation of the Heaven’s Gate evasion technique, which it uses for most API calls until the load of the infostealer is completed.
Heaven’s Gate is widely used by threat actors to evade security tools. It refers to a method for executing a 64-bit process within a 32-bit process or vice versa, allowing a 32-bit process to run in a 64-bit process. This is accomplished by initiating a call or jump instruction through the use of a reserved selector. The key points in analyzing this technique in our case is to change the process mode from 32-bit to 64-bit, the specification of the selector “0x0033” required and followed by the execution of a far call or far jump, as shown in Figure 8.
Figure 8 – Heaven’s Gate technique implementation
The IDAT injector then expands the %TEMP% environment variable by using the ExpandEnvironmentStringsW API call. It creates a string based on the QueryPerformanceCounter API call output and randomizes its value.
Next, the IDAT loader gets the computer name by calling GetComputerNameW API call, and the output is randomized by using randand srand API calls. It uses that randomized value to set a new environment variable by using SetEnvironmentVariableW.This variable is set to a combination of %TEMP% path with the randomized string created previously.
Now, the new cmd.exe process is executed by the loader. The loader then creates and writes to the %TEMP%\89680228 file.
Creates a new memory section inside the remote process by using the NtCreateSection API call
Maps a view of the newly created section to the local malicious process with RW protection by using NtMapViewOfSection API call
Maps a view of the previously created section to a remote target process with RX protection by using NtMapViewOfSection API call
Fills the view mapped in the local process with shellcode by using NtWriteVirtualMemory API call
In our case, IDAT loader suspends the main thread on the cmd.exeprocess by using NtSuspendThread API call and then resumes the thread by using NtResumeThreadAPI call After completing the injection, the second instance of VMWareHostOpen.exeexits.
Stage 5 – IDAT Loader:
The injected loader code implements the Heaven’s Gate evasion technique in exactly the same way as the IDAT injector did. It retrieves the TCBEDOPKVDTUFUSOCPTRQFD environment variable, and reads the %TEMP%\89680228 file data into the memory. The data is then recursively XORed with the 3D ED C0 D3 key.
The decrypted data seems to contain configuration data, including which process the infostealer should be loaded, which API calls should be dynamically retrieved, additional code,and more. The loader then deletes the initial malicious DLL (vmtools.dll) by using DeleteFileW.The loader finally injects the infostealer code into the explorer.exe process by using the Process Doppelgänging injection technique.
TheProcess Doppelgängingmethod utilizes the Transactional NTFS feature within the Windows operating system. This feature is designed to ensure data integrity in the event of unexpected errors. For instance, when an application needs to write or modify a file, there’s a risk of data corruption if an error occurs during the write process. To prevent such issues, an application can open the file in a transactional mode to perform the modification and then commit the modification, thereby preventing any potential corruption. The modification either succeeds entirely or does not commence.
Process Doppelgänging exploits this feature to replace a legitimate file with a malicious one, leading to a process injection. The malicious file is created within a transaction, then committed to the legitimate file, and subsequently executed. The Process Doppelgängingin our sample was performed by:
Initiating a transaction by using NtCreateTransaction API call
Creating a new file by using NtCreateFile API call
Writing to the new file by using NtWriteFileAPI call
Writing malicious code into a section of the local process using NtCreateSectionAPI call
Discarding the transaction by using NtRollbackTransactionAPI call
Running a new instance of explorer.exe process by using NtCreateProcessEx API call
Running the malicious code inside explorer.exe process by using NtCreateThreadExAPI call
If the file created within a transaction is rolled back (instead of committed), but the file section was already mapped into the process memory, the process injection will still be performed.
The final payload injected into the explorer.exe process was identified by Rapid7 as Lumma Stealer.
Throughout the whole attack flow, the malware delays execution by using NtDelayExecution, a technique that is usually used to escape sandboxes.
As previously mentioned, Rapid7 has investigated several IDAT loader samples. The main differences were:
The legitimate software that loads the malicious DLL.
The name of the staging directory created within %APPDATA%.
The process the IDAT injector injects the Loader code to.
The process into which the infostealer/RAT loaded into.
Rapid7 observed the IDAT loader has been used to load the following infostealers and RAT: Stealc, Lumma and Amadey infostealers and SecTop RAT.
Conclusion
IDAT Loader is a new sophisticated loader that utilizes multiple evasion techniques in order to execute various commodity malware including InfoStealers and RAT’s. The Threat Actors behind the Fake Update campaign have been packaging the IDAT Loader into DLLs that are loaded by legitimate programs such as VMWarehost, Python and Windows Defender.
Rapid7 Customers
For Rapid7 MDR and InsightIDR customers, the following Attacker Behavior Analytics (ABA) rules are currently deployed and alerting on the activity described in this blog:
Attacker Technique – MSIExec loading object via HTTP
Suspicious Process – FSUtil Zeroing Out a File
Suspicious Process – Users Script Spawns Cmd And Redirects Output To Temp File
Suspicious Process – Possible Dropper Script Executed From Users Downloads Directory
Suspicious Process – WScript Runs JavaScript File from Temp Or Download Directory
MITRE ATT&CK Techniques:
Initial Access
Drive-by Compromise (T1189)
The SocGholish Uses Drive-by Compromise technique to target user’s web browser
Defense Evasion
System Binary Proxy Execution: Msiexec (T1218.007)
The ChromeSetup.exe downloader (C9094685AE4851FD5A5B886B73C7B07EFD9B47EA0BDAE3F823D035CF1B3B9E48) downloads and executes .msi file
Execution
User Execution: Malicious File (T1204.002)
Update.msi (53C3982F452E570DB6599E004D196A8A3B8399C9D484F78CDB481C2703138D47) drops and executes VMWareHostOpen.exe
Defense Evasion
Hijack Execution Flow: DLL Search Order Hijacking (T1574.001)
VMWareHostOpen.exe loads a malicious vmtools.dll (931D78C733C6287CEC991659ED16513862BFC6F5E42B74A8A82E4FA6C8A3FE06)
In a post-pandemic landscape, the interconnectedness of cybersecurity is front and center. Few could say that they were not at least aware of, if not directly affected by, the downstream effects of major breaches that cause impacts felt across economies. One should look at disruptions in the global supply chain as case in point.
So the concept of security that goes from the cradle to the grave, is more than just an industry buzz phrase, it is a critical component of securing networks, applications, and devices.
Sadly, in too many cases, cradle to grave security was either not considered at conception, or outright ignored. And as a new report released today by Rapid7 principal researcher, Deral Heiland points out, even when organizations are able to take steps to mitigate concerns at the grave portion of the life cycle, they don’t.
In Security Implications from Improper De-acquisition of Medical Infusion PumpsHeiland performs a physical and technical teardown of more than a dozen medical infusion pumps — devices used to deliver and control fluids directly into a patient’s body. Each of these devices was available for purchase on the secondary market and each one had issues that could compromise their previous organization’s networks.
The reason these devices pose such a risk is a lack of (or lax) process for de-acquisitioning them before they are sold on sites like eBay. In at least eight of the 13 devices used in the study, WiFi PSK access credentials were discovered, offering attackers potential access to health organization networks.
In the report, Heiland calls for systemic changes to policies and procedures for both the acquisition and de-acquisition of these devices. The policies must define ownership and governance of these devices from the moment they enter the building to the moment they are sold on the secondary market. The processes should detail how data should be purged from these devices (and by extension, many others). In the cases of medical devices that are leased, contractual agreements on the purging process and expectations should be made before acquisition.
The ultimate finding is that properly disposing of sensitive information on these devices should be a priority. Purging them of data should not (and in many cases is not) terribly difficult. The issue lies with process and responsibility for the protection of information stored in those devices. And that is a major component of the cradle to grave security concept.
If you would like to read the report it is available here.
Rapid7 is tracking a new, more sophisticated and staged campaign using the Blackmoon trojan, which appears to have originated in November 2022. The campaign is actively targeting various businesses primarily in the USA and Canada. However, it is not used to steal credentials, instead it implements different evasion and persistence techniques to drop several unwanted programs and stay in victims’ environment for as long as possible.
Blackmoon, also known as KRBanker, is a banking trojan first spotted in late September 2015 when targeting banks of the Republic of Korea. Back in 2015, it was using a “pharming” technique to steal credentials from targeted victims. This technique involves redirecting traffic to a forged website when a user attempts to access one of the banking sites being targeted by the cyber criminals. The fake server masquerades the original site and urges visitors to submit their information and credentials.
Stage 1 – Blackmoon
Blackmoon trojan was named after a debug string “blackmoon,” that is present in its code:
Blackmoon drops a dll into C:\Windows\Logs folder named RunDllExe.dll and implements a Port Monitors persistence technique. Port Monitors is related to the Windows Print Spooler Service or spoolsv.exe. When adding a printer port monitor a user (or the attacker in our case) has the ability to add an arbitrary dll that acts as the monitor. There are two ways to add a port monitor: via Registry for persistence or via a AddMonitor API call for immediate dll execution.
Our sample implements both, it calls AddMonitor API call to immediately execute RunDllExe.dll:
It also sets a Driver value in HKLM\SYSTEM\CurrentControlSet\Control\Print\Monitors\RunDllExe registry key to the malicious dll path.
Next, the malware adds a shutdown system privilege to the Spooler service by adding SeShutdownPrivilege to the RequiredPrivileges value of HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Spooler registry key.
The malware disables Windows Defender by setting HKLM\SOFTWARE\Policies\Microsoft\Windows Defender\DisableAntiSpyware value to “1”.
It also stops and disables “Lanman” service (the service that allows a computer to share files and printers with other devices on the network).
To block all incoming RPC and SMB communication the malware executes the set of following commands:
The malware sets two additional values under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\aspnet_staters: Work and Mining, both set to “1”.
Next, the malware checks if one of the following services exists on the victim computer:
clr_optimization_v3.0.50727_32
clr_optimization_v3.0.50727_64
WinHelpsvcs
Services
Help Service
KuGouMusic
WinDefender
Msubridge
ChromeUpdater
MicrosoftMysql
MicrosoftMssql
Conhost
MicrosotMaims
MicrosotMais
In case the service is found, it will be disabled (by setting “Start” value under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\servicename to “4”) or deleted by using DeleteService API call.
The malware enumerates running processes by using a combination of CreateToolhelp32Snapshot and Process32First and Process32Next API calls to terminate service’s process if one is running.
Finally, a Powershell command is executed to delete the running process’s file and the malware exits.
Stage 2 – RunDllExe.dll – injector
RunDllExe.dll is executed by Spooler service and is responsible for injecting a next stage payload into the newly executed svchost.exe process. The malware implements Process Hollowing injection technique. The injected code is a C++ file downloader.
Stage 3 – File Downloader
The downloader first checks if ‘Work’ and ‘Mining’ values exist and set under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\aspnet_staters registry key, if the values do not exist, it will create them and set both to “1”.
As downloader, this part of the attack flow is checking if all needed to be downloaded files are already present (by using PathFileExistsA API call) on the PC, if not, the malware sleeps for two minutes before every download and then use the URLDownloadToFileA API call to download the following files:
After the download all but the MpMgSvc.dll are executed by the downloader:
Stage 4 – Hook.exe – dropper
Hook.exe drops an additional dll to the users roaming folder C:\Users\Username\AppData\Roaming\GraphicsPerfSvcs.dll and creates a new service named GraphicsPerfSvcs, which will be automatically executed at system startup. The service name is almost identical to the legitimate service named GraphicsPerfSvc, which belongs to the Graphics performance monitor service. Naming services and files similarly to the once belonged to OS is an evasion technique widely used by threat actors.
The dropper starts the created service. It then creates and executes a .vbs which responsible for deleting Hook.exe and the .vbs itself:
Stage 4.1 – MpMgSvc.exe – spreader MpMgSvc.exe first creates a new \BaseNamedObjects\Brute_2022 mutex. As being responsible for spreading the malware, it drops Doublepulsar-1.3.1.exe, Eternalblue-2.2.0.exe, Eternalromance-1.4.0.exe and all required for these files libraries into the C:\Windows\Temp folder.
Then it scans the network for PC’s with open 3306, 445, 1433 ports. If any open ports are found, the spreader will attempt to install a backdoor by using EternalBlue and send shellcode to inject dll with Doublepulsar as implemented in the Eternal-Pulsar github project .
There are two dlls dropped, one for x64 architecture and the second one for x86. When injected by Doublepulsar it will download the first stage Blackmoon malware and follow the same execution stages described in this analysis.
Stage 4.2 – WmiPrvSER.exe – XMRig miner
WmiPrvSER.exe is a classic XMRig Monero miner. Our sample is the XMRig version 6.18, and it creates a BaseNamedObjects\\Win__Host mutex on the victim’s host.You can find a full report on XMRig here.
Stage 5 – GraphicsPerfSvcs service – dropper
As mentioned in the previous stage, the GraphicsPerfSvcs service will be started automatically at system startup. Every time it runs, it will check if two of the following files exist:
The service stays up and constantly attempts to read from the URL: hxxp://down.ftp21[.]cc/Update.txt. At the time of the analysis, this URL was down so we were not able to observe its content. However, following the service code, it seems to read the URL content and check if it contains one of the following commands:
[Delete File], [Kill Proccess], or [Delete Service], which will delete file, kill process or delete service accordingly.
Stage 6 – Ctfmoon.exe and Traffmonetizer.exe – Traffic Stealers
GraphicsPerfSvcs service executes two dropped files: Ctfmoon.exe and Traffmonetizer.exe, both appeared to be Potentially Unwanted Programs (PUP’s) in the form of traffic stealers. Both software are using the “network bandwidth sharing” monetization scheme to make “passive income”.
Ctfmoon.exe is a cli version of the Iproyal Pawns application. It gets the user email address and password as execution parameters to associate the activity and collect the money to the passed account. GraphicsPerfSvcs executes the following command line to start the Iproyal Pawns: ctfmoon.exe [email protected] -password=123456Aa. -device-name=Win32 -accept-tos
We can see that the user mentioned in our execution parameters already made $169:
The Traffmonetizer.exe is similar to Ctfmoon.exe, created by Traffmonetizer. It reads the user account data from a settings.json file dropped in users roaming directory. Our .json file contains the following content:
The analysis in this blog reveals the effort threat actors put into the attack flow, by using several evasion and persistence techniques and using different approaches to increase their income and use victim resources.
MITRE ATT&CK Techniques:
Persistence
Boot or Logon Autostart Execution: Port Monitors (T1547.010)
The Blackmoon trojan (a95737adb2cd7b1af2291d143200a82d8d32
a868c64fb4acc542608f56a0aeda) is using Port Monitors technique to establish persistence on the target host.
Persistence
Create or Modify System Process: Windows Service (T1543.003)
The Hook.exe dropper (1A7A4B5E7C645316A6AD59E26054A95
654615219CC03657D6834C9DA7219E99F) creates a new service to establish persistence on the target host.
Defense Evasion
Process Injection: Process Hollowing (T1055.012)
The dll dropped by Blackmoon (F5D508C816E485E05DF5F58450D623DC6B
FA35A2A0682C238286D82B4B476FBB) is using the process hollowing technique to evade endpoint security detection.
Defense Evasion
Impair Defenses: Disable or Modify Tools (T1562.001)
The Blackmoon trojan (a95737adb2cd7b1af2291d143200a82d8
d32a868c64fb4acc542608f56a0aeda) disables Windows Defender to evade end-point security detection.
Lateral Movement
Exploitation of Remote Services (T1210)
The MpMgSvc.exe spreader (72B0DA797EA4FC76BA4DB6AD131056257965D
F9B2BCF26CE2189AF3DBEC5B1FC) uses EternalBlue and DoublePulsar to spread in organization’s environment.
Discovery
Network Share Discovery (T1135)
The MpMgSvc.exe spreader (72B0DA797EA4FC76BA4DB6AD131056257965D
F9B2BCF26CE2189AF3DBEC5B1FC) scans the network do discover open SMB ports.
Impact
Resource Hijacking (T1496)
The XMRing miner (ECC5A64D97D4ADB41ED9332E4C0F5DC7DC02
A64A77817438D27FC31C69F7C1D3), Iproyal Pawns trafficStealer (FDD762192D351CEA051C0170840F1D8D
171F334F06313A17EBA97CACB5F1E6E1) and Traffmonetizer trafficStealer (2923EACD0C99A2D385F7C989882B7CCA
83BFF133ECF176FDB411F8D17E7EF265) executed to use victim’s resources.
Impact
Service Stop (T1489)
The Blackmoon trojan (a95737adb2cd7b1af2291d143200a82d8d
32a868c64fb4acc542608f56a0aeda) stops updates and security products services.
Command and Control
Application Layer Protocol: Web Protocols (T1071.001)
The downloader (E9A83C8811E7D7A6BF7EA7A656041BCD68968
7F8B23FA7655B28A8053F67BE99) downloads next stage payloads over HTTP protocol. GraphicsPerfSvcs service (5AF88DBDC7F53BA359DDC47C3BCAF3F5FE
9BDE83211A6FF98556AF7E38CDA72B) uses HTTP protocol to get command from C&C server.
From 27 to 29 September 2023, we and the University of Cambridge are hosting the WiPSCE International Workshop on Primary and Secondary Computing Education Research for educators and researchers. This year, this annual conference will take place at Robinson College in Cambridge. We’re inviting all UK-based teachers of computing subjects to apply for one of five ‘all expenses paid’ places at this well-regarded annual event.
You could attend WiPSCE with all expenses paid
WiPSCE is where teachers and researchers discuss research that’s relevant to teaching and learning in primary and secondary computing education, to teacher training, and to related topics. You can find more information about the conference, including the preliminary programme, at wipsce.org.
As a teacher at the conference, you will:
Engage with high-quality international research in the field where you teach
Learn ways to use that research to develop your own classroom practice
Find out how to become an advocate in your professional community for research-informed approaches to the teaching of computing.
We are delighted that, thanks to generous funding from a funder, we can offer five free places to UK computing teachers, covering:
The registration fee
Two nights’ accommodation at Robinson College
Up to £500 supply costs paid to your school to cover your teaching
You need to be a currently practising, UK-based teacher of Computing (England), Computing Science (Scotland), ICT or Digital Technologies (N. Ireland), or Computer Science (Wales)
Your headteacher needs to be able to provide written confirmation that they are happy for you to attend WiPSCE
You need to be available to attend the whole conference from Wednesday lunchtime to Friday afternoon
You need to be willing to share what you learn from the conference with your colleagues at school and with your broader teaching community, including through writing an article about your experience and its relevance to your teaching for this blog or Hello World magazine
The application form will ask your for:
Your name and contact details
Demographic and school information
Your teaching experience
A statement of up to 500 words on why you’re applying and how you think your teaching practice, your school and your colleagues will benefit from your attendance at WiPSCE (500 words is the maximum, feel free to be concise)
After the 19 July deadline, we’re aiming to inform you of the outcome of your application on Friday 21 July.
Use the information you share in your form, particularly in your statement
Select applicants from a mix of primary and secondary schools, with a mix of years of computing teaching experience, and from a mix of geographic areas
Join us in strengthening research-informed computing classroom practice
We’d be delighted to receive your application. Being able to facilitate teachers’ attendance at the conference is very much aligned with our approach to research. Both at the Foundation and the Raspberry Pi Computing Education Research Centre, we’re committed to conducting research that’s directly relevant to schools and teachers, and to working in close collaboration with teachers.
We hope you are interested in attending WiPSCE and becoming an advocate for research-informed computing education practice. If your application is unsuccessful, we hope you consider coming along anyway. We’re looking forward to meeting you there. In the meantime, you can keep up with WiPSCE news on Twitter.
Learning materials, teaching approaches, and the curriculum as a whole are three areas where culturally relevance is important.
In our latest research study, funded by Cognizant, we worked with 13 primary school teachers in England on adapting computing lessons to incorporate culturally relevant and responsive principles and practices. Here’s an insight into the workshop we ran with them, and what the teachers and we have taken away from it.
Adapting lesson materials based on culturally relevant pedagogy
In the group of 13 England-based primary school Computing teachers we worked with for this study:
One third were specialist primary Computing teachers, and the other two thirds were class teachers who taught a range of subjects
Some acted as Computing subject lead or coordinator at their school
Most had taught Computing for between three and five years
The majority worked in urban areas of England, at schools with culturally diverse catchment areas
In November 2022, we held a one-day workshop with the teachers to introduce culturally relevant pedagogy and explore how to adapt two six-week units of computing resources.
An example of a collaborative activity from the workshop
The first part of the workshop was a collaborative, discussion-based professional development session exploring what culturally relevant pedagogy is. This type of pedagogy uses equitable teaching practices to:
Draw on the breadth of learners’ experiences and cultural knowledge
Facilitate projects that have personal meaning for learners
Develop learners’ critical consciousness
The rest of the workshop day was spent putting this learning into practice while planning how to adapt two units of computing lessons to make them culturally relevant for the teachers’ particular settings. We used a design-based approach for this part of the workshop, meaning researchers and teachers worked collaboratively as equal stakeholders to decide on plans for how to alter the units.
We worked in four groups, each with three or four teachers and one or two researchers, focusing on one of two units of work from The Computing Curriculum for teaching digital skills: a unit on photo editing for Year 4 (ages 8–9), and a unit about vector graphics for Year 5 (ages 9–10).
We based the workshop around two Computing Curriculum units that cover digital literacy skills.
In order to plan how the resources in these units of work could be made culturally relevant for the participating teachers’ contexts, the groups used a checklist of ten areas of opportunity. This checklist is a result of one of our previous research projects on culturally relevant pedagogy. Each group used the list to identify a variety of ways in which the units’ learning objectives, activities, learning materials, and slides could be adapted. Teachers noted down their ideas and then discussed them with their group to jointly agree a plan for adapting the unit.
By the end of the day, the groups had designed four really creative plans for:
A Year 4 unit on photo editing that included creating an animal to represent cultural identity
A Year 4 unit on photo editing that included creating a collage all about yourself
A Year 5 unit on vector graphics that guided learners to create their own metaverse and then add it to the class multiverse
A Year 5 unit on vector graphics that contextualised the digital skills by using them in online activities and in video games
Outcomes from the workshop
Before and after the workshop, we asked the teachers to fill in a survey about themselves, their experiences of creating computing resources, and their views about culturally relevant resources. We then compared the two sets of data to see whether anything had changed over the course of the workshop.
The workshop was a positive experience for the teachers.
After teachers had attended the workshop, they reported a statistically significant increase in their confidence levels to adapt resources to be culturally relevant for both themselves and others.
Teachers explained that the workshop had increased their understanding of culturally relevant pedagogy and of how it could impact on learners. For example, one teacher said:
“The workshop has developed my understanding of how culturally adapted resources can support pupil progress and engagement. It has also highlighted how contextual appropriateness of resources can help children to access resources.” – Participating teacher
Some teachers also highlighted how important it had been to talk to teachers from other schools during the workshop, and how they could put their new knowledge into practice in the classroom:
“The dedicated time and value added from peer discourse helped make this authentic and not just token activities to check a box.” – Participating teacher
“I can’t wait to take some of the work back and apply it to other areas and subjects I teach.” – Participating teacher
What you can expect to see next from this project
After our research team made the adaptations to the units set out in the four plans made during the workshop, the adapted units were delivered by the teachers to more than 500 Year 4 and 5 pupils. We visited some of the teachers’ schools to see the units being taught, and we have interviewed all the teachers about their experience of delivering the adapted materials. This observational and interview data, together with additional survey responses, will be analysed by us, and we’ll share the results over the coming months.
As part of the project, we observed teachers delivering the adapted units to their learners.
In our next blog post about this work, we will delve into the fascinating realm of parental attitudes to culturally relevant computing, and we’ll explore how embracing diversity in the digital landscape is shaping the future for both children and their families.
We’ve also written about this professional development activity in more detail in a paper to be published at the UKICER conference in September, and we’ll share the paper once it’s available.
Finally, we are grateful to Cognizant for funding this academic research, and to our cohort of primary computing teachers for their enthusiasm, energy, and creativity, and their commitment to this project.
Developers today do more than just write and ship code—they’re expected to navigate a number of tools, environments, and technologies, including the new frontier of generative artificial intelligence (AI) coding tools. But the most important thing for developers isn’t story points or the speed of deployments. It’s the developer experience, which determines how efficiently and productively developers can exceed standards, enter a flow state, and drive impact.
I say this not only as GitHub’s chief product officer, but as a long-time developer who has worked across every part of the stack. Decades ago, when I earned my master’s in mechanical engineering, I became one of the first technologists to apply AI in the lab. Back then, it would take our models five days to process our larger datasets—which is striking considering the speed of today’s AI models. I yearned for tools that would make me more efficient and shorten my time to production. This is why I’m passionate about developer experience (DevEx) and have made it my focus as GitHub’s chief product officer.
Amid the rapid advancements in generative AI, we wanted to get a better understanding from developers about how new tools—and current workflows—are impacting the overall developer experience. As a starting point, we focused on some of the biggest components of the developer experience: developer productivity, team collaboration, AI, and how developers think they can best drive impact in enterprise environments.
To do so, we partnered with Wakefield Research to survey 500 U.S.-based developers at enterprise companies. In the following report, we’ll show how organizations can remove barriers to help enterprise engineering teams drive innovation and impact in this new age of software development. Ultimately, the way to innovate at scale is to empower developers by improving their productivity, increasing their satisfaction, and enabling them to do their best work—every day. After all, there can be no progress without developers who are empowered to drive impact.
Inbal Shani Chief Product Officer // GitHub
Learn how generative AI is changing the developer experience
Discover how generative AI is changing software development in a pre-recorded session from GitHub.
With this survey, we wanted to better understand the typical experience for developers—and identify key ways companies can empower their developers and achieve greater success.
One big takeaway: It starts with investing in a great developer experience. And collaboration, as we learned from our research, is at the core of how developers want to work and what makes them most productive, satisfied, and impactful.
C = Collaboration, the multiplier across the entire developer experience.
DevEx is a formula that takes into account:
How simple and fast it is for a developer to implement a change on a codebase—or be productive.
How frictionless it is to move from idea through production to impact.
How positively or negatively the work environment, workflows, and tools affect developer satisfaction.
For leaders, developer experience is about creating a collaborative environment where developers can be their most productive, impactful, and satisfied at work. For developers, collaboration is one of the most important parts of the equation.
Current performance metrics fall short of developer expectations
Developers say performance metrics don’t meet expectations
The way developers are currently evaluated doesn’t align with how they think their performance should be measured.
For instance, the developers we surveyed say they’re currently measured by the number of incidents they resolve. But developers believe that how they handle those bugs and issues is more important to performance. This aligns with the belief that code quality over code quantity should remain a top performance metric.
Developers also believe collaboration and communication should be just as important as code quality in terms of performance measures. Their ability to collaborate and communicate with others is essential to their job, but only 33% of developers report that their companies use it as a performance metric.
Metrics currently used to measure performance, compared with metrics developers think should be used to measure their performance.
More than output quantity and efficiency, code quality and collaboration are the most
important performance metrics, according to the developers we surveyed.
The top ranked responses that developers say their teams are working the most on including writing code and finding and fixing security vulnerabilities.
Developers want more opportunities to upskill and drive impact
When developers are asked about what makes a positive impact on their workday, they rank learning new skills (43%), getting feedback from end users (39%), and automated tests (38%), and designing solutions to novel problems (36%) as top contenders.
The top tasks developers say positively impact their workdays.
But developers say they’re spending most of their time writing code and tests, then waiting for that code to be reviewed or builds and tests to be executed.
On a typical day, the enterprise developers we surveyed report their teams are busy with a variety of tasks, including writing code, fixing security vulnerabilities, and getting feedback from end users, among other things. Developers also report that they spend a similar amount of time across these tasks, indicating that they’re stretched thin throughout the day.
The tasks developers say they spend the most time working on each day.
Notably, developers say they spend the same amount of time waiting for builds and tests as they do writing new code.
This suggests that wait times for builds and tests are still a persistent problem despite investments in DevOps tools over the past decade.
Developers also continue to face obstacles, such as waiting on code review, builds, and test runs, which can hinder their ability to learn new skills and design solutions to novel problems, and our research suggests that these factors can have the biggest impact on their overall satisfaction.
Developers want feedback from end users, but face challenges
Developers say getting feedback from end users (39%) is the second-most important thing that positively impacts their workdays—but it’s often challenging for development teams to get that feedback directly.
Product managers and marketing teams often act as intermediaries, making it difficult for developers to directly receive end-user feedback.
Developers would ideally receive feedback from automated and validation tests to improve their work, but sometimes these tests are sent to other teams before being handed off to engineering teams.
The top two daily tasks for development teams include writing code (32%) and finding and fixing security vulnerabilities (31%).
This shows the increased importance developers have placed on security and underscores how companies are prioritizing security.
It also demonstrates the critical role that enterprise development teams play in meeting policy and board edicts around security.
The bottom line
Developers want to upskill, design solutions, get feedback from end users, and be evaluated on their communication skills. However, wait times on builds and tests, as well as the current performance metrics they’re evaluated on, are getting in the way.
Collaboration is the cornerstone of the developer experience
Developers thrive in collaborative environments
In our survey of enterprise engineers, developers say they work with an average of 21 other developers on a typical project—and 52% report working with other teams daily or weekly. Notably, they rank regular touchpoints as the most important factor for effective collaboration.
Developers in enterprise settings often work with an average of 21 other developers on a daily or weekly cadence.
But developers also have a holistic view of collaboration—it’s defined not only by talking and meeting with others, but also by uninterrupted work time, access to fully configured developer environments, and formal mentor-mentee relationships.
Specified blocks with no team communication give developers the time and space to write code and work towards team goals.
Access to fully configured developer environments promotes consistency throughout the development process. It also helps developers collaborate faster and avoid hearing the infamous line, “But it worked on my machine.”
Mentorships can help developers upskill and build interpersonal skills that are essential in a collaborative work environment.
It’s important to note these factors can also negatively impact a developer’s work day—which suggests that ineffective meetings can serve to distract rather than help developers (something we’ve found in previous research).
Our survey indicates the factors most important to effective collaboration are so critical that when they’re not done effectively, they have a noticeable, negative impact on a developer’s work.
The tasks developers say most often have a negative impact on their workday experience.
Developers work with an average of 21 people on any given project. They need the time and tools for success—including regular touchpoints, heads-down time, access to fully-configured dev environments, and formal mentor-mentee relationships.
We wanted to learn more about how developers collaborate
As developer experience continues to be defined, so, too, will successful developer collaboration. Too many pings and messages can affect flow, but there’s still a need to stay in touch. In our survey, developers say effective collaboration results in improved test coverage and faster, cleaner, more secure code writing—which are best practices for any development team. This shows that when developers work effectively with others, they believe they build better and more secure software.
Developers widely view effective collaboration as helping to improve what they ship and how often they ship it.
Developers we surveyed believe collaboration and communication—along with code quality—should be the top priority for evaluation.
From DevOps to agile methodologies, developers and the greater business world have been talking about the importance of collaboration for a long time.
But developers are still not being measured on it.
The metrics that developers think their managers should use to evaluate their performance and productivity.
The takeaway: Companies and engineering managers should encourage regular team communication, and set time to check in–especially in remote environments–but respect developers’ need to work and focus.
Developers believe that effective and regular touchpoints with their colleagues are critical for effective team collaboration.
4 tips for engineering managers to improve collaboration
At GitHub, our researchers, developers, product teams, and analysts are dedicated to studying and improving developer productivity and satisfaction. Here are their tips for engineering leaders who want to improve collaboration among developers:
Make collaboration a goal in performance objectives. This builds the space and expectation that people will collaborate. This could be in the form of lunch and learns, joint projects, etc.
Define and scope what collaboration looks like in your organization. Let people know when they’re being informed about something vs. being consulted about something. A matrix outlining roles and responsibilities helps define each person’s role and is something GitHub teams have implemented.
Give developers time to converse and get to know one another. In particular, remote or hybrid organizations need to dedicate a portion of a developer’s time and virtual space to building relationships. Check out the GitHub guides to remote work.
Identify principal and distinguished engineers. Academic research supports the positive impact of change agents in organizations—and how they should be the people who are exceptionally great at collaboration. It’s a matter of identifying your distinguished engineers and elevating them to a place where they can model desired behaviors.
The bottom line
Effective developer collaboration improves code quality and should be a performance measure. Regular touchpoints, heads-down time, access to fully configured dev environments, and formal mentor-mentee relationships result in improved test coverage and faster, cleaner, more secure code writing.
AI improves individual performance and team collaboration
Developers are already using AI coding tools at work
A staggering 92% of U.S.-based developers working in large companies report using an AI coding tool either at work or in their personal time—and 70% say they see significant benefits to using these tools.
AI is here to stay—and it’s already transforming how developers approach their day-to-day work. That makes it critical for businesses and engineering leaders to adopt enterprise-grade AI tools to avoid their developers using non-approved applications. Companies should also establish governance standards for using AI tools to ensure that they are used ethically and effectively.
Almost all developers are already using AI coding tools at and outside of work.
Almost all (92%) developers use AI coding tools at work—and a majority (67%) have used these tools in both a work setting and during their personal time. Curiously, only 6% of developers in our survey say they solely use these tools outside of work.
Developers believe AI coding tools will enhance their performance
With most developers experimenting with AI tools in the workplace, our survey results suggest it’s not just idle interest leading developers to use AI. Rather, it’s a recognition that AI coding tools will help them meet performance standards.
In our survey, developers say AI coding tools can help them meet existing performance standards with improved code quality, faster outputs, and fewer production-level incidents. They also believe that these metrics should be used to measure their performance beyond code quantity.
Developers widely think that AI coding tools will layer into their existing workflows and bring greater efficiencies—but they do not think AI will change how software is made.
Around one-third of developers report that their managers currently assess their performance based on the volume of code they produce—and an equal number anticipate that this will persist when they start using AI-based coding tools.
Notably, the quantity of code a developer produces may not necessarily correspond to its business value.
Stay smart. With the increase of AI tooling being used in software development—which often contributes to code volume—engineering leaders will need to ask whether measuring code volume is still the best way to measure productivity and output.
Developers think AI coding tools will lead to greater team collaboration
Beyond improving individual performance, more than 4 in 5 developers surveyed (81%) say AI coding tools will help increase collaboration within their teams and organizations.
In fact, security reviews, planning, and pair programming are the most significant points of collaboration and the tasks that development teams are expected to, and should, work on with the help of AI coding tools. This also indicates that code and security reviews will remain important as developers increase their use of AI coding tools in the workplace.
Developers think their teams will need to become more collaborative as they start using AI coding tools.
Sometimes, developers can do the same thing with one line or multiple lines of code. Even still, one-third of developers in our survey say their managers measure their performance based on how much code they produce.
Notably, developers believe AI coding tools will give them more time to focus on solution design. This has direct organizational benefits and means developers believe they’ll spend more time designing new features and products with AI instead of writing boilerplate code.
Developers are already using generative AI coding tools to automate parts of their workflow, which frees up time for more collaborative projects like security reviews, planning, and pair programming.
Developers believe that AI coding tools will help them focus on higher-value problem solving.
Developers think AI increases productivity and prevents burnout
Not only can AI coding tools help improve overall productivity, but they can also provide upskilling opportunities to help create a smarter workforce according to the developers we surveyed.
57% of developers believe AI coding tools help them improve their coding language skills—which is the top benefit they see. Beyond the prospect of acting as an upskilling aid, developers also say AI coding tools can also help with reducing cognitive effort, and since mental capacity and time are both finite resources, 41% of developers believe that AI coding tools can help with preventing burnout.
In previous research we conducted, 87% of developers reported that the AI coding tool GitHub Copilot helped them preserve mental effort while completing more repetitive tasks. This shows that AI coding tools allow developers to preserve cognitive effort and focus on more challenging and innovative aspects of software development or research and development.
AI coding tools help developers upskill while they work. Across our survey, developers consistently rank learning new skills as the number one contributor to a positive workday. But 30% also say learning and development can have a negative impact on their overall workday, which suggests some developers view learning and development as adding more work to their workdays. Notably, developers say the top benefit of AI coding tools is learning new skills—and these tools can help developers learn while they work, instead of making learning and development an additional task.
Developers are already using generative AI coding tools to automate parts of their workflow, which frees up time for more collaborative projects like security reviews, planning, and pair programming.
AI is improving the developer experience across the board
Developers in our survey suggest they can better meet standards around code quality, completion time, and the number of incidents when using AI coding tools—all of which are measures developers believe are key areas for evaluating their performance.
AI coding tools can also help reduce the likelihood of coding errors and improve the accuracy of code—which ultimately leads to more reliable software, increased application performance, and better performance numbers for developers. As AI technology continues to advance, it is likely that these coding tools will have an even greater impact on developer performance and upskilling.
AI coding tools are layering into existing developer workflows and creating greater efficiencies
Developers believe that AI coding tools will increase their productivity—but our survey suggests that developers don’t think these tools are fundamentally altering the software development lifecycle. Instead, developers suggest they’re bringing greater efficiencies to it.
The use of automation and AI has been a part of the developer workflow for a considerable amount of time, with developers already utilizing a range of automated and AI-powered tools, such as machine learning-based security checks and CI/CD pipelines.
Rather than completely overhauling operations, these tools create greater efficiencies within existing workflows, and that frees up more time for developers to concentrate on developing solutions.
The bottom line
Almost all developers (92%) are using AI coding at work—and they say these tools not only improve day-to-day tasks but enable upskilling opportunities, too. Developers see material benefits to using AI tools including improved performance and coding skills, as well as increased team collaboration.
The path forward
Developer satisfaction, productivity, and organizational impact are all positioned to get a boost from AI coding tools—and that will have a material impact on the overall developer experience.
92% of developers already saying they use AI coding tools at work and in their personal time, which makes it clear AI is here to stay. 70% of the developers we surveyed say they already see significant benefits when using AI coding tools, and 81% of the developers we surveyed expect AI coding tools to make their teams more collaborative—which is a net benefit for companies looking to improve both developer velocity and the developer experience.
Notably, 57% of developers believe that AI could help them upskill—and hold the potential to build learning and development into their daily workflow. With all of this in mind, technical leaders should start exploring AI as a solution to improve satisfaction, productivity, and the overall developer experience.
In addition to exploring AI tools, here are three takeaways engineering and business leaders should consider to improve the developer experience:
Help your developers enter a flow state with tools, processes, and practices that help them be productive, drive impact, and do creative and meaningful work.
Empower collaboration by breaking down organizational silos and providing developers with the opportunity to communicate efficiently.
Make room for upskilling within developer workflows through key investments in AI to help your organization experiment and innovate for the future.
Methodology
This report draws on a survey conducted online by Wakefield Research on behalf of GitHub from March 14, 2023 through March 29, 2023 among 500 non-student, U.S.-based developers who are not managers and work at companies with 1,000-plus employees. For a complete survey methodology, please contact [email protected].
Every day, most of us both consume and create data. For example, we interpret data from weather forecasts to predict our chances of a good weather for a special occasion, and we create data as our carbon footprint leaves a trail of energy consumption information behind us. Data is important in our lives, and countries around the world are expanding their school curricula to teach the knowledge and skills required to work with data, including at primary (K–5) level.
Kate FarrellProf. Judy Robertson
In our most recent research seminar, attendees heard about a research-based initiative called Data Education in Schools. The speakers, Kate Farrell and Professor Judy Robertson from the University of Edinburgh, Scotland, shared how this project aims to empower learners to develop data literacy skills and succeed in a data-driven world.
“Data literacy is the ability to ask questions, collect, analyse, interpret and communicate stories about data.”
– Kate Farrell & Prof. Judy Robertson
Being a data citizen
Scotland’s national curriculum does not explicitly mention data literacy, but the topic is embedded in many subjects such as Maths, English, Technologies, and Social Studies. Teachers in Scotland, particularly in primary schools, have the flexibility to deliver learning in an interdisciplinary way through project-based learning. Therefore, the team behind Data Education in Schools developed a set of cross-curricular data literacy projects. Educators and education policy makers in other countries who are looking to integrate computing topics with other subjects may also be interested in this approach.
Data citizens have skills they need to thrive in a world shaped by digital technology.
The Data Education in Schools projects are aimed not just at giving learners skills they may need for future jobs, but also at equipping them as data citizens in today’s world. A data citizen can think critically, interpret data, and share insights with others to effect change.
Kate and Judy shared an example of data citizenship from a project they had worked on with a primary school. The learners gathered data about how much plastic waste was being generated in their canteen. They created a data visualisation in the form of a giant graph of types of rubbish on the canteen floor and presented this to their local council.
Sorting food waste from lunch by type of material
As a result, the council made changes that reduced the amount of plastic used in the canteen. This shows how data citizens are able to communicate insights from data to influence decisions.
A cycle for data literacy projects
Across its projects, the Data Education in Schools initiative uses a problem-solving cycle called the PPDAC cycle. This cycle is a useful tool for creating educational resources and for teaching, as you can use it to structure resources, and to concentrate on areas to develop learner skills.
The PPDAC data problem-solving cycle
The five stages of the cycle are:
Problem: Identifying the problem or question to be answered
Plan: Deciding what data to collect or use to answer the question
Analysis: Preparing, modelling, and visualising the data, e.g. in a graph or pictogram
Conclusion: Reviewing what has been learned about the problem and communicating this with others
Smaller data literacy projects may focus on one or two stages within the cycle so learners can develop specific skills or build on previous learning. A large project usually includes all five stages, and sometimes involves moving backwards — for example, to refine the problem — as well as forwards.
Data literacy for primary school learners
At primary school, the aim of data literacy projects is to give learners an intuitive grasp of what data looks like and how to make sense of graphs and tables. Our speakers gave some great examples of playful approaches to data. This can be helpful because younger learners may benefit from working with tangible objects, e.g. LEGO bricks, which can be sorted by their characteristics. Kate and Judy told us about one learner who collected data about their clothes and drew the results in the form of clothes on a washing line — a great example of how tangible objects also inspire young people’s creativity.
As learners get older, they can begin to work with digital data, including data they collect themselves using physical computing devices such as BBC micro:bit microcontrollers or Raspberry Pi computers.
Free resources for primary (and secondary) schools
For many attendees, one of the highlights of the seminar was seeing the range of high-quality teaching resources for learners aged 3–18 that are part of the Data Education in Schools project. These include:
Data 101 videos: A set of 11 videos to help primary and secondary teachers understand data literacy better.
Lesson resources: Lots of projects to develop learners’ data literacy skills. These are mapped to the Scottish primary and secondary curriculum, but can be adapted for use in other countries too.
More resources are due to be published later in 2023, including a set of prompt cards to guide learners through the PPDAC cycle, a handbook for teachers to support the teaching of data literacy, and a set of virtual data-themed escape rooms.
You may also be interested in the units of work on data literacy skills that are part of The Computing Curriculum, our complete set of classroom resources to teach computing to 5- to 16-year-olds.
Join our next seminar on primary computing education
At our next seminar we welcome Aim Unahalekhaka from Tufts University, USA,who will share research about a rubric to evaluate young learners’ ScratchJr projects. If you have a tablet with ScratchJr installed, make sure to have it available to try out some activities. The seminar will take place online on Tuesday 6 June at 17.00 UK time, sign up now to not miss out.
To find out more about connecting research to practice for primary computing education, you can see a list of our upcoming monthly seminars on primary (K–5) teaching and learning and watch the recordings of previous seminars in this series.
Broadening participation and finding new entry points for young people to engage with computing is part of how we pursue our mission here at the Raspberry Pi Foundation. It was also the focus of our March online seminar, led by our own Dr Bobby Whyte. In this third seminar of our series on computing education for primary-aged children, Bobby presented his work on ‘designing multimodal composition activities for integrated K-5 programming and storytelling’. In this research he explored the integration of computing and literacy education, and the implications and limitations for classroom practice.
Motivated by challenges Bobby experienced first-hand as a primary school teacher, his two studies on the topic contribute to the body of research aiming to make computing less narrow and difficult. In this work, Bobby integrated programming and storytelling as a way of making the computing curriculum more applicable, relevant, and contextualised.
Critically for computing educators and researchers in the area, Bobby explored how theories related to ‘programming as writing’ translate into practice, and what the implications of designing and delivering integrated lessons in classrooms are. While the two studies described here took place in the context of UK schooling, we can learn universal lessons from this work.
What is multimodal composition?
In the seminar Bobby made a distinction between applying computing to literacy (or vice versa) and true integration of programming and storytelling. To achieve true integration in the two studies he conducted, Bobby used the idea of ‘multimodal composition’ (MMC). A multimodal composition is defined as “a composition that employs a variety of modes, including sound, writing, image, and gesture/movement [… with] a communicative function”.
Storytelling comes together with programming in a multimodal composition as learners create a program to tell a story where they:
Decide on content and representation (the characters, the setting, the backdrop)
Structure text they’ve written
Use technical aspects (i.e. motion blocks, tension) to achieve effects for narrative purposes
Defining multimodal composition (MMC) for a visual programming context
Multimodality for programming and storytelling in the classroom
To investigate the use of MMC in the classroom, Bobby started by designing a curriculum unit of lessons. He mapped the unit’s MMC activities to specific storytelling and programming learning objectives. The MMC activities were designed using design-based research, an approach in which something is designed and tested iteratively in real-world contexts. In practice that means Bobby collaborated with teachers and students to analyse, evaluate, and adapt the unit’s activities.
Mapping of the MMC activities to storytelling and programming learning objectives
The first of two studies to explore the design and implementation of MMC activities was conducted with 10 K-5 students (age 9 to 11) and showed promising results. All students approached the composition task multimodally, using multiple representations for specific purposes. In other words, they conveyed different parts of their stories using either text, sound, or images.
Bobby found that broadcast messages and loops were the least used blocks among the group. As a consequence, he modified the curriculum unit to include additional scaffolding and instructional support on how and why the students might embed these elements.
Bobby modified the classroom unit based on findings from his first study
In the second study, the MMC activities were evaluated in a classroom of 28 K-5 students led by one teacher over two weeks. Findings indicated that students appreciated the longer multi-session project. The teacher reported being satisfied with the project work the learners completed and the skills they practised. The teacher also further integrated and adapted the unit into their classroom practice after the research project had been completed.
How might you use these research findings?
Factors that impacted the integration of storytelling and programming included the teacher’s confidence to teach programming as well as the teacher’s ability to differentiate between students and what kind of support they needed depending on their previous programming experience.
In addition, there are considerations regarding the curriculum. The school where the second study took place considered the activities in the unit to be literacy-light, as the English literacy curriculum is ‘text-heavy’ and the addition of multimodal elements ‘wastes’ opportunities to produce stories that are more text-based.
Bobby’s research indicates that MMC provides useful opportunities for learners to simultaneously pursue storytelling and programming goals, and the curriculum unit designed in the research proved adaptable for the teacher to integrate into their classroom practice. However, Bobby cautioned that there’s a need to carefully consider both the benefits and trade-offs when designing cross-curricular integration projects in order to ensure a fair representation of both subjects.
Can you see an opportunity for integrating programming and storytelling in your classroom? Let us know your thoughts or questions in the comments below.
Join our next seminar on primary computing education
At our next seminar, we welcome Kate Farrell and Professor Judy Robertson (University of Edinburgh). This session will introduce you to how data literacy can be taught in primary and early-years education across different curricular areas. It will take place online on Tuesday 9 May at 17.00 UK time, don’t miss out and sign up now.
Yo find out more about connecting research to practice for primary computing education, you can find other our upcoming monthly seminars on primary (K–5) teaching and learning and watch the recordings of previous seminars in this series.
It would have been really cool to combine those two words to make “inundata,” but it would have been disastrous for SEO purposes. It’s all meant to kick off a conversation about the state of security organizations with regard to threat intelligence. There are several key challenges to overcome on the road to clarity in threat intelligence operations and enabling actionable data.
For the commissioned study, Forrester conducted interviews with four Rapid7 customers and collated their responses into the form of one representative organization and its experiences after implementing Rapid7’s threat intelligence solution, Threat Command. Interviewees noted that prior to utilizing Threat Command, lack of visibility and unactionable data across legacy systems were hampering efforts to innovate in threat detection. The study stated:
“Interviewees noted that there was an immense amount of data to examine with their previous solutions and systems. This resulted in limited visibility into the potential security threats to both interviewees’ organizations and their customers. The data the legacy solutions provided was confusing to navigate. There was no singular accounting of assets or solution to provide curated customizable information.”
A key part of that finding is that limited visibility can turn into potential liabilities for an organization’s customers – like the SonicWall attack a couple of years ago. These kinds of incidents can cause immediate pandemonium within the organizations of their downstream customers.
In this same scenario, lack of visibility can also be disastrous for the supply chain. Instead of affecting end-users of one product, now there’s a whole network of vendors and their end-users who could be adversely affected by lack of visibility into threat intelligence originating from just one organization. With greater data visibility through a single pane of glass and consolidating information into a centralized asset list, security teams can begin to mitigate visibility concerns.
Time-consuming processes for investigation and analysis
Rapid7 customers interviewed for the study also felt that their legacy threat intelligence solutions forced teams to “spend hours manually searching through different platforms, such as a web-based Git repository or the dark web, to investigate all potential threat alerts, many of which were irrelevant.”
Because of these inefficiencies, additional and unforeseen work was created on the backend, along with what we can assume were many overstretched analysts. How can organizations, then, gain back – and create new – efficiencies? First, alert context is a must. With Threat Command, security organizations can:
Receive actionable alerts categorized by severity, type (phishing, data leakage), and source (social media, black markets).
Implement alert automation rules based on your specific criteria so you can precisely customize and fine-tune alert management.
Accelerate alert triage and shorten investigation time by leveraging Threat Command Extend™ (browser extension) as well as 24x7x365 availability of Rapid7 expert analysts.
By leveraging these features, the study’s composite organization was able to surface far more actionable alerts and see faster remediation processes. It saved $302,000 over three years by avoiding the cost of hiring an additional security analyst.
Pivoting away from a constant reactive approach to cyber incidents
When it comes to security, no one ever aims for an after-the-fact approach. But sometimes a SOC may realize that’s the position it’s been in for quite some time. Nothing good will come from that for the business or anyone on the security team. Indeed, interviewees in the study supported this perspective:
“Legacy systems and internal processes led to a reactive approach for their threat intelligence investigations and security responses. Security team members would get alerts from systems or other teams with limited context, which led to inefficient triage or larger issues. As a result, the teams sacrificed quality for speed.”
The study notes how interviewees’ organizations were then motivated to look for a solution with improved search capabilities across locations such as social media and the dark web. After implementing Threat Command, it was possible for those organizations to receive early warning of potential attacks and automated intelligence of vulnerabilities targeting their networks and customers.
By creating processes that are centered around early-warning methodologies and a more proactive approach to security, the composite organization was able to reduce the likelihood of a breach by up to 70% over the course of three years.
Security is about the solutions
Challenges in a SOC don’t have to mean stopping everything and preparing for a years-long audit of all processes and solutions. It is possible to overcome challenges relatively quickly with a solution like Threat Command that can show immediate value after an accelerated onboarding process. And it is possible to vastly improve security posture in the face of an increasing volume of global threats.
For a deeper-dive into The Total Economic Impact™ of Rapid7 Threat Command For Digital Risk Protection and Threat Intelligence, download the study now. You can also read the previous blog entry in this series here.
The attack surface of the United Kingdom’s 350 largest publicly traded companies has—drum roll, please—improved. But it could be better. Those are the high level findings of the latest in Rapid7’s looks at the cybersecurity health of companies tied to some of the globe’s largest stock indices. This is the second time in more than two years that we looked at the FTSE 350 to gauge how well the entire UK’s business arena is faring against cyber threats. Turns out, they’ve improved in that time, and are on par with the other big indices we’ve looked at, though in some specific places, there is definitely room for improvement.
We chose the FTSE 350 as a benchmark in determining the cyber health of UK businesses because they are by and large some of the largest companies in the country and are not as resource constrained as some other, smaller, companies might be. This gives us a pretty even playing field on which to analyze their health and extrapolate out to the overall health of the region. We’ve done this with several other indices (most recently the ASX 200) and find it works well to provide a snapshot of what’s going on in the region.
In this report, we looked first at the overall attack surface of the FTSE 350 companies, broken down by industry. We also looked at the overall health of their email and web server security. All three areas showed improvement, as well as points for concern.
Attack Surface
By and large, the attack surfaces of the companies that make up the FTSE 350 was quite limited and in line with other major indices around the world. But, when you look at the individual industries that make up the FTSE you start to see some red flags.
For instance, financial and technology companies have by far the largest vulnerability through high risk ports exposed to the internet. Technology companies averaged well over 1000 ports with internet exposure and financial companies averaged nearly 800. That is 4 and 5 times the next highest industry (respectively). When it comes to particularly high risk ports, the financial sector is the biggest offender with an average of 12 high risk ports. For comparison, the technology sector had three.
Email Security
Email security is one area where we’ve seen some laudable improvement over the last time we looked at the FTSE 350. For instance, use of Domain-based Message Authentication, Reporting & Conformance (DMARC) policy is up 29%. However, the implementation of Domain Name System Security Extensions (DNSSEC) is at just 4% of the 350 companies that make up the index. Sadly, this too is on par with other indices. They should all seek improvements (alright, we’ll get off our soapbox).
Web Server Security
Going after vulnerable web servers is a favorite vector for attackers. When looking at the status of FTSE 350 company web servers we found that of the three most common types (NGinx, Apache, and IIS), not all were running high enough percentages of supported or fully patched versions. For instance, some 40% of NGinx servers were supported or fully patched, whereas 89% of Apache and 80% of IIS servers were. That’s a pretty big discrepancy. Thankfully, Apache and IIS are the dominant servers in this region, minimizing the overall risk.
If you want to take a look at our report you can read it here. If you’d like to check out the report we conducted for Australia’s ASX 200 it is available here.
At Cloudflare, helping to build a better Internet is not just a catchy saying. We are committed to the long-term process of standards development. We love the work of pushing the fundamental technology of the Internet forward in ways that are accessible to everyone. Today we are adding even more substance to that commitment. One of our core beliefs is that privacy is a human right. We believe that to achieve that right the most advanced cryptography needs to be available to everyone, free of charge, forever. Today, we are announcing that our implementations of post-quantum cryptography will meet that standard: available to everyone, and included free of charge, forever.
We have a proud history of taking paid encryption products and launching it to the Internet at scale for Free. Even at the cost of short and long-term revenue because it’s the right thing to do. In 2014, we made SSL free for every Cloudflare customer with Universal SSL. As we make our implementations of post-quantum cryptography free forever today, we do it in the spirit of that first major announcement:
“Having cutting-edge encryption may not seem important to a small blog, but it is critical to advancing the encrypted-by-default future of the Internet. Every byte, however seemingly mundane, that flows encrypted across the Internet makes it more difficult for those who wish to intercept, throttle, or censor the web. In other words, ensuring your personal blog is available over HTTPS makes it more likely that a human rights organization or social media service or independent journalist will be accessible around the world. Together we can do great things.”
We hope that others will follow us in making their implementations of PQC free as well so that we can create a secure and private Internet without a “quantum” up-charge.
The Internet has matured since the 1990’s and the launch of SSL. What was once an experimental frontier has turned into the underlying fabric of modern society. It runs in our most critical infrastructure like power systems, hospitals, airports, and banks. We trust it with our most precious memories. We trust it with our secrets. That’s why the Internet needs to be private by default. It needs to be secure by default. It’s why we’re committed to ensuring that anyone and everyone can achieve post quantum security for free as well as start deploying it at scale today.
Our work on post-quantum crypto is driven by the thesis that quantum computers that can break conventional cryptography create a similar problem to the Year 2000 bug. We know there is going to be a problem in the future that could have catastrophic consequences for users, businesses, and even nation states. The difference this time is we don’t know the date and time that this break in the paradigm of how computers operate will occur. We need to prepare today to be ready for this threat.
To that end we have been preparing for this transition since 2018. At that time we were concerned about the implementation problems other large protocol transitions, like the move to TLS 1.3, had caused our customers and wanted to get ahead of it. Cloudflare Research over the last few years has become a leader and champion of the idea that PQC security wasn’t an afterthought for tomorrow but a real problem that needed to be solved today. We have collaborated with industry partners like Google and Mozilla, contributed to development through participation in the IETF, and even launched an open source experimental cryptography suite to help move the needle. We have tried hard to work with everyone that wanted to be a part of the process and show our work along the way.
As we have worked with our partners in both industry and academia to help prepare us and the Internet for a post-quantum future, we have become dismayed by an emerging trend. There are a growing number of vendors out there that want to cash in on the legitimate fear that nervous executives, privacy advocates, and government leaders have about quantum computing breaking traditional encryption. These vendors offer vague solutions based on unproven technologies like “Quantum Key Distribution” or “Post Quantum Security” libraries that package non-standard algorithms that haven’t been through public review with exorbitant price tags like RSA did in the 1990s. They often love to throw around phrases like “AI” and “Post Quantum” without really showing their work on how any of their systems actually function. Security and privacy are table stakes in the modern Internet, and no one should be charged just to get the baseline protection needed in our contemporary world.
Launch your PQC transition today
Testing and adopting post-quantum cryptography in modern networks doesn’t have to be hard! In fact, Cloudflare customers can test PQC in their systems today, as we describe later in this post.
After testing out how Cloudflare can help you implement PQC in your networks, it’s time to start to prepare yourself for the transition to PQC in all of your systems. This first step of inventorying and identifying is critical to a smooth rollout. We know first hand since we have undertaken an extensive evaluation of all of our systems to earn our FedRAMP Authorization certifications, and we are doing a similar evaluation again to transition all of our internal systems to PQC.
How we are setting ourselves up for the future of quantum computing
Here’s a sneak preview of the path that we are developing right now to fully secure Cloudflare itself against the cryptographic threat of quantum computers. We can break that path down into three parts: internal systems, zero trust, and open source contributions.
The first part of our path to full PQC adoption at Cloudflare is around all of our connections. The connection between yourself and Cloudflare is just one part of the larger path of the connection. Inside our internal systems we are implementing two significant upgrades in 2023 to ensure that they are PQC secure as well.
The first is that we use BoringSSL for a substantial amount of connections. We currently use our fork and we are excited that upstream support for Kyber is underway. Any additional internal connections that use a different cryptographic system are being upgraded as well. The second major upgrade we are making is to shift the remaining internal connections that use TLS 1.2 to TLS 1.3. This combination of Kyber and TLS 1.3 will make our internal connections faster and more secure, even though we use a hybrid of classical and post-quantum secure cryptography. It’s a speed and security win-win. And we proved this power house combination would provide that speed and security over three and half years ago thanks to the groundbreaking work of Cloudflare Research and Google.
The next part of that path is all about using PQC and zero trust as allies together. As we think about the security posture of tomorrow being based around post-quantum cryptography, we have to look at the other critical security paradigm being implemented today: zero trust. Today, the zero trust vendor landscape is littered with products that fail to support common protocols like IPv6 and TLS 1.2, let alone the next generation of protocols like TLS 1.3 and QUIC that enable PQC. So many middleboxes struggle under the load of today’s modern protocols. They artificially downgrade connections and break end user security all in the name of inspecting traffic because they don’t have a better solution. Organizations big and small struggled to support customers that wanted the highest possible performance and security, while also keeping their businesses safe, because of the resistance of these vendors to adapt to modern standards. We do not want to repeat the mistakes of the past. We are planning and evaluating the needed upgrades to all of our zero trust products to support PQC out of the box. We believe that zero trust and post-quantum cryptography are not at odds with one another, but rather together are the future standard of security.
Finally, it’s not enough for us to do this for ourselves and for our customers. The Internet is only as strong as its weakest links in the connection chains that network us all together. Every connection on the Internet needs the strongest possible encryption so that businesses can be secure, and everyday users can be ensured of their privacy. We believe that this core technology should be vendor agnostic and open to everyone. To help make that happen our final part of the path is all about contributing to open source projects. We have already been focused on releases to CIRCL. CIRCL (Cloudflare Interoperable, Reusable Cryptographic Library) is a collection of cryptographic primitives written in Go. The goal of this library is to be used as a tool for experimental deployment of cryptographic algorithms targeting post quantum.
Later on this year we will be publishing as open source a set of easy to adopt, vendor-neutral roadmaps to help you upgrade your own systems to be secure against the future today. We want the security and privacy created by post quantum crypto to be accessible and free for everyone. We will also keep writing extensively about our post quantum journey. To learn more about how you can turn on PQC today, and how we have been building post-quantum cryptography at Cloudflare, please check out these resources:
News coverage of a recent paper caused a bit of a stir with this headline: “AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm”. The news article claimed that Kyber, the encryption algorithm in question, which we have deployed world-wide, had been “broken.” Even more dramatically, the news article claimed that “the revolutionary aspect of the research was to apply deep learning analysis to side-channel differential analysis”, which seems aimed to scare the reader into wondering what will Artificial Intelligence (AI) break next?
Reporting on the paper has been wildly inaccurate: Kyber is not broken and AI has been used for more than a decade now to aid side-channel attacks. To be crystal clear: our concern is with the news reporting around the paper, not the quality of the paper itself. In this blog post, we will explain how AI is actually helpful in cryptanalysis and dive into the paper by Dubrova, Ngo, and Gärtner (DNG), that has been misrepresented by the news coverage. We’re honored to have Prof. Dr. Lejla Batina and Dr. Stjepan Picek, world-renowned experts in the field of applying AI to side-channel attacks, join us on this blog.
We start with some background, first on side-channel attacks and then on Kyber, before we dive into the paper.
Breaking cryptography
When one thinks of breaking cryptography, one imagines a room full of mathematicians puzzling over minute patterns in intercepted messages, aided by giant computers, until they figure out the key. Famously in World War II, the Nazis’ Enigma cipher machine code was completely broken in this way, allowing the Allied forces to read along with their communications.
It’s exceedingly rare for modern established cryptography to get broken head-on in this way. The last catastrophically broken cipher was RC4, designed in 1987, while AES, designed in 1998, stands proud with barely a scratch. The last big break of a cryptographic hash was on SHA-1, designed in 1995, while SHA-2, published in 2001, remains untouched in practice.
So what to do if you can’t break the cryptography head-on? Well, you get clever.
Side-channel attacks
Can you guess the pin code for this gate?
You can clearly see that some of the keys are more worn than the others, suggesting heavy use. This observation gives us some insight into the correct pin, namely the digits. But the correct order is not immediately clear. It might be 1580, 8510, or even 115085, but it’s a lot easier than trying every possible pin code. This is an example of a side-channel attack. Using the security feature (entering the PIN) had some unintended consequences (abrading the paint), which leaks information.
There are many different types of side channels, and which one you should worry about depends on the context. For instance, the sounds your keyboard makes as you type leaks what you write, but you should not worry about that if no one is listening in.
Remote timing side channel
When writing cryptography in software, one of the best known side channels is the time it takes for an algorithm to run. For example, let’s take the classic example of creating an RSA signature. Grossly simplified, to sign a message m with private key d, we compute the signature s as md (mod n). Computing the exponent of a big number is hard, but luckily, because we’re doing modular arithmetic, there is the square-and-multiply trick. Here is a naive implementation in pseudocode:
The algorithm loops over the bits of the secret key, and does a multiply step if the current bit is a 1. Clearly, the runtime depends on the secret key. Not great, but if the attacker can only time the full run, then they only learn the number of 1s in the secret key. The typical catastrophic timing attack against RSA instead is hidden behind the “mod n”. In a naive implementation this modular reduction is slower if the number being reduced is larger or equal n. This allows an attacker to send specially crafted messages to tease out the secret key bit-by-bit and similar attacks are surprisingly practical.
Because of this, the mantra is: cryptography should run in “constant time”. This means that the runtime does not depend on any secret information. In our example, to remove the first timing issue, one would replace the if-statement with something equivalent to:
s = ((s * powerOfM) mod n) * bit(s, i) + s * (1 - bit(s, i))
This ensures that the multiplication is always done. Similar countermeasures prevent practically all remote timing attacks.
Power side-channel
The story is quite different for power side-channel attacks. Again, the classic example is RSA signatures. If we hook up an oscilloscope to a smartcard that uses the naive algorithm from before, and measure the power usage while it signs, we can read off the private key by eye:
Even if we use a constant-time implementation, there are still minute changes in power usage that can be detected. The underlying issue is that hardware gates that switch use more power than those that don’t. For instance, computing 127 + 64 takes more energy than 64 + 64.
127+64 and 64+64 in binary. There are more switched bits in the first.
Masking A common countermeasure against power side-channel leakage is masking. This means that before using the secret information, it is split randomly into shares. Then, the brunt of the computation is done on the shares, which are finally recombined.
In the case of RSA, before creating a new signature, one can generate a random r and compute md+r (mod n) and mr (mod n) separately. From these, the final signature md (mod n) can be computed with some extra care.
Masking is not a perfect defense. The parts where shares are created or recombined into the final value are especially vulnerable. It does make it harder for the attacker: they will need to collect more power traces to cut through the noise. In our example we used two shares, but we could bump that up even higher. There is a trade-off between power side-channel resistance and implementation cost.
One of the challenging parts in the field is to estimate how much secret information is actually leaked through the traces, and how to extract it. Here machine learning enters the picture.
Machine learning: extracting the key from the traces
Machine learning, of which deep learning is a part, represents the capability of a system to acquire its knowledge by extracting patterns from data — in this case, the secrets from the power traces. Machine learning algorithms can be divided into several categories based on their learning style. The most popular machine learning algorithms in side-channel attacks follow the supervised learning approach. In supervised learning, there are two phases: 1) training, where a machine learning model is trained based on known labeled examples (e.g., side-channel measurements where we know the key) and 2) testing, where, based on the trained model and additional side-channel measurements (now, with an unknown key), the attacker guesses the secret key. A common depiction of such attacks is given in the figure below.
While the threat model may sound counterintuitive, it is actually not difficult to imagine that the attacker will have access (and control) of a device similar to the one being attacked.
In side-channel analysis, the attacks following those two phases (training and testing) are called profiling attacks.
Profiling attacks are not new. The first such attack, called the template attack, appeared in 2002. Diverse machine learning techniques have been used since around 2010, all reporting good results and the ability to break various targets. The big breakthrough came in 2016, when the side-channel community started using deep learning. It greatly increased the effectiveness of power side-channel attacks both against symmetric-key and public-key cryptography, even if the targets were protected with, for instance, masking or some other countermeasures. To be clear: it doesn’t magically figure out the key, but it gets much better at extracting the leaked bits from a smaller number of power traces.
While machine learning-based side-channel attacks are powerful, they have limitations. Carefully implemented countermeasures make the attacks more difficult to conduct. Finding a good machine learning model that can break a target can be far from trivial: this phase, commonly called tuning, can last weeks on powerful clusters.
What will the future bring for machine learning/AI in side-channel analysis? Counter intuitively, we would like to see more powerful and easy to use attacks. You’d think that would make us worse off, but to the contrary it will allow us to better estimate how much actual information is leaked by a device. We also hope that we will be able to better understand why certain attacks work (or not), so that more cost-effective countermeasures can be developed. As such, the future for AI in side-channel analysis is bright especially for security evaluators, but we are still far from being able to break most of the targets in real-world applications.
Kyber
Kyber is a post-quantum (PQ) key encapsulation method (KEM). After a six-year worldwide competition, the National Institute of Standards and Technology (NIST) selected Kyber as the post-quantum key agreement they will standardize. The goal of a key agreement is for two parties that haven’t talked to each other before to agree securely on a shared key they can use for symmetric encryption (such as Chacha20Poly1305). As a KEM, it works slightly different with different terminology than a traditional Diffie–Hellman key agreement (such as X25519):
When connecting to a website the client first generates a new ephemeral keypair that consists of a private and public key. It sends the public key to the server. The server then encapsulates a shared key withthat public key, which gives it a random shared key, which it keeps, and a ciphertext (in which the shared key is hidden), which the server returns to the client. The client can then use its private key to decapsulate the shared key from the ciphertext. Now the server and client can communicate with each other using the shared key.
Key agreement is particularly important to make secure against attacks of quantum computers. The reason is that an attacker can store traffic today, and crack the key agreement in the future, revealing the shared key and all communication encrypted with it afterwards. That is why we have already deployed support for Kyber across our network.
The DNG paper
With all the background under our belt, we’re ready to take a look at the DNG paper. The authors perform a power side-channel attack on their own masked implementation of Kyber with six shares.
Point of attack
They attack the decapsulation step. In the decapsulation step, after the shared key is extracted, it’s encapsulated again, and compared against the original ciphertext to detect tampering. For this re-encryption step, the precursor of the shared key—let’s call it the secret—is encoded bit-by-bit into a polynomial. To be precise, the 256-bit secret needs to be converted to a polynomial with 256 coefficients modulo q=3329, where the ith coefficient is (q+1)/2 if the ith bth is 1 and zero otherwise.
This function sounds simple enough, but creating a masked version is tricky. The rub is that the natural way to create shares of the secret is to have shares that xor together to be the secret, and that the natural way to share polynomials is to have shares that add together to get to the intended polynomial.
This is the two-shares implementation of the conversion that the DNG paper attacks:
The code loops over the bits of the two shares. For each bit, it creates a mask, that’s 0xffff if the bit was 1 and 0 otherwise. Then this mask is used to add (q+1)/2 to the polynomial share if appropriate. Processing a 1 will use a bit more power. It doesn’t take an AI to figure out that this will be a leaky function. In fact, this pattern was pointed out to be weak back in 2016, and explicitly mentioned to be a risk for masked Kyber in 2020. Apropos, one way to mitigate this, is to process multiple bits at once — for the state of the art, tune into April 2023’s NIST PQC seminar. For the moment, let’s allow the paper its weak target.
The authors do not claim any fundamentally new attack here. Instead, they improve the effectiveness of the attack in two ways: the way they train the neural network, and how to use multiple traces more effectively by changing the ciphertext sent. So, what did they achieve?
Effectiveness
To test the attack, they use a Chipwhisperer-lite board, which has a Cortex M4 CPU, which they downclock to 24Mhz. Power usage is sampled at 24Mhz, with high 10-bit precision.
To train the neural networks, 150,000 power traces are collected for decapsulation of different ciphertexts (with known shared key) for the same KEM keypair. This is already a somewhat unusual situation for a real-world attack: for key agreement KEM keypairs are ephemeral; generated and used only once. Still, there are certainly legitimate use cases for long-term KEM keypairs, such as for authentication, HPKE, and in particular ECH.
The training is a key step: different devices even from the same manufacturer can have wildly different power traces running the same code. Even if two devices are of the same model, their power traces might still differ significantly.
The main contribution highlighted by the authors is that they train their neural networks to attack an implementation with 6 shares, by starting with a neural network trained to attack an implementation with 5 shares. That one can be trained from a model to attack 4 shares, and so on. Thus to apply their method, of these 150,000 power traces, one-fifth must be from an implementation with 6 shares, another one-fifth from one with 5 shares, et cetera. It seems unlikely that anyone will deploy a device where an attacker can switch between the number of shares used in the masking on demand.
Given these affordances, the attack proper can commence. The authors report that, from a single power trace of a two-share decapsulation, they could recover the shared key under these ideal circumstances with probability… 0.12%. They do not report the numbers for single trace attacks on more than two shares.
When we’re allowed multiple traces of the same decapsulation, side-channel attacks become much more effective. The second trick is a clever twist on this: instead of creating a trace of decapsulation of exactly the same message, the authors rotate the ciphertext to move bits of the shared key in more favorable positions. With 4 traces that are rotations of the same message, the success probability against the two-shares implementation goes up to 78%. The six-share implementation stands firm at 0.5%. When allowing 20 traces from the six-share implementation, the shared key can be recovered with an 87% chance.
In practice
The hardware used in the demonstration might be somewhat comparable to a smart card, but it is very different from high-end devices such as smartphones, desktop computers and servers. Simple power analysis side-channel attacks on even just embedded 1GHz processors are much more challenging, requiring tens of thousands of traces using a high-end oscilloscope connected close to the processor. There are much better avenues for attack with this kind of physical access to a server: just connect the oscilloscope to the memory bus.
Except for especially vulnerable applications, such as smart cards and HSMs, power-side channel attacks are widely considered infeasible. Although sometimes, when the planets align, an especially potent power side-channel attack can be turned into a remote timing attack due to throttling, as demonstrated by Hertzbleed. To be clear: the present attack does not even come close.
And even for these vulnerable applications, such as smart cards, this attack is not particularly potent or surprising. In the field, it is not a question of whether a masked implementation leaks its secrets, because it always does. It’s a question of how hard it is to actually pull off. Papers such as the DNG paper contribute by helping manufacturers estimate how many countermeasures to put in place, to make attacks too costly. It is not the first paper studying power side-channel attacks on Kyber and it will not be the last.
Wrapping up
AI did not completely undermine a new wave of cryptography, but instead is a helpful tool to deal with noisy data and discover the vulnerabilities within it. There is a big difference between a direct break of cryptography and a power side-channel attack. Kyber is not broken, and the presented power side-channel attack is not cause for alarm.
Today we are announcing the general availability of Amazon Lightsail for Research, a new offering that makes it easy for researchers and students to create and manage a high-performance CPU or a GPU research computer in just a few clicks on the cloud. You can use your preferred integrated development environments (IDEs) like preinstalled Jupyter, RStudio, Scilab, VSCodium, or native Ubuntu operating system on your research computer.
You no longer need to use your own research laptop or shared school computers for analyzing larger datasets or running complex simulations. You can create your own research environments and directly access the application running on the research computer remotely via a web browser. Also, you can easily upload data to and download from your research computer via a simple web interface.
You pay only for the duration the computers are in use and can delete them at any time. You can also use budgeting controls that can automatically stop your computer when it’s not in use. Lightsail for Research also includes all-inclusive prices of compute, storage, and data transfer, so you know exactly how much you will pay for the duration you use the research computer.
Get Started with Amazon Lightsail for Research To get started, navigate to the Lightsail for Research console, and choose Virtual computers in the left menu. You can see my research computers naming “channy-jupyter” or “channy-rstudio” already created.
Choose Create virtual computer to create a new research computer, and select which software you’d like preinstalled on your computer and what type of research computer you’d like to create.
In the first step, choose the application you want installed on your computer and the AWS Region to be located in. We support Jupyter, RStudio, Scilab, and VSCodium. You can install additional packages and extensions through the interface of these IDE applications.
Next, choose the desired virtual hardware type, including a fixed amount of compute (vCPUs or GPUs), memory (RAM), SSD-based storage volume (disk) space, and a monthly data transfer allowance. Bundles are charged on an hourly and on-demand basis.
Standard types are compute-optimized and ideal for compute-bound applications that benefit from high-performance processors.
Name
vCPUs
Memory
Storage
Monthly data transfer allowance*
Standard XL
4
8 GB
50 GB
0.5TB
Standard 2XL
8
16 GB
50 GB
0.5TB
Standard 4XL
16
32 GB
50 GB
0.5TB
GPU types provide a high-performance platform for general-purpose GPU computing. You can use these bundles to accelerate scientific, engineering, and rendering applications and workloads.
Name
GPU
vCPUs
Memory
Storage
Monthly data transfer allowance*
GPU XL
1
4
16 GB
50 GB
1 TB
GPU 2XL
1
8
32 GB
50 GB
1 TB
GPU 4XL
1
16
64 GB
50 GB
1 TB
* AWS created the Global Data Egress Waiver (GDEW) program to help eligible researchers and academic institutions use AWS services by waiving data egress fees. To learn more, see the blog post.
After making your selections, name your computer and choose Create virtual computer to create your research computer. Once your computer is created and running, choose the Launch application button to open a new window that will display the preinstalled application you selected.
Lightsail for Research Features As with existing Lightsail instances, you can create additional block-level storage volumes (disks) that you can attach to a running Lightsail for Research virtual computer. You can use a disk as a primary storage device for data that requires frequent and granular updates. To create your own storage, choose Storage and Create disk.
You can also create Snapshots, a point-in-time copy of your data. You can create a snapshot of your Lightsail for Research virtual computers and use it as baselines to create new computers or for data backup. A snapshot contains all of the data that is needed to restore your computer from the moment when the snapshot was taken.
When you restore a computer by creating it from a snapshot, you can easily create a new one or upgrade your computer to a larger size using a snapshot backup. Create snapshots frequently to protect your data from corrupt applications or user errors.
You can use Cost control rules that you define to help manage the usage and cost of your Lightsail for Research virtual computers. You can create rules that stop running computers when average CPU utilization over a selected time period falls below a prescribed level.
For example, you can configure a rule that automatically stops a specific computer when its CPU utilization is equal to or less than 1 percent for a 30-minute period. Lightsail for Research will then automatically stop the computer so that you don’t incur charges for running computers.
In the Usage menu, you can view the cost estimate and usage hours for your resources during a specified time period.
Now Available Amazon Lightsail for Research is now available in the US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), and Europe (Sweden) Regions.
Each year, the research team at Rapid7 analyzes thousands of vulnerabilities in order to identify their root causes, broaden understanding of attacker behavior, and provide actionable intelligence that guides security professionals at critical moments. Our annual Vulnerability Intelligence Report examines notable vulnerabilities and high-impact attacks from 2022 to highlight trends that drive significant risk for organizations of all sizes.
Today, we’re excited to release Rapid7’s 2022 Vulnerability Intelligence Report—a deep dive into 50 of the most notable vulnerabilities our research team investigated throughout the year. The report offers insight into critical vulnerabilities, widespread threats, prominent attack surface area, and changing exploitation trends.
2022 attack trends
The threat landscape today is radically different than it was even a few years ago. Over the past three years, we’ve seen zero-day exploits and widespread attacks chart a meteoric rise that’s strained security teams to their breaking point and beyond. While 2022 saw a modest decline in zero-day and widespread exploitation from 2021’s record highs, the multi-year trend of rising attack speed and scale remains strikingly consistent overall.
Report findings include:
Widespread exploitation of newvulnerabilities decreased 15% year over year in 2022, but mass exploitation events were still the norm. Our 2022 vulnerability intelligence dataset tracks 28 net-new widespread threats, many of which were used to deploy webshells, cryptocurrency miners, botnet malware, and/or ransomware on target systems.
Zero-day exploitation remained a significant challenge for security teams, with 43% of widespread threats arising from a zero-day exploit.
Attackers are still developing and deploying exploits faster than ever before. More than half of the vulnerabilities in our report dataset were exploited within seven days of public disclosure—a 12% increase from 2021 and an 87% increase over 2020.
Vulnerabilities mapped definitively to ransomware operations dropped 33% year over year—a troubling trend that speaks more to evolving attacker behavior and lower industry visibility than to any actual reprieve for security practitioners. This year’s report explores the growing complexity of the cybercrime ecosystem, the rise of initial access brokers, and industry-wide ransomware reporting trends.
How to manage risk from critical vulnerabilities
In today’s threat landscape, security teams are frequently forced into reactive positions, lowering security program efficacy and sustainability. Strong foundational security program components, including vulnerability and asset management processes, are essential to building resilience in a persistently elevated threat climate.
Have emergency patching procedures and incident response playbooks in place so that in the event of a widespread threat or breach, your team has a well-understood mechanism to drive immediate action.
Have a defined, regular patch cycle that includes prioritization of actively exploited CVEs, as well as network edge technologies like VPNs and firewalls. These network edge devices continue to be popular attack vectors and should adhere to a zero-day patch cycle wherever possible, meaning that updates and/or downtime should be scheduled as soon as new critical advisories are released.
Keep up with operating system-level and cumulative updates. Falling behind on these regular updates can make it difficult to install out-of-band security patches at critical moments.
Limit and monitor internet exposure of critical infrastructure and services, including domain controllers and management or administrative interfaces. The exploitation of many of the CVEs in this year’s report could be slowed down or prevented by taking management interfaces off the public internet.
2022 Vulnerability Intelligence Report
Read the report to see our full list of high-priority CVEs and learn more about attack trends from 2022.
Every young learner needs a successful start to their learning journey in the primary computing classroom. One aspect of this for teachers is to introduce programming to their learners in a structured way. As computing education is introduced in more schools, the need for research-informed strategies and approaches to support beginner programmers is growing. Over recent years, researchers have proposed various strategies to guide teachers and students, such as the block model, PRIMM, and, in the case of this month’s seminar, TIPP&SEE.
We need to give all learners a successful start in the primary computing classroom.
We are committed to make computing and creating with digital technologies accessible to all young people, including through our work with educators and researchers. In our current online research seminar series, we focus on computing education for primary-aged children (K–5, ages 5 to 11). In the series’ second seminar, we were delighted to welcome Dr Jean Salac, researcher in the Code & Cognition Lab at the University of Washington.
Dr Jean Salac
Jean’s work sits across computing education and human-computer interaction, with an emphasis on justice-focused computing for youth. She talked to the seminar attendees about her work on developing strategies to support primary school students learning to program in Scratch. Specifically, Jean described an approach called TIPP&SEE and how teachers can use it to guide their learners through programming activities.
What is TIPP&SEE?
TIPP&SEE is a metacognitive approach for programming in Scratch. The purpose of metacognitive strategies is to help students become more aware of their own learning processes.
The stages of the TIPP&SEE approach
TIPP&SEE scaffolds students as they learn from example Scratch projects: TIPP (Title, Instructions, Purpose, Play) is a scaffold to read and run a Scratch project, while SEE (Sprites, Events, Explore) is a scaffold to examine projects more deeply and begin to adapt them.
Using, modifying and creating
TIPP&SEE is inspired by the work of Irene Lee and colleagues who proposed a progressive three-stage approach called Use-Modify-Create. Following that approach, learners move from reading pre-existing programs (“not mine”) to adapting and creating their own programs (“mine”) and gradually increase ownership of their learning.
TIPP&SEE builds on the Use-Modify-Create progression.
Proponents of scaffolded approaches like Use-Modify-Create argue that engaging learners in cycles of using existing programs (e.g. worked examples) before they move to adapting and creating new programs encourages ownership and agency in learning. TIPP&SEE builds on this model by providing additional scaffolding measures to support learners.
Impact of TIPP&SEE
Jean presented some promising results from her research on the use of TIPP&SEE in classrooms. In one study, fourth-grade learners (age 9 to 10) were randomly assigned to one of two groups: (i) Use-Modify-Create only (the control group) or (ii) Use-Modify-Create with TIPP&SEE. Jean found that, compared to learners in the control group, learners in the TIPP&SEE group:
Were more thorough, and completed more tasks
Wrote longer scripts during open-ended tasks
Used more learned blocks during open-ended tasks
The TIPP&SEE group performed better than the control group in assessments
In another study, Jean compared how learners in the TIPP&SEE and control groups performed on several cognitive tests. She found that, in the TIPP&SEE group, students with learning difficulties performed as well as students without learning difficulties. In other words, in the TIPP&SEE group the performance gap was much narrower than in the control group. In our seminar, Jean argued that this indicates the TIPP&SEE scaffolding provides much-needed support to diverse groups of students.
Using TIPP&SEE in the classroom
TIPP&SEE is a multi-step strategy where learners start by looking at the surface elements of a program, and then move on to examining the underlying code. In the TIPP phase, learners first read the title and instructions of a Scratch project, identify its purpose, and then play the project to see what it does.
In the second phase, SEE, learners look inside the Scratch project to click on sprites and predict what each script is doing. They then make changes to the Scratch code and see how the project’s output changes. By changing parameters, learners can observe which part of the output changes as a result and then reason how each block functions. This practice is called deliberate tinkering because it encourages learners to observe changes while executing programs multiple times with different parameters.
Jean’s talk highlighted the need for computing to be inclusive and to give equitable access to all learners. The field of computing education is still in its infancy, though our understanding of how young people learn about computing is growing. We ourselves work to deepen our understanding of how young people learn through computing and digital making experiences.
In our own research, we have been investigating similar teaching approaches for programming, including the use of the PRIMM approach in the UK, so we were very interested to learn about different approaches and country contexts. We are grateful to Dr Jean Salac for sharing her work with researchers and teachers alike. Watch the recording of Jean’s seminar to hear more:
Free support for teaching programming and more to primary school learners
If you are looking for more free resources to help you structure your computing lessons:
Download our comprehensive classroom resources in The Computing Curriculum, which includes units of programming lessons for age 5 and upwards.
In the next seminar of our online series on primary computing, I will be presenting my research on integrated computing and literacy activities. Sign up now to join us for this session on Tues 7 March:
Industrial Control System (ICS) networking stacks are often the go-to bogeyman for infosec and cybersecurity professionals, and doubly so for offensive, red-team style security folks. How often have you been new on site, all ready to run a bog-standard nmap scan across the internal address space, only to be stopped by a frantic senior manager, “No, you can’t scan 192.168.69.0/24, that’s where the factory floor operates!”
“Why not?” you might ask—after all, isn’t it important to scan your IP-connected assets regularly to make sure they’re all accounted for and patched? Isn’t that kind of the one thing we tell literally anyone who asks, right after making sure your passwords are nice and long and random?
“Oh no,” this manager might plead, “if you scan them, they fall over, and it kills production. Minutes of downtime costs millions!”
Well, I’m happy to report that today, Rapid7’s Andreas Galauner has produced a technical deep dive whitepaper into the mysterious and opaque world of PLC protocols, and specifically, how you, intrepid IT explorer, can safely and securely scan around your CODESYS-based ICS footprint.
CODESYS is a protocol suite that runs a whole lot of industrial equipment. Sometimes it’s labeled clearly as such, and sometimes it’s not mentioned at all in the docs. While it is IP-based, it also uses some funky features of UDP multicast, which is one reason why scanning (or worse, fuzzing) these things blindly can cause a lot of trouble in the equipment that depends on it.
No spoilers, but if you’re the sort who always wondered why, exactly, flinging packets at the ICS network can lead to heartache and lost productivity, this is the paper for you. This goes double if you’re already a bit of a networking nerd.
If you’re not sure, here’s an easy test. Go and read this Errata Security blog about the infamous Hacker Jeopardy telnet question real quick. If you have any emotional response at all (hilarity, enlightenment, outrage, or a mix of all three), you’re definitely in the audience for this paper.
Best of all, this paper comes with some tooling; Andy has graciously open sourced a Wireshark plugin for CODESYS analysis, and an Nmap NSE script for safer scanning. You can grab those, right now, at our GitHub repo. Cower in the dark about ICS networks no more!
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.