Every day, most of us both consume and create data. For example, we interpret data from weather forecasts to predict our chances of a good weather for a special occasion, and we create data as our carbon footprint leaves a trail of energy consumption information behind us. Data is important in our lives, and countries around the world are expanding their school curricula to teach the knowledge and skills required to work with data, including at primary (K–5) level.
Kate FarrellProf. Judy Robertson
In our most recent research seminar, attendees heard about a research-based initiative called Data Education in Schools. The speakers, Kate Farrell and Professor Judy Robertson from the University of Edinburgh, Scotland, shared how this project aims to empower learners to develop data literacy skills and succeed in a data-driven world.
“Data literacy is the ability to ask questions, collect, analyse, interpret and communicate stories about data.”
– Kate Farrell & Prof. Judy Robertson
Being a data citizen
Scotland’s national curriculum does not explicitly mention data literacy, but the topic is embedded in many subjects such as Maths, English, Technologies, and Social Studies. Teachers in Scotland, particularly in primary schools, have the flexibility to deliver learning in an interdisciplinary way through project-based learning. Therefore, the team behind Data Education in Schools developed a set of cross-curricular data literacy projects. Educators and education policy makers in other countries who are looking to integrate computing topics with other subjects may also be interested in this approach.
Data citizens have skills they need to thrive in a world shaped by digital technology.
The Data Education in Schools projects are aimed not just at giving learners skills they may need for future jobs, but also at equipping them as data citizens in today’s world. A data citizen can think critically, interpret data, and share insights with others to effect change.
Kate and Judy shared an example of data citizenship from a project they had worked on with a primary school. The learners gathered data about how much plastic waste was being generated in their canteen. They created a data visualisation in the form of a giant graph of types of rubbish on the canteen floor and presented this to their local council.
Sorting food waste from lunch by type of material
As a result, the council made changes that reduced the amount of plastic used in the canteen. This shows how data citizens are able to communicate insights from data to influence decisions.
A cycle for data literacy projects
Across its projects, the Data Education in Schools initiative uses a problem-solving cycle called the PPDAC cycle. This cycle is a useful tool for creating educational resources and for teaching, as you can use it to structure resources, and to concentrate on areas to develop learner skills.
The PPDAC data problem-solving cycle
The five stages of the cycle are:
Problem: Identifying the problem or question to be answered
Plan: Deciding what data to collect or use to answer the question
Analysis: Preparing, modelling, and visualising the data, e.g. in a graph or pictogram
Conclusion: Reviewing what has been learned about the problem and communicating this with others
Smaller data literacy projects may focus on one or two stages within the cycle so learners can develop specific skills or build on previous learning. A large project usually includes all five stages, and sometimes involves moving backwards — for example, to refine the problem — as well as forwards.
Data literacy for primary school learners
At primary school, the aim of data literacy projects is to give learners an intuitive grasp of what data looks like and how to make sense of graphs and tables. Our speakers gave some great examples of playful approaches to data. This can be helpful because younger learners may benefit from working with tangible objects, e.g. LEGO bricks, which can be sorted by their characteristics. Kate and Judy told us about one learner who collected data about their clothes and drew the results in the form of clothes on a washing line — a great example of how tangible objects also inspire young people’s creativity.
As learners get older, they can begin to work with digital data, including data they collect themselves using physical computing devices such as BBC micro:bit microcontrollers or Raspberry Pi computers.
Free resources for primary (and secondary) schools
For many attendees, one of the highlights of the seminar was seeing the range of high-quality teaching resources for learners aged 3–18 that are part of the Data Education in Schools project. These include:
Data 101 videos: A set of 11 videos to help primary and secondary teachers understand data literacy better.
Lesson resources: Lots of projects to develop learners’ data literacy skills. These are mapped to the Scottish primary and secondary curriculum, but can be adapted for use in other countries too.
More resources are due to be published later in 2023, including a set of prompt cards to guide learners through the PPDAC cycle, a handbook for teachers to support the teaching of data literacy, and a set of virtual data-themed escape rooms.
You may also be interested in the units of work on data literacy skills that are part of The Computing Curriculum, our complete set of classroom resources to teach computing to 5- to 16-year-olds.
Join our next seminar on primary computing education
At our next seminar we welcome Aim Unahalekhaka from Tufts University, USA,who will share research about a rubric to evaluate young learners’ ScratchJr projects. If you have a tablet with ScratchJr installed, make sure to have it available to try out some activities. The seminar will take place online on Tuesday 6 June at 17.00 UK time, sign up now to not miss out.
To find out more about connecting research to practice for primary computing education, you can see a list of our upcoming monthly seminars on primary (K–5) teaching and learning and watch the recordings of previous seminars in this series.
Broadening participation and finding new entry points for young people to engage with computing is part of how we pursue our mission here at the Raspberry Pi Foundation. It was also the focus of our March online seminar, led by our own Dr Bobby Whyte. In this third seminar of our series on computing education for primary-aged children, Bobby presented his work on ‘designing multimodal composition activities for integrated K-5 programming and storytelling’. In this research he explored the integration of computing and literacy education, and the implications and limitations for classroom practice.
Motivated by challenges Bobby experienced first-hand as a primary school teacher, his two studies on the topic contribute to the body of research aiming to make computing less narrow and difficult. In this work, Bobby integrated programming and storytelling as a way of making the computing curriculum more applicable, relevant, and contextualised.
Critically for computing educators and researchers in the area, Bobby explored how theories related to ‘programming as writing’ translate into practice, and what the implications of designing and delivering integrated lessons in classrooms are. While the two studies described here took place in the context of UK schooling, we can learn universal lessons from this work.
What is multimodal composition?
In the seminar Bobby made a distinction between applying computing to literacy (or vice versa) and true integration of programming and storytelling. To achieve true integration in the two studies he conducted, Bobby used the idea of ‘multimodal composition’ (MMC). A multimodal composition is defined as “a composition that employs a variety of modes, including sound, writing, image, and gesture/movement [… with] a communicative function”.
Storytelling comes together with programming in a multimodal composition as learners create a program to tell a story where they:
Decide on content and representation (the characters, the setting, the backdrop)
Structure text they’ve written
Use technical aspects (i.e. motion blocks, tension) to achieve effects for narrative purposes
Defining multimodal composition (MMC) for a visual programming context
Multimodality for programming and storytelling in the classroom
To investigate the use of MMC in the classroom, Bobby started by designing a curriculum unit of lessons. He mapped the unit’s MMC activities to specific storytelling and programming learning objectives. The MMC activities were designed using design-based research, an approach in which something is designed and tested iteratively in real-world contexts. In practice that means Bobby collaborated with teachers and students to analyse, evaluate, and adapt the unit’s activities.
Mapping of the MMC activities to storytelling and programming learning objectives
The first of two studies to explore the design and implementation of MMC activities was conducted with 10 K-5 students (age 9 to 11) and showed promising results. All students approached the composition task multimodally, using multiple representations for specific purposes. In other words, they conveyed different parts of their stories using either text, sound, or images.
Bobby found that broadcast messages and loops were the least used blocks among the group. As a consequence, he modified the curriculum unit to include additional scaffolding and instructional support on how and why the students might embed these elements.
Bobby modified the classroom unit based on findings from his first study
In the second study, the MMC activities were evaluated in a classroom of 28 K-5 students led by one teacher over two weeks. Findings indicated that students appreciated the longer multi-session project. The teacher reported being satisfied with the project work the learners completed and the skills they practised. The teacher also further integrated and adapted the unit into their classroom practice after the research project had been completed.
How might you use these research findings?
Factors that impacted the integration of storytelling and programming included the teacher’s confidence to teach programming as well as the teacher’s ability to differentiate between students and what kind of support they needed depending on their previous programming experience.
In addition, there are considerations regarding the curriculum. The school where the second study took place considered the activities in the unit to be literacy-light, as the English literacy curriculum is ‘text-heavy’ and the addition of multimodal elements ‘wastes’ opportunities to produce stories that are more text-based.
Bobby’s research indicates that MMC provides useful opportunities for learners to simultaneously pursue storytelling and programming goals, and the curriculum unit designed in the research proved adaptable for the teacher to integrate into their classroom practice. However, Bobby cautioned that there’s a need to carefully consider both the benefits and trade-offs when designing cross-curricular integration projects in order to ensure a fair representation of both subjects.
Can you see an opportunity for integrating programming and storytelling in your classroom? Let us know your thoughts or questions in the comments below.
Join our next seminar on primary computing education
At our next seminar, we welcome Kate Farrell and Professor Judy Robertson (University of Edinburgh). This session will introduce you to how data literacy can be taught in primary and early-years education across different curricular areas. It will take place online on Tuesday 9 May at 17.00 UK time, don’t miss out and sign up now.
Yo find out more about connecting research to practice for primary computing education, you can find other our upcoming monthly seminars on primary (K–5) teaching and learning and watch the recordings of previous seminars in this series.
It would have been really cool to combine those two words to make “inundata,” but it would have been disastrous for SEO purposes. It’s all meant to kick off a conversation about the state of security organizations with regard to threat intelligence. There are several key challenges to overcome on the road to clarity in threat intelligence operations and enabling actionable data.
For the commissioned study, Forrester conducted interviews with four Rapid7 customers and collated their responses into the form of one representative organization and its experiences after implementing Rapid7’s threat intelligence solution, Threat Command. Interviewees noted that prior to utilizing Threat Command, lack of visibility and unactionable data across legacy systems were hampering efforts to innovate in threat detection. The study stated:
“Interviewees noted that there was an immense amount of data to examine with their previous solutions and systems. This resulted in limited visibility into the potential security threats to both interviewees’ organizations and their customers. The data the legacy solutions provided was confusing to navigate. There was no singular accounting of assets or solution to provide curated customizable information.”
A key part of that finding is that limited visibility can turn into potential liabilities for an organization’s customers – like the SonicWall attack a couple of years ago. These kinds of incidents can cause immediate pandemonium within the organizations of their downstream customers.
In this same scenario, lack of visibility can also be disastrous for the supply chain. Instead of affecting end-users of one product, now there’s a whole network of vendors and their end-users who could be adversely affected by lack of visibility into threat intelligence originating from just one organization. With greater data visibility through a single pane of glass and consolidating information into a centralized asset list, security teams can begin to mitigate visibility concerns.
Time-consuming processes for investigation and analysis
Rapid7 customers interviewed for the study also felt that their legacy threat intelligence solutions forced teams to “spend hours manually searching through different platforms, such as a web-based Git repository or the dark web, to investigate all potential threat alerts, many of which were irrelevant.”
Because of these inefficiencies, additional and unforeseen work was created on the backend, along with what we can assume were many overstretched analysts. How can organizations, then, gain back – and create new – efficiencies? First, alert context is a must. With Threat Command, security organizations can:
Receive actionable alerts categorized by severity, type (phishing, data leakage), and source (social media, black markets).
Implement alert automation rules based on your specific criteria so you can precisely customize and fine-tune alert management.
Accelerate alert triage and shorten investigation time by leveraging Threat Command Extend™ (browser extension) as well as 24x7x365 availability of Rapid7 expert analysts.
By leveraging these features, the study’s composite organization was able to surface far more actionable alerts and see faster remediation processes. It saved $302,000 over three years by avoiding the cost of hiring an additional security analyst.
Pivoting away from a constant reactive approach to cyber incidents
When it comes to security, no one ever aims for an after-the-fact approach. But sometimes a SOC may realize that’s the position it’s been in for quite some time. Nothing good will come from that for the business or anyone on the security team. Indeed, interviewees in the study supported this perspective:
“Legacy systems and internal processes led to a reactive approach for their threat intelligence investigations and security responses. Security team members would get alerts from systems or other teams with limited context, which led to inefficient triage or larger issues. As a result, the teams sacrificed quality for speed.”
The study notes how interviewees’ organizations were then motivated to look for a solution with improved search capabilities across locations such as social media and the dark web. After implementing Threat Command, it was possible for those organizations to receive early warning of potential attacks and automated intelligence of vulnerabilities targeting their networks and customers.
By creating processes that are centered around early-warning methodologies and a more proactive approach to security, the composite organization was able to reduce the likelihood of a breach by up to 70% over the course of three years.
Security is about the solutions
Challenges in a SOC don’t have to mean stopping everything and preparing for a years-long audit of all processes and solutions. It is possible to overcome challenges relatively quickly with a solution like Threat Command that can show immediate value after an accelerated onboarding process. And it is possible to vastly improve security posture in the face of an increasing volume of global threats.
For a deeper-dive into The Total Economic Impact™ of Rapid7 Threat Command For Digital Risk Protection and Threat Intelligence, download the study now. You can also read the previous blog entry in this series here.
The attack surface of the United Kingdom’s 350 largest publicly traded companies has—drum roll, please—improved. But it could be better. Those are the high level findings of the latest in Rapid7’s looks at the cybersecurity health of companies tied to some of the globe’s largest stock indices. This is the second time in more than two years that we looked at the FTSE 350 to gauge how well the entire UK’s business arena is faring against cyber threats. Turns out, they’ve improved in that time, and are on par with the other big indices we’ve looked at, though in some specific places, there is definitely room for improvement.
We chose the FTSE 350 as a benchmark in determining the cyber health of UK businesses because they are by and large some of the largest companies in the country and are not as resource constrained as some other, smaller, companies might be. This gives us a pretty even playing field on which to analyze their health and extrapolate out to the overall health of the region. We’ve done this with several other indices (most recently the ASX 200) and find it works well to provide a snapshot of what’s going on in the region.
In this report, we looked first at the overall attack surface of the FTSE 350 companies, broken down by industry. We also looked at the overall health of their email and web server security. All three areas showed improvement, as well as points for concern.
Attack Surface
By and large, the attack surfaces of the companies that make up the FTSE 350 was quite limited and in line with other major indices around the world. But, when you look at the individual industries that make up the FTSE you start to see some red flags.
For instance, financial and technology companies have by far the largest vulnerability through high risk ports exposed to the internet. Technology companies averaged well over 1000 ports with internet exposure and financial companies averaged nearly 800. That is 4 and 5 times the next highest industry (respectively). When it comes to particularly high risk ports, the financial sector is the biggest offender with an average of 12 high risk ports. For comparison, the technology sector had three.
Email Security
Email security is one area where we’ve seen some laudable improvement over the last time we looked at the FTSE 350. For instance, use of Domain-based Message Authentication, Reporting & Conformance (DMARC) policy is up 29%. However, the implementation of Domain Name System Security Extensions (DNSSEC) is at just 4% of the 350 companies that make up the index. Sadly, this too is on par with other indices. They should all seek improvements (alright, we’ll get off our soapbox).
Web Server Security
Going after vulnerable web servers is a favorite vector for attackers. When looking at the status of FTSE 350 company web servers we found that of the three most common types (NGinx, Apache, and IIS), not all were running high enough percentages of supported or fully patched versions. For instance, some 40% of NGinx servers were supported or fully patched, whereas 89% of Apache and 80% of IIS servers were. That’s a pretty big discrepancy. Thankfully, Apache and IIS are the dominant servers in this region, minimizing the overall risk.
If you want to take a look at our report you can read it here. If you’d like to check out the report we conducted for Australia’s ASX 200 it is available here.
At Cloudflare, helping to build a better Internet is not just a catchy saying. We are committed to the long-term process of standards development. We love the work of pushing the fundamental technology of the Internet forward in ways that are accessible to everyone. Today we are adding even more substance to that commitment. One of our core beliefs is that privacy is a human right. We believe that to achieve that right the most advanced cryptography needs to be available to everyone, free of charge, forever. Today, we are announcing that our implementations of post-quantum cryptography will meet that standard: available to everyone, and included free of charge, forever.
We have a proud history of taking paid encryption products and launching it to the Internet at scale for Free. Even at the cost of short and long-term revenue because it’s the right thing to do. In 2014, we made SSL free for every Cloudflare customer with Universal SSL. As we make our implementations of post-quantum cryptography free forever today, we do it in the spirit of that first major announcement:
“Having cutting-edge encryption may not seem important to a small blog, but it is critical to advancing the encrypted-by-default future of the Internet. Every byte, however seemingly mundane, that flows encrypted across the Internet makes it more difficult for those who wish to intercept, throttle, or censor the web. In other words, ensuring your personal blog is available over HTTPS makes it more likely that a human rights organization or social media service or independent journalist will be accessible around the world. Together we can do great things.”
We hope that others will follow us in making their implementations of PQC free as well so that we can create a secure and private Internet without a “quantum” up-charge.
The Internet has matured since the 1990’s and the launch of SSL. What was once an experimental frontier has turned into the underlying fabric of modern society. It runs in our most critical infrastructure like power systems, hospitals, airports, and banks. We trust it with our most precious memories. We trust it with our secrets. That’s why the Internet needs to be private by default. It needs to be secure by default. It’s why we’re committed to ensuring that anyone and everyone can achieve post quantum security for free as well as start deploying it at scale today.
Our work on post-quantum crypto is driven by the thesis that quantum computers that can break conventional cryptography create a similar problem to the Year 2000 bug. We know there is going to be a problem in the future that could have catastrophic consequences for users, businesses, and even nation states. The difference this time is we don’t know the date and time that this break in the paradigm of how computers operate will occur. We need to prepare today to be ready for this threat.
To that end we have been preparing for this transition since 2018. At that time we were concerned about the implementation problems other large protocol transitions, like the move to TLS 1.3, had caused our customers and wanted to get ahead of it. Cloudflare Research over the last few years has become a leader and champion of the idea that PQC security wasn’t an afterthought for tomorrow but a real problem that needed to be solved today. We have collaborated with industry partners like Google and Mozilla, contributed to development through participation in the IETF, and even launched an open source experimental cryptography suite to help move the needle. We have tried hard to work with everyone that wanted to be a part of the process and show our work along the way.
As we have worked with our partners in both industry and academia to help prepare us and the Internet for a post-quantum future, we have become dismayed by an emerging trend. There are a growing number of vendors out there that want to cash in on the legitimate fear that nervous executives, privacy advocates, and government leaders have about quantum computing breaking traditional encryption. These vendors offer vague solutions based on unproven technologies like “Quantum Key Distribution” or “Post Quantum Security” libraries that package non-standard algorithms that haven’t been through public review with exorbitant price tags like RSA did in the 1990s. They often love to throw around phrases like “AI” and “Post Quantum” without really showing their work on how any of their systems actually function. Security and privacy are table stakes in the modern Internet, and no one should be charged just to get the baseline protection needed in our contemporary world.
Launch your PQC transition today
Testing and adopting post-quantum cryptography in modern networks doesn’t have to be hard! In fact, Cloudflare customers can test PQC in their systems today, as we describe later in this post.
After testing out how Cloudflare can help you implement PQC in your networks, it’s time to start to prepare yourself for the transition to PQC in all of your systems. This first step of inventorying and identifying is critical to a smooth rollout. We know first hand since we have undertaken an extensive evaluation of all of our systems to earn our FedRAMP Authorization certifications, and we are doing a similar evaluation again to transition all of our internal systems to PQC.
How we are setting ourselves up for the future of quantum computing
Here’s a sneak preview of the path that we are developing right now to fully secure Cloudflare itself against the cryptographic threat of quantum computers. We can break that path down into three parts: internal systems, zero trust, and open source contributions.
The first part of our path to full PQC adoption at Cloudflare is around all of our connections. The connection between yourself and Cloudflare is just one part of the larger path of the connection. Inside our internal systems we are implementing two significant upgrades in 2023 to ensure that they are PQC secure as well.
The first is that we use BoringSSL for a substantial amount of connections. We currently use our fork and we are excited that upstream support for Kyber is underway. Any additional internal connections that use a different cryptographic system are being upgraded as well. The second major upgrade we are making is to shift the remaining internal connections that use TLS 1.2 to TLS 1.3. This combination of Kyber and TLS 1.3 will make our internal connections faster and more secure, even though we use a hybrid of classical and post-quantum secure cryptography. It’s a speed and security win-win. And we proved this power house combination would provide that speed and security over three and half years ago thanks to the groundbreaking work of Cloudflare Research and Google.
The next part of that path is all about using PQC and zero trust as allies together. As we think about the security posture of tomorrow being based around post-quantum cryptography, we have to look at the other critical security paradigm being implemented today: zero trust. Today, the zero trust vendor landscape is littered with products that fail to support common protocols like IPv6 and TLS 1.2, let alone the next generation of protocols like TLS 1.3 and QUIC that enable PQC. So many middleboxes struggle under the load of today’s modern protocols. They artificially downgrade connections and break end user security all in the name of inspecting traffic because they don’t have a better solution. Organizations big and small struggled to support customers that wanted the highest possible performance and security, while also keeping their businesses safe, because of the resistance of these vendors to adapt to modern standards. We do not want to repeat the mistakes of the past. We are planning and evaluating the needed upgrades to all of our zero trust products to support PQC out of the box. We believe that zero trust and post-quantum cryptography are not at odds with one another, but rather together are the future standard of security.
Finally, it’s not enough for us to do this for ourselves and for our customers. The Internet is only as strong as its weakest links in the connection chains that network us all together. Every connection on the Internet needs the strongest possible encryption so that businesses can be secure, and everyday users can be ensured of their privacy. We believe that this core technology should be vendor agnostic and open to everyone. To help make that happen our final part of the path is all about contributing to open source projects. We have already been focused on releases to CIRCL. CIRCL (Cloudflare Interoperable, Reusable Cryptographic Library) is a collection of cryptographic primitives written in Go. The goal of this library is to be used as a tool for experimental deployment of cryptographic algorithms targeting post quantum.
Later on this year we will be publishing as open source a set of easy to adopt, vendor-neutral roadmaps to help you upgrade your own systems to be secure against the future today. We want the security and privacy created by post quantum crypto to be accessible and free for everyone. We will also keep writing extensively about our post quantum journey. To learn more about how you can turn on PQC today, and how we have been building post-quantum cryptography at Cloudflare, please check out these resources:
News coverage of a recent paper caused a bit of a stir with this headline: “AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm”. The news article claimed that Kyber, the encryption algorithm in question, which we have deployed world-wide, had been “broken.” Even more dramatically, the news article claimed that “the revolutionary aspect of the research was to apply deep learning analysis to side-channel differential analysis”, which seems aimed to scare the reader into wondering what will Artificial Intelligence (AI) break next?
Reporting on the paper has been wildly inaccurate: Kyber is not broken and AI has been used for more than a decade now to aid side-channel attacks. To be crystal clear: our concern is with the news reporting around the paper, not the quality of the paper itself. In this blog post, we will explain how AI is actually helpful in cryptanalysis and dive into the paper by Dubrova, Ngo, and Gärtner (DNG), that has been misrepresented by the news coverage. We’re honored to have Prof. Dr. Lejla Batina and Dr. Stjepan Picek, world-renowned experts in the field of applying AI to side-channel attacks, join us on this blog.
We start with some background, first on side-channel attacks and then on Kyber, before we dive into the paper.
Breaking cryptography
When one thinks of breaking cryptography, one imagines a room full of mathematicians puzzling over minute patterns in intercepted messages, aided by giant computers, until they figure out the key. Famously in World War II, the Nazis’ Enigma cipher machine code was completely broken in this way, allowing the Allied forces to read along with their communications.
It’s exceedingly rare for modern established cryptography to get broken head-on in this way. The last catastrophically broken cipher was RC4, designed in 1987, while AES, designed in 1998, stands proud with barely a scratch. The last big break of a cryptographic hash was on SHA-1, designed in 1995, while SHA-2, published in 2001, remains untouched in practice.
So what to do if you can’t break the cryptography head-on? Well, you get clever.
Side-channel attacks
Can you guess the pin code for this gate?
You can clearly see that some of the keys are more worn than the others, suggesting heavy use. This observation gives us some insight into the correct pin, namely the digits. But the correct order is not immediately clear. It might be 1580, 8510, or even 115085, but it’s a lot easier than trying every possible pin code. This is an example of a side-channel attack. Using the security feature (entering the PIN) had some unintended consequences (abrading the paint), which leaks information.
There are many different types of side channels, and which one you should worry about depends on the context. For instance, the sounds your keyboard makes as you type leaks what you write, but you should not worry about that if no one is listening in.
Remote timing side channel
When writing cryptography in software, one of the best known side channels is the time it takes for an algorithm to run. For example, let’s take the classic example of creating an RSA signature. Grossly simplified, to sign a message m with private key d, we compute the signature s as md (mod n). Computing the exponent of a big number is hard, but luckily, because we’re doing modular arithmetic, there is the square-and-multiply trick. Here is a naive implementation in pseudocode:
The algorithm loops over the bits of the secret key, and does a multiply step if the current bit is a 1. Clearly, the runtime depends on the secret key. Not great, but if the attacker can only time the full run, then they only learn the number of 1s in the secret key. The typical catastrophic timing attack against RSA instead is hidden behind the “mod n”. In a naive implementation this modular reduction is slower if the number being reduced is larger or equal n. This allows an attacker to send specially crafted messages to tease out the secret key bit-by-bit and similar attacks are surprisingly practical.
Because of this, the mantra is: cryptography should run in “constant time”. This means that the runtime does not depend on any secret information. In our example, to remove the first timing issue, one would replace the if-statement with something equivalent to:
s = ((s * powerOfM) mod n) * bit(s, i) + s * (1 - bit(s, i))
This ensures that the multiplication is always done. Similar countermeasures prevent practically all remote timing attacks.
Power side-channel
The story is quite different for power side-channel attacks. Again, the classic example is RSA signatures. If we hook up an oscilloscope to a smartcard that uses the naive algorithm from before, and measure the power usage while it signs, we can read off the private key by eye:
Even if we use a constant-time implementation, there are still minute changes in power usage that can be detected. The underlying issue is that hardware gates that switch use more power than those that don’t. For instance, computing 127 + 64 takes more energy than 64 + 64.
127+64 and 64+64 in binary. There are more switched bits in the first.
Masking A common countermeasure against power side-channel leakage is masking. This means that before using the secret information, it is split randomly into shares. Then, the brunt of the computation is done on the shares, which are finally recombined.
In the case of RSA, before creating a new signature, one can generate a random r and compute md+r (mod n) and mr (mod n) separately. From these, the final signature md (mod n) can be computed with some extra care.
Masking is not a perfect defense. The parts where shares are created or recombined into the final value are especially vulnerable. It does make it harder for the attacker: they will need to collect more power traces to cut through the noise. In our example we used two shares, but we could bump that up even higher. There is a trade-off between power side-channel resistance and implementation cost.
One of the challenging parts in the field is to estimate how much secret information is actually leaked through the traces, and how to extract it. Here machine learning enters the picture.
Machine learning: extracting the key from the traces
Machine learning, of which deep learning is a part, represents the capability of a system to acquire its knowledge by extracting patterns from data — in this case, the secrets from the power traces. Machine learning algorithms can be divided into several categories based on their learning style. The most popular machine learning algorithms in side-channel attacks follow the supervised learning approach. In supervised learning, there are two phases: 1) training, where a machine learning model is trained based on known labeled examples (e.g., side-channel measurements where we know the key) and 2) testing, where, based on the trained model and additional side-channel measurements (now, with an unknown key), the attacker guesses the secret key. A common depiction of such attacks is given in the figure below.
While the threat model may sound counterintuitive, it is actually not difficult to imagine that the attacker will have access (and control) of a device similar to the one being attacked.
In side-channel analysis, the attacks following those two phases (training and testing) are called profiling attacks.
Profiling attacks are not new. The first such attack, called the template attack, appeared in 2002. Diverse machine learning techniques have been used since around 2010, all reporting good results and the ability to break various targets. The big breakthrough came in 2016, when the side-channel community started using deep learning. It greatly increased the effectiveness of power side-channel attacks both against symmetric-key and public-key cryptography, even if the targets were protected with, for instance, masking or some other countermeasures. To be clear: it doesn’t magically figure out the key, but it gets much better at extracting the leaked bits from a smaller number of power traces.
While machine learning-based side-channel attacks are powerful, they have limitations. Carefully implemented countermeasures make the attacks more difficult to conduct. Finding a good machine learning model that can break a target can be far from trivial: this phase, commonly called tuning, can last weeks on powerful clusters.
What will the future bring for machine learning/AI in side-channel analysis? Counter intuitively, we would like to see more powerful and easy to use attacks. You’d think that would make us worse off, but to the contrary it will allow us to better estimate how much actual information is leaked by a device. We also hope that we will be able to better understand why certain attacks work (or not), so that more cost-effective countermeasures can be developed. As such, the future for AI in side-channel analysis is bright especially for security evaluators, but we are still far from being able to break most of the targets in real-world applications.
Kyber
Kyber is a post-quantum (PQ) key encapsulation method (KEM). After a six-year worldwide competition, the National Institute of Standards and Technology (NIST) selected Kyber as the post-quantum key agreement they will standardize. The goal of a key agreement is for two parties that haven’t talked to each other before to agree securely on a shared key they can use for symmetric encryption (such as Chacha20Poly1305). As a KEM, it works slightly different with different terminology than a traditional Diffie–Hellman key agreement (such as X25519):
When connecting to a website the client first generates a new ephemeral keypair that consists of a private and public key. It sends the public key to the server. The server then encapsulates a shared key withthat public key, which gives it a random shared key, which it keeps, and a ciphertext (in which the shared key is hidden), which the server returns to the client. The client can then use its private key to decapsulate the shared key from the ciphertext. Now the server and client can communicate with each other using the shared key.
Key agreement is particularly important to make secure against attacks of quantum computers. The reason is that an attacker can store traffic today, and crack the key agreement in the future, revealing the shared key and all communication encrypted with it afterwards. That is why we have already deployed support for Kyber across our network.
The DNG paper
With all the background under our belt, we’re ready to take a look at the DNG paper. The authors perform a power side-channel attack on their own masked implementation of Kyber with six shares.
Point of attack
They attack the decapsulation step. In the decapsulation step, after the shared key is extracted, it’s encapsulated again, and compared against the original ciphertext to detect tampering. For this re-encryption step, the precursor of the shared key—let’s call it the secret—is encoded bit-by-bit into a polynomial. To be precise, the 256-bit secret needs to be converted to a polynomial with 256 coefficients modulo q=3329, where the ith coefficient is (q+1)/2 if the ith bth is 1 and zero otherwise.
This function sounds simple enough, but creating a masked version is tricky. The rub is that the natural way to create shares of the secret is to have shares that xor together to be the secret, and that the natural way to share polynomials is to have shares that add together to get to the intended polynomial.
This is the two-shares implementation of the conversion that the DNG paper attacks:
The code loops over the bits of the two shares. For each bit, it creates a mask, that’s 0xffff if the bit was 1 and 0 otherwise. Then this mask is used to add (q+1)/2 to the polynomial share if appropriate. Processing a 1 will use a bit more power. It doesn’t take an AI to figure out that this will be a leaky function. In fact, this pattern was pointed out to be weak back in 2016, and explicitly mentioned to be a risk for masked Kyber in 2020. Apropos, one way to mitigate this, is to process multiple bits at once — for the state of the art, tune into April 2023’s NIST PQC seminar. For the moment, let’s allow the paper its weak target.
The authors do not claim any fundamentally new attack here. Instead, they improve the effectiveness of the attack in two ways: the way they train the neural network, and how to use multiple traces more effectively by changing the ciphertext sent. So, what did they achieve?
Effectiveness
To test the attack, they use a Chipwhisperer-lite board, which has a Cortex M4 CPU, which they downclock to 24Mhz. Power usage is sampled at 24Mhz, with high 10-bit precision.
To train the neural networks, 150,000 power traces are collected for decapsulation of different ciphertexts (with known shared key) for the same KEM keypair. This is already a somewhat unusual situation for a real-world attack: for key agreement KEM keypairs are ephemeral; generated and used only once. Still, there are certainly legitimate use cases for long-term KEM keypairs, such as for authentication, HPKE, and in particular ECH.
The training is a key step: different devices even from the same manufacturer can have wildly different power traces running the same code. Even if two devices are of the same model, their power traces might still differ significantly.
The main contribution highlighted by the authors is that they train their neural networks to attack an implementation with 6 shares, by starting with a neural network trained to attack an implementation with 5 shares. That one can be trained from a model to attack 4 shares, and so on. Thus to apply their method, of these 150,000 power traces, one-fifth must be from an implementation with 6 shares, another one-fifth from one with 5 shares, et cetera. It seems unlikely that anyone will deploy a device where an attacker can switch between the number of shares used in the masking on demand.
Given these affordances, the attack proper can commence. The authors report that, from a single power trace of a two-share decapsulation, they could recover the shared key under these ideal circumstances with probability… 0.12%. They do not report the numbers for single trace attacks on more than two shares.
When we’re allowed multiple traces of the same decapsulation, side-channel attacks become much more effective. The second trick is a clever twist on this: instead of creating a trace of decapsulation of exactly the same message, the authors rotate the ciphertext to move bits of the shared key in more favorable positions. With 4 traces that are rotations of the same message, the success probability against the two-shares implementation goes up to 78%. The six-share implementation stands firm at 0.5%. When allowing 20 traces from the six-share implementation, the shared key can be recovered with an 87% chance.
In practice
The hardware used in the demonstration might be somewhat comparable to a smart card, but it is very different from high-end devices such as smartphones, desktop computers and servers. Simple power analysis side-channel attacks on even just embedded 1GHz processors are much more challenging, requiring tens of thousands of traces using a high-end oscilloscope connected close to the processor. There are much better avenues for attack with this kind of physical access to a server: just connect the oscilloscope to the memory bus.
Except for especially vulnerable applications, such as smart cards and HSMs, power-side channel attacks are widely considered infeasible. Although sometimes, when the planets align, an especially potent power side-channel attack can be turned into a remote timing attack due to throttling, as demonstrated by Hertzbleed. To be clear: the present attack does not even come close.
And even for these vulnerable applications, such as smart cards, this attack is not particularly potent or surprising. In the field, it is not a question of whether a masked implementation leaks its secrets, because it always does. It’s a question of how hard it is to actually pull off. Papers such as the DNG paper contribute by helping manufacturers estimate how many countermeasures to put in place, to make attacks too costly. It is not the first paper studying power side-channel attacks on Kyber and it will not be the last.
Wrapping up
AI did not completely undermine a new wave of cryptography, but instead is a helpful tool to deal with noisy data and discover the vulnerabilities within it. There is a big difference between a direct break of cryptography and a power side-channel attack. Kyber is not broken, and the presented power side-channel attack is not cause for alarm.
Today we are announcing the general availability of Amazon Lightsail for Research, a new offering that makes it easy for researchers and students to create and manage a high-performance CPU or a GPU research computer in just a few clicks on the cloud. You can use your preferred integrated development environments (IDEs) like preinstalled Jupyter, RStudio, Scilab, VSCodium, or native Ubuntu operating system on your research computer.
You no longer need to use your own research laptop or shared school computers for analyzing larger datasets or running complex simulations. You can create your own research environments and directly access the application running on the research computer remotely via a web browser. Also, you can easily upload data to and download from your research computer via a simple web interface.
You pay only for the duration the computers are in use and can delete them at any time. You can also use budgeting controls that can automatically stop your computer when it’s not in use. Lightsail for Research also includes all-inclusive prices of compute, storage, and data transfer, so you know exactly how much you will pay for the duration you use the research computer.
Get Started with Amazon Lightsail for Research To get started, navigate to the Lightsail for Research console, and choose Virtual computers in the left menu. You can see my research computers naming “channy-jupyter” or “channy-rstudio” already created.
Choose Create virtual computer to create a new research computer, and select which software you’d like preinstalled on your computer and what type of research computer you’d like to create.
In the first step, choose the application you want installed on your computer and the AWS Region to be located in. We support Jupyter, RStudio, Scilab, and VSCodium. You can install additional packages and extensions through the interface of these IDE applications.
Next, choose the desired virtual hardware type, including a fixed amount of compute (vCPUs or GPUs), memory (RAM), SSD-based storage volume (disk) space, and a monthly data transfer allowance. Bundles are charged on an hourly and on-demand basis.
Standard types are compute-optimized and ideal for compute-bound applications that benefit from high-performance processors.
Name
vCPUs
Memory
Storage
Monthly data transfer allowance*
Standard XL
4
8 GB
50 GB
0.5TB
Standard 2XL
8
16 GB
50 GB
0.5TB
Standard 4XL
16
32 GB
50 GB
0.5TB
GPU types provide a high-performance platform for general-purpose GPU computing. You can use these bundles to accelerate scientific, engineering, and rendering applications and workloads.
Name
GPU
vCPUs
Memory
Storage
Monthly data transfer allowance*
GPU XL
1
4
16 GB
50 GB
1 TB
GPU 2XL
1
8
32 GB
50 GB
1 TB
GPU 4XL
1
16
64 GB
50 GB
1 TB
* AWS created the Global Data Egress Waiver (GDEW) program to help eligible researchers and academic institutions use AWS services by waiving data egress fees. To learn more, see the blog post.
After making your selections, name your computer and choose Create virtual computer to create your research computer. Once your computer is created and running, choose the Launch application button to open a new window that will display the preinstalled application you selected.
Lightsail for Research Features As with existing Lightsail instances, you can create additional block-level storage volumes (disks) that you can attach to a running Lightsail for Research virtual computer. You can use a disk as a primary storage device for data that requires frequent and granular updates. To create your own storage, choose Storage and Create disk.
You can also create Snapshots, a point-in-time copy of your data. You can create a snapshot of your Lightsail for Research virtual computers and use it as baselines to create new computers or for data backup. A snapshot contains all of the data that is needed to restore your computer from the moment when the snapshot was taken.
When you restore a computer by creating it from a snapshot, you can easily create a new one or upgrade your computer to a larger size using a snapshot backup. Create snapshots frequently to protect your data from corrupt applications or user errors.
You can use Cost control rules that you define to help manage the usage and cost of your Lightsail for Research virtual computers. You can create rules that stop running computers when average CPU utilization over a selected time period falls below a prescribed level.
For example, you can configure a rule that automatically stops a specific computer when its CPU utilization is equal to or less than 1 percent for a 30-minute period. Lightsail for Research will then automatically stop the computer so that you don’t incur charges for running computers.
In the Usage menu, you can view the cost estimate and usage hours for your resources during a specified time period.
Now Available Amazon Lightsail for Research is now available in the US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), and Europe (Sweden) Regions.
Each year, the research team at Rapid7 analyzes thousands of vulnerabilities in order to identify their root causes, broaden understanding of attacker behavior, and provide actionable intelligence that guides security professionals at critical moments. Our annual Vulnerability Intelligence Report examines notable vulnerabilities and high-impact attacks from 2022 to highlight trends that drive significant risk for organizations of all sizes.
Today, we’re excited to release Rapid7’s 2022 Vulnerability Intelligence Report—a deep dive into 50 of the most notable vulnerabilities our research team investigated throughout the year. The report offers insight into critical vulnerabilities, widespread threats, prominent attack surface area, and changing exploitation trends.
2022 attack trends
The threat landscape today is radically different than it was even a few years ago. Over the past three years, we’ve seen zero-day exploits and widespread attacks chart a meteoric rise that’s strained security teams to their breaking point and beyond. While 2022 saw a modest decline in zero-day and widespread exploitation from 2021’s record highs, the multi-year trend of rising attack speed and scale remains strikingly consistent overall.
Report findings include:
Widespread exploitation of newvulnerabilities decreased 15% year over year in 2022, but mass exploitation events were still the norm. Our 2022 vulnerability intelligence dataset tracks 28 net-new widespread threats, many of which were used to deploy webshells, cryptocurrency miners, botnet malware, and/or ransomware on target systems.
Zero-day exploitation remained a significant challenge for security teams, with 43% of widespread threats arising from a zero-day exploit.
Attackers are still developing and deploying exploits faster than ever before. More than half of the vulnerabilities in our report dataset were exploited within seven days of public disclosure—a 12% increase from 2021 and an 87% increase over 2020.
Vulnerabilities mapped definitively to ransomware operations dropped 33% year over year—a troubling trend that speaks more to evolving attacker behavior and lower industry visibility than to any actual reprieve for security practitioners. This year’s report explores the growing complexity of the cybercrime ecosystem, the rise of initial access brokers, and industry-wide ransomware reporting trends.
How to manage risk from critical vulnerabilities
In today’s threat landscape, security teams are frequently forced into reactive positions, lowering security program efficacy and sustainability. Strong foundational security program components, including vulnerability and asset management processes, are essential to building resilience in a persistently elevated threat climate.
Have emergency patching procedures and incident response playbooks in place so that in the event of a widespread threat or breach, your team has a well-understood mechanism to drive immediate action.
Have a defined, regular patch cycle that includes prioritization of actively exploited CVEs, as well as network edge technologies like VPNs and firewalls. These network edge devices continue to be popular attack vectors and should adhere to a zero-day patch cycle wherever possible, meaning that updates and/or downtime should be scheduled as soon as new critical advisories are released.
Keep up with operating system-level and cumulative updates. Falling behind on these regular updates can make it difficult to install out-of-band security patches at critical moments.
Limit and monitor internet exposure of critical infrastructure and services, including domain controllers and management or administrative interfaces. The exploitation of many of the CVEs in this year’s report could be slowed down or prevented by taking management interfaces off the public internet.
2022 Vulnerability Intelligence Report
Read the report to see our full list of high-priority CVEs and learn more about attack trends from 2022.
Every young learner needs a successful start to their learning journey in the primary computing classroom. One aspect of this for teachers is to introduce programming to their learners in a structured way. As computing education is introduced in more schools, the need for research-informed strategies and approaches to support beginner programmers is growing. Over recent years, researchers have proposed various strategies to guide teachers and students, such as the block model, PRIMM, and, in the case of this month’s seminar, TIPP&SEE.
We need to give all learners a successful start in the primary computing classroom.
We are committed to make computing and creating with digital technologies accessible to all young people, including through our work with educators and researchers. In our current online research seminar series, we focus on computing education for primary-aged children (K–5, ages 5 to 11). In the series’ second seminar, we were delighted to welcome Dr Jean Salac, researcher in the Code & Cognition Lab at the University of Washington.
Dr Jean Salac
Jean’s work sits across computing education and human-computer interaction, with an emphasis on justice-focused computing for youth. She talked to the seminar attendees about her work on developing strategies to support primary school students learning to program in Scratch. Specifically, Jean described an approach called TIPP&SEE and how teachers can use it to guide their learners through programming activities.
What is TIPP&SEE?
TIPP&SEE is a metacognitive approach for programming in Scratch. The purpose of metacognitive strategies is to help students become more aware of their own learning processes.
The stages of the TIPP&SEE approach
TIPP&SEE scaffolds students as they learn from example Scratch projects: TIPP (Title, Instructions, Purpose, Play) is a scaffold to read and run a Scratch project, while SEE (Sprites, Events, Explore) is a scaffold to examine projects more deeply and begin to adapt them.
Using, modifying and creating
TIPP&SEE is inspired by the work of Irene Lee and colleagues who proposed a progressive three-stage approach called Use-Modify-Create. Following that approach, learners move from reading pre-existing programs (“not mine”) to adapting and creating their own programs (“mine”) and gradually increase ownership of their learning.
TIPP&SEE builds on the Use-Modify-Create progression.
Proponents of scaffolded approaches like Use-Modify-Create argue that engaging learners in cycles of using existing programs (e.g. worked examples) before they move to adapting and creating new programs encourages ownership and agency in learning. TIPP&SEE builds on this model by providing additional scaffolding measures to support learners.
Impact of TIPP&SEE
Jean presented some promising results from her research on the use of TIPP&SEE in classrooms. In one study, fourth-grade learners (age 9 to 10) were randomly assigned to one of two groups: (i) Use-Modify-Create only (the control group) or (ii) Use-Modify-Create with TIPP&SEE. Jean found that, compared to learners in the control group, learners in the TIPP&SEE group:
Were more thorough, and completed more tasks
Wrote longer scripts during open-ended tasks
Used more learned blocks during open-ended tasks
The TIPP&SEE group performed better than the control group in assessments
In another study, Jean compared how learners in the TIPP&SEE and control groups performed on several cognitive tests. She found that, in the TIPP&SEE group, students with learning difficulties performed as well as students without learning difficulties. In other words, in the TIPP&SEE group the performance gap was much narrower than in the control group. In our seminar, Jean argued that this indicates the TIPP&SEE scaffolding provides much-needed support to diverse groups of students.
Using TIPP&SEE in the classroom
TIPP&SEE is a multi-step strategy where learners start by looking at the surface elements of a program, and then move on to examining the underlying code. In the TIPP phase, learners first read the title and instructions of a Scratch project, identify its purpose, and then play the project to see what it does.
In the second phase, SEE, learners look inside the Scratch project to click on sprites and predict what each script is doing. They then make changes to the Scratch code and see how the project’s output changes. By changing parameters, learners can observe which part of the output changes as a result and then reason how each block functions. This practice is called deliberate tinkering because it encourages learners to observe changes while executing programs multiple times with different parameters.
Jean’s talk highlighted the need for computing to be inclusive and to give equitable access to all learners. The field of computing education is still in its infancy, though our understanding of how young people learn about computing is growing. We ourselves work to deepen our understanding of how young people learn through computing and digital making experiences.
In our own research, we have been investigating similar teaching approaches for programming, including the use of the PRIMM approach in the UK, so we were very interested to learn about different approaches and country contexts. We are grateful to Dr Jean Salac for sharing her work with researchers and teachers alike. Watch the recording of Jean’s seminar to hear more:
Free support for teaching programming and more to primary school learners
If you are looking for more free resources to help you structure your computing lessons:
Download our comprehensive classroom resources in The Computing Curriculum, which includes units of programming lessons for age 5 and upwards.
In the next seminar of our online series on primary computing, I will be presenting my research on integrated computing and literacy activities. Sign up now to join us for this session on Tues 7 March:
Industrial Control System (ICS) networking stacks are often the go-to bogeyman for infosec and cybersecurity professionals, and doubly so for offensive, red-team style security folks. How often have you been new on site, all ready to run a bog-standard nmap scan across the internal address space, only to be stopped by a frantic senior manager, “No, you can’t scan 192.168.69.0/24, that’s where the factory floor operates!”
“Why not?” you might ask—after all, isn’t it important to scan your IP-connected assets regularly to make sure they’re all accounted for and patched? Isn’t that kind of the one thing we tell literally anyone who asks, right after making sure your passwords are nice and long and random?
“Oh no,” this manager might plead, “if you scan them, they fall over, and it kills production. Minutes of downtime costs millions!”
Well, I’m happy to report that today, Rapid7’s Andreas Galauner has produced a technical deep dive whitepaper into the mysterious and opaque world of PLC protocols, and specifically, how you, intrepid IT explorer, can safely and securely scan around your CODESYS-based ICS footprint.
CODESYS is a protocol suite that runs a whole lot of industrial equipment. Sometimes it’s labeled clearly as such, and sometimes it’s not mentioned at all in the docs. While it is IP-based, it also uses some funky features of UDP multicast, which is one reason why scanning (or worse, fuzzing) these things blindly can cause a lot of trouble in the equipment that depends on it.
No spoilers, but if you’re the sort who always wondered why, exactly, flinging packets at the ICS network can lead to heartache and lost productivity, this is the paper for you. This goes double if you’re already a bit of a networking nerd.
If you’re not sure, here’s an easy test. Go and read this Errata Security blog about the infamous Hacker Jeopardy telnet question real quick. If you have any emotional response at all (hilarity, enlightenment, outrage, or a mix of all three), you’re definitely in the audience for this paper.
Best of all, this paper comes with some tooling; Andy has graciously open sourced a Wireshark plugin for CODESYS analysis, and an Nmap NSE script for safer scanning. You can grab those, right now, at our GitHub repo. Cower in the dark about ICS networks no more!
Today marks an important day for Rapid7, for the state of Florida, and if we may be so bold, for the future of our industry. The announcement of a joint research lab between Rapid7 and the University of South Florida (USF) reaffirms our commitment to driving a deeper understanding of the challenges we face in protecting our shared digital space, while ushering in new talent to ensure that the cyber workforce of tomorrow is as diverse as the individuals who create the shared digital space we set out to protect.
With the Rapid7 Cybersecurity Foundation, we are proud to announce the opening of the Rapid7 Cyber Threat Intelligence Lab in Tampa, at USF. We intend for the lab to be an integral component in real-time threat tracking by leveraging our extensive network of sensors, and incorporating this intelligence not only into our products and customers, but to make actionable indicators available to the wider community. This project also reaffirms our commitment to making cybersecurity more accessible to everyone through our support of research, disclosure, and open source, including projects such as Metasploit, Recog, and Velociraptor to name a few.
We believe that providing USF faculty and students this breadth of intelligence will not only support their journey in learning, but fundamentally provide a clearer path in determining areas to focus in their careers. We are hopeful that working side by side with Rapid7 analysts can help propel this journey, and enhance the meaningful research developed by the university.
As part of the commitment for this investment—and consistent with the guiding principles of the Rapid7 Cybersecurity Foundation—we intend to promote diversity within the cybersecurity workforce. In particular, we plan on opening doors to individuals from historically underrepresented groups within the cybersecurity workforce. With the objective to ensure that research projects are inclusive of those from all backgrounds, we are optimistic that not only will this introduce hands-on technical content to those who may not otherwise have such opportunities, but also, in the longer term, encourage greater diversity within the cybersecurity industry as a whole. We remain steadfast in our commitment to broadening the opportunities within cybersecurity to all those with a passion for creating a more secure and prosperous digital future.
We are deeply thankful to USF for their shared vision, and look forward to a partnership that benefits all students and faculty while producing actionable intelligence that can support the entire internet and the broader industry. Ultimately, the threatscape is such that we recognise no one organization can stop attackers on their own. This partnership remains part of our commitment to establish the relationships between private industry and partners that include academia.
Last week, multiple organizations issued warnings that a ransomware campaign dubbed “ESXiArgs” was targeting VMware ESXi servers by leveraging CVE-2021-21974—a nearly two-year-old heap overflow vulnerability. Two years. And yet, Rapid7 research has found that a significant number of ESXi servers likely remain vulnerable. We believe, with high confidence, that there are at least 18,581 vulnerable internet-facing ESXi servers at the time of this writing.
That 18,581 number is based on Project Sonar telemetry. We leverage the TLS certificate Recog signature to determine that a particular server is a legitimate ESXi server. Then, after removing likely honeypots from the results, we checked the build ids of the scanned servers against a list of vulnerable build ids.
Project Sonar is a Rapid7 research effort aimed at improving security through the active analysis of public networks. As part of the project, we conduct internet-wide surveys across more than 70 different services and protocols to gain insights into global exposure to common vulnerabilities.
We have also observed additional incidents targeting ESXi servers, unrelated to the ESXiArgs campaign, that may also leverage CVE-2021-21974. RansomExx2—a relatively new strain of ransomware written in Rust and targeting Linux has been observed exploiting vulnerable ESXi servers. According to a recent IBM Security X-Force report, ransomware written in Rust has lower antivirus detection rates compared to those written in more common languages.
CISA issues fix, sort of
The U.S. Cybersecurity and Infrastructure Security Agency (CISA) on Wednesday released a ransomware decryptor to help victims recover from ESXiArgs attacks. However, it’s important to note the script is not a cure all and requires additional tools for a full recovery. Moreover, reporting suggests that the threat actor behind the campaign has modified their attack to mitigate the decryptor.
The script works by allowing users to unregister virtual machines that have been encrypted by the ransomware and re-register them with a new configuration file. However, you still need to have a backup of the encrypted parts of the VM to make a full restore.
The main benefit of the decryptor script is that it enables users to bring virtual machines back to a working state while data restore from backup occurs in the background. This is particularly useful for users of traditional backup tools without virtualization-based disaster recovery capabilities.
Rapid7 recommends
Deny access to servers. Unless a service absolutely needs to be on the internet, do not expose it to the internet. Some victims of these attacks had these servers exposed to the open internet, but could have gotten just as much business value out of them by restricting access to allowlisted IP addresses. If you are running an ESXi server, or any server, default to denying access to that server except from trusted IP space.
Patch vulnerable ESXi Servers. VMware issued a patch for CVE-2021-21974 nearly two years ago. If you have unpatched ESXi servers in your environment, click on that link and patch them now.
Develop and adhere to a patching strategy. Patching undoubtedly has challenges. However, this event illustrates perfectly why it’s essential to have a patching strategy in place and stick to it.
Back up virtual machines. Make sure you have a backup solution in place, even for virtual machines. As noted above, the decryptor script issued by the CIA is only a partial fix. The only way to completely recover from attacks associated with CVE-2021-21974 is via operational backups. There are a wide variety of backup solutions available to protect virtual machines today.
By Christiaan Beek, with special thanks to Matt Green
DLL search order hijacking is a technique used by attackers to elevate privileges on the compromised system, evade restrictions, and/or establish persistence on the system. The Windows operating system uses a common method to look for required dynamic link libraries (DLLs) to load into a program. Attackers can hijack this search order to get their malicious payload executed.
DLL sideloading is similar to the above mentioned technique; however, instead of manipulating the search order, attackers place their payload alongside the victim’s application or a trusted third-party application. Abusing trusted applications to load their payload may bypass restrictions and evade endpoint security detections since they are loaded into a trusted process.
Attribution remains a topic of significant subjectivity, especially when attempting to connect an attack to a nation state. A common approach in determining the source has been to evaluate the techniques used by the perpetrator(s). DLL search order hijacking (T1574.001) or DLL sideloading (T1574.002) are common approaches used by nation state sponsored attackers.
PlugX
The PlugX malware family, which has been around for more than a decade, is famous for using both techniques to bypass endpoint security and inject itself into trusted third party applications. PlugX is a remote access trojan with modular plugins. It is frequently updated with new functionalities and plugins.
Example of PlugX builderExample of modules in the code
In recent years, MITRE ATT&CK, CISA, and others have associated the PlugX family with various Chinese actors. Builders of the PlugX malware have been leaked to the public and can be used by other actors having access to the builders.
In January 2023, we observed activity from a China-based group called Mustang Panda using PlugX in one of their campaigns. In this particular case, they used a virtual hard disk (VHD) file, to hide the malicious files from antivirus detection. The VHD, which automatically mounted when opened contained a single archive file (RAR) that extracted the typical three files associated with PlugX:
Trusted binary (executable .exe)
Hijacked driver (DLL file)
Encrypted payload file (often a DAT file)
The trusted binary ranged from compromised AV vendor files, operating system files, and third-party vendor files. These files are signed and therefore most of the time trusted by endpoint technology.
This approach is known as a Mark-of-the-Web bypass or MOTW (T1553.005). In short, container files that are downloaded from the Internet are marked with MOTW, but the files within do not inherit the MOTW after the container files are extracted and/or mounted. When files are marked with the MOTW, if they are not trusted or downloaded from the Internet, they will not be executed.
While we observed Mustang Panda using aVHD file to hide malicious files, it is worth noting that ISO files may also be used, as they are also automatically mounted.
Hunting with Velociraptor
Since PlugX is injecting itself into a trusted process, abusing a trusted executable, this threat is often detected when the outgoing Command & Control Server (C2) traffic is being discovered (usually by accident or that someone flagged the IP address as being malicious). One classic mistake I’ve observed over the years is that when companies see in their AV logs that malware has been removed, they often don’t look further into what type of malware it is, its capabilities, and whether it is nation-state related or cybercrime related. However, the appropriate incident response handling differs in approach for each.
Many nation-state actors want to be long term persistent into a network and have established ways of staying inside, even if a few of their open doors are being closed (think about valid accounts added, webshells, other backdoors, etc.). A dead C2 server can indicate this, as the actor may have used it as a first entry to the network.
For example, we recently observed what appeared to be an incident where some suspicious password dumping tools were discovered. Although the security team removed the tools, they seemed to come back into the network.
After meeting with the team and reviewing some of the logs of the incidents, it was time to grab one of my favorite (and free) tools: Velociraptor. Velociraptor is Rapid7’s advanced open-source endpoint monitoring, digital forensic and cyber response platform. It enables users to effectively respond to a wide range of digital forensic and cyber incident response investigations and data breaches.
With a ton of forensic options and hunting possibilities, the first thing was to acquire live collections of data to investigate.
After investigating the initial memory dumps, remnants were discovered where a process was talking to an outside IP address. The process itself was using a DLL that was not located in a standard location on disk. After retrieving the folder from the victim’s machine and reversing the process, it became clear: PlugX was discovered.
There are several ways Velociraptor can be used to hunt for DLL search order hijacking or sideloading. In this particular case, we’ll discuss the approach for PlugX malware.
We could hunt for:
Process / Mutex
Lnk Files
Disk
Memory
Network traffic / C2 URL/IP-address
Using the YARA toolset, we created rules for malicious or suspicious binaries and/or memory patterns. Velociraptor can use these rules to scan a bulk of data or process memory or raw memory using the ‘yara()’ or ‘proc_yara’ options.
Based on recent PlugX samples (end of 2022, beginning 2023), the we created the following rule (which can be downloaded from my Github page):
Using this rule, which is based on code patterns from the DLL component used in PlugX, Velociraptor will hunt for these DLL files and detect them. Once detected, you can look at the systems impacted, make a memory-dump, process dumps, etc., and investigate the system for suspicious activity. The directory where the DLL is stored will most likely also have the payload and trusted binary included, all written to disk at the same time.
Recently my colleague Matt Green released a repository on Github called DetectRaptor to share publicly available Velociraptor detection content. It provides you with easy-to-consume detection content to hunt for suspicious activity. One of the libraries Matt is importing is from https://hijacklibs.net/, a list of files and locations that are indicators of DLL hijacking (including PlugX). If you look at the non-Microsoft entries in the ‘hijacklibs.csv’, several instances are related to PlugX incidents reported by multiple vendors.
After importing the content, Velociraptor can start hunting and detecting possible signs of DLL hijacking and, for example, PlugX.
Author: Thomas Elkins Contributors: Andrew Iwamaye, Matt Green, James Dunne, and Hernan Diaz
Rapid7 routinely conducts research into the wide range of techniques that threat actors use to conduct malicious activity. One objective of this research is to discover new techniques being used in the wild, so we can develop new detection and response capabilities.
Recently, we (Rapid7) observed malicious actors using OneNote files to deliver malicious code. We identified a specific technique that used OneNote files containing batch scripts, which upon execution started an instance of a renamed PowerShell process to decrypt and execute a base64 encoded binary. The base64 encoded binary subsequently decrypted a final payload, which we have identified to be either Redline Infostealer or AsyncRat.
This blog post walks through analysis of a OneNote file that delivered a Redline Infostealer payload.
Analysis of OneNote File
The attack vector began when a user was sent a OneNote file via a phishing email. Once the OneNote file was opened, the user was presented with the option to “Double Click to View File” as seen in Figure 1.
Figure 1 – OneNote file "Remittance" displaying the button “Double Click to View File”
We determined that the button “Double Click to View File” was moveable. Hidden underneath the button, we observed five shortcuts to a batch script, nudm1.bat. The hidden placement of the shortcuts ensured that the user double-clicked on one of the shortcuts when interacting with the “Double Click to View File” button.
Figure 2 – Copy of Batch script nudm1.bat revealed after moving “Double Click to View File” button
Once the user double clicked the button “Double Click to View File”, the batch script nudm1.bat executed in the background without the user’s knowledge.
Analysis of Batch Script
In a controlled environment, we analyzed the batch script nudm1.bat and observed variables storing values.
Figure 3 – Beginning contents of nudm1.bat
Near the middle of the script, we observed a large section of base64 encoded data, suggesting at some point, the data would be decoded by the batch script.
Figure 4 – Base64 encoded data contained within nudm1.bat
At the bottom of the batch script, we observed the declared variables being concatenated. To easily determine what the script was doing, we placed echo commands in front of the concatenations. The addition of the echo commands allowed for the batch script to deobfuscate itself for us upon execution.
Figure 5 – echo command placed in front of concatenated variables
We executed the batch file and piped the deobfuscated result to a text file. The text file contained a PowerShell script that was executed with a renamed PowerShell binary, nudm1.bat.exe.
Figure 6 – Output after using echo reveals PowerShell script
We determined the script performed the following:
Base64 decoded the data stored after :: within nudm1.bat, shown in Figure 4
AES Decrypted the base64 decoded data using the base64 Key 4O2hMB9pMchU0WZqwOxI/4wg3/QsmYElktiAnwD4Lqw= and base64 IV of TFfxPAVmUJXw1j++dcSfsQ==
Decompressed the decrypted contents using gunzip
Reflectively loaded the decrypted and decompressed contents into memory
Using CyberChef, we replicated the identified decryption method to obtain a decrypted executable file.
Figure 7 – AES decryption via Cyberchef reveals MZ header
We determined the decrypted file was a 32-bit .NET executable and analyzed the executable using dnSpy.
Analysis of .NET 32-bit Executable
In dnSpy we observed the original file name was tmpFBF7. We also observed that the file contained a resource named payload.exe.
Figure 8 – dnSpy reveals name of original program tmpFBF7 and a payload.exe resource
We navigated to the entry point of the file and observed base64 encoded strings. The base64 encoded strings were passed through a function SRwvjAcHapOsRJfNBFxi. The function SRwvjAcHapOsRJfNBFxi utilized AES decryption to decrypt data passed as argument.
Figure 9 – AES Decrypt Function SRwvjAcHapOsRJfNBFxi
As seen in Figure 9, the function SRwvjAcHapOsRJfNBFxi took in 3 arguments: input, key and iv.
We replicated the decryption process from the function SRwvjAcHapOsRJfNBFxi using CyberChef to decrypt the values of the base64 encoded strings. Figure 9 shows an example of the decryption process of the base64 encoded string vYhBhJfROLULmQk1P9jbiqyIcg6RWlONx2FLYpdRzZA= from line 30 of Figure 7 to reveal a decoded and decrypted string of CheckRemoteDebuggerPresent.
Figure 10 – Using Cyberchef to replicate decryption of function SRwvjAcHapOsRJfNBFxi
Repeating the decryption of the other base64 encoded strings revealed some anti-analysis and anti-AV checks performed by the executable:
EtwEventWrite /c choice /c y /n /d y /t 1 & attrib -h -s
After passing the anti-analysis and anti-AV checks, the executable called upon the payload.exe resource in line 94 of the code. We determined that the payload.exe resource was saved into the variable @string.
Figure 11 – @string storing payload.exe
On line 113, the variable @string was passed into a new function, aBTlNnlczOuWxksGYYqb, as well as the AES decryption function SRwvjAcHapOsRJfNBFxi.
Figure 12 – @string being passed through function hDMeRrMMQVtybxerYkHW
The function aBTlNnlczOuWxksGYYqb decompressed content passed to it using Gunzip.
Figure 13 – Function aBTlNnlczOuWxksGYYqb decompresses content using Gzip
Using CyberChef, we decrypted and decompressed the payload.exe resource to obtain another 32-bit .NET executable, which we named payload2.bin. Using Yara, we scanned payload2.bin and determined it was related to the Redline Infostealer malware family.
Figure 14 – Yara Signature identifying payload2.bin as Redline Infostealer
We also analyzed payload2.bin in dnSpy.
Analysis of Redline Infostealer
We observed that the original final name of payload2.bin was Footstools and that a class labeled Arguments contained the variables IP and Key. The variable IP stored a base64 encoded value GTwMCik+IV89NmBYISBRLSU7PlMZEiYJKwVVUg==.
Figure 15 – Global variable IP set as Base64 encoded string
The variable Key stored a UTF8 value of Those.
Figure 16 – Global variable Key set with value Those
We identified that the variable IP was called into a function, WriteLine(), which passed the variables IP and Key into a String.Decrypt function as arguments.
Figure 17 – String.Decrypt being passed arguments IP and Key
The function String.Decrypt was a simple function that XOR’ed input data with the value of Key.
Using Cyberchef, we replicated the String.Decrypt function for the ‘IP’ variable by XORing the base64 value shown in Figure 13 with the value of Key shown in Figure 16 to obtain the decrypted value for the IP variable, 172.245.45[.]213:3235.
Figure 19 – Using XOR in Cyberchef to reveal value of argument IP
Redline Info Stealer has the capability to steal credentials related to Cryptocurrency wallets, Discord data, as well as web browser data including cached cookies. Figure 19 shows functionality in Redline Infostealer that searches for known Cryptocurrency wallets.
Figure 20 – Redline Infostealer parsing for known Cryptocurrency wallet locations
Rapid7 Protection
Rapid7 has existing rules that detect the behavior observed within customers environments using our Insight Agent including:
Suspicious Process – Renamed PowerShell
OneNote Embedded File Parser
Rapid7 has also developed a OneNote file parser and detection artifact for Velociraptor. This artifact can be used to detect or extract malicious payloads like the one discussed in this post. https://docs.velociraptor.app/exchange/artifacts/pages/onenote/
Block .one attachments at the network perimeter or with an antiphishing solution if .one files are not business-critical
User awareness training
If possible, implement signatures to search for PowerShell scripts containing reverse strings such as gnirtS46esaBmorF
Watch out for OneNote as the parent process of cmd.exe executing a .bat file
In our first seminar of 2023, we were delighted to welcome Dr Katie Rich and Carla Strickland. They spoke to us about teaching the programming construct of variables in Grade 3 and 4 (age 8 to 10).
Dr Katie RichCarla Strickland
We are hearing from a diverse range of speakers in our current series of monthly online research seminars focused on primary (K-5) computing education. Many of them work closely with educators to translate research findings into classroom practice to make sure that all our younger learners have positive first experiences of learning computing. An important goal of their research is to impact the development of pedagogy, resources, and professional development to support educators to deliver computing concepts with confidence.
Variables in computing and mathematics
Dr Katie Rich (American Institutes of Research) and Carla Strickland (UChicago STEM Education) are both part of a team that worked on a research project called Everyday Computing, which aims to integrate computational thinking into primary mathematics lessons. A key part of the Everyday Computing project was to develop coherent learning resources across a number of school years. During the seminar, Katie and Carla presented on a study in the project that revolved around teaching variables in Grade 3 and 4 (age 8 to 10) by linking this computing concept to mathematical concepts such as area, perimeter, and fractions.
Variables are used in both mathematics and computing, but in significantly different ways. In mathematics, a variable, often represented by a single letter such as x or y, corresponds to a quantity that stays the same for a given problem. However, in computing, a variable is an identifier used to label data that may change as a computer program is executed. A variable is one of the programming constructs that can be used to generalise programs to make them work for a range of inputs. Katie highlighted that the research team was keen to explore the synergies and tensions that arise when curriculum subjects share terms, as is the case for ‘variable’.
Defining a learning trajectory
At the start of the project, in order to be able to develop coherent learning resources across school years, the team reviewed research papers related to teaching the programming construct of variables. In the papers, they found a variety of learning goals that related to facts (what learners need to know) and skills (what learners need to be able to do). They grouped these learning goals and arranged the groups into ‘levels of thinking’, which were then mapped onto a learning trajectory to show progression pathways for learning.
Four of the five levels of thinking identified in the study: Data Storer, Data User, Variable User, Variable Creator. Click to enlarge.
Learning materials about variables
Carla then shared three practical examples of learning resources their research team created that integrated the programming construct of variables into a maths curriculum. The three activities, described below, form part of a series of lessons called Action Fractions. You can read more about the series of lessons in this research paper.
Robot Boxesis an unpluggedactivity that is positioned at the Data User level of thinking. It relates to creating instructions for a fictional robot. Learners have to pay attention to different data the robot needs in order to draw a box, such as the length and width, and also to the value that the robot calculates as area of the box. The lesson uses boxes on paper as concrete representations of variables to which learners can physically add values.
Ambling Animals is set at the ‘Data Storer’ and ‘Variable Interpreter’ levels of thinking. It includes a Scratch project to help students to locate and compare fractions on number lines. During this lesson, find a variable that holds the value of the animal that represents the larger of two fractions.
Adding Fractions draws on facts and skills from the ‘Variable Interpreter’ and ‘Variable Implementer’ levels of thinking and also includes a Scratch project. The Scratch project visualises adding fractions with the same denominator on a number line. The lesson starts to explain why variables are so important in computer programs by demonstrating how using a variable can make code more efficient.
Takeaways: Cross-curricular teaching, collaborative research
Teaching about the programming construct of variables can be challenging, as it requires young learners to understand abstract ideas. The research Katie and Carla presented shows how integrating these concepts into a mathematics curriculum is one way to highlight tangible uses of variables in everyday problems. The levels of thinking in the learning trajectory provide a structure helping teachers to support learners to develop their understanding and skills; the same levels of thinking could be used to introduce variables in other contexts and curricula.
Many primary teachers use cross-curricular learning to increase children’s engagement and highlight real-world examples. The seminar showed how important it is for teachers to pay attention to terms used across subjects, such as the word ‘variable’, and to explicitly explain a term’s different meanings. Katie and Carla shared a practical example of this when they suggested that computing teachers need to do more to stress the difference between equations such as xy = 45 in maths and assignment statements such as length = 45 in computing.
The Everyday Computing project resources were created by a team of researchers and educators who worked together to translate research findings into curriculum materials. This type of collaboration can be really valuable in driving a research agenda to directly improve learning outcomes for young people in classrooms.
How can this research influence your classroom practice or other activities as an educator? Let us know your thoughts in the comments. We’ll be continuing to reflect on this question throughout the seminar series.
You can watch Katie’s and Carla’s full presentation here:
Join our seminar series on primary computing education
We continue on Tuesday 7 February at 17.00 UK time, when we will hear from Dr Jean Salac, University of Washington. Jean will present her work in identifying inequities in elementary computing instruction and in developing a learning strategy, TIPP&SEE, to address these inequities. Sign up now, and we will send you a joining link for the session.
Recog Release v3.0.3, which is available now, includes updated fingerprints for Zoho ManageEngine PAM360, Password Manager Pro, and Access Manager Plus; Atlassian Bitbucket Server; and Supervisord Supervisor. It also includes new fingerprints and a number of bug fixes, all of which are detailed below.
Recog is an open source recognition framework used to identify products, operating systems, and hardware through matching network probe data against its extensive fingerprint collection. Support for Recog is part of Rapid7’s ongoing commitment to open source initiatives.
Zoho ManageEngine PAM360, Password Manager Pro, and Access Manager Plus
Fingerprints for these three Zoho ManageEngine products were added shortly after Cybersecurity and Infrastructure Security Agency (CISA) added CVE-2022-35405 to their Known Exploited Vulnerabilities (KEV) catalog on September 22nd, 2022. Favicon, HTML title, and HTTP server fingerprints were created for both PAM360 and Password Manager Pro, and favicon and HTML title fingerprints were created for Access Manager Plus. PAM360 version 5500 (and older) and Password Manager Pro version 12100 (and older) are both vulnerable to an unauthenticated remote code execution (RCE) vulnerability, and Access Manager Plus version 4302 (and older) is vulnerable to an authenticated remote code execution (RCE) vulnerability. In addition, Grant Willcox contributed the Metasploit Zoho Password Manager Pro XML-RPC Java Deserialization exploit module which is capable of exploiting the unauthenticated vulnerability via the XML-RPC interface in Password Manager Pro and PAM360 and attaining RCE as the NT AUTHORITY\SYSTEM user.
More recently, on January 4th, 2023, Zoho released details of a SQL injection vulnerability (CVE-2022-47523) in PAM360 version 5800 (and older), Password Manager Pro version 12200 (and older) and Access Manager Plus version 4308 (and older). From a quick analysis of internet scan data there appears to be only about 76 Password Manager Pro and 21 PAM360 instances on the internet.
Atlassian Bitbucket Server
Favicon, HTML title and HTTP cookie fingerprints for the Atlassian Bitbucket server were added shortly after our Emergent Threat Response for CVE-2022-36804 was published on September 20th, 2022 in response to the command injection vulnerability in multiple API endpoints of both Bitbucket Server and Data Center. An adversary with access to either a public repository or read permissions on a private repository can perform remote code execution simply through a malicious HTTP request. Shelby Pace contributed the Metasploit Bitbucket Git Command Injection exploit module which is capable of exploiting the unauthenticated command injection. Bitbucket Server and Data Center versions 7.6 prior to 7.6.17, 7.17 prior to 7.17.10, 7.21 prior to 7.21.4, 8.0 prior to 8.0.3, 8.1 prior to 8.1.3, 8.2 prior to 8.2.2 and 8.3 prior to 8.3.1 are vulnerable. From a quick analysis of internet scan data there appears to be just under a thousand of these exposed on the internet.
Supervisord Supervisor
Favicon and HTML title fingerprints were added for anyone interested in locating unsupervised Supervisor instances on their networks. The web interface for the process control system allows users to restart or stop processes under the software’s control, and even tail the standard output and error streams. There might be some interesting information in those streams! From a quick analysis of internet scan data there appears to be only about 165 instances on the internet.
Welcome to 2023. I hope that you had a fantastic 2022 and that you’re looking forward to an even better year ahead. To help get the year off to a great start, I thought it might be fun to share a few of the things that we’ve got planned for 2023.
Whether you’re a teacher, a mentor, or a young person, if it’s computer science, coding, or digital skills that you’re looking for, we’ve got you covered.
Your code in space
Through our collaboration with the European Space Agency, theAstro Pi, young people can write computer programs that are guaranteed to run on the Raspberry Pi computers on the International Space Station (terms and conditions apply).
The Raspberry Pi computers on board the ISS (Image: ESA/NASA)
Astro Pi Mission Zero is open to participants until 17 March 2023 and is a perfect introduction to programming in Python for beginners. It takes about an hour to complete and we provide step-by-step guides for teachers, mentors, and young people.
Make a cool project and share it with the world
Kids all over the world are already working on their entries to Coolest Projects Global 2023, our international online showcase that will see thousands of young people share their brilliant tech creations with the world. Registration opens on 6 February and it’s super simple to get involved. If you’re looking for inspiration, why not explore the judges’ favourite projects from 2022?
While we all love the Coolest Projects online showcase, I’m also looking forward to attending more in-person Coolest Projects events in 2023. The word on the street is that members of the Raspberry Pi team have been spotted scouting venues in Ireland… Watch this space.
Experience AI
I am sure I wasn’t alone in disappearing down a ChatGPT rabbit hole at the end of last year after OpenAI made their latest AI chatbot available for free. The internet exploded with both incredible examples of what the chatbot can do and furious debates about the limitations and ethics of AI systems.
With the rapid advances being made in AI technology, it’s increasingly important that young people are able to understand how AI is affecting their lives now and the role that it can play in their future. This year we’ll be building on our research into the future of AI and data science education and launching Experience AI in partnership with leading AI company DeepMind. The first wave of resources and learning experiences will be available in March.
The big Code Club and CoderDojo meetup
With pandemic restrictions now almost completely unwound, we’ve seen a huge resurgence in Code Clubs and CoderDojos meeting all over the world. To build on this momentum, we are delighted to be welcoming Code Club and CoderDojo mentors and educators to a big Clubs Conference in Churchill College in Cambridge on 24 and 25 March.
This will be the first time we’re holding a community get-together since 2019 and a great opportunity to share learning and make new connections.
Building partnerships in India, Kenya, and South Africa
As part of our global mission to ensure that every young person is able to learn how to create with digital technologies, we have been focused on building partnerships in India, Kenya, and South Africa, and that work will be expanding in 2023.
In India we will significantly scale up our work with established partners Mo School and Pratham Education Foundation, training 2000 more teachers in government schools in Odisha, and running 2200 Code Clubs across four states. We will also be launching new partnerships with community-based organisations in Kenya and South Africa, helping them set up networks of Code Clubs and co-designing learning experiences that help them bring computing education to their communities of young people.
Exploring computing education for 5- to 11-year-olds
Over the past few years, our research seminar series has covered computing education topics from diversity and inclusion, to AI and data science. This year, we’re focusing on current questions and research in primary computing education for 5- to 11-year-olds.
As ever, we’re providing a platform for some of the world’s leading researchers to share their insights, and convening a community of educators, researchers, and policy makers to engage in the discussion. The first seminar takes place today (Tuesday 10 January) and it’s not too late to sign up.
And much, much more…
That’s just a few of the super cool things that we’ve got planned for 2023. I haven’t even mentioned the new online projects we’re developing with our friends at Unity, the fun we’ve got planned with our very own online text editor, or what’s next for our curriculum and professional development offer for computing teachers.
Welcome to 2023, a year that sounds so futuristic it is hard to believe it is real. But real it is, and make no mistake, threat actors are still out there, working hard to get into networks the world over. So, at the start of the new year, I am reminded of two particular phrases: Those who do not learn from their past are doomed to repeat it, and history doesn’t repeat itself, but it rhymes.
With those cautionary words in mind, let’s take a brief look back at a smattering of the research we conducted in 2022. Hopefully, you will find some overlooked lessons from the past to keep your organizations safe here <cue excessive reverb> in the future.
Trends Gonna Trend
Some of Rapid7’s most important research is focused on the current state of cybersecurity and threat landscape. This research is designed to glean critical insights into threat actor tactics and how the security industry works to combat them. Below are four reports based on this type of research.
One of our most pored over reports, the Vulnerability Intelligence Report looks at threats that emerged in the previous year. This year, we identified many worrying (and some downright critical) trends in the vulnerability management space. For example, we found that widespread threats were up 130% from 202o and roughly half of them were zero-day exploits. Additionally, the time to known exploitation of vulnerabilities shrunk to under a week. It’s a sobering report, without a doubt.
Securing cloud instances has become a major part of keeping organizations safe from risk. That’s not exactly an earth-shattering statement. We all know how important cloud security is. However, our Cloud Misconfiguration Report found that even some of the world’s largest (and resource rich) organizations neglect to put some basic, common sense protections in place. So, clearly, there is still work to be done.
Ransomware has been on the rise for several years and continues to evolve as quickly as cybersecurity professionals find ways to combat it. One way it has evolved over the last few years is with the rise in double extortion. In this type of attack, threat actors exfiltrate an organization’s data before encrypting it. Then, they threaten to leak or sell that data unless a ransom is paid.
In this first of its kind report, we looked at data disclosures associated with double extortion campaigns and extracted some interesting trends including the industries most affected by these attacks, who is conducting them, and when they occur.
Passwords, we’ve all got them, but that doesn’t mean we are great at using them to their full potential. We cross referenced well-known password repositories with our own honeypots for SSH and RDP credentials to determine how well organizations use secure credentialing. The results were grim. This report details the results of our research and offers some tips on how to improve passwords (password managers to the front!).
All Cybersecurity is Local
Global trends are important, but keeping it local can help us understand the intricacies of security in our own neck of the woods. In this report, we took a deep dive into one geographical region to provide critical insights that improve security.
We took a look at the ASX200, the stock market index of companies listed on the Australian Stock Exchange. We found that though there is always room for improvement, by and large, the ASX200 companies are on equal security footing as some of their larger counterparts around the globe. And to further dispel any lingering FUD, they’ve measurably improved since the last time we looked at this sector in 2021, so go on ya, Aussie infosec pros!
The Future of Cybersecurity
At Rapid7, we don’t just look at the current state of the cybersecurity industry, we actively strive to improve it. Often, that means deep research into the future of cybersecurity tools and practices. This year was no exception. We’re quite proud of these reports and the potential they have to make us all a little bit safer.
This may sound like the title to a formal academic paper (and vaguely British) and that’s because it is. Rapid7 was honored to have a research paper on machine learning techniques to improve false positives in DAST solutions accepted by a journal published by the Association for Computing Machinery.
Our researchers created a machine learning technique that can reduce false positives in DAST solutions by 96% allowing security professionals more time to focus on triaging actual threats and remediating them, rather than heading on wild goose chases caused by false positives.
Companies large and small struggle with securing their IoT infrastructure from attackers. So, when the opportunity came to observe (and dissect) one company that seems to be doing it right, we jumped at the chance. In this report we partnered with Domino’s Pizza to look at their IoT operation and they were gracious enough to allow our expert pentesters and code auditors to tear it apart and see how they do it. It’s an excellent read for anyone looking to see some of the best practices in IoT security in use today.
2023 Here We Come
These are just a few of the great research papers we released last year. It has long been our mission to not only provide the best security platform and services available, but to help the entire cybersecurity community close the security achievement gap.
Our research departments take that mission very seriously and you can bet that we will be entering 2023 with a big ol’ list of research papers looking at the latest in cybersecurity innovation and best practices. We are grateful that you joined us on our journey last year and hope that you’ll be along for the ride again this year.
Improving gender balance in computing is part of our work to ensure equitable learning opportunities for all young people. Our Gender Balance in Computing (GBIC) research programme has been the largest effort to date to explore ways to encourage more girls and young women to engage with Computing.
Storytelling: Connecting computing to storytelling with 6- to 7-year-olds
Belonging: Supporting learners to feel that they “belong” in computer science
Non-formal Learning: Establishing the connections between in-school and out-of-school computing
Relevance: Making computing relatable to everyday life
Subject Choice: How computer science is presented to young people as a subject choice
In December we published the last of seven reports describing the results of the programme. In this blog post I summarise our overall findings and reflect on what we’ve learned through doing this research.
Gender balance in computing is not a new problem
I was fascinated to read a paper by Deborah Butler from 2000 which starts by summarising themes from research into gender balance in computing from the 1980s and 1990s, for example that boys may have access to more role models in computing and may receive more encouragement to pursue the subject, and that software may be developed with a bias towards interests traditionally considered to be male. Butler’s paper summarises research from at least two decades ago — have we really made progress?
In England, it’s true that making Computing a mandatory subject from age 5 means we have taken great strides forward; the need for young people to make a choice about studying the subject only arises at age 14. However, statistics for England’s externally assessed high-stakes Computer Science courses taken at ages 14–16 (GCSE) and 16–18 (A level) clearly show that, although there is a small upwards trend in the proportion of female students, particularly for A level, gender balance among the students achieving GCSE/A level qualifications remains an issue:
Computer Science qualification (England):
In 2018:
In 2021:
In 2022:
GCSE (age 16)
20.41%
20.77%
21.37%
A level (age 18)
11.74%
14.71%
15.17%
Percentage of girls among the students achieving Computer Science qualifications in England’s secondary schools
What did we do in the Gender Balance in Computing programme?
In GBIC, we carried out a range of research studies involving more than 14,500 pupils and 725 teachers in England. Implementation teams came from the Foundation, Apps For Good, the WISE Campaign, and the Behavioural Insights Team (BIT). A separate team at BIT acted as the independent evaluators of all the studies.
In total we conducted the following studies:
Two feasibility studies: Storytelling; Relevance, which led to a full randomised controlled trial (RCT)
Five RCTs: Belonging; Peer Instruction; Pair Programming; Relevance, which was preceded by a feasibility study; Non-formal Learning (primary)
One quasi-experimental study: Non-formal Learning (secondary)
One exploratory research study: Subject Choice (Subject choice evenings and option booklets)
Each study (apart from the exploratory research study) involved a 12-week intervention in schools. Bespoke materials were developed for all the studies, and teachers received training on how to deliver the intervention they were a part of. For the RCTs, randomisation was done at school level: schools were randomly divided into treatment and control groups. The independent evaluators collected both quantitative and qualitative data to ensure that we gained comprehensive insights from the schools’ experiences of the interventions. The evaluators’ reports and our associated blog posts give full details of each study.
The impact of the pandemic
The research programme ran from 2019 to 2022, and as it was based in schools, we faced a lot of challenges due to the coronavirus pandemic. Many research programmes meant to take place in school were cancelled as soon as schools shut during the pandemic.
Although we were fortunate that GBIC was allowed to continue, we were not allowed to extend the end date of the programme. Thus our studies were compressed into the period after schools reopened and primarily delivered in the academic year 2021/2022. When schools were open again, the implementation of the studies was affected by teacher and pupil absences, and by schools necessarily focusing on making up some of the lost time for learning.
The overall results of Gender Balance in Computing
Quantitatively, none of the RCTs showed a statistically significant impact on the primary outcome measured, which was different in different trials but related to either learners’ attitudes to computer science or their intention to study computer science. Most of the RCTs showed a positive impact that fell just short of statistical significance. The evaluators went to great lengths to control for pandemic-related attrition, and the implementation teams worked hard to support teachers in still delivering the interventions as designed, but attrition and disruptions due to the pandemic may have played a part in the results.
The qualitative research results were more encouraging. Teachers were enthusiastic about the approaches we had chosen in order to address known barriers to gender balance, and the qualitative data indicated that pupils reacted positively to the interventions. One key theme across the Teaching Approach (and other) studies was that girls valued collaboration and teamwork. The data also offered insights that enable us to improve on the interventions.
We designed the studies so they could act as pilots that may be rolled out at a national scale. While we have gained sufficient understanding of what works to be able to run the interventions at a larger scale, two particular learnings shape our view of what a large-scale study should look like:
1. A single intervention may not be enough to have an impact
The GBIC results highlight that there is no quick fix and suggest that we should combine some of the approaches we’ve been trialling to provide a more holistic approach to teaching Computing in an equitable way. We would recommend that schools adopt several of the approaches we’ve tested; the materials associated with each intervention are freely available (see our blog posts for links).
2. Age matters
One of the very interesting overall findings from this research programme was the difference in intent to study Computing between primary school and secondary school learners; fewer secondary school learners reported intent to study the subject further. This difference was observed for both girls and boys, but was more marked for girls, as shown in the graph below. This suggests that we need to double down on supporting children, especially girls, to maintain their interest in Computing as they enter secondary school at age 11. It also points to a need for more longitudinal research to understand more about the transition period from primary to secondary school and how it impacts children’s engagement with computer science and technology in general.
Compared to primary school age girls, girls aged 12 to 13 show dramatically reduced intent to continue studying computing.
What’s next?
We think that more time (in excess of 12 weeks) is needed to both deliver the interventions and measure their outcome, as the change in learners’ attitudes may be slow to appear, and we’re hoping to engage in more longitudinal research moving forward.
In the final seminar in our series on cross-disciplinary computing, Dr Tracy Gardner and Rebecca Franks, who work here at the Foundation, described the framework underpinning the Foundation’s non-formal learning pathways. They also shared insights from our recently published literature review about the impact that non-formal computing education has on learners.
Dr Tracy GardnerRebecca Franks
Tracy and Rebecca both have extensive experience in teaching computing, and they are passionate about inspiring young learners and broadening access to computing education. In their work here, they create resources and content for learners in coding clubs and young people at home.
How non-formal learning creates opportunities for computing education
UNESCO defines non-formal learning as “institutionalised, intentional, and planned… an addition, alternative, and/or complement to formal education within the process of life-long learning of individuals”. In terms of computing education, this kind of learning happens in after-school programmes or children’s homes as they engage with materials that have been carefully designed by education providers.
At the Raspberry Pi Foundation, we support two global networks of free, volunteer-led coding clubs where regular non-formal learning takes place: Code Club, teacher- and volunteer-led coding clubs for 9- to 13-year-olds taking place in schools in more than160 countries; and CoderDojo, volunteer-led programming clubs for young people aged 7–17 taking place in community venues and offices in 100 countries. Through free learning resources and other support, we enable volunteers to run their club sessions, offering versatile opportunities and creative, inclusive spaces for young people to learn about computing outside of the school curriculum. Volunteers who run Code Clubs or CoderDojos report that participating in the club sessions positively impacts participants’ programming skills and confidence.
Rebecca and Tracy are part of the team here that writes the learning resources young people in Code Clubs and CoderDojos (and beyond) use to learn to code and create technology.
Helping learners make things that matter to them
Rebecca started the seminar by describing how the team reviewed existing computing pedagogy research into non-formal learning, as well as large amounts of website visitor data and feedback from volunteers, to establish a new framework for designing and creating coding resources in the form of learning paths.
What the Raspberry Pi Foundation takes into account when creating non-formal learning resources. Click to enlarge.
As Rebecca explained, non-formal learning paths should be designed to bridge the so-called ‘Turing tar-pit’: the gap between what learners want to do, and what they have the knowledge and resources to achieve.
To prevent learners from getting frustrated and ultimately losing interest in computing, learning paths need to:
Be beginner-friendly
Include scaffolding
Support learner’s design skills
Relate to things that matter to learners
When Rebecca and Tracy’s team create new learning paths, they first focus on the things that learners want to make. Then they work backwards to bridge the gap between learners’ big ideas and the knowledge and skills needed to create them. To do this, they use the 3…2…1…Make! framework they’ve developed.
An illustration of the 3…2…1…Make! structure of the new Raspberry Pi Foundation non-formal learning paths.
Learning paths designed according to the framework are made up of three different types of project in a 3-2-1 structure:
Three Explore projects to introduce creators to a set of skills and provide step-by-step instructions to help them develop initial confidence
Two Design projects to allow creators to practise the skills they learned in the previous Explore projects, and to express themselves creatively while they grow in independence
One Invent project where creators use their skills to meet a project brief for a particular audience
Rebecca and Tracy’s team have created several new learning pathways based on the 3…2…1…Make! framework and received much positive feedback on them. They are now looking to develop more tools and libraries to support learners, to increase the accessibility of the paths, and also to conduct research into the impact of the framework.
New literature review of non-formal computing education showcases its positive impact
In the second half of the seminar, Tracy shared what the research literature says about the impact of non-formal learning. She and researchers at the Foundation particularly wanted to find out what the research says about computing education for K–12 in non-formal settings. They systematically reviewed 421 papers, identifying 88 papers from the last seven years that related to empirical research on non-formal computing education for young learners. Based on these 88 papers, they summarised the state of the field in a literature review.
So far, most studies of non-formal computing education have looked at knowledge and skill development in computing, as well as affective factors such as interest and perception. The cognitive impact of non-formal education has been generally positive. The papers Tracy and the research reviewed suggested that regular learning opportunities, such as weekly Code Clubs, were beneficial for learners’ knowledge development, and that active teaching of problem solving skills can lead to learners’ independence.
Non-formal computing education also seems to be beneficial in terms of affective factors (although it is unclear yet whether the benefits remain long-term, since most existing research studies conducted have been short-term ones). For example, out-of-school programmes can lead to more positive perception and increased awareness of computing for learners, and also boost learners’ confidence and self-efficacy if they have had little prior experience of computing. The social aspects of participating in coding clubs should not be underestimated, as learners can develop a sense of belonging and support as they work with their peers and mentors.
The literature review showed that non-formal computing complements formal in-school education in many ways. Not only can Code Clubs and CoderDojos be accessible and equitable spaces for all young people, because the people who run them can tailor learning to the individuals. Coding clubs such as these succeed in making computing fun and engaging by enabling a community to form and allowing learners to make things that are meaningful to them.
What existing studies in non-formal computing aren’t telling us
Another thing the literature review made obvious is that there are big gaps in the existing understanding of non-formal computing education that need to be researched in more detail. For example, most of the studies the papers in the literature review described took place with female students in middle schools in the US.
That means the existing research tells us little about non-formal learning:
In other geographic locations
In other educational settings, such as primary schools or after-school programmes
For a wider spectrum of learners
We would also love to see studies that hone in on:
The long-term impact of non-formal learning
Which specific factors contribute to positive outcomes
Non-formal learning about aspects of computing beyond programming
3…2…1…research!
We’re excited to continue collaborating within the Foundation so that our researchers and our team creating non-formal learning content can investigate the impact of the 3…2…1…Make! framework.
The aim of the 3…2…1…Make! framework is to enable young people to create things and solve problems that matter to them using technology.
This collaboration connects two of our long-term strategic goals: to engage millions of young people in learning about computing and how to create with digital technologies outside of school, and to deepen our understanding of how young people learn about computing and how to create with digital technologies, and to use that knowledge to increase the impact of our work and advance the field of computing education. Based on our research, we will iterate and improve the framework, in order to enable even more young people to realise their full potential through the power of computing and digital technologies.
Join our seminar series on primary computing education
From January, you can join our new monthly seminar series on primary (K–5) teaching and learning. In this series, we’ll hear insights into how our youngest learners develop their computing knowledge, so whether you’re a volunteer in a coding club, a teacher, a researcher, or simply interested in the topic, we’d love to see you at one of these monthly online sessions.
The first seminar, on Tuesday 10 January at 5pm UK time, will feature researchers and educators Dr Katie Rich and Carla Strickland. They will share findings on how to teach children about variables, one of the most difficult aspects of computing for young learners. Sign up now, and we will send you notifications and joining links for each seminar session.