Tag Archives: research

Universal design for learning in computing | Hello World #15

Post Syndicated from Hayley Leonard original https://www.raspberrypi.org/blog/universal-design-for-learning-in-computing-hello-world-15/

In our brand-new issue of Hello World magazine, Hayley Leonard from our team gives a primer on how computing educators can apply the Universal Design for Learning framework in their lessons.

Cover of issue 15 of Hello World magazine

Universal Design for Learning (UDL) is a framework for considering how tools and resources can be used to reduce barriers and support all learners. Based on findings from neuroscience, it has been developed over the last 30 years by the Center for Applied Special Technology (CAST), a nonprofit education research and development organisation based in the US. UDL is currently used across the globe, with research showing it can be an efficient approach for designing flexible learning environments and accessible content.

A computing classroom populated by students with diverse genders and ethnicities

Engaging a wider range of learners is an important issue in computer science, which is often not chosen as an optional subject by girls and those from some minority ethnic groups. Researchers at the Creative Technology Research Lab in the US have been investigating how UDL principles can be applied to computer science, to improve learning and engagement for all students. They have adapted the UDL guidelines to a computer science education context and begun to explore how teachers use the framework in their own practice. The hope is that understanding and adapting how the subject is taught could help to increase the representation of all groups in computing.

The UDL guidelines help educators anticipate barriers to learning and plan activities to overcome them.

A scientific approach

The UDL framework is based on neuroscientific evidence which highlights how different areas or networks in the brain work together to process information during learning. Importantly, there is variation across individuals in how each of these networks functions and how they interact with each other. This means that a traditional approach to teaching, in which a main task is differentiated for certain students with special educational needs, may miss out on the variation in learning between all students across different tasks.

A stylised representation of the human brain
The UDL framework is based on neuroscientific evidence

The UDL guidelines highlight different opportunities to take learner differences into account when planning lessons. The framework is structured according to three main principles, which are directly related to three networks in the brain that play a central role in learning. It encourages educators to plan multiple, flexible methods of engagement in learning (affective networks), representation of the teaching materials (recognition networks), and opportunities for action and expression of what has been learnt (strategic networks).

The three principles of UDL are each expanded into guidelines and checkpoints that allow educators to identify the different methods of engagement, representation, and expression to be used in a particular lesson. Each principle is also broken down into activities that allow learners to access the learning goals, remain engaged and build on their learning, and begin to internalise the approaches to learning so that they are empowered for the future.

Examples of UDL guidelines for computer science education from the Creative Technology Research Lab

Multiple means of engagement Multiple means of representation Multiple means of
action and expression
Provide options for recruiting interests
* Give students choice (software, project, topic)
* Allow students to make projects relevant to culture and age
Provide options for perception
* Model computing through physical representations as well as through interactive whiteboard/videos etc.
* Select coding apps and websites that allow adjustment of visual settings (e.g. font size/contrast) and that are compatible with screen readers
Provide options for physical action
* Include CS unplugged activities that show physical relationships of abstract computing concepts
* Use assistive technology, including a larger or smaller mouse or touchscreen devices
Provide options for sustaining effort and persistence
* Utilise pair programming and group work with clearly defined roles
* Discuss the integral role of perseverance and problem-solving in computer science
Provide options for language, mathematical expressions, and symbols
* Teach and review computing vocabulary (e.g. code, animations, algorithms)
* Provide reference sheets with images of blocks, or with common syntax when using text
Provide options for expression and communication
* Provide sentence starters or checklists for communicating in order to collaborate, give feedback, and explain work
* Provide options that include starter code
Provide options for self-regulation
* Break up coding activities with opportunities for reflection, such as ‘turn and talk’ or written questions
* Model different strategies for dealing with frustration appropriately
Provide options for comprehension
* Encourage students to ask questions as comprehension checkpoints
* Use relevant analogies and make cross-curricular connections explicit
Provide options for executive function
* Embed prompts to stop and plan, test, or debug throughout a lesson or project
* Demonstrate debugging with think-alouds

Each principle of the UDL framework is associated with three areas of activity which may be considered when planning lessons or units of work. It will not be the case that each area of activity should be covered in every lesson, and some may prove more important in particular contexts than others. The full table and explanation can be found on the Creative Technology Research Lab website at ctrl.education.ufl.edu/projects/tactic.

Applying UDL to computer science education

While an advantage of UDL is that the principles can be applied across different subjects, it is important to think carefully about what activities to address these principles could look like in the case of computer science.

Maya Israel
Researcher Maya Israel will speak at our April seminar

Researchers at the Creative Technology Research Lab, led by Maya Israel, have identified key activities, some of which are presented in the table on the previous page. These guidelines will help educators anticipate potential barriers to learning and plan activities that can overcome them, or adapt activities from those in existing schemes of work, to help engage the widest possible range of students in the lesson.

UDL in the classroom

As well as suggesting approaches to applying UDL to computer science education, the research team at the Creative Technology Research Lab has also investigated how teachers are using UDL in practice. Israel and colleagues worked with four novice computer science teachers in US elementary schools to train them in the use of UDL and understand how they applied the framework in their teaching.

Smiling learners in a computing classroom

The research found that the teachers were most likely to include in their teaching multiple means of engagement, followed by multiple methods of representation. For example, they all offered choice in their students’ activities and provided materials in different formats (such as oral and visual presentations and demonstrations). They were less likely to provide multiple means of action and expression, and mainly addressed this principle through supporting students in planning work and checking their progress against their goals.

Although the study included only four teachers, it highlighted the flexibility of the UDL approach in catering for different needs within variable teaching contexts. More research will be needed in future, with larger samples, to understand how successful the approach is in helping a wide range of students to achieve good learning outcomes.

Find out more about using UDL

There are numerous resources designed to help teachers learn more about the UDL framework and how to apply it to teaching computing. The CAST website (helloworld.cc/cast) includes an explainer video and the detailed UDL guidelines. The Creative Technology Research Lab website has computing-specific ideas and lesson plans using UDL (helloworld.cc/udl).

Maya Israel will be presenting her research at our computing education research seminar series, on 20 April 2021. Our seminars are free to attend and open to anyone from anywhere around the world. Find out more about the current seminar series, which focuses on diversity and inclusion, and sign up to attend for free.

Further reading

The post Universal design for learning in computing | Hello World #15 appeared first on Raspberry Pi.

What does equity-focused teaching mean in computer science education?

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/equity-focused-teaching-in-computer-science-education/

Today, I discuss the second research seminar in our series of six free online research seminars focused on diversity and inclusion in computing education, where we host researchers from the UK and USA together with the Royal Academy of Engineering. By diversity, we mean any dimension that can be used to differentiate groups and people from one another. This might be, for example, age, gender, socio-economic status, disability, ethnicity, religion, nationality, or sexuality. The aim of inclusion is to embrace all people irrespective of difference. 

In this seminar, we were delighted to hear from Prof Tia Madkins (University of Texas at Austin), Dr Nicol R. Howard (University of Redlands), and Shomari Jones (Bellevue School District) (find their bios here), who talked to us about culturally responsive pedagogy and equity-focused teaching in K-12 Computer Science.

Equity-focused computer science teaching

Tia began the seminar with an audience-engaging task: she asked all participants to share their own definition of equity in the seminar chat. Amongst their many suggestions were “giving everybody the same opportunity”, “equal opportunity to access high-quality education”, and “everyone has access to the same resources”. I found Shomari’s own definition of equity very powerful: 

“Equity is the fair treatment, access, opportunity, and advancement of all people, while at the same time striving to identify and eliminate barriers that have prevented the full participation of some groups. Improving equity involves increasing justice and fairness within the procedures and processes of institutions or systems, as well as the distribution of resources. Tackling equity requires an understanding of the root cause of outcome disparity within our society.”

Shomari Jones

This definition is drawn directly from the young people Shomari works with, and it goes beyond access and opportunity to the notion of increasing justice and fairness and addressing the causes of outcome disparity. Justice was a theme throughout the seminar, with all speakers referring to the way that their work looks at equity in computer science education through a justice-oriented lens.

Removing deficit thinking

Using a justice-oriented approach means that learners should be encouraged to use their computer science knowledge to make a difference in areas that are important to them. It means that just having access to a computer science education is not sufficient for equity.

Tia Madkins presents a slide: "A justice-oriented approach to computer science teaching empowers students to use CS knowledge for transformation, moves beyond access and achievement frames, and is an asset- or strengths-based approach centering students and families"

Tia spoke about the need to reject “deficit thinking” (i.e. focusing on what learners lack) and instead focus on learners’ strengths or assets and how they bring these to the school classroom. For researchers and teachers to do this, we need to be aware of our own mindset and perspective, to think about what we value about ethnic and racial identities, and to be willing to reflect and take feedback.

Activities to support computer science teaching

Nicol talked about some of the ways of designing computing lessons to be equity-focused. She highlighted the benefits of pair programming and other peer pedagogies, where students teach and learn from each other through feedback and sharing ideas/completed work. She suggested using a variety of different programs and environments, to ensure a range of different pathways to understanding. Teachers and schools can aim to base teaching around tools that are open and accessible and, where possible, available in many languages. If the software environment and tasks are accessible, they open the doors of opportunity to enable students to move on to more advanced materials. To demonstrate to learners that computer science is applicable across domains, the topic can also be introduced in the context of mathematics and other subjects.

Nicol Howard presents a slide: "Considerations for equity-focused computer science teaching include your beliefs (and your students' beliefs) and how they impact CS classrooms; tiered activities and pair programming; self-expressions versus CS preparation; equity-focused lens"

Learners can benefit from learning computer science regardless of whether they want to become a computer scientist. Computing offers them skills that they can use for self-expression or to be creative in other areas of their life. They can use their knowledge for a specific purpose and to become more autonomous, particularly if their teacher does not have any deficit thinking. In addition, culturally relevant teaching in the classroom demonstrates a teacher’s deliberate and explicit acknowledgment that they value all students in their classroom and expect students to excel.

Engaging family and community

Shomari talked about the importance of working with parents and families of ethnically diverse students in order to hear their voices and learn from their experiences.

Shomari Jones presents a slide: “Parents without backgrounds and insights into the changing landscape of technology struggle to negotiate what roles they can play, such as how to work together in computing activities or how to find learning opportunities for their children.”

He described how the absence of a background in technology of parents and carers can drastically impact the experiences of young people.

“Parents without backgrounds and insights into the changing landscape of technology struggle to negotiate what roles they can play, such as how to work together in computing activities or how to find learning opportunities for their children.”

Betsy DiSalvo, Cecili Reid, and Parisa Khanipour Roshan. 2014

Shomari drew on an example from the Pacific Northwest in the US, a region with many successful technology companies. In this location, young people from wealthy white and Asian communities can engage fully in informal learning of computer science and can have aspirations to enter technology-related fields, whereas amongst the Black and Latino communities, there are significant barriers to any form of engagement with technology. This already existent inequity has been enhanced by the coronavirus pandemic: once so much of education moved online, it became widely apparent that many families had never owned, or even used, a computer. Shomari highlighted the importance of working with pre-service teachers to support them in understanding the necessity of family and community engagement.

Building classroom communities

Building a classroom community starts by fostering and maintaining relationships with students, families, and their communities. Our speakers emphasised how important it is to understand the lives of learners and their situations. Through this understanding, learning experiences can be designed that connect with the learners’ lived experiences and cultural practices. In addition, by tapping into what matters most to learners, teachers can inspire them to be change agents in their communities. Tia gave the example of learning to code or learning to build an app, which provides learners with practical tools they can use for projects they care about, and with skills to create artefacts that challenge and document injustices they see happening in their communities.

Find out more

If you want to learn more about this topic, a great place to start is the recent paper Tia and Nicol have co-authored that lays out more detail on the work described in the seminar: Engaging Equity Pedagogies in Computer Science Learning Environments, by Tia C. Madkins, Nicol R. Howard and Natalie Freed, 2020.

You can access the presentation slides via our seminars page.

Join our next free seminar

In our next seminar on Tuesday 2 March at 17:00–18:30 BST / 12:00–13:30 EDT / 9:00–10:30 PDT / 18:00–19:30 CEST, we’ll welcome Jakita O. Thomas (Auburn University), who is going to talk to us about Designing STEM Learning Environments to Support Computational Algorithmic Thinking and Black Girls: A Possibility Model for Changing Hegemonic Narratives and Disrupting STEM Neoliberal Projects. To join this free online seminar, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended Peter’s and Billy’s seminar, the link remains the same.

The post What does equity-focused teaching mean in computer science education? appeared first on Raspberry Pi.

NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS

Post Syndicated from Tod Beardsley original https://blog.rapid7.com/2021/01/29/nicer-protocol-deep-dive-internet-exposure-of-http-and-https/

NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS

Welcome to the NICER Protocol Deep Dive blog series! When we started researching what all was out on the internet way back in January, we had no idea we’d end up with a hefty, 137-page tome of a research report. The sheer length of such a thing might put off folks who might otherwise learn a thing or two about the nature of internet exposure, so we figured, why not break up all the protocol studies into their own reports?

So, here we are! What follows is taken directly from our National / Industry / Cloud Exposure Report (NICER), so if you don’t want to wait around for the next installment, you can cheat and read ahead!

[Research] Read the full NICER report today

Get Started

HTTP (TCP/80) & HTTPS (TCP/443)

One protocol to bring them all, and in the darkness, bind them.

TLDR

  • WHAT IT IS: HTTP: Pristine, plaintext Hypertext Transfer Protocol communications. HTTPS: Encrypted HTTP.
  • HOW MANY: 51,519,309 discovered HTTP nodes. 36,141,137 discovered HTTPS nodes. We’re going to be talking a bit differently about fingerprinting in this blog post, so raw, generic counts will have no context.
  • VULNERABILITIES: Hoo boy! Many! But, do you mean vulnerabilities in core web servers themselves? The add-ons folks build into them? The web applications they serve? As many users of Facebook might say, “it’s complicated.”
  • ADVICE: Go back to Gopher! Seriously, though, please continue to build awesome things using HTTPS. Just build them in such a way that folks who install and operate web servers can easily configure them securely, see patch status, and upgrade quickly and confidently.
  • ALTERNATIVES: QUIC, or “Quick UDP Internet Connection,” which is a “new multiplexed and secure transport atop UDP, designed from the ground up and optimized for HTTP/2 semantics.” While HTTP[S] will be with us for a Very Long Time, QUIC is its successor and will usher in whole new ways to deliver content securely and efficiently (and undoubtedly, exploit the same).

We’re going to talk about both HTTP and HTTPS combined (for the most part) as we identify what we found, some core areas of exposure, and opportunities for attackers. It’ll be a bit different than all the previous blogs, but that’s just part of the quirky nature of HTTP in general.

Discovery details

Way back in our Email blogs, we compared encrypted and unencrypted services. We’ll do the same here, but will be presenting a “top 12” for countries since that is the set combination between HTTP and HTTPS.

There are 30% more devices on the internet running plaintext HTTP versus encrypted HTTPS web services. The U.S. dwarfs all other countries in terms of discovered web service, very likely due to the presence of so many cloud services, hosting providers, and routers, switches, etc. in  IPv4 space allocated to the U.S.

Germany and Ireland each expose 9% more HTTPS nodes than HTTP, and both the Netherlands and U.K. are quickly closing their encryption disparity as well.

We’ll skip cloud counts since, well, everyone knows cloud servers are full of web servers and we’re not sure what good it will do letting you know that Amazon had ~640K Elastic Load Balancers (version 2.0!) running on the day our studies kicked off.

NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS

Exposure information

To understand exposure, we need to see what is running on these web servers. That’s not as easy as you might think with just lightweight scans. For example, here are the top 20 HTTP servers by vendor/family and port:

Vendor Family HTTPS (80) % of HTTP HTTPS (443) % of HTTPS
Microsoft IIS 5,273,393 10.24% 2,096,655 5.80%
Apache Apache 4,873,517 9.46% 2,595,714 7.18%
nginx nginx 3,938,031 7.64% 2,495,667 6.91%
Amazon Elastic Load Balancing 644,862 1.25% 386,751 1.07%
Squid Cache Squid 381,224 0.74% 8,649 0.02%
ACME Laboratories mini_httpd 125,708 0.24% 82,427 0.23%
Oracle GoAhead Webserver 48,505 0.09% 40,501 0.11%
Apache Tomcat 40,702 0.08% 32,271 0.09%
Taobao Tengine 37,626 0.07% 14,130 0.04%
Eclipse Jetty 29,750 0.06% 50,763 0.14%
Mbedthis Software Appweb 23,463 0.05% 19,470 0.05%
Virata EmWeb 22,354 0.04% 7,179 0.02%
Embedthis Appweb 17,235 0.03% 32,629 0.09%
Microsoft Windows CE Web Server 14,012 0.03% 1,027 0.00%
TornadoWeb Tornado 13,637 0.03% 10,151 0.03%
Tridium Niagara 9,772 0.02% 564 0.00%
TwistedMatrix Twisted Web 7,481 0.01% 4,984 0.01%
Caucho Resin 5,168 0.01% 1,812 0.01%
Mort Bay Jetty 5,079 0.01% 2,033 0.01%
SolarWinds Serv-U 3,232 0.01% 6,421 0.02%

Remember, we’re just counting what comes back on a `GET` request to those two ports on each active IP address, and the counts come from Recog signatures (which are great, but far from comprehensive). For some servers, we can get down to the discrete version level, which lets us build a Common Platform Enumeration identifier. That identifier lets us see how many CVEs a given instance type has associated with it. We used this capability to compare each version of each service family against the number of CVEs it has. While we do not have complete coverage across the above list, we do have some of the heavy(ier) hitters:

NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS

We limited the view to a service family having at least having 10 or more systems exposed and used color to encode the CVSS v2 scores.

The most prevalent CVE-enumerated vulnerabilities are listed in the table below. While it’s technically possible that these CVEs have been mitigated through some other software control, patching them out entirely is really the best and easiest way to avoid uncomfortable conversations with your vulnerability manager.

And, the top 30 most prevalent are:

CVE Number
CVE-2017-8361 336
CVE-2013-2275 202
CVE-2012-1452 186
CVE-2016-1000107 184
CVE-2016-6440 184
CVE-2012-0038 168
CVE-2012-1835 165
CVE-2016-8827 165
CVE-2011-3868 164
CVE-2011-0607 160
CVE-2007-6740 154
CVE-2013-4564 150
CVE-2016-0948 149
CVE-2016-0956 149
CVE-2009-2047 146
CVE-2015-5670 145
CVE-2017-8577 143
CVE-2014-0134 135
CVE-2015-5355 135
CVE-2012-5932 127
CVE-2014-8089 120
CVE-2015-5685 118
CVE-2016-1000109 118
CVE-2015-5672 114
CVE-2016-5596 112
CVE-2016-5600 112
CVE-2016-4261 111
CVE-2016-4263 111
CVE-2016-4264 111
CVE-2016-4268 111

While we expect to see traditional web servers, there are other devices connected to the internet that expose web services or administrative interfaces (which we’ve partially enumerated below):

Vendor Device HTTP (80) HTTPS (443)
Cisco Firewall 123 986,766
AVM WAP 1,942 604,890
Asus WAP 1 177,936
Synology NAS 61,796 50,531
Check Point Firewall 16,059 30,773
SonicWALL VPN 7,413 16,061
Ubiquiti WAP 0 11,813
HP Printer 16,247 9,178
MikroTik Router 289,026 8,056
Tivo DVR 6,400 6,779
Philips Light Bulb 4,785 3,349
Polycom VoIP 369 3,079
Ubiquiti Web cam 955 922
HP Lights Out Management 601 708
ARRIS Cable Modem 350 217
Fortinet Firewall 1,221 159
Xerox Printer 1,575 29
Canon Multifunction Device 124 14
Netwave Web cam 6,420 7
HeiTel DVR 2,734 2
Samsung DVR 53,053 2
Merit LILIN DVR 2,565 1
Fidelix Industrial Control 545 0
FUHO DVR 1,249 0
Shenzhen Reecam Tech. Ltd. Web cam 1,902 0
Ubiquiti DVR 675 0
Yamaha Router 9,675 0

For instance, we found nearly a million Cisco ASA firewalls. That fact is not necessarily “bad,” since they can be configured to provide remote access services (like VPN). Having 123 instances on port 80 is, however, not the best idea.

Unlike Cisco, most MikroTik routers seem to be exposed sans encryption, and over 75% of them are exposing the device’s admin interface. What could possibly go wrong?

Upward of 50,000 Synology network-attached storage devices show up as well, and the File Sharing blog posts talked at length about the sorry state of exposure in these types of devices. They’re on the internet to enable owners to play local media remotely and access other files remotely.

There are printers, and light bulbs; DVRs and home router admin interfaces; oh, and a few thousand entire building control systems.In short, you can find pretty much anything with a web interface hanging out on the internet.

Attacker’s view

There are so many layers in modern HTTP[S] services that attackers likely are often paralyzed by not knowing which ones to go after first. Attacking HTTP services on embedded systems is generally one of the safest paths to take, since they’re generally not monitored by the owner nor the network operator and can be used with almost guaranteed anonymity.

Formal web services—think Apache Struts, WebLogic, and the like—are also desirable targets, since they’re usually associated with enterprise deployments and, thus, have more potential for financial gain or access to confidential records. HTTP interfaces to firewalls and remote access systems (as we saw back in the Remote Access blog posts) have been a major focus for many attacker groups for the past 18–24 months since once compromised, they can drop an adversary right into the heart of the internal network where they can (usually) quickly establish a foothold and secondary access method.

NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS
NICER Protocol Deep Dive: Internet Exposure of HTTP and HTTPS

You’re also more likely to see (at least for now) more initial probes on HTTP (80), as noted by both the unique source IPv4 and total interaction views (above). It’s hard to say “watch 80 closely, and especially 80→443 moves by clients,” since most services are still offered on both ports and good sites are configured to automatically redirect clients to HTTPS. Still, if you see clients focus more on 80, you may want to flag those for potential further investigation. And, definitely be more careful with your systems that only talk HTTP (80).

Our advice

IT and IT security teams should build awesome platforms and services and put them on the internet over HTTPS! Innovation drives change and progress—plus, the internet has likely done more good than harm since the first HTTP request was made. Do keep all this patched and ensure secure configuration and coding practices are part of the development and deployment lifecycles. Do not put administrative interfaces to anything on the internet if at all possible and ensure you know what services your network devices and “Internet of Things” devices are exposing. Finally, disable `Server:` banners on everything and examine other HTTP headers for what else they might leak and sanitize what you can. Attackers on the lookout for, say, nginx will often move on if they see Apache in the Server header. You’d be surprised just how effective this one change can be.

Cloud providers should continue to offer secure, scalable web technologies. At the same time, if pre-built disk images with common application stacks are offered, keep them patched and ensure you have the ability to inform users when things go out-of-date.

Government cybersecurity agencies should keep reminding us not to put digital detritus with embedded web servers on the internet and monitor for campaigns that are targeting these invisible services. When there are major issues with core technologies such as Microsoft IIS, Apache HTTP, or nginx, processes should be in place to notify the public and work with ISPs, hosting, and cloud providers to try to contain any possible widespread damage. There should be active programs in place to ensure no critical telecommunications infrastructure has dangerous ports or services exposed, especially router administrative interfaces over HTTP/HTTPS.

[Research] Read the full NICER report today

Get Started

NICER Protocol Deep Dive: Internet Exposure of NTP

Post Syndicated from Tod Beardsley original https://blog.rapid7.com/2021/01/22/nicer-protocol-deep-dive-internet-exposure-of-ntp/

NICER Protocol Deep Dive: Internet Exposure of NTP

Welcome to the NICER Protocol Deep Dive blog series! When we started researching what all was out on the internet way back in January, we had no idea we’d end up with a hefty, 137-page tome of a research report. The sheer length of such a thing might put off folks who might otherwise learn a thing or two about the nature of internet exposure, so we figured, why not break up all the protocol studies into their own reports?

So, here we are! What follows is taken directly from our National / Industry / Cloud Exposure Report (NICER), so if you don’t want to wait around for the next installment, you can cheat and read ahead!

[Research] Read the full NICER report today

Get Started

NTP (123)

In the immortal words of The Smiths, “How soon is now?”

TLDR

  • WHAT IT IS: Network Time Protocol—the service that keeps us all in sync.
  • HOW MANY: 1,638,577 discovered nodes. 1,638,495 (99.9%) gave up version and/or other fingerprintable information and (much) smaller subsets provided operating system information.
  • VULNERABILITIES: A few. Mostly denial-of-service and information disclosure, but there have been remote code execution ones from time to time.
  • ADVICE: Use it! Just not on the internet. And, configured properly. And, patched.
  • ALTERNATIVES: Nope. This is the de facto way to keep time on the internet.
  • GETTING: Stuck in time. There was literally no change from 2019.

The internet could not function the way it does without NTP. You’d think with that much power NTP would be all BPOC and act all smug and superior. Yet, it does it’s thing—keeping all computers that use it in sync, time-wise—with little fanfare, except when it’s being used in denial-of-service attacks. It has been around since around 1985, and while it is not the only network-based time synchronization protocol, it is The Standard.

NTP servers operate in a hierarchy with up to 15 levels dubbed stratum. There are authoritative, highly available NTP servers we all use every day (most of the time provided by operating system vendors and running on obviously named hosts such as time.apple.com and time.windows.com).

Virtually anything can be an NTP server, from a router, to your phone, to a RaspberryPi, so dedicated appliances that key off of GPS signals as a time-source. Now, just because something can be a time server does not mean it should be a time server.

Discovery details

Project Sonar found 1,638,577 NTP servers on the public internet, so one might say we have quite a bit of time on our hands. Our editors say otherwise, so let’s see what time looks like across countries and clouds.

NICER Protocol Deep Dive: Internet Exposure of NTP

The United States has many IPv4 blocks, many computers, and many major ISPs and IT companies that like to control things. It also has a decent number of businesses that run NTP for no good reason. All of this helps push it to the top spot. Russia finally shows up in second place, for similar reasons, though two of Russia’s major ISPs account for just over 40% of Russia’s exposure. China—with its vast IPv4 space and population—comes in at No. 3, which means businesses and ISPs have figured out needlessly exposing NTP can cause more problems than it’s worth.

NICER Protocol Deep Dive: Internet Exposure of NTP

Rapid7 Labs was glad to see cloud environments (both the runners and the customers) seem to take the dangers of running NTP seriously as well, with most having almost no exposure.

Exposure information

Now, you know NTP has to be a bit dangerous if the main support site for the protocol itself has a big, bad warning about the dangers of NTP right at the top of its page. The biggest danger comes from using it for amplification DDoS (it is a UDP-based protocol). While it is still used today, there are way better services, such as memcached, to use for such things.

NTP servers are just bits of software that have vulnerabilities like all other software. When you put anything on the internet, bad folks are going to try to gain control over it. If an organization needs—for some odd reason—to run its own NTP server, there’s no reason it has to be on the public internet. And, if there is some weird reason it does, there’s no reason it has to be configured to respond to requests from all subnets.

Why are we picking nits? Well, it’s one more thing you’re not going to patch. Then, there’s the problem of all the information you might be giving to attackers about your network setup. In our NTP corpus, 255,602 (15.5%) reveal the private IP address scheme on the internal network interface.

OS Count Percentage
UNIX (generic) 1,089,876 69.61%
Cisco device 294,330 18.80%
Linux+kernel version 99,032 6.32%
BSD+kernel version 38,798 2.48%
Juniper device+version 32,469 2.07%
VMware+version 8,597 0.55%
SunOS 948 0.06%
Other 657 0.04%
vxWorks 505 0.03%
Sidewinder+version 332 0.02%
QNX+version 186 0.01%
macOS+version 66 0.00%

Over 1.5 million NTP servers give hints about the operating system and version they run. In total, 180,410 (11%) give us precise NTP version and build information, with all but roughly 4,000 giving us the precise release date:

NICER Protocol Deep Dive: Internet Exposure of NTP

There’s an [un]healthy mix of remote code execution, information leakage, local service DoS, and amplification DDoS spread throughout that mix of NTP devices.

Hopefully we’ve managed to at least start to convince you otherwise if you were thinking, “Well, it’s just an NTP server” at the start of this section.

Attacker’s view

The Exposure Information section provided a great deal of information on the potential (and measured) weaknesses in NTP systems. Attackers will judge your potential as a victim (and cyber-insurers will likely up your premiums) from how your attack surface is configured. NTP can reveal all the cracks in your configuration and patch management processes, and even provide a means of entry.

And, attackers still use NTP in amplification attacks, so that NTP server you didn’t realize you had or really thought you needed will likely be used in attacks on other sites.

Our advice

IT and IT security teams should use NTP behind the firewall and keep it patched. If you do need to run NTP externally, only let it talk to specific hosts/networks.

Cloud providers should keep up the great work by only exposing as much NTP as they need to and offering guidance to customers for how to run NTP securely (off the internet).

Government cybersecurity agencies should provide timely notifications when new vulnerabilities in NTP surface or there are known, active NTP DoS campaigns. Educational materials should be made available on dangers of exposing NTP to the internet and on how to securely configure various NTP services.

[Research] Read the full NICER report today

Get Started

Computing education and underrepresentation: the data from England

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/computing-education-underrepresentation-data-england-schools/

In this blog post, I’ll discuss the first research seminar in our six-part series about diversity and inclusion. Let’s start by defining our terms. Diversity is any dimension that can be used to differentiate groups and people from one another. This might be, for example, age, gender, socio-economic status, disability, ethnicity, religion, nationality, or sexuality. The aim of inclusion is to embrace all people irrespective of difference.

It’s vital that we are inclusive in computing education, because we need to ensure that everyone can access and learn the empowering and enabling technical skills they need to support all aspects of their lives.

One male and two female teenagers at a computer

Between January and June of this year, we’re partnering with the Royal Academy of Engineering to host speakers from the UK and USA for a series of six research seminars focused on diversity and inclusion in computing education.

We kicked off the series with a seminar from Dr Peter Kemp and Dr Billy Wong focused on computing education in England’s schools post-14. Peter is a Lecturer in Computing Education at King’s College London, where he leads on initial teacher education in computing. His research areas are digital creativity and digital equity. Billy is an Associate Professor at the Institute of Education, University of Reading. His areas of research are educational identities and inequalities, especially in the context of higher education and STEM education.

Computing in England’s schools

Peter began the seminar with a comprehensive look at the history of curriculum change in Computing in England. This was very useful given our very international audience for these seminars, and I will summarise it below. (If you’d like more detail, you can look over the slides from the seminar. Note that these changes refer to England only, as education in the UK is devolved, and England, Northern Ireland, Scotland, and Wales each has a different education system.)

In 2014, England switched from mandatory ICT (Information and Communication Technology) to mandatory Computing (encompassing information technology, computer science, and digital literacy). This shift was complemented by a change in the qualifications for students aged 14–16 and 16–18, where the primary qualifications are GCSEs and A levels respectively:

  • At GCSE, there has been a transition from GCSE ICT to GCSE Computer Science over the last five years, with GCSE ICT being discontinued in 2017
  • At A level before 2014, ICT and Computing were on offer as two separate A levels; now there is only one, A level Computer Science

One of the issues is that in the English education system, there is a narrowing of the curriculum at age 14: students have to choose between Computer Science and other subjects such as Geography, History, Religious Studies, Drama, Music, etc. This means that those students that choose not to take a GCSE Computer Science (CS) may find that their digital education is thereby curtailed from then onwards. Peter’s and Billy’s view is that having a more specialist subject offer for age 14+ (Computer Science as opposed to ICT) means that fewer students take it, and they showed evidence of this from qualifications data. The number of students taking CS at GCSE has risen considerably since its introduction, but it’s not yet at the level of GCSE ICT uptake.

GCSE computer science and equity

Only 64% of schools in England offer GCSE Computer Science, meaning that just 81% of students have the opportunity to take the subject (some schools also add selection criteria). A higher percentage (90%) of selective grammar schools offer GCSE CS than do comprehensive schools (80%) or independent schools (39%). Peter suggested that this was making Computer Science a “little more elitist” as a subject.

Peter analysed data from England’s National Pupil Database (NPD) to thoroughly investigate the uptake of Computer Science post-14 with respect to the diversity of entrants.

He found that the gender gap for GCSE CS uptake is greater than it was for GCSE ICT. Now girls make up 22% of the cohort for GCSE CS (2020 data), whereas for the ICT qualification (2017 data), 43% of students were female.

Peter’s analysis showed that there is also a lower representation of black students and of students from socio-economically disadvantaged backgrounds in the cohort for GCSE CS. In contrast, students with Chinese ancestry are proportionally more highly represented in the cohort. 

Another part of Peter’s analysis related gender data to the Income Deprivation Affecting Children Index (IDACI), which is used as an indicator of the level of poverty in England’s local authority districts. In the graphs below, a higher IDACI decile means more deprivation in an area. Relating gender data of GCSE CS uptake against the IDACI shows that:

  • Girls from more deprived areas are more likely to take up GCSE CS than girls from less deprived areas are
  • The opposite is true for boys
Two bar charts relating gender data of GCSE uptake against the Income Deprivation Affecting Children Index. The graph plotting GCSE ICT data shows that students from areas with higher deprivation are slightly more likely to choose the GCSE, irrespective of gender. The graph plotting GCSE Computer Science data shows that girls from more deprived areas are more likely to take up GCSE CS than girls from less deprived areas, and the opposite is true for boys.

Peter covered much more data in the seminar, so do watch the video recording (below) if you want to learn more.

Peter’s analysis shows a lack of equity (i.e. equality of outcome in the form of proportional representation) in uptake of GCSE CS after age 14. It is also important to recognise, however, that England does mandate — not simply provide or offer — Computing for all pupils at both primary and secondary levels; making a subject mandatory is the only way to ensure that we do give access to all pupils.

What can we do about the lack of equity?

Billy presented some of the potential reasons for why some groups of young people are not fully represented in GCSE Computer Science:

  • There are many stereotypes surrounding the image of ‘the computer scientist’, and young people may not be able to identify with the perception they hold of ‘the computer scientist’
  • There is inequality in access to resources, as indicated by the research on science and STEM capital being carried out within the ASPIRES project

More research is needed to understand the subject choices young people make and their reasons for choosing as they do.

We also need to look at how the way we teach Computing to students aged 11 to 14 (and younger) affects whether they choose CS as a post-14 subject. Our next seminar revolves around equity-focused teaching practices, such as culturally relevant pedagogy or culturally responsive teaching, and how educators can use them in their CS learning environments. 

Meanwhile, our own research project at the Raspberry Pi Foundation, Gender Balance in Computing, investigates particular approaches in school and non-formal learning and how they can impact on gender balance in Computer Science. For an overview of recent research around barriers to gender balance in school computing, look back on the research seminar by Katharine Childs from our team.

Peter and Billy themselves have recently been successful in obtaining funding for a research project to explore female computing performance and subject choice in English schools, a project they will be starting soon!

If you missed the seminar, watch recording here. You can also find Peter and Billy’s presentation slides on our seminars page.

Next up in our seminar series

In our next research seminar on Tuesday 2 February at 17:00–18:30 BST / 12:00–13:30 EDT / 9:00–10:30 PDT / 18:00–19:30 CEST, we’ll welcome Prof Tia Madkins (University of Texas at Austin), Dr Nicol R. Howard (University of Redlands), and Shomari Jones (Bellevue School District), who are going to talk to us about culturally responsive pedagogy and equity-focused teaching in K-12 Computer Science. To join this free online seminar, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended Peter’s and Billy’s seminar, the link remains the same.

The post Computing education and underrepresentation: the data from England appeared first on Raspberry Pi.

NICER Protocol Deep Dive: Internet Exposure of DNS-over-TLS

Post Syndicated from Tod Beardsley original https://blog.rapid7.com/2021/01/15/nicer-protocol-deep-dive-internet-exposure-of-dns-over-tls/

NICER Protocol Deep Dive: Internet Exposure of DNS-over-TLS

Welcome to the NICER Protocol Deep Dive blog series! When we started researching what all was out on the internet way back in January, we had no idea we’d end up with a hefty, 137-page tome of a research report. The sheer length of such a thing might put off folks who might otherwise learn a thing or two about the nature of internet exposure, so we figured, why not break up all the protocol studies into their own reports?

So, here we are! What follows is taken directly from our National / Industry / Cloud Exposure Report (NICER), so if you don’t want to wait around for the next installment, you can cheat and read ahead!

[Research] Read the full NICER report today

Get Started

DNS-over-TLS (DoT) (TCP/853)

Encrypting DNS is great! Unless it’s baddies doing the encrypting.

TLDR

  • WHAT IT IS: DNS over TLS is just what it says on the tin: the DNS protocol embedded in a TLS connection, ostensibly to make your DNS request more confidential.
  • HOW MANY: 3,237 discovered nodes. A hodgepodge mix of vendor/version information was discernible, but you’ll need to read the details to find out more.
  • VULNERABILITIES: Whatever is in the DNS that backs the service or in the code that presents TLS (more often than not, a plain, ol’ web server).
  • ADVICE: It’s complicated (read on to find out why!)
  • ALTERNATIVES: Plain, simple, uncomplicated, and woefully unconfidential UDP DNS; DNS over HTTPS (DoH); DNS over QUIC (DoQ); DNS over avian carriers (DoAC).
  • GETTING: Drunk with power. There are nearly two times as many as April 2019.

At face value, DNS over TLS (henceforth referred to as DoT) aims to be the confidentiality solution for a legacy cleartext protocol that has managed to resist numerous other confidentiality (and integrity) fixup attempts. It is one of a handful of modern efforts to help make DNS less susceptible to eavesdropping and person-in-the-middle attacks.

Discovery details

We chose to examine DoT because web browsers have become the new operating system of the internet, and DoT and cousins all allow browsers (or any app, really) to bypass your home, ISP, or organization’s choices of DNS resolution method and resolution provider. Since it’s presented over TLS, it can also be a great way for attackers to continue to use DNS as a command-and-control channel as well as an exfiltration channel.

We chose to examine DoT versus DoH because, well, it is far easier to enumerate DoT endpoints than it is DoH endpoints. It’s getting easier to enumerate DoH since there seems to be some agreement on the standard way to query it, so that will likely make it to a future report, but for now, let’s take a look at what DoT Project Sonar found:

NICER Protocol Deep Dive: Internet Exposure of DNS-over-TLS

Yes, you read that chart correctly! Ireland is No. 1 in terms of the number of nodes running a DoT service, and it’s all thanks to a chap named Daniel Cid, who co-runs CleanBrowsing, which is a “DNS-based content filtering service that offers a safe way to browse the web without surprises.” Daniel has his name on AS205157, which is allocated to Ireland, but the CleanBrowsing service itself is run out of California. In fact, CleanBrowsing comprises almost 50% of the DoT corpus (1,612 nodes), with 563 nodes attributed to the United States and a tiny number of servers attributed to a dozen or so other country network spaces.

Both the U.S. and Germany have a cornucopia of server types and autonomous systems presenting DoT services (none really stand out besides CleanBrowsing).

Since Bulgaria rarely makes it into top 10 exposure lists, we took a look at what was there and it’s a ton (relatively, anyway: 242) of DoT servers in Fiber Optics Bulgaria OOD, which is a kind of “meta” service provider for ISPs. Given the relative scarcity of IPv4 addresses, setting aside 242 of them just for DoT is a pretty major investment.

Even though the numbers are small, Japan’s presence is interesting, as it’s nearly all due to a single ISP: Internet Initiative Japan Inc.

NICER Protocol Deep Dive: Internet Exposure of DNS-over-TLS

In case you have been left unawares, Google is a big player] in the DoT space, but it tends to concentrate DNS exposure to a tiny handful of IP addresses (i.e., that bar is not Google-proper). When we filter out CleanBrowsing (yep, they’re everywhere), we’re left with the major exposure in Google being … a couple dozen servers running an instance of Pi-hole (dnsmasq-pi-hole-2.80, to be precise). Cut/paste that finding for OV and DigitalOcean and yep, that same Pi-hole setup is tops in those two clouds as well.

You don’t need to get all fancy and run a Pi-hole setup to host your own DoT server. Just fire up an nginx instance, create a basic configuration, set up your own DNS behind it, and now, you too can stop your ISP from snooping your DNS queries.

Exposure information

Here is where we’d normally talk about versions and CVEs, etc., but the DoT situation is complicated by a few things. First, we have big players in this space using proprietary solutions, so version fingerprints such as  “CleanBrowsing v1.6a” are not very useful information. Second, should we focus on the version of the web server or of the back-end DNS server (or, both)? The latter might not be useful, since you can configure an nginx DoT setup to proxy to a third party, and that’s what will get picked up in the response. Lastly, even if we focus on the second-tier “big guns,” such as PowerDNS, we end up with a situation like this:

NICER Protocol Deep Dive: Internet Exposure of DNS-over-TLS

Giving you that glimpse does help to show it’s utter chaos even in PowerDNS-land, but DNS and chaos seem to go hand in hand.

Attacker’s view

There are no DoT honeypots in project Heisenberg, but DoT is just a TLS wrapper over a traditional DNS binary-format query. When we looked for that in the TCP/853 full packet captures, we saw us (!) and a couple other researchers. Not very exciting, but with the goal of DoT being privacy, we really shouldn’t see random DoT requests.

Attackers are more likely to stand up their own DoT servers or reconfigure other DoT servers to use their DNS back-ends and then use those as covert channels once they gain a foothold after a successful phishing attack. This is a big reason we enumerate/catalog DoT, and we’re starting to see more DoT in residential ISP space and traditional hosting provider IP space. It looks like more folks are experimenting with DoT with each monthly study.

Our advice

IT and IT security teams should block TCP/853, lock down DoT and DoH browser settings as much as possible so there is no way to bypass organizational IT policies, and monitor for all attempts to use DoT or DoH services internally (or externally). In other words, unless you’re the ones setting them up, disallowing rogue, internal DoT is the safest course.

Cloud providers should consider offering managed DoT solutions and provide patched, secure disk images for folks who want to stand up their own. (This is one of the few cases where organizational advice and cloud advice are quite nearly opposite.)

Government cybersecurity agencies should monitor for malicious use of DoT and provide timely updates to the public. These centers should also be a source of unbiased, expert information on DoT, DoH, DoQ (et al).

[Research] Read the full NICER report today

Get Started

A Name Resolver for the Distributed Web

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/cloudflare-distributed-web-resolver/

A Name Resolver for the Distributed Web

A Name Resolver for the Distributed Web

The Domain Name System (DNS) matches names to resources. Instead of typing 104.18.26.46 to access the Cloudflare Blog, you type blog.cloudflare.com and, using DNS, the domain name resolves to 104.18.26.46, the Cloudflare Blog IP address.

Similarly, distributed systems such as Ethereum and IPFS rely on a naming system to be usable. DNS could be used, but its resolvers’ attributes run contrary to properties valued in distributed Web (dWeb) systems. Namely, dWeb resolvers ideally provide (i) locally verifiable data, (ii) built-in history, and (iii) have no single trust anchor.

At Cloudflare Research, we have been exploring alternative ways to resolve queries to responses that align with these attributes. We are proud to announce a new resolver for the Distributed Web, where IPFS content indexed by the Ethereum Name Service (ENS) can be accessed.

To discover how it has been built, and how you can use it today, read on.

Welcome to the Distributed Web

IPFS and its addressing system

The InterPlanetary FileSystem (IPFS) is a peer-to-peer network for storing content on a distributed file system. It is composed of a set of computers called nodes that store and relay content using a common addressing system.

This addressing system relies on the use of Content IDentifiers (CID). CIDs are self-describing identifiers, because the identifier is derived from the content itself. For example, QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco is the CID version 0 (CIDv0) of the wikipedia-on ipfs homepage.

To understand why a CID is defined as self-describing, we can look at its binary representation. For QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco, the CID looks like the following:

A Name Resolver for the Distributed Web

The first is the algorithm used to generate the CID (sha2-256 in this case); then comes the length of the encoded content (32 for a sha2-256 hash), and finally the content itself. When referring to the multicodec table, it is possible to understand how the content is encoded.

Name Code (in hexadecimal)
identity 0x00
sha1 0x11
sha2-256 0x12 = 00010010
keccak-256 0x1b

This encoding mechanism is useful, because it creates a unique and upgradable content-addressing system across multiple protocols.

If you want to learn more, have a look at ProtoSchool’s tutorial.

Ethereum and decentralised applications

Ethereum is an account-based blockchain with smart contract capabilities. Being account-based, each account is associated with addresses and these can be modified by operations grouped in blocks and sealed by Ethereum’s consensus algorithm, Proof-of-Work.

There are two categories of accounts: user accounts and contract accounts. User accounts are controlled by a private key, which is used to sign transactions from the account. Contract accounts hold bytecode, which is executed by the network when a transaction is sent to their account. A transaction can include both funds and data, allowing for rich interaction between accounts.

When a transaction is created, it gets verified by each node on the network. For a transaction between two user accounts, the verification consists of checking the origin account signature. When the transaction is between a user and a smart contract, every node runs the smart contract bytecode on the Ethereum Virtual Machine (EVM). Therefore, all nodes perform the same suite of operations and end up in the same state. If one actor is malicious, nodes will not add its contribution. Since nodes have diverse ownership, they have an incentive to not cheat.

How to access IPFS content

As you may have noticed, while a CID describes a piece of content, it doesn’t describe where to find it. In fact, the CID describes the content, but not its location on the network. The location of the file would be retrieved by a query made to an IPFS node.

An IPFS URL (Unified Resource Locator) looks like this: ipfs://QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco. Accessing this URL means retrieving QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco using the IPFS protocol, denoted by ipfs://. However, typing such a URL is quite error-prone. Also, these URLs are not very human-friendly, because there is no good way to remember such long strings. To get around this issue, you can use DNSLink. DNSLink is a way of specifying IPFS CIDs within a DNS TXT record. For instance, wikipedia on ipfs has the following TXT record

$ dig +short TXT _dnslink.en.wikipedia-on-ipfs.org

_dnslink=/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco

In addition, its A record points to an IPFS gateway. This means that, when you access en.wikipedia-on-ipfs.org, your request is directed to an IPFS HTTP Gateway, which then looks out for the CID using your domain TXT record, and returns the content associated to this CID using the IPFS network.

This is trading ease-of-access against security. The web browser of the user doesn’t verify the integrity of the content served. This could be because the browser does not implement IPFS or because it has no way of validating domain signature — DNSSEC. We wrote about this issue in our previous blog post on End-to-End Integrity.

Human readable identifiers

DNS simplifies referring to IP addresses, in the same way that postal addresses are a way of referring to geolocation data, and contacts in your mobile phone abstract phone numbers. All these systems provide a human-readable format and reduce the error rate of an operation.

To verify these data, the trusted anchors, or “sources of truth”, are:

  • Root DNS Keys for DNS.
  • The government registry for postal addresses. In the UK, addresses are handled by cities, boroughs and local councils.
  • When it comes to your contacts, you are the trust anchor.

Ethereum Name Service, an index for the Distributed Web

An account is identified by its address. An address starts with “0x” and is followed by 20 bytes (ref 4.1 Ethereum yellow paper), for example: 0xf10326c1c6884b094e03d616cc8c7b920e3f73e0. This is not very readable, and can be pretty scary when transactions are not reversible and one can easily mistype a single  character.

A first mitigation strategy was to introduce a new notation to capitalise some letters based on the hash of the address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0. This can help detect mistype, but it is still not readable. If I have to send a transaction to a friend, I have no way of confirming she hasn’t mistyped the address.

The Ethereum Name Service (ENS) was created to tackle this issue. It is a system capable of turning human-readable names, referred to as domains, to blockchain addresses. For instance, the domain privacy-pass.eth points to the Ethereum address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0.

To achieve this, the system is organised in two components, registries and resolvers.

A registry is a smart contract that maintains a list of domains and some information about each domain: the domain owner and the domain resolver. The owner is the account allowed to manage the domain. They can create subdomains and change ownership of their domain, as well as modify the resolver associated with their domain.

Resolvers are responsible for keeping records. For instance, Public Resolver is a smart contract capable of associating not only a name to blockchain addresses, but also a name to an IPFS content identifier. The resolver address is stored in a registry. Users then contact the registry to retrieve the resolver associated with the name.

Consider a user, Alice, who has direct access to the Ethereum state. The flow goes as follows: Alice would like to get Privacy Pass’s Ethereum address, for which the domain is privacy-pass.eth. She looks for privacy-pass.eth in the ENS Registry and figures out the resolver for privacy-pass.eth is at 0x1234… . She now looks for the address of privacy-pass.eth at the resolver address, which turns out to be 0xf10326c….

A Name Resolver for the Distributed Web

Accessing the IPFS content identifier for privacy-pass.eth works in a similar way. The resolver is the same, only the accessed data is different — Alice calls a different method from the smart contract.

A Name Resolver for the Distributed Web

Cloudflare Distributed Web Resolver

The goal was to be able to use this new way of indexing IPFS content directly from your web browser. However, accessing the ENS registry requires access to the Ethereum state. To get access to IPFS, you would also need to access the IPFS network.

To tackle this, we are going to use Cloudflare’s Distributed Web Gateway. Cloudflare operates both an Ethereum Gateway and an IPFS Gateway, respectively available at cloudflare-eth.com and cloudflare-ipfs.com.

The first version of EthLink was built by Jim McDonald and is operated by True Name LTD at eth.link. Starting from next week, eth.link will transition to use the Cloudflare Distributed Web Resolver. To that end, we have built EthLink on top of Cloudflare Workers. This is a proxy to IPFS. It proxies all ENS registered domains when .link is appended. For instance, privacy-pass.eth should render the Privacy Pass homepage. From your web browser, https://privacy-pass.eth.link does it.

The resolution is done at the Cloudflare edge using a Cloudflare Worker. Cloudflare Workers allows JavaScript code to be run on Cloudflare infrastructure, eliminating the need to maintain a server and increasing the reliability of the service. In addition, it follows Service Workers API, so results returned from the resolver can be checked by end users if needed.

To do this, we setup a wildcard DNS record for *.eth.link to be proxied through Cloudflare and handled by a Cloudflare Worker.  When a user Alice accesses privacy-pass.eth.link, the worker first gets the CID of the CID to be retrieved from Ethereum. Then, it requests the content matching this CID to IPFS, and returns it to Alice.

A Name Resolver for the Distributed Web

All parts can be run locally. The worker can be run in a service Worker, and the Ethereum Gateway can point to both a local Ethereum node and the IPFS gateway provided by IPFS Companion. It means that while Cloudflare provides resolution-as-a-service, none of the components has to be trusted.

Final notes

So are we distributed yet? No, but we are getting closer, building bridges between emerging technologies and current web infrastructure. By providing a gateway dedicated to the distributed web, we hope to make these services more accessible to everyone.

We thank the ENS team for their support of a new resolver on expanding the distributed web. The ENS team has been running a similar service at https://eth.link. On January 18th, they will switch https://eth.link to using our new service.

These services benefit from the added speed and security of the Cloudflare Worker platform, while paving the way to run distributed protocols in browsers.

NICER Protocol Deep Dive: Internet Exposure of DNS

Post Syndicated from Tod Beardsley original https://blog.rapid7.com/2021/01/05/nicer-protocol-deep-dive-internet-exposure-of-dns/

NICER Protocol Deep Dive: Internet Exposure of DNS

Welcome to the NICER Protocol Deep Dive blog series! When we started researching what all was out on the internet way back in January, we had no idea we’d end up with a hefty, 137-page tome of a research report. The sheer length of such a thing might put off folks who might otherwise learn a thing or two about the nature of internet exposure, so we figured, why not break up all the protocol studies into their own reports?

So, here we are! What follows is taken directly from our National / Industry / Cloud Exposure Report (NICER), so if you don’t want to wait around for the next installment, you can cheat and read ahead!

[Research] Read the full NICER report today

Get Started

Domain Name System (DNS) (UDP/53)

“The Achilles Heel of the Internet” – Sir Tim Berners-Lee

TLDR

  • WHAT IT IS: Domain Name System (DNS): The globally distributed address book of services on the internet.
  • HOW MANY: 4,717,658 discovered nodes. 3,498,439 (74.1%) have Recog fingerprints (15 total vendor+service families)
  • VULNERABILITIES: Around 200 across all service families with every CVSS score imaginable.
  • ADVICE: You kinda have no other choice but to use it.
  • ALTERNATIVES: DNS over TLS (DoH), DNS over HTTPS (DoH), DNS over QUIC (DoQ); downgrade to Novell Netware.
  • GETTING: Used about as much as last year, which kind of makes sense since DNS makes the internet work.

Nobody wants to memorize IP addresses in order to get to network resources, nor does anyone want to maintain a giant standalone list of hostname to IP address mappings. However, nobody also wants to wait forever to get a response to the request for the IP address of, say, example.com. Thus was the atmosphere that begat what we posit is the most ubiquitous user-facing but also most user-overlooked service on the internet: the Domain Name System  (DNS).

Discovery details

Project Sonar discovered nearly 5 million DNS servers via UDP requests on port 53. This is a far fewer number than the total sum of, say, web servers, but it is a non-trivial number of systems and the reasons for that make sense. ISPs provide DNS services to home and small-business users, organizations host their own DNS to maintain control of their brand namespace, and vendors provide customized DNS services in either an outsourcing capacity or to provide enhanced services such as malware and other types of content filtering. Finally, large technology companies such as Google, Cloudflare, IBM (via Quad9), and others also provide centralized DNS services for various good (?) reasons. This is all to say, outside of the giant centralized DNS providers, the global DNS footprint tends to track very closely with the allocated country IPv4 space; the more IP allocations a given country has, the more DNS servers are there to keep track of them all.

NICER Protocol Deep Dive: Internet Exposure of DNS
NICER Protocol Deep Dive: Internet Exposure of DNS

Conversely, it really doesn’t make much sense to waste precious (and costly) cloud resources by hosting DNS in there. However, it seems OVH users have plenty of cycles (and money) to burn. Yep, those aren’t just OVH’s DNS servers. We come to that conclusion based on the diversity of DNS vendor software and the version spread. Now, OVH does have the largest data center on the planet and is not just a cloud services provider, so it’s pretty reasonable to see that it can and should be in the top spot.

Given that most small orgs use their ISPs’ external DNS (directly or via recursive DNS) and that the vast majority of home users still use their ISP DNS, you can imagine that autonomous system DNS server distribution has a very long tail.

Exposure information

DNS has had … challenges … over the years. It is a binary protocol that receives quite a bit of attention paid to it by both researchers and attackers. Because of this, and the nature of the UDP service, it is possible to craft a binary DNS request that ends up being around 60 bytes that asks for a DNS response, which ends up potentially being near 4,000 bytes (~7:1 amplification), making it great for use in low-to-mid-level amplification DDoS attacks. It is also possible to compromise a DNS server via specially crafted binary messages, though that task gets more difficult with each passing year.

DNS Service Prevalence

Vendor Count Percentage
ISC BIND 2,007,593 57.39%
Thekelleys Dnsmasq 556,228 15.90%
NLnet Labs NSD 520,785 14.89%
PowerDNS PowerDNS 342,143 9.78%
Microsoft DNS 43,185 1.23%
NLnet Labs Unbound 14,158 0.40%
Nominum Vantio 7,596 0.22%
DrayTek DNS 2,674 0.08%
cz.nic Knot 1,897 0.05%
Michael Tokarev rbldnsd 898 0.03%
RIPE Atlas Anchor 614 0.02%
ALU DNS 539 0.02%
Incognito DNS 78 0.00%
D J Bernstein djbdns 45 0.00%
Check Point META IP 6 0.00%

BIND (now ISC BIND) was the first DNS server and is still the most prevalent one (of those we had Recog fingerprints for), which is likely why it has 119 CVEs (most all of them DoS-related). The picture really isn’t this clean, though. Within ISC BIND alone, we found 550 distinct version strings (most legit, too). We can look at version diversity by vendor across all autonomous systems with DNS servers to see just how crazy the situation really is:

NICER Protocol Deep Dive: Internet Exposure of DNS

If this were a social media service instead of a serious research paper, now’s about the time we’d post a “Do You Even…?” meme gif with the word “Patch” in it. So, not only do we forget about DNS when we’re using it, we also seem to forget about it when we run it, too. Denial-of-service flaws are found every year in these servers, but when DNS is running, it’s running, and you likely need it to keep running.

Attacker’s view

We’re not in the DDoS protection services racket market, nor do we have DDoS probes sitting in key locations to be able to detect when DDoS attacks are happening. We see both TCP- and UDP-based DNS traffic in Heisenberg, but they’re mostly inventory scans or misconfigurations.

This is not to say attackers care not about DNS anymore. Every DDoS mitigation vendor makes a point of reminding us about this a few times a year in their service reports, and Verizon noted a serious uptick in DoS in general in 2019 (in which DNS played a part). And, there are always new, crafty attack vectors being researched and developed.

But, attackers do more with DNS than just DoS. Organizations must register public, top-level domain names and set up various types of records for them so we can all buy things without leaving home. This exposes two potential avenues of for attack: first at the registrar level, which is why it is vital that you protect your domain registration account with multi-factor authentication (preferably app-based for this versus just SMS) and then do the same for your external DNS provider (if you’re using an external DNS provider). In May 2020, the Internet Systems Consortium hosted a webinar on this very topic that should help provide more background information, and SpamHaus estimates that GoDaddy has around 100 newly hijacked domains daily.

They who control DNS control who you are on the internet.

Our advice

IT and IT security teams should safeguard registrar and external DNS provider accounts with multi-factor authentication, keep internal and external DNS systems fully patched, relentlessly monitor DNS for signs of abuse and configuration changes, and consider treating DNS like a first-class application in your environment as opposed to the plumbing that sits hidden behind drywall.

Cloud providers that offer DNS registration and hosting services should mandate multi-factor authentication be used and have processes in place to detect potential malicious activity (i.e., takeover attempts). All machine images with DNS services installed by default should be updated immediately after new DNS server versions are released and then notify all existing users about the need to upgrade.

Government cybersecurity agencies should provide timely notifications regarding DNS attacks of all kinds and have resources available that document how to securely maintain DNS infrastructure.

[Research] Read the full NICER report today

Get Started

HaXmas Hardware Hacking

Post Syndicated from Tod Beardsley original https://blog.rapid7.com/2021/01/02/haxmas-hardware-hacking/

HaXmas Hardware Hacking

Usually, when you read an IoT hacking report or blog post, it ends with something along the lines of, “and that’s how I got root,” or “and there was a secret backdoor credential,” or “and every device in the field uses the same S3 bucket with no authentication.” You know, something bad, and the whole reason for publishing the research in the first place. While such research is of course interesting, important, and worth publishing, we pretty much never hear about the other outcome: the IoT hacking projects that didn’t uncover something awful, but instead ended up with, “and everything looked pretty much okay.”

So, this HaXmas, I decided to dig around a little in Rapid7’s library of IoT investigations that never really went anywhere, just to see which tools were used. The rest of this blog post is basically a book report of the tooling used in a recent engagement performed by our own Jonathan Stines, and can be used as a starting point if you’re interested in getting into some casual IoT hacking yourself. Even though this particular engagement didn’t go anywhere, I had a really good time reading along with Stines’ investigation on a smart doorbell camera.

Burp Suite

While Burp Suite might be a familiar mainstay for web app hackers, it has a pretty critical role in IoT investigations as well. The “I” in IoT is what makes these Things interesting, so checking out what and how those gadgets are chatting on the internet is pretty critical in figuring out the security posture of those devices. Burp Suite lets investigators capture, inspect, and replay conversations in a proxied context, and the community edition is a great way to get started with this kind of manual, dynamic analysis.

Frida

While Burp is great, if the IoT mobile app you’re looking at (rightly) uses certificate pinning in order to secure communications, you won’t get very far with its proxy capabilities. In order to deal with this, you’ll need some mechanism to bypass the application’s pinned cert, and that mechanism is Frida. While Frida might be daunting for the casual IoT hacker, there’s a great HOWTO by Vedant that provides some verbose instructions for setting up Frida, adb, and Burp Suite in order to inject a custom SSL certificate and bypass that pesky pinning. Personally, I had never heard of Frida or how to use it for this sort of thing, so it looks like I’m one of today’s lucky 10,000.

HaXmas Hardware Hacking

Binwalk

When mucking about with firmware (the packaged operating system and applications that makes IoT devices go), Binwalk from Refirm Labs is the standard for exploring those embedded filesystems. In nearly all cases, a “check for updates” button on a newly opened device will trigger some kind of firmware download—IoT devices nearly always update themselves by downloading and installing an entirely new firmware—so if you can capture that traffic with something like Wireshark (now that you’ve set up your proxied environment), you can extract those firmware updates and explore them with Binwalk.

Allsocket eMMC153 chip reader

Now, with the software above, you will go far in figuring out how an IoT device does its thing, but the actual hands-on-hardware experience in IoT hacking is kinda the fun part that differentiates it from regular old web app testing. So for this, you will want to get your hands on a chip reader for your desoldered components. Pictured below is an Allsocket device that can be used to read both 153-pin and 169-pin configurations of eMMC storage, both of which are very common formats for solid-state flash memory in IoT-land. Depending on where you get it, they can run about $130, so not cheap, but also not bank-breaking.

HaXmas Hardware Hacking

Thanks!

Thanks again to Jonathan Stines, who did all the work that led to this post. If you need some validation of your IoT product, consider hiring him for your next IoT engagement. Rapid7’s IoT assessment experts are all charming humans who are pretty great at not just IoT hacking, but explaining what they did and how they did it. And, if you like this kind of thing, drop a comment below and let me know—I’m always happy to learn and share something new (to me) when it comes to hardware hacking.

More HaXmas blogs

Rapid7 Labs’ 2020 Naughty List Summary Report to Santa

Post Syndicated from boB Rudis original https://blog.rapid7.com/2020/12/25/rapid7-labs-2020-naughty-list-summary-report-to-santa/

Rapid7 Labs’ 2020 Naughty List Summary Report to Santa

As requested, your dutiful elves here at Rapid7 Labs have compiled a list of the naughty country networks being used to launch cyberattacks across the globe. Needless to say, some source networks have been very naughty (dare we use the word “again,” since these all seem to be repeat offenders).

To make it easier to digest, we’ve broken the list out into three categories:

  • Naughty Microsoft SQL Server attacks
  • Naughty Microsoft Remote Desktop Protocol (RDP) attacks
  • Naughty Microsoft SMB attacks

These are focused on the top offenders for the last half of the year, and provide a smoothed trending view (vs. discrete daily counts) in each one to help you make your Naughty/Nice inclusion decisions.

Naughty Microsoft SQL server attacks

Hopefully you do not maintain your lists on publicly accessible Microsoft SQL servers, as they are regular targets for attackers who have their evil designs on them, with a major focus on using them for cryptocurrency mining this year.

Rapid7 Labs’ 2020 Naughty List Summary Report to Santa

Source Country Network Median Daily MSSQL Attack Interactions
China 147,677
United States 12,984
India 7,159
Brazil 8,984
Russia 7,031

A massive botnet operating from before the fall of 2019 and early 2020 abruptly stopped operations just before summer, and MS SQL server credential and query attack types have leveled off to previous baseline levels. The enduring lesson from measuring these interactions is for all the grown-up kids out there to never, ever put any database like MS SQL on the public internet. Unfortunately, you can read an excerpt from our other report that found nearly 100,000 of them earlier this year (perhaps the offenders on that list would be better placed on the naughty list?).

Naughty Microsoft Remote Desktop Protocol (RDP) attacks

We were sorry to hear that even your own factory had to observe remote-work protocols starting in March, but we hope your IT department did not have to resort to enabling direct Microsoft Remote Desktop Protocol (RDP) access, since it has been the target of a massive increase in discovery and credential stuffing attacks the last quarter of this year.

Rapid7 Labs’ 2020 Naughty List Summary Report to Santa

Source Country Network Median Daily RDP Attack Interactions
Russia 41,515
United Kingdom 23,337
United States 32,840
Germany 2,832
France 12,802

Naughty traffic levels started just before the presidential election in the United States and further increased in size toward the end of the year.

RDP-targeted ransomware has been a fairly huge problem this year, with many nefarious attackers setting their sights on overworked and under-resourced healthcare, education, and municipal targets.

It might be worth taking some time to remind your elves in IT that it’s not a good idea to put RDP services directly on the internet. While another of our reports earlier this year did not find any RDP nodes coming from the North Pole autonomous system, it is possible we didn’t inventory your network on a day they did. As you know, it’s best to put RDP servers behind a dedicated (and properly configured) Microsoft RDP Gateway server or—better yet—a multifactor virtual private network (VPN).

The majority of the malicious RDP traffic coming from the U.K., U.S., and Germany appear to be the work of one or two groups who should also be considered candidates for the naughty list.

Naughty Microsoft SMB attacks

Rapid7 Labs’ 2020 Naughty List Summary Report to Santa

Source Country Network Median Daily SMB Attack Interactions
Vietnam 4,206,475
India 2,137,146
Russia 2,055,072
Brazil 1,478,000
Indonesia 1,420,109

Last, we lament the need to report a renewed uptick in EternalBlue-infused attacks against internet-accessible Microsoft SMB servers. The vast majority of source nodes involved with these attacks are part of the various Mirai-like botnets that use both traditional compromised server hosts and (mostly) “internet of things” devices such as cameras, DVRs, and other business and home automation devices to coordinate and orchestrate attacks.

Might we be so bold as to suggest that you hold off—at least this year—distributing rebranded white-box electronics components to the folks on the Nice list? If you do insist on giving out home automation presents this year, please make sure the programmer elves follow the guidance in the IoT Security Foundation’s Security Compliance Framework to guard against adding more nodes to these naughty botnets.

If you’re wondering why attackers are still looking for SMB servers, you can see for yourself that there are still hundreds of thousands of them out on the internet to connect to. We’re just glad you switched to using a secure file transfer service to exchange documents (like this one!) with all your partner elves.

Glad tidings `til next year!

We hope you, Mrs. Claus, the elves and all the reindeer stay safe and socially distanced. We’ll make sure to leave the cookies and bourbon milk in the usual place.

Happy Holidays from all the Elves in Rapid7 Labs!

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

More HaXmas blogs

UPnP With a Holiday Cheer

Post Syndicated from Deral Heiland original https://blog.rapid7.com/2020/12/22/upnp-with-a-holiday-cheer/

UPnP With a Holiday Cheer

T’was the night before HaXmas,
when all through the house,
Not a creature was stirring, not even a mouse.
The stockings were hung by the chimney with care,
in hopes that St. Nicholas soon would be there.

This may be the way you start your holiday cheer,
but before you get started, let me make you aware.
I spend my holidays quite differently, I fear.
As a white-hat hacker with a UPnP cheer.

And since you may not be aware,
let me share what I learned with you,
so that you can also care,
how to port forward with UPnP holiday cheer.

Universal Plug and Play (UPnP) is a service that has been with us for many years and is used to automate discovery and setup of network and communication services between devices on your network. For today’s discussion, this blog post will only cover the port forwarding services and will also share a Python script you can use to start examining this service.

UPnP port forwarding services are typically enabled by default on most consumer internet-facing Network Address Translation (NAT) routers supplied by internet service providers (ISP) for supporting IPv4 networks. This is done so that devices on the internal network can automate their setup of needed TCP and UDP port forwarding functions on the internet-facing router, so devices on the internet can connect to services on your internal network.

So, the first thing I would like to say about this is that if you are not running applications or systems such as internet gaming systems that require this feature, I would recommend disabling this on your internet-facing router. Why? Because it has been used by malicious actors to further compromise a network by opening up port access into internal networks via malware. So, if you don’t need it, you can remove the risk by disabling it. This is the best option to help reduce any unnecessary exposure.

To make all this work, UPnP uses a discovery protocol known as Simple Service Discovery Protocol (SSDP). This SSDP discovery service for UPnP is a UDP service that responds on port 1900 and can be enumerated by broadcasting an M-SEARCH message via the multicast address 239.255.255.250. This M-SEARCH message will return device information, including the URL and port number for the device description file ‘rootDesc.xml’. Here is an example of a returned M-SEARCH response from a NETGEAR Wi-Fi router device on my network:

UPnP With a Holiday Cheer

To send a M-SEARCH multicast message, here is a simple Python script:

# simple script to enumerate UPNP devices
 
import socket
 
# M-Search message body
MS = \
    'M-SEARCH * HTTP/1.1\r\n' \
    'HOST:239.255.255.250:1900\r\n' \
    'ST:upnp:rootdevice\r\n' \
    'MX:2\r\n' \
    'MAN:"ssdp:discover"\r\n' \
    '\r\n'
 
# Set up a UDP socket for multicast
SOC = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
SOC.settimeout(2)
 
# Send M-Search message to multicast address for UPNP
SOC.sendto(MS.encode('utf-8'), ('239.255.255.250', 1900) )
 
#listen and capture returned responses
try:
    while True:
        data, addr = SOC.recvfrom(8192)
        print (addr, data)
except socket.timeout:
        pass

The next step is to access the rootDesc.xml file. In this case, this is accessible on my device via http://192.168.2.74:5555/rootDesc.xml. Looking at the M-SEARCH response above, we can see that the IP address for rootDesc.xml at 169.254.39.187.  169.254.*.* is known as an Automatic Private IP address. It is not uncommon to see an address in that range returned by an M-SEARCH request. Trying to access it will fail because it is incorrect. To actually access the rootDesc.xml file, you will need to use the device’s true IP address, which in my case was 192.168.2.74 and was shown in the header of the M-SEARCH message response.

Once the rootDesc.xml is returned, you will see some very interesting things listed, but in this case, we are only interested in port forwarding. If port forwarding service is available, it will be listed in the rootDesc.xml file as service type WANIPConnection, as shown below:

UPnP With a Holiday Cheer

You can open WANIPCn.xml on the same http service and TCP port location that you retrieved the rootDesc.xml file. The WANIPCn.xml file identifies various actions that are available, and this will often include the following example actions:

  • AddPortMapping
  • GetExternalIPAddress
  • DeletePortMapping
  • GetStatusInfo
  • GetGenericPortMappingEntry
  • GetSpecificPortMappingEntry

Under each of these actions will be an argument list. This argument list specifies the argument values that can be sent via Simple Object Access Protocol (SOAP) messages to the control URL at http://192.168.2.74:5555/ctl/IPConn, which is used to configure settings or retrieve status on the router device. SOAP is a messaging specification that uses a Extensible Markup Language (XML) format to exchange information.

Below are a couple captured SOAP messages, with the first one showing AddPortMapping. This will set up port mapping on the router at the IP address 192.168.1.1. The port being added in this case is TCP 1234 and it is set up to map the internet side of the router to the internal IP address of 192.168.1.241, so anyone connecting to TCP port 1234 on the external IP address of the router will be connected to port 1234 on internal host at 192.168.1.241.

UPnP With a Holiday Cheer

The following captured SOAP message shows the action DeletePortMapping being used to delete the port mapping that was created in the above SOAP message:

UPnP With a Holiday Cheer

To conclude this simple introduction to UPnP, SSDP, and port forwarding services, I highly recommend that you do not experiment on your personal internet-facing router or DSL modem where you could impact your home network’s security posture. But I do recommend that you set up a test environment. This can easily be done with any typical home router or Wi-Fi access point with router services. These can often be purchased used, or you may even have one laying around that you have upgraded from. It is amazing how simple it is to modify a router using these UPnP services by sending SOAP messages, and I hope you will take this introduction and play with these services to further expand your knowledge in this area. If you are looking for further tools for experimenting with port forwarding services, you can use the UPnP IGD SOAP Port Mapping Utility in  Metasploit to create and delete these port mappings.

But I heard him exclaim, ere he drove out of sight-
Happy HaXmas to all, and to all a good UPnP night

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

More HaXmas blogs

Block-based programming: does it help students learn?

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/block-based-programming-does-it-help-students-learn-research-seminar/

At the Raspberry Pi Foundation, we are continually inspired by young learners in our community: they embrace digital making and computing to build creative projects, supported by our resources, clubs, and volunteers. While creating their projects, they are learning the core programming skills that underlie digital making.

Over the years, many tools and environments have been developed to make programming more accessible to young people. Scratch is one example of a block-based programming environment for young learners, and it’s been shown to make programming more accessible to them; on our projects site we offer many step-by-step Scratch project resources.

Mark Scratch
A Scratch code-along, led by one of our educators on our weekly Digital Making at Home live stream

But does block-based programming actually help learning? Does it increase motivation and support students? Where is the hard evidence? In our latest research seminar, we were delighted to hear from Dr David Weintrop, an Assistant Professor at the University of Maryland who has done research in this area for several years and published widely on the differences between block-based and text-based programming environments.

David Weintrop

A variety of block-based programming environments

The first useful insight David shared was that we should avoid thinking about block-based programming as synonymous with the well-known Scratch environment. There are several other environments, with different affordances, that David referred to in his talk, such as Snap, Pencil Code, Blockly, and more.

Logos of block-based programming environments

Some of these, for example Pencil Code, offer a dual-modality (or hybrid) environment, where learners can write the same program in a text-based and a block-based programming environment side by side. Dual-modality environments provide this side-by-side approach based on the assumption that being able to match a text-based program to its block-based equivalent supports the development of understanding of program syntax in a text-based language.

Screenshot of the Pencil Code dual-modality programming environment

As a tool for transitioning to text-based programming

Another aspect of the research around block-based programming focuses on its usefulness as a transition to a text-based language. David described a 15-week study he conducted in high schools in the USA to investigate differences in student learning caused by use of block-based, text-based, and hybrid (a mixture of both using a dual-modality platform) programming tools.

Details of the study design: classroom-based, 3 conditions, 2 phases, quasi-experimental mixed method study

The 90 students in the study (14 to 16 years old) were divided into three groups, each with a different intervention but taught by the same teacher. In the first phase of the study (5 weeks), the groups were set the same tasks with the same learning objectives, but they used either block-based programming, text-based programming, or the hybrid environment.

After 5 weeks, students were given a test to assess learning outcomes, and they were asked questions about their attitudes to programming (specifically their perception of computing and their confidence). In the second phase (10 weeks), all the students were taught Java (a common language taught in the USA for end-of-school assessment), and then the test and attitudinal questions were repeated.

The results showed that at the 5-week point, the students who had used block-based programming scored higher in their learning outcome assessment, but at the final assessment after 15 weeks, all groups’ scores were roughly equivalent.  

A graph of assessment scores of the three groups in the study. The final scores are not significantly different.

In terms of students’ perception of computing and confidence, the responses of the Blocks group were very positive at the 5-week point, while at the 15-week point, the responses were less positive. The responses from the Text group showed a gradual increase in positivity between the 5- and 15-week points. The Hybrid group’s responses weren’t as negative as those of the Text group at the 5-week point, and their positivity didn’t decrease like the Blocks group’s did.

Taking both methods of assessment into account, the Hybrid group showed the best results in the study. The gains associated with the block-based introduction to programming did not translate to those students being further ahead when learning Java, but starting with block-based programming also did not hamper students’ transition to text-based programming.

David completed his talk by recommending dual-modality environments (such as Pencil Code) for teaching programming, as used by the Hybrid group in his study. 

More research is needed

The seminar audience raised many questions about David’s study, for example whether the actual teaching (pedagogy) may have differed for the three groups, and whether the results are not just due to the specific tools or environments that were used. This is definitely an area for further research. 

It seems that students may benefit from different tools at different times, which is why a dual-modality environment can be very useful. Of course, competence in programming takes a long time to develop, so there is room on the research agenda for longitudinal studies that monitor students’ progress over many months and even years. Such studies could take into account both the teaching approach and the programming environment in order to determine what factors impact a deep understanding of programming concepts, and students’ desire to carry on with their programming journey. 

Next up in our series

If you missed the seminar, you can find David’s presentation slides and a recording of his talk on our seminars page.

Our next free online seminar takes place on Tuesday 5 January at 17:00–18:00 BST / 12:00–13:00 EDT / 9:00–10:00 PDT / 18:00–19:00 CEST. We’ll welcome Peter Kemp and Billy Wong, who are going to share insights from their research on computing education for underrepresented groups. To join this free online seminar, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended David’s seminar, the link remains the same.

The January seminar will be the first one in our series focusing on diversity and inclusion in computing education, which we’re co-hosting with the Royal Academy for Engineering. We hope to see you there!

The post Block-based programming: does it help students learn? appeared first on Raspberry Pi.

Good-bye ESNI, hello ECH!

Post Syndicated from Christopher Patton original https://blog.cloudflare.com/encrypted-client-hello/

Good-bye ESNI, hello ECH!

Good-bye ESNI, hello ECH!

Most communication on the modern Internet is encrypted to ensure that its content is intelligible only to the endpoints, i.e., client and server. Encryption, however, requires a key and so the endpoints must agree on an encryption key without revealing the key to would-be attackers. The most widely used cryptographic protocol for this task, called key exchange, is the Transport Layer Security (TLS) handshake.

In this post we’ll dive into Encrypted Client Hello (ECH), a new extension for TLS that promises to significantly enhance the privacy of this critical Internet protocol. Today, a number of privacy-sensitive parameters of the TLS connection are negotiated in the clear. This leaves a trove of metadata available to network observers, including the endpoints’ identities, how they use the connection, and so on.

ECH encrypts the full handshake so that this metadata is kept secret. Crucially, this closes a long-standing privacy leak by protecting the Server Name Indication (SNI) from eavesdroppers on the network. Encrypting the SNI secret is important because it is the clearest signal of which server a given client is communicating with. However, and perhaps more significantly, ECH also lays the groundwork for adding future security features and performance enhancements to TLS while minimizing their impact on the privacy of end users.

ECH is the product of close collaboration, facilitated by the IETF, between academics and the tech industry leaders, including Cloudflare, our friends at Fastly and Mozilla (both of whom are the affiliations of co-authors of the standard), and many others. This feature represents a significant upgrade to the TLS protocol, one that builds on bleeding edge technologies, like DNS-over-HTTPS, that are only now coming into their own. As such, the protocol is not yet ready for Internet-scale deployment. This article is intended as a sign post on the road to full handshake encryption.

Background

The story of TLS is the story of the Internet. As our reliance on the Internet has grown, so the protocol has evolved to address ever-changing operational requirements, use cases, and threat models. The client and server don’t just exchange a key: they negotiate a wide variety of features and parameters: the exact method of key exchange; the encryption algorithm; who is authenticated and how; which application layer protocol to use after the handshake; and much, much more. All of these parameters impact the security properties of the communication channel in one way or another.

SNI is a prime example of a parameter that impacts the channel’s security. The SNI extension is used by the client to indicate to the server the website it wants to reach. This is essential for the modern Internet, as it’s common nowadays for many origin servers to sit behind a single TLS operator. In this setting, the operator uses the SNI to determine who will authenticate the connection: without it, there would be no way of knowing which TLS certificate to present to the client. The problem is that SNI leaks to the network the identity of the origin server the client wants to connect to, potentially allowing eavesdroppers to infer a lot of information about their communication. (Of course, there are other ways for a network observer to identify the origin — the origin’s IP address, for example. But co-locating with other origins on the same IP address makes it much harder to use this metric to determine the origin than it is to simply inspect the SNI.)

Although protecting SNI is the impetus for ECH, it is by no means the only privacy-sensitive handshake parameter that the client and server negotiate. Another is the ALPN extension, which is used to decide which application-layer protocol to use once the TLS connection is established. The client sends the list of applications it supports — whether it’s HTTPS, email, instant messaging, or the myriad other applications that use TLS for transport security — and the server selects one from this list, and sends its selection to the client. By doing so, the client and server leak to the network a clear signal of their capabilities and what the connection might be used for.

Some features are so privacy-sensitive that their inclusion in the handshake is a non-starter. One idea that has been floated is to replace the key exchange at the heart of TLS with password-authenticated key-exchange (PAKE). This would allow password-based authentication to be used alongside (or in lieu of) certificate-based authentication, making TLS more robust and suitable for a wider range of applications. The privacy issue here is analogous to SNI: servers typically associate a unique identifier to each client (e.g., a username or email address) that is used to retrieve the client’s credentials; and the client must, somehow, convey this identity to the server during the course of the handshake. If sent in the clear, then this personally identifiable information would be easily accessible to any network observer.

A necessary ingredient for addressing all of these privacy leaks is handshake encryption, i.e., the encryption of handshake messages in addition to application data. Sounds simple enough, but this solution presents another problem: how do the client and server pick an encryption key if, after all, the handshake is itself a means of exchanging a key? Some parameters must be sent in the clear, of course, so the goal of ECH is to encrypt all handshake parameters except those that are essential to completing the key exchange.

In order to understand ECH and the design decisions underpinning it, it helps to understand a little bit about the history of handshake encryption in TLS.

Handshake encryption in TLS

TLS had no handshake encryption at all prior to the latest version, TLS 1.3. In the wake of the Snowden revelations in 2013, the IETF community began to consider ways of countering the threat that mass surveillance posed to the open Internet. When the process of standardizing TLS 1.3 began in 2014, one of its design goals was to encrypt as much of the handshake as possible. Unfortunately, the final standard falls short of full handshake encryption, and several parameters, including SNI, are still sent in the clear. Let’s take a closer look to see why.

The TLS 1.3 protocol flow is illustrated in Figure 1. Handshake encryption begins as soon as the client and server compute a fresh shared secret. To do this, the client sends a key share in its ClientHello message, and the server responds in its ServerHello with its own key share. Having exchanged these shares, the client and server can derive a shared secret. Each subsequent handshake message is encrypted using the handshake traffic key derived from the shared secret. Application data is encrypted using a different key, called the application traffic key, which is also derived from the shared secret. These derived keys have different security properties: to emphasize this, they are illustrated with different colors.

The first handshake message that is encrypted is the server’s EncryptedExtensions. The purpose of this message is to protect the server’s sensitive handshake parameters, including the server’s ALPN extension, which contains the application selected from the client’s ALPN list. Key-exchange parameters are sent unencrypted in the ClientHello and ServerHello.

Good-bye ESNI, hello ECH!
Figure 1: The TLS 1.3 handshake.

All of the client’s handshake parameters, sensitive or not, are sent in the ClientHello. Looking at Figure 1, you might be able to think of ways of reworking the handshake so that some of them can be encrypted, perhaps at the cost of additional latency (i.e., more round trips over the network). However, extensions like SNI create a kind of “chicken-and-egg” problem.

The client doesn’t encrypt anything until it has verified the server’s identity (this is the job of the Certificate and CertificateVerify messages) and the server has confirmed that it knows the shared secret (the job of the Finished message). These measures ensure the key exchange is authenticated, thereby preventing monster-in-the-middle (MITM) attacks in which the adversary impersonates the server to the client in a way that allows it to decrypt messages sent by the client.  Because SNI is needed by the server to select the certificate, it needs to be transmitted before the key exchange is authenticated.

In general, ensuring confidentiality of handshake parameters used for authentication is only possible if the client and server already share an encryption key. But where might this key come from?

Full handshake encryption in the early days of TLS 1.3. Interestingly, full handshake encryption was once proposed as a core feature of TLS 1.3. In early versions of the protocol (draft-10, circa 2015), the server would offer the client a long-lived public key during the handshake, which the client would use for encryption in subsequent handshakes. (This design came from a protocol called OPTLS, which in turn was borrowed from the original QUIC proposal.) Called “0-RTT”, the primary purpose of this mode was to allow the client to begin sending application data prior to completing a handshake. In addition, it would have allowed the client to encrypt its first flight of handshake messages following the ClientHello, including its own EncryptedExtensions, which might have been used to protect the client’s sensitive handshake parameters.

Ultimately this feature was not included in the final standard (RFC 8446, published in 2018), mainly because its usefulness was outweighed by its added complexity. In particular, it does nothing to protect the initial handshake in which the client learns the server’s public key. Parameters that are required for server authentication of the initial handshake, like SNI, would still be transmitted in the clear.

Nevertheless, this scheme is notable as the forerunner of other handshake encryption mechanisms, like ECH, that use public key encryption to protect sensitive ClientHello parameters. The main problem these mechanisms must solve is key distribution.

Before ECH there was (and is!) ESNI

The immediate predecessor of ECH was the Encrypted SNI (ESNI) extension. As its name implies, the goal of ESNI was to provide confidentiality of the SNI. To do so, the client would encrypt its SNI extension under the server’s public key and send the ciphertext to the server. The server would attempt to decrypt the ciphertext using the secret key corresponding to its public key. If decryption were to succeed, then the server would proceed with the connection using the decrypted SNI. Otherwise, it would simply abort the handshake. The high-level flow of this simple protocol is illustrated in Figure 2.

Good-bye ESNI, hello ECH!
Figure 2: The TLS 1.3 handshake with the ESNI extension. It is identical to the TLS 1.3 handshake, except the SNI extension has been replaced with ESNI.

For key distribution, ESNI relied on another critical protocol: Domain Name Service (DNS). In order to use ESNI to connect to a website, the client would piggy-back on its standard A/AAAA queries a request for a TXT record with the ESNI public key. For example, to get the key for crypto.dance, the client would request the TXT record of _esni.crypto.dance:

$ dig _esni.crypto.dance TXT +short
"/wGuNThxACQAHQAgXzyda0XSJRQWzDG7lk/r01r1ZQy+MdNxKg/mAqSnt0EAAhMBAQQAAAAAX67XsAAAAABftsCwAAA="

The base64-encoded blob contains an ESNI public key and related parameters such as the encryption algorithm.

But what’s the point of encrypting SNI if we’re just going to leak the server name to network observers via a plaintext DNS query? Deploying ESNI this way became feasible with the introduction of DNS-over-HTTPS (DoH), which enables encryption of DNS queries to resolvers that provide the DoH service (1.1.1.1 is an example of such a service.). Another crucial feature of DoH is that it provides an authenticated channel for transmitting the ESNI public key from the DoH server to the client. This prevents cache-poisoning attacks that originate from the client’s local network: in the absence of DoH, a local attacker could prevent the client from offering the ESNI extension by returning an empty TXT record, or coerce the client into using ESNI with a key it controls.

While ESNI took a significant step forward, it falls short of our goal of achieving full handshake encryption. Apart from being incomplete — it only protects SNI — it is vulnerable to a handful of sophisticated attacks, which, while hard to pull off, point to theoretical weaknesses in the protocol’s design that need to be addressed.

ESNI was deployed by Cloudflare and enabled by Firefox, on an opt-in basis, in 2018, an  experience that laid bare some of the challenges with relying on DNS for key distribution. Cloudflare rotates its ESNI key every hour in order to minimize the collateral damage in case a key ever gets compromised. DNS artifacts are sometimes cached for much longer, the result of which is that there is a decent chance of a client having a stale public key. While Cloudflare’s ESNI service tolerates this to a degree, every key must eventually expire. The question that the ESNI protocol left open is how the client should proceed if decryption fails and it can’t access the current public key, via DNS or otherwise.

Another problem with relying on DNS for key distribution is that several endpoints might be authoritative for the same origin server, but have different capabilities. For example, a request for the A record of “example.com” might return one of two different IP addresses, each operated by a different CDN. The TXT record for “_esni.example.com” would contain the public key for one of these CDNs, but certainly not both. The DNS protocol does not provide a way of atomically tying together resource records that correspond to the same endpoint. In particular, it’s possible for a client to inadvertently offer the ESNI extension to an endpoint that doesn’t support it, causing the handshake to fail. Fixing this problem requires changes to the DNS protocol. (More on this below.)

The future of ESNI. In the next section, we’ll describe the ECH specification and how it addresses the shortcomings of ESNI. Despite its limitations, however, the practical privacy benefit that ESNI provides is significant. Cloudflare intends to continue its support for ESNI until ECH is production-ready.

The ins and outs of ECH

The goal of ECH is to encrypt the entire ClientHello, thereby closing the gap left in TLS 1.3 and ESNI by protecting all privacy-sensitive handshake-parameters. Similar to ESNI, the protocol uses a public key, distributed via DNS and obtained using DoH, for encryption during the client’s first flight. But ECH has improvements to key distribution that make the protocol more robust to DNS cache inconsistencies. Whereas the ESNI server aborts the connection if decryption fails, the ECH server attempts to complete the handshake and supply the client with a public key it can use to retry the connection.

But how can the server complete the handshake if it’s unable to decrypt the ClientHello? As illustrated in Figure 3, the ECH protocol actually involves two ClientHello messages: the ClientHelloOuter, which is sent in the clear, as usual; and the ClientHelloInner, which is encrypted and sent as an extension of the ClientHelloOuter. The server completes the handshake with just one of these ClientHellos: if decryption succeeds, then it proceeds with the ClientHelloInner; otherwise, it proceeds with the ClientHelloOuter.

Good-bye ESNI, hello ECH!
Figure 3: The TLS 1.3 handshake with the ECH extension.

The ClientHelloInner is composed of the handshake parameters the client wants to use for the connection. This includes sensitive values, like the SNI of the origin server it wants to reach (called the backend server in ECH parlance), the ALPN list, and so on. The ClientHelloOuter, while also a fully-fledged ClientHello message, is not used for the intended connection. Instead, the handshake is completed by the ECH service provider itself (called the client-facing server), signaling to the client that its intended destination couldn’t be reached due to decryption failure. In this case, the service provider also sends along the correct ECH public key with which the client can retry handshake, thereby “correcting” the client’s configuration. (This mechanism is similar to how the server distributed its public key for 0-RTT mode in the early days of TLS 1.3.)

At a minimum, both ClientHellos must contain the handshake parameters that are required for a server-authenticated key-exchange. In particular, while the ClientHelloInner contains the real SNI, the ClientHelloOuter also contains an SNI value, which the client expects to verify in case of ECH decryption failure (i.e., the client-facing server). If the connection is established using the ClientHelloOuter, then the client is expected to immediately abort the connection and retry the handshake with the public key provided by the server. It’s not necessary that the client specify an ALPN list in the ClientHelloOuter, nor any other extension used to guide post-handshake behavior. All of these parameters are encapsulated by the encrypted ClientHelloInner.

This design resolves — quite elegantly, I think — most of the challenges for securely deploying handshake encryption encountered by earlier mechanisms. Importantly, the design of ECH was not conceived in a vacuum. The protocol reflects the diverse perspectives of the IETF community, and its development dovetails with other IETF standards that are crucial to the success of ECH.

The first is an important new DNS feature known as the HTTPS resource record type. At a high level, this record type is intended to allow multiple HTTPS endpoints that are authoritative for the same domain name to advertise different capabilities for TLS. This makes it possible to rely on DNS for key distribution, resolving one of the deployment challenges uncovered by the initial ESNI deployment. For a deep dive into this new record type and what it means for the Internet more broadly, check out Alessandro Ghedini’s recent blog post on the subject.

The second is the CFRG’s Hybrid Public Key Encryption (HPKE) standard, which specifies an extensible framework for building public key encryption schemes suitable for a wide variety of applications. In particular, ECH delegates all of the details of its handshake encryption mechanism to HPKE, resulting in a much simpler and easier-to-analyze specification. (Incidentally, HPKE is also one of the main ingredients of Oblivious DNS-over-HTTPS.

The road ahead

The current ECH specification is the culmination of a multi-year collaboration. At this point, the overall design of the protocol is fairly stable. In fact, the next draft of the specification will be the first to be targeted for interop testing among implementations. Still, there remain a number of details that need to be sorted out. Let’s end this post with a brief overview of the road ahead.

Resistance to traffic analysis

Ultimately, the goal of ECH is to ensure that TLS connections made to different origin servers behind the same ECH service provider are indistinguishable from one another. In other words, when you connect to an origin behind, say, Cloudflare, no one on the network between you and Cloudflare should be able to discern which origin you reached, or which privacy-sensitive handshake-parameters you and the origin negotiated. Apart from an immediate privacy boost, this property, if achieved, paves the way for the deployment of new features for TLS without compromising privacy.

Encrypting the ClientHello is an important step towards achieving this goal, but we need to do a bit more. An important attack vector we haven’t discussed yet is traffic analysis. This refers to the collection and analysis of properties of the communication channel that betray part of the ciphertext’s contents, but without cracking the underlying encryption scheme. For example, the length of the encrypted ClientHello might leak enough information about the SNI for the adversary to make an educated guess as to its value (this risk is especially high for domain names that are either particularly short or particularly long). It is therefore crucial that the length of each ciphertext is independent of the values of privacy-sensitive parameters. The current ECH specification provides some mitigations, but their coverage is incomplete. Thus, improving ECH’s resistance to traffic analysis is an important direction for future work.

The spectre of ossification

An important open question for ECH is the impact it will have on network operations.

One of the lessons learned from the deployment of TLS 1.3 is that upgrading a core Internet protocol can trigger unexpected network behavior. Cloudflare was one of the first major TLS operators to deploy TLS 1.3 at scale; when browsers like Firefox and Chrome began to enable it on an experimental basis, they observed a significantly higher rate of connection failures compared to TLS 1.2. The root cause of these failures was network ossification, i.e., the tendency of middleboxes — network appliances between clients and servers that monitor and sometimes intercept traffic — to write software that expects traffic to look and behave a certain way. Changing the protocol before middleboxes had the chance to update their software led to middleboxes trying to parse packets they didn’t recognize, triggering software bugs that, in some instances, caused connections to be dropped completely.

This problem was so widespread that, instead of waiting for network operators to update their software, the design of TLS 1.3 was altered in order to mitigate the impact of network ossification. The ingenious solution was to make TLS 1.3 “look like” another protocol that middleboxes are known to tolerate. Specifically, the wire format and even the contents of handshake messages were made to resemble TLS 1.2. These two protocols aren’t identical, of course — a curious network observer can still distinguish between them — but they look and behave similar enough to ensure that the majority of existing middleboxes don’t treat them differently. Empirically, it was found that this strategy significantly reduced the connection failure rate enough to make deployment of TLS 1.3 viable.

Once again, ECH represents a significant upgrade for TLS for which the spectre of network ossification looms large. The ClientHello contains parameters, like SNI, that have existed in the handshake for a long time, and we don’t yet know what the impact will be of encrypting them. In anticipation of the deployment issues ossification might cause, the ECH protocol has been designed to look as much like a standard TLS 1.3 handshake as possible. The most notable difference is the ECH extension itself: if middleboxes ignore it — as they should, if they are compliant with the TLS 1.3 standard — then the rest of the handshake will look and behave very much as usual.

It remains to be seen whether this strategy will be enough to ensure the wide-scale deployment of ECH. If so, it is notable that this new feature will help to mitigate the impact of future TLS upgrades on network operations. Encrypting the full handshake reduces the risk of ossification since it means that there are less visible protocol features for software to ossify on. We believe this will be good for the health of the Internet overall.

Conclusion

The old TLS handshake is (unintentionally) leaky. Operational requirements of both the client and server have led to privacy-sensitive parameters, like SNI, being negotiated completely in the clear and available to network observers. The ECH extension aims to close this gap by enabling encryption of the full handshake. This represents a significant upgrade to TLS, one that will help preserve end-user privacy as the protocol continues to evolve.

The ECH standard is a work-in-progress. As this work continues, Cloudflare is committed to doing its part to ensure this important upgrade for TLS reaches Internet-scale deployment.

OPAQUE: The Best Passwords Never Leave your Device

Post Syndicated from Tatiana Bradley original https://blog.cloudflare.com/opaque-oblivious-passwords/

OPAQUE: The Best Passwords Never Leave your Device

OPAQUE: The Best Passwords Never Leave your Device

Passwords are a problem. They are a problem for reasons that are familiar to most readers. For us at Cloudflare, the problem lies much deeper and broader. Most readers will immediately acknowledge that passwords are hard to remember and manage, especially as password requirements grow increasingly complex. Luckily there are great software packages and browser add-ons to help manage passwords. Unfortunately, the greater underlying problem is beyond the reaches of software to solve.

The fundamental password problem is simple to explain, but hard to solve: A password that leaves your possession is guaranteed to sacrifice security, no matter its complexity or how hard it may be to guess. Passwords are insecure by their very existence.

You might say, “but passwords are always stored in encrypted format!” That would be great. More accurately, they are likely stored as a salted hash, as explained below. Even worse is that there is no way to verify the way that passwords are stored, and so we can assume that on some servers passwords are stored in cleartext. The truth is that even responsibly stored passwords can be leaked and broken, albeit (and thankfully) with enormous effort. An increasingly pressing problem stems from the nature of passwords themselves: any direct use of a password, today, means that the password must be handled in the clear.

You say, “but my password is transmitted securely over HTTPS!” This is true.

You say, “but I know the server stores my password in hashed form, secure so no one can access it!” Well, this puts a lot of faith in the server. Even so, let’s just say that yes, this may be true, too.

There remains, however, an important caveat — a gap in the end-to-end use of passwords. Consider that once a server receives a password, between being securely transmitted and securely stored, the password has to be read and processed. Yes, as cleartext!

And it gets worse — because so many are used to thinking in software, it’s easy to forget about the vulnerability of hardware. This means that even if the software is somehow trusted, the password must at some point reside in memory. The password must at some point be transmitted over a shared bus to the CPU. These provide vectors of attack to on-lookers in many forms. Of course, these attack vectors are far less likely than those presented by transmission and permanent storage, but they are no less severe (recent CPU vulnerabilities such as Spectre and Meltdown should serve as a stark reminder.)

The only way to fix this problem is to get rid of passwords altogether. There is hope! Research and private sector communities are working hard to do just that. New standards are emerging and growing mature. Unfortunately, passwords are so ubiquitous that it will take a long time to agree on and supplant passwords with new standards and technology.

At Cloudflare, we’ve been asking if there is something that can be done now, imminently. Today’s deep-dive into OPAQUE is one possible answer. OPAQUE is one among many examples of systems that enable a password to be useful without it ever leaving your possession. No one likes passwords, but as long they’re in use, at least we can ensure they are never given away.

I’ll be the first to admit that password-based authentication is annoying. Passwords are hard to remember, tedious to type, and notoriously insecure. Initiatives to reduce or replace passwords are promising. For example, WebAuthn is a standard for web authentication based primarily on public key cryptography using hardware (or software) tokens. Even so, passwords are frustratingly persistent as an authentication mechanism. Whether their persistence is due to their ease of implementation, familiarity to users, or simple ubiquity on the web and elsewhere, we’d like to make password-based authentication as secure as possible while they persist.

My internship at Cloudflare focused on OPAQUE, a cryptographic protocol that solves one of the most glaring security issues with password-based authentication on the web: though passwords are typically protected in transit by HTTPS, servers handle them in plaintext to check their correctness. Handling plaintext passwords is dangerous, as accidentally logging or caching them could lead to a catastrophic breach. The goal of the project, rather than to advocate for adoption of any particular protocol, is to show that OPAQUE is a viable option among many for authentication. Because the web case is most familiar to me, and likely many readers, I will use the web as my main example.

Web Authentication 101: Password-over-TLS

When you type in a password on the web, what happens? The website must check that the password you typed is the same as the one you originally registered with the site. But how does this check work?

Usually, your username and password are sent to a server. The server then checks if the registered password associated with your username matches the password you provided. Of course, to prevent an attacker eavesdropping on your Internet traffic from stealing your password, your connection to the server should be encrypted via HTTPS (HTTP-over-TLS).

Despite use of HTTPS, there still remains a glaring problem in this flow: the server must store a representation of your password somewhere. Servers are hard to secure, and breaches are all too common. Leaking this representation can cause catastrophic security problems. (For records of the latest breaches, check out https://haveibeenpwned.com/).

To make these leaks less devastating, servers often apply a hash function to user passwords. A hash function maps each password to a unique, random-looking value. It’s easy to apply the hash to a password, but almost impossible to reverse the function and retrieve the password. (That said, anyone can guess a password, apply the hash function, and check if the result is the same.)

With password hashing, plaintext passwords are no longer stored on servers.  An attacker who steals a password database no longer has direct access to passwords. Instead, the attacker must apply the hash to many possible passwords and compare the results with the leaked hashes.

Unfortunately, if a server hashes only the passwords, attackers can download precomputed rainbow tables containing hashes of trillions of possible passwords and almost instantly retrieve the plaintext passwords. (See https://project-rainbowcrack.com/table.htm for a list of some rainbow tables).

With this in mind, a good defense-in-depth strategy is to use salted hashing, where the server hashes your password appended to a random, per-user value called a salt. The server also saves the salt alongside the username, so the user never sees or needs to submit it. When the user submits a password, the server re-computes this hash function using the salt. An attacker who steals password data, i.e., the password representations and salt values, must then guess common passwords one by one and apply the (salted) hash function to each guessed password. Existing rainbow tables won’t help because they don’t take the salts into account, so the attacker needs to make a new rainbow table for each user!

This (hopefully) slows down the attack enough for the service to inform users of a breach, so they can change their passwords. In addition, the salted hashes should be hardened by applying a hash many times to further slow attacks. (See https://blog.cloudflare.com/keeping-passwords-safe-by-staying-up-to-date/ for a more detailed discussion).

These two mitigation strategies — encrypting the password in transit and storing salted, hardened hashes — are the current best practices.

A large security hole remains open. Password-over-TLS (as we will call it) requires users to send plaintext passwords to servers during login, because servers must see these passwords to match against registered passwords on file. Even a well-meaning server could accidentally cache or log your password attempt(s), or become corrupted in the course of checking passwords. (For example, Facebook detected in 2019 that it had accidentally been storing hundreds of millions of plaintext user passwords). Ideally, servers should never see a plaintext password at all.

But that’s quite a conundrum: how can you check a password if you never see the password? Enter OPAQUE: a Password-Authenticated Key Exchange (PAKE) protocol that simultaneously proves knowledge of a password and derives a secret key. Before describing OPAQUE in detail, we’ll first summarize PAKE functionalities in general.

Password Proofs with Password-Authenticated Key Exchange

Password-Authenticated Key Exchange (PAKE) was proposed by Bellovin and Merrit in 1992, with an initial motivation of allowing password-authentication without the possibility of dictionary attacks based on data transmitted over an insecure channel.

Essentially, a plain, or symmetric, PAKE is a cryptographic protocol that allows two parties who share only a password to establish a strong shared secret key. The goals of PAKE are:

1) The secret keys will match if the passwords match, and appear random otherwise.

2) Participants do not need to trust third parties (in particular, no Public Key Infrastructure),

3) The resulting secret key is not learned by anyone not participating in the protocol – including those who know the password.

4) The protocol does not reveal either parties’ password to each other (unless the passwords match), or to eavesdroppers.

In sum, the only way to successfully attack the protocol is to guess the password correctly while participating in the protocol. (Luckily, such attacks can be mostly thwarted by rate-limiting, i.e, blocking a user from logging in after a certain number of incorrect password attempts).

Given these requirements, password-over-TLS is clearly not a PAKE, because:

  • It relies on WebPKI, which places trust in third-parties called Certificate Authorities (see https://blog.cloudflare.com/introducing-certificate-transparency-and-nimbus/ for an in-depth explanation of WebPKI and some of its shortcomings).
  • The user’s password is revealed to the server.
  • Password-over-TLS provides the user no assurance that the server knows their password or a derivative of it — a server could accept any input from the user with no checks whatsoever.

That said, plain PAKE is still worse than Password-over-TLS, simply because it requires the server to store plaintext passwords. We need a PAKE that lets the server store salted hashes if we want to beat the current practice.

An improvement over plain PAKE is what’s called an asymmetric PAKE (aPAKE), because only the client knows the password, and the server knows a hashed password. An aPAKE has the four properties of PAKE, plus one more:

5) An attacker who steals password data stored on the server must perform a dictionary attack to retrieve the password.

The issue with most existing aPAKE protocols, however, is that they do not allow for a salted hash (or if they do, they require that salt to be transmitted to the user, which means the attacker has access to the salt beforehand and can begin computing a rainbow table for the user before stealing any data). We’d like, therefore, to upgrade the security property as follows:

5*) An attacker who steals password data stored on the server must perform a per-user dictionary attack to retrieve the password after the data is compromised.

OPAQUE is the first aPAKE protocol with a formal security proof that has this property: it allows for a completely secret salt.

OPAQUE – Servers safeguard secrets without knowing them!

OPAQUE: The Best Passwords Never Leave your Device

OPAQUE is what’s referred to as a strong aPAKE, which simply means that it resists these pre-computation attacks by using a secretly salted hash on the server. OPAQUE was proposed and formally analyzed by Stanislaw Jarecki, Hugo Krawcyzk and Jiayu Xu in 2018 (full disclosure: Stanislaw Jarecki is my academic advisor). The name OPAQUE is a combination of the names of two cryptographic protocols: OPRF and PAKE. We already know PAKE, but what is an OPRF? OPRF stands for Oblivious Pseudo-Random Function, which is a protocol by which two parties compute a function F(key, x) that is deterministic but outputs random-looking values. One party inputs the value x, and another party inputs the key – the party who inputs x learns the result F(key, x) but not the key, and the party providing the key learns nothing.  (You can dive into the math of OPRFs here: https://blog.cloudflare.com/privacy-pass-the-math/).

The core of OPAQUE is a method to store user secrets for safekeeping on a server, without giving the server access to those secrets. Instead of storing a traditional salted password hash, the server stores a secret envelope for you that is “locked” by two pieces of information: your password known only by you, and a random secret key (like a salt) known only by the server. To log in, the client initiates a cryptographic exchange that reveals the envelope key to the client, but, importantly, not to the server.

The server then sends the envelope to the user, who now can retrieve the encrypted keys. (The keys included in the envelope are a private-public key pair for the user, and a public key for the server.) These keys, once unlocked, will be the inputs to an Authenticated Key Exchange (AKE) protocol, which allows the user and server to establish a secret key which can be used to encrypt their future communication.

OPAQUE consists of two phases, being credential registration and login via key exchange.

OPAQUE: Registration Phase

Before registration, the user first signs up for a service and picks a username and password. Registration begins with the OPRF flow we just described: Alice (the user) and Bob (the server) do an OPRF exchange. The result is that Alice has a random key rwd, derived from the OPRF output F(key, pwd), where key is a server-owned OPRF key specific to Alice and pwd is Alice’s password.

Within his OPRF message, Bob sends the public key for his OPAQUE identity. Alice then generates a new private/public key pair, which will be her persistent OPAQUE identity for Bob’s service, and encrypts her private key along with Bob’s public key with the rwd (we will call the result an encrypted envelope). She sends this encrypted envelope along with her public key (unencrypted) to Bob, who stores the data she provided, along with Alice’s specific OPRF keysecret, in a database indexed by her username.

OPAQUE: The Best Passwords Never Leave your Device

OPAQUE: Login Phase

The login phase is very similar. It starts the same way as registration — with an OPRF flow. However, on the server side, instead of generating a new OPRF key, Bob instead looks up the one he created during Alice’s registration. He does this by looking up Alice’s username (which she provides in the first message), and retrieving his record of her. This record contains her public key, her encrypted envelope, and Bob’s OPRF key for Alice.

He also sends over the encrypted envelope which Alice can decrypt with the output of the OPRF flow. (If decryption fails, she aborts the protocol — this likely indicates that she typed her password incorrectly, or Bob isn’t who he says he is). If decryption succeeds, she now has her own secret key and Bob’s public key. She inputs these into an AKE protocol with Bob, who, in turn, inputs his private key and her public key, which gives them both a fresh shared secret key.

OPAQUE: The Best Passwords Never Leave your Device

Integrating OPAQUE with an AKE

An important question to ask here is: what AKE is suitable for OPAQUE? The emerging CFRG specification outlines several options, including 3DH and SIGMA-I. However, on the web, we already have an AKE at our disposal: TLS!

Recall that TLS is an AKE because it provides unilateral (and mutual) authentication with shared secret derivation. The core of TLS is a Diffie-Hellman key exchange, which by itself is unauthenticated, meaning that the parties running it have no way to verify who they are running it with. (This is a problem because when you log into your bank, or any other website that stores your private data, you want to be sure that they are who they say they are). Authentication primarily uses certificates, which are issued by trusted entities through a system called Public Key Infrastructure (PKI). Each certificate is associated with a secret key. To prove its identity, the server presents its certificate to the client, and signs the TLS handshake with its secret key.

Modifying this ubiquitous certificate-based authentication on the web is perhaps not the best place to start. Instead, an improvement would be to authenticate the TLS shared secret, using OPAQUE, after the TLS handshake completes. In other words, once a server is authenticated with its typical WebPKI certificate, clients could subsequently authenticate to the server. This authentication could take place “post handshake” in the TLS connection using OPAQUE.

Exported Authenticators are one mechanism for “post-handshake” authentication in TLS. They allow a server or client to provide proof of an identity without setting up a new TLS connection. Recall that in the standard web case, the server establishes their identity with a certificate (proving, for example, that they are “cloudflare.com”). But if the same server also holds alternate identities, they must run TLS again to prove who they are.

The basic Exported Authenticator flow works resembles a classical challenge-response protocol, and works as follows. (We’ll consider the server authentication case only, as the client case is symmetric).

OPAQUE: The Best Passwords Never Leave your Device

At any point after a TLS connection is established, Alice (the client) sends an authenticator request to indicate that she would like Bob (the server) to prove an additional identity. This request includes a context (an unpredictable string — think of this as a challenge), and extensions which include information about what identity the client wants to be provided. For example, the client could include the SNI extension to ask the server for a certificate associated with a certain domain name other than the one initially used in the TLS connection.

On receipt of the client message, if the server has a valid certificate corresponding to the request, it sends back an exported authenticator which proves that it has the secret key for the certificate. (This message has the same format as an Auth message from the client in TLS 1.3 handshake – it contains a Certificate, a CertificateVerify and a Finished message). If the server cannot or does not wish to authenticate with the requested certificate, it replies with an empty authenticator which contains only a well formed Finished message.

The client then checks that the Exported Authenticator it receives is well-formed, and then verifies that the certificate presented is valid, and if so, accepts the new identity.

In sum, Exported Authenticators provide authentication in a higher layer (such as the application layer) safely by leveraging the well-vetted cryptography and message formats of TLS. Furthermore, it is tied to the TLS session so that authentication messages can’t be copied and pasted from one TLS connection into another. In other words, Exported Authenticators provide exactly the right hooks needed to add OPAQUE-based authentication into TLS.

OPAQUE with Exported Authenticators (OPAQUE-EA)

OPAQUE: The Best Passwords Never Leave your Device

OPAQUE-EA allows OPAQUE to run at any point after a TLS connection has already been set up. Recall that Bob (the server) will store his OPAQUE identity, in this case a signing key and verification key, and Alice will store her identity — encrypted — on Bob’s server. (The registration flow where Alice stores her encrypted keys is the same as in regular OPAQUE, except she stores a signing key, so we will skip straight to the login flow). Alice and Bob run two request-authenticate EA flows, one for each party, and OPAQUE protocol messages ride along in the extensions section of the EAs. Let’s look in detail how this works.

First, Alice generates her OPRF message based on her password. She creates an Authenticator Request asking for Bob’s OPAQUE identity, and includes (in the extensions field) her username and her OPRF message, and sends this to Bob over their established TLS connection.

Bob receives the message and looks up Alice’s username in his database. He retrieves her OPAQUE record containing her verification key and encrypted envelope, and his OPRF key. He uses the OPRF key on the OPRF message, and creates an Exported Authenticator proving ownership of his OPAQUE signing key, with an extension containing his OPRF message and the encrypted envelope. Additionally, he sends a new Authenticator Request asking Alice to prove ownership of her OPAQUE signing key.

Alice parses the message and completes the OPRF evaluation using Bob’s message to get output rwd, and uses rwd to decrypt the envelope. This reveals her signing key and Bob’s public key. She uses Bob’s public key to validate his Authenticator Response proof, and, if it checks out, she creates and sends an Exported Authenticator proving that she holds the newly decrypted signing key. Bob checks the validity of her Exported Authenticator, and if it checks out, he accepts her login.

My project: OPAQUE-EA over HTTPS

Everything described above is supported by lots and lots of theory that has yet to find its way into practice. My project was to turn the theory into reality. I started with written descriptions of Exported Authenticators, OPAQUE, and a preliminary draft of OPAQUE-in-TLS. My goal was to get from those to a working prototype.

My demo shows the feasibility of implementing OPAQUE-EA on the web, completely removing plaintext passwords from the wire, even encrypted. This provides a possible alternative to the current password-over-TLS flow with better security properties, but no visible change to the user.

A few of the implementation details are worth knowing. In computer science, abstraction is a powerful tool. It means that we can often rely on existing tools and APIs to avoid duplication of effort. In my project I relied heavily on mint, an open-source implementation of TLS 1.3 in Go that is great for prototyping. I also used CIRCL’s OPRF API. I built libraries for Exported Authenticators, the core of OPAQUE, and OPAQUE-EA (which ties together the two).

I made the web demo by wrapping the OPAQUE-EA functionality in a simple HTTP server and client that pass messages to each other over HTTPS. Since a browser can’t run Go, I compiled from Go to WebAssembly (WASM) to get the Go functionality in the browser, and wrote a simple script in JavaScript to call the WASM functions needed.

Since current browsers do not give access to the underlying TLS connection on the client side, I had to implement a work-around to allow the client to access the exporter keys, namely, that the server simply computes the keys and sends them to the client over HTTPS. This workaround reduces the security of the resulting demo — it means that trust is placed in the server to provide the right keys. Even so, the user’s password is still safe, even if a malicious server provided bad keys— they just don’t have assurance that they actually previously registered with that server. However, in the future, browsers could include a mechanism to support exported keys and allow OPAQUE-EA to run with its full security properties.

You can explore my implementation on Github, and even follow the instructions to spin up your own OPAQUE-EA test server and client. I’d like to stress, however, that the implementation is meant as a proof-of-concept only, and must not be used for production systems without significant further review.

OPAQUE-EA Limitations

Despite its great properties, there will definitely be some hurdles in bringing OPAQUE-EA from a proof-of-concept to a fully fledged authentication mechanism.

Browser support for TLS exporter keys. As mentioned briefly before, to run OPAQUE-EA in a browser, you need to access secrets from the TLS connection called exporter keys. There is no way to do this in the current most popular browsers, so support for this functionality will need to be added.

Overhauling password databases. To adopt OPAQUE-EA, servers need not only to update their password-checking logic, but also completely overhaul their password databases. Because OPAQUE relies on special password representations that can only be generated interactively, existing salted hashed passwords cannot be automatically updated to OPAQUE records. Servers will likely need to run a special OPAQUE registration flow on a user-by-user basis. Because OPAQUE relies on buy-in from both the client and the server, servers may need to support the old method for a while before all clients catch up.

Reliance on emerging standards. OPAQUE-EA relies on OPRFs, which is in the process of standardization, and Exported Authenticators, a proposed standard. This means that support for these dependencies is not yet available in most existing cryptographic libraries, so early adopters may need to implement these tools themselves.

Summary

As long as people still use passwords, we’d like to make the process as secure as possible. Current methods rely on the risky practice of handling plaintext passwords on the server side while checking their correctness. PAKEs, and (specifically aPAKEs) allow secure password login without ever letting the server see the passwords.

OPAQUE is also being explored within other companies. According to Kevin Lewi, a research scientist from the Novi Research team at Facebook, they are “excited by the strong cryptographic guarantees provided by OPAQUE and are actively exploring OPAQUE as a method for further safeguarding credential-protected fields that are stored server-side.”

OPAQUE is one of the best aPAKEs out there, and can be fully integrated into TLS. You can check out the core OPAQUE implementation here and the demo TLS integration here. A running version of the demo is also available here. A Typescript client implementation of OPAQUE is coming soon. If you’re interested in implementing the protocol, or encounter any bugs with the current implementation, please drop us a line at [email protected]! Consider also subscribing to the IRTF CFRG mailing list to track discussion about the OPAQUE specification and its standardization.

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

Post Syndicated from Tanya Verma original https://blog.cloudflare.com/oblivious-dns/

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

Today we are announcing support for a new proposed DNS standard — co-authored by engineers from Cloudflare, Apple, and Fastly — that separates IP addresses from queries, so that no single entity can see both at the same time. Even better, we’ve made source code available, so anyone can try out ODoH, or run their own ODoH service!

But first, a bit of context. The Domain Name System (DNS) is the foundation of a human-usable Internet. It maps usable domain names, such as cloudflare.com, to IP addresses and other information needed to connect to that domain. A quick primer about the importance and issues with DNS can be read in a previous blog post. For this post, it’s enough to know that, in the initial design and still dominant usage of DNS, queries are sent in cleartext. This means anyone on the network path between your device and the DNS resolver can see both the query that contains the hostname (or website) you want, as well as the IP address that identifies your device.

To safeguard DNS from onlookers and third parties, the IETF standardized DNS encryption with DNS over HTTPS (DoH) and DNS over TLS (DoT). Both protocols prevent queries from being intercepted, redirected, or modified between the client and resolver. Client support for DoT and DoH is growing, having been implemented in recent versions of Firefox, iOS, and more. Even so, until there is wider deployment among Internet service providers, Cloudflare is one of only a few providers to offer a public DoH/DoT service. This has raised two main concerns. One concern is that the centralization of DNS introduces single points of failure (although, with data centers in more than 100 countries, Cloudflare is designed to always be reachable). The other concern is that the resolver can still link all queries to client IP addresses.

Cloudflare is committed to end-user privacy. Users of our public DNS resolver service are protected by a strong, audited privacy policy. However, for some, trusting Cloudflare with sensitive query information is a barrier to adoption, even with such a strong privacy policy. Instead of relying on privacy policies and audits, what if we could give users an option to remove that bar with technical guarantees?

Today, Cloudflare and partners are launching support for a protocol that does exactly that: Oblivious DNS over HTTPS, or ODoH for short.

ODoH Partners:

We’re excited to launch ODoH with several leading launch partners who are equally committed to privacy.

A key component of ODoH is a proxy that is disjoint from the target resolver. Today, we’re launching ODoH with several leading proxy partners, including: PCCW, SURF, and Equinix.

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

“ODoH is a revolutionary new concept designed to keep users’ privacy at the center of everything. Our ODoH partnership with Cloudflare positions us well in the privacy and “Infrastructure of the Internet” space. As well as the enhanced security and performance of the underlying PCCW Global network, which can be accessed on-demand via Console Connect, the performance of the proxies on our network are now improved by Cloudflare’s 1.1.1.1 resolvers. This model for the first time completely decouples client proxy from the resolvers. This partnership strengthens our existing focus on privacy as the world moves to a more remote model and privacy becomes an even more critical feature.” — Michael Glynn, Vice President, Digital Automated Innovation, PCCW Global

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

“We are partnering with Cloudflare to implement better user privacy via ODoH. The move to ODoH is a true paradigm shift, where the users’ privacy or the IP address is not exposed to any provider, resulting in true privacy. With the launch of ODoH-pilot, we’re joining the power of Cloudflare’s network to meet the challenges of any users around the globe. The move to ODoH is not only a paradigm shift but it emphasizes how privacy is important to any users than ever, especially during 2020. It resonates with our core focus and belief around Privacy.” — Joost van Dijk, Technical Product Manager, SURF

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

How does Oblivious DNS over HTTPS (ODoH) work?

ODoH works by adding a layer of public key encryption, as well as a network proxy between clients and DoH servers such as 1.1.1.1. The combination of these two added elements guarantees that only the user has access to both the DNS messages and their own IP address at the same time.

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

There are three players in the ODoH path. Looking at the figure above, let’s begin with the target. The target decrypts queries encrypted by the client, via a proxy. Similarly, the target encrypts responses and returns them to the proxy. The standard says that the target may or may not be the resolver (we’ll touch on this later). The proxy does as a proxy is supposed to do, in that it forwards messages between client and target. The client behaves as it does in DNS and DoH, but differs by encrypting queries for the target, and decrypting the target’s responses. Any client that chooses to do so can specify a proxy and target of choice.

Together, the added encryption and proxying provide the following guarantees:

  1. The target sees only the query and the proxy’s IP address.
  2. The proxy has no visibility into the DNS messages, with no ability to identify, read, or modify either the query being sent by the client or the answer being returned by the target.
  3. Only the intended target can read the content of the query and produce a response.

These three guarantees improve client privacy while maintaining the security and integrity of DNS queries. However, each of these guarantees relies on one fundamental property — that the proxy and the target servers do not collude. So long as there is no collusion, an attacker succeeds only if both the proxy and target are compromised.

One aspect of this system worth highlighting is that the target is separate from the upstream recursive resolver that performs DNS resolution. In practice, for performance, we expect the target to be the same. In fact, 1.1.1.1 is now both a recursive resolver and a target! There is no reason that a target needs to exist separately from any resolver. If they are separated then the target is free to choose resolvers, and just act as a go-between. The only real requirement, remember, is that the proxy and target never collude.

Also, importantly, clients are in complete control of proxy and target selection. Without any need for TRR-like programs, clients can have privacy for their queries, in addition to security. Since the target only knows about the proxy, the target and any upstream resolver are oblivious to the existence of any client IP addresses. Importantly, this puts clients in greater control over their queries and the ways they might be used. For example, clients could select and alter their proxies and targets any time, for any reason!

ODoH Message Flow

In ODoH, the ‘O’ stands for oblivious, and this property comes from the level of encryption of the DNS messages themselves. This added encryption is `end-to-end` between client and target, and independent from the connection-level encryption provided by TLS/HTTPS. One might ask why this additional encryption is required at all in the presence of a proxy. This is because two separate TLS connections are required to support proxy functionality. Specifically, the proxy terminates a TLS connection from the client, and initiates another TLS connection to the target. Between those two connections, the DNS message contexts would otherwise appear in plaintext! For this reason, ODoH additionally encrypts messages between client and target so the proxy has no access to the message contents.

The whole process begins with clients that encrypt their query for the target using HPKE. Clients obtain the target’s public key via DNS, where it is bundled into a HTTPS resource record and protected by DNSSEC. When the TTL for this key expires, clients request a new copy of the key as needed (just as they would for an A/AAAA record when that record’s TTL expires). The usage of a target’s DNSSEC-validated public key guarantees that only the intended target can decrypt the query and encrypt a response (answer).

Clients transmit these encrypted queries to a proxy over an HTTPS connection. Upon receipt, the proxy forwards the query to the designated target. The target then decrypts the query, produces a response by sending the query to a recursive resolver such as 1.1.1.1, and then encrypts the response to the client. The encrypted query from the client contains encapsulated keying material from which targets derive the response encryption symmetric key.

This response is then sent back to the proxy, and then subsequently forwarded to the client. All communication is authenticated and confidential since these DNS messages are end-to-end encrypted, despite being transmitted over two separate HTTPS connections (client-proxy and proxy-target). The message that otherwise appears to the proxy as plaintext is actually an encrypted garble.

What about Performance? Do I have to trade performance to get privacy?

We’ve been doing lots of measurements to find out, and will be doing more as ODoH deploys more widely. Our initial set of measurement configurations spanned cities in the USA, Canada, and Brazil. Importantly, our measurements include not just 1.1.1.1, but also 8.8.8.8 and 9.9.9.9. The full set of measurements, so far, is documented for open access.

In those measurements, it was important to isolate the cost of proxying and additional encryption from the cost of TCP and TLS connection setup. This is because the TLS and TCP costs are incurred by DoH, anyway. So, in our setup, we ‘primed’ measurements by establishing connections once and reusing that connection for all measurements. We did this for both DoH and for ODoH, since the same strategy could be used in either case.

The first thing that we can say with confidence is that the additional encryption is marginal. We know this because we randomly selected 10,000 domains from the Tranco million dataset and measured both encryption of the A record with a different public key, as well as its decryption. The additional cost between a proxied DoH query/response and its ODoH counterpart is consistently less than 1ms at the 99th percentile.

The ODoH request-response pipeline, however, is much more than just encryption. A very useful way of looking at measurements is by looking at the cumulative distribution chart — if you’re familiar with these kinds of charts, skip to the next paragraph. In contrast to most charts where we start along the x-axis, with cumulative distributions we often start with the y-axis.

The chart below shows the cumulative distributions for query/response times in DoH, ODoH, and DoH when transmitted over the Tor Network. The dashed horizontal line that starts on the left from 0.5 is the 50% mark. Along this horizontal line, for any plotted curve, the part of the curve below the dashed line is 50% of the data points. Now look at the x-axis, which is a measure of time. The lines that appear to the left are faster than lines to the right. One last important detail is that the x-axis is plotted on a logarithmic scale. What does this mean? Notice that the distance between the labeled markers (10x) is equal in cumulative distributions but the ‘x’ is an exponent, and represents orders of magnitude. So, while the time difference between the first two markers is 9ms, the difference between the 3rd and 4th markers is 900ms.

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

In this chart, the middle curve represents ODoH measurements. We also measured the performance of privacy-preserving alternatives, for example, DoH queries transmitted over the Tor network as represented by the right curve in the chart. (Additional privacy-preserving alternatives are captured in the open access technical report.) Compared to other privacy-oriented DNS variants, ODoH cuts query time in half, or better. This point is important since privacy and performance rarely play nicely together, so seeing this kind of improvement is encouraging!

The chart above also tells us that 50% of the time ODoH queries are resolved in fewer than 228ms. Now compare the middle line to the left line that represents ‘straight-line’ (or normal) DoH without any modification. That left plotline says that 50% of the time, DoH queries are resolved in fewer than 146ms. Looking below the 50% mark, the curves also tell us that ½ the time that difference is never greater than 100ms. On the other side, looking at the curves above the 50% mark tells us that ½ ODoH queries are competitive with DoH.

Those curves also hide a lot of information, so it is important to delve further into the measurements. The chart below has three different cumulative distribution curves that describe ODoH performance if we select proxies and targets by their latency. This is also an example of the insights that measurements can reveal, some of which are counterintuitive. For example, looking above 0.5, these curves say that ½ of ODoH query/response times are virtually indistinguishable, no matter the choice of proxy and target. Now shift attention below 0.5 and compare the two solid curves against the dashed curve that represents overall average. This region suggests that selecting the lowest-latency proxy and target offers minimal improvement over the average but, most importantly, it shows that selecting the lowest-latency proxy leads to worse performance!

Improving DNS Privacy with Oblivious DoH in 1.1.1.1

Open questions remain, of course. This first set of measurements were executed largely in North America. Does performance change at a global level? How does this affect client performance, in practice? We’re working on finding out, and this release will help us to do that.

Interesting! Can I experiment with ODoH? Is there an open ODoH service?

Yes, and yes! We have open sourced our interoperable ODoH implementations in Rust, odoh-rs and Go, odoh-go, as well as integrated the target into the Cloudflare DNS Resolver. That’s right, 1.1.1.1 is ready to receive queries via ODoH.

We have also open sourced test clients in Rust, odoh-client-rs, and Go, odoh-client-go, to demo ODoH queries. You can also check out the HPKE configuration used by ODoH for message encryption to 1.1.1.1 by querying the service directly:

$ dig -t type65 +dnssec @ns1.cloudflare.com odoh.cloudflare-dns.com 

; <<>> DiG 9.10.6 <<>> -t type65 +dnssec @ns1.cloudflare.com odoh.cloudflare-dns.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19923
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;odoh.cloudflare-dns.com.	IN	TYPE65

;; ANSWER SECTION:
odoh.cloudflare-dns.com. 300	IN	TYPE65	\# 108 00010000010003026832000400086810F8F96810F9F9000600202606 470000000000000000006810F8F92606470000000000000000006810 F9F98001002E002CFF0200280020000100010020ED82DBE32CCDE189 BC6C643A80B5FAFF82548D21601C613408BACAAE6467B30A
odoh.cloudflare-dns.com. 300	IN	RRSIG	TYPE65 13 3 300 20201119163629 20201117143629 34505 odoh.cloudflare-dns.com. yny5+ApxPSO6Q4aegv09ZnBmPiXxDEnX5Xv21TAchxbxt1VhqlHpb5Oc 8yQPNGXb0fb+NyibmHlvTXjphYjcPA==

;; Query time: 21 msec
;; SERVER: 173.245.58.100#53(173.245.58.100)
;; WHEN: Wed Nov 18 07:36:29 PST 2020
;; MSG SIZE  rcvd: 291

We are working to add ODoH to existing stub resolvers such as cloudflared. If you’re interested in adding support to a client, or if you encounter bugs with the implementations, please drop us a line at [email protected]! Announcements about the ODoH specification and server will be sent to the IETF DPRIVE mailing list. You can subscribe and follow announcements and discussion about the specification here.

We are committed to moving it forward in the IETF and are already seeing interest from client vendors. Eric Rescorla, CTO of Firefox, says, “Oblivious DoH is a great addition to the secure DNS ecosystem. We’re excited to see it starting to take off and are looking forward to experimenting with it in Firefox.” We hope that more operators join us along the way and provide support for the protocol, by running either proxies or targets, and we hope client support will increase as the available infrastructure increases, too.

The ODoH protocol is a practical approach for improving privacy of users, and aims to improve the overall adoption of encrypted DNS protocols without compromising performance and user experience on the Internet.

Acknowledgements

Marek Vavruša and Anbang Wen were instrumental in getting the 1.1.1.1 resolver to support ODoH. Chris Wood and Peter Wu helped get the ODoH libraries ready and tested.

Diversity and inclusion in computing education — new research seminars

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/diversity-inclusion-computing-education-research-seminars/

At the Raspberry Pi Foundation, we host a free online research seminar once a month to explore a wide variety of topics in the area of digital and computing education. This year, we’ve hosted eleven seminars — you can (re)discover slides and recordings on our website.

A classroom of young learners and a teacher at laptops

Now we’re getting ready for new seminars in 2021! In the coming months, our seminars are going to focus on diversity and inclusion in computing education. This topic is extremely important, as we want to make sure that computing is accessible to all, that we understand how to actively remove barriers to participation for learners, and that we understand how to teach computing in an inclusive way. 

We are delighted to announce that these seminars focusing on diversity and inclusion will be co-hosted by the Royal Academy of Engineering. The Royal Academy of Engineering is harnessing the power of engineering to build a sustainable society and an inclusive economy that works for everyone.

Royal Academy of Engineering logo

We’re very excited to be partnering with the Academy because of our shared interest in ensuring that computing and engineering are inclusive and accessible to all.

Our upcoming seminars

The seminars take place on the first Tuesday of the month at 17:00–18:30 GMT / 12:00–13:30 EST / 9:00–10:30 PST / 18:00–19:30 CET.

  • 5 January 2021: Peter Kemp (King’s College London) and Billy Wong (University of Reading) will be looking at computing education in England, particularly GCSE computer science, and how it is accessed by groups typically underrepresented in computing.
  • 2 February 2021: Professor Tia Madkins (University of Texas at Austin), Nicol R. Howard (University of Redlands), and Shomari Jones (Bellevue School District) will be talking about equity-focused teaching in K–12 computer science. Find out more.
  • 2 March 2021: Dr Jakita O. Thomas (Auburn University, Alabama) will be talking about her research on supporting computational algorithmic thinking in the context of intersectional computing.
  • April 2021: event to be confirmed
  • 4 May 2021: Dr Cecily Morrison (Microsoft Research) will be speaking about her work on physical programming for people with visual impairments.

Join the seminars

We’d love to welcome you to these seminars so we can learn and discuss together. To get access, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended our seminars in the past, the link remains the same.

The post Diversity and inclusion in computing education — new research seminars appeared first on Raspberry Pi.

PRIMM: encouraging talk in programming lessons

Post Syndicated from Oliver Quinlan original https://www.raspberrypi.org/blog/primm-talk-in-programming-lessons-research-seminar/

Whenever you learn a new subject or skill, at some point you need to pick up the particular language that goes with that domain. And the only way to really feel comfortable with this language is to practice using it. It’s exactly the same when learning programming.

A girl doing Scratch coding in a Code Club classroom

In our latest research seminar, we focused on how we educators and our students can talk about programming. The seminar presentation was given by our Chief Learning Officer, Dr Sue Sentance. She shared the work she and her collaborators have done to develop a research-based approach to teaching programming called PRIMM, and to work with teachers to investigate the effects of PRIMM on students.

Sue Sentance

As well as providing a structure for programming lessons, Sue’s research on PRIMM helps us think about ways in which learners can investigate programs, start to understand how they work, and then gradually develop the language to talk about them themselves.

Productive talk for education

Sue began by taking us through the rich history of educational research into language and dialogue. This work has been heavily developed in science and mathematics education, as well as language and literacy.

In particular the work of Neil Mercer and colleagues has shown that students need guidance to develop and practice using language to reason, and that developing high-quality language improves understanding. The role of the teacher in this language development is vital.

Sue’s work draws on these insights to consider how language can be used to develop understanding in programming.

Why is programming challenging for beginners?

Sue identified shortcomings of some teaching approaches that are common in the computing classroom but may not be suitable for all beginners.

  • ‘Copy code’ activities for learners take a long time, lead to dreaded syntax errors, and don’t necessarily build more understanding.
  • When teachers model the process of writing a program, this can be very helpful, but for beginners there may still be a huge jump from being able to follow the modeling to being able to write a program from scratch themselves.

PRIMM was designed by Sue and her collaborators as a language-first approach where students begin not by writing code, but by reading it.

What is PRIMM?

PRIMM stands for ‘Predict, Run, Investigate, Modify, Make’. In this approach, rather than copying code or writing programs from scratch, beginners instead start by focussing on reading working code.

In the Predict stage, the teacher provides learners with example code to read, discuss, and make output predictions about. Next, they run the code to see how the output compares to what they predicted. In the Investigate stage, the teacher sets activities for the learners to trace, annotate, explain, and talk about the code line by line, in order to help them understand what it does in detail.

In the seminar, Sue took us through a mini example of the stages of PRIMM where we predicted the output of Python Turtle code. You can follow along on the recording of the seminar to get the experience of what it feels like to work through this approach.

The impact of PRIMM on learning

The PRIMM approach is informed by research, and it is also the subject of research by Sue and her collaborators. They’ve conducted two studies to measure the effectiveness of PRIMM: an initial pilot, and a larger mixed-methods study with 13 teachers and 493 students with a control group.

The larger study used a pre and post test, and found that the group who experienced a PRIMM approach performed better on the tests than the control group. The researchers also collected a wealth of qualitative feedback from teachers. The feedback suggested that the approach can help students to develop a language to express their understanding of programming, and that there was much more productive peer conversation in the PRIMM lessons (sometimes this meant less talk, but at a more advanced level).

The PRIMM structure also gave some teachers a greater capacity to talk about the process of teaching programming. It facilitated the discussion of teaching ideas and learning approaches for the teachers, as well as developing language approaches that students used to learn programming concepts.

The research results suggest that learners taught using PRIMM appear to be developing the language skills to talk coherently about their programming. The effectiveness of PRIMM is also evidenced by the number of teachers who have taken up the approach, building in their own activities and in some cases remixing the PRIMM terminology to develop their own take on a language-first approach to teaching programming.

Future research will investigate in detail how PRIMM encourages productive talk in the classroom, and will link the approach to other work on semantic waves. (For more on semantic waves in computing education, see this seminar by Jane Waite and this symposium talk by Paul Curzon.)

Resources for educators who want to try PRIMM

If you would like to try out PRIMM with your learners, use our free support materials:

Join our next seminar

If you missed the seminar, you can find the presentation slides alongside the recording of Sue’s talk on our seminars page.

In our next seminar on Tuesday 1 December at 17:00–18:30 GMT / 12:00–13:30 EsT / 9:00–10:30 PT / 18:00–19:30 CEST. Dr David Weintrop from the University of Maryland will be presenting on the role of block-based programming in computer science education. To join, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended this past seminar, the link remains the same.

The post PRIMM: encouraging talk in programming lessons appeared first on Raspberry Pi.

Formative assessment in the computer science classroom

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/research-seminar-formative-assessment-computer-science-classroom/

In computing education research, considerable focus has been put on the design of teaching materials and learning resources, and investigating how young people learn computing concepts. But there has been less focus on assessment, particularly assessment for learning, which is called formative assessment. As classroom teachers are engaged in assessment activities all the time, it’s pretty strange that researchers in the area of computing and computer science in school have not put a lot of focus on this.

Shuchi Grover

That’s why in our most recent seminar, we were delighted to hear about formative assessment — assessment for learning — from Dr Shuchi Grover, of Looking Glass Ventures and Stanford University in the USA. Shuchi has a long track record of work in the learning sciences (called education research in the UK), and her contribution in the area of computational thinking has been hugely influential and widely drawn on in subsequent research.

Two types of assessment

Assessment is typically divided into two types:

  1. Summative assessment (i.e. assessing what has been learned), which typically takes place through examinations, final coursework, projects, etcetera.
  2. Formative assessment (i.e. assessment for learning), which is not aimed at giving grades and typically takes place through questioning, observation, plenary classroom activities, and dialogue with students.

Through formative assessment, teachers seek to find out where students are at, in order to use that information both to direct their preparation for the next teaching activities and to give students useful feedback to help them progress. Formative assessment can be used to surface misconceptions (or alternate conceptions) and for diagnosis of student difficulties.

Venn diagram of how formative assessment practices intersect with teacher knowledge and skills
Click to enlarge

As Shuchi outlined in her talk, a variety of activities can be used for formative assessment, for example:

  • Self- and peer-assessment activities (commonly used in schools).
  • Different forms of questioning and quizzes to support learning (not graded tests).
  • Rubrics and self-explanations (for assessing projects).

A framework for formative assessment

Shuchi described her own research in this topic, including a framework she has developed for formative assessment. This comprises three pillars:

  1. Assessment design.
  2. Teacher or classroom practice.
  3. The role of the community in furthering assessment practice.
Shuchi Grover's framework for formative assessment
Click to enlarge

Shuchi’s presentation then focused on part of the first pillar in the framework: types of assessments, and particularly types of multiple-choice questions that can be automatically marked or graded using software tools. Tools obviously don’t replace teachers, but they can be really useful for providing timely and short-turnaround feedback for students.

As part of formative assessment, carefully chosen questions can also be used to reveal students’ misconceptions about the subject matter — these are called diagnostic questions. Shuchi discussed how in a classroom setting, teachers can employ this kind of question to help them decide what to focus on in future lessons, and to understand their students’ alternate or different conceptions of a topic. 

Formative assessment of programming skills

The remainder of the seminar focused on the formative assessment of programming skills. There are many ways of assessing developing programming skills (see Shuchi’s slides), including Parsons problems, microworlds, hotspot items, rubrics (for artifacts), and multiple-choice questions. As an MCQ example, in the figure below you can see some snippets of block-based code, which students need to read and work out what the outcome of running the snippets will be. 

Click to enlarge

Questions such as this highlight that it’s important for learners to engage in code comprehension and code reading activities when learning to program. This really underlines the fact that such assessment exercises can be used to support learning just as much as to monitor progress.

Formative assessment: our support for teachers

Interestingly, Shuchi commented that in her experience, teachers in the UK are more used to using code reading activities than US teachers. This may be because code comprehension activities are embedded into the curriculum materials and support for pedagogy, both of which the Raspberry Pi Foundation developed as part of the National Centre for Computing Education in England. We explicitly share approaches to teaching programming that incorporate code reading, for example the PRIMM approach. Moreover, our work in the Raspberry Pi Foundation includes the Isaac Computer Science online learning platform for A level computer science students and teachers, which is centered around different types of questions designed as tools for learning.

All these materials are freely available to teachers wherever they are based.

Further work on formative assessment

Based on her work in US classrooms researching this topic, Shuchi’s call to action for teachers was to pay attention to formative assessment in computer science classrooms and to investigate what useful tools can support them to give feedback to students about their learning. 

Advice from Shuchi Grover on how to embed formative assessment in classroom practice
Click to enlarge

Shuchi is currently involved in an NSF-funded research project called CS Assess to further develop formative assessment in computer science via a community of educators. For further reading, there are two chapters related to formative assessment in computer science classrooms in the recently published book Computer Science in K-12 edited by Shuchi.

There was much to take away from this seminar, and we are really grateful to Shuchi for her input and look forward to hearing more about her developing project.

Join our next seminar

If you missed the seminar, you can find the presentation slides and a recording of the Shuchi’s talk on our seminars page.

In our next seminar on Tuesday 3 November at 17:00–18:30 BST / 12:00–13:30 EDT / 9:00–10:30 PT / 18:00–19:30 CEST, I will be presenting my work on PRIMM, particularly focusing on language and talk in programming lessons. To join, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended this past seminar, the link remains the same.

The post Formative assessment in the computer science classroom appeared first on Raspberry Pi.

Embedding computational thinking skills in our learning resources

Post Syndicated from Oliver Quinlan original https://www.raspberrypi.org/blog/computational-thinking-skills-in-our-free-learning-resources/

Learning computing is fun, creative, and exploratory. It also involves understanding some powerful ideas about how computers work and gaining key skills for solving problems using computers. These ideas and skills are collected under the umbrella term ‘computational thinking’.

When we create our online learning projects for young people, we think as much about how to get across these powerful computational thinking concepts as we do about making the projects fun and engaging. To help us do this, we have put together a computational thinking framework, which you can read right now.

What is computational thinking? A brief summary

Computational thinking is a set of ideas and skills that people can use to design systems that can be run on a computer. In our view, computational thinking comprises:

  • Decomposition
  • Algorithms
  • Patterns and generalisations
  • Abstraction
  • Evaluation
  • Data

All of these aspects are underpinned by logical thinking, the foundation of computational thinking.

What does computational thinking look like in practice?

In principle, the processes a computer performs can also be carried out by people. (To demonstrate this, computing educators have created a lot of ‘unplugged’ activities in which learners enact processes like computers do.) However, when we implement processes so that they can be run on a computer, we benefit from the huge processing power that computers can marshall to do certain types of activities.

A group of young people and educators smiling while engaging with a computer

Computers need instructions that are designed in very particular ways. Computational thinking includes the set of skills we use to design instructions computers can carry out. This skill set represents the ways we can logically approach problem solving; as computers can only solve problems using logical processes, to write programs that run on a computer, we need to use logical thinking approaches. For example, writing a computer program often requires the task the program revolves around to be broken down into smaller tasks that a computer can work through sequentially or in parallel. This approach, called decomposition, can also help people to think more clearly about computing problems: breaking down a problem into its constituent parts helps us understand the problem better.

Male teacher and male students at a computer

Understanding computational thinking supports people to take advantage of the way computers work to solve problems. Computers can run processes repeatedly and at amazing speeds. They can perform repetitive tasks that take a long time, or they can monitor states until conditions are met before performing a task. While computers sometimes appear to make decisions, they can only select from a range of pre-defined options. Designing systems that involve repetition and selection is another way of using computational thinking in practice.

Our computational thinking framework

Our team has been thinking about our approach to computational thinking for some time, and we have just published the framework we have developed to help us with this. It sets out the key areas of computational thinking, and then breaks these down into themes and learning objectives, which we build into our online projects and learning resources.

To develop this computational thinking framework, we worked with a group of academics and educators to make sure it is robust and useful for teaching and learning. The framework was also influenced by work from organisations such as Computing At School (CAS) in the UK, and the Computer Science Teachers’ Association (CSTA) in the USA.

We’ve been using the computational thinking framework to help us make sure we are building opportunities to learn about computational thinking into our learning resources. This framework is a first iteration, which we will review and revise based on experience and feedback.

We’re always keen to hear feedback from you in the community about how we shape our learning resources, so do let us know what you think about them and the framework in the comments.

The post Embedding computational thinking skills in our learning resources appeared first on Raspberry Pi.

How is computing taught in schools around the world?

Post Syndicated from Sue Sentance original https://www.raspberrypi.org/blog/international-computing-curriculum-metrecc-research-seminar/

Around the world, formal education systems are bringing computing knowledge to learners. But what exactly is set down in different countries’ computing curricula, and what are classroom educators teaching? This was the topic of the first in the autumn series of our Raspberry Pi research seminars on Tuesday 8 September.

A glowing globe floating above an open hand in the dark

We heard from an international team (Monica McGill , USA; Rebecca Vivian, Australia; Elizabeth Cole, Scotland) who represented a group of researchers also based in England, Malta, Ireland, and Italy. As a researcher working at the Raspberry Pi Foundation, I myself was part of this research group. The group developed METRECC, a comprehensive and validated survey tool that can be used to benchmark and measure developments of the teaching and learning of computing in formal education systems around the world. Monica, Rebecca, and Elizabeth presented how the research group developed and validated the METRECC tool, and shared some findings from their pilot study.

What’s in a curriculum? Developing a survey tool

Those of us who work or have worked in school education use the word ‘curriculum’ frequently, although it’s an example of education terminology that means different things in different contexts, and to different people. Following Porter and Smithson (2001)1, we can distinguish between the intended curriculum and the enacted curriculum:

  • Intended curriculum: Policy tools as curriculum standards, frameworks, or guidelines that outline the curriculum teachers are expected to deliver.
  • Enacted curriculum: Actual curricular content in which students engage in the classroom, and adopted pedagogical approaches; for computer science (CS) curricula, this also includes students’ use of technology, physical computing devices, and tools in CS lessons.

To compare the intended and enacted computing curriculum in as many countries as possible, at particular points in time, the research group Monica, Rebecca, Elizabeth, and I were part of developed the METRECC survey tool.

A classroom of students in North America

METRECC stands for MEasuring TeacheREnacted Computing Curriculum. The METRECC survey has 11 categories of questions and is designed to be completed by computing teachers within 35–40 minutes. Following best practice in research, which calls for standardised research instruments, the research group ensured that the survey produces valid, reliable results (meaning that it works as intended) before using it to gather data.

Using METRECC in a pilot study

In their pilot study, the research group gathered data from 7 countries. The intended curriculum for each country was determined by examining standards and policies in place for each country/state under consideration. Teachers’ answers in the METRECC survey provided the countries’ enacted curricula. (The complete dataset from the pilot study is publicly available at csedresearch.org, a very useful site for CS education researchers where many surveys are shared.)

Two girls coding at a computer under supervision of a female teacher

The researchers then mapped the intended to the enacted curricula to find out whether teachers were actually teaching the topics that were prescribed for them. Overall, the results of the mapping showed that there was a good match between intended and enacted curricula. Examples of mismatches include lower numbers of primary school teachers reporting that they taught visual or symbolic programming, even though the topic did appear on their curriculum.

A table listing computer science topics
This table shows computer science topic the METRECC tool asks teachers about, and what percentage of respondents in the pilot study stated that they teach these to their students.

Another aspect of the METRECC survey allows to measure teachers’ confidence, self-efficacy, and self-esteem. The results of the pilot study showed a relationship between years of experience and CS self-esteem; in particular, after four years of teaching, teachers started to report high self-esteem in relation to computer science. Moreover, primary teachers reported significantly lower self-esteem than secondary teachers did, and female teachers reported lower self-esteem than male teachers did.

Adapting the survey’s language

The METRECC survey has also been used in South Asia, namely Bangladesh, Nepal, Pakistan, and Sri Lanka (where computing is taught under ICT). Amongst other things, what the researchers learned from that study was that some of the survey questions needed to be adapted to be relevant to these countries. For example, while in the UK we use the word ‘gifted’ to mean ‘high-attaining’, in the South Asian countries involved in the study, to be ‘gifted’ means having special needs.

Two girls coding at a computer under supervision of a female teacher

The study highlighted how important it is to ensure that surveys intended for an international audience use terminology and references that are pertinent to many countries, or that the survey language is adapted in order to make sense in each context it is delivered. 

Let’s keep this monitoring of computing education moving forward!

The seminar presentation was well received, and because we now hold our seminars for 90 minutes instead of an hour, we had more time for questions and answers.

My three main take-aways from the seminar were:

1. International collaboration is key

It is very valuable to be able to form international working groups of researchers collaborating on a common project; we have so much to learn from each other. Our Raspberry Pi research seminars attract educators and researchers from many different parts of the world, and we can truly push the field’s understanding forward when we listen to experiences and lessons of people from diverse contexts and cultures.

2. Making research data publicly available

Increasingly, it is expected that research datasets are made available in publicly accessible repositories. While this is becoming the norm in healthcare and scientific, it’s not yet as prevalent in computing education research. It was great to be able to publicly share the dataset from the METRECC pilot study, and we encourage other researchers in this field to do the same. 

3. Extending the global scope of this research

Finally, this work is only just beginning. Over the last decade, there has been an increasing move towards teaching aspects of computer science in school in many countries around the world, and being able to measure change and progress is important. Only a handful of countries were involved in the pilot study, and it would be great to see this research extend to more countries, with larger numbers of teachers involved, so that we can really understand the global picture of formal computing education. Budding research students, take heed!

Next up in our seminar series

If you missed the seminar, you can find the presentation slides and a recording of the researchers’ talk on our seminars page.

In our next seminar on Tuesday 6 October at 17:00–18:30 BST / 12:00–13:30 EDT / 9:00–10:30 PT / 18:00–19:30 CEST, we’ll welcome Shuchi Grover, a prominent researcher in the area of computational thinking and formative assessment. The title of Shuchi’s seminar is Assessments to improve student learning in introductory CS classrooms. To join, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended this past seminar, the link remains the same.


1. Andrew C. Porter and John L. Smithson. 2001. Defining, Developing and Using Curriculum Indicators. CPRE Research Reports, 12-2001. (2001)

The post How is computing taught in schools around the world? appeared first on Raspberry Pi.