All posts by Thibault Meunier

Serving Cloudflare Pages sites to the IPFS network

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/cloudflare-pages-on-ipfs/

Serving Cloudflare Pages sites to the IPFS network

Serving Cloudflare Pages sites to the IPFS network

Four years ago, Cloudflare went Interplanetary by offering a gateway to the IPFS network. This meant that if you hosted content on IPFS, we offered to make it available to every user of the Internet through HTTPS and with Cloudflare protection. IPFS allows you to choose a storage provider you are comfortable with, while providing a standard interface for Cloudflare to serve this data.

Since then, businesses have new tools to streamline web development. Cloudflare Workers, Pages, and R2 are enabling developers to bring services online in a matter of minutes, with built-in scaling, security, and analytics.

Today, we’re announcing we’re bridging the two. We will make it possible for our customers to serve their sites on the IPFS network.

In this post, we’ll learn how you will be able to build your website with Cloudflare Pages, and leverage the IPFS integration to make your content accessible and available across multiple providers.

A primer on IPFS

The InterPlanetary FileSystem (IPFS) is a peer-to-peer network for storing content on a distributed file system. It is composed of a set of computers called nodes that store and relay content using a common addressing system. In short, a set of participants agree to maintain a shared index of content the network can provide, and where to find it.

Let’s take two random participants in the network: Alice, a cat person, and Bob, a dog person.

As a participant in the network, Alice keeps connections with a subset of participants, referred to as her peers. When Alice is making her content 🐱 available on IPFS, it means she announces to her peers she has content 🐱, and these peers stored in their routing table 🐱 is provided by Alice’s node.

Serving Cloudflare Pages sites to the IPFS network

Each node has a routing table, and a datastore. The routing table retains a mapping of content to peers, and the datastore retains the content a given node stores. In the above case, only Alice has content, a 🐱.

Serving Cloudflare Pages sites to the IPFS network

When Bob wants to retrieve 🐱, he tells his peers they want 🐱. These peers point him to Alice. Alice then provides 🐱 to Bob. Bob can verify 🐱 is the content they were looking for, because the content identifier he requested is derived from the 🐱 content itself, using a secure, cryptographic hash function.

Serving Cloudflare Pages sites to the IPFS network

Even better, if Bob becomes a cat person, they can announce to their peers they are also providing a cat. Bob’s love for cats could be genuine, or because they have interest in providing the content, such as a contract with Alice. IPFS provides a common ground to share content, without being opinionated on how this content has to be stored and its guarantees.

How Pages websites are made available on IPFS

Content is made available as follows.

Serving Cloudflare Pages sites to the IPFS network

The components are:

  • Pages storage: Storage solution for Cloudflare Pages.
  • IPFS Index Proxy: Service maintaining a map between IPFS CID and location of the data. This is operated on Cloudflare Workers and using Workers KV to store the mapping.
  • IPFS node: Cloudflare-hosted IPFS node serving Pages content. It has an in-house datastore module, able to communicate with the IPFS Index Proxy.
  • IPFS network: The rest of the IPFS network.

When you opt in serving your Cloudflare Page on IPFS, a call is made to the IPFS index proxy. This call first fetches your Pages content, transforms it into a CID, then both populates IndexDB to associate the CID with the content and reaches out to Cloudflare IPFS node to tell them they are able to provide the CID.

For example, imagine your website structure is as follows:

  • /
    • index.html
    • static/
      • cats.txt
      • beautiful_cats.txt

To provide this website on IPFS, Cloudflare has to compute a CID for /. CIDs are built recursively. To compute the CID for a given folder /, one needs to have the CID of index.html and static/. index.html CID is derived from its binary representation, and static/ from cats.txt and beautiful_cats.txt.

Serving Cloudflare Pages sites to the IPFS network

This works similarly to a Merkle tree, except nodes can reference each other as long as they still form a Directed Acyclic Graph (DAG). This structure is therefore referred to as a MerkleDAG.

In our example, it’s not unlikely for cats.txt and beautiful_cats.txt to have data in common. In this case, Cloudflare can be smart in the way it builds the MerkleDAG.

Serving Cloudflare Pages sites to the IPFS network

This reduces the storage and bandwidth requirement when accessing the website on IPFS.

If you want to learn more about how you can model a file system on IPFS, you can check the UnixFS specification.

Cloudflare stores every CID and the content it references in IndexDB. This allows Cloudflare IPFS nodes to serve Cloudflare Pages assets when requested.

Let’s put this together

Let’s take an example: pages-on-ipfs.com is hosted on IPFS. It is built using Hugo, a static site generator, and Cloudflare Pages with the public documentation template. Its source is available on GitHub. If you have an IPFS compatible client, you can access it at ipns://pages-on-ipfs.com as well.

1. Read Cloudflare Pages deployment documentation

For the purpose of this blog, I want to create a simple Cloudflare page website. I have experience with Hugo, so I choose it as my framework for the project.

I type “cloudflare pages” in the search bar of my web browser, then head to the Read the docs > Framework Guide > Deploy a Hugo site.

2. Create a site

This is where I use Hugo, and your configuration might differ. In short, I use hugo new site pages-on-ipfs, create an index and two static resources, et voilà. The result is available on the source GitHub for this project.

3. Deploy using Cloudflare Pages

On the Cloudflare Dashboard, I go to Account Home > Pages > Create a project. I select the GitHub repository I created, and configure the build tool to build Hugo website. Basically, I follow what’s written on Cloudflare Pages documentation.

Upon clicking Save and Deploy, my website is deployed on pages-on-ipfs.pages.dev. I also configure it to be available at pages-on-ipfs.com

4. Serve my content on IPFS

First, I opt in my zone on Cloudflare Pages integration with IPFS. This feature is not available yet for everyone to try out.

This allows Cloudflare to index the content of my website. Once indexed, I get the CID for my deployment baf…1. I can check that my content is available at this hash on IPFS using an IPFS gateway https://baf…1.ipfs.cf-ipfs.com/.

5. Make my IPFS website available at pages-on-ipfs.com

Having one domain name to access both Cloudflare Pages and IPFS version, depending on if the client supports IPFS or not is ideal. Fortunately, the IPFS ecosystem supports such a feature via DNSLink. DNSLink is a way to specify the IPFS content a domain is associated with.

For pages-on-ipfs.com, I create a TXT record on _dnslink.pages-on-ipfs.com with value dnslink=/ipfs/baf…1. Et voilà. I can now access ipns://pages-on-ipfs.com via an IPFS client.

6. (Optional) Replicate my website

The content of my website can now easily be replicated and pinned by other IPFS nodes. This can either be done at home via an IPFS client or using a pinning service such as Pinata.

What’s next

We’ll make this service available later this year as it is being refined. We are committed to make information move freely and help build a better Internet. Cloudflare Pages work of solving developer problems continues, as developers are now able to make their site accessible to more users.

Over the years, IPFS has been used by more and more people. While Cloudflare started by making IPFS content available to web users through an HTTP interface, we now think it’s time to give back. Allowing Cloudflare assets to be served over a public distributed network extends developers and users capability on an open web.

Common questions

  • I am already hosting my website on IPFS. Can I pin it using Cloudflare?
    • No. This project aims at serving Cloudflare hosted content via IPFS. We are still investigating how to best replicate and re-provide already-existing IPFS content via Cloudflare infrastructure.
  • Does this make IPFS more centralized?
    • No. Cloudflare does not have the authority to decide who can be a node operator nor what content they provide.
  • Are there guarantees the content is going to be available?
    • Yes. As long as you choose Cloudflare to host your website on IPFS, it will be available on IPFS. Should you move to another provider, it would be your responsibility to make sure the content remains available. IPFS allows for this transition to be smooth using a pinning service.
  • Is IPFS private?
    • It depends. Generally, no. IPFS is a p2p protocol. The nodes you peer with and request content from would know the content you are looking for. The privacy depends on the trust you have in your peers to not snoop on the data you request.
  • Can users verify the integrity of my website?
    • Yes. They need to access your website through an IPFS compatible client. Ideally, IPFS content integrity is turned into a web standard, similar to subresource integrity.

Web3 — A vision for a decentralized web

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/what-is-web3/

Web3 — A vision for a decentralized web

Web3 — A vision for a decentralized web

By reading this, you are a participant of the web. It’s amazing that we can write this blog and have it appear to you without operating a server or writing a line of code. In general, the web of today empowers us to participate more than we could at any point in the past.

Last year, we mentioned the next phase of the Internet would be always on, always secure, always private. Today, we dig into a similar trend for the web, referred to as Web3. In this blog we’ll start to explain Web3 in the context of the web’s evolution, and how Cloudflare might help to support it.

Going from Web 1.0 to Web 2.0

When Sir Tim Berners-Lee wrote his seminal 1989 document “Information Management: A Proposal”, he outlined a vision of the “web” as a network of information systems interconnected via hypertext links. It is often assimilated to the Internet, which is the computer network it operates on. Key practical requirements for this web included being able to access the network in a decentralized manner through remote machines and allowing systems to be linked together without requiring any central control or coordination.

Web3 — A vision for a decentralized web
The original proposal for what we know as the web, fitting in one diagram – Source: w3

This vision materialized into an initial version of the web that was composed of interconnected static resources delivered via a distributed network of servers and accessed primarily on a read-only basis from the client side — “Web 1.0”. Usage of the web soared with the number of websites growing well over 1,000% in the ~2 years following the introduction of the Mosaic graphical browser in 1993, based on data from the World Wide Web Wanderer.

The early 2000s marked an inflection point in the growth of the web and a key period of its development, as technology companies that survived the dot-com crash evolved to deliver value to customers in new ways amidst heightened skepticism around the web:

  • Desktop browsers like Netscape became commoditized and paved the way for native web services for discovering content like search engines.
  • Network effects that were initially driven by hyperlinks in web directories like Yahoo! were hyperscaled by platforms that enabled user engagement and harnessed collective intelligence like review sites.
  • The massive volume of data generated by Internet activity and the growing realization of its competitive value forced companies to become experts at database management.

O’Reilly Media coined the concept of Web 2.0 in an attempt to capture such shifts in design principles, which were transformative to the usability and interactiveness of the web and continue to be core building blocks for Internet companies nearly two decades later.

However, in the midst of the web 2.0 transformation, the web fell out of touch with one of its initial core tenets — decentralization.

Decentralization: No permission is needed from a central authority to post anything on the web, there is no central controlling node, and so no single point of failure … and no “kill switch”!
— History of the web by Web Foundation

A new paradigm for the Internet

This is where Web3 comes in. The last two decades have proven that building a scalable system that decentralizes content is a challenge. While the technology to build such systems exists, no content platform achieves decentralization at scale.

There is one notable exception: Bitcoin. Bitcoin was conceptualized in a 2008 whitepaper by Satoshi Nakamoto as a type of distributed ledger known as a blockchain designed so that a peer-to-peer (P2P) network could transact in a public, consistent, and tamper-proof manner.

That’s a lot said in one sentence. Let’s break it down by term:

  • A peer-to-peer network is a network architecture. It consists of a set of computers, called nodes, that store and relay information. Each node is equally privileged, preventing one node from becoming a single point of failure. In the Bitcoin case, nodes can send, receive, and process Bitcoin transactions.
  • A ledger is a collection of accounts in which transactions are recorded. For Bitcoin, the ledger records Bitcoin transactions.
  • A distributed ledger is a ledger that is shared and synchronized among multiple computers. This happens through a consensus, so each computer holds a similar replica of the ledger. With Bitcoin, the consensus process is performed over a P2P network, the Bitcoin network.
  • A blockchain is a type of distributed ledger that stores data in “blocks” that are cryptographically linked together into an immutable chain that preserves their chronological order. Bitcoin leverages blockchain technology to establish a shared, single source of truth of transactions and the sequence in which they occurred, thereby mitigating the double-spending problem.

Bitcoin — which currently has over 40,000 nodes in its network and processes over $30B in transactions each day — demonstrates that an application can be run in a distributed manner at scale, without compromising security. It inspired the development of other blockchain projects such as Ethereum which, in addition to transactions, allows participants to deploy code that can verifiably run on each of its nodes.

Today, these programmable blockchains are seen as ideal open and trustless platforms to serve as the infrastructure of a distributed Internet. They are home to a rich and growing ecosystem of nearly 7,000 decentralized applications (“Dapps”) that do not rely on any single entity to be available. This provides them with greater flexibility on how to best serve their users in all jurisdictions.

The web is for the end user

Distributed systems are inherently different from centralized systems. They should not be thought about in the same way. Distributed systems enable the data and its processing to not be held by a single party. This is useful for companies to provide resilience, but it’s also useful for P2P-based networks where data can stay in the hands of the participants.

For instance, if you were to host a blog the old-fashioned way, you would put up a server, expose it to the Internet (via Cloudflare 😀), et voilà. Nowadays, your blog would be hosted on a platform like WordPress, Ghost, Notions, or even Twitter. If these companies were to have an outage, this affects a lot more people. In a distributed fashion, via IPFS for instance, your blog content can be hosted and served from multiple locations operated by different entities.

Web3 — A vision for a decentralized web
Web 1.0
Web3 — A vision for a decentralized web
Web 2.0
Web3 — A vision for a decentralized web
Web3

Each participant in the network can choose what they host/provide and can be home to different content. Similar to your home network, you are in control of what you share, and you don’t share everything.

This is a core tenet of decentralized identity. The same cryptographic principles underpinning cryptocurrencies like Bitcoin and Ethereum are being leveraged by applications to provide secure, cross-platform identity services. This is fundamentally different from other authentication systems such as OAuth 2.0, where a trusted party has to be reached to assess one’s identity. This materializes in the form of “Login with <Big Cloud provider>” buttons. These cloud providers are the only ones with enough data, resources, and technical expertise.

In a decentralised web, each participant holds a secret key. They can then use it to identify each other. You can learn about this cryptographic system in a previous blog. In a Web3 setting where web participants own their data, they can selectively share these data with applications they interact with. Participants can also leverage this system to prove interactions they had with one another. For example, if a college issues you a Decentralized Identifier (DID), you can later prove you have been registered at this college without reaching out to the college again. Decentralized Identities can also serve as a placeholder for a public profile, where participants agree to use a blockchain as a source of trust. This is what projects such as ENS or Unlock aim to provide: a way to verify your identity online based on your control over a public key.

This trend of proving ownership via a shared source of trust is key to the NFT craze. We have discussed NFTs before on this blog. Blockchain-based NFTs are a medium of conveying ownership. Blockchain enables this information to be publicly verified and updated. If the blockchain states a public key I control is the owner of an NFT, I can refer to it on other platforms to prove ownership of it. For instance, if my profile picture on social media is a cat, I can prove the said cat is associated with my public key. What this means depends on what I want to prove, especially with the proliferation of NFT contracts. If you want to understand how an NFT contract works, you can build your own.

Web3 — A vision for a decentralized web

How does Cloudflare fit in Web3?

Decentralization and privacy are challenges we are tackling at Cloudflare as part of our mission to help build a better Internet.

In a previous post, Nick Sullivan described Cloudflare’s contributions to enabling privacy on the web. We launched initiatives to fix information leaks in HTTPS through Encrypted Client Hello (ECH), make DNS even more private by supporting Oblivious DNS-over-HTTPS (ODoH), and develop OPAQUE which makes password breaches less likely to occur. We have also released our data localization suite to help businesses navigate the ever evolving regulatory landscape by giving them control over where their data is stored without compromising performance and security. We’ve even built a privacy-preserving attestation that is based on the same zero-knowledge proof techniques that are core to distributed systems such as ZCash and Filecoin.

It’s exciting to think that there are already ways we can change the web to improve the experience for its users. However, there are some limitations to build on top of the exciting infrastructure. This is why projects such as Ethereum and IPFS build on their own architecture. They are still relying on the Internet but do not operate with the web as we know it. To ease the transition, Cloudflare operates distributed web gateways. These gateways provide an HTTP interface to Web3 protocols: Ethereum and IPFS. Since HTTP is core to the web we know today, distributed content can be accessed securely and easily without requiring the user to operate experimental software.

Where do we go next?

The journey to a different web is long but exciting. The infrastructure built over the last two decades is truly stunning. The Internet and the web are now part of 4.6 billion people’s lives. At the same time, the top 35 websites had more visits than all others (circa 2014). Users have less control over their data and are even more reliant on a few players.

The early Web was static. Then Web 2.0 came to provide interactiveness and service we use daily at the cost of centralisation. Web3 is a trend that tries to challenge this. With distributed networks built on open protocols, users of the web are empowered to participate.

At Cloudflare, we are embracing this distributed future. Applying the knowledge and experience we have gained from running one of the largest edge networks, we are making it easier for users and businesses to benefit from Web3. This includes operating a distributed web product suite, contributing to open standards, and moving privacy forward.

If you would like to help build a better web with us, we are hiring.

How Cloudflare provides tools to help keep IPFS users safe

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/cloudflare-ipfs-safe-mode/

How Cloudflare provides tools to help keep IPFS users safe

How Cloudflare provides tools to help keep IPFS users safe

Cloudflare’s journey with IPFS started in 2018 when we announced a public gateway for the distributed web. Since then, the number of infrastructure providers for the InterPlanetary FileSystem (IPFS) has grown and matured substantially. This is a huge benefit for users and application developers as they have the ability to choose their infrastructure providers.

Today, we’re excited to announce new secure filtering capabilities in IPFS. The Cloudflare IPFS module is a tool to protect users from threats like phishing and ransomware. We believe that other participants in the network should have the same ability. We are releasing that software as open source, for the benefit of the entire community.

Its code is available on github.com/cloudflare/go-ipfs. To understand how we built it and how to use it, read on.

A brief introduction on IPFS content retrieval

Before we get to understand how IPFS filtering works, we need to dive a little deeper into the operation of an IPFS node.

The InterPlanetary FileSystem (IPFS) is a peer-to-peer network for storing content on a distributed file system. It is composed of a set of computers called nodes that store and relay content using a common addressing system.

Nodes communicate with each other over the Internet using a Peer-to-Peer (P2P) architecture, preventing one node from becoming a single point of failure. This is even more true given that anyone can operate a node with limited resources. This can be light hardware such as a Raspberry Pi, a server at a cloud provider, or even your web browser.

How Cloudflare provides tools to help keep IPFS users safe

This creates a challenge since not all nodes may support the same protocols, and networks may block some types of connections. For instance, your web browser does not expose a TCP API and your home router likely doesn’t allow inbound connections. This is where libp2p comes to help.

libp2p is a modular system of protocols, specifications, and libraries that enable the development of peer-to-peer network applications – libp2p documentation

That’s exactly what four IPFS nodes need to connect to the IPFS network. From a node point of view, the architecture is the following:

How Cloudflare provides tools to help keep IPFS users safe

Any node that we maintain a connection with is a peer. A peer that does not have 🐱 content can ask their peers, including you, they WANT🐱. If you do have it, you will provide the 🐱 to them. If you don’t have it, you can give them information about the network to help them find someone who might have it. As each node chooses the resources they store, it means some might be stored on a limited number of nodes.

For instance, everyone likes 🐱, so many nodes will dedicate resources to store it. However, 🐶 is less popular. Therefore, only a few nodes will provide it.

How Cloudflare provides tools to help keep IPFS users safe

This assumption does not hold for public gateways like Cloudflare. A gateway is an HTTP interface to an IPFS node. On our gateway, we allow a user of the Internet to retrieve arbitrary content from IPFS. If a user asks for 🐱, we provide 🐱. If they ask for 🐶, we’ll find 🐶 for them.

How Cloudflare provides tools to help keep IPFS users safe

Cloudflare’s IPFS gateway is simply a cache in front of IPFS. Cloudflare does not have the ability to modify or remove content from the IPFS network. However, IPFS is a decentralized and open network, so there is the possibility of users sharing threats like phishing or malware. This is content we do not want to provide to the P2P network or to our HTTP users.

In the next section, we describe how an IPFS node can protect its users from such threats.

If you would like to learn more about the inner workings of libp2p, you can go to ProtoSchool which has a great tutorial about it.

How IPFS filtering works

As we described earlier, an IPFS node provides content in two ways: to its peers through the IPFS P2P network and to its users via an HTTP gateway.

Filtering content of the HTTP interface is no different from the current protection Cloudflare already has in place. If 🐶 is considered malicious and is available at cloudflare-ipfs.com/ipfs/🐶, we can filter these requests, so the end user is kept safe.

The P2P layer is different. We cannot filter URLs because that’s not how the content is requested. IPFS is content-addressed. This means that instead of asking for a specific location such as cloudflare-ipfs.com/ipfs/🐶, peers request the content directly using its Content IDentifiers (CID), 🐶.

More precisely, 🐶 is an abstraction of the content address. A CID looks like QmXnnyufdzAWL5CqZ2RnSNgPbvCc1ALT73s6epPrRnZ1Xy (QmXnnyufdzAWL5CqZ2RnSNgPbvCc1ALT73s6epPrRnZ1Xy happens to be the hash of a .txt file containing the string “I’m trying out IPFS”). CID is a convenient way to refer to content in a cryptographically verifiable manner.

This is great, because it means that when peers ask for malicious 🐶 content, we can prevent our node from serving it. This includes both the P2P layer and the HTTP gateway.

In addition, the working of IPFS makes it, so content can easily be reused. On directories for instance, the address is a CID based on the CID of its files. This way, a file can be shared across multiple directories, and still be referred to by the same CID. It allows IPFS nodes to efficiently store content without duplicating it. This can be used to share docker container layers for example.

In the filtering use case, it means that if 🐶 content is included in other IPFS content, our node can also prevent content linking to malicious 🐶 content from being served. This results in 😿, a mix of valid and malicious content.

How Cloudflare provides tools to help keep IPFS users safe

This cryptographic method of linking content together is known as MerkleDAG. You can learn more about it on ProtoSchool, and Consensys did an article explaining the basic cryptographic construction with bananas 🍌.

How to use IPFS secure filtering

By now, you should have an understanding of how an IPFS node retrieves and provides content, as well as how we can protect peers and users from shared nodes accessing threats. Using this knowledge, Cloudflare went on to implement IPFS Safemode, a node protection layer on top of go-ipfs. It is up to every node operator to build their own list of threats to be blocked based on their policy.

To use it, we are going to follow the instructions available on cloudflare/go-ipfs repository.

First, you need to clone the git repository

git clone https://github.com/cloudflare/go-ipfs.git
cd go-ipfs/

Then, you have to check out the commit where IPFS safemode is implemented. This version is based on v0.9.1 of go-ipfs.

git checkout v0.9.1-safemode

Now that you have the source code on your machine, we need to build the IPFS client from source.

make build

Et voilà. You are ready to use your IPFS node, with safemode capabilities.

# alias ipfs command to make it easier to use
alias ipfs=’./cmd/ipfs/ipfs’
# run an ipfs daemon
ipfs daemon &
# understand how to use IPFS safemode
ipfs safemode --help
USAGE
ipfs safemode - Interact with IPFS Safemode to prevent certain CIDs from being provided.
...

Going further

IPFS nodes are running in a diverse set of environments and operated by parties at various scales. The same software has to accommodate configuration in which it is accessed by a single-user, and others where it is shared by thousands of participants.

At Cloudflare, we believe that decentralization is going to be the next major step for content networks, but there is still work to be done to get these technologies in the hands of everyone. Content filtering is part of this story. If the community aims at embedding a P2P node in every computer, there needs to be ways to prevent nodes from serving harmful content. Users need to be able to give consent on the content they are willing to serve, and the one they aren’t.

By providing an IPFS safemode tool, we hope to make this protection more widely available.

A Name Resolver for the Distributed Web

Post Syndicated from Thibault Meunier original https://blog.cloudflare.com/cloudflare-distributed-web-resolver/

A Name Resolver for the Distributed Web

A Name Resolver for the Distributed Web

The Domain Name System (DNS) matches names to resources. Instead of typing 104.18.26.46 to access the Cloudflare Blog, you type blog.cloudflare.com and, using DNS, the domain name resolves to 104.18.26.46, the Cloudflare Blog IP address.

Similarly, distributed systems such as Ethereum and IPFS rely on a naming system to be usable. DNS could be used, but its resolvers’ attributes run contrary to properties valued in distributed Web (dWeb) systems. Namely, dWeb resolvers ideally provide (i) locally verifiable data, (ii) built-in history, and (iii) have no single trust anchor.

At Cloudflare Research, we have been exploring alternative ways to resolve queries to responses that align with these attributes. We are proud to announce a new resolver for the Distributed Web, where IPFS content indexed by the Ethereum Name Service (ENS) can be accessed.

To discover how it has been built, and how you can use it today, read on.

Welcome to the Distributed Web

IPFS and its addressing system

The InterPlanetary FileSystem (IPFS) is a peer-to-peer network for storing content on a distributed file system. It is composed of a set of computers called nodes that store and relay content using a common addressing system.

This addressing system relies on the use of Content IDentifiers (CID). CIDs are self-describing identifiers, because the identifier is derived from the content itself. For example, QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco is the CID version 0 (CIDv0) of the wikipedia-on ipfs homepage.

To understand why a CID is defined as self-describing, we can look at its binary representation. For QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco, the CID looks like the following:

A Name Resolver for the Distributed Web

The first is the algorithm used to generate the CID (sha2-256 in this case); then comes the length of the encoded content (32 for a sha2-256 hash), and finally the content itself. When referring to the multicodec table, it is possible to understand how the content is encoded.

Name Code (in hexadecimal)
identity 0x00
sha1 0x11
sha2-256 0x12 = 00010010
keccak-256 0x1b

This encoding mechanism is useful, because it creates a unique and upgradable content-addressing system across multiple protocols.

If you want to learn more, have a look at ProtoSchool’s tutorial.

Ethereum and decentralised applications

Ethereum is an account-based blockchain with smart contract capabilities. Being account-based, each account is associated with addresses and these can be modified by operations grouped in blocks and sealed by Ethereum’s consensus algorithm, Proof-of-Work.

There are two categories of accounts: user accounts and contract accounts. User accounts are controlled by a private key, which is used to sign transactions from the account. Contract accounts hold bytecode, which is executed by the network when a transaction is sent to their account. A transaction can include both funds and data, allowing for rich interaction between accounts.

When a transaction is created, it gets verified by each node on the network. For a transaction between two user accounts, the verification consists of checking the origin account signature. When the transaction is between a user and a smart contract, every node runs the smart contract bytecode on the Ethereum Virtual Machine (EVM). Therefore, all nodes perform the same suite of operations and end up in the same state. If one actor is malicious, nodes will not add its contribution. Since nodes have diverse ownership, they have an incentive to not cheat.

How to access IPFS content

As you may have noticed, while a CID describes a piece of content, it doesn’t describe where to find it. In fact, the CID describes the content, but not its location on the network. The location of the file would be retrieved by a query made to an IPFS node.

An IPFS URL (Unified Resource Locator) looks like this: ipfs://QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco. Accessing this URL means retrieving QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco using the IPFS protocol, denoted by ipfs://. However, typing such a URL is quite error-prone. Also, these URLs are not very human-friendly, because there is no good way to remember such long strings. To get around this issue, you can use DNSLink. DNSLink is a way of specifying IPFS CIDs within a DNS TXT record. For instance, wikipedia on ipfs has the following TXT record

$ dig +short TXT _dnslink.en.wikipedia-on-ipfs.org

_dnslink=/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco

In addition, its A record points to an IPFS gateway. This means that, when you access en.wikipedia-on-ipfs.org, your request is directed to an IPFS HTTP Gateway, which then looks out for the CID using your domain TXT record, and returns the content associated to this CID using the IPFS network.

This is trading ease-of-access against security. The web browser of the user doesn’t verify the integrity of the content served. This could be because the browser does not implement IPFS or because it has no way of validating domain signature — DNSSEC. We wrote about this issue in our previous blog post on End-to-End Integrity.

Human readable identifiers

DNS simplifies referring to IP addresses, in the same way that postal addresses are a way of referring to geolocation data, and contacts in your mobile phone abstract phone numbers. All these systems provide a human-readable format and reduce the error rate of an operation.

To verify these data, the trusted anchors, or “sources of truth”, are:

  • Root DNS Keys for DNS.
  • The government registry for postal addresses. In the UK, addresses are handled by cities, boroughs and local councils.
  • When it comes to your contacts, you are the trust anchor.

Ethereum Name Service, an index for the Distributed Web

An account is identified by its address. An address starts with “0x” and is followed by 20 bytes (ref 4.1 Ethereum yellow paper), for example: 0xf10326c1c6884b094e03d616cc8c7b920e3f73e0. This is not very readable, and can be pretty scary when transactions are not reversible and one can easily mistype a single  character.

A first mitigation strategy was to introduce a new notation to capitalise some letters based on the hash of the address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0. This can help detect mistype, but it is still not readable. If I have to send a transaction to a friend, I have no way of confirming she hasn’t mistyped the address.

The Ethereum Name Service (ENS) was created to tackle this issue. It is a system capable of turning human-readable names, referred to as domains, to blockchain addresses. For instance, the domain privacy-pass.eth points to the Ethereum address 0xF10326C1c6884b094E03d616Cc8c7b920E3F73E0.

To achieve this, the system is organised in two components, registries and resolvers.

A registry is a smart contract that maintains a list of domains and some information about each domain: the domain owner and the domain resolver. The owner is the account allowed to manage the domain. They can create subdomains and change ownership of their domain, as well as modify the resolver associated with their domain.

Resolvers are responsible for keeping records. For instance, Public Resolver is a smart contract capable of associating not only a name to blockchain addresses, but also a name to an IPFS content identifier. The resolver address is stored in a registry. Users then contact the registry to retrieve the resolver associated with the name.

Consider a user, Alice, who has direct access to the Ethereum state. The flow goes as follows: Alice would like to get Privacy Pass’s Ethereum address, for which the domain is privacy-pass.eth. She looks for privacy-pass.eth in the ENS Registry and figures out the resolver for privacy-pass.eth is at 0x1234… . She now looks for the address of privacy-pass.eth at the resolver address, which turns out to be 0xf10326c….

A Name Resolver for the Distributed Web

Accessing the IPFS content identifier for privacy-pass.eth works in a similar way. The resolver is the same, only the accessed data is different — Alice calls a different method from the smart contract.

A Name Resolver for the Distributed Web

Cloudflare Distributed Web Resolver

The goal was to be able to use this new way of indexing IPFS content directly from your web browser. However, accessing the ENS registry requires access to the Ethereum state. To get access to IPFS, you would also need to access the IPFS network.

To tackle this, we are going to use Cloudflare’s Distributed Web Gateway. Cloudflare operates both an Ethereum Gateway and an IPFS Gateway, respectively available at cloudflare-eth.com and cloudflare-ipfs.com.

The first version of EthLink was built by Jim McDonald and is operated by True Name LTD at eth.link. Starting from next week, eth.link will transition to use the Cloudflare Distributed Web Resolver. To that end, we have built EthLink on top of Cloudflare Workers. This is a proxy to IPFS. It proxies all ENS registered domains when .link is appended. For instance, privacy-pass.eth should render the Privacy Pass homepage. From your web browser, https://privacy-pass.eth.link does it.

The resolution is done at the Cloudflare edge using a Cloudflare Worker. Cloudflare Workers allows JavaScript code to be run on Cloudflare infrastructure, eliminating the need to maintain a server and increasing the reliability of the service. In addition, it follows Service Workers API, so results returned from the resolver can be checked by end users if needed.

To do this, we setup a wildcard DNS record for *.eth.link to be proxied through Cloudflare and handled by a Cloudflare Worker.  When a user Alice accesses privacy-pass.eth.link, the worker first gets the CID of the CID to be retrieved from Ethereum. Then, it requests the content matching this CID to IPFS, and returns it to Alice.

A Name Resolver for the Distributed Web

All parts can be run locally. The worker can be run in a service Worker, and the Ethereum Gateway can point to both a local Ethereum node and the IPFS gateway provided by IPFS Companion. It means that while Cloudflare provides resolution-as-a-service, none of the components has to be trusted.

Final notes

So are we distributed yet? No, but we are getting closer, building bridges between emerging technologies and current web infrastructure. By providing a gateway dedicated to the distributed web, we hope to make these services more accessible to everyone.

We thank the ENS team for their support of a new resolver on expanding the distributed web. The ENS team has been running a similar service at https://eth.link. On January 18th, they will switch https://eth.link to using our new service.

These services benefit from the added speed and security of the Cloudflare Worker platform, while paving the way to run distributed protocols in browsers.