Tag Archives: Careers

Internship Experience: Research Engineer

Post Syndicated from Sudheesh Singanamalla original https://blog.cloudflare.com/internship-experience-research-engineer/

Internship Experience: Research Engineer

Internship Experience: Research Engineer

I spent my summer of 2020 as an intern at Cloudflare working with the incredible research team. I had recently started my time as a PhD student at the University of Washington’s Paul G Allen School of Computer Science and Engineering working on decentralizing and securing cellular network infrastructure, and measuring the adoption of HTTPS by government websites worldwide. Here’s the story of how I ended up on Cloudflare TV talking about my award-winning research on a project I wasn’t even aware of when the pandemic hit.

Prior to the Internship

It all started before the pandemic, when I came across a job posting over LinkedIn for an internship with the research team at Cloudflare. I had been a happy user of Cloudflare’s products and services and this seemed like a very exciting opportunity to really work with them towards their mission to help build a better Internet. While working on research at UW, I came across a lot of prior research work published by the researchers at Cloudflare, and was excited to possibly be a part of the research team and interact with them. Without second thoughts, I submitted an application through LinkedIn and waited to hear back from the team.

I received a first call from a recruiter a few months later, asking me if I was still interested in the internship position, and informing me that the internships would be remote due to the pandemic. I was told that the research team was interested in interviewing me for the internship and  during the call also informed about the process, which included a programming task to work with an existing open source Cloudflare project, a pair programming interview task with a member of the team, followed by phone calls with some research leads. I was extremely excited and said “Yes! I’d love to try out the interview process”.

Adding Certificate Transparency Log Scans to Scan families

Within the next few hours I received a task from Nick Sullivan with a clear problem statement to add support for producing a certificate transparency report in CFSSL, an open source project from Cloudflare which contained cfssl-scan, a tool that scanned hostnames for connectivity, TLS information, TLS session support, and PKI information (certificates). I was tasked with adding a new family of scanners to look into Certificate Transparency logs (CT Logs) and integrate the information from the CT logs into the output. After a few back and forth emails with Nick and other researchers CC’d on the email thread, I set out to work and submitted a draft detailing my design rationale, supported features and examples of how different error conditions were handled by the changes to the code.

The task was very exciting because it allowed me to gain more familiarity with Go, a language I would use even more at Cloudflare during my internship. With the task complete, I was invited for a pair programming task with Watson Ladd. We discussed my current research work at the university, the areas of research which interested me and learnt about new cool projects that Cloudflare was working on and problems they were interested in solving to help make the Internet better. We then started working on a pair programming problem and discussed the design rationale for solving the problem, extensibility, code-reuse and writing test coverage.

Soon after, I had a bunch of similar calls talking about my current research work, understanding potential research problems that Cloudflare was interested in solving before finally receiving an internship offer for the summer. Yay!

The Internship

My summer internship with Cloudflare was like none other. It all started with a seamless onboarding process with clear documentation and training. The access control for the account worked flawlessly from the first day, and I had all the tools, documentation and internal resources available to get started. However, this is where the first challenge started: there are too many interesting research problems to try and tackle! It felt like a kid at a carnival. I liked everything and wanted to try everything, but I knew, given the short duration of the internship, I had to pick one research problem which interested me. After a week of deliberation, long conversations with different researchers on the team and reading highly relevant prior research relevant in the different areas, I decided to explore and work on Oblivious DNS over HTTPS (ODoH).

Initially, I was worried about not being able to make a decision regarding which project to pursue, because the interactions with other people in Cloudflare were remote, with no in-person conversations like I’d had at other companies. I also worried this setup made me overlook something that might have been easier to discuss in person. But the team was super supportive through it and ensured that I had all the relevant information before making my decision.

Oblivious DNS over HTTPS (ODoH) is a protocol proposed at the IETF with the goal of providing privacy to the clients making DNS requests using DNS over HTTPS (DoH). Cloudflare operates a popular public recursive DNS resolver to which clients can make DNS queries. However, DNS over HTTPS (DoH) requests made by clients to the resolver leak client IP addresses despite providing a secure encrypted communication channel. While DoH enhances the security of the DNS queries and responses when used instead of the default insecure UDP based DNS requests, the leakage of client IP information could be problematic. Cloudflare maintains users’ privacy through a rigorous privacy policy, audits, and purging client information.

Along with my advisors, I spent time building interoperable versions of ODoH services, and the necessary components in Go and Rust which were experimentally deployed on cloud services for performing measurements of the protocol. Through frequent conversations, we identified interesting research questions, performed the necessary measurements, found both security and performance issues, improved our design and drove towards conclusions iteratively. Then, we worked with the help of the brilliant engineering and reliability engineering teams at Cloudflare to move the support for the protocol into production, to convince the community about the advantages and practicality of the ODoH protocol.

The interoperable implementations of the protocol were made open source. They served as a reference implementation presented during the standardization process and various presentations we made at IETF and OARC, through which we obtained valuable feedback. With all the experiments in place, we submitted our work to the proceedings of Privacy Enhancing Technologies Symposium (PETS 2021) where it was accepted and awarded the Andreas Pfitzmann best student paper.

Cloudflare strongly believes in transparency. The effects of this are visible within the company, from open and inclusive discussions about social and technological issues, to the way people across the company can collaborate and share information with the public. I was fortunate to present and share some work on ODoH on Cloudflare TV. I was definitely nervous about presenting the work on live Internet TV, but it became possible with the support and encouragement of the TV team and members of the research team.

Outside work

While the work that I did during my time as an intern at Cloudflare was exciting, it was not the only thing that kept me occupied. It was very easy to interact with engineers, designers, sales and marketing teams within the company, learn about their work, their experiences and gain an understanding of all the amazing work happening throughout the company. The internship also provided me an opportunity to engage in random engineer chats — a program which randomly matched me with other engineers, and researchers, allowing me to learn more about their work. The research team at Cloudflare operated very similarly to an academic research lab and frequently discussed papers during scheduled reading group meetings. The weekly intern hangouts allowed me to build friendships with the other interns in the team. However, not everything was rosy: it was hard to make it to all the intern hangouts, and time zone differences did add to the challenges for scheduling time to get to know the other interns.

Takeaways

Cloudflare is an incredibly transparent company built for scale, and a brilliant place to work with a lot of interesting research and engineering work that could move from prototype to production. The transparent collaboration between different teams, academia, and the shared mission to help build a better Internet make it possible to leverage the strengths of various teams, and highly motivated people to contribute to a project. In retrospect, I strongly believe that I got lucky working on a problem which interested me, and had value for Cloudflare’s mission. And while I get to write this blog post about my experience, this experience and the work I was able to do during my time at Cloudflare wouldn’t have been possible without the hundreds of motivated and brilliant people in various teams (media, content, design, legal etc.) with whom I interacted, along with the direct involvement of the research, engineering and reliability teams. The internship experience was truly humbling!

If this sort of experience interests you, and you would love to join an innovative and collaborative environment, Cloudflare Research is currently accepting applications for 2022 internships!

Cloudflare invites visiting researchers

Post Syndicated from Vânia Gonçalves original https://blog.cloudflare.com/visiting-researcher-program/

Cloudflare invites visiting researchers

Cloudflare invites visiting researchers

As part of Cloudflare’s effort to build collaborations with academia, we host research focused internships all year long. Interns collaborate cross-functionally in research projects and are encouraged to ship code and write a blog post and a peer-reviewed publication at the end of their internship. Post-internship, many of our interns have joined Cloudflare to continue their work and often connect back with their alma mater strengthening idea sharing and collaborative initiatives.

Last year, we extended the intern experience by hosting Thomas Ristenpart, Associate Professor at Cornell Tech. Thomas collaborated for half a year on a project related to password breach alerting. Based on the success of this experience we are taking a further step in creating a structured Visiting Researcher program, to broaden our capabilities and invest further on a shared motivation with academics.

Foster engagement and closer partnerships

Our current research focuses on applied cryptography, privacy, network protocols and architecture, measurement and performance evaluation, and, increasingly, distributed systems. With the Visiting Researcher program, Cloudflare aims to foster a shared motivation with academia and engage together in seeking innovative solutions to help build a better Internet in the mentioned domains.

We expect to support the operationalization of ideas that emerge in academia and put them to the test in deployable services that will be used worldwide, hence giving the opportunity to develop collaborative projects with real world applicability and also push industry forward.

About the Visiting Researcher Program

The Visiting Researcher Program is available to both postdocs and full-time faculty members who aim to collaborate primarily with Cloudflare Research for periods of three to 12 months. There are a few eligibility criteria to meet before expressing interest in the program:

  • Have a PhD and a well-established research track record demonstrated by peer-reviewed journal publications and conference papers.
  • Relevant research experience and interest in one of the research areas.
  • Ability to design and execute on a research agenda.

We know we will receive excellent proposals but we expect selected expressions of interest to have the potential to have a significant impact on one of the mentioned domains and reinforce the contribution to the Internet community at large. Proposals should aim at wide dissemination and have the potential to deliver value to both technical and academic communities.

You can explore more about the program on the Cloudflare Research website and learn more about Thomas Ristenpart’s experience in the next section .

The Visiting Researcher experience so far

There are a lot of potential reasons for a short-term visit in industry. For senior researchers it’s an opportunity to refresh one’s perspectives on problems observed in practice, and potentially transfer research ideas from “the lab” to products. Compared to some companies, Cloudflare’s research organization is smaller, has clear connections with product teams, and has an outsized portfolio of exciting, high-impact research projects.

As mentioned above, I joined Cloudflare in the summer of 2020, during my academic sabbatical. I worked three days a week — remotely given the COVID-19 pandemic — and spent the rest of the work week advising my academic group at Cornell. A lot of my academic research over the past few years has focused on how to improve security for password-based authentication, including developing some of the first protocols for privacy-preserving password breach alerting. I knew Cloudflare well due to its ongoing engagement with the applied cryptography community, and it made sense to visit: Cloudflare’s focus on security, privacy, and its position as a first-line of defense for millions of websites made it a unique opportunity for working on improving authentication security.

I worked directly with research engineers in the team to implement a new type of password breach alerting service, that we called Might I Get Pwned (MIGP). While it built off prior work done in academia, we encountered a number of fascinating challenges in architecting and implementing the system. We also found new opportunities for impact, realizing that the Web Application Firewall (WAF) team was simultaneously interested in breach alerting and could utilize the infrastructure we were building. Ultimately, my work contributed directly to the WAF’s breach alerting feature that launched in Spring 2021.

At the same time, being embedded at Cloudflare surfaced fascinating new research questions. At one point, the CEO asked the team about how we could handle the potential threat of hoarding attacks against Privacy Pass, a deployed cryptographic protocol that Cloudflare customers rely on to help prevent bots from mounting attacks. This led to a foundational cryptographic protocol question: can we build partially oblivious pseudorandom function protocols that match the efficiency of standard oblivious pseudorandom functions? I won’t unpack that jargon here, but for those who are curious you can check out the preprint. We ended up tackling this question as a collaboration between my academic research group, the University of Washington, and Cloudflare, culminating in a new protocol that is sure to get deployed quite widely in the years to come.

Overall, this was a hugely successful visit. I’m excited to see the Cloudflare visiting scholar program expand and develop, and would definitely recommend it to interested academics.

Express your interest

We’re very excited to have this program going forward and diversifying our collaborations with academia! You can read more about the Visiting Researcher program and send us your expression of interest through Cloudflare Research website. We are expecting to host you in early 2022!

The Future of Work at Cloudflare

Post Syndicated from Matthew Prince original https://blog.cloudflare.com/the-future-of-work-at-cloudflare/

The Future of Work at Cloudflare

The Future of Work at Cloudflare

During Impact Week, we’ve shared how Cloudflare is providing tools for our customers to minimize their environmental impact as well as what we, as a company, are doing to help society at large. But some critical stakeholders we haven’t talked much about yet are Cloudflare’s more than 2,000 employees: who build our services, support and educate our customers, keep our finances in order, work through difficult policy issues, and empower us to accomplish everything we have.

Over the last year and a half, we’ve all challenged a lot of the assumptions about what it means to “work.” Prior to the start of the pandemic, Cloudflare was very much a work-from-office culture. And so when, on March 13, 2020, we closed all our offices and asked everyone to work from home, the two of us were extremely nervous.

And then something unexpected happened: a lot of things got better.

As a company, productivity increased — when measured by our success selling our products, our pace of shipping new products, and even things like the time it takes for our finance team to close our books.

Other day-to-day things got better, too. We noticed a marked increase in participation in meetings by women, team members from whom English wasn’t their first language, junior team members, and other traditionally underrepresented groups. It turns out, putting everyone in a Brady-bunch like box on a screen smooths out some of the other social cues that, when in-person, make some people less comfortable, willing, or able to fully participate.

Virtually More Inclusive

It’s not unreasonable to speculate that the increase in productivity was driven, in no small part, by the increase in overall participation by people who previously felt reluctant to do so. And this further aligned with job surveys that we conducted over the last year and a half which showed that while the things people wanted us to improve remained the same, overall satisfaction with jobs increased.

We also noticed that the diversity of the candidates that were applying to work for us increased as we allowed people to work remotely. We were now an option for people who did not live in, or could not move to, the cities we had offices in. At Cloudflare, we’ve always believed in having a diverse team. Not to look good in a government report, but because it’s the right business strategy: more diverse teams win.

We all have different perspectives formed by our experiences that inherently give us insights and blind spots. If everyone on a team has the same insights and blind spots then there will be less unique and creative solutions proposed to whatever problems we face. Just as it’s important to have genetic diversity in a species, having diversity on every dimension in hiring makes us a stronger, more creative company. Prioritizing a diverse team is the right strategy if you’re optimizing for innovation, like we are at Cloudflare.

But not everything got better when we switched to remote; some things definitely got worse. We’re social creatures. We thrive through human interaction that is still difficult to replicate virtually. Even with improvements in video conferencing, online interactions still mute some of the social cues and make misunderstanding more likely. The osmosis for our team of learning by watching others is harder, especially for team members early in their career. And, unfortunately, for some the office is a refuge from difficult situations at home and so not having it as a place to get away can amplify those challenges.

What We’ve Learned… So Far

So we’ve been thinking a lot about what the future of work looks like at Cloudflare and wanted to share publicly what we’ve been talking about for some time internally. Here are some things we think we know.

First, we don’t know what the long term future of work will be like and so we’ve been hesitant to lay down broad proclamations. Instead, we expect that as we get past the pandemic and are able to work in-person safely again, we will do what Cloudflare has always done: run a number of experiments ourselves, watch what our peers are doing, and figure out what works for us. The one thing we feel pretty sure of is that wherever we start the experiment is highly unlikely to be exactly the place where we end up. The future of work won’t be set in stone sometime in the coming months, but evolve over the coming years.

Second, no matter what, the future of work will be more flexible. There’s no way we are putting the genie of remote work back in the bottle. Why would we want to if we’ve learned that we’ve been more productive and more satisfied with their jobs while we’ve been remote? Flexibility is the number one requested work benefit, and one of the silver linings of the pandemic for us has been that we ran a forced experiment that proved we could make it work.

Third, we are incredibly reluctant to impose arbitrary rules. Requiring team members to come in every Monday, Tuesday, and Thursday begs the question: why those days? Saying you need to come in if you’re below a certain seniority level also seems weirdly arbitrary. Instead of rules, we’re much more likely to start with general standards outlining what success as a member of the team at Cloudflare looks like and giving guidelines. We may need rules at some point, but we want to develop those rules over time based on what we learn.

Fourth, just opening offices and hoping for the best doesn’t work. What we’ve seen ourselves, and confirmed with others, is that what makes working from an office great is getting to work side-by-side with your colleagues. But if Alyssa comes in on Monday, and Blake comes in on Tuesday, and Carlos comes in on Wednesday, and Deeksha comes in on Thursday, and Ellen comes in on Friday, and they all hoped that they would get to connect, then none of them has a good experience and none of them come in the following week. If in-person work is going to work, there needs to be some deliberate structure and planning.

Fifth, we believe more in carrots than sticks. We’d rather we create an environment where people want to come in than where they have to come in. Based on our internal surveys, about 10% of our team wants to come in every day. We want to make the environment such that 100% of our team wants to come in at least some days.

Sixth, a more flexible way of working will require a more flexible physical space. The base “lego brick” we used to design all our offices pre-pandemic was the 6-person conference room. And, while none of our offices started this way, they all evolved into a sea of white, adjustable desks in neat rows as we found spots for our growing team. That already feels anachronistic. We think we need to redesign spaces to accommodate teams coming together to collaborate as well as individuals looking for a quiet spot for heads-down work.

Seventh, mixed meetings suck. When some people are in-person and some people are virtual the experience is bad for everyone. Part of why we think the last year and a half has worked is because everyone is in the same boat. We believe part of the reason why hybrid work environments have traditionally not worked is because they, left to their own devices, will tend to devolve to an experience that’s bad for everyone. The future of flexible work needs to acknowledge that most hybrid work experiments in the past haven’t worked.

Eighth, we’re a very global company. We have team members in countries around the world and need to operate our business around the clock. One of the benefits of being fully remote over the last year and a half is that it made all our offices feel like they were on equal footing. That’s something we believe is important for us to maintain.

So what’s our plan? Again, we don’t pretend to have all the answers. Instead, we expect that we’ll start somewhere and experiment. So we’re starting by being more flexible about where we hire people. We still believe that people will tend to cluster in hubs around cities where we have physical offices, but we are now open to hiring for nearly all of our roles in any location where we have a legal entity setup that allows us to hire.

We are tearing apart our offices in San Francisco and London to remake them into flexible work spaces. We’re designing them to allow for teams of 10, 20, or 30 employees to get together and collaborate. We’re also creating “Zoom villages” with one-person spaces and high quality AV equipment to let people jump on conference calls.

One of the few rules that we plan on starting with is that in meetings if any person is remote then everyone in the meeting is remote. We know that will create some awkward situations where some of our team will literally be sitting next to each other at desks talking on a video conference call. But we believe this is a rule worth having, in spite of our hesitation to impose strict rules, to help keep the playing field level for all our colleagues, wherever they’re working.

We’re going to rethink the purpose of the offices as spaces where teams can come together to collaborate. Internally, we’re calling these “on-site off-sites” — though everyone agrees we need a better name. The idea being that teams can call an in-person meeting and reserve space in any of our offices to come together. We expect different teams will set different cadences of these meetings, but expect most people to have at least some time in an office at least once a quarter.

The Future of Work at Cloudflare

We’re planning for what we’ve termed a “Czar of Serendipity” who will coordinate cross-group lunches and other activities to help facilitate teams who may not work directly together to have the opportunity if they want to meet colleagues they may not otherwise know. They’ll also help arrange in-person speakers and other activities aligned with whatever teams or groups are physically in the office each week.

And we’re hunting for carrots to encourage our team, and especially members who are earlier in their career, to come in. One we’re working on is what we’re calling Orange Card. We hope to turn every team member’s ID into a charge card. The card will only activate after someone badges in for the day and will only work to purchase food at restaurants that are within a 10-minute walk from the office with pre-tax dollars.

The Future of Work at Cloudflare

It’s in Cloudflare’s interest to encourage people to come in physically to work. Across the industry, however, we think jobs that require in-person work will look increasingly anachronistic. We also believe that, rather than operating private cafeterias inside our own spaces, it’s important for us to support local businesses near our offices — especially as so many of them were hit hard during COVID. If with Orange Card we can do this and find a way to let employees pay for lunches when they’re in the office at an effective discount, then it will check both boxes: giving employees a reason to come in and also supporting the local community.

We don’t know how many of these things will work, but it’s a sense of the experiments we intend to run as we try and find the future of work that works for our team.

In many ways we were fortunate that Cloudflare’s product could be of specific help during an incredibly difficult time for the world. The superheros of the last year and a half have been the medical professionals and scientists who have taken care of the sick and looked for cures for this disease. But the Internet has been the faithful sidekick that has helped many continue to work, stay connected with loved ones, and keep ourselves entertained through this trying time. As one of the defenders of the Internet, our work at Cloudflare has been incredibly rewarding. We hope we can create a future of work that remains incredibly rewarding even long past the pandemic.

The thoughts above are just a starting place. We expect that we’re going to learn a lot not only from our own experiments, but also from what we learn works (and doesn’t work) at peer companies. We would have never tried this experiment in remote work but for the pandemic. Now, having realized that we can continue to execute in a more flexible work environment, we don’t plan to forget the lessons we learned. We’re hopeful that we, along with our peer companies, will continue to run experiments and, over time, develop a new future of work that is more flexible, more inclusive, and more productive.

PS – We’re hiring.

Building a sustainable workforce, through communities

Post Syndicated from Janet Van Huysse original https://blog.cloudflare.com/building-a-sustainable-workforce-through-communities/

Building a sustainable workforce, through communities

Building a sustainable workforce, through communities

At Cloudflare, we have our eyes set on an ambitious goal: to help build a better Internet. Today the company runs one of the world’s largest networks that powers approximately 25 million Internet properties. This is made possible by our 1,900 team members around the world. We believe the key to achieving our potential is to build diverse teams and create an environment where everyone can do their best work.

That is why we place a lot of value on the importance of diversity, equity and inclusion. Diversity, equity, and inclusion lead to better outcomes through improved decision-making, more innovative teams, stronger financial returns and simply a better place to work for everyone.

Building a sustainable workforce, through communities

To become more diverse, equitable, and inclusive, we believe it’s important to focus on communities within and around our company.

Building internal communities at Cloudflare

At Cloudflare, like most workplaces, there are built-in communities: your direct team, your cross-functional partners and (because we take onboarding very seriously) your new hire class. These communities, especially the first two, are important to help you get your job done. But we want more than that for our team at Cloudflare. We believe that community builds connection and fosters a sense of belonging.

Because of that, we have supported the growth of over 16 Employee Resource Groups (ERG’s). We use the term ERG broadly at Cloudflare. We have many ERG’s focused on traditionally under-represented groups in tech: Afroflare (Black, African diaspora), Latinflare, and Womenflare; groups that have been historically marginalized: Proudflare (LGBTQIA+), Cloudflarents (parents and caregivers); as well as interest and affinity groups like Mindflare and Soberflare. To read more about all of our ERGs, visit our diversity, equity, and inclusion webpage or read about them on our blog. In addition to creating a community of support and belonging, our ERGs also work to enhance career development of their members and contribute to the development of a more inclusive culture at Cloudflare.

Building the skills to build communities

We define an inclusive culture as one where everyone feels safe, welcome and respected with a sense of belonging. We do not leave this to chance. We make investments in training and programs to develop and deepen the skills needed to nurture and preserve inclusive communities at Cloudflare.

One of our earliest offerings was Ally Skills training. The aim of this workshop is to help build awareness of the types of behavior and language which can be harmful to inclusivity at Cloudflare, and teach simple, everyday ways to support people who are targets of systemic oppression. During the workshop, team members share strategies on how to act as allies and how to create a long-lasting, inclusive culture at Cloudflare. As the program was being rolled out, the management team did the workshop together and quickly realized these were not skills reserved for ‘allies’ but it was our expectation that this was how all of our team members treated each other. These were necessary skills to be successful at Cloudflare. As a result, we reworked some pieces of the workshop and renamed it: How We Work Together.

We have also partnered with Paradigm IQ and Included to create a three-part Unconscious Bias Education Program. These workshops are a mix of eLearning and facilitated workshops where we learn about how to help mitigate unconscious bias and make our company a more welcoming and inclusive place for everyone. tEQuitable is an additional comprehensive resource which helps us create a safe, inclusive, and equitable workplace. They provide an independent sounding board where our employees may confidentially raise a concern, access a just-in-time learning platform, and get advice from professional Ombuds. They also help us identify systemic workplace issues and provide us with actionable recommendations for how to improve our workplace culture. What we especially love about tEQuitable is that it’s all about empowering our employees with tools and resources to address issues that may be impacting them, or they may witness impacting others, so we all play an active role in maintaining and nurturing our culture.

One other program worth highlighting is our Week On: Learning and Inclusion. This program came as a response to the murder of George Floyd in the US at the end of May 2020. Our Afroflare global leaders suggested we use Juneteenth as a full-day of deep learning from external experts on topics ranging from the history of race and racism to the psychological impact of racism on people of color. In 2021, we expanded it from a one-day program to a week full of programming with topics ranging from antiracism keynotes, inclusive people management workshops and inclusive recruiting practices.

Holding ourselves accountable to an inclusive culture

Increasing awareness and skill-building is valuable, but it is not enough. We also have to hold ourselves accountable by analyzing data, setting goals and measuring progress objectively. Each year we set company-wide goals around our diversity, and for the last few years we’ve added individual goals for managers — one focused on building a more diverse team, and one focused on building an inclusive team culture.

We also place a high value on behaviors at Cloudflare. This is imperative because we believe that culture is defined by the behaviors we reward. So in order to have a healthy and inclusive culture, we must reward the behaviors that promote and preserve that. We have defined these behaviors as our Cloudflare Capabilities.

Building a sustainable workforce, through communities

We screen for these Capabilities during our interview process, and they are used in performance and promotion conversations. We hold ourselves accountable by using a very simple formula: Performance = results + behaviors. Equally weighted.

Our Recruiting Efforts

Speaking of interviewing, hiring is an important part of our diversity story. We believe that diverse teams win, and we put in a lot of effort to build diverse teams across the company. We have many team members who took unconventional paths into tech, and we believe that makes us stronger as a company. In fact, many of our job descriptions read: We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills, and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team.

In addition to an inclusive and expansive mindset around hiring, we also have interviews dedicated specifically to fit against our Capabilities, as well as leveraging technology and tools to help identify great talent who help to increase the diversity of our teams.

We have also made investments in events and partnerships that help support our diversity recruiting efforts. In August 2016, Cloudflare was one of the first companies to partner with Path Forward when it first launched its program in California. [Fun fact: that’s how I learned about Cloudflare and became interested in working here]. In Singapore, we have a similar partnership with Mums@Work.

We also engage with organizations and participate in events that help us reach talent from underrepresented groups. We have sponsored and spoke on stage at events like Lesbians Who Tech and Grace Hopper, where our co-founder, President and COO, Michelle Zatlyn, delivered the keynote in 2020. We regularly attend events and conferences hosted by AfroTech, Women Who Code, Girls Who Code, TAPIA, NSN, and more.

Engaging with external communities

Our ethos is to support and connect with external communities as well. Prior to the pandemic, when our offices were fully open and social and professional events were a thing, we regularly hosted external organizations to host events in our communal spaces. One example of such an organization is Wu Yee Children’s Services, a San Francisco Chinatown-based nonprofit that connects parents and caregivers to affordable childcare options, offers payment assistance to low-income families, and other family and community services. We were honored to host their orientation session. Another organization we hosted was Women Who Code SF. We regularly hosted their “ algorithm and interview prep” workshops, which helped women coders gain the skills they need to land good jobs in the tech industry. Unlike many of our tech company peers, we did not offer free lunch five days a week. It was important to us that our team members got out of the office and supported local businesses and restaurants. It is important that we do not isolate ourselves, but rather are part of a larger community.

We also believe in giving back to our local communities. Prior to COVID, Cloudflare dedicated one week every year to volunteer efforts. Coordinated across many of our large office locations, we would dedicate each day for a full week volunteering at employee-nominated, local non-profit organizations. Our participation pivoted to virtual during COVID, but we are anxious to return to in-person giving when we can.

While we are proud of these efforts, it is in using Cloudflare products and services for good that is truly special. Cloudflare’s mission to help build a better Internet means we are in a unique position to help vulnerable websites, applications and services be safer, faster and more reliable online.

A few to highlight:

Project Galileo

Organizations working in the arts, human rights, civil society, journalism, or democracy, may apply for Project Galileo to get Cloudflare’s cybersecurity protection, for free. Since 2014, we’ve been leveraging our services to support vulnerable public interest web properties including, but are not limited to: minority rights organizations, human rights organizations, independent media outlets, arts groups, and democracy and voter protection programs.

Our support of one of these organizations has blossomed over the years. We are proud to announce our partnership with The Trevor Project. Founded in 1998 by the creators of the Academy Award®-winning short film TREVOR, The Trevor Project is the leading national organization providing crisis intervention and suicide prevention services to lesbian, gay, bisexual, transgender, queer & questioning (LGBTQ) young people under 25. We support the organization through monetary donations, a partnership with our LGBTQIA+ Employee Resource Group, Proudflare, and free Cloudflare services through our Project Galileo Program.

Since 2017, we have donated about $8 million in cybersecurity tools under Project Galileo.

Athenian Project

Cloudflare launched the Athenian Project in 2017 to provide our highest level of cybersecurity services for free to state and local governments in the United States that run elections. The project is designed to protect these websites tied to elections including information related to voting and polling places, voter registration and sites that publish election results. And voter data from cyberattack, and keep them online. During the 2020 U.S. election, we worked closely with civil society and government agencies to share threat information that we saw targeted against these participants and protected more than 292 websites in 30 states, including the Missouri Secretary of State, Solano County in California and The Colorado Department of State.

In recognition that election security is a global issue, we recently announced our partnerships with the International Foundation for Electoral Systems, National Democratic Institute and International Republican Institute to extend our cybersecurity protections to election management bodies around the world, as well as organizations that support free and fair elections. We look forward to continuing our work to protect resources in the voting process and help build trust in democratic institutions around the world.

Project Fairshot

Around the world, governments, hospitals, and pharmacies are struggling to distribute the COVID-19 vaccine. Technical limitations are causing vaccine registration sites to crash under the load of registrations. At Cloudflare, we want to help. Cloudflare’s Waiting Room feature allows organizations with more demand for a resource — be it concert tickets, new edition sneakers, or vaccines — to allow individuals to queue and then allocate access. Waiting Rooms can be deployed in front of any existing registration website without requiring code changes. As we watched the world struggle to fairly and efficiently distribute the COVID-19 vaccine we wanted to lend our technologies and expertise to help. Under Project Fair Shot, Cloudflare is providing Waiting Room to any government agency, hospital, pharmacy, or other organization facilitating the distribution of the COVID-19 vaccine for free until anyone who wants to be vaccinated can be, until at least 31-December 2021.

We all need to work together to get past this incredibly difficult time worldwide and are humbled to have helped so many different organizations around the world such as the County of San Luis Obispo, Verto Health, and the Ministry of Health for the Republic of Latvia, and more!

Why we are publishing our diversity data

At Cloudflare, we believe in being principled, curious and transparent. Publishing our diversity report is aligned with these values.

We are Principled: One of the Cloudflare Capabilities is “Do the Right Thing” — that includes long-term thinking about how we build an innovative and sustainable workforce. We have a fundamental belief that fairness is the right thing. We believe that equity is the right thing.

We are Curious: Creating a more diverse and sustainable workforce is hard work. We want to draw lessons from the things we try, and we want to learn from what others are trying. Sustainable communities is not a zero-sum game, and we believe we can all benefit as an active part of the broader community.

We believe in Transparency: For many years, we have been transparent with our team about our diversity data and our goals, and we have measured our progress regularly. Now we are taking the step to share publicly because we believe in accountability and accept the responsibility to build a diverse and sustainable workforce.

You can check out our Diversity, Equity, and Inclusion webpage with our diversity report here.

While there is always more work to be done, we are grateful for the empathetic and curious team that makes Cloudflare what it is today. Together, we are optimistic we can build a better — and more inclusive — Internet.

Through the eyes of a Cloudflare Technical Support Engineer

Post Syndicated from Justina Wong original https://blog.cloudflare.com/through-the-eyes-tech-support-engineer/

Through the eyes of a Cloudflare Technical Support Engineer

This post originally appeared on Landing Jobs under the title Mission: Protect the Internet where you can find open positions at Cloudflare Lisbon.

Justina Wong, Technical Support Team Lead in Lisbon, talks about what it’s like working at Cloudflare, and everything you need to know if you want to join us.

Through the eyes of a Cloudflare Technical Support Engineer

Justina joined Cloudflare about three years ago in London as a Technical Support Engineer. Currently, she’s part of their Customer Support team working in Lisbon as a team lead.

I can’t speak for others, but I love the things you can learn from the others. There are so many talented individuals who are willing and ready to teach/share. They are my inspiration and I want to become them!

On a Mission to Protect the Internet

Justina’s favourite Cloudflare products are firewall-related ones. The company’s primary care is for the customers and they want to make attack mitigation as easy as possible. As she puts it, “the fact that these protections are on multiple layers, like L7, L3/4, is very important, and I’m proud to be someone who can help our customers when they face certain attacks.”.

Cloudflare is constantly releasing new products to help build a better Internet, so product managers are always on top of tool updates to facilitate that. The company believes that it’s not only important to help customers from the product side, but it’s also as important to teach them how to help themselves so that they can fix their issues promptly without having to wait for an answer.

Company culture and Office vibes

According to Justina, one of the amazing things about Cloudflare is the unified company culture. As their SVP of Engineering, Usman, said in a recent meeting with the team, “Be helpful, look around for problems and help find solutions”.

Every Cloudflare office has its own little “flare”: London’s love of mince pies; Singapore’s super fun cultural richness in one location (they have four new years in one year, officially); and Lisbon’s forever love (and fight) for pastéis de nata.

Each office also has its own function or focus, so people working at Cloudflare get to meet very diverse individuals. For Justina, the things that she’d loved the most are learning from all of the engineers in London, picking up new customer service skills in Singapore and helping to build the new Lisbon office. She says that every time she goes to a different office, they have grown at least 50% in headcount compared to when she was last there. Talk about growth!

As a hiring manager, she also says that the company is mindful of diversity.

Through the eyes of a Cloudflare Technical Support Engineer

Working remote

Like everywhere else, remote work has become the current normal at Cloudflare. As someone who enjoyed being in the office, Justina says “all the countless times I just walked over to someone to ask a question, now all turned into a chat message; or the random coffee chat when we waited for our coffee to be done.”

Funnily enough, the EMEA CSUP team is working closer than before the pandemic. Previously, each office was somewhat in its own communication bubble, now it has turned into a collective conversation. This is great for getting to know colleagues during and beyond work hours.

What you need to know if you want to land a job at Cloudflare in Lisbon

For Cloudflare, growing the team is a continuous challenge, and Justina has never needed to do as many interviews as she has done in the Lisbon office. Although it’s a huge challenge for her, it’s also fun. Since the company is hiring aggressively despite the pandemic, their teams are eager to welcome anyone who’s ready to be part of Lisbon Cloudflare.

One of the things you can expect if you work at Cloudflare is for your manager to care and for your feedback to be heard. We know these are valuable things when considering where to work. So if you’re someone who’s willing to learn and is excited about their technologies, this call is for you. The company is expanding in different markets, so they’re looking for tech candidates who can speak multiple languages.

Currently, Cloudflare has over 25 open positions for their offices in Lisbon. Categories include Security Engineers, Full-Stack Developers, Data Scientists, and more.

Starting a new job in the middle of a pandemic

Post Syndicated from Daniela Rodrigues original https://blog.cloudflare.com/starting-a-new-job-in-the-middle-of-a-pandemic/

Starting a new job in the middle of a pandemic

Starting a new job in the middle of a pandemic

It has now been more than 90 days since I joined Cloudflare’s EMEA Recruiting Team as a Recruiting Coordinator based in Lisbon. In a year filled with hardships for so many people around the world, I wanted to share my journey. I hope people will relate and feel encouraged to pursue their dreams, even during these challenging times.

When 2020 started, it was not in my plans to change jobs and start working at a new company, completely remote, without ever meeting my colleagues in person or visiting the office. However, that is exactly what happened, and I am so glad I did.

Interviewing with Cloudflare

The number of interviews in the hiring process at Cloudflare may feel overwhelming for some – in my case, I met 11 people during this process. For me, I was glad to have so many chances to get to know the people I would be working with. I believe I got as much out of the conversations as the interviewers did, which is great — a recruitment process should be as much about the company getting to know you, as you getting to know the company.

A great thing about interviewing remotely is that I got the chance to talk to people all around the globe, which enriched the process and my idea of Cloudflare as a company. I started to picture myself as an actual member of the team, definitely interested in working towards a better and safer Internet. Even though there were many interviews to get through, the constant communication with the team made me feel engaged and excited. In the end, the process went by quickly, even quicker than I expected.

The best thing was the outpouring of support I received from what would be my future teammates once I accepted the offer. I felt welcomed way before my actual start date!

Remote Onboarding: Adapting and Evolving

In all my previous companies, onboarding was done in person and small groups. I was not prepared for a fully remote experience with a class of more than 20 people, yet it was so smooth and well-coordinated that you wouldn’t believe it had been run virtually for only a few months!

My onboarding class included people from all over the world — Lisbon, Austin, Miami, Washington, London, Munich, Singapore… And not only that, but we were all starting different roles, from Customer Success to Engineering, and even Legal Counsel! This gave me the opportunity to know people I otherwise wouldn’t have had the chance to meet, and it allowed me to establish bonds early on with my colleagues. Given the current situation, knowing that people were in the same boat with me felt reassuring. I felt that we were in it together, in a way. Not only that, but I got everything I needed for work (and more — like a pair of Cloudflare socks!) delivered to my home, making the whole experience very comfortable for me.

Ramping up and aiming for the stars

Starting a new job in the middle of a pandemic

Starting in a new role can be a daunting experience — it’s a new environment, a new team, a new project, and lots of things that could go sideways. However, there are also a lot of things that can go right!

At Cloudflare, I found an extremely welcoming, supportive team that helped me ramp up and take ownership of my work quickly and effectively. I felt so supported that I took ownership of a big project right away — Cloudflare Careers Day. Right from the start, it was clear to me that Cloudflare has ambitious goals for the growth of our Lisbon office. I thought about the ways I could help with that, and a virtual careers day seemed like a great first step to drive brand awareness and let people know we are hiring and that we are hiring! The Recruitment Team set in motion a plan to turn this idea into reality in less than three months, resulting in a successful and fun first edition of the Cloudflare Careers Day in November 2020.

Of course, there were times when I felt unsure of myself and my abilities. But this is why it is so important to be able to rely on your team. In the end, I feel I have grown a lot in just three months — not only professionally, but personally as well!

I look forward to working on more projects. I’m excited to write with this blog post, which I hope will inspire more people to take a chance, believe in themselves and just go for it! Even in these strange, stressful times, good things can and do happen, especially when you are surrounded by talented, inspiring people.

What does the future hold?

Lisbon! I am excited to help grow our Lisbon office, recruiting talented people that feel as strongly as I do about helping build a better Internet. We have many different open roles at the moment so, if you see one that suits you, take a chance and reach out. Maybe you’ll embark on a new journey, just like me.

Our Lisbon story is just beginning. I can’t wait to see all the amazing things we will accomplish in 2021, both as a team and as a company.

A Thanksgiving 2020 Reading List

Post Syndicated from Val Vesa original https://blog.cloudflare.com/a-thanksgiving-2020-reading-list/

A Thanksgiving 2020 Reading List

While our colleagues in the US are celebrating Thanksgiving this week and taking a long weekend off, there is a lot going on at Cloudflare. The EMEA team is having a full day on CloudflareTV with a series of live shows celebrating #CloudflareCareersDay.

So if you want to relax in an active and learning way this weekend, here are some of the topics we’ve covered on the Cloudflare blog this past week that you may find interesting.

Improving Performance and Search Rankings with Cloudflare for Fun and Profit

Making things fast is one of the things we do at Cloudflare. More responsive websites, apps, APIs, and networks directly translate into improved conversion and user experience. On November 10, Google announced that Google Search will directly take web performance and page experience data into account when ranking results on their search engine results pages (SERPs), beginning in May 2021.

Rustam Lalkaka and Rita Kozlov explain in this blog post how Google Search will prioritize results based on how pages score on Core Web Vitals, a measurement methodology Cloudflare has worked closely with Google to establish, and we have implemented support for in our analytics tools. Read the full blog post.

Getting to the Core: Benchmarking Cloudflare’s Latest Server Hardware

At the Cloudflare Core, we process logs to analyze attacks and compute analytics. In 2020, our Core servers were in need of a refresh, so we decided to redesign the hardware to be more in line with our Gen X edge servers. We designed two major server variants for the core. The first is Core Compute 2020, an AMD-based server for analytics and general-purpose compute paired with solid-state storage drives. The second is Core Storage 2020, an Intel-based server with twelve spinning disks to run database workloads. This is a refresh of the hardware that Cloudflare uses to run analytics provided big efficiency improvements.

Read the full blog post by Brian Bassett

Moving Quicksilver into production

We previously explained how and why we built Quicksilver. Quicksilver is the data store responsible for storing and distributing the billions of KV pairs used to configure the millions of sites and Internet services which use Cloudflare. This second blog post is about the long journey to production which culminates with Kyoto Tycoon removal from Cloudflare infrastructure and points to the first signs of obsolescence.

Geoffrey Plouviez takes you through the entire story of real-world engineering challenges and what it’s like to replace one of Cloudflare’s oldest critical components: read the full blog post here.

Building Black Friday e-commerce experiences with JAMstack and Cloudflare Workers

In this blog post, we explore how Cloudflare Workers continues to excel as a JAMstack deployment platform, and how it can be used to power e-commerce experiences, integrating with familiar tools like Stripe, as well as new technologies like Nuxt.js, and Sanity.io.

Read the full blog post and get all the details and open-source code from Kristian Freeman.

A Byzantine failure in the real world

When we review design documents at Cloudflare, we are always on the lookout for Single Points of Failure (SPOFs). In this post, we present a timeline of a real-world incident, and how an interesting failure mode known as a Byzantine fault played a role in a cascading series of events.

Tom Lianza and Chris Snook’s full blog post describes the consequences of a malfunctioning switch on a system built for reliability.

ASICs at the Edge

At Cloudflare, we pride ourselves in our global network that spans more than 200 cities in over 100 countries. To accelerate all that traffic through our network, there are multiple technologies at play. So let’s have a look at one of the cornerstones that makes all of this work.

Tom Strickx’ epic deep dive into ASICs is here.

Let us know your thoughts and comments below or feel free to also reach out to us via our social media channels. And because we talked about careers in the beginning of this blog post, check out our available jobs if you are interested to join Cloudflare.

My internship: Brotli compression using a reduced dictionary

Post Syndicated from Felix Hanau original https://blog.cloudflare.com/brotli-compression-using-a-reduced-dictionary/

My internship: Brotli compression using a reduced dictionary

Brotli is a state of the art lossless compression format, supported by all major browsers. It is capable of achieving considerably better compression ratios than the ubiquitous gzip, and is rapidly gaining in popularity. Cloudflare uses the Google brotli library to dynamically compress web content whenever possible. In 2015, we took an in-depth look at how brotli works and its compression advantages.

One of the more interesting features of the brotli file format, in the context of textual web content compression, is the inclusion of a built-in static dictionary. The dictionary is quite large, and in addition to containing various strings in multiple languages, it also supports the option to apply multiple transformations to those words, increasing its versatility.

The open sourced brotli library, that implements an encoder and decoder for brotli, has 11 predefined quality levels for the encoder, with higher quality level demanding more CPU in exchange for a better compression ratio. The static dictionary feature is used to a limited extent starting with level 5, and to the full extent only at levels 10 and 11, due to the high CPU cost of this feature.

We improve on the limited dictionary use approach and add optimizations to improve the compression at levels 5 through 9 at a negligible performance impact when compressing web content.

Brotli Static Dictionary

Brotli primarily uses the LZ77 algorithm to compress its data. Our previous blog post about brotli provides an introduction.

To improve compression on text files and web content, brotli also includes a static, predefined dictionary. If a byte sequence cannot be matched with an earlier sequence using LZ77 the encoder will try to match the sequence with a reference to the static dictionary, possibly using one of the multiple transforms. For example, every HTML file contains the opening <html> tag that cannot be compressed with LZ77, as it is unique, but it is contained in the brotli static dictionary and will be replaced by a reference to it. The reference generally takes less space than the sequence itself, which decreases the compressed file size.

The dictionary contains 13,504 words in six languages, with lengths from 4 to 24 characters. To improve the compression of real-world text and web data, some dictionary words are common phrases (“The current”) or strings common in web content (‘type=”text/javascript”’). Unlike usual LZ77 compression, a word from the dictionary can only be matched as a whole. Starting a match in the middle of a dictionary word, ending it before the end of a word or even extending into the next word is not supported by the brotli format.

Instead, the dictionary supports 120 transforms of dictionary words to support a larger number of matches and find longer matches. The transforms include adding suffixes (“work” becomes “working”) adding prefixes (“book” => “ the book”) making the first character uppercase (“process” => “Process”) or converting the whole word to uppercase (“html” => “HTML”). In addition to transforms that make words longer or capitalize them, the cut transform allows a shortened match (“consistently” => “consistent”), which makes it possible to find even more matches.

Methods

With the transforms included, the static dictionary contains 1,633,984 different words – too many for exhaustive search, except when used with the slow brotli compression levels 10 and 11. When used at a lower compression level, brotli either disables the dictionary or only searches through a subset of roughly 5,500 words to find matches in an acceptable time frame. It also only considers matches at positions where no LZ77 match can be found and only uses the cut transform.

Our approach to the brotli dictionary uses a larger, but more specialized subset of the dictionary than the default, using more aggressive heuristics to improve the compression ratio with negligible cost to performance. In order to provide a more specialized dictionary, we provide the compressor with a content type hint from our servers, relying on the Content-Type header to tell the compressor if it should use a dictionary for HTML, JavaScript or CSS. The dictionaries can be furthermore refined by colocation language in the future.

Fast dictionary lookup

To improve compression without sacrificing performance, we needed a fast way to find matches if we want to search the dictionary more thoroughly than brotli does by default. Our approach uses three data structures to find a matching word directly. The radix trie is responsible for finding the word while the hash table and bloom filter are used to speed up the radix trie and quickly eliminate many words that can’t be matched using the dictionary.

My internship: Brotli compression using a reduced dictionary
Lookup for a position starting with “type”

The radix trie easily finds the longest matching word without having to try matching several words. To find the match, we traverse the graph based on the text at the current position and remember the last node with a matching word. The radix trie supports compressed nodes (having more than one character as an edge label), which greatly reduces the number of nodes that need to be traversed for typical dictionary words.

The radix trie is slowed down by the large number of positions where we can’t find a match. An important finding is that most mismatching strings have a mismatching character in the first four bytes. Even for positions where a match exists, a lot of time is spent traversing nodes for the first four bytes since the nodes close to the tree root usually have many children.

Luckily, we can use a hash table to look up the node equivalent to four bytes, matching if it exists or reject the possibility of a match. We thus look up the first four bytes of the string, if there is a matching node we traverse the trie from there, which will be fast as each four-byte prefix usually only has a few corresponding dict words. If there is no matching node, there will not be a matching word at this position and we do not need to further consider it.

While the hash table is designed to reject mismatches quickly and avoid cache misses and high search costs in the trie, it still suffers from similar problems: We might search through several 4-byte prefixes with the hash value of the given position, only to learn that no match can be found. Additionally, hash lookups can be expensive due to cache misses.

To quickly reject words that do not match the dictionary, but might still cause cache misses, we use a k=1 bloom filter to quickly rule out most non-matching positions. In the k=1 case, the filter is simply a lookup table with one bit indicating whether any matching 4-byte prefixes exist for a given hash value. If the hash value for the given bit is 0, there won’t be a match. Since the bloom filter uses at most one bit for each four-byte prefix while the hash table requires 16 bytes, cache misses are much less likely. (The actual size of the structures is a bit different since there are many empty spaces in both structures and the bloom filter has twice as many elements to reject more non-matching positions.)

This is very useful for performance as a bloom filter lookup requires a single memory access. The bloom filter is designed to be fast and simple, but still rejects more than half of all non-matching positions and thus allows us to save a full hash lookup, which would often mean a cache miss.

Heuristics

To improve the compression ratio without sacrificing performance, we employed a number of heuristics:

Only search the dictionary at some positions
This is also done using the stock dictionary, but we search more aggressively. While the stock dictionary only considers positions where the LZ77 match finder did not find a match, we also consider positions that have a bad match according to the brotli cost model: LZ77 matches that are short or have a long distance between the current position and the reference usually only offer a small compression improvement, so it is worth trying to find a better match in the static dictionary.

Only consider the longest match and then transform it
Instead of finding and transforming all matches at a position, the radix trie only gives us the longest match which we then transform. This approach results in a vast performance improvement. In most cases, this results in finding the best match.

Only include some transforms
While all transformations can improve the compression ratio, we only included those that work well with the data structures. The suffix transforms can easily be applied after finding a non-transformed match. For the upper case transforms, we include both the non-transformed and the upper case version of a word in the radix trie. The prefix and cut transforms do not play well with the radix trie, therefore a cut of more than 1 byte and prefix transforms are not supported.

Generating the reduced dictionary

At low compression levels, brotli searches a subset of ~5,500 out of 13,504 words of the dictionary, negatively impacting compression. To store the entire dictionary, we would need to store ~31,700 words in the trie considering the upper case transformed output of ASCII sequences and ~11,000 four-byte prefixes in the hash. This would slow down hash table and radix trie, so we needed to find a different subset of the dictionary that works well for web content.

For this purpose, we used a large data set containing representative content. We made sure to use web content from several world regions to reflect language diversity and optimize compression. Based on this data set, we identified which words are most common and result in the largest compression improvement according to the brotli cost model. We only include the most useful words based on this calculation. Additionally, we remove some words if they slow down hash table lookups of other, more common words based on their hash value.

We have generated separate dictionaries for HTML, CSS and JavaScript content and use the MIME type to identify the right dictionary to use. The dictionaries we currently use include about 15-35% of the entire dictionary including uppercase transforms. Depending on the type of data and the desired compression/speed tradeoff, different options for the size of the dictionary can be useful. We have also developed code that automatically gathers statistics about matches and generates a reduced dictionary based on this, which makes it easy to extend this to other textual formats, perhaps data that is majority non-English or XML data and achieve better results for this type of data.

Results

We tested the reduced dictionary on a large data set of HTML, CSS and JavaScript files.

The improvement is especially big for small files as the LZ77 compression is less effective on them. Since the improvement on large files is a lot smaller, we only tested files up to 256KB. We used compression level 5, the same compression level we currently use for dynamic compression on our edge, and tested on a Intel Core i7-7820HQ CPU.

Compression improvement is defined as 1 – (compressed size using the reduced dictionary / compressed size without dictionary). This ratio is then averaged for each input size range. We also provide an average value weighted by file size. Our data set mirrors typical web traffic, covering a wide range of file sizes with small files being more common, which explains the large difference between the weighted and unweighted average.

My internship: Brotli compression using a reduced dictionary

With the improved dictionary approach, we are now able to compress HTML, JavaScript and CSS files as well, or sometimes even better than using a higher compression level would allow us, all while using only 1% to 3% more CPU. For reference using compression level 6 over 5 would increase CPU usage by up to 12%.