Tag Archives: Partners

Addressing the Web’s Client-Side Security Challenge

Post Syndicated from Swapnil Bhalode (Guest Author) original https://blog.cloudflare.com/addressing-the-webs-client-side-security-challenge/

Addressing the Web’s Client-Side Security Challenge

Modern web architecture relies heavily on JavaScript and enabling third-party code to make client-side network requests. These innovations are built on client-heavy frameworks such as Angular, Ember, React, and Backbone that leverage the processing power of the browser to enable the execution of code directly on the client interface/web browser. These third-party integrations provide richness (chat tools, images, fonts) or extract analytics (Google Analytics). Today, up to 70% of the code executing and rendering on your customer’s browser comes from these integrations. All of these software integrations provide avenues for potential vulnerabilities.

Addressing the Web’s Client-Side Security Challenge

Unfortunately, these unmanaged, unmonitored integrations operate without security consideration, providing an expansive attack surface that attackers have routinely exploited to compromise websites. Today, only 2% of the Alexa 1000 global websites were found to deploy client-side security measures to protect websites and web applications against attacks such as Magecart, XSS, credit card skimming, session redirects and website defacement.

Improving website security and ensuring performance with Cloudflare Workers

In this post, we focus on how Cloudflare Workers can be used to improve security and ensure the high performance of web applications. Tala has joined Cloudflare’s marketplace to further our common goals of ensuring website security, preserving data privacy and assuring the integrity of web commerce. Tala’s innovative and unobtrusive solution, coupled with Cloudflare’s global reach, offers a compelling, highly effective solution for combatting the acceleration of client-side website attacks.

About Cloudflare Workers

Cloudflare Workers is a globally distributed serverless compute platform that runs across Cloudflare’s network of 200+ locations worldwide. Workers is designed for flexibility, with multiple use cases ranging from customizing configuration of Cloudflare services and features to building full, independent applications.

Cloudflare & Tala

Tala has integrated its “web module” capabilities into Cloudflare’s service Worker platform to enable a serverless, instantaneous deployment. This allows customers to activate enterprise-grade website security quickly and efficiently from Cloudflare’s 200+ reliable and redundant edge locations around the world. Tala automates the activation of standards-based, browser-native security controls to deliver highly effective security, without impacting website performance or user experience.

About Tala

Tala secures millions of web sessions for large providers in verticals such as financial services, online retail, payment processing, tech, fintech and education. We secure websites and web applications by continuously interrogating application architecture to enable the automation and continuous deployment of precise, browser-native, standards-based policies & controls. Our technology allows organizations to deploy standards-based website security with near-zero impact to performance and without the operational burdens associated with the application and administration of these policies.

How Tala Works

Tala’s solution is enabled with an analytics engine that evaluates over 150 unique indicators of a web page’s behavior and integrations. This dynamic analytics engine scans continuously, working in conjunction with an AI-assisted automation engine that activates and tunes standards-based security capabilities, like Content Security Policy (CSP), Subresource Integrity (SRI), Strict Transport (HSTS), Sandboxing (iFrame rules), Referrer Policy, Trusted Types, Certificate Stapling, Clear Site Data and others.

The automation of browser-native security controls provides comprehensive security without requiring any changes to application code and has near-zero impact on website performance. Tala’s solution can be installed via the Cloudflare Workers Integration to deliver instantaneous client-side security.

With Tala, rich website analytics become available with the risk of client-side website attacks. Website performance is preserved, administration is accelerated and the need for costly and continuous administration, remediation or incident response is minimized.

Addressing the Web’s Client-Side Security Challenge

How Tala Integrates with Cloudflare Workers

Customers can deploy Tala-generated security policies (discussed in the section above) on their website using Cloudflare’s Service Workers. The customer will install the Tala Service Worker on their Cloudflare account, using Tala’s installation scripts. These scripts invoke Cloudflare’s APIs to upload and enable the Tala Service Worker to Cloudflare as well upload the customized Tala security policies to Cloudflare’s KV store.

Once the installation is complete, the Tala service worker will be invoked every time an end user requests the customer’s site. During the response from Cloudflare, the Tala Service Worker implements the appropriate Tala’s security policies. Here are the steps involved:

  • Tala Service Worker sees the HTML content coming from the origin web server
  • Tala Service Worker parses the HTML page
  • Based on the content of the page, the Tala Service Worker inserts the appropriate security controls (e.g., CSP, SRI) which could include a combination of HTTP security headers (e.g., referrer policy, CSP, HSTS) as well as page insertions (e.g., nonces, SRI hashes)

Periodically, the Tala Service Worker polls the Tala cloud service to check for any security policy updates and if required, push the latest policies. For more details on how to install Tala into Cloudflare’s Service Workers, please read the installation manual.

Deploy Client-Side Website Security

Client-side vulnerability is a significant and accelerating problem. Workers can provide speed and capability to ensure your organization isn’t the next victim of a growing volume of successful attacks targeting widespread website and web application vulnerability. Standards-based security offers the most effective, comprehensive solution to safeguard against these attacks.

The combination of Cloudflare and Tala can help you expedite deployment. We’d love to hear from you and explore a Workers deployment!

The Tala solution is available today!

  • Cloudflare Enterprise Customers: Reach out to your dedicated Cloudflare account manager to learn more and start the process.
  • Tala Customers and Cloudflare Customers, reach out to Tala to learn more and start the process. You can sign up for and learn more about using Cloudflare Workers here!

Impact of Cache Locality

Post Syndicated from Sung Park original https://blog.cloudflare.com/impact-of-cache-locality/

Impact of Cache Locality

Impact of Cache Locality

In the past, we didn’t have the opportunity to evaluate as many CPUs as we do today. The hardware ecosystem was simple – Intel had consistently delivered industry leading processors. Other vendors could not compete with them on both performance and cost. Recently it all changed: AMD has been challenging the status quo with their 2nd Gen EPYC processors.

This is not the first time that Intel has been challenged; previously there was Qualcomm, and we worked with AMD and considered their 1st Gen EPYC processors and based on the original Zen architecture, but ultimately, Intel prevailed. AMD did not give up and unveiled their 2nd Gen EPYC processors codenamed Rome based on the latest Zen 2 architecture.


This made many improvements over its predecessors. Improvements include a die shrink from 14nm to 7nm, a doubling of the top end core count from 32 to 64, and a larger L3 cache size. Let’s emphasize again on the size of that L3 cache, which is 32 MiB L3 cache per Core Complex Die (CCD).

This time around, we have taken steps to understand our workloads at the hardware level through the use of hardware performance counters and profiling tools. Using these specialized CPU registers and profilers, we collected data on the AMD 2nd Gen EPYC and Intel Skylake-based Xeon processors in a lab environment, then validated our observations in production against other generations of servers from the past.

Simulated Environment

CPU Specifications

Impact of Cache Locality

We evaluated several Intel Cascade Lake and AMD 2nd Gen EPYC processors, trading off various factors between power and performance; the AMD EPYC 7642 CPU came out on top. The majority of Cascade Lake processors have 1.375 MiB L3 cache per core shared across all cores, a common theme that started with Skylake. On the other hand, the 2nd Gen EPYC processors start at 4 MiB per core. The AMD EPYC 7642 is a unique SKU since it has 256 MiB of L3 cache shared across its 48 cores. Having a cache this large or approximately 5.33 MiB sitting right next to each core means that a program will spend fewer cycles fetching data from RAM with the capability to have more data readily available in the L3 cache.

Impact of Cache Locality
Before (Intel)
Impact of Cache Locality
After (AMD)

Traditional cache layout has also changed with the introduction of 2nd Gen EPYC, a byproduct of AMD using a multi-chip module (MCM) design. The 256 MiB L3 cache is formed by 8 individual dies or Core Complex Die (CCD) that is formed by 2 Core Complexes (CCX) with each CCX containing 16 MiB of L3 cache.

Impact of Cache Locality
Core Complex (CCX) – Up to four cores
Impact of Cache Locality
Core Complex Die (CCD) – Created by combining two CCXs
Impact of Cache Locality
AMD 2nd Gen EPYC 7642 – Created with 8x CCDs plus an I/O die in the center

Methodology

Our production traffic shares many characteristics of a sustained workload which typically does not induce large variation in operating frequencies nor enter periods of idle time. We picked out a simulated traffic pattern that closely resembled our production traffic behavior which was the Cached 10KiB png via HTTPS. We were interested in assessing the CPU’s maximum throughput or requests per second (RPS), one of our key metrics. With that being said, we did not disable Intel Turbo Boost or AMD Precision Boost, nor matched the frequencies clock-for-clock while measuring for requests per second, instructions retired per second (IPS), L3 cache miss rate, and sustained operating frequency.

Results

The 1P AMD 2nd Gen EPYC 7642 powered server took the lead and processed 50% more requests per second compared to our Gen 9’s 2P Intel Xeon Platinum 6162 server.

Impact of Cache Locality

We are running a sustained workload, so we should end up with a sustained operating frequency that is higher than base clock. The AMD EPYC 7642 operating frequency or the number cycles that the processor had at its disposal was approximately 20% greater than the Intel Xeon Platinum 6162, so frequency alone was not enough to explain the 50% gain in requests per second.

Impact of Cache Locality

Taking a closer look, the number of instructions retired over time was far greater on the AMD 2nd Gen EPYC 7642 server, thanks to its low L3 cache miss rate.

Impact of Cache Locality
Impact of Cache Locality

Production Environment

CPU Specifications

Impact of Cache Locality

Methodology

Our most predominant bottleneck appears to be the cache memory and we saw significant improvement in requests per second as well as time to process a request due to low L3 cache miss rate. The data we present in this section was collected at a point of presence that spanned between Gen 7 to Gen 9 servers. We also collected data from a secondary region to gain additional confidence that the data we present here was not unique to one particular environment. Gen 9 is the baseline just as we have done in the previous section.

We put the 2nd Gen EPYC-based Gen X into production with hopes that the results would mirror closely to what we have previously seen in the lab. We found that the requests per second did not quite align with the results we had hoped, but the AMD EPYC server still outperformed all previous generations including outperforming the Intel Gen 9 server by 36%.

Impact of Cache Locality

Sustained operating frequency was nearly identical to what we have seen back in the lab.

Impact of Cache Locality

Due to the lower than expected requests per second, we also saw lower instructions retired over time and higher L3 cache miss rate but maintained a lead over Gen 9, with 29% better performance.

Impact of Cache Locality
Impact of Cache Locality

Conclusion

The single AMD EPYC 7642 performed very well during our lab testing, beating our Gen 9 server with dual Intel Xeon Platinum 6162 with the same total number of cores. Key factors we noticed were its large L3 cache, which led to a low L3 cache miss rate, as well as a higher sustained operating frequency. The AMD 2nd Gen EPYC 7642 did not have as big of an advantage in production, but nevertheless still outperformed all previous generations. The observation we made in production was based on a PoP that could have been influenced by a number of other factors such as but not limited to ambient temperature, timing, and other new products that will shape our traffic patterns in the future such as WebAssembly on Cloudflare Workers. The AMD EPYC 7642 opens up the possibility for our upcoming Gen X server to maintain the same core count while processing more requests per second than its predecessor.

Got a passion for hardware? I think we should get in touch. We are always looking for talented and curious individuals to join our team. The data presented here would not have been possible if it was not for the teamwork between many different individuals within Cloudflare. As a team, we strive to work together to create highly performant, reliable, and secure systems that will form the pillars of our rapidly growing network that spans 200 cities in more than 90 countries and we are just getting started.

An EPYC trip to Rome: AMD is Cloudflare’s 10th-generation Edge server CPU

Post Syndicated from Rob Dinh original https://blog.cloudflare.com/an-epyc-trip-to-rome-amd-is-cloudflares-10th-generation-edge-server-cpu/

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

More than 1 billion unique IP addresses pass through the Cloudflare Network each day, serving on average 11 million HTTP requests per second and operating within 100ms of 95% of the Internet-connected population globally. Our network spans 200 cities in more than 90 countries, and our engineering teams have built an extremely fast and reliable infrastructure.

We’re extremely proud of our work and are determined to help make the Internet a better and more secure place. Cloudflare engineers who are involved with hardware get down to servers and their components to understand and select the best hardware to maximize the performance of our stack.

Our software stack is compute intensive and is very much CPU bound, driving our engineers to work continuously at optimizing Cloudflare’s performance and reliability at all layers of our stack. With the server, a straightforward solution for increasing computing power is to have more CPU cores. The more cores we can include in a server, the more output we can expect. This is important for us since the diversity of our products and customers has grown over time with increasing demand that requires our servers to do more. To help us drive compute performance, we needed to increase core density and that’s what we did. Below is the processor detail for servers we’ve deployed since 2015, including the core counts:

Gen 6Gen 7Gen 8Gen 9
Start of service2015201620172018
CPUIntel Xeon E5-2630 v3Intel Xeon E5-2630 v4Intel Xeon Silver 4116Intel Xeon Platinum 6162
Physical Cores2 x 82 x 102 x 122 x 24
TDP2 x 85W2 x 85W2 x 85W2 x 150W
TDP per Core10.65W8.50W7.08W6.25W

In 2018, we made a big jump in total number of cores per server with Gen 9. Our physical footprint was reduced by 33% compared to Gen 8, giving us increased capacity and computing power per rack. Thermal Design Power (TDP aka typical power usage) are mentioned above to highlight that we’ve also been more power efficient over time. Power efficiency is important to us: first, because we’d like to be as carbon friendly as we can; and second, so we can better utilize our provisioned power supplied by the data centers. But we know we can do better.

Our main defining metric is Requests per Watt. We can increase our Requests per Second number with more cores, but we have to stay within our power budget envelope. We are constrained by the data centers’ power infrastructure which, along with our selected power distribution units, leads us to power cap for each server rack. Adding servers to a rack obviously adds more power draw increasing power consumption at the rack level. Our Operational Costs significantly increase if we go over a rack’s power cap and have to provision another rack. What we need is more compute power inside the same power envelope which will drive a higher (better) Requests per Watt number – our key metric.

As you might imagine, we look at power consumption carefully in the design stage. From the above you can see that it’s not worth the time for us to deploy more power-hungry CPUs if TDP per Core is higher than our current generation which would hurt our Requests per Watt metric. As we started looking at production ready systems to power our Gen X solution, we took a long look at what is available to us in the market today and we’ve made our decision. We’re moving on from Gen 9’s 48-core setup of dual socket Intel® Xeon® Platinum 6162‘s to a 48-core single socket AMD EPYC™ 7642.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
Gen X server setup with single socket 48-core AMD EPYC 7642

IntelAMD
CPUXeon Platinum 6162EPYC 7642
Microarchitecture“Skylake”“Zen 2”
Codename“Skylake SP”“Rome”
Process14nm7nm
Physical Cores2 x 2448
Frequency1.9 GHz2.4 GHz
L3 Cache / socket24 x 1.375MiB16 x 16MiB
Memory / socket6 channels, up to DDR4-24008 channels, up to DDR4-3200
TDP2 x 150W225W
PCIe / socket48 lanes128 lanes
ISAx86-64x86-64

From the specs, we see that with the AMD chip we get to keep the same amount of cores and lower TDP. Gen 9’s TDP per Core was 6.25W, Gen X’s will be 4.69W… That’s a 25% decrease. With higher frequency, and perhaps going to a more simplified setup of single socket, we can speculate that the AMD chip will perform better. We’re walking through a series of tests, simulations, and live production results in the rest of this blog to see how much better AMD performs.

As a side note before we go further, TDP is a simplified metric from the manufacturers’ datasheets that we use in the early stages of our server design and CPU selection process. A quick Google search leads to thoughts that AMD and Intel define TDP differently, which basically makes the spec unreliable. Actual CPU power draw, and more importantly server system power draw, are what we really factor in our final decisions.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Ecosystem Readiness

At the beginning of our journey to choose our next CPU, we got a variety of processors from different vendors that could fit well with our software stack and services, which are written in C, LuaJIT, and Go. More details about benchmarking for our stack were explained when we benchmarked Qualcomm’s ARM® chip in the past. We’re going to go through the same suite of tests as Vlad’s blog this time around since it is a quick and easy “sniff test”. This allows us to test a bunch of CPUs within a manageable time period before we commit to spend more engineering effort and need to apply our software stack.

We tried a variety of CPUs with different number of cores, sockets, and frequencies. Since we’re explaining how we chose the AMD EPYC 7642, all of the graphs in this blog focus on how AMD compares with our Gen 9’s Intel Xeon Platinum 6162 CPU as a baseline.

Our results correspond to server node for both CPUs tested; meaning the numbers pertain to 2x 24-core processors for Intel, and 1x 48-core processor for AMD – a two socket Intel based server and a one socket AMD EPYC powered server. Before we started our testing, we changed the Cloudflare lab test servers’ BIOS settings to match our production server settings. This gave us CPU frequencies yields for AMD at 3.03 Ghz and Intel at 2.50 Ghz on average with very little variation. With gross simplification, we expect that with the same amount of cores AMD would perform about 21% better than Intel. Let’s start with our crypto tests.

Cryptography

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Looking promising for AMD. In public key cryptography, it does 18% better. Meanwhile for symmetric key, AMD loses on AES-128-GCM but it’s comparable overall.

Compression

We do a lot of compression at the edge to save bandwidth and help deliver content faster. We go through both zlib and brotli libraries written in C. All tests are done on blog.cloudflare.com HTML file in memory.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

AMD wins by an average of 29% using gzip across all qualities. It does even better with brotli with tests lower than quality 7, which we use for dynamic compression. There’s a throughput cliff starting brotli-9 which Vlad’s explanation is that Brotli consumes lots of memory and thrashes cache. Nevertheless, AMD wins by a healthy margin.

A lot of our services are written in Go. In the following graphs we’re redoing the crypto and compression tests in Go along with RegExp on 32KB strings and the strings library.

Go Cryptography

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Go Compression

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Go Regexp

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Go Strings

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

AMD performs better in all of our Go benchmarks except for ECDSA P256 Sign losing by 38%, which is peculiar since with the test in C it does 24% better. It’s worth investigating what’s going on here. Other than that, AMD doesn’t win by as much of a margin but it still proves to be better.

LuaJIT

We rely a lot on LuaJIT in our stack. As Vlad said, it’s the glue that holds Cloudflare together. We’re glad to show that AMD wins here as well.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Overall our tests show a single EPYC 7642 to be more competitive than two Xeon Platinum 6162. While there are a couple of tests where AMD loses out such as OpenSSL AES-128-GCM and Go OpenSSL ECDSA-P256 Sign, AMD wins in all the others. By scanning quickly and treating all tests equally, AMD does on average 25% better than Intel.

Performance Simulations

After our ‘sniff’ tests, we put our servers through another series of emulations which apply synthetic workloads simulating our edge software stack. Here we are simulating workloads of scenarios with different types of requests we see in production. Types of requests vary from asset size, whether they go through HTTP or HTTPS, WAF, Workers, or one of many additional variables. Below shows the throughput comparison between the two CPUs of the types of requests we see most typically.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

The results above are ratios using Gen 9’s Intel CPUs as the baseline normalized at 1.0 on the X-axis. For example, looking at simple requests of 10KiB assets over HTTPS, we see that AMD does 1.50x better than Intel in Requests per Second. On average for the tests shown on the graph above, AMD performs 34% better than Intel. Considering that the TDP for the single AMD EPYC 7642 is 225W, when compared to two Intel’s being 300W, we’re looking at AMD delivering up to 2.0x better Requests per Watt vs. the Intel CPUs!

By this time, we were already leaning heavily toward a single socket setup with AMD EPYC 7642 as our CPU for Gen X. We were excited to see exactly how well AMD EPYC servers would do in production, so we immediately shipped a number of the servers out to some of our data centers.

Live Production

Step one of course was to get all our test servers set up for a production environment. All of our machines in the fleet are loaded with the same processes and services which makes for a great apples-to-apples comparison.  Like data centers everywhere, we have multiple generations of servers deployed and we deploy our servers in clusters such that each cluster is pretty homogeneous by server generation. In some environments this can lead to varying utilization curves between clusters.  This is not the case for us. Our engineers have optimized CPU utilization across all server generations so that no matter if the machine’s CPU has 8 cores or 24 cores, CPU usage is generally the same.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

As you can see above and to illustrate our ‘similar CPU utilization’ comment, there is no significant difference in CPU usage between Gen X AMD powered servers and Gen 9 Intel based servers. This means both test and baseline servers are equally loaded. Good. This is exactly what we want to see with our setup, to have a fair comparison. The 2 graphs below show the comparative number of requests processed at the CPU single core and all core (server) level.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

We see that AMD does on average about 23% more requests. That’s really good! We talked a lot about bringing more muscle in the Gen 9 blog. We have the same number of cores, yet AMD does more work, and does it with less power. Just by looking at the specs for number of cores and TDP in the beginning, it’s really nice to see that AMD also delivers significantly more performance with better power efficiency.

But as we mentioned earlier, TDP isn’t a standardized spec across manufacturers so let’s look at real power usage below. Measuring server power consumption along with requests per second (RPS) yields the graph below:

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Observing our servers request rate over their power consumption, the AMD Gen X server performs 28% better. While we could have expected more out of AMD since its TDP is 25% lower, keep in mind that TDP is very ambiguous. In fact, we saw that AMD actual power draw ran nearly at spec TDP with its much higher than base frequency;  Intel was far from it. Another reason why TDP is becoming a less reliable estimate of power draw. Moreover, CPU is just one component contributing to the overall power of the system. Let’s remind that Intel CPUs are integrated in a multi-node system as described in the Gen 9 blog, while AMD is in a regular 1U form-factor machine. That actually doesn’t favor AMD since multi-node systems are designed for high density capabilities at lower power per node, yet it still outperformed the Intel system on a power per node basis anyway.

Through the majority of comparisons from the datasheets, test simulations, and live production performance, the 1P AMD EPYC 7642 configuration performed significantly better than the 2P Intel Xeon 6162. We’ve seen in some environments that AMD can do up to 36% better in live production and we believe we can achieve that consistently with some optimization on both our hardware and software.

So that’s it. AMD wins.

The additional graphs below show the median and p99 NGINX processing mostly on-CPU time latencies between the two CPUs throughout 24 hours. On average, AMD processes about 25% faster. At p99, it does about 20-50% depending on the time of day.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU

Conclusion

Hardware and Performance engineers at Cloudflare do significant research and testing to figure out the best server configuration for our customers. Solving big problems like this is why we love working here, and we’re also helping solving yours with our services like serverless edge compute and the array of security solutions such as Magic Transit, Argo Tunnel, and DDoS protection. All of our servers on the Cloudflare Network are designed to make our products work reliably, and we strive to make each new generation of our server design better than its predecessor. We believe the AMD EPYC 7642 is the answer for our Gen X’s processor question.

With Cloudflare Workers, developers have enjoyed deploying their applications to our Network, which is ever expanding across the globe. We’ve been proud to empower our customers by letting them focus on writing their code while we are managing the security and reliability in the cloud. We are now even more excited to say that their work will be deployed on our Gen X servers powered by 2nd Gen AMD EPYC processors.

An EPYC trip to Rome: AMD is Cloudflare's 10th-generation Edge server CPU
Expanding Rome to a data center near you

Thanks to AMD, using the EPYC 7642 allows us to increase our capacity and expand into more cities easier. Rome wasn’t built in one day, but it will be very close to many of you.

In the last couple of years, we’ve been experimenting with many Intel and AMD x86 chips along with ARM CPUs. We look forward to having these CPU manufacturers partner with us for future generations so that together we can help build a better Internet.

Cloudflare’s Gen X: Servers for an Accelerated Future

Post Syndicated from Nitin Rao original https://blog.cloudflare.com/cloudflares-gen-x-servers-for-an-accelerated-future/

Cloudflare’s Gen X: 
Servers for an Accelerated Future

“Every server can run every service.”

Cloudflare’s Gen X: 
Servers for an Accelerated Future

We designed and built Cloudflare’s network to be able to grow capacity quickly and inexpensively; to allow every server, in every city, to run every service; and to allow us to shift customers and traffic across our network efficiently. We deploy standard, commodity hardware, and our product developers and customers do not need to worry about the underlying servers. Our software automatically manages the deployment and execution of our developers’ code and our customers’ code across our network. Since we manage the execution and prioritization of code running across our network, we are both able to optimize the performance of our highest tier customers and effectively leverage idle capacity across our network.

An alternative approach might have been to run several fragmented networks with specialized servers designed to run specific features, such as the Firewall, DDoS protection or Workers. However, we believe that approach would have resulted in wasted idle resources and given us less flexibility to build new software or adopt the newest available hardware. And a single optimization target means we can provide security and performance at the same time.

We use Anycast to route a web request to the nearest Cloudflare data center (from among 200 cities), improving performance and maximizing the surface area to fight attacks.

Once a datacenter is selected, we use Unimog, Cloudflare’s custom load balancing system, to dynamically balance requests across diverse generations of servers. We load balance at different layers: between cities, between physical deployments located across a city, between external Internet ports, between internal cables, between servers, and even between logical CPU threads within a server.

As demand grows, we can scale out by simply adding new servers, points of presence (PoPs), or cities to the global pool of available resources. If any server component has a hardware failure, it is gracefully de-prioritized or removed from the pool, to be batch repaired by our operations team. This architecture has enabled us to have no dedicated Cloudflare staff at any of the 200 cities, instead relying on help for infrequent physical tasks from the ISPs (or data centers) hosting our equipment.

Gen X: Intel Not Inside

Cloudflare’s Gen X: 
Servers for an Accelerated Future

We recently turned up our tenth generation of servers, “Gen X”, already deployed across major US cities, and in the process of being shipped worldwide. Compared with our prior server (Gen 9), it processes as much as 36% more requests while costing substantially less. Additionally, it enables a ~50% decrease in L3 cache miss rate and up to 50% decrease in NGINX p99 latency, powered by a CPU rated at 25% lower TDP (thermal design power) per core.

Notably, for the first time, Intel is not inside. We are not using their hardware for any major server components such as the CPU, board, memory, storage, network interface card (or any type of accelerator). Given how critical Intel is to our industry, this would until recently have been unimaginable, and is in contrast with prior generations which made extensive use of their hardware.

Cloudflare’s Gen X: 
Servers for an Accelerated Future
Intel-based Gen 9 server

This time, AMD is inside.

We were particularly impressed by the 2nd Gen AMD EPYC processors because they proved to be far more efficient for our customers’ workloads. Since the pendulum of technology leadership swings back and forth between providers, we wouldn’t be surprised if that changes over time. However, we were happy to adapt quickly to the components that made the most sense for us.

Compute

Cloudflare’s Gen X: 
Servers for an Accelerated Future

CPU efficiency is very important to our server design. Since we have a compute-heavy workload, our servers are typically limited by the CPU before other components. Cloudflare’s software stack scales quite well with additional cores. So, we care more about core-count and power-efficiency than dimensions such as clock speed.

We selected the AMD EPYC 7642 processor in a single-socket configuration for Gen X. This CPU has 48-cores (96 threads), a base clock speed of 2.4 GHz, and an L3 cache of 256 MB. While the rated power (225W) may seem high, it is lower than the combined TDP in our Gen 9 servers and we preferred the performance of this CPU over lower power variants. Despite AMD offering a higher core count option with 64-cores, the performance gains for our software stack and usage weren’t compelling enough.

We have deployed the AMD EPYC 7642 in half a dozen Cloudflare data centers; it is considerably more powerful than a dual-socket pair of high-core count Intel processors (Skylake as well as Cascade Lake) we used in the last generation.

Readers of our blog might remember our excitement around ARM processors. We even ported the entirety of our software stack to run on ARM, just as it does with x86, and have been maintaining that ever since even though it calls for slightly more work for our software engineering teams. We did this leading up to the launch of Qualcomm’s Centriq server CPU, which eventually got shuttered. While none of the off-the-shelf ARM CPUs available this moment are interesting to us, we remain optimistic about high core count offerings launching in 2020 and beyond, and look forward to a day when our servers are a mix of x86 (Intel and AMD) and ARM.

We aim to replace servers when the efficiency gains enabled by new equipment outweigh their cost.

The performance we’ve seen from the AMD EPYC 7642 processor has encouraged us to accelerate replacement of multiple generations of Intel-based servers.

Compute is our largest investment in a server. Our heaviest workloads, from the Firewall to Workers (our serverless offering), often require more compute than other server resources. Also, the average size in kilobytes of a web request across our network tends to be small, influenced in part by the relative popularity of APIs and mobile applications. Our approach to server design is very different than traditional content delivery networks engineered to deliver large object video libraries, for whom servers focused on storage might make more sense, and re-architecting to offer serverless is prohibitively capital intensive.

Our Gen X server is intentionally designed with an “empty” PCIe slot for a potential add on card, if it can perform some functions more efficiently than the primary CPU. Would that be a GPU, FPGA, SmartNIC, custom ASIC, TPU or something else? We’re intrigued to explore the possibilities.

In accompanying blog posts over the next few days, our hardware engineers will describe how AMD 7642 performed against the benchmarks we care about. We are thankful for their hard work.

Memory, Storage & Network

Since we are typically limited by CPU, Gen X represented an opportunity to grow components such as RAM and SSD more slowly than compute.

Cloudflare’s Gen X: 
Servers for an Accelerated Future

For memory, we continued to use 256GB of RAM, as in our prior generation, but rated higher at 2933MHz. For storage, we continue to have ~3TB, but moved to 3x1TB form factor using NVME flash (instead of SATA) with increased available IOPS and higher endurance, which enables full disk encryption using LUKS without penalty. For the network card, we continue to use Mellanox 2x25G NIC.

Cloudflare’s Gen X: 
Servers for an Accelerated Future

We moved from our multi-node chassis back to a simple 1U form factor, designed to be lighter and less error prone during operational work at the data center. We also added multiple new ODM partners to diversify how we manufacture our equipment and to take advantage of additional global warehousing.

Cloudflare’s Gen X: 
Servers for an Accelerated Future

Network Expansion

Cloudflare’s Gen X: 
Servers for an Accelerated Future

Our newest generation of servers give us the flexibility to continue to build out our network even closer to every user on Earth. We’re proud of the hard work from across engineering teams on Gen X, and are grateful for the support of our partners. Be on the lookout for more blogs about these servers in the coming days.

Get Cloudflare insights in your preferred analytics provider

Post Syndicated from Simon Steiner original https://blog.cloudflare.com/cloudflare-partners-with-analytics-providers/

Get Cloudflare insights in your preferred analytics provider

Today, we’re excited to announce our partnerships with Chronicle Security, Datadog, Elastic, Looker, Splunk, and Sumo Logic to make it easy for our customers to analyze Cloudflare logs and metrics using their analytics provider of choice. In a joint effort, we have developed pre-built dashboards that are available as a Cloudflare App in each partner’s platform. These dashboards help customers better understand events and trends from their websites and applications on our network.


Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Cloudflare insights in the tools you’re already using

Data analytics is a frequent theme in conversations with Cloudflare customers. Our customers want to understand how Cloudflare speeds up their websites and saves them bandwidth, ranks their fastest and slowest pages, and be alerted if they are under attack. While providing insights is a core tenet of Cloudflare’s offering, the data analytics market has matured and many of our customers have started using third-party providers to analyze data—including Cloudflare logs and metrics. By aggregating data from multiple applications, infrastructure, and cloud platforms in one dedicated analytics platform, customers can create a single pane of glass and benefit from better end-to-end visibility over their entire stack.

Get Cloudflare insights in your preferred analytics provider

While these analytics platforms provide great benefits in terms of functionality and flexibility, they can take significant time to configure: from ingesting logs, to specifying data models that make data searchable, all the way to building dashboards to get the right insights out of the raw data. We see this as an opportunity to partner with the companies our customers are already using to offer a better and more integrated solution.

Providing flexibility through easy-to-use integrations

To address these complexities of aggregating, managing, and displaying data, we have developed a number of product features and partnerships to make it easier to get insights out of Cloudflare logs and metrics. In February we announced Logpush, which allows customers to automatically push Cloudflare logs to Google Cloud Storage and Amazon S3. Both of these cloud storage solutions are supported by the major analytics providers as a source for collecting logs, making it possible to get Cloudflare logs into an analytics platform with just a few clicks. With today’s announcement of Cloudflare’s Analytics Partnerships, we’re releasing a Cloudflare App—a set of pre-built and fully customizable dashboards—in each partner’s app store or integrations catalogue to make the experience even more seamless.

By using these dashboards, customers can immediately analyze events and trends of their websites and applications without first needing to wade through individual log files and build custom searches. The dashboards feature all 55+ fields available in Cloudflare logs and include 90+ panels with information about the performance, security, and reliability of customers’ websites and applications.

Get Cloudflare insights in your preferred analytics provider

Ultimately, we want to provide flexibility to our customers and make it easier to use Cloudflare with the analytics tools they already use. Improving our customers’ ability to get better data and insights continues to be a focus for us, so we’d love to hear about what tools you’re using—tell us via this brief survey. To learn more about each of our partnerships and how to get access to the dashboards, please visit our developer documentation or contact your Customer Success Manager. Similarly, if you’re an analytics provider who is interested in partnering with us, use the contact form on our analytics partnerships page to get in touch.

Get Cloudflare insights in your preferred analytics provider

Post Syndicated from Simon Steiner original https://blog.cloudflare.com/cloudflare-partners-with-analytics-providers/

Get Cloudflare insights in your preferred analytics provider

Today, we’re excited to announce our partnerships with Chronicle Security, Datadog, Elastic, Looker, Splunk, and Sumo Logic to make it easy for our customers to analyze Cloudflare logs and metrics using their analytics provider of choice. In a joint effort, we have developed pre-built dashboards that are available as a Cloudflare App in each partner’s platform. These dashboards help customers better understand events and trends from their websites and applications on our network.


Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Get Cloudflare insights in your preferred analytics provider

Cloudflare insights in the tools you’re already using

Data analytics is a frequent theme in conversations with Cloudflare customers. Our customers want to understand how Cloudflare speeds up their websites and saves them bandwidth, ranks their fastest and slowest pages, and be alerted if they are under attack. While providing insights is a core tenet of Cloudflare’s offering, the data analytics market has matured and many of our customers have started using third-party providers to analyze data—including Cloudflare logs and metrics. By aggregating data from multiple applications, infrastructure, and cloud platforms in one dedicated analytics platform, customers can create a single pane of glass and benefit from better end-to-end visibility over their entire stack.

Get Cloudflare insights in your preferred analytics provider

While these analytics platforms provide great benefits in terms of functionality and flexibility, they can take significant time to configure: from ingesting logs, to specifying data models that make data searchable, all the way to building dashboards to get the right insights out of the raw data. We see this as an opportunity to partner with the companies our customers are already using to offer a better and more integrated solution.

Providing flexibility through easy-to-use integrations

To address these complexities of aggregating, managing, and displaying data, we have developed a number of product features and partnerships to make it easier to get insights out of Cloudflare logs and metrics. In February we announced Logpush, which allows customers to automatically push Cloudflare logs to Google Cloud Storage and Amazon S3. Both of these cloud storage solutions are supported by the major analytics providers as a source for collecting logs, making it possible to get Cloudflare logs into an analytics platform with just a few clicks. With today’s announcement of Cloudflare’s Analytics Partnerships, we’re releasing a Cloudflare App—a set of pre-built and fully customizable dashboards—in each partner’s app store or integrations catalogue to make the experience even more seamless.

By using these dashboards, customers can immediately analyze events and trends of their websites and applications without first needing to wade through individual log files and build custom searches. The dashboards feature all 55+ fields available in Cloudflare logs and include 90+ panels with information about the performance, security, and reliability of customers’ websites and applications.

Get Cloudflare insights in your preferred analytics provider

Ultimately, we want to provide flexibility to our customers and make it easier to use Cloudflare with the analytics tools they already use. Improving our customers’ ability to get better data and insights continues to be a focus for us, so we’d love to hear about what tools you’re using—tell us via this brief survey. To learn more about each of our partnerships and how to get access to the dashboards, please visit our developer documentation or contact your Customer Success Manager. Similarly, if you’re an analytics provider who is interested in partnering with us, use the contact form on our analytics partnerships page to get in touch.

Why I’m helping Cloudflare build its partnerships worldwide

Post Syndicated from Matthew Harrell original https://blog.cloudflare.com/helping-cloudflare-build-its-partnerships-worldwide/

Cloudflare has always had an audacious mission: to help build a better Internet. From its inception, the company realized that a mission this big couldn’t be taken on alone. Such an undertaking would require the help of an extraordinary group of partners. Early in the company’s history, Cloudflare built strong relationships with many hosting providers to protect and accelerate internet traffic. And through the years, Cloudflare has continued to build some amazing Enterprise partnerships and strategic alliances.

As we continue to grow and foster our partner ecosystem, we are excited to announce Cloudflare’s next iteration of its Partner Program—to engage and enable an equally audacious set of partners that want to help build a better Internet, together.

I recently joined Cloudflare to run Global Channel Sales & Partnerships after spending over nine years at Google Cloud in various indirect and direct leadership roles. At Google, I witnessed the powerful impact that a strong partner ecosystem could have on solving complex organizational and societal problems. By combining innovative technologies provided by the manufacturer, with deep domain expertise provided by the partner, we delivered valuable industry solutions to our customers. And through this process, we helped our partners build valuable businesses, accelerate growth, and bring new innovation economies to all parts of the globe.

I joined Cloudflare because I strongly believe in its mission to help build a better Internet, and believe this mission, paired with its massive global network, will enable the company to continue to deliver incredibly innovative solutions to customers of all segments. Cloudflare has strong brand recognition, a market leading product portfolio, an ambitious vision, and a leadership team that is 100% committed to building out the channel and partner program.

I’m excited to connect with Cloudflare partners, and my first priority as the global channel leader is to provide our partners with the tools and programs which allow them to build a compelling business around our products. I’m eager to continue developing a world class program and organization that is:

  • Focused on helping partners build compelling businesses: Cloudflare has a history of democratizing Internet technologies that were once difficult to access, or complicated to use and even understand, such as free SSL, unmetered DDoS, and wholesale Registrar. We plan to take a similar market-shifting approach with our partners. We are redesigning our partner program with a vision of developing best-in-class revenue share models and value added professional services and managed services that we scale through our partners.
  • Easy to do business with: Cloudflare has always prided itself on its ease of use, and we want the partner experience to be just as seamless. We have redesigned how our partners engage with us—from initial sign up, to on-going engagement—to make it even easier for partners to do business with us. This includes simplifying the deal registration process, smooth product trainings for partner reps,, straightforward tracking of deals, and making it easier overall to profit from their relationship with Cloudflare.  
  • Strategically focused: Cloudflare has always relied on valuable partnerships on its mission to help build a better Internet. We are expanding that commitment by diving deeper with those partners that are committed to building their businesses around Cloudflare. We plan to invest resources and design partner-first programs that reward partners for leaning in and investing in Cloudflare’s mission.

Today, you’ll see a few important announcements around the future of our program and how we continue to scale to support some of our most complex partnerships.

We look forward to helping you build your business with Cloudflare!

For those partners that will be in London, please join us at Cloudflare Connect // London, our second annual London gathering of distinguished businesses and technologists, including many Cloudflare customers, partners, and developers. This is Cloudflare’s marquee customer event, which means the content and experience is built for you. I plan to be there personally to formally announce our new partner program, and provide insights on what’s to come.

You can register here: CloudflareConnect.com


More Information:

Cloudflare Partners: A New Program with New Partners

Post Syndicated from Dan Hollinger original https://blog.cloudflare.com/cloudflare-partners-a-new-program-with-new-partners/

Cloudflare Partners: A New Program with New Partners

Many overlook a critical portion of the language in Cloudflare’s mission: “to help build a better Internet.” From the beginning, we knew a mission this bold, an undertaking of this magnitude, couldn’t be done alone. We could only help. To ultimately build a better Internet, it would take a diverse and engaged ecosystem of technologies, customers, partners, and end-users. Fortunately, we’ve been able to work with amazing partners as we’ve grown, and we are eager to announce new, specific programs to grow our ecosystem with an increasingly diverse set of partners.

Today, we’re excited to announce the latest iteration of our partnership program for solutions partners. These categories encompass resellers and referral partners, OEM partners, and the new partner services program. Over the past few years, we’ve grown and learned from some amazing partnerships, and want to bring those best practices to our latest partners at scale—to help them grow their business with Cloudflare’s global network.

Cloudflare Partners: A New Program with New Partners
Cloudflare Partner Tiers

Partner Program for Solution Partners

Every partner program out there has tiers, and Cloudflare’s program is no exception. However, our tiering was built to help our partners ramp up, accelerate and move fast. As Matt Harrell highlighted, we want the process to be as seamless as possible, to help partners find the level of engagement that works best for them‚ with world-class enablement paths and best-in-class revenue share models—built-in from the beginning.

World-Class Enablement

Cloudflare offers complimentary training and enablement to all partners. From self-serve paths, to partner-focused webinars, and instructor-based courses, and certification—we want to ensure our partners can learn and develop Cloudflare and product expertise, to make them as effective as possible when utilizing our massive global network.

Driving Business Value

We want our partners to grow and succeed. From self-serve resellers to our most custom integration, we want to make it frictionless to build a profitable business on Cloudflare. From our tenant API system to dedicated account teams, we’re ready to help you go-to-market with solutions that help end-customers. This includes opportunities to co-market, develop target accounts, and directly partner, to succeed with strategic accounts.

Cloudflare recognizes that, in order to help build a better Internet, we need successful partners—and our latest program is built to help partners build and grow profitable business models around Cloudflare’s solutions.

Partner Services Program – SIs, MSSPs, MSSPs, PSOs

For the first time, we are expanding our program to also include service providers that want to develop and grow profitable services practices around Cloudflare’s solutions.

Our customers face some of the most complex challenges on the Internet. From those challenges, we’ve already seen some amazing opportunities for service providers to create value, grow their business, and make an impact. From customers migrating to the public, hybrid or multi-cloud for the first time, to entirely re-writing applications using Cloudflare Workers®, the need for expertise has increased across our entire customer base. In early pilots, we’ve seen four major categories in building a successful service practice around Cloudflare:

  • Network Digital Transformations – Help customers migrate and modernize their network solution. Cloudflare is the only cloud-native network to give Enterprise-grade control and visibility into on-prem, hybrid, and multi-cloud architectures.
  • Serverless Architecture Development – Provide serverless application development services, thought leadership, and technical consulting around leveraging Cloudflare Workers and Apps.
  • Managed Security & Insights – Enable CISO and IT leaders to obtain single pane of glass of reporting and policy management across all Internet-facing application with Cloudflare’s Security solutions.
  • Managed Performance & Reliability – Keep customer’s High-Availability applications running quickly and efficiently with Cloudflare’s Global Load Balancing, Smart Routing, and Anycast DNS, which allows performance consulting, traffic analysis, and application monitoring.

As we expand this program, we are looking for audacious service providers and system integrators that want to help us build a better Internet for our mutual customers. Cloudflare can be an essential lynchpin to simplify and accelerate digital transformations. We imagine a future where massive applications run within 10ms of 90% of the global population, and where a single-pane solution provides security, performance, and reliability for mission-critical applications—running across multiple clouds. Cloudflare needs help from amazing global integrators and service partners to help realize this future.

If you are interested in learning more about becoming a service partner and growing your business with Cloudflare, please reach out to [email protected] or explore cloudflare.com/partners/services

Just Getting Started

Metcalf’s law states that a network is only as powerful as the amount of nodes within the network. And within the global Cloudflare network, we want as many partner nodes as possible—from agencies to systems integrators, managed security providers, Enterprise resellers, and OEMs. A diverse ecosystem of partners is essential to our mission of helping to build a better Internet, together. We are dedicated to the success of our partners, and we will continue to iterate and develop our programs to make sure our partners can grow and develop on Cloudflare’s global network. Our commitment moving forward is that Cloudflare will be the easiest and most valuable solution for channel partners to sell and support globally.


More Information:

Announcing the New Cloudflare Partner Platform

Post Syndicated from Garrett Galow original https://blog.cloudflare.com/announcing-the-new-cloudflare-partner-platform/

Announcing the New Cloudflare Partner Platform

Announcing the New Cloudflare Partner Platform

When I first started at Cloudflare over two years ago, one of the first things I was tasked with was to help evolve our partner platform to support the changes in our service and the expanding needs of our partners and customers. Cloudflare’s existing partner platform was released in 2010. It is a testament to those who built it, that it was, and still is, in use today—but it was also clear that the landscape had substantially changed.

Since the launch of the existing partner platform, we had built and expanded multi-user access, and launched many new products: Argo, Load Balancing, and Cloudflare Workers, to name a few. Retrofitting the existing offering was not practical. Cloudflare needed a new partner platform that could meet the needs of partners and their customers.

As the team started to develop a new solution, we needed to find a partner who could keep us on the right path. The number of hypotheticals were infinite and we needed a first customer to ground ourselves. Lo and behold, not long after I had begun putting pen to paper, we found the perfect partner for the new platform.

The IBM Partnership

IBM was looking for a partner to bring various edge services to market quickly, and our suite of capabilities was what they were looking for. If you are not familiar with our partnership with IBM, you can learn a bit more about it in our blog post and on the IBM Cloud Internet Services landing page. We signed the contract in November 2017, and we had to be ready to launch by IBM Think the following February. Given that IBM’s engineering team needed time to integrate with us, we were on a tight timeline to deliver.

A number of team members and I jumped on a plane and flew to Austin, Texas (Hook ‘em!) to work with IBM and determine the minimum viable product (MVP). Over kolaches (for the Czech readers at home: Klobásník), IBM and Cloudflare nailed down the MVP requirements. Briefly, they were as follows:

  1. Full API integration to provision the building blocks of using Cloudflare.
    • This included:
      1. Accounts: The container of resources – typically zones
      2. Users: The way in which we partition access to accounts
  2. The ability to sell and provision Cloudflare’s paid services and package them in a way that made sense for IBM’s customers.
    • Our existing partner platform only supported zone plans and none of our newer offerings, such as Argo or load balancing.
    • IBM had specific requirements around how they could package and sell to customers, so our solution needed to be flexible enough to support that.
  3. Ensure that what we built was re-usable.
    • Cloudflare makes it a point to solve problems for scale. While we were focused on ensuring our first partner would be successful, we knew that long term we would need to be able to scale this solution to additional partners. Nothing we built could prevent us from doing that.

Over the next couple of months, many teams at Cloudflare came together to deliver this solution at breakneck speed. Given that the midpoint of this effort happened over the holiday season, I’m personally proud of our company not sacrificing employee’s time with their friends and families in order to deliver. Even when it feels like a sprint, it is still a marathon.

During this time, the engineering team we were working with at IBM felt like another team at Cloudflare. Their ability to move quickly, integrate, and validate our work was critical to the success of the project. At THINK in February 2018, we were able to announce the Beta of IBM CIS (Cloud Internet Services) powered by Cloudflare!

Following the initial release, we continued to add functionality to further enrich the IBM CIS offering, while behind the scenes we continued our work to redefine Cloudflare’s partner platform.

The New Partner Platform

Over the past year we have expanded the capabilities and completed the necessary work to enable more partners to be able to use what we initially built for the IBM partnership. Out of that comes our new partner platform we are announcing today. The new partner platform allows partners of Cloudflare to sell and provision Cloudflare for their customers in a scalable fashion.

Our new partner platform is the combination of two systems designed to fulfill specific needs:

1. Tenants: an abstraction on top of our existing accounts and users for easier management
2. Subscriptions: a new way of packaging and provisioning services

Tenants

An absolute necessity for partners is the ability to provision accounts for each of their customers. Normally the only way to get a Cloudflare account is to sign up on the dashboard. We needed a way for partners to be able to create end customer accounts at their discretion to support their specific onboarding needs. This also ensures proper separation of ownership between customers and allows end customers to access the Cloudflare dashboard directly.

With the introduction of tenants, our data model now looks like the following:

Announcing the New Cloudflare Partner Platform
Cloudflare Resource Data Model

Tenants provide partners the ability to create and manage the accounts for their customers. Each account created is a separate container of resources (zones, workers, etc) for each of customer. Users can be invited to each account as necessary for self service management, while the partner retains control of the capabilities enabled for each account. How a partner manages those capabilities brings us to the second major system that makes up the new partner platform.

Subscriptions

While not as obvious as the need for account provisioning, the ability to package and provision services is critical to providing differentiated offerings for partners of Cloudflare. One drawback of our old partner platform was the difficulty in ensuring new products and services were available to those partners. As Cloudflare grew, it reached the point where new paid services could not be added into the existing partner platform.

With subscriptions, this is no longer the case. What started as just a way to provision services for IBM, has now grown into the standard of how all customer services are provisioned at Cloudflare. Whether you purchase services through IBM CIS or buy Cloudflare Workers in our dashboard, behind the scenes, Subscriptions is what ensures you get exactly the right services enabled.

Enough talk, let’s show things in action!

The Partner Platform in Action

The full details of using the new partner platform can be found in our Provisioning API docs, but here we provide a walkthrough of a typical use case.

Using the new partner platform involves 4 steps:

  1. Provisioning Customer Accounts
  2. Granting Customer Access
  3. Enabling Services
  4. Service Configuration

1) Provisioning Customer Accounts

When onboarding customers, you want each to have their own Cloudflare account. This ensures one customer can not affect any resources belonging to another. By making a `POST /accounts` request, you can create an account for an individual customer.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/accounts \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "name": "Customer Account", 
          "type": "standard" 
        }'

Response:

{
    "result": {
        "id": "2bab6ace8c72ed3f09b9eca6db1396bb",
        "name": "Customer Account",
        "type": "standard",
        "settings": {
            "enforce_twofactor": false
        }
    },
    "success": true,
    "errors": [],
    "messages": []
}

This new account is owned by the partner. It can be managed by API, or in the UI by the partner or any additional administrators that are invited.

2) Granting Customer Access

Now that the customer’s account is created, let’s give them access to it. This step uses existing APIs and if you have shared access to a Cloudflare account before, then you have already done this.

Request:

curl -X POST \
    'https://api.cloudflare.com/client/v4/accounts/2bab6ace8c72ed3f09b9eca6db1396bb/members' \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "email": "[email protected]",
          "roles": ["05784afa30c1afe1440e79d9351c7430"],
          "status": "accepted" 
        }'

Response:

{
    "result": {
        "id": "47bd8083af8516a20c410090d2f53655",
        "user": {
            "id": "fccad3c46f26dc2d6ba47ad19f639707",
            "first_name": null,
            "last_name": null,
            "email": "[email protected]",
            "two_factor_authentication_enabled": false
        },
        "status": "pending",
        "roles": [
            {
                "id": "05784afa30c1afe1440e79d9351c7430",
                "name": "Administrator",
                "description": "Can access the full account, except for membership management and billing.",
                "permissions": {
                    "organization": {
                        "read": true,
                        "edit": true
                    },
                    "zone": {
                        "read": true,
                        "edit": true
                    },
                    truncated...
                }
            }
        ]
    },
    "success": true,
    "errors": [],
    "messages": []
}

Alternatively, you can do this in the UI, from the Members section for the newly created account.

3) Enabling Services

Now the fun part! With the ability to provision subscriptions, you can enable paid services for your customers. Before we do that though, we will create a zone so we can attach a zone subscription to it.

Adding a zone as a partner is no different than adding a zone as a regular customer. It can also be done by the customer.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/zones \
    -H 'Content-Type: application/json' \
    -H 'x-auth-email: <x-auth-email>' \
    -H 'x-auth-key: <x-auth-key>' \
    -d '{ "name": "theircompany.com",
            "account": { "id": "2bab6ace8c72ed3f09b9eca6db1396bb" }
        }'

Response:

{
    "result": {
        "id": "cae181e41197e2eb875d9bcb9396abe7",
        "name": "theircompany.com",
        "status": "pending",
        "paused": false,
        "type": "full",
        "development_mode": 0,
        "name_servers": [
            "lana.ns.cloudflare.com",
            "lynn.ns.cloudflare.com"
        ],
        "original_name_servers": null,
        "original_registrar": "cloudflare, inc.",
        "original_dnshost": null,
        "modified_on": "2019-05-30T17:51:08.510558Z",
        "created_on": "2019-05-30T17:51:08.510558Z",
        "activated_on": null,
        "meta": {
            "step": 4,
            "wildcard_proxiable": false,
            "custom_certificate_quota": 0,
            "page_rule_quota": 3,
            "phishing_detected": false,
            "multiple_railguns_allowed": false
        },
        "owner": {
            "id": null,
            "type": "user",
            "email": null
        },
        "account": {
            "id": "2bab6ace8c72ed3f09b9eca6db1396bb",
            "name": "Customer Account"
        },
        "permissions": [
            "#access:edit",
            "#access:read",
            ...truncated
        ],
        "plan": {
            "id": "0feeeeeeeeeeeeeeeeeeeeeeeeeeeeee",
            "name": "Free Website",
            "price": 0,
            "currency": "USD",
            "frequency": "",
            "is_subscribed": true,
            "can_subscribe": false,
            "legacy_id": "free",
            "legacy_discount": false,
            "externally_managed": false
        }
    },
    "success": true,
    "errors": [],
    "messages": []
}

For this customer we will provision a Pro plan for the newly created zone. If you are not familiar with our zone plans, then you can read about them here. For this, we make a call to the subscriptions service.

Request:

curl -X POST \
    https://api.cloudflare.com/client/v4/zones/cae181e41197e2eb875d9bcb9396abe7/subscription \
  -H 'Content-Type: application/json' \
  -H 'X-Auth-Email: <x-auth-email>' \
  -H 'X-Auth-Key: <x-auth-key>' \
  -d '{"rate_plan": {
          "id": "PARTNERS_PRO"}
      }'

Response:

{
    "success": true,
    "result": {
        "id": "ff563a93e11c46e7b278be46f49cdd2f",
        "product": {
            "name": "partners_cloudflare_zones",
            "period": "",
            "billing": "",
            "public_name": "CloudFlare Services",
            "duration": 0
        },
        "rate_plan": {
            "id": "partners_pro",
            "public_name": "Partners Professional Plan",
            "currency": "USD",
            "scope": "zone",
            "externally_managed": false,
            "sets": [
                "zone",
                "partner"
            ],
            "is_contract": true
        },
        "component_values": [
            {
                "name": "dedicated_certificates",
                "value": 0,
                "price": 0
            },
            {
                "name": "dedicated_certificates_custom",
                "value": 0,
                "price": 0
            },
            {
                "name": "page_rules",
                "value": 20,
                "default": 20,
                "price": 0
            },
            {
                "name": "zones",
                "value": 1,
                "default": 1,
                "price": 0
            }
        ],
        "zone": {
            "id": "cae181e41197e2eb875d9bcb9396abe7",
            "name": "theircompany.com"
        },
        "frequency": "monthly",
        "currency": "USD",
        "app": {
            "install_id": null
        },
        "entitled": true
    },
    "messages": null,
    "api_version": "2.0.0"
}

Now that the customer is set up with an account, zone, and zone subscription, the only thing left is configuring the resources appropriately.

4) Service Configuration

Service configuration can be done by either you, the partner, or the end customer. Most commonly, DNS records need to be added, security settings verified and updated, and customizations made. These can all be done either through our Client v4 APIs or the Cloudflare Dashboard.

Once that is done, the customer is all set!

This is just the beginning

With our announcement today, partners can protect and accelerate their customer’s internet services with Cloudflare’s partner platform. We have battled tested the underlying systems over the last year and are excited to partner with others to help make a better internet. We are not done yet though. We will be continually investing in the tenant and subscription services to expand their capabilities and simplify usage.

Announcing the New Cloudflare Partner Platform
Some of the latest partners using the new partner platform

If you are interested in partnering with Cloudflare, then reach out to [email protected]. If building the future of how Cloudflare’s partners and customers use our service sounds interesting then take a look at our career page.


For more information, see the following resources:

Join Cloudflare & Yandex at our Moscow meetup! Присоединяйтесь к митапу в Москве!

Post Syndicated from Andrew Fitch original https://blog.cloudflare.com/moscow-developers-join-cloudflare-yandex-at-our-meetup/

Join Cloudflare & Yandex at our Moscow meetup! Присоединяйтесь к митапу в Москве!
Photo by Serge Kutuzov / Unsplash

Join Cloudflare & Yandex at our Moscow meetup! Присоединяйтесь к митапу в Москве!

Are you based in Moscow? Cloudflare is partnering with Yandex to produce a meetup this month in Yandex’s Moscow headquarters.  We would love to invite you to join us to learn about the newest in the Internet industry. You’ll join Cloudflare’s users, stakeholders from the tech community, and Engineers and Product Managers from both Cloudflare and Yandex.

Cloudflare Moscow Meetup

Tuesday, May 30, 2019: 18:00 – 22:00

Location: Yandex – Ulitsa L’va Tolstogo, 16, Moskva, Russia, 119021

Talks will include “Performance and scalability at Cloudflare”, “Security at Yandex Cloud”, and “Edge computing”.

Speakers will include Evgeny Sidorov, Information Security Engineer at Yandex, Ivan Babrou, Performance Engineer at Cloudflare, Alex Cruz Farmer, Product Manager for Firewall at Cloudflare, and Olga Skobeleva, Solutions Engineer at Cloudflare.

Agenda:

18:00 – 19:00 – Registration and welcome cocktail

19:00 – 19:10 – Cloudflare overview

19:10 – 19:40 – Performance and scalability at Cloudflare

19:40 – 20:10 – Security at Yandex Cloud

20:10 – 20:40 – Cloudflare security solutions and industry security trends

20:40 – 21:10 – Edge computing

Q&A

The talks will be followed by food, drinks, and networking.

View Event Details & Register Here »

We’ll hope to meet you soon.

Разработчики, присоединяйтесь к Cloudflare и Яндексу на нашей предстоящей встрече в Москве!

Cloudflare сотрудничает с Яндексом, чтобы организовать мероприятие в этом месяце в штаб-квартире Яндекса. Мы приглашаем вас присоединиться к встрече посвященной новейшим достижениям в интернет-индустрии. На мероприятии соберутся клиенты Cloudflare, профессионалы из технического сообщества, инженеры из Cloudflare и Яндекса.

Вторник, 30 мая: 18:00 – 22:00

Место встречи: Яндекс, улица Льва Толстого, 16, Москва, Россия, 119021

Доклады будут включать себя такие темы как «Решения безопасности Cloudflare и тренды в области безопасности», «Безопасность в Yandex Cloud», “Производительность и масштабируемость в Cloudflare и «Edge computing» от докладчиков из Cloudflare и Яндекса.

Среди докладчиков будут Евгений Сидоров, Заместитель руководителя группы безопасности сервисов в Яндексе, Иван Бобров, Инженер по производительности в Cloudflare, Алекс Круз Фармер, Менеджер продукта Firewall в Cloudflare, и Ольга Скобелева, Инженер по внедрению в Cloudflare.

Программа:

18:00 – 19:00 – Регистрация, напитки и общение

19:00 – 19:10 – Обзор Cloudflare

19:10 – 19:40 – Производительность и масштабируемость в Cloudflare

19:40 – 20:10 – Решения для обеспечения безопасности в Яндексе

20:10 – 20:40 – Решения безопасности Cloudflare и тренды в области безопасности

20:40 – 21:10 – Примеры Serverless-решений по безопасности

Q&A

Вслед за презентациям последует общение, еда и напитки.

Посмотреть детали события и зарегистрироваться можно здесь »

Ждем встречи с вами!

How We Optimized Storage and Performance of Apache Cassandra at Backblaze

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/wide-partitions-in-apache-cassandra-3-11/

Guest post by Mick Semb Wever

Backblaze uses Apache Cassandra, a high-performance, scalable distributed database to help manage hundreds of petabytes of data. We engaged the folks at The Last Pickle to use their extensive experience to optimize the capabilities and performance of our Cassandra 3.11 cluster, and now they want to share their experience with a wider audience to explain what they found. We agree; enjoy!

— Andy

Wide Partitions in Apache Cassandra 3.11

by Mick Semb Wever, Consultant, The Last Pickle

Wide partitions in Cassandra can put tremendous pressure on the Java heap and garbage collector, impact read latencies, and can cause issues ranging from load shedding and dropped messages to crashed and downed nodes.

While the theoretical limit on the number of cells per partition has always been two billion cells, the reality has been quite different, as the impacts of heap pressure show. To mitigate these problems, the community has offered a standard recommendation for Cassandra users to keep partitions under 400MB, and preferably under 100MB.

However, in version 3 many improvements were made that affected how Cassandra handles wide partitions. Memtables, caches, and SSTable components were moved off-heap, the storage engine was rewritten in CASSANDRA-8099, and Robert Stupp made a number of other improvements listed under CASSANDRA-11206.

While working with Backblaze and operating a Cassandra version 3.11 cluster, we had the opportunity to test and validate how Cassandra actually handles partitions with this latest version. We will demonstrate that well designed data models can go beyond the existing 400MB recommendation without nodes crashing through heap pressure.

Below, we walk through how Cassandra writes partitions to disk in 3.11, look at how wide partitions impact read latencies, and then present our testing and verification of wide partition impacts on the cluster using the work we did with Backblaze.

The Art and Science of Writing Wide Partitions to Disk

First we need to understand what a partition is and how Cassandra writes partitions to disk in version 3.11.

Each SSTable contains a set of files, and the (–Data.db) file contains numerous partitions.

The layout of a partition in the –Data.db file has three components: a header, followed by zero or one static rows, which is followed by zero or more ordered Clusterable objects. The Clusterable object in this file may either be a row or a RangeTombstone that deletes data with each wide partition containing many Clusterable objects. For an excellent in-depth examination of this, see Aaron’s blog post Cassandra 3.x Storage Engine.

The –Index.db file stores offsets for the partitions, as well as the IndexInfo serialized objects for each partition. These indices facilitate locating the data on disk within the –Data.db file. Stored partition offsets are represented by a subclass of the RowIndexEntry. This subclass is chosen by the the ColumnIndex and depends on the size of the partition:

  • RowIndexEntry is used when there are no Clusterable objects in the partition, such as when there is only a static row. In this case there are no IndexInfo objects to store and so the parent RowIndexEntry class is used rather than a subclass.
  • The IndexEntry subclass holds the IndexInfo objects in memory until the partition has finished writing to disk. It is used in partitions where the total serialized size of the IndexInfo objects is less than the column_index_cache_size_in_kb configuration setting (which defaults to 2KB).
  • The ShallowIndexEntry subclass serializes IndexInfo objects to disk as they are created and references these objects using only their position in the file. This is used in partitions where the total serialized size of the IndexInfo objects is more than the column_index_cache_size_in_kb configuration setting.

These IndexInfo objects provide a sampling of positional offsets for rows within a partition, creating an index. Each object specifies the offset the page starts at, the first row and the last row.

So, in general, the bigger the partition, the more IndexInfo objects need to be created when writing to disk — and if they are held in memory until the partition is fully written to disk they can cause memory pressure. This is why the column_index_cache_size_in_kb setting was added in Cassandra 3.6 and the objects are now serialized as they are created.

The relationship between partition size and the number of objects was quantified by Robert Stupp in his presentation, Myths of Big Partitions.

IndexInfo numbers from Robert Stupp

How Wide Partitions Impact Read Latencies

Cassandra’s key cache is an optimization that is enabled by default and helps to improve the speed and efficiency of the read path by reducing the amount of disk activity per read.

Each key cache entry is identified by a combination of the keyspace, table name, SSTable, and the partition key. The value of the key cache is a RowIndexEntry or one of its subclasses — either IndexedEntry or the new ShallowIndexedEntry. The size of the key cache is limited by the key_cache_size_in_mb configuration setting.

When a read operation in the storage engine gets a cache hit it avoids having to access the –Summary.db and –Index.db SSTable components, which reduces that read request’s latency. Wide partitions, however, can decrease the efficiency of this key cache optimization because fewer hot partitions will fit into the allocated cache size.

Indeed, before the ShallowIndexedEntry was added in Cassandra version 3.6, a single wide row could fill the key cache, reducing the hit rate efficiency. When applied to multiple rows, this will cause greater churn of additions and evictions of cache entries.

For example, if the IndexEntry for a 512MB partition contains 100K+ IndexInfo objects and if these IndexInfo objects total 1.4MB, then the key cache would only be able to hold 140 entries.

The introduction of ShallowIndexedEntry objects changed how the key cache can hold data. The ShallowIndexedEntry contains a list of file pointers referencing the serialized IndexInfo objects and can binary search through this list, rather than having to deserialize the entire IndexInfo objects list. Thus when the ShallowIndexedEntry is used no IndexInfo objects exist within the key cache. This increases the storage efficiency of the key cache in storing more entries, but does still require that the IndexInfo objects are binary searched and deserialized from the –Index.db file on a cache hit.

In short, on wide partitions a key cache miss still results in two additional disk reads, as it did before Cassandra 3.6, but now a key cache hit incurs a disk read to the -Index.db file where it did not before Cassandra 3.6.

Object Creation and Heap Behavior with Wide Partitions in 2.2.13 vs 3.11.3

Introducing the ShallowIndexedEntry into Cassandra version 3.6 creates a measurable improvement in the performance of wide partitions. To test the effects of this and the other performance enhancement features introduced in version 3 we compared how Cassandra 2.2.13 and 3.11.3 performed when one hundred thousand, one million, or ten million rows were each written to a single partition.

The results and accompanying screenshots help illustrate the impact of object creation and heap behavior when inserting rows into wide partitions. While version 2.2.13 crashed repeatedly during this test, 3.11.3 was able to write over 30 million rows to a single partition before Cassandra Out-of-Memory crashed. The test and results are reproduced below.

Both Cassandra versions were started as single-node clusters with default configurations, excepting heap customization in the cassandra–env.sh:

MAX_HEAP_SIZE=”1G”
HEAP_NEWSIZE=”600M”

In Cassandra only the configured concurrency of memtable flushes and compactors determines how many partitions are processed by a node and thus pressuring its heap at any one time. Based on this known concurrency limitation, profiling can be done by inserting data into one partition against one Cassandra node with a small heap. These results extrapolate to production environments.

The tlp-stress tool inserted data in three separate profiling passes against both versions of Cassandra, creating wide partitions of one hundred thousand (100K), one million (1M), or ten million (10M) rows.

A tlp-stress profile for wide partitions was written, as no suitable profile existed. The read to write ratio used the default setting of 1:100.

The following command lines then implemented the tlp-stress tool:

# To write 100000 rows into one partition
tlp-stress run Wide –replication “{‘class’:’SimpleStrategy’,’replication_factor’: 1}” -n 100K# To write 1M rows into one partition
tlp-stress run Wide –replication “{‘class’:’SimpleStrategy’,’replication_factor’: 1}” -n 1M# To write 10M rows into one partition
tlp-stress run Wide –replication “{‘class’:’SimpleStrategy’,’replication_factor’: 1}” -n 10M

Each time tlp-stress executed it was immediately followed by a command to ensure the full count of specified rows passed through the memtable flush and were written to disk:

nodetool flush

The graphs in the sections below, taken from the Apache NetBeans Profiler, illustrate how the ShallowIndexEntry in Cassandra version 3.11 avoids keeping IndexInfo objects in memory.

Notably, the IndexInfo objects are instantiated far more often, but are referenced for much shorter periods of time. The Garbage Collector is more effective at removing short-lived objects, as illustrated by the GC pause times being barely present in the Cassandra 3.11 graphs compared to Cassandra 2.2 where GC pause times overwhelm the JVM.

Wide Partitions in Cassandra 2.2

Benchmarks were against Cassandra 2.2.13

One Partition with 100K Rows (2.2.13)

The following three screenshots shows the number of IndexInfo objects instantiated during the write benchmark, during compaction, and a heap profile.

The partition grew to be ~40MB.

Objects created during tlp-stress

screenshot of Cassandra 2.2 objects created during tlp-stress

Objects created during subsequent major compaction

screenshot of Cassandra 2.2 objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

screenshot of Cassandra 2.2 Heap profiled during tlp-stress and major compaction

The above diagrams do not have their x-axis expanded to the full width, but still encompass the startup, stress test, flush, and compaction periods of the benchmark.

When stress testing starts with tlp-stress, the CPU Time and Surviving Generations starts to climb. During this time the heap also starts to increase and decrease more frequently as it fills up and then the Garbage Collector cleans it out. In these diagrams the garbage collection intervals are easy to identify and isolate from one another.

One Partition with 1M Rows (2.2.13)

Here, the first two screenshots show the number of IndexInfo objects instantiated during the write benchmark and during the subsequent compaction process. The third screenshot shows the CPU & GC Pause Times and the heap profile from the time writes started through when the compaction was completed.

The partition grew to be ~400MB.

Already at this size the Cassandra JVM is GC thrashing and has occasionally Out-of-Memory crashed.

Objects created during tlp-stress

screenshot of Cassandra 2.2.13 Objects created during tlp-stress

Objects created during subsequent major compaction

screenshot of Cassandra 2.2.13 Objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

screenshot of Cassandra 2.2.13 Heap profiled during tlp-stress and major compaction

The above diagrams display a longer running benchmark, with the quiet period during the startup barely noticeable on the very left-hand side of each diagram. The number of garbage collection intervals and the oscillations in heap size are far more frequent. The GC Pause Time during the stress testing period is now consistently higher and comparable to the CPU Time. It only dissipates when the benchmark performs the flush and compaction.

One Partition with 10M Rows (2.2.13)

In this final test of Cassandra version 2.2.13, the results were difficult to reproduce reliably, as more often than not this test Out-of-Memory crashed from GC heap pressure.

The first two screenshots show the number of IndexInfo objects instantiated during the write benchmark and during the subsequent compaction process. The third screenshot shows the GC Pause Time and the heap profile from the time writes started until compaction was completed.

The partition grew to be ~4GB.

Objects created during tlp-stress

Objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

screenshot of Cassandra Heap profiled during tlp-stress and major compaction

The above diagrams display consistently very high GC Pause Time compared to CPU Time. Any Cassandra node under this much duress from garbage collection is not healthy. It is suffering from high read latencies, could become blacklisted by other nodes due to its lack of responsiveness, and even crash altogether from Out-of-Memory errors (as it did often during this benchmark).

Wide Partitions in Cassandra 3.11.3

Benchmarks were against Cassandra 3.11.3

In this series, the graphs demonstrate how IndexInfo objects are created either from memtable flushes or from deserialization off disk. The ShallowIndexEntry is used in Cassandra 3.11.3 when deserializing the IndexInfo objects from the -Index.db file.

Neither form of IndexInfo objects reside long in the heap and thus the GC Pause Time is barely visible in comparison to Cassandra 2.2.13 despite the additional numbers of IndexInfo objects created via deserialization.

One Partition with 100K Rows (3.11.3)

As with the earlier version test of this size, the following two screenshots shows the number of IndexInfo objects instantiated during the write benchmark and during the subsequent compaction process. The third screenshot shows the CPU & GC Pause Time and the heap profile from the time writes started through when the compaction was completed.

The partition grew to be ~40MB, the same as with Cassandra 2.2.13

Objects created during tlp-stress

screenshot of Cassandra 3.11.3 objects created during tlp-stress

Objects created during subsequent major compaction

screenshot of Cassandra 3.11.3 objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

screenshot of Cassandra 3.11.3 Heap profiled during tlp-stress and major compaction

The diagrams above are roughly comparable to the first diagrams presented under Cassandra 2.2.13, except here the x-axis is expanded to full width. Note there are significantly more instantiated IndexInfo objects, but barely any noticeable GC Pause Time.

One Partition with 1M Rows (3.11.3)

Again, the first two screenshots show the number of IndexInfo objects instantiated during the write benchmark and during the subsequent compaction process. The third screenshot shows the CPU & GC Pause Time and the heap profile over the time writes started until the compaction was completed.

The partition grew to be ~400MB, the same as with Cassandra 2.2.13

Objects created during tlp-stress

Objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

The above diagrams show a wildly oscillating heap as many IndexInfo objects are created, and shows many garbage collection intervals, yet the GC Pause Time remains low, if at all noticeable.

One Partition with 10M Rows (3.11.3)

Here again, the first two screenshots show the number of IndexInfo objects instantiated during the write benchmark and during the subsequent compaction process. The third screenshot shows the CPU & GC Pause Time and the heap profile over the time writes started until the compaction was completed.

The partition grew to be ~4GB, the same as with Cassandra 2.2.13

Objects created during tlp-stress

Objects created during subsequent major compaction

Heap profiled during tlp-stress and major compaction

Unlike this profile in 2.2.13, the cluster remains stable as it was when running 1M rows per partition. The above diagrams display an oscillating heap when IndexInfo objects are created, and many garbage collection intervals, yet GC Pause Time remains low, if at all noticeable.

Maximum Rows in 1GB Heap (3.11.3)

In an attempt to push Cassandra 3.11.3 to the limit, we ran a test to see how much data could be written to a single partition before Cassandra Out-of-Memory crashed.

The result was 30M+ rows, which is ~12GB of data on disk.

This is similar to the limit of 17GB of data written to a single partition as Robert Stupp found in CASSANDRA-9754 when using a 5GB Java heap.

screenshot of Cassandra 3.11.3 memory usage

What about Reads

The following graph reruns the benchmark on Cassandra version 3.11.3 over a longer period of time with a read to write ratio of 10:1. It illustrates that reads of wide partitions do not create the heap pressure that writes do.

screenshot of Cassandra 3.11.3 read functions

Conclusion

While the 400MB community recommendation for partition size is clearly appropriate for version 2.2.13, version 3.11.3 shows that performance improvements have created a tremendous ability to handle wide partitions and they can easily be an order of magnitude larger than earlier versions of Cassandra without nodes crashing through heap pressure.

The trade-off for better supporting wide partitions in Cassandra 3.11.3 is increased read latency as row offsets now need to be read off disk. However, modern SSDs and kernel pagecaches take advantage of larger configurations of physical memory providing enough IO improvements to compensate for the read latency trade-offs.

The improved stability and falling back on better hardware to deal with the read latency issue allows Cassandra operators to worry less about how to store massive amounts of data in different schemas and unexpected data growth patterns on those schemas.

Some CASSANDRA-9754 custom B+ tree structures will be used to more effectively look up the deserialised row offsets and further avoid the deserialization and instantiation of short-lived unused IndexInfo objects.


Mick Semb WeverMick Semb Wever designs, builds, and is an evangelist for distributed systems, from data-driven backends using Cassandra, Hadoop, Spark, to enterprise microservices platform.

The post How We Optimized Storage and Performance of Apache Cassandra at Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Legal Blackmail: Zero Cases Brought Against Alleged Pirates in Sweden

Post Syndicated from Andy original https://torrentfreak.com/legal-blackmail-zero-cases-brought-against-alleged-pirates-in-sweden-180525/

While several countries in Europe have wilted under sustained pressure from copyright trolls for more than ten years, Sweden managed to avoid their controversial attacks until fairly recently.

With Germany a decade-old pit of misery, with many hundreds of thousands of letters – by now probably millions – sent out to Internet users demanding cash, Sweden avoided the ranks of its European partners until two years ago

In September 2016 it was revealed that an organization calling itself Spridningskollen (Distribution Check) headed up by law firm Gothia Law, would begin targeting the public.

Its spokesperson described its letters as “speeding tickets” for pirates, in that they would only target the guilty. But there was a huge backlash and just a couple of months later Spridningskollen headed for the hills, without a single collection letter being sent out.

That was the calm before the storm.

In February 2017, Danish law firm Njord Law was found to be at the center of a new troll operation targeting the subscribers of several ISPs, including Telia, Tele2 and Bredbandsbolaget. Court documents revealed that thousands of IP addresses had been harvested by the law firm’s partners who were determined to link them with real-life people.

Indeed, in a single batch, Njord Law was granted permission from the court to obtain the identities of citizens behind 25,000 IP addresses, from whom it hoped to obtain cash settlements of around US$550. But it didn’t stop there.

Time and again the trolls headed back to court in an effort to reach more people although until now the true scale of their operations has been open to question. However, a new investigation carried out by SVT has revealed that the promised copyright troll invasion of Sweden is well underway with a huge level of momentum.

Data collated by the publication reveals that since 2017, the personal details behind more than 50,000 IP addresses have been handed over by Swedish Internet service providers to law firms representing copyright trolls and their partners. By the end of this year, Njord Law alone will have sent out 35,000 letters to Swede’s whose IP addresses have been flagged as allegedly infringing copyright.

Even if one is extremely conservative with the figures, the levels of cash involved are significant. Taking a settlement amount of just $300 per letter, very quickly the copyright trolls are looking at $15,000,000 in revenues. On the perimeter, assuming $550 will make a supposed lawsuit go away, we’re looking at a potential $27,500,000 in takings.

But of course, this dragnet approach doesn’t have the desired effect on all recipients.

In 2017, Njord Law said that only 60% of its letters received any kind of response, meaning that even fewer would be settling with the company. So what happens when the public ignores the threatening letters?

“Yes, we will [go to court],” said lawyer Jeppe Brogaard Clausen last year.

“We wish to resolve matters as much as possible through education and dialogue without the assistance of the court though. It is very expensive both for the rights holders and for plaintiffs if we go to court.”

But despite the tough-talking, SVT’s investigation has turned up an interesting fact. The nuclear option, of taking people to court and winning a case when they refuse to pay, has never happened.

After trawling records held by the Patent and Market Court and all those held by the District Courts dating back five years, SVT did not find a single case of a troll taking a citizen to court and winning a case. Furthermore, no law firm contacted by the publication could show that such a thing had happened.

“In Sweden, we have not yet taken someone to court, but we are planning to file for the right in 2018,” Emelie Svensson, lawyer at Njord Law, told SVT.

While a case may yet reach the courts, when it does it is guaranteed to be a cut-and-dried one. Letter recipients can often say things to damage their case, even when they’re only getting a letter due to their name being on the Internet bill. These are the people who find themselves under the most pressure to pay, whether they’re guilty or not.

“There is a risk of what is known in English as ‘legal blackmailing’,” says Mårten Schultz, professor of civil law at Stockholm University.

“With [the copyright holders’] legal and economic muscles, small citizens are scared into paying claims that they do not legally have to pay.”

It’s a position shared by Marianne Levine, Professor of Intellectual Property Law at Stockholm University.

“One can only show that an IP address appears in some context, but there is no point in the evidence. Namely, that it is the subscriber who also downloaded illegitimate material,” she told SVT.

Njord Law, on the other hand, sees things differently.

“In Sweden, we have no legal case saying that you are not responsible for your IP address,” Emelie Svensson says.

Whether Njord Law will carry through with its threats will remain to be seen but there can be little doubt that while significant numbers of people keep paying up, this practice will continue and escalate. The trolls have come too far to give up now.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Fully-Loaded Kodi Box Sellers Receive Hefty Jail Sentences

Post Syndicated from Andy original https://torrentfreak.com/fully-loaded-kodi-box-sellers-receive-hefty-jail-sentences-180524/

While users of older peer-to-peer based file-sharing systems have to work relatively hard to obtain content, users of the Kodi media player have things an awful lot easier.

As standard, Kodi is perfectly legal. However, when augmented with third-party add-ons it becomes a media discovery powerhouse, providing most of the content anyone could desire. A system like this can be set up by the user but for many, buying a so-called “fully-loaded” box from a seller is the easier option.

As a result, hundreds – probably thousands – of cottage industries have sprung up to service this hungry market in the UK, with regular people making a business out of setting up and selling such devices. Until three years ago, that’s what Michael Jarman and Natalie Forber of Colwyn Bay, Wales, found themselves doing.

According to reports in local media, Jarman was arrested in January 2015 when police were called to a disturbance at Jarman and Forber’s home. A large number of devices were spotted and an investigation was launched by Trading Standards officers. The pair were later arrested and charged with fraud offenses.

While 37-year-old Jarman pleaded guilty, 36-year-old Forber initially denied the charges and was due to stand trial. However, she later changed her mind and like Jarman, pleaded guilty to participating in a fraudulent business. Forber also pleaded guilty to transferring criminal property by shifting cash from the scheme through various bank accounts.

The pair attended a sentencing hearing before Judge Niclas Parry at Caernarfon Crown Court yesterday. According to local reporter Eryl Crump, the Court heard that the couple had run their business for about two years, selling around 1,000 fully-loaded Kodi-enabled devices for £100 each via social media.

According to David Birrell for the prosecution, the operation wasn’t particularly sophisticated but it involved Forber programming the devices as well as handling customer service. Forber claimed she was forced into the scheme by Jarman but that claim was rejected by the prosecution.

Between February 2013 and January 2015 the pair banked £105,000 from the business, money that was transferred between bank accounts in an effort to launder the takings.

Reporting from Court via Twitter, Crump said that Jarman’s defense lawyer accepted that a prison sentence was inevitable for his client but asked for the most lenient sentence possible.

Forber’s lawyer pointed out she had no previous convictions. The mother-of-two broke up with Jarman following her arrest and is now back in work and studying at college.

Sentencing the pair, Judge Niclas Parry described the offenses as a “relatively sophisticated fraud” carried out over a significant period. He jailed Jarman for 21 months and Forber for 16 months, suspended for two years. She must also carry out 200 hours of unpaid work.

The pair will also face a Proceeds of Crime investigation which could see them paying large sums to the state, should any assets be recoverable.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

AWS GDPR Data Processing Addendum – Now Part of Service Terms

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/aws-gdpr-data-processing-addendum/

Today, we’re happy to announce that the AWS GDPR Data Processing Addendum (GDPR DPA) is now part of our online Service Terms. This means all AWS customers globally can rely on the terms of the AWS GDPR DPA which will apply automatically from May 25, 2018, whenever they use AWS services to process personal data under the GDPR. The AWS GDPR DPA also includes EU Model Clauses, which were approved by the European Union (EU) data protection authorities, known as the Article 29 Working Party. This means that AWS customers wishing to transfer personal data from the European Economic Area (EEA) to other countries can do so with the knowledge that their personal data on AWS will be given the same high level of protection it receives in the EEA.

As we approach the GDPR enforcement date this week, this announcement is an important GDPR compliance component for us, our customers, and our partners. All customers which that are using cloud services to process personal data will need to have a data processing agreement in place between them and their cloud services provider if they are to comply with GDPR. As early as April 2017, AWS announced that AWS had a GDPR-ready DPA available for its customers. In this way, we started offering our GDPR DPA to customers over a year before the May 25, 2018 enforcement date. Now, with the DPA terms included in our online service terms, there is no extra engagement needed by our customers and partners to be compliant with the GDPR requirement for data processing terms.

The AWS GDPR DPA also provides our customers with a number of other important assurances, such as the following:

  • AWS will process customer data only in accordance with customer instructions.
  • AWS has implemented and will maintain robust technical and organizational measures for the AWS network.
  • AWS will notify its customers of a security incident without undue delay after becoming aware of the security incident.
  • AWS will make available certificates issued in relation to the ISO 27001 certification, the ISO 27017 certification, and the ISO 27018 certification to further help customers and partners in their own GDPR compliance activities.

Customers who have already signed an offline version of the AWS GDPR DPA can continue to rely on that GDPR DPA. By incorporating our GDPR DPA into the AWS Service Terms, we are simply extending the terms of our GDPR DPA to all customers globally who will require it under GDPR.

AWS GDPR DPA is only part of the story, however. We are continuing to work alongside our customers and partners to help them on their journey towards GDPR compliance.

If you have any questions about the GDPR or the AWS GDPR DPA, please contact your account representative, or visit the AWS GDPR Center at: https://aws.amazon.com/compliance/gdpr-center/

-Chad

Interested in AWS Security news? Follow the AWS Security Blog on Twitter.

Working with the Scout Association on digital skills for life

Post Syndicated from Philip Colligan original https://www.raspberrypi.org/blog/working-with-scout-association-digital-skills-for-life/

Today we’re launching a new partnership between the Scouts and the Raspberry Pi Foundation that will help tens of thousands of young people learn crucial digital skills for life. In this blog post, I want to explain what we’ve got planned, why it matters, and how you can get involved.

This is personal

First, let me tell you why this partnership matters to me. As a child growing up in North Wales in the 1980s, Scouting changed my life. My time with 2nd Rhyl provided me with countless opportunities to grow and develop new skills. It taught me about teamwork and community in ways that continue to shape my decisions today.

As my own kids (now seven and ten) have joined Scouting, I’ve seen the same opportunities opening up for them, and like so many parents, I’ve come back to the movement as a volunteer to support their local section. So this is deeply personal for me, and the same is true for many of my colleagues at the Raspberry Pi Foundation who in different ways have been part of the Scouting movement.

That shouldn’t come as a surprise. Scouting and Raspberry Pi share many of the same values. We are both community-led movements that aim to help young people develop the skills they need for life. We are both powered by an amazing army of volunteers who give their time to support that mission. We both care about inclusiveness, and pride ourselves on combining fun with learning by doing.

Raspberry Pi

Raspberry Pi started life in 2008 as a response to the problem that too many young people were growing up without the skills to create with technology. Our goal is that everyone should be able to harness the power of computing and digital technologies, for work, to solve problems that matter to them, and to express themselves creatively.

In 2012 we launched our first product, the world’s first $35 computer. Just six years on, we have sold over 20 million Raspberry Pi computers and helped kickstart a global movement for digital skills.

The Raspberry Pi Foundation now runs the world’s largest network of volunteer-led computing clubs (Code Clubs and CoderDojos), and creates free educational resources that are used by millions of young people all over the world to learn how to create with digital technologies. And lots of what we are able to achieve is because of partnerships with fantastic organisations that share our goals. For example, through our partnership with the European Space Agency, thousands of young people have written code that has run on two Raspberry Pi computers that Tim Peake took to the International Space Station as part of his Mission Principia.

Digital makers

Today we’re launching the new Digital Maker Staged Activity Badge to help tens of thousands of young people learn how to create with technology through Scouting. Over the past few months, we’ve been working with the Scouts all over the UK to develop and test the new badge requirements, along with guidance, project ideas, and resources that really make them work for Scouting. We know that we need to get two things right: relevance and accessibility.

Relevance is all about making sure that the activities and resources we provide are a really good fit for Scouting and Scouting’s mission to equip young people with skills for life. From the digital compass to nature cameras and the reinvented wide game, we’ve had a lot of fun thinking about ways we can bring to life the crucial role that digital technologies can play in the outdoors and adventure.

Compass Coding with Raspberry Pi

We are beyond excited to be launching a new partnership with the Raspberry Pi Foundation, which will help tens of thousands of young people learn digital skills for life.

We also know that there are great opportunities for Scouts to use digital technologies to solve social problems in their communities, reflecting the movement’s commitment to social action. Today we’re launching the first set of project ideas and resources, with many more to follow over the coming weeks and months.

Accessibility is about providing every Scout leader with the confidence, support, and kit to enable them to offer the Digital Maker Staged Activity Badge to their young people. A lot of work and care has gone into designing activities that require very little equipment: for example, activities at Stages 1 and 2 can be completed with a laptop without access to the internet. For the activities that do require kit, we will be working with Scout Stores and districts to make low-cost kit available to buy or loan.

We’re producing accessible instructions, worksheets, and videos to help leaders run sessions with confidence, and we’ll also be planning training for leaders. We will work with our network of Code Clubs and CoderDojos to connect them with local sections to organise joint activities, bringing both kit and expertise along with them.




Get involved

Today’s launch is just the start. We’ll be developing our partnership over the next few years, and we can’t wait for you to join us in getting more young people making things with technology.

Take a look at the brand-new Raspberry Pi resources designed especially for Scouts, to get young people making and creating right away.

The post Working with the Scout Association on digital skills for life appeared first on Raspberry Pi.

Connect, collaborate, and learn at AWS Global Summits in 2018

Post Syndicated from Tina Kelleher original https://aws.amazon.com/blogs/big-data/connect-collaborate-and-learn-at-aws-global-summits-in-2018/

Regardless of your career path, there’s no denying that attending industry events can provide helpful career development opportunities — not only for improving and expanding your skill sets, but for networking as well. According to this article from PayScale.com, experts estimate that somewhere between 70-85% of new positions are landed through networking.

Narrowing our focus to networking opportunities with cloud computing professionals who’re working on tackling some of today’s most innovative and exciting big data solutions, attending big data-focused sessions at an AWS Global Summit is a great place to start.

AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. As the name suggests, these summits are held in major cities around the world, and attract technologists from all industries and skill levels who’re interested in hearing from AWS leaders, experts, partners, and customers.

In addition to networking opportunities with top cloud technology providers, consultants and your peers in our Partner and Solutions Expo, you’ll also hone your AWS skills by attending and participating in a multitude of education and training opportunities.

Here’s a brief sampling of some of the upcoming sessions relevant to big data professionals:

May 31st : Big Data Architectural Patterns and Best Practices on AWS | AWS Summit – Mexico City

June 6th-7th: Various (click on the “Big Data & Analytics” header) | AWS Summit – Berlin

June 20-21st : [email protected] | Public Sector Summit – Washington DC

June 21st: Enabling Self Service for Data Scientists with AWS Service Catalog | AWS Summit – Sao Paulo

Be sure to check out the main page for AWS Global Summits, where you can see which cities have AWS Summits planned for 2018, register to attend an upcoming event, or provide your information to be notified when registration opens for a future event.

Connect Veeam to the B2 Cloud: Episode 3 — Using OpenDedup

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/opendedup-for-cloud-storage/

Veeam backup to Backblaze B2 logo

In this, the third post in our series on connecting Veeam with Backblaze B2 Cloud Storage, we discuss how to back up your VMs to B2 using Veeam and OpenDedup. In our previous posts, we covered how to connect Veeam to the B2 cloud using Synology, and how to connect Veeam with B2 using StarWind VTL.

Deduplication and OpenDedup

Deduplication is simply the process of eliminating redundant data on disk. Deduplication reduces storage space requirements, improves backup speed, and lowers backup storage costs. The dedup field used to be dominated by a few big-name vendors who sold dedup systems that were too expensive for most of the SMB market. Then an open-source challenger came along in OpenDedup, a project that produced the Space Deduplication File System (SDFS). SDFS provides many of the features of commercial dedup products without their cost.

OpenDedup provides inline deduplication that can be used with applications such as Veeam, Veritas Backup Exec, and Veritas NetBackup.

Features Supported by OpenDedup:

  • Variable Block Deduplication to cloud storage
  • Local Data Caching
  • Encryption
  • Bandwidth Throttling
  • Fast Cloud Recovery
  • Windows and Linux Support

Why use Veeam with OpenDedup to Backblaze B2?

With your VMs backed up to B2, you have a number of options to recover from a disaster. If the unexpected occurs, you can quickly restore your VMs from B2 to the location of your choosing. You also have the option to bring up cloud compute through B2’s compute partners, thereby minimizing any loss of service and ensuring business continuity.

Veeam logo + OpenDedup logo + Backblaze B2 logo

Backblaze’s B2 is an ideal solution for backing up Veeam’s backup repository due to B2’s combination of low-cost and high availability. Users of B2 save up to 75% compared to other cloud solutions such as Microsoft Azure, Amazon AWS, or Google Cloud Storage. When combined with OpenDedup’s no-cost deduplication, you’re got an efficient and economical solution for backing up VMs to the cloud.

How to Use OpenDedup with B2

For step-by-step instructions for how to set up OpenDedup for use with B2 on Windows or Linux, see Backblaze B2 Enabled on the OpenDedup website.

Are you backing up Veeam to B2 using one of the solutions we’ve written about in this series? If you have, we’d love to hear from you in the comments.

View all posts in the Veeam series.

The post Connect Veeam to the B2 Cloud: Episode 3 — Using OpenDedup appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Introducing the AWS Machine Learning Competency for Consulting Partners

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/introducing-the-aws-machine-learning-competency-for-consulting-partners/

Today I’m excited to announce a new Machine Learning Competency for Consulting Partners in the Amazon Partner Network (APN). This AWS Competency program allows APN Consulting Partners to demonstrate a deep expertise in machine learning on AWS by providing solutions that enable machine learning and data science workflows for their customers. This new AWS Competency is in addition to the Machine Learning comptency for our APN Technology Partners, that we launched at the re:Invent 2017 partner summit.

These APN Consulting Partners help organizations solve their machine learning and data challenges through:

  • Providing data services that help data scientists and machine learning practitioners prepare their enterprise data for training.
  • Platform solutions that provide data scientists and machine learning practitioners with tools to take their data, train models, and make predictions on new data.
  • SaaS and API solutions to enable predictive capabilities within customer applications.

Why work with an AWS Machine Learning Competency Partner?

The AWS Competency Program helps customers find the most qualified partners with deep expertise. AWS Machine Learning Competency Partners undergo a strict validation of their capabilities to demonstrate technical proficiency and proven customer success with AWS machine learning tools.

If you’re an AWS customer interested in machine learning workloads on AWS, check out our AWS Machine Learning launch partners below:

 

Interested in becoming an AWS Machine Learning Competency Partner?

APN Partners with experience in Machine Learning can learn more about becoming an AWS Machine Learning Competency Partner here. To learn more about the benefits of joining the AWS Partner Network, see our APN Partner website.

Thanks to the AWS Partner Team for their help with this post!
Randall