Tag Archives: AI

MLPerf Inference v3.1 Shows NVIDIA Grace Hopper and a Cool AMD TPU v5e Win

Post Syndicated from Cliff Robinson original https://www.servethehome.com/mlperf-inference-v3-1-shows-nvidia-grace-hopper-and-a-cool-amd-tpu-v5e-win/

NVIDIA’s MLPerf Inference v3.1 is out. Two standouts were NVIDIA setting the stage to jettison x86 and AMD having a big win at Google

The post MLPerf Inference v3.1 Shows NVIDIA Grace Hopper and a Cool AMD TPU v5e Win appeared first on ServeTheHome.

Cloudflare One for Data Protection

Post Syndicated from James Chang original http://blog.cloudflare.com/cloudflare-one-data-protection-announcement/

Cloudflare One for Data Protection

This post is also available in 日本語, 한국어, Deutsch, Français.

Cloudflare One for Data Protection

Data continues to explode in volume, variety, and velocity, and security teams at organizations of all sizes are challenged to keep up. Businesses face escalating risks posed by varied SaaS environments, the emergence of generative artificial intelligence (AI) tools, and the exposure and theft of valuable source code continues to keep CISOs and Data Officers up at night.

Over the past few years, Cloudflare has launched capabilities to help organizations navigate these risks and gain visibility and controls over their data — including the launches of our data loss prevention (DLP) and cloud access security broker (CASB) services in the fall of 2022.

Announcing Cloudflare One’s data protection suite

Today, we are building on that momentum and announcing Cloudflare One for Data Protection — our unified suite to protect data everywhere across web, SaaS, and private applications. Built on and delivered across our entire global network, Cloudflare One’s data protection suite is architected for the risks of modern coding and increased usage of AI.

Specifically, this suite converges capabilities across Cloudflare’s DLP, CASB, Zero Trust network access (ZTNA), secure web gateway (SWG), remote browser isolation (RBI), and cloud email security services onto a single platform for simpler management. All these services are available and packaged now as part of Cloudflare One, our SASE platform that converges security and network connectivity services.

A separate blog post published today looks back on what technologies and features we delivered over the past year and previews new functionality that customers can look forward to.

In this blog, we focus more on what impact those technologies and features have for customers in addressing modern data risks — with examples of practical use cases. We believe that Cloudflare One is uniquely positioned to deliver better data protection that addresses modern data risks. And by “better,” we mean:

  • Helping security teams be more effective protecting data by simplifying inline and API connectivity together with policy management
  • Helping employees be more productive by ensuring fast, reliable, and consistent user experiences
  • Helping organizations be more agile by innovating rapidly to meet evolving data security and privacy requirements

Harder than ever to secure data

Data spans more environments than most organizations can keep track of. In conversations with customers, three distinctly modern risks stick out:

  1. The growing diversity of cloud and SaaS environments: The apps where knowledge workers spend most of their time — like cloud email inboxes, shared cloud storage folders and documents, SaaS productivity and collaboration suites like Microsoft 365 — are increasingly targeted by threat actors for data exfiltration.
  2. Emerging AI tools: Business leaders are concerned about users oversharing sensitive information with opaque large language model tools like ChatGPT, but at the same time, want to leverage the benefits of AI.
  3. Source code exposure or theft: Developer code fuels digital business, but that same high-value source code can be exposed or targeted for theft across many developer tools like GitHub, including in plain sight locations like public repositories.

These latter two risks, in particular, are already intersecting. Companies like Amazon, Apple, Verizon, Deutsche Bank, and more are blocking employees from using tools like ChatGPT for fear of losing confidential data, and Samsung recently had an engineer accidentally upload sensitive code to the tool. As organizations prioritize new digital services and experiences, developers face mounting pressure to work faster and smarter. AI tools can help unlock that productivity, but the long-term consequences of oversharing sensitive data with these tools is still unknown.

All together, data risks are only primed to escalate, particularly as organizations accelerate digital transformation initiatives with hybrid work and development continuing to expand attack surfaces. At the same time, regulatory compliance will only become more demanding, as more countries and states adopt more stringent data privacy laws.

Traditional DLP services are not equipped to keep up with these modern risks. A combination of high setup and operational complexity plus negative user experiences means that, in practice, DLP controls are often underutilized or bypassed entirely. Whether deployed as a standalone platform or integrated into security products or SaaS applications, DLP products can often become expensive shelfware. And backhauling traffic through on-premise data protection hardware – whether, DLP, firewall and SWG appliances, or otherwise — create costs and slow user experiences that hold businesses back in the long run.

Figure 1: Modern data risks

Cloudflare One for Data Protection

How customers use Cloudflare for data protection

Today, customers are increasingly turning to Cloudflare to address these data risks, including a Fortune 500 natural gas company, a major US job site, a regional US airline, an Australian healthcare company and more. Across these customer engagements, three use cases are standing out as common focus areas when deploying Cloudflare One for data protection.

Use case #1: Securing AI tools and developer code (Applied Systems)

Applied Systems, an insurance technology & software company, recently deployed Cloudflare One to secure data in AI environments.

Specifically, the company runs the public instance of ChatGPT in an isolated browser, so that the security team can apply copy-paste blocks: preventing users from copying sensitive information (including developer code) from other apps into the AI tool. According to Chief Information Security Officer Tanner Randolph, “We wanted to let employees take advantage of AI while keeping it safe.”

This use case was just one of several Applied Systems tackled when migrating from Zscaler and Cisco to Cloudflare, but we see a growing interest in securing AI and developer code among our customers.

Use case #2: Data exposure visibility

Customers are leveraging Cloudflare One to regain visibility and controls over data exposure risks across their sprawling app environments. For many, the first step is analyzing unsanctioned app usage, and then taking steps to allow, block, isolate, or apply other controls to those resources. A second and increasingly popular step is scanning SaaS apps for misconfigurations and sensitive data via a CASB and DLP service, and then taking prescriptive steps to remediate via SWG policies.

A UK ecommerce giant with 7,5000 employees turned to Cloudflare for this latter step. As part of a broader migration strategy from Zscaler to Cloudflare, this company quickly set up API integrations between its SaaS environments and Cloudflare’s CASB and began scanning for misconfigurations. Plus, during this integration process, the company was able to sync DLP policies with Microsoft Pureview Information Protection sensitivity labels, so that it could use its existing framework to prioritize what data to protect. All in all, the company was able to begin identifying data exposure risks within a day.

Use case #3: Compliance with regulations

Comprehensive data regulations like GDPR, CCPA, HIPAA, and GLBA have been in our lives for some time now. But new laws are quickly emerging: for example, 11 U.S. states now have comprehensive privacy laws, up from just 3 in 2021. And updates to existing laws like PCI DSS now include stricter, more expansive requirements.

Customers are increasingly turning to Cloudflare One for compliance, in particular by ensuring they can monitor and protect regulated data (e.g. financial data, health data, PII, exact data matches, and more). Some common steps include first, detecting and applying controls to sensitive data via DLP, next, maintaining detailed audit trails via logs and further SIEM analysis, and finally, reducing overall risk with a comprehensive Zero Trust security posture.

Let’s look at a concrete example. One Zero Trust best practice that is increasingly required is multi-factor authentication (MFA). In the payment cards industry, PCI DSS v4.0, which takes effect in 2025, requires that requests to MFA be enforced for every access request to the cardholder data environment, for every user and for every location – including cloud environments, on-prem apps, workstations and more. (requirement 8.4.2). Plus, those MFA systems must be configured to prevent misuse – including replay attacks and bypass attempts – and must require at least two different factors that must be successful (requirement 8.5). To help organizations comply with both of these requirements, Cloudflare helps organizations enforce MFA across all apps and users – and in fact, we use our same services to enforce hard key authentication for our own employees.

Figure 2: Data protection use cases

Cloudflare One for Data Protection

The Cloudflare difference

Cloudflare One’s data protection suite is built to stay at the forefront of modern data risks to address these and other evolving use cases.

With Cloudflare, DLP is not just integrated with other typically distinct security services, like CASB, SWG, ZTNA, RBI, and email security, but converged onto a single platform with one control plane and one interface. Beyond the acronym soup, our network architecture is really what enables us to help organizations be more effective, more productive, and more agile with protecting data.

We simplify connectivity, with flexible options for you to send traffic to Cloudflare for enforcement. Those options include API-based scans of SaaS suites for misconfigurations and sensitive data. Unlike solutions that require security teams to get full app permissions from IT or business teams, Cloudflare can find risk exposure with read-only app permissions. Clientless deployments of ZTNA to secure application access and of browser isolation to control data within websites and apps are scalable for all users — employees and third-parties like contractors — for the largest enterprises. And when you do want to forward proxy traffic, Cloudflare offers one device client with self-enrollment permissions or wide area network on-ramps across security services. With so many practical ways to deploy, your data protection approach will be effective and functional — not shelfware.

Just like your data, our global network is everywhere, now spanning over 300 cities in over 100 countries. We have proven that we enforce controls faster than vendors like Zscaler, Netskope, and Palo Alto Networks — all with single-pass inspection. We ensure security is quick, reliable, and unintrusive, so you can layer on data controls without disruptive work productivity.

Our programmable network architecture enables us to build new capabilities quickly. And we rapidly adopt new security standards and protocols (like IPv6-only connections or HTTP/3 encryption) to ensure data protection remains effective. Altogether, this architecture equips us to evolve alongside changing data protection use cases, like protecting code in AI environments, and quickly deploy AI and machine learning models across our network locations to enforce higher precision, context-driven detections.

Figure 3: Unified data protection with Cloudflare

Cloudflare One for Data Protection

How to get started

Modern data risks demand modern security. We feel that Cloudflare One’s unified data protection suite is architected to help organizations navigate their priority risks today and in the future — whether that is securing developer code and AI tools, regaining visibility over SaaS apps, or staying compliant with evolving regulations.

If you’re ready to explore how Cloudflare can protect your data, request a workshop with our experts today.

Or to learn more about how Cloudflare One protects data, read today’s press release, visit our website, or dive deeper with our accompanying technical blog.

***

  1. The State of Secrets Sprawl 2023, GitGuardian
  2. Top Generative AI Statistics for 2023, Salesforce
  3. Cost of a Data Breach Report 2023, IBM
  4. 2023 “State of the CISO” report, conducted by Global Survey
  5. United Nations Conference on Trade & Development
  6. International Association of Privacy Professionals (IAPP)

Google Details TPUv4 and its Crazy Optically Reconfigurable AI Network

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/google-details-tpuv4-and-its-crazy-optically-reconfigurable-ai-network/

Google detailed how its TPUv4 pods use optically reconfigurable networks to support efficient, large scale, AI workloads

The post Google Details TPUv4 and its Crazy Optically Reconfigurable AI Network appeared first on ServeTheHome.

NVIDIA Announces a New NVIDIA H100 144GB HBM3e Model and an Anticipated Business Model Shift

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/nvidia-announces-a-new-nvidia-h100-144gb-hbm3e-model-and-an-anticipated-business-model-shift/

NVIDIA announced a new H100 144GB HBM3e model this week while signaling a major business model shift to the industry

The post NVIDIA Announces a New NVIDIA H100 144GB HBM3e Model and an Anticipated Business Model Shift appeared first on ServeTheHome.

100M USD Cerebras AI Cluster Makes it the Post-Legacy Silicon AI Winner

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/100m-usd-cerebras-ai-cluster-makes-it-the-post-legacy-silicon-ai-winner/

Cerebras announced a $100M+ AI Cluster using its huge AI chips and plans for two more in 2023 making it a winner of the new AI silicon makers

The post 100M USD Cerebras AI Cluster Makes it the Post-Legacy Silicon AI Winner appeared first on ServeTheHome.

Wiwynn Liquid Cooling for 8kW AI Accelerator Trays Shown

Post Syndicated from Eric Smith original https://www.servethehome.com/wiwynn-liquid-cooling-for-8kw-of-ai-accelerators-shown-oam-oai-ocp-ubb/

We check out the Hyper-scale server maker Wiwynn’s liquid cooling solution for 8kW GPU and AI accelerator trays at its headquarters

The post Wiwynn Liquid Cooling for 8kW AI Accelerator Trays Shown appeared first on ServeTheHome.

Lenovo ThinkEdge SE350 V2 SE360 V2 and SR675 V3 Launched for AI

Post Syndicated from Cliff Robinson original https://www.servethehome.com/lenovo-thinkedge-se350-v2-se360-v2-and-sr675-v3-launched-for-ai/

Lenovo has a trio of new AI servers, the Lenovo ThinkEdge SE350 V2 and SE360 V2 are edge AI platforms and the ThinkSystem SR675 V3 is for big GPUs

The post Lenovo ThinkEdge SE350 V2 SE360 V2 and SR675 V3 Launched for AI appeared first on ServeTheHome.

Globally distributed AI and a Constellation update

Post Syndicated from Rita Kozlov original http://blog.cloudflare.com/globally-distributed-ai-and-a-constellation-update/

Globally distributed AI and a Constellation update

Globally distributed AI and a Constellation update

During Cloudflare's 2023 Developer Week, we announced Constellation, a set of APIs that allow everyone to run fast, low-latency inference tasks using pre-trained machine learning/AI models, directly on Cloudflare’s network.

Constellation update

We now have a few thousand accounts onboarded in the Constellation private beta and have been listening to our customer's feedback to evolve and improve the platform. Today, one month after the announcement, we are upgrading Constellation with three new features:

Bigger models
We are increasing the size limit of your models from 10 MB to 50 MB. While still somewhat conservative during the private beta, this new limit opens doors to more pre-trained and optimized models you can use with Constellation.

Tensor caching
When you run a Constellation inference task, you pass multiple tensor objects as inputs, sometimes creating big data payloads. These inputs travel through the wire protocol back and forth when you repeat the same task, even when the input changes from multiple runs are minimal, creating unnecessary network and data parsing overhead.

The client API now supports caching input tensors resulting in even better network latency and faster inference times.

XGBoost runtime
Constellation started with the ONNX runtime, but our vision is to support multiple runtimes under a common API. Today we're adding the XGBoost runtime to the list.

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable, and it's known for its performance in structured and tabular data tasks.

You can start uploading and using XGBoost models today.

Globally distributed AI and a Constellation update

You can find the updated documentation with these new features and an example on how to use the XGBoost runtime with Constellation in our Developers Documentation.

An era of globally distributed AI

Since Cloudflare’s network is globally distributed, Constellation is our first public release of globally distributed machine learning.

But what does this mean? You may not think of a global network as the place to deploy your machine learning tasks, but machine learning has been a core part of what’s enabled much of Cloudflare’s core functionality for many years. And we run it across our global network in 300 cities.

Is this large spike in traffic an attack or a Black Friday sale? What’s going to be the best way to route this request based on current traffic patterns? Is this request coming from a human or a bot? Is this HTTP traffic a zero-day? Being able to answer these questions using automated machine learning and AI, rather than human intervention, is one of the things that’s enabled Cloudflare to scale.

But this is just a small sample of what globally distributed machine learning enables. The reason this was so helpful for us was because we were able to run this machine learning as an integrated part of our stack, which is why we’re now in the process of opening it up to more and more developers with Constellation.

As Michelle Zatlyn, our co-founder likes to say, we’re just getting started (in this space) — every day we’re adding hundreds of new users to our Constellation beta, testing out and globally deploying new models, and beyond that, deploying new hardware to support the new types of workloads that AI will bring to the our global network.

With that, we wanted to share a few announcements and some use cases that help illustrate why we’re so excited about globally distributed AI. And since it’s Speed Week, it should be no surprise that, well, speed is at the crux of it all.

Custom tailored web experiences, powered by AI

We’ve long known about the importance of performance when it comes to web experiences — in e-commerce, every second of page load time can have as much as a 7% drop off effect on conversion. But being fast is not enough. It’s necessary, but not sufficient. You also have to be accurate.

That is, rather than serving one-size-fits-all experiences, users have come to expect that you know what they want before they do.

So you have to serve personalized experiences, and you have to do it fast. That’s where Constellation can come into play. With Constellation, as a part of your e-commerce application that may already be served from Cloudflare’s network through Workers or Pages, or even store data in D1, you can now perform tasks such as categorization (what demographic is this customer most likely in?) and personalization (if you bought this, you may also like that).

Making devices smarter wherever they are

Another use case where performance is critical is in interacting with the real world. Imagine a face recognition system that detects whether you’re human or not every time you go into your house. Every second of latency makes a difference (especially if you’re holding heavy groceries).

Running inference on Cloudflare’s network, means that within 95% of the world’s population, compute, and thus a decision, is never going to be more than 50ms away. This is in huge contrast to centralized compute, where if you live in Europe, but bought a doorbell system from a US-based company, may be up to hundreds of milliseconds round trip away.

You may be thinking, why not just run the compute on the device then?

For starters, running inference on the device doesn’t guarantee fast performance. Most devices with built in intelligence are run on microcontrollers, often with limited computational abilities (not a high-end GPU or server-grade CPU). Milliseconds become seconds; depending on the volume of workloads you need to process, the local inference might not be suitable. The compute that can be fit on devices is simply not powerful enough for high-volume complex operations, certainly not for operating at low-latency.

But even user experience aside (some devices don’t interface with a user directly), there are other downsides to running compute directly on devices.

The first is battery life — the longer the compute, the shorter the battery life. There's always a power consumption hit, even if you have a custom ASIC chip or a Tensor Processing Unit (TPU), meaning shorter battery life if that's one of your constraints. For consumer products, this means having to switch out your doorbell battery (lest you get locked out). For operating fleets of devices at scale (imagine watering devices in a field) this means costs of keeping up with, and swapping out batteries.

Lastly, device hardware, and even software, is harder to update. As new technologies or more efficient chips become available, upgrading fleets of hundreds or thousands of devices is challenging. And while software updates may be easier to manage, they’ll never be as easy as updating on-cloud software, where you can effortlessly ship updates multiple times a day!

Speaking of shipping software…

AI applications, easier than ever with Constellation

Speed Week is not just about making your applications or devices faster, but also your team!

For the past six years, our developer platform has been making it easy for developers to ship new code with Cloudflare Workers. With Constellation, it’s now just as easy to add Machine Learning to your existing application, with just a few commands.

And if you don’t believe us, don’t just take our word for it. We’re now in the process of opening up the beta to more and more customers. To request access, head on over to the Cloudflare Dashboard where you’ll see a new tab for Constellation. We encourage you to check out our tutorial for getting started with Constellation — this AI thing may be even easier than you expected it to be!

We’re just getting started

This is just the beginning of our journey for helping developers build AI driven applications, and we’re already thinking about what’s next.

We look forward to seeing what you build, and hearing your feedback.

Every request, every microsecond: scalable machine learning at Cloudflare

Post Syndicated from Alex Bocharov original http://blog.cloudflare.com/scalable-machine-learning-at-cloudflare/

Every request, every microsecond: scalable machine learning at Cloudflare

Every request, every microsecond: scalable machine learning at Cloudflare

In this post, we will take you through the advancements we've made in our machine learning capabilities. We'll describe the technical strategies that have enabled us to expand the number of machine learning features and models, all while substantially reducing the processing time for each HTTP request on our network. Let's begin.

Background

For a comprehensive understanding of our evolved approach, it's important to grasp the context within which our machine learning detections operate. Cloudflare, on average, serves over 46 million HTTP requests per second, surging to more than 63 million requests per second during peak times.

Machine learning detection plays a crucial role in ensuring the security and integrity of this vast network. In fact, it classifies the largest volume of requests among all our detection mechanisms, providing the final Bot Score decision for over 72% of all HTTP requests. Going beyond, we run several machine learning models in shadow mode for every HTTP request.

At the heart of our machine learning infrastructure lies our reliable ally, CatBoost. It enables ultra low-latency model inference and ensures high-quality predictions to detect novel threats such as stopping bots targeting our customers' mobile apps. However, it's worth noting that machine learning model inference is just one component of the overall latency equation. Other critical components include machine learning feature extraction and preparation. In our quest for optimal performance, we've continuously optimized each aspect contributing to the overall latency of our system.

Initially, our machine learning models relied on single-request features, such as presence or value of certain headers. However, given the ease of spoofing these attributes, we evolved our approach. We turned to inter-request features that leverage aggregated information across multiple dimensions of a request in a sliding time window. For example, we now consider factors like the number of unique user agents associated with certain request attributes.

The extraction and preparation of inter-request features were handled by Gagarin, a Go-based feature serving platform we developed. As a request arrived at Cloudflare, we extracted dimension keys from the request attributes. We then looked up the corresponding machine learning features in the multi-layered cache. If the desired machine learning features were not found in the cache, a memcached "get" request was made to Gagarin to fetch those. Then machine learning features were plugged into CatBoost models to produce detections, which were then surfaced to the customers via Firewall and Workers fields and internally through our logging pipeline to ClickHouse. This allowed our data scientists to run further experiments, producing more features and models.

Every request, every microsecond: scalable machine learning at Cloudflare
Previous system design for serving machine learning features over Unix socket using Gagarin.

Initially, Gagarin exhibited decent latency, with a median latency around 200 microseconds to serve all machine learning features for given keys. However, as our system evolved and we introduced more features and dimension keys, coupled with increased traffic, the cache hit ratio began to wane. The median latency had increased to 500 microseconds and during peak times, the latency worsened significantly, with the p99 latency soaring to roughly 10 milliseconds. Gagarin underwent extensive low-level tuning, optimization, profiling, and benchmarking. Despite these efforts, we encountered the limits of inter-process communication (IPC) using Unix Domain Socket (UDS), among other challenges, explored below.

Problem definition

In summary, the previous solution had its drawbacks, including:

  • High tail latency: during the peak time, a portion of requests experienced increased  latency caused by CPU contention on the Unix socket and Lua garbage collector.
  • Suboptimal resource utilization: CPU and RAM utilization was not optimized to the full potential, leaving less resources for other services running on the server.
  • Machine learning features availability: decreased due to memcached timeouts, which resulted in a higher likelihood of false positives or false negatives for a subset of the requests.
  • Scalability constraints: as we added more machine learning features, we approached the scalability limit of our infrastructure.

Equipped with a comprehensive understanding of the challenges and armed with quantifiable metrics, we ventured into the next phase: seeking a more efficient way to fetch and serve machine learning features.

Exploring solutions

In our quest for more efficient methods of fetching and serving machine learning features, we evaluated several alternatives. The key approaches included:

Further optimizing Gagarin: as we pushed our Go-based memcached server to its limits, we encountered a lower bound on latency reductions. This arose from IPC over UDS synchronization overhead and multiple data copies, the serialization/deserialization overheads, as well as the inherent latency of garbage collector and the performance of hashmap lookups in Go.

Considering Quicksilver: we contemplated using Quicksilver, but the volume and update frequency of machine learning features posed capacity concerns and potential negative impacts on other use cases. Moreover, it uses a Unix socket with the memcached protocol, reproducing the same limitations previously encountered.

Increasing multi-layered cache size: we investigated expanding cache size to accommodate tens of millions of dimension keys. However, the associated memory consumption, due to duplication of these keys and their machine learning features across worker threads, rendered this approach untenable.

Sharding the Unix socket: we considered sharding the Unix socket to alleviate contention and improve performance. Despite showing potential, this approach only partially solved the problem and introduced more system complexity.

Switching to RPC: we explored the option of using RPC for communication between our front line server and Gagarin. However, since RPC still requires some form of communication bus (such as TCP, UDP, or UDS), it would not significantly change the performance compared to the memcached protocol over UDS, which was already simple and minimalistic.

After considering these approaches, we shifted our focus towards investigating alternative Inter-Process Communication (IPC) mechanisms.

IPC mechanisms

Adopting a first principles design approach, we questioned: "What is the most efficient low-level method for data transfer between two processes provided by the operating system?" Our goal was to find a solution that would enable the direct serving of machine learning features from memory for corresponding HTTP requests. By eliminating the need to traverse the Unix socket, we aimed to reduce CPU contention, improve latency, and minimize data copying.

To identify the most efficient IPC mechanism, we evaluated various options available within the Linux ecosystem. We used ipc-bench, an open-source benchmarking tool specifically designed for this purpose, to measure the latencies of different IPC methods in our test environment. The measurements were based on sending one million 1,024-byte messages forth and back (i.e., ping pong) between two processes.

IPC method Avg duration, μs Avg throughput, msg/s
eventfd (bi-directional) 9.456 105,533
TCP sockets 8.74 114,143
Unix domain sockets 5.609 177,573
FIFOs (named pipes) 5.432 183,388
Pipe 4.733 210,369
Message Queue 4.396 226,421
Unix Signals 2.45 404,844
Shared Memory 0.598 1,616,014
Memory-Mapped Files 0.503 1,908,613

Based on our evaluation, we found that Unix sockets, while taking care of synchronization, were not the fastest IPC method available. The two fastest IPC mechanisms were shared memory and memory-mapped files. Both approaches offered similar performance, with the former using a specific tmpfs volume in /dev/shm and dedicated system calls, while the latter could be stored in any volume, including tmpfs or HDD/SDD.

Missing ingredients

In light of these findings, we decided to employ memory-mapped files as the IPC mechanism for serving machine learning features. This choice promised reduced latency, decreased CPU contention, and minimal data copying. However, it did not inherently offer data synchronization capabilities like Unix sockets. Unlike Unix sockets, memory-mapped files are simply files in a Linux volume that can be mapped into memory of the process. This sparked several critical questions:

  1. How could we efficiently fetch an array of hundreds of float features for given dimension keys when dealing with a file?
  2. How could we ensure safe, concurrent and frequent updates for tens of millions of keys?
  3. How could we avert the CPU contention previously encountered with Unix sockets?
  4. How could we effectively support the addition of more dimensions and features in the future?

To address these challenges we needed to further evolve this new approach by adding a few key ingredients to the recipe.

Augmenting the Idea

To realize our vision of memory-mapped files as a method for serving machine learning features, we needed to employ several key strategies, touching upon aspects like data synchronization, data structure, and deserialization.

Wait-free synchronization

When dealing with concurrent data, ensuring safe, concurrent, and frequent updates is paramount. Traditional locks are often not the most efficient solution, especially when dealing with high concurrency environments. Here's a rundown on three different synchronization techniques:

With-lock synchronization: a common approach using mechanisms like mutexes or spinlocks. It ensures only one thread can access the resource at a given time, but can suffer from contention, blocking, and priority inversion, just as evident with Unix sockets.

Lock-free synchronization: this non-blocking approach employs atomic operations to ensure at least one thread always progresses. It eliminates traditional locks but requires careful handling of edge cases and race conditions.

Wait-free synchronization: a more advanced technique that guarantees every thread makes progress and completes its operation without being blocked by other threads. It provides stronger progress guarantees compared to lock-free synchronization, ensuring that each thread completes its operation within a finite number of steps.

Disjoint Access Parallelism Starvation Freedom Finite Execution Time
With lock Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare
Lock-free Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare
Wait-free Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare Every request, every microsecond: scalable machine learning at Cloudflare

Our wait-free data access pattern draws inspiration from Linux kernel's Read-Copy-Update (RCU) pattern and the Left-Right concurrency control technique. In our solution, we maintain two copies of the data in separate memory-mapped files. Write access to this data is managed by a single writer, with multiple readers able to access the data concurrently.

We store the synchronization state, which coordinates access to these data copies, in a third memory-mapped file, referred to as "state". This file contains an atomic 64-bit integer, which represents an InstanceVersion and a pair of additional atomic 32-bit variables, tracking the number of active readers for each data copy. The InstanceVersion consists of the currently active data file index (1 bit), the data size (39 bits, accommodating data sizes up to 549 GB), and a data checksum (24 bits).

Zero-copy deserialization

To efficiently store and fetch machine learning features, we needed to address the challenge of deserialization latency. Here, zero-copy deserialization provides an answer. This technique reduces the time and memory required to access and use data by directly referencing bytes in the serialized form.

We turned to rkyv, a zero-copy deserialization framework in Rust, to help us with this task. rkyv implements total zero-copy deserialization, meaning no data is copied during deserialization and no work is done to deserialize data. It achieves this by structuring its encoded representation to match the in-memory representation of the source type.

One of the key features of rkyv that our solution relies on is its ability to access HashMap data structures in a zero-copy fashion. This is a unique capability among Rust serialization libraries and one of the main reasons we chose rkyv for our implementation. It also has a vibrant Discord community, eager to offer best-practice advice and accommodate feature requests.

Every request, every microsecond: scalable machine learning at Cloudflare
Feature comparison: rkyv vs FlatBuffers and Cap'n Proto

Enter mmap-sync crate

Leveraging the benefits of memory-mapped files, wait-free synchronization and zero-copy deserialization, we've crafted a unique and powerful tool for managing high-performance, concurrent data access between processes. We've packaged these concepts into a Rust crate named mmap-sync, which we're thrilled to open-source for the wider community.

At the core of the mmap-sync package is a structure named Synchronizer. It offers an avenue to read and write any data expressible as a Rust struct. Users simply have to implement or derive a specific Rust trait surrounding struct definition – a task requiring just a single line of code. The Synchronizer presents an elegantly simple interface, equipped with "write" and "read" methods.

impl Synchronizer {
    /// Write a given `entity` into the next available memory mapped file.
    pub fn write<T>(&mut self, entity: &T, grace_duration: Duration) -> Result<(usize, bool), SynchronizerError> {
        …
    }

    /// Reads and returns `entity` struct from mapped memory wrapped in `ReadResult`
    pub fn read<T>(&mut self) -> Result<ReadResult<T>, SynchronizerError> {
        …
    }
}

/// FeaturesMetadata stores features along with their metadata
#[derive(Archive, Deserialize, Serialize, Debug, PartialEq)]
#[archive_attr(derive(CheckBytes))]
pub struct FeaturesMetadata {
    /// Features version
    pub version: u32,
    /// Features creation Unix timestamp
    pub created_at: u32,
    /// Features represented by vector of hash maps
    pub features: Vec<HashMap<u64, Vec<f32>>>,
}

A read operation through the Synchronizer performs zero-copy deserialization and returns a "guarded" Result encapsulating a reference to the Rust struct using RAII design pattern. This operation also increments the atomic counter of active readers using the struct. Once the Result is out of scope, the Synchronizer decrements the number of readers.

The synchronization mechanism used in mmap-sync is not only "lock-free" but also "wait-free". This ensures an upper bound on the number of steps an operation will take before it completes, thus providing a performance guarantee.

The data is stored in shared mapped memory, which allows the Synchronizer to “write” to it and “read” from it concurrently. This design makes mmap-sync a highly efficient and flexible tool for managing shared, concurrent data access.

Now, with an understanding of the underlying mechanics of mmap-sync, let's explore how this package plays a key role in the broader context of our Bot Management platform, particularly within the newly developed components: the bliss service and library.

System design overhaul

Transitioning from a Lua-based module that made memcached requests over Unix socket to Gagarin in Go to fetch machine learning features, our new design represents a significant evolution. This change pivots around the introduction of mmap-sync, our newly developed Rust package, laying the groundwork for a substantial performance upgrade. This development led to a comprehensive system redesign and introduced two new components that form the backbone of our Bots Liquidation Intelligent Security System – or BLISS, in short: the bliss service and the bliss library.

Every request, every microsecond: scalable machine learning at Cloudflare

Bliss service

The bliss service operates as a Rust-based, multi-threaded sidecar daemon. It has been designed for optimal batch processing of vast data quantities and extensive I/O operations. Among its key functions, it fetches, parses, and stores machine learning features and dimensions for effortless data access and manipulation. This has been made possible through the incorporation of the Tokio event-driven platform, which allows for efficient, non-blocking I/O operations.

Bliss library

Operating as a single-threaded dynamic library, the bliss library seamlessly integrates into each worker thread using the Foreign Function Interface (FFI) via a Lua module. Optimized for minimal resource usage and ultra-low latency, this lightweight library performs tasks without the need for heavy I/O operations. It efficiently serves machine learning features and generates corresponding detections.

In addition to leveraging the mmap-sync package for efficient machine learning feature access, our new design includes several other performance enhancements:

  • Allocations-free operation: bliss library re-uses pre-allocated data structures and performs no heap allocations, only low-cost stack allocations. To enforce our zero-allocation policy, we run integration tests using the dhat heap profiler.
  • SIMD optimizations: wherever possible, the bliss library employs vectorized CPU instructions. For instance, AVX2 and SSE4 instruction sets are used to expedite hex-decoding of certain request attributes, enhancing speed by tenfold.
  • Compiler tuning: We compile both the bliss service and library with the following flags for superior performance:

[profile.release]
codegen-units = 1
debug = true
lto = "fat"
opt-level = 3

  • Benchmarking & profiling: We use Criterion for benchmarking every major feature or component within bliss. Moreover, we are also able to use the Go pprof profiler on Criterion benchmarks to view flame graphs and more:

cargo bench -p integration -- --verbose --profile-time 100

go tool pprof -http=: ./target/criterion/process_benchmark/process/profile/profile.pb

This comprehensive overhaul of our system has not only streamlined our operations but also has been instrumental in enhancing the overall performance of our Bot Management platform. Stay tuned to witness the remarkable changes brought about by this new architecture in the next section.

Rollout results

Our system redesign has brought some truly "blissful" dividends. Above all, our commitment to a seamless user experience and the trust of our customers have guided our innovations. We ensured that the transition to the new design was seamless, maintaining full backward compatibility, with no customer-reported false positives or negatives encountered. This is a testament to the robustness of the new system.

As the old adage goes, the proof of the pudding is in the eating. This couldn't be truer when examining the dramatic latency improvements achieved by the redesign. Our overall processing latency for HTTP requests at Cloudflare improved by an average of 12.5% compared to the previous system.

This improvement is even more significant in the Bot Management module, where latency improved by an average of 55.93%.

Every request, every microsecond: scalable machine learning at Cloudflare
Bot Management module latency, in microseconds.

More specifically, our machine learning features fetch latency has improved by several orders of magnitude:

Latency metric Before (μs) After (μs) Change
p50 532 9 -98.30% or x59
p99 9510 18 -99.81% or x528
p999 16000 29 -99.82% or x551

To truly grasp this impact, consider this: with Cloudflare’s average rate of 46 million requests per second, a saving of 523 microseconds per request equates to saving over 24,000 days or 65 years of processing time every single day!

In addition to latency improvements, we also reaped other benefits from the rollout:

  • Enhanced feature availability: thanks to eliminating Unix socket timeouts, machine learning feature availability is now a robust 100%, resulting in fewer false positives and negatives in detections.
  • Improved resource utilization: our system overhaul liberated resources equivalent to thousands of CPU cores and hundreds of gigabytes of RAM – a substantial enhancement of our server fleet's efficiency.
  • Code cleanup: another positive spin-off has been in our Lua and Go code. Thousands of lines of less performant and less memory-safe code have been weeded out, reducing technical debt.
  • Upscaled machine learning capabilities: last but certainly not least, we've significantly expanded our machine learning features, dimensions, and models. This upgrade empowers our machine learning inference to handle hundreds of machine learning features and dozens of dimensions and models.

Conclusion

In the wake of our redesign, we've constructed a powerful and efficient system that truly embodies the essence of 'bliss'. Harnessing the advantages of memory-mapped files, wait-free synchronization, allocation-free operations, and zero-copy deserialization, we've established a robust infrastructure that maintains peak performance while achieving remarkable reductions in latency. As we navigate towards the future, we're committed to leveraging this platform to further improve our Security machine learning products and cultivate innovative features. Additionally, we're excited to share parts of this technology through an open-sourced Rust package mmap-sync.

As we leap into the future, we are building upon our platform's impressive capabilities, exploring new avenues to amplify the power of machine learning. We are deploying a new machine learning model built on BLISS with select customers. If you are a Bot Management subscriber and want to test the new model, please reach out to your account team.

Separately, we are on the lookout for more Cloudflare customers who want to run their own machine learning models at the edge today. If you’re a developer considering making the switch to Workers for your application, sign up for our Constellation AI closed beta. If you’re a Bot Management customer and looking to run an already trained, lightweight model at the edge, we would love to hear from you. Let's embark on this path to bliss together.

AMD Instinct MI300 is THE Chance to Chip into NVIDIA AI Share

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/amd-instinct-mi300-is-the-chance-to-chip-into-nvidia-ai-share/

The AMD Instinct MI300 is a family of CPU and GPU offerings that embodies a next-generation approach for AI and high-performance computing

The post AMD Instinct MI300 is THE Chance to Chip into NVIDIA AI Share appeared first on ServeTheHome.

Survey reveals AI’s impact on the developer experience

Post Syndicated from Inbal Shani original https://github.blog/2023-06-13-survey-reveals-ais-impact-on-the-developer-experience/


Developers today do more than just write and ship code—they’re expected to navigate a number of tools, environments, and technologies, including the new frontier of generative artificial intelligence (AI) coding tools. But the most important thing for developers isn’t story points or the speed of deployments. It’s the developer experience, which determines how efficiently and productively developers can exceed standards, enter a flow state, and drive impact.

I say this not only as GitHub’s chief product officer, but as a long-time developer who has worked across every part of the stack. Decades ago, when I earned my master’s in mechanical engineering, I became one of the first technologists to apply AI in the lab. Back then, it would take our models five days to process our larger datasets—which is striking considering the speed of today’s AI models. I yearned for tools that would make me more efficient and shorten my time to production. This is why I’m passionate about developer experience (DevEx) and have made it my focus as GitHub’s chief product officer.

Amid the rapid advancements in generative AI, we wanted to get a better understanding from developers about how new tools—and current workflows—are impacting the overall developer experience. As a starting point, we focused on some of the biggest components of the developer experience: developer productivity, team collaboration, AI, and how developers think they can best drive impact in enterprise environments.

To do so, we partnered with Wakefield Research to survey 500 U.S.-based developers at enterprise companies. In the following report, we’ll show how organizations can remove barriers to help enterprise engineering teams drive innovation and impact in this new age of software development. Ultimately, the way to innovate at scale is to empower developers by improving their productivity, increasing their satisfaction, and enabling them to do their best work—every day. After all, there can be no progress without developers who are empowered to drive impact.

Inbal Shani
Chief Product Officer // GitHub

Learn how generative AI is changing the developer experience

Discover how generative AI is changing software development in a pre-recorded session from GitHub.

Watch the video >

Why developer experience matters

At GitHub, we’re aware there’s often a significant gap between the day-to-day reality for most developers and “conversations about ‘what developers want.’”

With this survey, we wanted to better understand the typical experience for developers—and identify key ways companies can empower their developers and achieve greater success.

One big takeaway: It starts with investing in a great developer experience. And collaboration, as we learned from our research, is at the core of how developers want to work and what makes them most productive, satisfied, and impactful.

A diagram of a formula behind the developer experience that accounts for productivity, impact, satisfaction, and collaboration.
C = Collaboration, the multiplier across the entire developer experience.

DevEx is a formula that takes into account:

  • How simple and fast it is for a developer to implement a change on a codebase—or be productive.
  • How frictionless it is to move from idea through production to impact.
  • How positively or negatively the work environment, workflows, and tools affect developer satisfaction.

For leaders, developer experience is about creating a collaborative environment where developers can be their most productive, impactful, and satisfied at work. For developers, collaboration is one of the most important parts of the equation.

Current performance metrics fall short of developer expectations

Developers say performance metrics don’t meet expectations

The way developers are currently evaluated doesn’t align with how they think their performance should be measured.

  • For instance, the developers we surveyed say they’re currently measured by the number of incidents they resolve. But developers believe that how they handle those bugs and issues is more important to performance. This aligns with the belief that code quality over code quantity should remain a top performance metric.
  • Developers also believe collaboration and communication should be just as important as code quality in terms of performance measures. Their ability to collaborate and communicate with others is essential to their job, but only 33% of developers report that their companies use it as a performance metric.
Key survey findings showing what developer say their managers use to measure their performance and what developers think will matter more when they start using AI coding tools.
Metrics currently used to measure performance, compared with metrics developers think should be used to measure their performance.
More than output quantity and efficiency, code quality and collaboration are the most
important performance metrics, according to the developers we surveyed.
Twitter logo LinkedIn logo
A chart showing what developers say their teams spend the most time doing at work.
The top ranked responses that developers say their teams are working the most on including writing code and finding and fixing security vulnerabilities.

Developers want more opportunities to upskill and drive impact

When developers are asked about what makes a positive impact on their workday, they rank learning new skills (43%), getting feedback from end users (39%), and automated tests (38%), and designing solutions to novel problems (36%) as top contenders.

A ranked list of the tasks 500 U.S.-based developers say have the most positive impact on their workdays.
The top tasks developers say positively impact their workdays.

But developers say they’re spending most of their time writing code and tests, then waiting for that code to be reviewed or builds and tests to be executed.

On a typical day, the enterprise developers we surveyed report their teams are busy with a variety of tasks, including writing code, fixing security vulnerabilities, and getting feedback from end users, among other things. Developers also report that they spend a similar amount of time across these tasks, indicating that they’re stretched thin throughout the day.

A ranked list of the top tasks developers and software engineers say they spend the most time working on each day.
The tasks developers say they spend the most time working on each day.

Notably, developers say they spend the same amount of time waiting for builds and tests as they do writing new code.

  • This suggests that wait times for builds and tests are still a persistent problem despite investments in DevOps tools over the past decade.
  • Developers also continue to face obstacles, such as waiting on code review, builds, and test runs, which can hinder their ability to learn new skills and design solutions to novel problems, and our research suggests that these factors can have the biggest impact on their overall satisfaction.

Developers want feedback from end users, but face challenges

Developers say getting feedback from end users (39%) is the second-most important thing that positively impacts their workdays—but it’s often challenging for development teams to get that feedback directly.

  • Product managers and marketing teams often act as intermediaries, making it difficult for developers to directly receive end-user feedback.
  • Developers would ideally receive feedback from automated and validation tests to improve their work, but sometimes these tests are sent to other teams before being handed off to engineering teams.

The top two daily tasks for development teams include writing code (32%) and finding and fixing security vulnerabilities (31%).

  • This shows the increased importance developers have placed on security and underscores how companies are prioritizing security.
  • It also demonstrates the critical role that enterprise development teams play in meeting policy and board edicts around security.

The bottom line
Developers want to upskill, design solutions, get feedback from end users, and be evaluated on their communication skills. However, wait times on builds and tests, as well as the current performance metrics they’re evaluated on, are getting in the way.

Collaboration is the cornerstone of the developer experience

Developers thrive in collaborative environments

In our survey of enterprise engineers, developers say they work with an average of 21 other developers on a typical project—and 52% report working with other teams daily or weekly. Notably, they rank regular touchpoints as the most important factor for effective collaboration.

A survey finding that developers at enterprise companies often work with an average of 21 developers on other projects and often work on a daily or weekly basis with colleagues.
Developers in enterprise settings often work with an average of 21 other developers on a daily or weekly cadence.

But developers also have a holistic view of collaboration—it’s defined not only by talking and meeting with others, but also by uninterrupted work time, access to fully configured developer environments, and formal mentor-mentee relationships.

  • Specified blocks with no team communication give developers the time and space to write code and work towards team goals.
  • Access to fully configured developer environments promotes consistency throughout the development process. It also helps developers collaborate faster and avoid hearing the infamous line, “But it worked on my machine.”
  • Mentorships can help developers upskill and build interpersonal skills that are essential in a collaborative work environment.

It’s important to note these factors can also negatively impact a developer’s work day—which suggests that ineffective meetings can serve to distract rather than help developers (something we’ve found in previous research).

The key factors developers in a survey say contribute most highly to effective team collaboration including meetings, dedicated time for individual work, and access to fully configured dev environments.

Our survey indicates the factors most important to effective collaboration are so critical that when they’re not done effectively, they have a noticeable, negative impact on a developer’s work.

A ranked list of the top tasks developers in a survey reported as having a negative impact on their overall workday experience.
The tasks developers say most often have a negative impact on their workday experience.
Developers work with an average of 21 people on any given project. They need the time and tools for success—including regular touchpoints, heads-down time, access to fully-configured dev environments, and formal mentor-mentee relationships.
Twitter logo LinkedIn logo

We wanted to learn more about how developers collaborate

So, we sourced some answers from our followers on Twitter. We asked developers what tips they have for effective collaboration. Here’s what one developer had to say:

Twitter user Colby Ray had multiple points in response to our prompt. Click the image to read his tweet.

We also asked what makes for a productive and valuable meeting:

Twitter user kettenaito had several points in response to our prompt. Click the image to read on Twitter.

Twitter user Mateus Feira had several points in response to our prompt. Click the image to read on Twitter.

Effective collaboration improves code quality

As developer experience continues to be defined, so, too, will successful developer collaboration. Too many pings and messages can affect flow, but there’s still a need to stay in touch. In our survey, developers say effective collaboration results in improved test coverage and faster, cleaner, more secure code writing—which are best practices for any development team. This shows that when developers work effectively with others, they believe they build better and more secure software.

Developers in a survey report that collaboration positively impacts how they write code, how fast they can ship it, and more.
Developers widely view effective collaboration as helping to improve what they ship and how often they ship it.

Developers we surveyed believe collaboration and communication—along with code quality—should be the top priority for evaluation.

  • From DevOps to agile methodologies, developers and the greater business world have been talking about the importance of collaboration for a long time.
  • But developers are still not being measured on it.
Developers in a survey respond to a question about what metrics they believe their companies should use to measure their performance and productivity.
The metrics that developers think their managers should use to evaluate their performance and productivity.

We asked developers to share their ideas for measuring how well they collaborate. Here’s what one developer had to say:

Twitter user Andrew DiMola had several points in response to our prompt. Click to read on Twitter.

  • The takeaway: Companies and engineering managers should encourage regular team communication, and set time to check in–especially in remote environments–but respect developers’ need to work and focus.
Developers think regular touchpoints with their teams including meetings, asynchronous communication, and innersource practices help organizations collaborate at scale.
Developers believe that effective and regular touchpoints with their colleagues are critical for effective team collaboration.

4 tips for engineering managers to improve collaboration

At GitHub, our researchers, developers, product teams, and analysts are dedicated to studying and improving developer productivity and satisfaction. Here are their tips for engineering leaders who want to improve collaboration among developers:

  1. Make collaboration a goal in performance objectives. This builds the space and expectation that people will collaborate. This could be in the form of lunch and learns, joint projects, etc.
  2. Define and scope what collaboration looks like in your organization. Let people know when they’re being informed about something vs. being consulted about something. A matrix outlining roles and responsibilities helps define each person’s role and is something GitHub teams have implemented.
  3. Give developers time to converse and get to know one another. In particular, remote or hybrid organizations need to dedicate a portion of a developer’s time and virtual space to building relationships. Check out the GitHub guides to remote work.
  4. Identify principal and distinguished engineers. Academic research supports the positive impact of change agents in organizations—and how they should be the people who are exceptionally great at collaboration. It’s a matter of identifying your distinguished engineers and elevating them to a place where they can model desired behaviors.

The bottom line
Effective developer collaboration improves code quality and should be a performance measure. Regular touchpoints, heads-down time, access to fully configured dev environments, and formal mentor-mentee relationships result in improved test coverage and faster, cleaner, more secure code writing.

AI improves individual performance and team collaboration

Developers are already using AI coding tools at work

A staggering 92% of U.S.-based developers working in large companies report using an AI coding tool either at work or in their personal time—and 70% say they see significant benefits to using these tools.

  • AI is here to stay—and it’s already transforming how developers approach their day-to-day work. That makes it critical for businesses and engineering leaders to adopt enterprise-grade AI tools to avoid their developers using non-approved applications. Companies should also establish governance standards for using AI tools to ensure that they are used ethically and effectively.
92% of developers in a survey say they're already using AI coding tools at work.
Almost all developers are already using AI coding tools at and outside of work.

70% of developers see a benefit to using AI coding tools at work.

Almost all (92%) developers use AI coding tools at work—and a majority (67%) have used these tools in both a work setting and during their personal time. Curiously, only 6% of developers in our survey say they solely use these tools outside of work.
Twitter logo LinkedIn logo

Developers believe AI coding tools will enhance their performance

With most developers experimenting with AI tools in the workplace, our survey results suggest it’s not just idle interest leading developers to use AI. Rather, it’s a recognition that AI coding tools will help them meet performance standards.

  • In our survey, developers say AI coding tools can help them meet existing performance standards with improved code quality, faster outputs, and fewer production-level incidents. They also believe that these metrics should be used to measure their performance beyond code quantity.
The metrics developers say their managers use to measure their productivity vs. the metrics developers think their managers should use to measure their productivity if they use AI coding tools.
Developers widely think that AI coding tools will layer into their existing workflows and bring greater efficiencies—but they do not think AI will change how software is made.

Around one-third of developers report that their managers currently assess their performance based on the volume of code they produce—and an equal number anticipate that this will persist when they start using AI-based coding tools.

  • Notably, the quantity of code a developer produces may not necessarily correspond to its business value.
  • Stay smart. With the increase of AI tooling being used in software development—which often contributes to code volume—engineering leaders will need to ask whether measuring code volume is still the best way to measure productivity and output.

Developers think AI coding tools will lead to greater team collaboration

Beyond improving individual performance, more than 4 in 5 developers surveyed (81%) say AI coding tools will help increase collaboration within their teams and organizations.

  • In fact, security reviews, planning, and pair programming are the most significant points of collaboration and the tasks that development teams are expected to, and should, work on with the help of AI coding tools. This also indicates that code and security reviews will remain important as developers increase their use of AI coding tools in the workplace.
Developers believe that AI coding tools will make engineering teams more collaborative as the quality of code produced becomes ever more important.
Developers think their teams will need to become more collaborative as they start using AI coding tools.
Sometimes, developers can do the same thing with one line or multiple lines of code. Even still, one-third of developers in our survey say their managers measure their performance based on how much code they produce.
Twitter logo LinkedIn logo

Notably, developers believe AI coding tools will give them more time to focus on solution design. This has direct organizational benefits and means developers believe they’ll spend more time designing new features and products with AI instead of writing boilerplate code.

  • Developers are already using generative AI coding tools to automate parts of their workflow, which frees up time for more collaborative projects like security reviews, planning, and pair programming.
Developers think AI coding tools will help them upskill, become more productive, and focus on higher-value problem solving.
Developers believe that AI coding tools will help them focus on higher-value problem solving.

Developers think AI increases productivity and prevents burnout

Not only can AI coding tools help improve overall productivity, but they can also provide upskilling opportunities to help create a smarter workforce according to the developers we surveyed.

  • 57% of developers believe AI coding tools help them improve their coding language skills—which is the top benefit they see. Beyond the prospect of acting as an upskilling aid, developers also say AI coding tools can also help with reducing cognitive effort, and since mental capacity and time are both finite resources, 41% of developers believe that AI coding tools can help with preventing burnout.
  • In previous research we conducted, 87% of developers reported that the AI coding tool GitHub Copilot helped them preserve mental effort while completing more repetitive tasks. This shows that AI coding tools allow developers to preserve cognitive effort and focus on more challenging and innovative aspects of software development or research and development.
  • AI coding tools help developers upskill while they work. Across our survey, developers consistently rank learning new skills as the number one contributor to a positive workday. But 30% also say learning and development can have a negative impact on their overall workday, which suggests some developers view learning and development as adding more work to their workdays. Notably, developers say the top benefit of AI coding tools is learning new skills—and these tools can help developers learn while they work, instead of making learning and development an additional task.
Developers are already using generative AI coding tools to automate parts of their workflow, which frees up time for more collaborative projects like security reviews, planning, and pair programming.
Twitter logo LinkedIn logo

AI is improving the developer experience across the board

Developers in our survey suggest they can better meet standards around code quality, completion time, and the number of incidents when using AI coding tools—all of which are measures developers believe are key areas for evaluating their performance.

AI coding tools can also help reduce the likelihood of coding errors and improve the accuracy of code—which ultimately leads to more reliable software, increased application performance, and better performance numbers for developers. As AI technology continues to advance, it is likely that these coding tools will have an even greater impact on developer performance and upskilling.

AI coding tools are layering into existing developer workflows and creating greater efficiencies

Developers believe that AI coding tools will increase their productivity—but our survey suggests that developers don’t think these tools are fundamentally altering the software development lifecycle. Instead, developers suggest they’re bringing greater efficiencies to it.

  • The use of automation and AI has been a part of the developer workflow for a considerable amount of time, with developers already utilizing a range of automated and AI-powered tools, such as machine learning-based security checks and CI/CD pipelines.
  • Rather than completely overhauling operations, these tools create greater efficiencies within existing workflows, and that frees up more time for developers to concentrate on developing solutions.

The bottom line
Almost all developers (92%) are using AI coding at work—and they say these tools not only improve day-to-day tasks but enable upskilling opportunities, too. Developers see material benefits to using AI tools including improved performance and coding skills, as well as increased team collaboration.

The path forward

Developer satisfaction, productivity, and organizational impact are all positioned to get a boost from AI coding tools—and that will have a material impact on the overall developer experience.

92% of developers already saying they use AI coding tools at work and in their personal time, which makes it clear AI is here to stay. 70% of the developers we surveyed say they already see significant benefits when using AI coding tools, and 81% of the developers we surveyed expect AI coding tools to make their teams more collaborative—which is a net benefit for companies looking to improve both developer velocity and the developer experience.

Notably, 57% of developers believe that AI could help them upskill—and hold the potential to build learning and development into their daily workflow. With all of this in mind, technical leaders should start exploring AI as a solution to improve satisfaction, productivity, and the overall developer experience.

In addition to exploring AI tools, here are three takeaways engineering and business leaders should consider to improve the developer experience:

  1. Help your developers enter a flow state with tools, processes, and practices that help them be productive, drive impact, and do creative and meaningful work.
  2. Empower collaboration by breaking down organizational silos and providing developers with the opportunity to communicate efficiently.
  3. Make room for upskilling within developer workflows through key investments in AI to help your organization experiment and innovate for the future.

Methodology

This report draws on a survey conducted online by Wakefield Research on behalf of GitHub from March 14, 2023 through March 29, 2023 among 500 non-student, U.S.-based developers who are not managers and work at companies with 1,000-plus employees. For a complete survey methodology, please contact [email protected].

Rumor on the Street Intel Gaudi2 Getting an AI Accelerator Super Cluster

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/rumor-on-the-street-intel-gaudi2-getting-an-ai-accelerator-super-cluster-nvidia/

Rumor has it there is an AI supercluster being built around the Intel Gaudi2 AI accelerator that is massive in scale

The post Rumor on the Street Intel Gaudi2 Getting an AI Accelerator Super Cluster appeared first on ServeTheHome.