Tag Archives: OCP

Massive Intel LGA7529 Socket for Sierra Forest at OCP Summit 2023

Post Syndicated from Eric Smith original https://www.servethehome.com/massive-intel-lga7529-socket-for-sierra-forest-at-ocp-summit-2023/

We get hands-on with the massive Intel LGA7529 socket at OCP Summit 2023. This is the 12-channel DDR5 socket for Intel’s 2024 generation CPUs

The post Massive Intel LGA7529 Socket for Sierra Forest at OCP Summit 2023 appeared first on ServeTheHome.

Intel Shows Granite Rapids and Sierra Forest Motherboards at OCP Summit 2023

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/intel-shows-granite-rapids-and-sierra-forest-motherboards-at-ocp-summit-2023-qct-wistron/

Intel showed off upcoming Granite Rapids and Sierra Forest Xeon motherboards at OCP Summit 2023, we show upcoming server trends

The post Intel Shows Granite Rapids and Sierra Forest Motherboards at OCP Summit 2023 appeared first on ServeTheHome.

Introducing the Project Argus Datacenter-ready Secure Control Module design specification

Post Syndicated from Xiaomin Shen original http://blog.cloudflare.com/introducing-the-project-argus-datacenter-ready-secure-control-module-design-specification/

Introducing the Project Argus Datacenter-ready Secure Control Module design specification

Introducing the Project Argus Datacenter-ready Secure Control Module design specification

Historically, data center servers have used motherboards that included all key components on a single circuit board. The DC-SCM (Datacenter-ready Secure Control Module) decouples server management and security functions from a traditional server motherboard, enabling development of server management and security solutions independent of server architecture. It also provides opportunities for reducing server printed circuit board (PCB) material cost, and allows unified firmware images to be developed.

Today, Cloudflare is announcing that it has partnered with Lenovo to design a DC-SCM for our next-generation servers. The design specification has been published to the OCP (Open Compute Project) contribution database under the name Project Argus.

A brief introduction to baseboard management controllers

A baseboard management controller (BMC) is a specialized processor that can be found in virtually every server product. It allows remote access to the server through a network connection, and provides a rich set of server management features. Some of the commonly used BMC features include server power management, device discovery, sensor monitoring, remote firmware update, system event logging, and error reporting.

In a typical server design, the BMC resides on the server motherboard, along with other key components such as the processor, memory, CPLD and so on. This was the norm for generations of server products, but that has changed in recent years as motherboards are increasingly optimized for high-speed signal bandwidth, and servers need to support specialized security requirements. This has made it necessary to decouple the BMC and its related components from the server motherboard, and move them to a smaller common form factor module known as the Datacenter Secure Control Module (DC-SCM).

Figure 1 is a picture of a motherboard used on Cloudflare’s previous generation of edge servers. The BMC and its related circuit components are placed on the same printed circuit board as the host CPU.

Introducing the Project Argus Datacenter-ready Secure Control Module design specification
Figure 1: Previous Generation Server Motherboard

For Cloudflare’s next generation of edge servers, we are partnering with Lenovo to create a DC-SCM based design. On the left-hand side of Figure 2 is the printed circuit board assembly (PCBA) for the Host Processor Module (HPM). It hosts the CPU, the memory slots, and other components required for the operation and features of the server design. But the BMC and its related circuits have been relocated to a separate PCBA, which is the DC-SCM.

Introducing the Project Argus Datacenter-ready Secure Control Module design specification
Figure 2: Next Generation HPM and DC-SCM

Benefits of DC-SCM based server design

PCB cost reduction

As of today, DDR5 memory runs at 6400MT/s (mega transfers per second). In the future DDR5 speed may even increase to 7200MT/s or 8800MT/s. Meanwhile, PCIe Gen5 is running at 32 GT/s (giga transfers per second), doubling the speed rate of PCIe Gen4. Both DDR5 and PCIE Gen5 are key interfaces for the processors used on our next-generation servers.

The increasing rates of high-speed IO signals and memory buses are pushing the next generation of server motherboard designs to transition from low-loss to ultra-low loss dielectric printed circuit board (PCB) materials, and higher layer counts in the PCB. At the same time, the speed of BMC and its related circuitry are not progressing so quickly. For example, the physical layer interface of ASPEED AST2600 BMC is only at PCIe Gen2 (5 GT/s).

Ultra-low loss dielectric PCB material and higher PCB layer count are both driving factors for higher PCB cost. Another driving factor of PCB cost is the size of the PCB. In a traditional server motherboard design, the size of the server motherboard is larger, since the BMC and its related circuits are placed on the same PCB as the host CPU.

By decoupling the BMC and its related circuitry from the host processor module (HPM), we can reduce the size of the relatively more expensive PCB for the HPM. BMC and its related circuitry can be placed on relatively cheaper PCB, with reduced layer count and lossier PCB dielectric materials. For example, in the design of Cloudflare’s next generation of servers, the server motherboard PCB needs to be 14 or more layers, whereas the BMC and its related components can be easily routed with 8 or 10 layers of PCB. In addition, the dielectric material used on DC-SCM PCB is low-loss dielectric — another cost saver compared to ultra-low loss dielectric materials used on HPM PCB.

Modularized design enables flexibility

DC-SCM modularizes server management and security components into a common add-in card form factor, enabling developers to remove customer specific solutions from the more complex components, such as motherboards, to the DC-SCM. This provides flexibility for developers to offer multiple customer-specific solutions, without the need to redesign multiple motherboards for each solution.

Developers are able to reuse the DC-SCM from a previous generation of server design, if the management and security requirements remain the same. This reduces the overall cost of upgrading to a new generation of servers, and has the potential to reduce e-waste when a server is decommissioned.

Likewise, management and security solution upgrades within a server generation can be carried out separately by modifying or replacing the DC-SCM. The more complex components on the HPM do not need to be redesigned. From a data center perspective, it speeds up the upgrade of management and security hardware across multiple server platforms.

Unified interoperable OpenBMC firmware development

Data center secure control interface (DC-SCI) is a standardized hardware interface between DC-SCM and the Host Processor Module (HPM). It provides a basis for electrical interoperability between different DC-SCM and host processor module (HPM) designs.

This interoperability makes it possible to have a unified firmware image across multiple DC-SCM designs, concentrating development resources on a single firmware rather than an array of them. The publicly-accessible OpenBMC repository provides a perfect platform for firmware developers of different companies to collaborate and develop such unified OpenBMC images. Instead of maintaining a separate BMC firmware image for each platform, we now use a single image that can be applied across multiple server platforms. The device tree specific to each respective server is automatically loaded based on device product information.

Using a unified OpenBMC image significantly simplifies the process of releasing BMC firmware to multiple server platforms. Firmware updates and changes are propagated to all supported platforms in a single firmware release.

Project Argus

The DC-SCM specifications have been driven by the Open Compute Project (OCP) Foundation hardware management workstream, as a way to standardize server management, security, and control features.

Cloudflare has partnered with Lenovo on what we call Project Augus, Cloudflare’s first DC-SCM implementation that fully adheres to the DC-SCM 2.0 specification. In the DC-SCM 2.0 specifications, a few design items are left open for implementers to decide on the most suitable architectural choices. With the goal of improving interoperability of Cloudflare DC-SCM designs across server vendors and server designs, Project Argus includes documentation on implementation details and design decisions on form factor, mechanical locking mechanism, faceplate design, DC-SCI pin out, BMC chip, BMC pinout, Hardware Root of Trust (HWRoT), HWRoT pinout, and minimum bootable device tree.

Introducing the Project Argus Datacenter-ready Secure Control Module design specification
Figure 3: Project Argus DC-SCM 2.0

At the heart of the Project Argus DC-SCM is the ASPEED AST2600 BMC System on Chip (SoC), which when loaded with a compatible OpenBMC firmware, provides a rich set of common features necessary for remote server management. ASPEED AST1060 is used on Project Argus DC-SCM as the HWRoT solution, providing secure firmware authentication, firmware recovery, and firmware update capability. Project Argus DC-SCM 2.0 uses Lattice MachXO3D CPLD with secure boot and dual boot ability as the DC-SCM CPLD to support a variety of IO interfaces including LTPI, SGPIO, UART and GPIOs.

The mechanical form factor of Project Argus DC-SCM 2.0 is the horizontal External Form Factor (EFF).

Cloudflare and Lenovo have contributed Project Argus Design Specification and reference design files to the OCP contribution database. Below is a detailed list of our contribution:

  • SPI, I2C/I3C, UART, LTPI/SGPIO block diagrams
  • DC-SCM PCB stackup
  • DC-SCM Board placements (TOP and BOTTOM layers)
  • DC-SCM schematic PDF file
  • DC-SCI pin definition PDF file
  • Power sequence PDF file
  • DC-SCM bill of materials Excel spreadsheet
  • Minimum bootable device tree requirements
  • Mechanical Drawings PDF files, including card assembly drawing and interlock rail drawing

The security foundation for our Gen 12 hardware

Cloudflare has been innovating around server design for many years, delivering increased performance per watt and reduced carbon footprints. We are excited to integrate Project Argus DC-SCM 2.0 into our next-generation, Cloudflare Gen 12 servers. Stay tuned for more exciting updates on Cloudflare Gen 12 hardware design!

Wiwynn Liquid Cooling for 8kW AI Accelerator Trays Shown

Post Syndicated from Eric Smith original https://www.servethehome.com/wiwynn-liquid-cooling-for-8kw-of-ai-accelerators-shown-oam-oai-ocp-ubb/

We check out the Hyper-scale server maker Wiwynn’s liquid cooling solution for 8kW GPU and AI accelerator trays at its headquarters

The post Wiwynn Liquid Cooling for 8kW AI Accelerator Trays Shown appeared first on ServeTheHome.

MiTAC Goldstone GS1D11 4th Gen Intel Xeon Scalable 2P at OCP Regional Summit 2023 Prague

Post Syndicated from Cliff Robinson original https://www.servethehome.com/mitac-goldstone-gs1d11-4th-gen-intel-xeon-scalable-2p-at-ocp-regional-summit-2023-prague/

We saw the new MiTAC Goldstone GS1D11 4th Gen Intel Xeon Scalable 2P at OCP Regional Summit 2023 in Prague

The post MiTAC Goldstone GS1D11 4th Gen Intel Xeon Scalable 2P at OCP Regional Summit 2023 Prague appeared first on ServeTheHome.

EdgeCore ECS4125-10P 2.5GbE PoE++ Switch at OCP Regional Summit 2023 Prague

Post Syndicated from Cliff Robinson original https://www.servethehome.com/edgecore-ecs4125-10p-2-5gbe-poe-switch-at-ocp-regional-summit-2023-prague/

The EdgeCore ECS4125-10P is an 8x 2.5GbE and PoE++ switch with 2x 10GbE uplinks that is nearly perfect for WiFi 6/ WiFi 6E AP deployments

The post EdgeCore ECS4125-10P 2.5GbE PoE++ Switch at OCP Regional Summit 2023 Prague appeared first on ServeTheHome.

MiTAC Capri2 CP2S11 AMD EPYC Genoa Server at OCP Regional Summit 2023 Prague

Post Syndicated from Cliff Robinson original https://www.servethehome.com/mitac-capri2-cp2s11-amd-epyc-genoa-server-at-ocp-regional-summit-2023-prague/

We saw the MiTAC Capri2 CP2S11 a hyper-scale oriented OCP platform built around the AMD EPYC 9004 “Genoa” processor line

The post MiTAC Capri2 CP2S11 AMD EPYC Genoa Server at OCP Regional Summit 2023 Prague appeared first on ServeTheHome.

Amazon Building for Retail Stores with 2000 Cameras and 100K Sensors OCP Regional Summit 2023

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/amazon-building-for-retail-stores-with-2000-cameras-and-100k-sensors-ocp-regional-summit-2023/

Amazon explained the challenge of building retail stores with over 2000 cameras as it unveiled its new Enterprise Edge Gateway OCP specs

The post Amazon Building for Retail Stores with 2000 Cameras and 100K Sensors OCP Regional Summit 2023 appeared first on ServeTheHome.