
After years of development and constant publicity, Intel’s fourth-generation Xeon Scalable processor, code-named Sapphire Rapids, is finally here! This series of new x86 server platforms, which everyone expected to completely open a new page in the history of server computing by the end of 2022, will officially debut in mid-January.
Compared with AMD’s fourth-generation EPYC processor released in November last year, which can provide a maximum of 96 computing cores and support DDR5 memory and PCIe 5.0 interface, Intel’s fourth-generation Xeon Scalable processor also supports these two hardware The new specification increases the number of computing cores to a maximum of 60.
The most notable feature is the built-in various accelerator engines that can be used for a variety of workloads. It provides a choice of 52 models, covering 10 major application scenarios. Instant inference and training performance can be achieved. 10 times higher than the previous generation
https://www.facebook.com/plugins/like.php?action=like&app_id=161989317205664&channel=https%3A%2F%2Fstaticxx.facebook.com%2Fx%2Fconnect%2Fxd_arbiter%2F%3Fversion%3D46%23cb%3Dfc1abe2172bfc4%26domain%3Dwww.ithome.com.tw%26is_canvas%3Dfalse%26origin%3Dhttps%253A%252F%252Fwww.ithome.com.tw%252Ff29c305d8beeb54%26relation%3Dparent.parent&container_width=0&href=https%3A%2F%2Fwww.ithome.com.tw%2Ftech%2F155638&layout=button_count&locale=zh_TW&sdk=joey&share=true&show_faces=false
Text / Li Zonghan | Published on 2023-02-27

Intel
Saying goodbye to 2022, the two major x86 server processor manufacturers once clashed in November. On the eve of the SC22 annual supercomputer conference, they competed to release the latest platform, and this wave of product release battles will continue this year.
On November 2, Intel’s Twitter account Intel News announced that a data center presentation will be held on January 10, which will include the fourth-generation Xeon Scalable series processors.
A week later, they preemptively announced (introduce) the Intel Max series product line on November 9 , including the server processor Xeon CPU Max series (code-named Sapphire Rapids HBM), claiming that 12 manufacturers responded to the call and provided more than 30 models. System design, in terms of loose standards, Intel is considered to have fulfilled their 2022 Sapphire Rapids launch promise at the beginning of the year.
But the embarrassing thing is that during that period, only a few server manufacturers issued press releases (such as Supermicro), saying that they would launch models with this series of processors in the future, and, after all, Sapphire Rapids HBM is aimed at high performance Products in the computing market do not cover general-purpose needs. Therefore, strictly speaking, the whole card of Sapphire Rapids has only been revealed, and the overall picture is still not clear enough.
And on November 10, AMD announced the launch of the fourth-generation EPYC server processor platform (on October 24, it was announced that the next-generation data center processor will be held on November 10), although it was a day later than the release of the Intel Max series. , in terms of the momentum of the product launch and the degree of support expressed by the partners, it is clearly superior.
A number of major cloud service providers and server manufacturers have sent company executives to congratulate through pre-recorded videos or go to the AMD conference platform in person. They also took the lead in announcing the latest implementation of individual services using this processor platform, or the new generation. Server models not only boost the reputation of processor manufacturers, but also take the opportunity to advertise their own products and services.
On December 22, Intel officially announced that it will hold a product launch event on January 10, 2023 , at which time it will launch (Launch) the fourth-generation Xeon Scalable series processor platform (codenamed Sapphire Rapids), as well as the Max series CPU and GPU.
However, it was the end of the year and the end of the year, and the world was approaching the 2022 Christmas and 2023 New Year holidays. During this period, the attention of the entire IT industry was on the news related to the International Consumer Electronics Show (CES). The development of servers is often focused on the personal computer platform, and most people do not expect a major announcement of a new generation of server platforms at this time.
Looking back at the release timing of Intel’s past generations of server processor platforms, the first-generation Xeon Scalable debuted in July 2017, the second-generation Xeon Scalable in April 2019, the third-generation Xeon Scalable in April 2021, and the fourth-generation Xeon Scalable in April 2021. The first generation of Xeon Scalable was officially launched in the next week of the CES exhibition, which is the first case.
Another reason why this processor platform was announced unexpectedly was because of the delays in the time to market.
Intel began to promote Sapphire Rapids long before the third-generation Xeon Scalable went on the market; at the Architecture Day in 2021 , they publicly introduced their technical architecture and market competitive advantages in a complete and systematic way for the first time; In a number of large-scale public events, Intel announced that in the three years of 2022, 2023, and 2024, three generations of Xeon processors, namely Sapphire Rapids, Emerald Rapids, and Granite Rapids and Sierra Forest, will be launched successively , confirming that Sapphire Rapids is the fourth generation of Xeon Scalable , but the outside world is still unclear when Sapphire Rapids will be launched in 2022.
Regarding the possible launch time of Sapphire Rapids, Intel has put forward different opinions in the past, which once made everyone think that it was significantly ahead of schedule, but unfortunately it was finally delayed until January 10, 2023.
For example, at the Architecture Day event held in 2020, they stated that they expected to start production in the second half of 2021; at the company’s investor conference in February 2022, Intel executives indicated that shipments would start in March; 2022 and Innovation 2022 held in September, although Intel continued to introduce the features of Sapphire Rapids, but did not mention the official release time of the product.
After that, Intel held the Intel Sustainability Taiwan Day in Taiwan in December. , although some server manufacturers were invited to demonstrate their products and peripheral extension solutions with the fourth-generation Xeon Scalable, but we still don’t know when it will be officially launched, and we only vaguely feel that the time to market is getting closer.
It wasn’t until early January 2023, when we received a press release notice from a server manufacturer to Taiwanese media, that we realized that this server processor platform was finally about to be released.
Find more cooperative manufacturers to help out, showing the strength of long-term cultivation of the server market
When the SC22 conference released the Max series of CPUs and GPUs with online pre-recorded videos, Intel may have hoped to focus on high-performance, machine learning and other supercomputing applications, and only listed 12 server manufacturers using this platform. The user case that appeared on the platform is the Argonne National Laboratory under the US Department of Energy. By the time of the fourth-generation Xeon Scalable series and Max series presentations in January 2023, Intel called on a number of cooperative manufacturers and users to congratulate them on the launch of their products, showing that they are better than competitors in the market.
As far as server manufacturers are concerned, companies such as Dell Technologies , Inspur, HPE , Lenovo , Cisco , Supermicro , and Fujitsu have all dispatched senior executives at the level of vice presidents and even CEOs. Nvidia , which competes with Intel in the data center GPU market , also Because of the relationship between the new generation of AI integrated application devices using processors, it was the first time that a pre-recorded video appeared on Intel’s server release event.
In terms of telecommunications network operators, there are Ericsson and Telefónica; in the field of high-performance computing applications, the Los Alamos National Laboratory of the US Department of Energy is the representative.
Launched 52 models, providing up to 60 cores, and the performance per watt increased to 2.9 times
Regarding the recent launch strategies of the two major manufacturers’ new-generation server processor platforms, it is difficult to compare them. However, if you want to criticize them severely, the common point is that they both adopt the method of “squeezing toothpaste” to launch products. AMD’s fourth-generation EPYC processor 9004 series launched in November last year, strictly speaking, can only be regarded as the first wave of this generation of data center products. Although only 18 styles are available, there are three more waves in the future. Product announcement.
Intel, which has often been criticized like this in the past, originally had the opportunity to change everyone’s mind and released a server platform with breakthrough performance and support for a variety of latest technologies in one go. Unfortunately, the main release event of the fourth-generation Xeon Scalable processor ended Fortunately, the fourth-generation Xeon Scalable processor was officially launched in January 2023. Most of the product’s complete appearance and detailed specifications have surfaced. For the processor that can further accelerate the processing of 5G core workloads vRAN Boost, Intel will launch additional products with built-in technology later.
In terms of options for different workload requirements, Intel provides 52 models (including the Xeon CPU Max series) for the fourth-generation Xeon Scalable processors, which is equivalent to the first-generation and second-generation Xeon Scalable processors. In terms of division, it can be divided into 10 categories. In terms of application form, it covers 7 categories: general purpose, long-term use/Internet of Things, in-memory database/big data analysis/virtualization optimization, 5G/network optimization, cloud service optimization, storage and hyper-converged infrastructure optimization, high-performance computing optimization, and in general use, it is subdivided into 2-way high-performance, 2-way mainstream, liquid cooling, and single-way.
It is worth noting that the four major brand series of Xeon Scalable processors in the past three generations: Xeon Platinum (9000 series and 8000 series), Xeon Gold (6000 series and 5000 series), Xeon Silver (4000 series), Xeon Bronze (3000 series ), to the latest fourth generation, the 9000 series is independent as a new brand: Xeon Max, which is one of the supercomputing products that will debut in November 2022, corresponding to the above-mentioned high-performance computing optimization application style.
According to the specifications released by Intel, the fourth-generation Xeon Scalable can provide up to 60 computing cores, 120 execution threads, and a thermal design power consumption (TDP) of up to 350 watts. In contrast, the third-generation Xeon Scalable can provide up to 40 computing cores, 80 execution threads, and a maximum thermal design power consumption of 270 watts.
In terms of product internal configuration, the fourth-generation Xeon Scalable processors are divided into general-purpose and high-performance computing types (as shown in the comparison on the left). The difference is that the high-performance computing type has an additional built-in HBM2 memory. On top of the bridge connection (EMIB), 4 crystal bricks (as shown on the far right in the figure above) are connected by a modular die interleaving network (MDF). Multiple Cores (XCC) and Medium Cores (MCC). Image Source/Intel
Since the increase in performance and power consumption is an inevitable development trend of the new generation of server processors, efforts to improve energy efficiency have also become the focus of attention. Therefore, Intel also emphasized that the release of the fourth-generation Xeon Scalable has this feature.
First of all, in terms of performance per watt, if it is used for the workload of the key-value database system RocksDB, it is based on a system with two 40-core third-generation Xeon Scalable processor systems, and two 60-core fourth-generation Xeon processor systems The server with Scalable processor can reach 2.9 times.
Moreover, Intel has added the Optimized Power Mode to the fourth-generation Xeon Scalable. If this mode is activated, specific workloads can be executed with the least performance loss. For each processor, the average Power consumption can save 70 watts. For example, if it is used for SPECjbb, SPECint equivalent performance tests, and Nginx website key handshake processing, after the fourth-generation Xeon Scalable enables this mode, the execution of these workloads will affect performance by less than 5%, but save System power consumption can reach 20%.
At the same time, if you want to improve the power saving effect of the server, whether you can have a more complete remote status monitoring capability is also the key, because what is difficult to measure cannot be managed (You Can’t Manage What You Don’t Measure) For this reason, all generations of Xeon processors support or have built-in telemetry mechanisms, which can help provide important data and AI processing capabilities for smarter CPU resource monitoring and management, and building models that can predict data center or network traffic peak loads , and can automatically adjust the CPU clock rate to save power when the processing demand decreases, thereby reducing the carbon footprint of the data center.
In the fourth-generation Xeon Scalable processors, Intel added a new framework for this part of the demand, called Platform Monitoring Technology (PMT), which can monitor the temperature of the processor core, power consumption, and operating system side conditions. The disclosure and control of collected processor execution status and other information provide better remote monitoring capabilities.
Zane A. Ball, vice president of Intel and general manager of the Data Center Platform Engineering and Architecture Group, said that PMT has been used in their internal large-scale test facility verification process to collect several gigabytes of telemetry data, which can cover in-frequency (in- band), out-band (out-band) management applications, and even firmware updates without restarting the operating system. Intel has also updated related debugging tools, which can be used for remote diagnosis and problem solving. Those rare or occasional conditions that are on the fringes of the statistics.
In addition, Intel has provided innovative designs for air cooling and liquid cooling mechanisms to further reduce the energy consumed by data centers. In the process of manufacturing this generation of Xeon Scalable processors, Intel also made a demonstration, emphasizing the use of more than 90% renewable electricity in its own factory area, with advanced water recycling facilities.
In terms of total cost of ownership (TCO), Intel said that compared to a system with a third-generation Xeon Scalable, a system with a fourth-generation Xeon Scalable can reduce costs by 52% to 66%. For example, the TCO reduction for database is 52%, the TCO reduction for AI real-time inference is 55%, and the TCO reduction for high-performance computing is 66%.
Regarding the increase in computing performance, among the announcements of the fourth-generation Xeon Scalable series, the most eye-catching is the application of AI, which benefits from the built-in “Advanced Matrix Extension (AMX)” of the processor chip. The accelerator can achieve a maximum of 10 times improvement in PyTorch real-time inference and training processing. For large-scale AI language model processing workloads, the Xeon Max series can provide 20 times faster performance.
To meet the needs of HPC and 5G network server construction, add enhanced equipment to improve performance
As mentioned earlier, the fourth-generation Xeon Scalable is currently divided into 10 uses, some of which specialize in specific application fields, and the specifications are significantly different from other models.
For example, products that specialize in the high-performance computing market are mainly the Xeon Max 9400 series. Intel announced five models and detailed technical specifications this time. In addition to many common specifications such as AMX, Xeon Max is currently equipped with 64 GB of HBM2 high-bandwidth memory. . In terms of computing performance, compared to the third-generation Xeon Scalable, Xeon Max can provide 3.7 times the performance in practical application systems such as energy and earth system modeling.
In addition to AI and HPC, Intel also provides a series of products in the fourth-generation Xeon Scalable for the high-performance and low-latency workload requirements of network and edge computing, which can be widely used in telecommunications, retail, manufacturing, smart cities and other fields. This is an important cornerstone to assist the entire industry in realizing a software-defined architecture.
In terms of the execution of core workloads of 5G network services, the fourth-generation Xeon Scalable built-in acceleration technology has features such as improved data throughput, reduced access latency, and improved power consumption management, which can enhance the response of the entire platform Speed and operational efficiency.
As far as the application of virtual radio access network (vRAN) is concerned, Intel has announced at the World Mobile Communications Conference (MWC Barcelona 2022) in February 2022 that Sapphire Rapids will add 5G-specific signal processing instruction enhancements, which are expected to Increase the load capacity to 2 times that of the third-generation Xeon Scalable processor system, and provide Sapphire Rapids processors that integrate accelerated vRAN workload execution technology.
The feature of this feature is that the fourth-generation Xeon Scalable’s new built-in AVX-512 for vRAN extended instruction set can complement the existing 32-bit and 64-bit floating-point instruction sets of the Xeon processor, and can support 16-bit Half-precision floating-point arithmetic (FP16), used for operations that require reduced computational precision, such as the processing of communication signals and media.
Taking wireless communication signals as an example, such as beamforming, precoding, and minimum mean square error (MMSE), AVX-512 for vRAN can be used to accelerate processing, prompting the general platform (GPP) in related operations In terms of efficiency, it can compete with dedicated equipment equipped with digital signal processors.
According to Intel’s tests with two generations of processors with similar core counts, power consumption, and clock pulses, and running FlexRAN software, the load capacity of the fourth-generation Xeon Scalable system can reach twice that of the third-generation Xeon Scalable system, and This will not increase power consumption; in other words, when communication service providers introduce such a computing platform, they can expect to obtain twice the performance per watt, thus meeting their requirements for key performance, scale expansion, and energy efficiency.
If users need to save more power and have stronger vRAN acceleration processing capabilities, the fourth-generation Xeon Scalable will provide models with built-in vRAN Boost technology (integrated vRAN acceleration chips).
On February 6, Intel clearly announced the launch time of this series of products. Sachin Katti, chief technology officer and strategy chief of the company’s network and edge business group, said that the fourth-generation Xeon Scalable processor with built-in vRAN Boost will be held at the end of the month. The 2023 World Mobile Communications Congress (MWC Barcelona 2023) is officially launched, and will be unveiled with some of their largest customers.
According to Intel’s situational design power consumption analysis and evaluation, under the same number of cores and clock speed, the system with this special processor can further save nearly 20% of power consumption compared with the general 4th generation Xeon Scalable. It provides better performance per watt, helps communication service providers save more power, and does not need to be designed with an external vRAN accelerator card, and also reduces OEM manufacturers’ bill of materials management costs.
Built-in 6 major hardware accelerations, supporting AI, analysis, network, information security, and storage applications
In the development process of the fourth-generation Xeon Scalable processor, as Intel actively advocates the new-generation vertically integrated manufacturing model from 2021, that is, the IDM 2.0 strategy, in the past two years, they have focused more on the promotion of product design and manufacturing capabilities. .
As far as server processor platform products are concerned, they have successively mentioned that they will use Intel 7 process computing tiles (Compute Tiles) and embedded multi-die interconnect bridge (EMIB) packaging to achieve different modular designs. At the same time, It will also support various advanced peripheral specifications such as DDR5 memory, PCIe 5.0 system I/O interface, and CXL 1.1 interconnection interface.
As far as the internal structure of the next-generation Xeon Scalable is concerned, in fact, a variety of new hardware component designs have also been introduced.
For example, on the 2021 Architecture Day, Intel introduced several important features, such as the combination of a high-performance core (Performance Core, P-core), which can reduce access latency and improve the performance of single-threaded applications on the server side.
Compared with the 12th-generation Core series of personal computer processors (code-named Alder Lake) released in October 2021, it is the first to be equipped with a high-performance core code-named Golden Cove. The new generation of Xeon Scalable series server processors also uses the Golden Cove core. , but additionally provides a new matrix multiplication engine, which is the aforementioned AMX, which is one of the acceleration functions announced on the 2020 Architecture Day.
In addition to the processor core, Intel also adds a variety of accelerator engines. At the 2021 Architecture Day and the Hot Chips Annual Conference, they first introduced the Data Streaming Accelerator (DSA), which was also announced at the 2020 Architecture Day; then Intel’s Quick Assist Technology (QAT), which has been developed for many years, can improve data encryption and compression Processing speed has been provided in many forms in the past, for example, independent accelerator cards, system chipsets, or built-in Xeon D series, Atom series processors, etc.; the other is the dynamic load balancing ( The Dynamic Load Balancing (DLB) accelerator can dynamically and evenly distribute the workload to each core for execution to improve computing efficiency.
During the Intel Innovation Conference held in September of the same year, Intel suddenly introduced and demonstrated another accelerator built into the fourth-generation Xeon Scalable, called the In-Memory Analytics Accelerator (IAA), which claims to be able to use a lower memory It can be used in general database systems, in-memory databases (In-Memory Database, IMDb), big data analysis systems, and data warehousing systems.
At the same time, Intel introduced the vRAN acceleration technology built into the processor in February, and finally announced its official name in September, called vRAN Boost. They predicted that the fourth-generation Xeon Scalable processor with this accelerator will be launched in 2023. The source of this acceleration technology is actually a dedicated interface card for vRAN preamble error correction (FCC) acceleration, named Intel vRAN Dedicated Accelerator ACC100 , which is now integrated into the Xeon Scalable processor in the form of a hardware accelerator component .
By January 2023, the fourth-generation Xeon Scalable will officially debut. Intel announced that the product will add an accelerator that strengthens server security protection, called Trust Domain Extensions (TDX), which provides a new virtual machine isolation technology that can Transplanting existing applications to a confidential environment for execution is expected to be demonstrated by several large public cloud operators. Both Microsoft Azure and IBM Cloud have published blog articles about this work. Intel said that Alibaba Cloud and Google Cloud also will support.
In terms of improvements in security protection, another long-awaited feature of Intel’s fourth-generation Xeon Scalable is finally built-in Control-Flow Enforcement Technology (CET), which can be generated by the processor by copying Pipe’s shadow stack (shadow stack), and then compare it to the stacked program call being executed to ensure that there are no unexpected differences, thereby combating return-oriented programming (Return-Oriented Programming, ROP) .
In fact , Intel has already started to provide the 11th-generation Core (code-named Tiger Lake) of the personal computer processor launched by CET as early as 2020 , and now the server processor Xeon Scalable also has this protection capability.