智能网卡加速Ceph存储的性能研究
电子技术应用
刘宝琴,罗向征,林茂,王钦雅,兰丽莎
迈普通信技术股份有限公司
摘要: 聚焦Ceph存储系统对象存储设备(Object Storage Device, OSD)架构线程锁竞争机制所导致的多核并行扩展能力受限问题,针对下一代Crimson-OSD架构与智能网卡协同优化技术开展研究,提出分层协同优化框架。相关研究表明,采用智能网卡协同优化,RDMA网络卸载降低CPU占用率达到70%,异构计算引擎实现纠删码硬件加速提升数据恢复速度达到4.84倍。研究成果为分布式存储系统的硬件加速提供相关理论依据与关键技术参考,对高性能计算和云边端融合等数据密集型场景的存储系统优化具有指导意义。
中图分类号:TN915.05 文献标志码:A DOI: 10.16157/j.issn.0258-7998.256678
中文引用格式: 刘宝琴,罗向征,林茂,等. 智能网卡加速Ceph存储的性能研究[J]. 电子技术应用,2025,51(12):14-19.
英文引用格式: Liu Baoqin,Luo Xiangzheng,Lin Mao,et al. Research on accelerating Ceph storage performance with SmartNICs[J]. Application of Electronic Technique,2025,51(12):14-19.
中文引用格式: 刘宝琴,罗向征,林茂,等. 智能网卡加速Ceph存储的性能研究[J]. 电子技术应用,2025,51(12):14-19.
英文引用格式: Liu Baoqin,Luo Xiangzheng,Lin Mao,et al. Research on accelerating Ceph storage performance with SmartNICs[J]. Application of Electronic Technique,2025,51(12):14-19.
Research on accelerating Ceph storage performance with SmartNICs
Liu Baoqin,Luo Xiangzheng,Lin Mao,Wang Qinya,Lan Lisha
Maipu Communication Technology Co., Ltd.
Abstract: This paper focuses on the issue of limited multi-core parallel scalability caused by thread lock contention mechanisms in the architecture of the Ceph storage system's Object Storage Device (OSD). It conducts research on collaborative optimization technologies between the next-generation Crimson-OSD architecture and SmartNICs, proposing a hierarchical cooperative optimization framework. Related studies demonstrate that employing SmartNIC-based cooperative optimization achieves a 70% reduction in CPU utilization through RDMA network offloading, while heterogeneous computing engines enable hardware acceleration for erasure coding, improving data recovery speed by 4.84 times. The research outcomes provide theoretical foundations and key technical references for hardware acceleration in distributed storage systems, offering guidance for optimizing storage systems in data-intensive scenarios such as high-performance computing and cloud-edge-end integration.
Key words : SmartNIC;Ceph storage system;performance optimization;hardware acceleration;distributed storage system
引言
以AI训练、HPC、边缘计算为代表的数据密集型应用爆发式增长对存储系统的性能与弹性提出前所未有的挑战。Ceph凭借高可用性与可扩展性优势在云数据中心得到广泛应用,但其传统OSD架构在多核场景下因线程锁竞争与跨核通信开销,导致处理器(CPU)利用率偏低,难以适配NVMe SSD等高性能硬件。Ceph社区为此重构了Crimson-OSD架构,通过Shared-Nothing设计与异步流水线模型,优化多核扩展性。实际测试表明: 8线程配置下,4K随机读IOPS性能达到311k,随着核数增长,性能得到进一步提升,验证了架构重构的有效性。尽管Crimson-OSD架构设计取得了长足进步,但在借助智能网卡可编程加速能力来开展协同优化方面的研究仍显不足。
针对Crimson-OSD 架构特点与性能瓶颈分析的基础上,本文提出基于智能网卡的分层协同优化框架,其核心内容包括两个方面,首先是建立关键参数性能敏感性模型,对Crimson-OSD多核扩展能力进行量化分析;其次设计分层协同优化框架,突破CPU算力对存储系统性能的制约。进一步对存算一体架构与AI赋能动态管理前沿方向进行了初步探讨。
本文详细内容请下载:
https://www.chinaaet.com/resource/share/2000006869
作者信息:
刘宝琴,罗向征,林茂,王钦雅,兰丽莎
(迈普通信技术股份有限公司,四川 成都 610094)

此内容为AET网站原创,未经授权禁止转载。
