《电子技术应用》
您所在的位置:首页 > 人工智能 > 设计应用 > 从RTL到GDS的功耗优化全流程
从RTL到GDS的功耗优化全流程
2022年电子技术应用第8期
顾东华1,曾智勇1,余金金1,黄徐辉1,朱嘉骏2,何湘君2,陈泽发2
1.燧原科技上海有限公司,上海200000;2.上海楷登电子科技有限公司,上海200000
摘要: 功耗作为大型SoC芯片的性能功耗面积(PPA)三要素之一,已经变得越来越重要。尤其是当主流设计平台已经发展到了7 nm以下。AI芯片一般会有多个核心并行执行高性能计算任务。这种行为会产生巨大的功耗。因此在AI芯片的设计过程中,功耗优化变得尤为重要。利用一个典型的功耗用例波形或者一组波形,可以从RTL进来开始功耗优化。基本的方式是借助Joules-replay实现基于RTL波形产生相对应的网表波形。在Genus的syn-gen、syn-map、syn-opt三个综合阶段,都可以加入Joules-replay,并且产生和综合网表相对应的波形,用于Innovus PR阶段进一步地进行功耗优化。在Innovus中实现Place和Routing也分为3个阶段:place_opt、cts_opt和route_opt。同样每一步都可以引入Joules-replay来生成功耗优化所需的网表波形。最终在Tempus timing signoff的环境中,再次引入波形进行功耗优化。基于上面的一系列各个节点的精确功耗优化该设计可以获得10%以上的功耗节省。此时再结合multi-bit技术,最终可以获得21%的功耗节省。
中图分类号: TN402
文献标识码: A
DOI:10.16157/j.issn.0258-7998.229807
中文引用格式: 顾东华,曾智勇,余金金,等. 从RTL到GDS的功耗优化全流程[J].电子技术应用,2022,48(8):65-69.
英文引用格式: Gu Donghua,Zeng Zhiyong,Yu Jinjin,et al. Fully power optimization flow from RTL to GDS[J]. Application of Electronic Technique,2022,48(8):65-69.
Fully power optimization flow from RTL to GDS
Gu Donghua1,Zeng Zhiyong1,Yu Jinjin1,Huang Xuhui1,Zhu Jiajun2,He Xiangjun2,Chen Zefa2
1.Enflame Technology,Shanghai 200000,China;2.Cadence Design System,Inc.,Shanghai 200000,China
Abstract: Power as one part of PPA(Performance, Power and Area) becomes more and more important in large SoC chips, especially under 7 nm technology. AI chips schedule multi-cores in parallel for specific application scenario, which lead to very large power consumption. Power optimization for each core is highest priority for an AI chip design. With a typical power scenario or multi-scenario grouped together, we can do power optimization from RTL synthesis to GDS. The basic flow is using Joules-replay to convert RTL activity file(time-based formats-VCD/FSDB/SHM/PHY) to gate level activity file. Synthesis with Genus has 3 steps: syn-gen, syn-map and syn-opt, Joules-replay is added after each step, and the replayed activity file will be used in power optimization in next step, which increase power estimation accuracy. Innovus place and route also has 3 main steps: place-opt, CTS-opt and route-opt, same flow with Joules-replay can be involved after each step, and it generates stimulus activity for next step. At final timing signoff stage, we use post-sim activity for power opt in Tempus. With this full flow power optimization flow, we can achieve more than 10% power reduction, combined with MBFF(Multi-Bit Flip-Flop) optimization, we can get 21% power reduction finally.
Key words : power optimization;AI chip design;SoC physical design;Joules-replay;Genus;Innovus

0 引言

    芯片设计一直在追求最好的PPA,在28 nm之前的技术节点上,很多时候更多地优先考虑性能和面积。随着技术节点向7 nm进化,标准单元的密度不断提升,随之而来的功耗密度也越来越大。因此作为PPA之一的功耗在设计中变得尤为重要。设计芯片需要在流程的各个节点尽量对功耗进行精确评估并进行优化,否则最终芯片的性能很可能由于功耗过大而无法充分发挥。




本文详细内容请下载:http://www.chinaaet.com/resource/share/2000004653




作者信息:

顾东华1,曾智勇1,余金金1,黄徐辉1,朱嘉骏2,何湘君2,陈泽发2

(1.燧原科技上海有限公司,上海200000;2.上海楷登电子科技有限公司,上海200000)




wd.jpg

此内容为AET网站原创,未经授权禁止转载。