Abstract: The integration of multiple heterogeneous chiplets presents opportunities to enhance both cost-efficiency and performance of ICs. However, this approach also brings forth challenges such as performance optimization, energy efficiency, signal and power integrity, thermal management, and Design for Test (DFT) considerations. In this talk, we will discuss the four critical dimensions of the solution space of implementing an IC using multiple heterogenous chiplets and discuss an innovative methodology of Chiplet Design Automation, leveraging EDA tools and IPs, to enable the designer to explore a much broader solution space and thus discover a much better solution utilizing diverse chiplets.
Array Partitioning Method for Streaming Dataflow Optimization in High-level Synthesis
Presenter: Renjing Hou, Beijing University of Posts and Telecommunications
Abstract: High-level synthesis (HLS) is a popular method that allows designers to describe the behavior-level functionality and automatically generates efficient register-transfer level (RTL) descriptions. In HLS, dataflow is the key micro-architecture to achieve high parallelism. However, the streaming dataflow is often limited by its strict conditions such as sequential access on the potential channels. To settle this issue, this paper proposes an efficient array partitioning method for the streaming dataflow inference. The key is to explore the potential array partitioning mode which matches the sequential access requirements by streaming channels. An experimental case study is presented on the inference of the convolutional neural networks (CNN).It indicates that the proposed method can achieve about 28.6% performance improvements compared with the default dataflow,with the cost of 7.2% power increasement.
Automated Python-to-RTL Transformation and Optimization for Neural Network Acceleration
Presenter: Chen Yang, Beijing University of Posts and Telecommunications
Abstract: In order to optimize the process of accelerating large-scale neural network (NN) on field-programmable gate array (FPGA), this paper presents and optimizes the automatic flow based on HeteroCL and Xilinx Vitis HLS. This flow could transform python-based NN description to Verilog RTL running on FPGA. To improve its quality of result (QoR), many key optimization methods are proposed for the high-level synthesis (HLS) input, including fixed-point quantization, loop pipelining, convolutional buffer and others. To prove the feasibility of proposed optimization techniques, the convolutional NN (CNN) is selected as the experimental case study. And the results show that the delay and power consumption due to optimization techniques are significantly reduced.
An Equivalent Circuit Model for Elliptic Cylindrical TSV Considering the Temperature Influence
Presenter: Wenbo Guan, Xidian University
Abstract: Through Silicon Via (TSV) technology is a key technology to realize multi-layer chips and its structure and equivalent circuit model have attracted much attention. With the continuous reduction of chip size, higher requirements are put forward for the equivalent circuit model of TSV. Elliptic cylindrical TSV has the advantages of small occupation area and good transmission characteristics. The established TSV model takes into account factors such as capacitance changes caused by non-uniform diameters. Under the temperature effect, the equivalent circuit model of elliptic cylindrical TSV is established, and then its RLCG parasitic parameters are extracted. By comparing the S parameter simulation results of HFSS and ADS, the accuracy of the equivalent circuit is verified. It is indicated that the circuit model of the novel elliptic cylindrical TSV can well predict the S parameters of the TSV in 0~50GHz.
Cross-die Optimization For Logic-on-Memory Face-to-Face Bonding 3-D IC Designs
Presenter: Siyuan Xu, Huawei Noah's Ark Lab
Abstract: The Moore's laws are pushing chips to grow smaller and higher in density. On the other hand, vertical chip stacking (i.e: three-dimensional integration or package) has attracted both academia and industry as it can lead to shorter wire interconnection, lower power consumption, and higher efficiency. However, this poses a significant challenge for cross-die co-optimization. In this paper, we discuss the cross-die optimization of 3D logic-on-memory integration. The background issues will be presented firstly and following by some of the recent work. Finally related difficulty including netlist partitioning, netlist representation, cross-die optimization, and its estimation and simulation will be discussed.
Chiplet Explorer: optimizing ICs with diverse chiplets through automated solution-space exploration
Invited Speaker: Pei-Hsin Ho, Shanghai UniVista Industrial Software Group Co., Ltd.