Research on Communication Topologies for Chiplet Architecture: Progress and Challenges
Affiliation:

1.Shenzhen Institute of Advanced Technology;2.Shenzhen university of Advanced Technology

Clc Number:

TP303

  • Article
  • | |
  • Metrics
  • |
  • Reference [95]
  • | |
  • Cited by [0]
  • | |
  • Comments
    Abstract:

    Chiplet-based multi-chip integration designs provide a flexible and scalable solution that surpasses traditional SoC (System on Chip) monolithic integration. However, inter-chiplet communication has become a significant bottleneck affecting overall system performance. The Network on Interposer (NoI) plays a pivotal role in multi-chip systems, directly influencing both performance and development costs. In this paper, we review NoI communication topologies for heterogeneous chiplets. We thoroughly explore the importance of current inter-chiplet communication architectures and discuss their design and implementation methods. This paper covers the entire communication process, spanning from protocol and interface layers to the application layer, classifying interconnect topologies based on their structural configurations and providing in-depth analyses and cross-comparisons for each category. Furthermore, we investigate future directions in NoI communication technologies, identifying technical challenges and potential solutions. We also propose advanced evaluation methods and modeling techniques for reusable interposer layers and topologies. This review aims to provide researchers with a thorough understanding of the current landscape and future trends in NoI technology, emphasizing its crucial role in advancing next-generation semiconductor devices across a wide spectrum of applications.

    Reference
    [1] JANGAM S C, PAL S, BAJWA A, et al. Latency, bandwidth and power benefits of the superchips integration scheme[C]//2017 IEEE 67th Electronic Components and Technology Conference (ECTC). IEEE, 2017: 86-94.
    [2] YIN J, LIN Z, KAYIRAN O, et al. Modular routing design for chiplet-based systems[C]//2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2018: 726-738.
    [3] LOH G H, NAFFZIGER S, LEPAK K. Understanding chiplets today to anticipate future integration opportunities and limits[C]//2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2021: 142-145.
    [4] PANDE P P, GRECU C, JONES M, et al. Performance evaluation and design trade-offs for network-on-chip interconnect architectures[J]. IEEE transactions on Computers, 2005, 54(8): 1025-1040.
    [5] SHARMA H, PFROMM L, TOPALOGLU R O, et al. Florets for Chiplets: Data Flow-aware High-Performance and Energy-efficient Network-on-Interposer for CNN Inference Tasks[J]. ACM Transactions on Embedded Computing Systems, 2023, 22(5s): 1-21.
    [6] BHARADWAJ S, YIN J, BECKMANN B, et al. Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling. In 2020 57th ACM/IEEE Design Automation Conference (DAC), 1–6[J]. IEEE, 1?6, 2020.
    [7] GOYAL V, WANG X, BERTACCO V, et al. Neksus: An interconnect for heterogeneous system-in-package architectures[C]//2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2020: 12-21.
    [8] OCP.2019.ODSAWiki.https://www.opencompute.org/wiki/Server/ODSA
    [9] FARJADRAD R, KUEMERLE M, VINNAKOTA B. A bunch-of-wires (BoW) interface for interchiplet communication[J]. IEEE Micro, 2019, 40(1): 15-24.
    [10] uciexpress.org/_files/ugd/0c1418_c5970a68ab214ffc97fab16d11581449.pdf (accessed on 6/12/2024)
    [11] ARDALAN S, FARJADRAD R, KUEMERLE M, et al. An open inter-chiplet communication link: Bunch of wires (BoW)[J]. IEEE Micro, 2020, 41(1): 54-60.
    [12] Debendra Das Sharma, UCIe White Paper, 2022, [online]Available:https://www.uciexpress.org/general-8.
    [13] FENG Y, XIANG D, MA K. Heterogeneous Die-to-Die Interfaces: Enabling More Flexible Chiplet Interconnection Systems[C]//Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture.2023:930-943.
    [14] oiforum.com/wp-content/uploads/OIF-FD-CEI-224G-01.0.pdf (accessed on 6/12/2024)
    [15] AIB-specification/AIB_Specification 2_0.pdf at master · chipsalliance/AIB-specification (github.com) (accessed on 6/12/2024)
    [16] opencompute.org/documents/bow-specification-v2-0d-1-pdf (accessed on 6/12/2024)
    [17] ARDALAN S, CIRIT H, FARJAD R, et al. Bunch of wires: An open die-to-die interface[C]//2020 IEEE Symposium on High-Performance Interconnects (HOTI). IEEE, 2020: 9-16.
    [18] http://www.iiisct.com/smart/upload/CMS1/202303/ACC1.0.pdf (accessed on 6/12/2024)
    [19] LIN M S, HUANG T C, TSAI C C, et al. A 7-nm 4-GHz Arm1-core-based CoWoS1 chiplet design for high-performance computing[J]. IEEE Journal of Solid-State Circuits, 2020, 55(4): 956-966.
    [20] IFF P, BESTA M, CAVALCANTE M, et al. HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement[C]//2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 2023: 1-6
    [21] EHRETT P, AUSTIN T, BERTACCO V. SiPterposer: A fault-tolerant substrate for flexible system-in-package design[C]//2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2019: 510-515.
    [22] NASRULLAH J, LUO Z, TAYLOR G. Designing Software Configurable Chips and SIPs using Chiplets and zGlue[C]//International Symposium on Microelectronics. International Microelectronics Assembly and Packaging Society, 2019, 2019(1): 000027-000032.
    [23] KANNAN A, JERGER N E, LOH G H. Enabling interposer-based disintegration of multi-core processors[C]//Proceedings of the 48th international symposium on Microarchitecture. 2015: 546-558.
    [24] STOW D, AKGUN I, XIE Y. Investigation of cost-optimal network-on-chip for passive and active interposer systems[C]//2019 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP). IEEE, 2019: 1-8.
    [25] STOW D, XIE Y, SIDDIQUA T, et al. Cost-effective design of scalable high-performance systems using active and passive interposers[C]//2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 2017: 728-735.
    [26] STOW D, AKGUN I, BARNES R, et al. Cost analysis and cost-driven IP reuse methodology for SoC design based on 2.5 D/3D integration[C]//2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 2016: 1-6.
    [27] EHRETT P, AUSTIN T, BERTACCO V. SiPterposer: A fault-tolerant substrate for flexible system-in-package design[C]//2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2019: 510-515.
    [28] VIVET P, GUTHMULLER E, THONNART Y, et al. IntAct: A 96-core processor with six chiplets 3D-stacked on an active interposer with distributed interconnects and integrated power management[J]. IEEE Journal of Solid-State Circuits, 2020, 56(1): 79-97.
    [29] Z TAN Z, CAI H, DONG R, et al. Nn-baton: Dnn workload orchestration and chiplet granularity exploration for multichip accelerators[C]//2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2021: 1013-1026.
    [30] HAO X, DING Z, YIN J, et al. Monad: Towards Cost-Effective Specialization for Chiplet-Based Spatial Accelerators[C]//2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023: 1-9.
    [31] THONNART Y, BERNABé S, CHARBONNIER J, et al. POPSTAR: A robust modular optical NoC architecture for chiplet-based 3D integrated systems[C]//2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2020: 1456-1461.
    [32] NARAYAN A, THONNART Y, VIVET P, et al. System-level evaluation of chip-scale silicon photonic networks for emerging data-intensive applications[C]//2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2020: 1444-1449.
    [33] WANG T, FENG F, XIANG S, et al. Application defined on-chip networks for heterogeneous chiplets: An implementation perspective[C]//2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2022: 1198-1210.
    [34] VIVET P, GUTHMULLER E, THONNART Y, et al. IntAct: A 96-core processor with six chiplets 3D-stacked on an active interposer with distributed interconnects and integrated power management[J]. IEEE Journal of Solid-State Circuits, 2020, 56(1): 79-97.
    [35] SHAO Y S, CLEMONS J, VENKATESAN R, et al. Simba: Scaling deep-learning inference with multi-chip-module-based architecture[C]//Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 2019: 14-27.
    [36] WANG M, WANG Y, LIU C, et al. Network-on-interposer design for agile neural-network processor chip customization[C]//2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 2021: 49-54.
    [37] FENG Y, XIANG D, MA K. A Scalable Methodology for Designing Efficient Interconnection Network of Chiplets[C]//2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2023: 1059-1071.
    [38] KRISHNAN G, MANDAL S K, PANNALA M, et al. SIAM: Chiplet-based scalable in-memory acceleration with mesh for deep neural networks[J]. ACM Transactions on Embedded Computing Systems (TECS), 2021, 20(5s): 1-24.
    [39] ORENES-VERA M, TURECI E, MARTONOSI M, et al.DCRA: A distributed chiplet-based reconfigurable architecture for irregular applications[J]. arXiv preprint arXiv:2311.15443, 2023.
    [40] JERGER N E, KANNAN A, LI Z, et al. NoC architectures for silicon interposer systems: Why pay for more wires when you can get them (from your interposer) for free?[C]//2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 2014: 458-470.
    [41] SHARMA H, MANDAL S K, DOPPA J R, et al. SWAP: A server-scale communication-aware chiplet-based manycore PIM accelerator[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 4145-4156.
    [42] LI F, WANG Y, CHENG Y, et al. Gia: A reusable general interposer architecture for agile chiplet integration[C]//Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 2022: 1-9.
    [43] KADOMOTO J, IRIE H, SAKAI S. Design of shape-changeable chiplet-based computers using an inductively coupled wireless bus interface[C]//2020 IEEE 38th International Conference on Computer Design (ICCD). IEEE, 2020: 589-596.
    [44] KADOMOTO J, MITSUNO S, IRIE H, et al. An inductively coupled wireless bus for chiplet-based systems[C]//2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020: 9-10.
    [45] GREEN C, THOTTETHODI M. NetSmith: An Optimization Framework for Machine-Discovered Network Topologies[J]. arXiv preprint arXiv:2404.02357, 2024.
    [46] KIM D H, ATHIKULWONGSE K, HEALY M B, et al. Design and analysis of 3D-MAPS (3D massively parallel processor with stacked memory)[J]. IEEE Transactions on Computers, 2013, 64(1): 112-125.
    [47] ZHANG J, FAN X, YE Y, et al. INDM: Chiplet-Based Interconnect Network and Dataflow Mapping for DNN Accelerators[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023.
    [48] ZHENG H, WANG K, LOURI A. A versatile and flexible chiplet-based system design for heterogeneous manycore architectures[C]//2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020: 1-6.
    [49] DALLY, SEITZ. Deadlock-free message routing in multiprocessor interconnection networks[J]. IEEE Transactions on computers, 1987, 100(5): 547-553.
    [50] RAMRAKHYANI A, GRATZ P V, KRISHNA T. Synchronized progress in interconnection networks (SPIN): A new theory for deadlock freedom[C]//2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2018: 699-711.
    [51] DUATO J. A new theory of deadlock-free adaptive routing in wormhole networks[J]. IEEE transactions on parallel and distributed systems, 1993, 4(12): 1320-1331.
    [52] TAHERI E, PASRICHA S, NIKDAST M. DeFT: A deadlock-free and fault-tolerant routing algorithm for 2.5 D chiplet networks[C]//2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2022: 1047-1052.
    [53] WU Y, WANG L, WANG X, et al. Upward packet popup for deadlock freedom in modular chiplet-based systems[C]//2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2022: 986-1000.
    [54] EBRAHIMI M, DANESHTALAB M. EbDa: A new theory on design and verification of deadlock-free interconnection networks[C]//Proceedings of the 44th Annual International Symposium on Computer Architecture. 2017: 703-715.
    [55] AL FARUQUE M A, EBI T, HENKEL J. Configurable links for runtime adaptive on-chip communication[C]//2009 Design, Automation & Test in Europe Conference & Exhibition. IEEE, 2009: 256-261.
    [56] MA S, WANG Z, LIU Z, et al. Leaving one slot empty: Flit bubble flow control for torus cache-coherent NoCs[J]. IEEE Transactions on Computers, 2013, 64(3): 763-777.
    [57] MAJUMDER P, KIM S, HUANG J, et al. Remote control: A simple deadlock avoidance scheme for modular systems-on-chip[J]. IEEE Transactions on Computers, 2020, 70(11): 1928-1941.
    [58] WANG T, FENG F, XIANG S, et al. Application defined on-chip networks for heterogeneous chiplets: An implementation perspective[C]//2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2022: 1198-1210.
    [59] XIANG X, SIGDEL P, TZENG N F. Bufferless network-on-chips with bridged multiple subnetworks for deflection reduction and energy savings[J]. IEEE Transactions on Computers, 2019, 69(4): 577-590.
    [60] FARROKHBAKHT H, KAO H, HASAN K, et al. Pitstop: Enabling a virtual network free network-on-chip[C]//2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2021: 682-695.
    [61] EJAZ A, PAPAEFSTATHIOU V, SOURDIS I. HighwayNoC: Approaching ideal NoC performance with dual data rate routers[J]. IEEE/ACM Transactions on Networking, 2020, 29(1): 318-331.
    [62] SRIVASTAVA S, SHAIKH M A, SHIVANEETHA G,et al. Intelligent congestion control for NoC architecture in Gem5 simulator[C]//2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC). IEEE, 2022: 353-360.
    [63] CHEN L, PINKSTON T M. Worm-bubble flow control[C]//2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2013: 366-377.
    [64] RAMRAKHYANI A, KRISHNA T. Static bubble: A framework for deadlock-free irregular on-chip topologies[C]//2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2017: 253-264.
    [65] PARASAR M, JERGER N E, GRATZ P V, et al. Swap: Synchronized weaving of adjacent packets for network deadlock resolution[C]//Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 2019: 873-885.
    [66] PARASAR M, FARROKHBAKHT H, JERGER N E, et al. Drain: Deadlock removal for arbitrary irregular networks[C]//2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020: 447-460.
    [67] MURPHY B T. Cost-size optima of monolithic integrated circuits[J]. Proceedings of the IEEE, 1964, 52(12): 1537-1545.
    [68] CHANG Y C, GONG C S A, CHIU C T. Fault-tolerant mesh-based NoC with router-level redundancy[J]. Journal of Signal Processing Systems, 2020, 92: 345-355.
    [69] LEHTONEN T, WOLPERT D, LILJEBERG P, et al. Self-adaptive system for addressing permanent errors in on-chip interconnects[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2009, 18(4): 527-540.
    [70] CHEN C, FU Y, COTOFANA S. Towards maximum utilization of remained bandwidth in defected NoC links[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2016, 36(2): 285-298.
    [71] KIA H S, ABABEI C. Improving fault tolerance of network-on-chip links via minimal redundancy and reconfiguration[C]//2011 International Conference on Reconfigurable Computing and FPGAs. IEEE, 2011: 363-368.
    [72] REN Y, LIU L, YIN S, et al. A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration[J]. Journal of Systems Architecture, 2013, 59(7): 482-491.
    [73] TAHERI E, PASRICHA S, NIKDAST M. ReD: A Reliable and Deadlock-Free Routing for 2.5 D Chiplet-Based Interposer Networks[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024.
    [74] OUYANG Y, WANG Q, RU M, et al. A novel low-latency regional fault-aware fault-tolerant routing algorithm for wireless NoC[J]. IEEE Access, 2020, 8: 22650-22663.
    [75] NAFFZIGER S, BECK N, BURD T, et al. Pioneering chiplet technology and design for the amd epyc? and ryzen? processor families: Industrial product[C]//2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2021: 57-70.
    [76] https://www.gem5.org/documentation/ (accessed on 07/2024)
    [77] BINKERT N, BECKMANN B, BLACK G, et al. The gem5 simulator[J]. ACM SIGARCH computer architecture news, 2011, 39(2): 1-7.
    [78] https://github.com/GT-CHIPS/gem5_chips) (accessed on 07/2024)
    [79] https://www.gem5.org/documentation/general_docs/ruby/heterogarnet/ (accessed on 07/2024)
    [80] https://github.com/FCAS-ZJU/Chiplet_sniper (accessed on 07/2024)
    [81] https://github.com/FCAS-SCUT/Chiplet-Gem5-sharedMemory (accessed on 07/2024)
    [82] https://github.com/FCAS-SCUT/Chiplet-GPGPU-Sim-massgepassing (accessed on 07/2024)
    [83] https://github.com/lllbbbyyy/Chiplet-sim (accessed on 07/2024)
    [84] JIANG N, BECKER D U, MICHELOGIANNAKIS G, et al. A detailed and flexible cycle-accurate network-on-chip simulator[C]//2013 IEEE international symposium on performance analysis of systems and software (ISPASS). IEEE, 2013: 86-96.
    [85] SUN C, CHEN C H O, KURIAN G, et al. DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling[C]//2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip. IEEE, 2012: 201-210.
    [86] KAHNG A B, LIN B, NATH S. ORION3. 0: A comprehensive NoC router estimation tool[J]. IEEE Embedded Systems Letters, 2015, 7(2): 41-45.
    [87] PAL S, PETRISKO D, KUMAR R, et al. Design space exploration for chiplet-assembly-based processors[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2020, 28(4): 1062-1073.
    [88] NARDI L, SOUZA A, KOEPLINGER D, et al. Hypermapper: a practical design space exploration framework[C]//2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 2019: 425-426.
    [89] EHRETT P, AUSTIN T, BERTACCO V. Chopin: Composing cost-effective custom chips with algorithmic chiplets[C]//2021 IEEE 39th International Conference on Computer Design (ICCD). IEEE, 2021: 395-399.
    [90] FENG Y, MA K. Chiplet actuary: A quantitative cost model and multi-chiplet architecture exploration[C]//Proceedings of the 59th ACM/IEEE Design Automation Conference. 2022: 121-126.
    [91] GRAENING A, PAL S, GUPTA P. Chiplets: How small is too small?[C]//Proceedings of the 60th ACM/IEEE Design Automation Conference. 2023: 1-6.
    [92] SAINT-PATRICE D, MALHOUITRE S, ASSOUS M, et al. Process Integration of Photonic Interposer for Chiplet-based 3D Systems[C]//2023 IEEE 73rd Electronic Components and Technology Conference (ECTC). IEEE, 2023: 5-12.
    [93] NARAYAN A, THONNART Y, VIVET P, et al. System-level evaluation of chip-scale silicon photonic networks for emerging data-intensive applications[C]//2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2020: 1444-1449.
    [94] IFF P, BRUGGMANN B, BESTA M, et al. RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures[J]. arXiv preprint arXiv:2311.06081, 2023.
    [95] YANG J, ZHENG H, LOURI A. Versa-DNN: A Versatile Architecture Enabling High-Performance and Energy-Efficient Multi-DNN Acceleration[J]. IEEE Transactions on Parallel and Distributed Systems, 2023.
    Related
    Cited by
    您输入的地址无效!
    没有找到您想要的资源,您输入的路径无效!

    Comments
    Comments
    分享到微博
    Submit
Get Citation
Share
Article Metrics
  • Abstract:172
  • PDF: 386
  • HTML: 0
  • Cited by: 0
History
  • Received:September 14,2024
  • Revised:November 18,2024
  • Adopted:November 21,2024
  • Online: November 21,2024
Article QR Code