An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators

Get started Watch Demo


Deep learning techniques, especially convolutional neural networks (CNN), have pervaded vision applications across image classification, face recognition, video processing, and so on due to the high degree of accuracy they provide. Both industry and academia are exploring specialized hardware accelerator ASICs as a solution to provide low-latency and high-throughput for CNN workloads.

The convolution operation is a deeply nested multiply-accumulate loop. For throughput and energy efficiency, each accelerator chooses different strategies to manipulate the loop order/tiling of the convolution operations and the spatial/temporal mapping of data on compute units, which we collectively refer to as dataflow. The throughput and energy efficiency of a dataflow changes dramatically depending on both the DNN topology (i.e., layer shapes and sizes), and accelerator hardware resources (buffer size, and network-on-chip (NoC) bandwidth). This demonstrates the importance of dataflow as a first-order consideration for deep learning accelerator ASICs, both at design-time when hardware resources (buffers and interconnects) are being allocated on-chip, and compile-time when different layers need to be optimally mapped for high utilization and energy-efficiency.

We present MAESTRO (Modeling Accelerator Efficiency via Spatio-Temporal Resource Occupancy), an open-source tool for modeling and evaluating the performance and energy-efficiency of different dataflows.


  • Oct 17 2020: MAESTRO tutorial in MICRO'2020
  • July 2020: GAMMA was accepted to be presented in ICCAD'2020.
  • July 2020: ConfuciuX was accepted to be presented in MICRO'2020.
  • Jan 2020: MAESTRO has been selected for inclusion in IEEE Micro’s Top Picks from Computer Architecture 2020!
  • Oct 15 2019: We presented MAESTRO at MICRO 2019
  • Feb 16 2019: We ran a tutorial on MAESTRO at HPCA 2019
  • Oct 23 2018: MAESTRO was in the final list at the ACM Student Research Competition (SRC)
  • June 3 2018: MAESTRO released at ISCA 2018 tutorial

HW-SW Co-Design of DNN Accelerators

The MAESTRO cost model provides rapid estimation of the performance/energy given a DNN Model, Mapping, HW configuration. It can be used for HW-SW co-design by porting it into tools that perform design-space exploration.


  • GAMMA: Mapping Space Exploration via Genetic Algorithm [ICCAD 2020]
  • ConfuciuX: Hardware Design-space Exploration via Reinforcement Learning [MICRO 2020]
  • Neural-Architecture Search of DNN Models [DAC 2020]
  • Design space exploration of mapping [arXiv]
  • Design-space Exploration of Hardware Configurations [MICRO 2019]

Result Analysis using Jupyter Notebook

There is plenty of cost data that MAESTRO provides.

Explore what kinds of results will be provided by running MAESTRO! For more details, please visit the following page.

Watch Video


  1. GAMMA: Automating the HW Mapping of DNN Models on Accelerators via Genetic Algorithm
    Sheng-Chun Kao and Tushar Krishna, ICCAD 2020 [PAPER]
  2. ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
    Sheng-Chun Kao, Geonhwa Jeong, and Tushar Krishna, MICRO 2020 [PAPER]
  3. MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings
    Hyoukjun Kwon, Prasanth Chatarasi, Micheal Pellauer, Angshuman Parashar, Vivek Sarkar, and Tushar Krishna, IEEE Micro Top Picks 2020 [PAPER]
  4. Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach
    Hyoukjun Kwon, Prasanth Chatarasi, Micheal Pellauer, Angshuman Parashar, Vivek Sarkar, and Tushar Krishna, MICRO 2019 [PDF] [Slides]


  1. Tutorial at MICRO 2020 [Link]
  2. Tutorial at HPCA 2019 [Link]
  3. Tutorial at ISCA 2018 [Link]