GainSight with SCALE-Sim-v2 Systolic Array Simulator Backend
This document describes how to use the SCALE-Sim backend within the GainSight profiling framework. This backend utilizes a fork of SCALE-Sim v2 to simulate deep learning accelerators and extract memory access patterns for lifetime analysis.
Description
The SCALE-Sim backend, located in backend/scalesim/, allows GainSight to analyze memory lifetimes based on simulations of systolic array accelerators. It processes the memory access traces generated by SCALE-Sim v2.
Key components:
- backend/scalesim/scale-sim-v2/: Contains the forked SCALE-Sim v2 simulator code.
- backend/scalesim/python/: Contains Python scripts (run.py, parse_lifetimes.py, etc.) to process SCALE-Sim output traces.
- frontend/scale_sim_frontend.py: Frontend script to perform final analysis and generate reports from the processed data.
Usage Workflow
The complete workflow for using the SCALE-Sim backend in GainSight involves three main phases:
1. Generate Memory Access Traces with SCALE-Sim
- Generate memory access traces using the SCALE-Sim simulator.
- You need a topology file (e.g., describing network layers like
resnet-50.csv) and a configuration file (e.g., defining hardware parameters likegainsight.cfg). Examples might be found inbackend/scalesim/scale-sim-v2/configsandtopologies. Adjust the configuration file (.cfg) for your target hardware (PE array size, SRAM size, etc.). - Execute the simulation:
1 2 3 4
# Navigate to the scale-sim-v2 directory if not already there cd backend/scalesim/scale-sim-v2 # Run the simulation python3 scalesim/scale.py -t <path_to_topology_file.csv> -c <path_to_config_file.cfg> -p <path_to_scalesim_output_directory/> - This command generates detailed trace files in the specified output directory, including:
COMPUTE_REPORT.csv: Layer-by-layer computational statisticsBANDWIDTH_REPORT.csv: Memory bandwidth requirements for each layer- Layer-specific trace files showing cycle-accurate memory accesses
2. Process Memory Access Traces with run.py
The run.py script in backend/scalesim/python/ is the primary entry point for processing SCALE-Sim traces into GainSight-compatible data:
1 2 | |
This script performs the following operations for each layer in the network:
- Parse Memory Lifetimes: Calls
parse_lifetimes.pyto analyze the trace files and calculate data lifetimes - Processes IFMAP, Filter, and OFMAP traces separately
- Tracks both reads and writes to calculate memory lifetime (time between write and last read)
-
Outputs detailed CSV files with address-specific lifetime data
-
Generate Aggregate Statistics: For each data type (IFMAP, OFMAP, Filter), calculates:
- Average, median, 90th percentile, and maximum lifetimes
- Read and write frequencies
- Total read and write counts
- Unique address count (memory footprint)
-
Outputs an
_aggregate_data.csvfile with these statistics -
Create Visualizations: Uses the
create_graphs.pymodule to generate lifetime distribution plots for each layer and data type
The script organizes outputs in directories mirroring the structure of the SCALE-Sim results, with each network layer having its own subdirectory containing:
- <layer_name>_lifetime_data.csv: Raw lifetime data for each memory address
- <layer_name>_aggregate_data.csv: Aggregated memory statistics
- <layer_name>_graph.png: Visualization of lifetime distributions
3. Generate Final Analysis with scale_sim_frontend.py
After processing the raw data, run the frontend analysis script to calculate key memory technology metrics:
1 2 | |
Details of what the frontend does can be found in the frontend documentation; in summary, it performs the following tasks:
- Import and Combine Layer Data: Concatenates data from all neural network layers
-
Aggregates statistics across all memory types or per memory type (IFMAP, OFMAP, Filter)
-
Calculate Memory Technology Metrics: Performs analysis based on different memory cell technologies
- Reads cell technology parameters from
simple_gc_list.json(gain cell retention times) - Reads area and power parameters from
area_power.json - Calculates required refresh rates based on data lifetimes and cell retention times
- Determines area requirements based on memory footprint and cell size
-
Estimates energy consumption accounting for reads, writes, and refresh operations
-
Output Results: Generates a comprehensive JSON report with:
- Overall workload information (name, size, dataflow style)
- Data lifetime statistics across all memory subdivisions
- Memory technology comparisons (SRAM vs. different gain cell variants)
- Area and energy estimates for different memory technologies
Output Files and Data Format
SCALE-Sim Raw Output
The SCALE-Sim simulator generates these files in the output directory:
- COMPUTE_REPORT.csv: Layer-wise computational statistics
- BANDWIDTH_REPORT.csv: Memory bandwidth requirements
- DETAILED_ACCESS_REPORT.csv: Access patterns summary
- Per-layer trace directories containing:
- IFMAP_SRAM_TRACE.csv, IFMAP_DRAM_TRACE.csv: Input feature map memory accesses
- FILTER_SRAM_TRACE.csv, FILTER_DRAM_TRACE.csv: Filter/weight memory accesses
- OFMAP_SRAM_TRACE.csv, OFMAP_DRAM_TRACE.csv: Output feature map memory accesses
These traces record cycle-accurate memory accesses when save_disk_space=False is configured in SCALE-Sim.
GainSight Processed Data
The run.py script generates these files for each layer:
- <layer>_lifetime_data.csv: Raw lifetime data with columns:
- On-chip memory subdivision (ifmap/filter/ofmap)
- Memory address
- Lifetime in cycles (time between write and last read)
<layer>_aggregate_data.csv: Statistical summary with columns:- On-chip memory subdivision (ifmap/filter/ofmap)
- Lifetime statistics (avg, median, 90th%, max)
- Access frequencies (read/write)
- Operation counts (reads/writes)
- Memory footprint (unique addresses)
GainSight Frontend Output
scale_sim_frontend.py produces a JSON file with:
- Workload identification (name, size, dataflow style)
- Write frequency analysis per data type
- Refresh count estimations for different memory technologies
- Area projections for different memory technologies
- Energy consumption estimates for different memory technologies
This final output enables architects to make technology-aware decisions about memory hierarchy design for specific AI workloads.