Publications

FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding

S. Zhang, B. McDanel, H. T. Kung
28th IEEE International Symposium on High-Performance Computer Architecture (HPCA-28), 2022.
preprint

a diagram showing the different components of a computer system

Saturation RRAM Leveraging Bit-level Sparsity Resulting from Term Quantization

B. McDanel, H. T. Kung, S. Zhang
IEEE International Symposium on Circuits and Systems (ISCAS), 2021
paper

a diagram showing the different types of data

Field-Configurable Multi-resolution Inference: Rethinking Quantization

S. Zhang, B. McDanel, H. T. Kung, X. Dong
26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021
preprint

two diagrams showing the different types of hardware

Term Quantization: Furthering Quantization at Run Time

H. T. Kung, B. McDanel, S. Zhang
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2020.
paper

the diagram shows how to use different types of items

Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays

H. T. Kung, B. McDanel, S. Zhang, X. Dong, C. Chen.
30th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2019
paper

Full-stack Optimization for Accelerating CNNs Using Powers-of-Two Weights with FPGA Validation

B. McDanel, S. Zhang, H. T. Kung, X. Dong.
32nd ACM International Conference on Supercomputing (ICS), 2019
paper

two diagrams showing the components of a computer system

Systolic Building Block for Logic-on-Logic 3D-IC Implementations of Convolutional Neural Networks

H. T. Kung, B. McDanel, S. Zhang, C. T. Wang, J. Cai, C. Y. Chen, V. Chang, M. F. Chen, J. Sun, and D. Yu.
IEEE International Symposium on Circuits and Systems (ISCAS), 2019
paper

two diagrams showing the different types of subplates

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization

H. T. Kung, B. McDanel, and S. Zhang
24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019
paper | code

a diagram showing the differences between spars and filters

Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks

H. T. Kung, B. McDanel, S. Zhang
International Conference on Pattern Recognition (ICPR), 2018
paper

two plots showing the different types of filting

Mapping Systolic Arrays Onto 3D Circuit Structures: Accelerating Convolutional Neural Network Inference

H. T. Kung, B. McDanel, S. Zhang
IEEE Workshop on Signal Processing Systems (SiPs), 2018.
paper

two different types of electronic components

Older Posts

Newer Posts