| Brad McDanel

Speculative Decoding and Beyond: An In-Depth Review of Techniques

Yunhai Hu, Zining Liu, Zhenyuan Dong, Tianfan Peng, Bradley McDanel, and Sai Qian Zhang EMNLP Findings, 2025. preprint


a diagram of a computer system

PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding

Bradley McDanel, Sai Qian Zhang, Yunhai Hu, Zining Liu ACL Findings, 2025. preprint code


a series of diagrams showing different types of circuiting

AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration

Bradley McDanel IEEE International Symposium on Circuits and Systems (ISCAS), 2025. preprint paper code


a diagram of a network

StitchNet: Composing Neural Networks from Pre-Trained Fragments

Surat Teerapittayanon, Marcus Comiter, Bradley McDanel, H. T. Kung. IEEE International Conference on Machine Learning (ICMLA), 2023. preprint


the diagram shows how patch sampling is used

Accelerating Vision Transformer Training via a Patch Sampling Schedule

Bradley McDanel, Chi Phuong Huynh IEEE International Conference on Machine Learning (ICMLA), 2023. preprint code


a diagram showing the different components of a computer system

FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding

S. Zhang, B. McDanel, H. T. Kung 28th IEEE International Symposium on High-Performance Computer Architecture (HPCA-28), 2022. preprint


two diagrams showing the different types of hardware

Field-Configurable Multi-resolution Inference: Rethinking Quantization

S. Zhang, B. McDanel, H. T. Kung, X. Dong 26th ACM International Conference on Architectural Support for Programming Languages and […]


a diagram of different types of channel covers

Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference

B. McDanel, S. Teerapittayanon, H. T. Kung International Conference On Machine Learning And Applications (ICMLA), 2017 paper


a diagram showing the location of an edge exit and local exit

Distributed Deep Neural Networks over the Cloud, the Edge and End Devices

S. Teerapittayanon, B. McDanel, H. T. Kung International Conference on Distributed Computing Systems (ICDCS), 2017 paper | code


a diagram showing the different types of text

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

S. Teerapittayanon, B. McDanel, H. T. Kung International Conference on Pattern Recognition (ICPR), 2016 paper