AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration

AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration

Posted on January 16, 2025

Bradley McDanel
IEEE International Symposium on Circuits and Systems (ISCAS), 2025.
preprint
code