Computer Science Education with a focus on how language models are impacting learning and assessment.
Exploring the intersection of LLMs and Programming Languages.
Reducing LLM autoregressive bottleneck by utilizing additional drafting models.
Varying the amount of computation based on sample difficulty.
Reducing the precision of weights and data to achieve better energy efficiency.
Enforcing structural sparsity in DNN weights for reduced computation that can be performed efficiently.
Mainly using systolic arrays, with a focus on integrating sparsity, quantization, and adaptive inference into the design.