Varying the amount of computation based on sample difficulty.
Reducing the precision of weights and data to achieve better energy efficiency.
Enforcing structural sparsity in DNN weights for reduced computation that can be performed efficiently.
Mainly using systolic arrays, with a focus on integrating sparsity, quantization, and adaptive inference into the design.