Nonlinear Computation and Learning over Communication Networks
The advent of large language models (LLMs), whose unprecedented scale necessitates distributed training and inference, has intensified the need for communication/computation-efficient distributed systems. Particularly, training and inference in LLMs rely on non-linear transformations, most prominently, activation functions and attention mechanisms.