ML Collective’s ICML Paper: A Probabilistic Interpretation of Transformers
In the new paper A Probabilistic Interpretation of Transformers, ML Collective researcher Alexander Shim provides a probabilistic explanation of transformers’ exponential dot product attention and contrastive learning based on distributions of the exponential family.