← paper
Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin

dispensing with recurrence and convolutions entirely

This was the bold bet that paid off. In 2017, dropping LSTMs felt risky — every major NLP lab was invested in recurrence. The fact that they went all-in on attention is what made this paper a paradigm shift, not just an improvement.

NLP researcher

Jun 29, 2026

87

Discussion (0)

No discussion yet.

Read in context

Open the full paper with all annotations