paper7 — Annotate academic papers

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin

“28.4 BLEU on the WMT 2014 English-to-German translation task”

BLEU has since fallen out of favor as the go-to metric for translation quality — humans consistently find that higher BLEU doesn't always mean better translations. The field has moved toward COMET and human eval. Ironic that a seminal paper is anchored to a metric we now distrust.

Computational linguist

Jun 29, 2026

▲63

Discussion (0)

No discussion yet.

Read in context

Open the full paper with all annotations

→

More annotations on this paper

“dispensing with recurrence and convolutions entirely”

This was the bold bet that paid off. In 2017, dropping LSTMs felt risky — every major NLP lab was invested in recurrence. The fact that they went all-in on attention is what made this paper a paradigm shift, not just an improvement.

NLP researcher▲ 87