Terms

# Terms Used in "Attention is All You Need"

attention is all you need term

Below is a comprehensive table of key terms used in the paper “Attention is All You Need,” along with their English and Chinese translations. Where applicable, links to external resources are provided for further reading.

English TermChinese TranslationExplanationLink
Encoder编码器The component that processes input sequences.
Decoder解码器The component that generates output sequences.
Attention Mechanism注意力机制Measures relationships between sequence elements.Attention Mechanism Explained
Self-Attention自注意力Focuses on dependencies within a single sequence.
Masked Self-Attention掩码自注意力Prevents the decoder from seeing future tokens.
Multi-Head Attention多头注意力Combines multiple attention layers for better modeling.
Positional Encoding位置编码Adds positional information to embeddings.
Residual Connection残差连接Shortcut connections to improve gradient flow.
Layer Normalization层归一化Stabilizes training by normalizing inputs.Layer Normalization Details
Feed-Forward Neural Network (FFNN)前馈神经网络Processes data independently of sequence order.Feed-Forward Networks in NLP
Recurrent Neural Network (RNN)循环神经网络Processes sequences step-by-step, maintaining state.RNN Basics
Convolutional Neural Network (CNN)卷积神经网络Uses convolutions to extract features from input data.CNN Overview
Parallelization并行化Performing multiple computations simultaneously.
BLEU (Bilingual Evaluation Understudy)双语评估替代A metric for evaluating the accuracy of translations.Understanding BLEU

This table provides a solid foundation for understanding the technical terms used in the “Attention is All You Need” paper. If you have questions or want to dive deeper into any term, the linked resources are a great place to start!

My avatar

Thanks for reading my blog post! Feel free to check out my other posts or contact me via the social links in the footer.


More Posts