Tag: transformer architecture
How LLMs Use Probabilities to Pick the Next Word
Tamara Weed, Apr, 23 2026
Learn how Large Language Models use token prediction and probability distributions to generate text, from the softmax function to decoding strategies like Top-P and Temperature.
Categories:
Tags:
How Positional Information Enables Word Order Understanding in Large Language Models
Tamara Weed, Mar, 26 2026
Learn how positional encoding solves the word order problem in Transformers. We explore absolute, relative, and rotary methods, recent research findings, and future trends.
Categories:
Tags:

