Cosine Similarity

A measure of how similar two vectors are in direction, regardless of their magnitude. Computed as the cosine of the angle between them:

$sim (a, b) = \frac{a \cdot b}{∣ a ∣ \cdot ∣ b ∣}$

1.0 = vectors point in the same direction (identical meaning in embedding space)
0.0 = vectors are perpendicular (unrelated)
-1.0 = vectors point in opposite directions (rare in practice with embeddings, which are typically non-negative)

In NLP and LLM evaluation, cosine similarity is used to compare Embeddings — if two texts have high cosine similarity between their embedding vectors, they are considered semantically similar.

Limitations

Cannot distinguish between “similar topic” and “same meaning” — two sentences about the same subject but making opposite claims may score high
Sensitive to the embedding model used — different models produce different similarity scores for the same pair
Poor at detecting numerical differences (“$100” vs “$10,000” can score highly similar)

Edmondo's Vault

Explorer

Cosine Similarity

Cosine Similarity

Limitations

See also

Graph View

Table of Contents

Backlinks