Embeddings | AI Free Learning

1. What are Embeddings?

Embeddings are numerical vectors representing words, sentences, or documents. They allow AI to measure **semantic similarity**—how close in meaning two pieces of text are.

Example: "I love AI" and "Artificial intelligence is amazing" would have similar embeddings, even though words differ.

2. Visualizing Embeddings

Imagine a 2D map where similar words are close together:

"cat" and "dog" are nearby, "cat" and "car" are far apart.

# Conceptually:
# cat -> [0.1, 0.9]
# dog -> [0.2, 0.85]
# car -> [0.9, 0.1]
# Euclidean distance measures similarity

3. Simple HuggingFace Example

Python example using sentence-transformers to generate embeddings:

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = ["I love AI", "Artificial intelligence is amazing", "I enjoy hiking"]

embeddings = model.encode(sentences)

# Find similarity
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity}")
# Output: similarity close to 1 (high)

This shows how AI measures semantic closeness between texts. Higher cosine similarity → more similar meaning.

4. Applications

Semantic search (like Google or RAG retrieval)
Recommendation systems
Question answering and chatbots
Clustering similar documents

5. Try It Yourself

Pick a few sentences or short paragraphs. Generate embeddings and calculate similarity to see which are closest in meaning. Visualize with a 2D plot if you like!

6. Inspirational Quote

"Meaning is hidden in numbers; embeddings reveal it."