Vector Databases | AI Free Learning

1. What is a Vector Database?

A vector database stores **high-dimensional embeddings** efficiently, allowing fast retrieval of similar items. Unlike traditional databases that rely on exact matches, vector DBs find **semantic similarity**.

Example: Searching "AI ethics" in a vector DB might also return results like "responsible AI guidelines" based on meaning.

2. How it Works

Store embeddings (vectors) for text, images, or other data.
Use similarity metrics (cosine, dot-product) to find closest matches.
Supports large-scale search with millions of vectors efficiently.

Popular vector DBs: Pinecone, Milvus, Weaviate, FAISS (open-source), Qdrant.

3. Simple Python Example using FAISS

import numpy as np
import faiss

# Sample embeddings (3 vectors of dimension 5)
vectors = np.array([
    [0.1, 0.3, 0.2, 0.7, 0.5],
    [0.2, 0.1, 0.4, 0.6, 0.3],
    [0.9, 0.7, 0.8, 0.2, 0.1]
], dtype='float32')

# Build index
index = faiss.IndexFlatL2(5)  # L2 distance
index.add(vectors)

# Query vector
query = np.array([[0.15,0.25,0.3,0.65,0.4]], dtype='float32')
distances, indices = index.search(query, k=2)

print("Closest vectors:", indices)
print("Distances:", distances)

The query returns the most similar embeddings quickly. Vector DBs handle millions efficiently in production.

4. Applications

Semantic search engines
RAG systems (Retrieval-Augmented Generation)
Recommendation engines
Image or multimedia similarity search
AI knowledge retrieval

5. Try It Yourself

Generate embeddings for a few sentences or documents. Store them in a vector DB (like FAISS or Milvus) and perform similarity search. Visualize results to see how similar items cluster.

6. Inspirational Quote

"Vectors carry knowledge; databases let it speak."