Vector Databases: Efficient AI Retrieval

Store embeddings, search semantically, and retrieve knowledge at lightning speed.

1. What is a Vector Database?

A vector database stores **high-dimensional embeddings** efficiently, allowing fast retrieval of similar items. Unlike traditional databases that rely on exact matches, vector DBs find **semantic similarity**.

Example: Searching "AI ethics" in a vector DB might also return results like "responsible AI guidelines" based on meaning.

2. How it Works

Popular vector DBs: Pinecone, Milvus, Weaviate, FAISS (open-source), Qdrant.

3. Simple Python Example using FAISS

import numpy as np
import faiss

# Sample embeddings (3 vectors of dimension 5)
vectors = np.array([
    [0.1, 0.3, 0.2, 0.7, 0.5],
    [0.2, 0.1, 0.4, 0.6, 0.3],
    [0.9, 0.7, 0.8, 0.2, 0.1]
], dtype='float32')

# Build index
index = faiss.IndexFlatL2(5)  # L2 distance
index.add(vectors)

# Query vector
query = np.array([[0.15,0.25,0.3,0.65,0.4]], dtype='float32')
distances, indices = index.search(query, k=2)

print("Closest vectors:", indices)
print("Distances:", distances)
      
The query returns the most similar embeddings quickly. Vector DBs handle millions efficiently in production.

4. Applications

5. Try It Yourself

Generate embeddings for a few sentences or documents. Store them in a vector DB (like FAISS or Milvus) and perform similarity search. Visualize results to see how similar items cluster.

6. Inspirational Quote

"Vectors carry knowledge; databases let it speak."