Pinecone — מדריך עברי מלא ל-Vector DB

מה זה Vector Database — ולמה Pinecone?

Vector Database הוא מסד נתונים שנבנה במיוחד לאחסון ולחיפוש של Embeddings — ייצוגים מספריים של טקסט, תמונות, וידאו ואודיו. בניגוד ל-SQL שמחפש לפי ערכים מדויקים, Vector DB מחפש לפי קרבה סמנטית — מוצא את הפריטים הדומים ביותר משמעותית לשאילתה שלך.

Pinecone הוא ה-Vector Database הפופולרי ביותר — Managed Service שלא דורש תשתית, עם Free Tier נדיב ו-Serverless שמתאים לכל גודל. זה הבסיס לכל מערכת RAG (Retrieval Augmented Generation).

כלי	Pinecone	Chroma	Weaviate	FAISS
סוג	Managed Cloud	Local / Cloud	Self-hosted	Local Library
קנה מידה	מיליארדי Vectors	מיליוני	גדול	מוגבל ל-RAM
Free Tier	100K vectors	פתוח לחלוטין	Self-hosted	חינמי לחלוטין
Production-ready	כן	חלקי	כן	לא
Hybrid Search	כן	לא	כן	לא

מושגי יסוד — Vectors, Embeddings ו-Similarity

לפני שמתחילים לקודד, חשוב להבין את המושגים הבסיסיים:

מה זה Embedding?

Embedding הוא ייצוג מספרי של טקסט — מערך של מספרים עשרוניים (למשל, 1536 ממדים ב-OpenAI text-embedding-3-small). מודל ה-Embedding הופך כל טקסט לנקודה במרחב מתמטי רב-ממדי, כאשר טקסטים עם משמעות דומה קרובים זה לזה.

# ויזואליזציה מפושטת של Embeddings:
# "כלב"        → [0.23, -0.45, 0.12, ...]  # 1536 ממדים
# "גור כלבים"  → [0.24, -0.44, 0.11, ...]  # קרוב מאוד!
# "חתול"       → [0.20, -0.40, 0.15, ...]  # קרוב (חיה)
# "בנק"        → [-0.12, 0.89, -0.34, ...] # רחוק לגמרי

Similarity Metrics

Cosine

מודד זווית בין Vectors. מתאים לNLP ול-Semantic Search. ברירת מחדל מומלצת.

Dot Product

מהיר יותר מCosine. דורש Vectors מנורמלים. טוב ל-Recommendation Systems.

Euclidean

מרחק גיאומטרי. מתאים לנתונים גיאוגרפיים ומספריים. פחות נפוץ ב-NLP.

Index Types — Serverless vs Pod

Serverless: Pay-per-use, אין ניהול תשתית, מתאים לרוב הפרויקטים. מומלץ להתחלה.
Pod-based: Reserved capacity, latency נמוך יותר, מתאים לProduction עם SLA נוקשה.

Setup ו-First Index

ההתחלה עם Pinecone פשוטה: נרשמים ב-app.pinecone.io, מקבלים API Key, ומתקינים את ה-SDK.

pip install pinecone openai python-dotenv

app.pinecone.io — Create Index

Create Index

Index Name automation4mi

Dimensions 1536

Metric cosine

Spec Serverless • AWS us-east-1

from pinecone import Pinecone, ServerlessSpec
import os
from dotenv import load_dotenv

load_dotenv()
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

# יצירת Index (פעם אחת)
pc.create_index(
    name="automation4mi",
    dimension=1536,        # OpenAI text-embedding-3-small
    metric="cosine",       # cosine / euclidean / dotproduct
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# חיבור ל-Index
index = pc.Index("automation4mi")
print(index.describe_index_stats())

play_circle

Pinecone Vector Database — RAG Tutorial Python 2024

YouTube • חפש סרטוני הסבר

open_in_new

Upsert — הוספת Vectors למאגר

Upsert = "Insert or Update" — אם Vector עם ה-ID הזה קיים, הוא יתעדכן; אם לא — יתווסף. שלח תמיד ב-Batches של 100 לביצועים מיטביים.

from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def embed(text: str) -> list[float]:
    return client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    ).data[0].embedding

# הכנת Vectors
vectors = [
    {
        "id": "doc1",
        "values": embed("מדריך n8n לאוטומציה"),
        "metadata": {
            "source": "guide-n8n",
            "title": "n8n Automation Guide",
            "category": "automation",
            "lang": "he"
        }
    },
    {
        "id": "doc2",
        "values": embed("Stable Diffusion ComfyUI workflows"),
        "metadata": {
            "source": "guide-sd",
            "title": "Stable Diffusion Guide",
            "category": "image-ai",
            "lang": "he"
        }
    },
    {
        "id": "doc3",
        "values": embed("Pinecone Vector Database RAG"),
        "metadata": {
            "source": "pinecone-guide",
            "title": "Pinecone Guide",
            "category": "vector-db",
            "lang": "he"
        }
    }
]

# Upsert לnamespace ספציפי
index.upsert(vectors=vectors, namespace="guides")
print(f"Upserted {len(vectors)} vectors")

Batch Upsert לקורפוס גדול

import time

def batch_upsert(documents: list[dict], namespace: str = "default", batch_size: int = 100):
    """Upsert מסמכים בbatches לביצועים טובים"""
    vectors = []
    for doc in documents:
        embedding = embed(doc["text"])
        vectors.append({
            "id": doc["id"],
            "values": embedding,
            "metadata": {
                "text": doc["text"][:1000],  # שמור עד 1000 תווים ב-metadata
                **doc.get("metadata", {})
            }
        })

    # שלח ב-batches
    for i in range(0, len(vectors), batch_size):
        batch = vectors[i:i + batch_size]
        index.upsert(vectors=batch, namespace=namespace)
        print(f"Upserted batch {i//batch_size + 1}/{(len(vectors)-1)//batch_size + 1}")
        time.sleep(0.1)  # הגנה מ-rate limit

tips_and_updates

שמור טקסט ב-Metadata

תמיד שמור את הטקסט המקורי ב-metadata["text"]. כך בזמן Query תוכל להחזיר את הטקסט הממשי. הוסף גם source, date, category לסינון מטא-דאטה עתידי.

Query — Semantic Search

בשלב ה-Query, מקבלים שאלה, הופכים אותה לEmbedding, ומחפשים ב-Pinecone את ה-Vectors הקרובים ביותר.

def semantic_search(query: str, top_k: int = 5, namespace: str = "guides") -> list[dict]:
    """חפש מסמכים הכי דומים ל-query"""
    query_embedding = embed(query)

    results = index.query(
        vector=query_embedding,
        top_k=top_k,
        include_metadata=True,
        namespace=namespace
    )

    matches = []
    for match in results.matches:
        matches.append({
            "text":   match.metadata.get("text", ""),
            "title":  match.metadata.get("title", ""),
            "source": match.metadata.get("source", ""),
            "score":  round(match.score, 3),
            "id":     match.id
        })
    return matches

# שימוש:
results = semantic_search("איך מגדירים Webhook ב-n8n?")
for r in results:
    print(f"[{r['score']}] {r['title']}: {r['text'][:80]}...")

Pinecone Query Results

Query: "איך מגדירים Webhook ב-n8n?"

0.94

score

n8n Automation Guide

Webhook triggers in n8n require...

0.81

score

Make.com vs n8n

Both platforms support webhooks...

0.62

score

Automation Guide

פחות רלוונטי...

info

כלל אצבע לScore

מעל 0.85 = רלוונטי מאוד. 0.70–0.85 = רלוונטי. 0.55–0.70 = שולי. מתחת 0.55 = לא רלוונטי. הגדר תמיד threshold ואל תעביר תוצאות עם score נמוך לLLM.

Metadata Filtering

שלב חיפוש וקטורי עם סינון לפי Metadata — חפש רק בתת-קבוצה מוגדרת של מסמכים לדיוק גבוה יותר.

# סינון לפי Metadata
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={
        "lang":     {"$eq": "he"},
        "category": {"$in": ["automation", "ai"]},
    },
    include_metadata=True,
    namespace="guides"
)

# Operators זמינים:
# $eq, $ne      — שוויון / אי-שוויון
# $gt, $gte     — גדול / גדול-או-שווה
# $lt, $lte     — קטן / קטן-או-שווה
# $in, $nin     — נמצא / לא נמצא ברשימה
# $exists       — שדה קיים
# $and, $or     — לוגי

Operator	משמעות	דוגמה
$eq	שווה ל-	"lang": {"$eq": "he"}
$in	ברשימה	"cat": {"$in": ["a","b"]}
$gte	גדול/שווה	"year": {"$gte": 2024}
$exists	שדה קיים	"tag": {"$exists": true}
$and	גם וגם	{"$and": [{...},{...}]}

Namespaces — ארגון מולטי-טנאנט

Namespace הוא מחיצה לוגית בתוך Index. כל Namespace עצמאי לחלוטין — אפשר לחפש, להוסיף ולמחוק בלי להשפיע על Namespaces אחרים. מושלם ל-Multi-tenant applications.

# Multi-tenant: Namespace לכל משתמש
index.upsert(vectors=user_a_docs, namespace="user-alice-123")
index.upsert(vectors=user_b_docs, namespace="user-bob-456")

# חיפוש בNamspace ספציפי — בידוד מושלם
results_alice = index.query(
    vector=query_embedding,
    namespace="user-alice-123",
    top_k=5
)

# סטטיסטיקות לפי Namespace
stats = index.describe_index_stats()
for ns, info in stats.namespaces.items():
    print(f"{ns}: {info.vector_count} vectors")

# מחיקת כל הנתונים של namespace
index.delete(delete_all=True, namespace="user-alice-123")

group

Namespace vs Index — מתי כל אחד?

Namespace — לאיזולציה לוגית (multi-tenant, environments, שפות). Index נפרד — כשמודל הEmbedding שונה לגמרי, או כשצריך dimension/metric שונה. Index אחד + Namespaces הרבה = הרבה יותר חסכוני.

Hybrid Search — Dense + Sparse

Hybrid Search משלב בין Dense (Embeddings — סמנטי) ו-Sparse (BM25 — מילות מפתח). Dense מצוין למשמעות; Sparse מצוין למילים מדויקות. השילוב נותן את הטוב משני העולמות.

from pinecone_text.sparse import BM25Encoder

# שלב אימון BM25 על הקורפוס שלך
bm25 = BM25Encoder()
bm25.fit([doc["text"] for doc in your_documents])

# Upsert עם Sparse + Dense
def upsert_hybrid(doc: dict):
    dense = embed(doc["text"])
    sparse = bm25.encode_documents(doc["text"])

    index.upsert(vectors=[{
        "id": doc["id"],
        "values": dense,
        "sparse_values": sparse,
        "metadata": {"text": doc["text"]}
    }])

# Query עם Hybrid
def hybrid_query(query: str, alpha: float = 0.5):
    """alpha=1: רק dense, alpha=0: רק sparse, 0.5: מאוזן"""
    dense_vec  = embed(query)
    sparse_vec = bm25.encode_queries(query)

    # שקלול ידני
    def scale(sparse, alpha):
        return {"indices": sparse["indices"],
                "values":  [v * (1-alpha) for v in sparse["values"]]}

    return index.query(
        vector=[v * alpha for v in dense_vec],
        sparse_vector=scale(sparse_vec, alpha),
        top_k=5,
        include_metadata=True
    )

מתי Hybrid Search שווה?

קוד ומונחים טכניים — "TypeError Python" = Dense מתקשה, Sparse מצליח מצוין.
שמות עצם ייחודיים — שמות מוצרים, שמות אנשים, ראשי תיבות.
שפה מעורבת — טקסט בעברית עם מונחים באנגלית.
Search-as-you-type — חיפוש בזמן הקלדה עם מילים חלקיות.

RAG Pipeline מלא עם LangChain

הנה RAG Pipeline מלא שמחבר Pinecone עם LangChain ו-GPT-4o — מבנה מוכן לייצור:

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
import os

# 1. Embeddings + VectorStore
embeddings  = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = PineconeVectorStore(
    index_name="automation4mi",
    embedding=embeddings,
    namespace="guides"
)

# 2. Retriever
retriever = vectorstore.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={
        "k": 4,
        "score_threshold": 0.70,
        "namespace": "guides"
    }
)

# 3. Custom Prompt
prompt = PromptTemplate(
    input_variables=["context", "question"],
    template="""אתה עוזר AI של Automation4MI.
ענה על השאלה אך ורק על בסיס ההקשר הבא.
אם אין מספיק מידע — אמור זאת בפירוש.
ענה בעברית.

הקשר:
{context}

שאלה: {question}

תשובה:"""
)

# 4. QA Chain
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o", temperature=0.1),
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt}
)

# 5. שימוש
result = qa.invoke({"query": "איך משתמשים ב-ControlNet?"})
print(result["result"])
print("\nמקורות:")
for doc in result["source_documents"]:
    print(f"  - {doc.metadata.get('title', 'ללא כותרת')}: score={doc.metadata.get('score','?')}")

RAG ידני (ללא LangChain)

from openai import OpenAI

openai_client = OpenAI()

def rag_query(question: str, namespace: str = "guides") -> str:
    # שלב 1: Embed השאלה
    q_embedding = embed(question)

    # שלב 2: חפש ב-Pinecone
    results = index.query(
        vector=q_embedding,
        top_k=4,
        include_metadata=True,
        namespace=namespace
    )

    # שלב 3: בנה הקשר
    context = "\n\n---\n\n".join([
        f"[{m.metadata.get('title','')}]\n{m.metadata.get('text','')}"
        for m in results.matches if m.score > 0.65
    ])

    if not context:
        return "לא נמצא מידע רלוונטי בבסיס הידע."

    # שלב 4: שלח ל-LLM
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        temperature=0.1,
        messages=[
            {"role": "system", "content": "ענה בעברית אך ורק על בסיס ההקשר."},
            {"role": "user",   "content": f"הקשר:\n{context}\n\nשאלה: {question}"}
        ]
    )
    return response.choices[0].message.content

print(rag_query("מה ההבדל בין n8n ל-Make?"))

5 פרויקטים מעשיים

1. Search Engine לאתר

Embed כל המאמרים, שאלות שמשתמשים מקלידים → semantic search → תוצאות רלוונטיות. עדיף על חיפוש מילות מפתח רגיל בסדרי גודל.

UpsertQueryMetadata

forum

2. Q&A Bot על תיעוד

Chunk כל Documentation, Upsert ל-Pinecone, RAG Pipeline עם GPT-4o. משתמשים שואלים בשפה טבעית ומקבלים תשובות עם ציטוטים.

RAGLangChainCitations

recommend

3. Recommendation System

Embed היסטוריית קריאה/רכישה של משתמש → חפש פריטים דומים → המלצות "אולי תאהב גם". עדיף על Collaborative Filtering לקורפוסים קטנים.

EmbeddingsNamespaces

find_replace

4. Duplicate Detection

Embed מסמכים חדשים → חפש ב-Pinecone → אם score > 0.95 = כנראה כפול. שימושי לבסיסי ידע, support tickets וביקורות מוצר.

QueryScore Threshold

contacts

5. Semantic CRM

Embed כל פגישות, מיילים ושיחות עם לקוחות. שאל: "אילו לקוחות הזכירו בעיות דומות?" או "מי מתאים להצעה חדשה?".

NamespacesMetadata Filter

גיליון עזר — הכל במקום אחד

Index Configuration

מודל Embedding	Dimensions	Metric מומלץ
text-embedding-3-small	1536	cosine
text-embedding-3-large	3072	cosine
text-embedding-004 (Google)	768	cosine
nomic-embed-text	768	cosine

SDK Methods עיקריים

Method	שימוש
pc.create_index()	יצירת Index חדש
index.upsert()	הוספת/עדכון Vectors
index.query()	חיפוש Semantic
index.delete()	מחיקת Vectors
index.fetch()	שליפה לפי ID
index.describe_index_stats()	סטטיסטיקות Index
index.update()	עדכון Metadata בלבד

Pricing (אפריל 2026)

Plan	Indexes	Vectors	מחיר
Free	1	100K	$0
Serverless	ללא הגבלה	ללא הגבלה	לפי שימוש
Enterprise	ללא הגבלה	מיליארדים	Custom

Metadata Best Practices

שמור טקסט מקורי — תמיד metadata["text"] לשליפה בזמן Query.
הוסף source — מאיפה מגיע המסמך, לattribution ולסינון.
Timestamps — created_at ו-updated_at לסינון לפי תאריך.
אל תשמור Vectors גדולים ב-Metadata — Metadata מוגבל ל-40KB לVector.
Normalize ערכים — category בhastag קטן, lang ב-ISO 639-1.

hub

קישורים שימושיים

open_in_newתיעוד Pinecone open_in_newPinecone Console open_in_newPython SDK

arrow_circle_left

הצעד הבא

אחרי Pinecone — המשך ל-RAG Systems המלא, חבר ל-n8n לאוטומציה, או קרא על Google AI Studio לחבר Gemini במקום OpenAI.

RAG Systems Guide n8n + Pinecone Google AI Studio

Pineconeהמדריך המלא

מה זה Vector Database — ולמה Pinecone?

מושגי יסוד — Vectors, Embeddings ו-Similarity

מה זה Embedding?

Similarity Metrics

Index Types — Serverless vs Pod

Setup ו-First Index

Upsert — הוספת Vectors למאגר

Batch Upsert לקורפוס גדול

Query — Semantic Search

Metadata Filtering

Namespaces — ארגון מולטי-טנאנט

Hybrid Search — Dense + Sparse

מתי Hybrid Search שווה?

RAG Pipeline מלא עם LangChain

RAG ידני (ללא LangChain)

5 פרויקטים מעשיים

גיליון עזר — הכל במקום אחד

Index Configuration

SDK Methods עיקריים

Pricing (אפריל 2026)

Metadata Best Practices

קישורים שימושיים

הצעד הבא

Pinecone
המדריך המלא