Semantic Brain Setup

Build an AI-powered semantic memory system with Supabase + pgvector

Store thoughts as vector embeddings. Search by meaning, not keywords. This guide walks you through the complete setup from zero to semantic search.

⏱️ ~15 minutes

🏗️ Step 1: Create Supabase Project

Go to supabase.com and create a new free project. This will be your brain's backend.

  • Go to supabase.com
  • Click "New Project"
  • Choose a name (e.g., "ghost-brain")
  • Wait for project to create
  • 🗄️ Step 2: Get Your Credentials

    Navigate to your project's API settings to get the credentials you'll need.

  • In your Supabase dashboard, go to Settings → API
  • Copy Project URL (looks like: https://your-project.supabase.co)
  • Copy Service Role Key (starts with eyJ...)
  • For OpenAI, go to platform.openai.com → API Keys
  • Create a new API key or copy your existing one
  • 🗃️ Step 3: Enable pgvector Extension

    pgvector is a PostgreSQL extension that enables vector similarity search. Without it, your database can't compare embeddings efficiently.

    CREATE EXTENSION IF NOT EXISTS vector;

    What this does: Installs the pgvector extension if it's not already installed. The IF NOT EXISTS part makes it safe to run multiple times without errors.

    📊 Step 4: Create the memories Table

    This table stores your thoughts as both text and vector embeddings. Each column has a specific purpose.

    CREATE TABLE memories (
      id SERIAL PRIMARY KEY,
      content TEXT NOT NULL,
      embedding vector(1536),
      metadata JSONB DEFAULT '{}'::jsonb,
      created_at TIMESTAMPTZ DEFAULT NOW(),
      importance VARCHAR(20) CHECK (importance IN ('low', 'medium', 'high'))
    );
  • id SERIAL PRIMARY KEY — Auto-incrementing unique ID
  • content TEXT NOT NULL — The actual text content
  • embedding vector(1536) — 1536-dimensional vector from OpenAI (semantic fingerprint)
  • metadata JSONB — Flexible data storage (topics, tags, custom fields)
  • created_at TIMESTAMPTZ — Timestamp when memory was created
  • importance VARCHAR — Priority level with validation (low/medium/high)
  • 🔧 Step 5: Create the RPC Search Function

    This Remote Procedure Call function handles vector similarity search. It takes a query embedding and finds the most similar memories using cosine similarity.

    CREATE OR REPLACE FUNCTION search_memories(query_embedding text)
    RETURNS TABLE (
      id INTEGER,
      content TEXT,
      metadata JSONB,
      created_at TIMESTAMPTZ,
      importance VARCHAR,
      similarity FLOAT
    )
    LANGUAGE plpgsql
    AS $$
    BEGIN
      RETURN QUERY
      SELECT
        m.id,
        m.content,
        m.metadata,
        m.created_at,
        m.importance,
        1 - (m.embedding::vector <=> query_embedding::vector) as similarity
      FROM memories m
      WHERE m.embedding IS NOT NULL
      ORDER BY m.embedding::vector <=> query_embedding::vector
      LIMIT 10;
    END;
    $$;

    What this does: Takes your query embedding, compares it to all stored memories using cosine distance, and returns the top 10 matches sorted by similarity.

    ⚙️ Step 6: Configure Your Environment

    Create a .env file in your brain skill directory with these credentials.

    SUPABASE_URL=https://your-project.supabase.co
    SUPABASE_SERVICE_KEY=eyJ...your-service-role-key...
    OPENAI_API_KEY=sk-proj-...your-openai-key...

    Security note: Never commit .env files to git. Add .env to your .gitignore file.

          graph TD
            A["Create Supabase Project
    Get project ID + keys"] --> B["Enable pgvector Extension
    Run SQL to install extension"] --> C["Create memories Table
    id, content, embedding vector1536, metadata"] --> D["Create RPC Function
    search_memories with similarity"] --> E["Configure .env File
    SUPABASE_URL + SUPABASE_SERVICE_KEY
    + OPENAI_API_KEY"] --> F["Capture Memories
    Generate embeddings + store as vectors"] --> G["Search Semantically
    Find by meaning, not keywords"] style A fill:#6366f1,stroke:#4f46e5,color:#ffffff style B fill:#1a1a1a,stroke:#6366f1,color:#ffffff style C fill:#1a1a1a,stroke:#6366f1,color:#ffffff style D fill:#1a1a1a,stroke:#6366f1,color:#ffffff style E fill:#1a1a1a,stroke:#6366f1,color:#ffffff style F fill:#10b981,stroke:#059669,color:#ffffff style G fill:#10b981,stroke:#059669,color:#ffffff
    💡 How Semantic Search Works:

    Embeddings are 1,536 numbers that represent the meaning of your text. When you search, the system compares your query's embedding to all stored memories using cosine similarity. This finds results even when words don't match — "database config" finds memories about "setting up PostgreSQL" because the meanings are similar.

    Similarity as percentage: The formula 1 - cosine_distance converts mathematical distance to a percentage. 100% means exact match, 50% means somewhat similar, 10% means not very similar.