Security Model

Authentication, authorization, data isolation, and consent architecture.

Authentication Overview

Epitome supports two authentication paths: OAuth sessions for human users (dashboard) and API keys for AI agents (MCP and REST). Both paths resolve to a user identity, but they have different authorization behaviors.

OAuth 2.0 (GitHub, Google)

The dashboard uses standard OAuth 2.0 authorization code flow. Users sign in via GitHub or Google. On successful authentication, a secure HTTP-only session cookie is set. The session maps to a user record in the shared.sessions table.

Session-authenticated requests bypass the consent system because the user is directly interacting with their own data. There is no need for an intermediary consent grant when the data owner is the one making the request.

API Keys (AI Agents)

Each AI agent is issued an API key (prefixed epi_live_ for production or epi_test_ for test environments). The key is passed in the Authorization: Bearer header. Keys are hashed with Argon2 before storage — the plaintext key is shown only once at creation time.

API key requests are subject to the consent system. Before an agent can access a resource, the user must grant explicit consent via the dashboard.

text

Authentication Flow:

Request arrives
  │
  ├── Has session cookie? → Validate session → Session auth (bypass consent)
  │
  └── Has Authorization header? → Validate API key → Agent auth (check consent)
        │
        ├── Key valid? → Resolve user_id + agent_id
        └── Key invalid? → 401 Unauthorized

Schema Isolation

Epitome uses PostgreSQL schemas to provide hard data isolation between users. Unlike row-level security (RLS), which relies on runtime policies to filter data, schema isolation physically separates each user's data into its own namespace.

This means a bug in a query — a missing WHERE clause, an injection attempt, a malformed join — cannot accidentally return another user's data. The search path is locked to the authenticated user's schema within the transaction.

typescript

// How schema isolation is enforced in every database operation
export async function withUserSchema<T>(
  userId: string,
  fn: (tx: Transaction) => Promise<T>
): Promise<T> {
  const schemaName = `user_${userId}`;

  return sql.begin(async (tx) => {
    // Lock the search path to this user's schema for the transaction
    await tx`SET LOCAL search_path = ${tx(schemaName)}, public`;

    // All queries in fn() now resolve to user's schema
    return fn(tx);
  });
}

Key properties of schema isolation:

Each user's data lives in a separate PostgreSQL schema (e.g., user_abc123)
SET LOCAL search_path is transaction-scoped — it cannot leak between connections
No composite indexes with user_id needed — tables are inherently single-user
Per-user backups are trivial: pg_dump -n user_abc123
User deletion is a clean DROP SCHEMA user_abc123 CASCADE
No risk of cross-user data leaks even with SQL injection in dynamic queries

The consent system controls which resources each AI agent can access. This gives users fine-grained control over their data. An agent's first request to a new resource type will fail with a CONSENT_REQUIRED error until the user grants permission via the dashboard.

Resource Types

Resource	Read Tools	Write Tools
profile	read_profile	update_profile
tables	query_table	insert_record
vectors	search_memory	store_memory
graph	query_graph, get_entity_neighbors	(auto-managed by extraction pipeline)
activity	(logged automatically)	log_activity

Hierarchical Matching

Consent uses hierarchical matching. A consent grant for a parent resource automatically covers all child resources. For example, granting consent forgraph also grants access tograph/stats,graph/query, andgraph/entities.

typescript

// Consent check pseudocode
function hasConsent(agentId: string, resource: string, permission: string): boolean {
  // Check exact match first
  if (findConsent(agentId, resource, permission)) return true;

  // Check parent resources (hierarchical matching)
  // "graph/stats" → check "graph" → check "*"
  const parts = resource.split('/');
  while (parts.length > 1) {
    parts.pop();
    const parent = parts.join('/');
    if (findConsent(agentId, parent, permission)) return true;
  }

  return false;
}

Consent Flow

text

Agent calls store_memory (requires 'vectors' write consent)
  │
  ├── Check agent_consent table for (agent_id, 'vectors', 'write'|'read_write')
  │
  ├── Found? → Proceed with request
  │
  └── Not found? → Return 403 CONSENT_REQUIRED
        │
        └── Error includes: { resource: 'vectors', permission: 'write',
              message: "Grant consent in the Epitome dashboard" }
              │
              └── User opens dashboard → Agents page → Grant 'vectors' write to agent
                    │
                    └── Next request succeeds

SQL Sandbox

Some features (like advanced table queries) allow parameterized SQL-like operations. These are sandboxed to prevent dangerous operations. The sandbox enforces:

Read-only by default: Only SELECT queries are allowed through query endpoints. Write operations go through dedicated insert/update/delete endpoints.
Schema-locked: Queries always run with the user's schema search path. There is no way to access another user's schema or the shared schema tables.
No DDL: CREATE, ALTER, DROP, TRUNCATE and other schema-modifying statements are blocked.
No system functions: Calls to pg_catalog, information_schema, and system functions are blocked.
Statement timeout: A per-statement timeout (default 5 seconds) prevents runaway queries.
Parameterized queries: All user-provided values are passed as parameters, never interpolated into SQL strings.

Rate Limiting

Rate limiting protects the service from abuse and ensures fair resource allocation. Limits are applied per-user and per-agent.

Operation	Limit	Window
General API requests	100 requests	per minute per agent
Vector store (store_memory)	30 requests	per minute per agent
Vector search (search_memory)	60 requests	per minute per agent
Profile updates	10 requests	per minute per agent
Entity extraction (async)	20 requests	per minute per user

Rate limit responses include standard headers:

text

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1708185600
Retry-After: 45

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded. Retry after 45 seconds."
  }
}

Audit Trail

Every significant action is logged to the activity_log table in the user's schema. This provides a complete audit trail of what agents and the user have done with their data.

What Gets Logged

MCP tool calls: Every tool invocation with tool name, parameters, and result status
REST API writes: Profile updates, record inserts/updates/deletes, vector stores
Dashboard actions: Profile edits, consent grants/revocations, entity merges, memory review resolutions
Agent events: Agent registration, key rotation, access revocation
System events: Entity extraction completions, contradiction detections

Log Entry Structure

json

{
  "id": "act_abc123",
  "agent_id": "agent_claude_desktop",  // null for dashboard/system actions
  "action": "store_memory",
  "resource": "vectors/facts",
  "details": {
    "content_preview": "My daughter Emma starts kindergarten...",
    "collection": "family",
    "entities_extracted": ["Emma", "Lincoln Elementary"],
    "confidence": 0.95
  },
  "created_at": "2026-02-17T14:30:00Z"
}

Retention: Activity logs are retained indefinitely for self-hosted instances. The hosted service retains logs for 90 days by default, with an option to export before deletion. The Activity page in the dashboard provides a filterable, searchable view of the complete audit trail.

Data Model Reference Contributing Guide

Security Model

Authentication Overview

OAuth 2.0 (GitHub, Google)

API Keys (AI Agents)

Schema Isolation

Consent System

Resource Types

Hierarchical Matching

Consent Flow

SQL Sandbox

Rate Limiting

Audit Trail

What Gets Logged

Log Entry Structure