Sift Reference

Complete documentation

Sift operates in two modes: as a command-line tool for direct text processing, and as an MCP server exposing 78 tools for AI agents. This reference covers both. I’ve organized it by subsystem, since that’s how I think about the tool when I’m actually using it.

CLI Modes
Search Tools
File Tools
SQL Tools
Memory Tools
Context Tools
Web Tools
Repository Tools
Hardware Tools
SQL Functions

CLI Modes

–for “QUERY”

Execute SQL on standard input. Each line becomes a row in the lines table with line_number and content columns.

cat data.csv | sift --for "SELECT content FROM lines WHERE line_number > 1"

–dig –for “QUERY”

Search across multiple files. Reads file paths from stdin, indexes their contents with FTS5, and exposes files, lines, and search_fts tables.

find . -name "*.c" | sift --dig --for \
          "SELECT filepath, content FROM search_fts WHERE content MATCH 'malloc NEAR free'"

–refine FILE –for “QUERY”

Transform an entire file. The query must return a content column; the file is rewritten with the results.

–pick FILE –for “QUERY”

Surgical line editing. The query must return line_number and content columns; only those lines are modified.

–sweep –for “QUERY”

Batch editing across multiple files read from stdin.

–drop-after N FILE / –drop-before N FILE

Insert content at specific line positions. Reads from stdin.

–peek FILE

Display a file with line numbers.

–quarry [ACTION]

Manage the persistent workspace index. Actions: init, status, refresh, rebuild.

–mcp

Run as an MCP server, exposing all tools via JSON-RPC over stdio.

Output Options

--grain FORMAT sets output format: plain, tsv, csv, json, ndjson, grep.

--count outputs only the row count. --head N and --tail N limit output.

--shake previews changes without writing. --diff shows unified diff output.

Search Tools (2)

sift_search

FTS5 full-text search, 30-195x faster than grep. Supports boolean queries (AND, OR, NOT, NEAR), prefix matching, and file glob filtering. Auto-initializes the workspace index on first use.

sift_search(pattern: "malloc AND free", files: "*.c", context: 3)

sift_workspace

Manage the workspace index. Actions: init (create), status (check info), refresh (update changed files), rebuild (full reindex).

File Tools (5)

sift_read

Read a file with line numbers. Supports start_line and end_line for partial reads. Large files stream automatically.

sift_write

Create or overwrite a file. Creates parent directories automatically.

sift_update

Find/replace with fuzzy whitespace matching. Fails if old_string isn’t found or isn’t unique (unless replace_all: true).

sift_edit

Powerful file editing with multiple modes:

find + replace: Simple text replacement
insert_after / insert_before: Add content at line numbers
delete_lines: Remove lines by number or range
replace_range: Replace a range of lines
sql: SQL-powered line transformation
patch: Apply unified diff patches

sift_batch

Execute multiple edit operations atomically. All succeed or all fail. Supports delete_lines, replace_range, insert_after, insert_before, append, prepend, and replace actions.

SQL Tools (2)

sift_sql

Execute SQL on text input. Query the lines table with line_number and content columns. Supports CSV parsing, regex, and all SQL functions.

sift_sql(input: "a,b\n1,2", sql: "SELECT csv_field(content, 0) FROM lines")

sift_transform

SQL-based file transformation. Modifies the file in place based on SQL query results.

Memory Tools (38)

The memory subsystem stores persistent knowledge across sessions.

Core CRUD

sift_memory_add creates memories of types: task, note, plan, step, pattern, gotcha, preference.

sift_memory_get, sift_memory_update, sift_memory_archive, sift_memory_list handle retrieval and management.

Search

sift_memory_search provides FTS5 search with automatic synonym expansion and relevance scoring based on access frequency and recency.

Synthesis

sift_memory_synthesize consolidates multiple memories into a single synthesis. sift_memory_expand retrieves the original sources from a synthesis.

Decisions

sift_memory_decide records a decision with question, answer, and rationale. sift_memory_decisions queries past decisions. sift_memory_supersede replaces a decision with a new one.

Reflections

sift_memory_reflect logs reasoning, observations, or corrections. sift_memory_reflections queries past reflections. sift_memory_reflect_trajectory captures insights about chains of memories over time.

Dependencies

sift_memory_link and sift_memory_unlink create relationships (blocks, related, parent). sift_memory_deps queries blockers. sift_memory_ready finds tasks with no open blockers.

Graph Analysis

sift_memory_network explores memory topology in four modes: hubs (most connected), neighbors (direct connections), cluster (related memories), bridges (connecting separate areas).

sift_memory_traverse walks the memory chain. sift_memory_origin finds the root of a chain. sift_memory_context generates rich session context.

Challenge

sift_memory_challenge searches for counterevidence to a claim by generating adversarial queries. sift_memory_challenge_evidence retrieves detailed results.

Fingerprints

sift_fingerprint_generate creates a fingerprint capturing engagement patterns. sift_fingerprint_load loads it at session start. sift_fingerprint_compare shows evolution between fingerprints. sift_fingerprint_drift detects deviation from baseline.

Maintenance

sift_memory_stats returns counts, active patterns, preferences, and corrections. sift_memory_stale finds old memories. sift_memory_cache_status shows eviction candidates. sift_memory_config and sift_memory_tune adjust ranking weights. sift_memory_backups and sift_memory_restore handle backup management.

Context Tools (10)

The context subsystem preserves conversation history.

sift_context_session manages sessions (start, end, get, current). sift_context_save stores messages and tool calls. sift_context_search provides FTS5 search across all conversations. sift_context_query allows raw SQL queries. sift_context_link connects messages to memories. sift_context_synthesize creates session summaries. sift_context_archive moves old sessions to cold storage. sift_context_stats returns database statistics. sift_context_stale finds sessions needing consolidation. sift_context_memory retrieves conversation context for a specific memory.

Web Tools (9)

Crawl and cache documentation for instant local search.

sift_web_crawl crawls a website respecting robots.txt and sitemaps. sift_web_fetch retrieves a single URL. sift_web_search performs FTS5 search on cached content. sift_web_query allows SQL queries on the pages table. sift_web_stats and sift_web_manifest show database information. sift_web_refresh updates stale pages. sift_web_search_multi searches across multiple databases. sift_web_merge combines databases.

sift_web_crawl(url: "https://docs.example.com", max_pages: 100)
        sift_web_search(db: "docs.db", query: "authentication AND oauth")

Repository Tools (5)

Clone and index git repositories for code search.

sift_repo_clone clones a repository and indexes it into a SQLite database. sift_repo_search performs FTS5 search. sift_repo_query allows raw SQL on the repo_files table. sift_repo_stats shows repository statistics. sift_repo_list lists indexed repositories.

sift_repo_clone(url: "https://github.com/org/repo")
        sift_repo_search(db: "repo.db", query: "error AND handling", language: "c")

Hardware Tools (7)

Monitor resources and adapt under pressure.

sift_hardware_status returns multi-dimensional resource state (memory, I/O, process metrics) with state levels (normal, elevated, critical, survival). sift_hardware_patterns shows learned tool access patterns. sift_hardware_events retrieves logged resource events.

sift_budget_request requests a resource budget before expensive operations. sift_budget_stats shows budget utilization.

sift_stream_read and sift_stream_close handle streaming for large results.

SQL Functions

Regex (PCRE2)

regex_match(pattern, text) returns 1 if pattern matches.

regex_replace(pattern, text, replacement) performs substitution.

regex_extract(pattern, text, group) extracts capture groups.

Encoding

base64_encode(text), base64_decode(text)

hex_encode(text), hex_decode(text)

url_encode(text), url_decode(text)

CSV (RFC 4180)

csv_field(line, index) extracts a field by position.

csv_count(line) returns field count.

csv_parse(line) returns JSON array of fields.

csv_escape(text) properly escapes for CSV output.

The source is on GitHub, currently proprietary while features mature, with plans to open source once the API stabilizes.

Home Home Intro Comparison GitHub