gotcontext
Description
Semantic compression gateway. 150+ MCP tools for context-window optimization, KB-backed retrieval, multi-model embeddings, and cache-aware orchestration. Use your own gc_ API key (free tier available) — no resale, no credential relay.
Skills
ace_curate
ace_curate[CURATE] ACE CURATE: Integrate insights into playbook via delta updates. Applies incremental changes (add/update/remove bullets) with semantic deduplication. Prevents context collapse through grow-and-refine strategy. Use after reflecting to evolve the playbook.
ace_execute_cycle
ace_execute_cycle[SYNC] ACE EXECUTE CYCLE: Execute complete ACE cycle (Generate -> Reflect -> Curate). Convenience tool that combines the three-step ACE process into one call. Generates trajectory, reflects on outcome, and curates insights automatically. Use for rapid iteration and continuous playbook improvement.
ace_generate
ace_generate[AFM] ACE GENERATE: Generate reasoning trajectory for a task using ACE playbook. Produces step-by-step reasoning that applies relevant bullets from the playbook. Each step includes relevant guidelines, reasoning, and confidence scores. Use this to guide semantic node selection and compression decisions.
ace_get_playbook
ace_get_playbook[ACE] ACE GET PLAYBOOK: Retrieve current ACE playbook state. Returns all bullets with performance stats, versioning, and delta history. Supports filtering by confidence, bullet type, or custom criteria. Use to inspect the evolved playbook and understand learned patterns.
ace_grow_context
ace_grow_context[ADD] ACE GROW: Manually add bullets to playbook (grow operation). Directly insert principles, strategies, tactics, constraints, or preferences. Use to seed domain-specific knowledge or codify team standards. Each bullet gets an embedding for semantic operations.
ace_refine_context
ace_refine_context[ACE] ACE REFINE: Update bullet performance based on feedback (refine operation). Adjusts confidence scores for specific bullets based on success/failure. Use to reinforce successful patterns or penalize failed approaches. Enables continuous improvement of the playbook.
ace_reflect
ace_reflect[ANALYZE] ACE REFLECT: Extract insights from a reasoning trajectory. Analyzes what worked (successes) and what didn't (failures) to formulate new bullets. Returns insights with confidence scores and reasoning. Use after completing a task to learn and improve the playbook.
adapt_to_context_window
adapt_to_context_windowADAPTIVE CONTEXT ALLOCATION (JSCCM-inspired): Dynamically adjust compression based on available context window. Low availability (like low SNR in wireless) -> More compression. High availability -> Less compression, more detail. Uses learned rate allocator to determine optimal skeleton ratio. Inspired by JSCCM paper's channel adaptation strategy.
add_memory
add_memoryStore an explicit memory independently of document ingestion. Useful for user preferences, decisions, gotchas, and persistent workflow hints.
advise_cache_strategy
advise_cache_strategyGet the optimal prompt caching strategy for your model. Each LLM provider handles caching differently -- Anthropic uses explicit markers, OpenAI is automatic, Gemini has implicit+explicit modes. Returns specific tips for maximizing cache hits and cost savings.
advise_context
advise_contextAnalyze all ingested documents and recommend optimal context strategy. Returns model recommendations, pruning priorities, and compression advice.
afm_add_message
afm_add_messageADAPTIVE FOCUS MEMORY: Add message to dialogue history. AFM (Adaptive Focus Memory, arXiv:2511.12712v1) manages multi-turn conversations by assigning adaptive fidelity to each message based on recency, semantic relevance, and importance. Messages are automatically classified as CRITICAL (safety-sensitive), RELEVANT, or TRIVIAL. Use this to build dialogue history before calling afm_build_context.
afm_build_context
afm_build_context[AFM] ADAPTIVE FOCUS MEMORY: Build optimized context for current query. Uses semantic similarity + recency weighting + importance classification to pack dialogue history under strict token budget. Achieves ~66% token reduction while preserving safety-critical information (e.g., allergies, constraints). Each message gets adaptive fidelity: FULL (verbatim), COMPRESSED (summary), or PLACEHOLDER. Messages packed chronologically to preserve conversation flow.
afm_clear_history
afm_clear_history[DELETE] ADAPTIVE FOCUS MEMORY: Clear dialogue history. Removes all messages and resets turn counter. Use when starting a new conversation or when dialogue context is no longer relevant.
afm_export_history
afm_export_history[SAVE] ADAPTIVE FOCUS MEMORY: Export dialogue history to JSON. Saves current conversation state including all messages, turn counter, and metadata. Use this to preserve conversations for later resume. Returns JSON string that can be saved and imported later.
afm_get_stats
afm_get_stats[STATS] ADAPTIVE FOCUS MEMORY: Get dialogue statistics. Returns total messages, current turn index, and importance breakdown (critical/relevant/trivial counts). Useful for monitoring dialogue state.
afm_import_history
afm_import_history[LOAD] ADAPTIVE FOCUS MEMORY: Import dialogue history from JSON. Restores a previously exported conversation state. This replaces the current dialogue history. Use this to resume saved conversations.
assess_cache_compatibility
assess_cache_compatibilityAssess whether a provider and harness combination exposes enough telemetry to validate prompt cache behavior reliably.
audit_prompt_cacheability
audit_prompt_cacheabilityAudit a composed prompt for cache-friendly section ordering and volatile metadata that can break provider prefix caching.
batch_compress_queue
batch_compress_queueEnqueue a batch compression job. Submits a list of documents for asynchronous compression and returns a job id. Poll GET /v1/batch-queue/{id} for status and results. Requires Team or Enterprise plan.
batch_ingest_documents
batch_ingest_documents[BATCH] Batch ingest multiple documents concurrently for 4x faster throughput. Processes documents in parallel with bounded concurrency, progress tracking, and error isolation. One document failure won't block the entire batch. Returns detailed results for each document including success status and processing time. Ideal for enterprise-scale document ingestion.
calculate_reward
calculate_reward[EXPERIMENTAL] Calculate decomposed compression reward using ASG-SI system. Computes 5 reward components: Schema (validation), Semantic (meaning preservation), Fidelity (ratio adherence), Composition (graph integrity), Memory (efficiency). Based on arxiv.org/abs/2512.23760 Audited Skill-Graph Self-Improvement.
calculate_roi
calculate_roiCalculate ROI of using gotcontext compression vs raw token usage. Shows monthly cost comparison: without vs with compression, Pro plan cost, net savings, and ROI multiplier. Powers the website ROI calculator.
capture_cache_telemetry
capture_cache_telemetryNormalize provider-side prompt cache telemetry from a real model API response and warn on silent cache misses.
check_blind_spots
check_blind_spotsBLIND SPOT DETECTOR: Analyze if your response missed critical context. This tool embeds your response and compares it to ALL nodes in the document. If relevant content was not retrieved, it alerts you and suggests auto-injection. Use AFTER generating a response to ensure fidelity. This implements the 'Self-Correcting Context Loop'.
check_budget
check_budgetCheck token budget usage against configured limits. Supports per-session, daily, and monthly budgets. Returns usage status, alert level, and projected usage. Schema rejects unknown fields (e.g. legacy 'period' arg) so MCP agents get explicit validation errors instead of silent argument-drops.
check_context_budget
check_context_budgetCheck how much of your LLM context window is being used and get proactive compression recommendations. Returns usage percentage and suggests action at 40%/60%/75% thresholds.
check_environment
check_environment[HEALTH] Check comprehensive environment health: models loaded, memory usage, cache hit ratio, stale documents, and disk space. Returns recommendations for optimization. Use this to understand system state before heavy operations.
check_resource_health
check_resource_health[SAVE] RESOURCE HEALTH: Check resource usage and system health. Returns storage, memory, and document count metrics with proactive warnings and recommendations. Use this to monitor resource usage before ingesting large documents or when experiencing slowdowns. Prevents hitting storage limits unexpectedly.
compare_experiment_runs
compare_experiment_runsCompare two experiment runs and report deltas in pass counts, compression, verification, and reward quality.
compare_prompt_versions
compare_prompt_versionsCompare two versions of the same prompt template and return changed fields plus a unified diff.
compile_knowledge
compile_knowledgeCompile flat memories into cross-linked markdown concept articles with a navigable index. Deduplicates and groups by category.
compress_codebase
compress_codebaseCompress a codebase directory into a semantic skeleton. Uses tensor-grep AST analysis when available for structure extraction. Falls back to directory scanning without tensor-grep. Optionally filter by query or file patterns.
compress_mcp_registry
compress_mcp_registryCompress the tool descriptions of one or more MCP servers by fetching their tools/list and running token-saver compression on the descriptions. Returns compressed tool schemas and token savings per server. Use to reduce the context cost of multi-server MCP setups. Pro+ only.
compress_meta_tokens
compress_meta_tokens[COMPRESS] Lossless meta-token compression (arXiv 2506.00307). Finds repeated token subsequences and replaces them with compact dictionary symbols (§1, §2, …). Fully reversible. Best for repetitive text with recurring phrases. Returns compressed_text, dictionary, and token savings.
configure_for_client
configure_for_clientConfigure compression parameters for a specific LLM client or model. Accepts a model identifier (e.g. claude-opus-4-6, gpt-4o) or explicit context window size. Auto-tunes skeleton ratio, chunk sizes, and fidelity defaults to maximize token efficiency for the target model.
create_connector_feed
create_connector_feedCreate a managed connector feed definition for exported web, GitHub, S3, or Slack payloads.
create_dataset
create_datasetCreate a reusable named benchmark/evaluation dataset from inline cases or a JSON corpus fixture.
create_handoff_bundle
create_handoff_bundleCreate a structured, auditable handoff bundle from a compressed document, including distilled skeleton context and optional focused evidence.
create_project
create_projectCreate a new project for API key and usage attribution. Returns the new project id. Bind API keys to a project to track per-project token savings and costs. Requires Team or Enterprise plan.
create_prompt_template
create_prompt_templateCreate a managed prompt template with version 1 and optional deployment label. Use this to make prompts first-class artifacts instead of hard-coded strings.
delete_document
delete_documentDELETE DOCUMENT: Permanently delete an ingested document. Removes the document from memory and persistent storage. This operation cannot be undone. Use with caution. Useful for managing storage limits or removing outdated documents.
delete_memory
delete_memoryDelete a previously stored explicit memory by ID.
deploy_prompt_version
deploy_prompt_versionAssign or move a deployment label (production, staging, canary) to a specific prompt template version.
detect_dead_code
detect_dead_codeDetect Python files in a directory that are never imported by other files. Uses regex-based import graph analysis to identify unreachable modules. Entry points (main.py, server.py, __init__.py, test_*.py, etc.) are always considered live. Returns dead file list with estimated token savings -- useful for excluding dead code before compression to reduce noise.
detect_hallucination
detect_hallucinationHALLUCINATION DETECTOR: Check if a response is grounded in source material. Compares response embedding to document graph. Flags responses with low similarity to all nodes (possible fabrication). Use when uncertain about answer accuracy.
diagnose_cache_miss
diagnose_cache_missDiagnose why an expected provider cache hit missed by comparing the recorded prompt expectation with the actual rendered prefix that reached the provider.
discover_savings
discover_savingsDiscover missed token savings opportunities. Scans a directory or list of text items to estimate what could be compressed. Returns ranked opportunities with estimated savings per file. Use before ingesting content to prioritize which files benefit most.
estimate_model_cost
estimate_model_costEstimate token cost savings for a model using original and compressed token counts.
estimate_tokens
estimate_tokensEstimate token count for a given text using multiple methods. Returns accurate count (tiktoken), fast estimate (bytes/4), JSON-optimized estimate (bytes/2), and raw byte count. Useful for budgeting context window usage before ingestion.
explain_compression_decision
explain_compression_decision[ANALYZE] Explain why a specific node was kept or dropped during compression. Provides detailed analysis including importance score ranking, connectivity, key entities, and relationships with other nodes. Perfect for understanding and debugging compression decisions.
export_graph_graphml
export_graph_graphml[VIZ] Export semantic graph as GraphML for analysis tools. GraphML is a standard XML format supported by Gephi, Cytoscape, igraph, and NetworkX. Perfect for advanced network analysis, visualization, and research workflows.
export_graph_json
export_graph_json[STATS] Export semantic graph as JSON for programmatic access. Returns a structured JSON representation of the semantic graph with nodes, edges, importance scores, and statistics. Perfect for custom analysis or integration with other tools.
export_team_data
export_team_dataExport aggregated team savings data. Supports JSON, CSV, and Prometheus exposition formats. Use for team dashboards, monitoring, and cost reporting.
filter_cli_output
filter_cli_outputFilter CLI command output to reduce token usage. Auto-detects command type (git, pytest, npm, lint, etc.) and applies optimal filtering strategy. Strips ANSI codes, extracts stats, groups errors, removes progress bars.
find_duplicates
find_duplicatesDetect near-duplicate content across different ingested documents. Uses cosine similarity on chunk embeddings to find redundant content that could be deduplicated to save tokens.
gc_agent_capsule
gc_agent_capsuleReturns an actionable context capsule for AI agents: primary targets, snippets, validation commands, rollback metadata, confidence. Use this BEFORE making any non-trivial code change — it tells you exactly what to look at first. Wraps `tg agent`. Pro+ only.
gc_blast_radius
gc_blast_radiusStructural code-context compression. Submit a file bundle + REQUIRED focus symbol; server runs tensor-grep blast-radius + BM25 and returns a ranked context list. Aligns with POST /v1/compress-code/structural. Pro+ only. NOTE: focus_symbol is required by tensor-grep>=1.12.0; omitting it now returns a structured BlastRadiusInputError instead of a degraded result.
gc_callers
gc_callersFind all call sites of a symbol + the likely impacted test files. Lower-cost than gc_blast_radius when you just need 'who calls this' rather than a full transitive blast radius. Wraps `tg callers`. Pro+ only.
gc_compress_manifest
gc_compress_manifestCompress an MCP tools/list manifest. Shortens tool descriptions via token-saver compression while preserving inputSchema byte-for-byte. Returns compressed manifest + savings stats. Pro+ only.
gc_context_render
gc_context_renderReturns a ranked, prompt-ready context bundle for a natural-language query — designed for direct injection into an LLM prompt. Use when you have a fuzzy 'find me code related to X' question. Wraps `tg context-render`. Pro+ only.
gc_edit_plan
gc_edit_planReturns a machine-readable edit plan: which files to modify, what to add/remove, with validation_commands to verify the result. Use this BEFORE writing a patch — it produces a structured rewrite plan that beats free-form 'let me think about this' loops. Wraps `tg edit-plan`. Pro+ only.
gc_kb_delete
gc_kb_deleteSoft-delete a KB item. The item is marked deleted and excluded from queries and listings, but its data is retained for audit purposes. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_kb_diff
gc_kb_diffCompute a unified text diff between two versions of a KB item. Returns both versions' full content plus a unified diff string. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_kb_edit
gc_kb_editReplace the content of an existing KB item (creates a new version). Pass the current version_id to prevent lost-update races. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project (the dispatch reads the key's project context, not a request parameter). Keys not bound to a KB project return {error: 'no_project_selected'}. Mint a project-scoped key from /dashboard/keys.
gc_kb_get
gc_kb_getFetch a single KB item's metadata by its id. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_kb_ingest
gc_kb_ingestAdd a new item to the project Knowledge Hub. Supply either 'content' (raw text) or 'url' (fetched and extracted). Returns the new item_id and version_id. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_kb_list
gc_kb_listList all non-deleted KB items in the current project. Returns id, name, type, current_version_id, and timestamps. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_kb_query
gc_kb_queryVector-search the project Knowledge Hub. Returns the top-k semantically similar chunks with their source item ids and scores. Requires Pro plan or higher. PROJECT BINDING: This tool requires the calling gc_ API key to be bound to a Knowledge Hub project. Keys not bound to a KB project return {error: 'no_project_selected'}.
gc_lookup
gc_lookupLook up framework documentation across 9 indexed Context Hub frameworks (Next.js, FastAPI, LangChain, SQLAlchemy, Pydantic, Tailwind, Drizzle, FastMCP, React). Free for all tiers. Returns raw documentation chunks ranked by semantic similarity to the query.
gc_mint_api_key
gc_mint_api_keyMint a new gc_ API key for the authenticated caller. The full key value is returned ONCE in the response — store it immediately. Subsequent calls cannot recover the value. Inherits the caller's plan + (optionally) project. Default expires in 30 days; max 365. Pro+ only — Free-tier customers must mint via the dashboard.
gc_pre_flight
gc_pre_flightPre-flight check before sending an expensive prompt to the LLM. Returns a structured verdict (send_as_is | send_compressed | warn_context_limit | clear_first) plus the compressed prompt body inline, cost preview against on-demand list pricing, and a cache hit likelihood score. One MCP call replaces context warnings, compression decisions, and cost previews. Available to all plans; volume governed by your monthly compression quota.
gc_read_doc
gc_read_docRead a gotcontext product/API documentation page as markdown. Pass a full URL (e.g. 'https://gotcontext.ai/docs#authentication') returned by gc_search_docs, or a known slug such as 'authentication' or 'mcp-server' (slug auto-resolves to the matching docs anchor). Returns JSON with markdown, source_url, and length_tokens.
gc_rebind_api_key
gc_rebind_api_keyRebind a gc_ API key to a different project (or unbind it to the user-scoped Default). Lets headless agents (CI runners, Claude Code sessions, automation) self-manage key→project attribution without a dashboard session. Only keys owned by the authenticated caller may be rebound — cross-user rebinds are rejected with 403. The plan-cache entry is explicitly evicted on success so traffic attributes to the new project immediately (no 5-min stale window). Pro+ only.
gc_revoke_api_key
gc_revoke_api_keyRevoke a gc_ API key owned by the authenticated caller. Cross-user revokes are rejected with 403. The key is set to status='revoked' in Postgres + evicted from the Redis plan cache so subsequent requests with that key return 401 within ~5 seconds. Pro+ only.
gc_search_docs
gc_search_docsSearch gotcontext product/API documentation and return ranked docs URLs. When to use vs gc_lookup: gc_lookup is for company-name disambiguation; gc_search_docs is for product/API documentation queries. Returns JSON with results containing title, snippet, url, and score.
gc_session_summary
gc_session_summaryCompress a session's conversation history into a portable summary the agent can re-inject after /clear. The natural pair to gc_pre_flight: when pre_flight returns clear_first, call gc_session_summary, then /clear, then re-inject the returned restoration_instructions. Runs on gotcontext infra so it works even when Claude Code's auto-compact would fail. Available to all plans; volume governed by your monthly compression quota.
generate_rewrite_prompt
generate_rewrite_promptGenerate a structured rewrite prompt for client-side LLM compression. Returns system instructions and user prompt optimized for generative compression.
generate_structural_summary
generate_structural_summaryGenerate a compact structural outline of a code file. Extracts imports, class definitions, and function signatures (with type hints). Replaces function bodies with `...`. Achieves ~80-90% token reduction while preserving the full API surface. Ideal for codebase exploration, API discovery, and context-window-efficient code review. Supports Python (AST-based) and other languages (regex fallback).
generate_synthetic_tests
generate_synthetic_tests[EXPERIMENTAL] Generate synthetic test cases for adversarial testing. Uses ASG-SI experience synthesis to create boundary cases, adversarial documents, and stress test scenarios for compression validation. Based on arxiv.org/abs/2512.23760 Audited Skill-Graph Self-Improvement.
get_compression_insights
get_compression_insightsGet insights from compression history: best ratios per content type, average fidelity scores, and data-driven strategy recommendations.
get_compression_presets
get_compression_presetsList available compression presets (code-review, chat, research, aggressive, balanced). Each preset maps to optimal skeleton_ratio and fidelity settings for common use cases.
get_compression_profile
get_compression_profileGet the active compression profile for the session. Returns the profile name and its parameter values (skeleton_ratio, fidelity, chunk_size).
get_connector_feed
get_connector_feedFetch one managed connector feed definition.
get_context_block
get_context_blockBuild a lifecycle-aware context block with active facts, recent events, and a cache-stable skeleton prefix.
get_evidence_stats
get_evidence_stats[EXPERIMENTAL] Get evidence store statistics for audit trail. The store maintains a tamper-evident blockchain-style chain of all compression operations with cryptographic integrity verification. Based on arxiv.org/abs/2512.23760 Audited Skill-Graph Self-Improvement.
get_experiment_run
get_experiment_runFetch a stored experiment run and its per-case evaluation details.
get_handoff_bundle
get_handoff_bundleFetch one structured handoff bundle including its distilled artifacts.
get_knowledge_index
get_knowledge_indexReturn the compiled knowledge index markdown for index-first retrieval. Useful for small knowledge bases (<500 entries) where a readable index beats embedding search.
get_multi_level_skeleton
get_multi_level_skeletonGenerate 3-tier skeleton output: headline (top 10%), summary (top 30%), and full (100%). Client picks the depth needed for their context budget.
get_original_output
get_original_outputRetrieve the original (pre-compression) content for a tee entry. When compression is aggressive (>80%), the original is automatically saved. Use this to recover full output when the compressed version lost important details.
get_prompt_template
get_prompt_templateGet a prompt template and resolve a specific version or deployment label to the exact prompt content.
get_provider_profile
get_provider_profileGet provider-aware pricing, cache telemetry fields, and prompt-shaping guidance for a model.
get_savings_inline
get_savings_inlineGet a compact one-line savings summary. Embed this in other tool responses to show real-time savings. Example: 'Saved 3,400 tokens ($0.051) | Session: $2.34 saved (8.1x ROI)'
get_savings_report
get_savings_reportGet a detailed report of token savings for this session. Shows total tokens saved, dollars saved, compression ratios, per-tool breakdown, monthly projection, and ROI vs the Pro plan. Use this to justify the value of token compression to your team.
get_stats
get_statsGet statistics about ingested documents. Shows token counts, compression ratios, and graph structure. Useful for understanding the semantic compression efficiency.
get_user_profile
get_user_profileBuild a deterministic user profile from explicit stored memories within the requested scope.
get_version_history
get_version_history[HISTORY] VERSION HISTORY: Get version timeline for a document. Shows all cached versions with timestamps, checksums, file paths, and compression stats. Similar to 'git log' for cached documents. Use this to browse version history before using diff_cached_file to compare specific versions.
ingest_context
ingest_contextIngest and compress a document into a semantic graph. This creates a fidelity-preserving encoding that reduces token usage by 80-95%. The document is analyzed for structure, relationships, and importance. Returns a compressed skeleton view. Provide document content via 'text' (inline) or 'file_url' (fetched via HTTPS). Optionally provide file_path to enable file sync tracking and version history.
ingest_multimodal
ingest_multimodalProduction-grade multimodal ingestion for text, code, images, audio transcripts, and document-with-images bundles.
ingest_transcript
ingest_transcriptExtract decisions, lessons, patterns, and gotchas from a conversation transcript and store them as scoped memories automatically.
invalidate_fact
invalidate_factInvalidate a fact so temporal retrieval excludes it by default.
lint_knowledge
lint_knowledgeRun quality checks on stored memories: staleness, near-duplicates, contradictions, and orphan detection. Returns a structured lint report.
list_connector_feeds
list_connector_feedsList managed connector feeds and their last sync state.
list_connector_types
list_connector_typesList available managed connector types and their purposes.
list_datasets
list_datasetsList named datasets available for experiment runs.
list_documents
list_documents[LIST] LIST DOCUMENTS: Get inventory of all ingested documents. Returns structured information about each document including file_id, metadata, node count, token counts, and ingestion time. Use this to discover what documents are available for querying.
list_fact_history
list_fact_historyList temporal fact versions for a document or exact fact identifier.
list_handoff_bundles
list_handoff_bundlesList structured handoff bundles visible to the current scope.
list_memories
list_memoriesList explicit memories in the requested scope. Supports optional category filtering and result limits.
list_prefix_collisions
list_prefix_collisionsList rendered prompt prefixes that collide across templates so shared provider-cacheable prefixes are visible.
list_prompt_templates
list_prompt_templatesList managed prompt templates with their latest version and deployment labels. Optionally include all versions.
list_tee_entries
list_tee_entriesList recent tee entries with metadata. Shows what original content has been preserved for recovery. Filter by source (cli_optimizer, proxy, compression).
modulate_region
modulate_regionRetrieve specific sections at a chosen fidelity level. Use this to 'zoom in' on relevant parts after reading the skeleton. 5 Fidelity levels (JSCCM-inspired adaptive modulation): 'ABSTRACT' (~10 tokens/node) - Quick summary only, 'OUTLINE' (~30 tokens/node) - Summary + section context, 'STRUCTURE' (~50 tokens/node) - Summary + entities + metadata, 'DETAILED' (~100 tokens/node) - Summary + entities + key excerpts, 'RAW' (variable tokens) - Full original content. This implements adaptive semantic
multilevel_encode
multilevel_encodeMULTI-LEVEL ENCODING (JSCCM-inspired): Generate skeleton with 3 priority levels: - Main branch (top 15%, always included) - critical concepts - Auxiliary branch (next 25%, include if space allows) - important details - Detail branch (remaining, only if plenty of space) - supplementary content. Progressively adds levels based on available context window. Inspired by JSCCM's parallel encoder architecture.
multimodal_ingest
multimodal_ingest[EXPERIMENTAL] Ingest mixed content (text, code, images). Requires Pillow for image support. Image paths validated for security. NOT production-ready. Returns experimental flag.
optimize_for_model
optimize_for_modelGenerate provider-aware cost, fidelity, and prompt-shaping recommendations for a target model.
project_stats
project_statsFetch usage statistics for your projects. With no arguments, lists all your projects. With project_id, returns that project's compression count and creation date. Pro+ only.
proxy_mcp_server
proxy_mcp_serverProxy a JSON-RPC method call to another MCP server and optionally compress the tools/list response on the way back. Useful for reducing context overhead when chaining MCP servers. Pro+ only.
prune_by_relevance
prune_by_relevancePrune document nodes by query relevance using attention-guided scoring. Keeps only the most relevant nodes for a given query, achieving up to 6x compression with better quality than blind ratio-based pruning.
read_skeleton
read_skeletonRead the compressed skeleton view of a previously ingested document. Shows high-importance 'anchor' concepts with summaries, and lists other sections as expandable nodes. Achieves 80-95% token reduction. Use this FIRST before requesting specific details. Selection modes: auto (default, smart detection), baseline, query_guided, evidence_aware. The response includes 'selection_mode_resolved' indicating which mode was actually used.
recommend_compression
recommend_compression[ADVISE] Recommend the optimal compression profile for a document. Simulates each profile, predicts quality (entity retention + coverage), and returns the most compressed profile meeting your quality floor. Useful before ingesting to choose between minimal/summary/balanced/detailed/full.
recommend_fidelity
recommend_fidelity[TIP] Get intelligent recommendation for optimal fidelity level. Analyzes your use case, number of nodes, token budget, and query complexity to suggest the best fidelity level (ABSTRACT, OUTLINE, STRUCTURE, DETAILED, or RAW). Returns recommendation with reasoning, token estimate, and alternatives. Use this BEFORE modulate_region to make informed decisions.
recover_session
recover_sessionRecover session state after conversation compaction. Returns a compact summary of all prior ingestions, configurations, and tool calls for the given session.
refresh_document
refresh_document[REFRESH] REFRESH DOCUMENT: Re-ingest document from source file to update cache with latest changes. Stores old version in history (default: keeps last 10 versions). Use this when check_file_sync detects staleness or after external edits. Returns new compression stats and confirms version was saved.
render_prompt_template
render_prompt_templateResolve and render a prompt template into cache-friendly ordered sections for a provider call.
replay_handoff_bundle
replay_handoff_bundleReplay a structured handoff bundle as text plus token-efficient artifact payloads.
run_experiment
run_experimentRun a tracked benchmark/evaluation experiment over a named dataset and store benchmark, verifier, and reward outputs.
scar_compress
scar_compress[EXPERIMENTAL] Compress embeddings using SCAR (learnable compression). WARNING: Uses UNTRAINED random weights by default. Requires PyTorch. NOT production-ready without model training. Returns experimental flag.
scar_get_stats
scar_get_stats[EXPERIMENTAL] Get SCAR compressor statistics and model state. Shows PyTorch availability and model training status. NOT production-ready. Returns experimental flag.
search_code
search_codeFast regex or literal code search using tensor-grep trigram index. Returns file paths and matching lines. Chain with ingest_context for targeted compression of search results. Falls back gracefully if tensor-grep is not installed.
search_memory
search_memorySearch explicit memories within the requested scope using lexical overlap and similarity scoring.
search_memory_index
search_memory_indexIndex-first memory search: compiles an index from stored memories, then returns matching articles. Best for small corpora (<500 entries) where an LLM-readable index beats embedding search.
search_multimodal
search_multimodalSearch a production multimodal project using text, code, or image queries.
search_semantic
search_semanticSemantic search across ingested documents. Uses vector similarity to find relevant sections, even if exact keywords don't match. Returns ranked node IDs. Optionally enables evidence-aware insufficiency detection.
search_timeline
search_timelineSearch lifecycle events across ingests, reads, searches, and invalidations.
set_compression_profile
set_compression_profileSet a named compression profile for the session. Profiles bundle skeleton_ratio, fidelity, and chunk_size into presets: minimal (max compression), summary (quick overview), balanced (default), detailed (deep analysis), full (near-original). Explicit parameters in subsequent tool calls override profile defaults.
should_compress
should_compress[PRE-CHECK] TOKEN-EFFICIENT PRE-CHECK: Estimate token count for a file WITHOUT reading content. Uses file size heuristics and binary content detection. CRITICAL: Call this BEFORE reading or ingesting any file. Detects binary files (PDF, DOCX, images) that need conversion before compression. Returns recommendation: SKIP (<100 tokens), DIRECT_READ (100-500), COMPRESS (>500), or CONVERT_THEN_COMPRESS (binary files with MarkItDown suggestion). Fields: needs_conversion, is_text_readable, conversion_t
submit_benchmark
submit_benchmarkSubmit a benchmark result for an LLM inference run. Records model + quantization + hardware + settings + measured tokens/sec. Returns a permalink slug (https://gotcontext.ai/benchmarks/runs/<slug>) you can share. Auto-flags as AI-submitted. Project-allowlist-aware (Pro+).
summarize_user_memory
summarize_user_memorySummarize one user's explicit memories into preferences, topical signals, and category breakdowns.
sync_connector_feed
sync_connector_feedNormalize and ingest a managed connector feed through the standard compression pipeline.
tee_store_stats
tee_store_statsGet tee store statistics: entry count, total size, mode, thresholds. Use to monitor tee storage usage.
tool_help
tool_help[HELP] Get detailed help, examples, and tips for any Semantic Modulator tool. Returns structured help with parameter descriptions, usage examples, and related tools. Use without tool_name to see all available tools organized by category. Set verbose=true for comprehensive examples.
toon_decode
toon_decode[EXPERIMENTAL] Decode TOON format back to structured data. TOON is lossy - optimized for LLM consumption, not round-trip serialization. NOT production-ready. Returns experimental flag.
toon_encode
toon_encode[EXPERIMENTAL] Encode data to TOON format (~40% smaller than JSON). TOON = Token-Oriented Object Notation. Pure Python, always available. NOT production-ready. Returns experimental flag.
update_prompt_template
update_prompt_templateCreate a new version of an existing prompt template. Supports prompt edits, variable changes, metadata updates, and change notes.
verify_compression
verify_compression[EXPERIMENTAL] Verify compression operation using ASG-SI contracts. Checks preconditions (valid input, fidelity level) and postconditions (compression ratio, skeleton quality). Returns contract violations. Based on arxiv.org/abs/2512.23760 Audited Skill-Graph Self-Improvement.
visualize_graph_html
visualize_graph_html[VIZ] Generate interactive HTML visualization of the semantic graph. Creates a beautiful, interactive web page with draggable nodes, zoom/pan, color-coded importance, and edge weights. Great for exploring and presenting compression decisions. Requires pyvis library.
System Capabilities
Input Modes
Output Modes
Streaming
✓ SupportedCategory
Business / DevelopmentActions
Test in PlaygroundIs this your agent?
If you own this agent you can claim it.