What is BioContext7
An overview of BioContext7 — the bioinformatics registry aggregator and pipeline generator
Overview
BioContext7 is a bioinformatics registry aggregator and self-healing pipeline generator designed for integration with Claude Code via the Model Context Protocol (MCP).
It solves a core problem in bioinformatics: discovering the right tools for a workflow and generating correct, executable pipeline code from natural language descriptions.
Architecture
BioContext7 is composed of four layers:
Registry Layer
Aggregates tool metadata from multiple sources into a unified, searchable index:
- bio.tools — 47,000+ curated bioinformatics tools with EDAM annotations
- BioContainers — Docker and Singularity container images for reproducible execution
- EDAM Ontology — Semantic terms for operations, topics, data types, and formats
- UniProt — Protein sequences, annotations, and ID mapping
- GA4GH Standards — Beacon (variant queries), VRS (variant representation), WES (workflow execution)
- Metabolomics — HMDB, Metabolomics Workbench, MassBank, LIPID MAPS
Compiler Layer
Translates a language-agnostic intermediate representation (PipelineSpec) into target-specific
pipeline code:
| Target | Output |
|---|---|
| Nextflow DSL2 | main.nf + nextflow.config |
| Snakemake | Snakefile + rule files |
| WDL | WDL task and workflow definitions |
| CWL | CWL workflow + tool definitions |
Healing Layer
Validates generated pipelines using Language Server Protocol (LSP) integration and automatically fixes errors through iterative correction loops:
- Generate pipeline code
- Run LSP validation (syntax, type checking)
- Collect diagnostics
- Apply auto-fixes
- Re-validate until clean or max iterations reached
MCP Layer
Exposes BioContext7 capabilities as five MCP tools for Claude Code integration:
resolve-library-id— Search 47K+ bioinformatics tools by name or keywordget-library-docs— Fetch versioned documentation for a specific tool (supports topic filtering, token budgets, and chunk size control)find-skills— Semantic skill search with quality scoring (bc7score), install commands, and health signals. Supports compact mode for LLM-optimized responses and platform filtering.recommend-tools— Get opinionated, ranked tool recommendations with benchmark references for 10 analysis patternsreport-snippet-quality— Relevance feedback loop that penalizes unhelpful snippets in subsequent retrievals
Design Principles
- Deterministic core — Data and specs produce build artifacts without LLM involvement
- Provenance everywhere — Full tracking of inputs, tool versions, and outputs
- Grounded text — Every output references concrete artifacts
- Self-healing — LSP validation loops catch and fix errors automatically