I've spent 25 years building data infrastructure in financial services — Goldman Sachs, Bridgewater, Deutsche Bank, Freddie Mac. Today I'm open-sourcing Datris, a data platform built around MCP (Model Context Protocol) from day one.
The idea: As far as I know, this is the first data platform where MCP is the primary interface, not an afterthought. We built the MCP server first and made everything accessible through it. The platform has 30+ MCP tools — any agent (Claude, Cursor, your own framework) can create pipelines, ingest data, validate with plain English rules, transform, query databases, search vector stores, and monitor jobs. The API and UI use the same pipeline engine, but MCP is the native interface.
What an agent can do: - Create a complete pipeline from sample data in one call (schema auto-detected) - Upload and process CSV, JSON, XML, Excel, PDFs, Word docs - Validate and transform data using natural language instructions - Query PostgreSQL and MongoDB - Semantic search across 5 vector databases (Qdrant, Weaviate, Milvus, Chroma, pgvector) - Ask questions in natural language — SQL generated and executed automatically - Full RAG pipeline — extract, chunk, embed, and search documents - Profile data quality, diagnose errors, explore metadata
There's also a CLI: bash datris ingest sales.csv --dest postgres datris query "SELECT * FROM public.sales" datris query "top 5 stocks by volume" --table trades datris search "return policy" --store pgvector --collection docs
I've spent 25 years building data infrastructure in financial services — Goldman Sachs, Bridgewater, Deutsche Bank, Freddie Mac. Today I'm open-sourcing Datris, a data platform built around MCP (Model Context Protocol) from day one.
The idea: As far as I know, this is the first data platform where MCP is the primary interface, not an afterthought. We built the MCP server first and made everything accessible through it. The platform has 30+ MCP tools — any agent (Claude, Cursor, your own framework) can create pipelines, ingest data, validate with plain English rules, transform, query databases, search vector stores, and monitor jobs. The API and UI use the same pipeline engine, but MCP is the native interface.
What an agent can do: - Create a complete pipeline from sample data in one call (schema auto-detected) - Upload and process CSV, JSON, XML, Excel, PDFs, Word docs - Validate and transform data using natural language instructions - Query PostgreSQL and MongoDB - Semantic search across 5 vector databases (Qdrant, Weaviate, Milvus, Chroma, pgvector) - Ask questions in natural language — SQL generated and executed automatically - Full RAG pipeline — extract, chunk, embed, and search documents - Profile data quality, diagnose errors, explore metadata
There's also a CLI: bash datris ingest sales.csv --dest postgres datris query "SELECT * FROM public.sales" datris query "top 5 stocks by volume" --table trades datris search "return policy" --store pgvector --collection docs
Install: git clone https://github.com/datris/datris-platform-oss cd datris-platform-oss cp .env.example .env docker compose up -d
CLI: brew tap datris/tap && brew install datris MCP server: uvx datris-mcp-server
Stack is all open-source: Spring Boot, Docker, MinIO, MongoDB, ActiveMQ, Vault, PostgreSQL. AI via Anthropic, OpenAI, or Ollama (fully local).
GitHub: https://github.com/datris/datris-platform-oss Docs: https://docs.datris.ai
Happy to answer questions.