Xata Agent: AI agent expert in PostgreSQL

(github.com)

101 points | by todsacerdoti 115 days ago

9 comments

simonw 115 days ago
The prompts that do most of the work are in this file: https://github.com/xataio/agent/blob/main/apps/dbagent/src/l... - strings stored in these variables:
```
  SLOW_QUERIES_PLAYBOOK
  GENERAL_MONITORING_PLAYBOOK
  TUNING_PLAYBOOK
  INVESTIGATE_HIGH_CPU_USAGE_PLAYBOOK
  INVESTIGATE_HIGH_CONNECTION_COUNT_PLAYBOOK
  INVESTIGATE_LOW_MEMORY_PLAYBOOK
```
Looks like they are orchestrated by these system prompts: https://github.com/xataio/agent/blob/69329cede85d4bc920558c0...
jasonthorsness 115 days ago
During incidents I've often found that the issue has been obvious or brewing for some time - the idea of having an LLM-driven "smart" monitoring system for key services that can recognize problems and often take action could hopefully make this less of a thing. I'll be looking at how this works to try something similar for my own company's services.
I think a key for this one is "I use preset SQL commands. I will never run destructive (even potentially destructive) commands against your database." If it's also locked down to only informational queries (and not leaking user tables to the LLM providers) I think why not try this?
I do wonder about cost of this at scale; compared to the cost of the services being monitored. Hopefully an Agent tax doesn't become another Datadog tax.
[-]
- tudorg 115 days ago
  > I do wonder about cost of this at scale; compared to the cost of the services being monitored. Hopefully an Agent tax doesn't become another Datadog tax.
  One idea that we want to experiment with is that we let the model pick the next time that it runs (between bounds). So if the model has any reason of concern it runs more often, otherwise maybe once every couple of hours is enough.
lelandfe 115 days ago
> I support multiple models from OpenAI, Anthropic, and Deepseek.
Are there risks associated with sending DB info off to these third parties?
[-]
- fforflo 115 days ago
  Of course, there are, but I guess this relies on the various System views (pg_*), like most monitoring tools, which have fine-grained control access if you create appropriate roles.
- vosper 114 days ago
  > Are there risks associated with sending DB info off to these third parties?
  You can use AWS Bedrock and get access to Claude, and then just have something that proxies from your tool over to AWS. It'll work provided you can set a provider URL for the LLM.
  So provided that you trust AWS you can use at least Claude (the best LLM anyway [at least today]) with confidence.
- cship2 115 days ago
  You can also self host via ollama assuming you got proper GPU. Running it on CPU can take minutes.
- thinkingtoilet 115 days ago
  Of course!
femiagbabiaka 115 days ago
Interesting, might try it out at home.
Documentation asserts: > I use preset SQL commands. I will never run destructive (even potentially destructive) commands against your database.
This is enforced by taking the responsibility for generating SQL in order to evaluate state out of the hands of the LLM. The LLM simply interprets results of predetermined commands based on a set of prompts/playbooks: https://github.com/xataio/agent/blob/69329cede85d4bc920558c0...
[-]
- iLoveOncall 115 days ago
  Until it hallucinates and doesn't run the command that you expect or doesn't run the command at all?
  This doesn't take anything out of the hands of the LLM.
  [-]
  - femiagbabiaka 115 days ago
    A hallucination that doesn't use a predefined command, but instead hallucinates A. random SQL queries and B. destructive queries, seems unlikely. But as a best practice the user that the agent connects to the database with should be locked down, of course.
  - tudorg 115 days ago
    There are things that can go wrong, but not so wrong to delete your data or cause outages. So there is a level of safety that I think is important at this moment. We do want to also allow executing generated SQL, but with an approval workflow.
- esafak 114 days ago
  I'd define the operations I allow in a composable language like Malloy and expose them to the LLM through MCP.
upghost 114 days ago
Watched the video. That is an absolutely fantastic UI. Seriously, huge props. This makes the project actually useful (as opposed to "why not just code this myself?"). Well done Xataio!
iLoveOncall 115 days ago
The title should really include the fact that it's an expert at MONITORING PostgreSQL. It's not for writing queries from natural language.
I'm extremely interested in the latter but not at all in the first.
a_tyler_ 114 days ago
The Xata Agent sounds like a cool use of AI for PostgreSQL monitoring—having an LLM interpret logs and metrics could be super useful for catching issues early. I like that it sticks to preset SQL commands to avoid unintended actions. That said, I’d be curious about the privacy implications of sending DB insights to an AI and the cost of running LLMs at scale. Maybe a self-hosted option would make sense? Either way, it’s an interesting direction for database monitoring!
arcticfox 114 days ago
This is very cool! What is preventing the other cloud providers from being supported I wonder? is the integration not just a connection string?
[-]
- tudorg 114 days ago
  It is a connection string for the direct queries, but also access to the metrics/logs via cloud specific tools. In case of AWS that's Cloudwatch. We do plan to add support for all the popular Postgres services.
arjunlol 115 days ago
Looks cool! This may actually save a lot of manual dba work for myself