Xata Agent: AI agent expert in PostgreSQL

(github.com)

101 points | by todsacerdoti 8 days ago

9 comments

  • simonw 8 days ago
    The prompts that do most of the work are in this file: https://github.com/xataio/agent/blob/main/apps/dbagent/src/l... - strings stored in these variables:

      SLOW_QUERIES_PLAYBOOK
      GENERAL_MONITORING_PLAYBOOK
      TUNING_PLAYBOOK
      INVESTIGATE_HIGH_CPU_USAGE_PLAYBOOK
      INVESTIGATE_HIGH_CONNECTION_COUNT_PLAYBOOK
      INVESTIGATE_LOW_MEMORY_PLAYBOOK
    
    Looks like they are orchestrated by these system prompts: https://github.com/xataio/agent/blob/69329cede85d4bc920558c0...
  • jasonthorsness 8 days ago
    During incidents I've often found that the issue has been obvious or brewing for some time - the idea of having an LLM-driven "smart" monitoring system for key services that can recognize problems and often take action could hopefully make this less of a thing. I'll be looking at how this works to try something similar for my own company's services.

    I think a key for this one is "I use preset SQL commands. I will never run destructive (even potentially destructive) commands against your database." If it's also locked down to only informational queries (and not leaking user tables to the LLM providers) I think why not try this?

    I do wonder about cost of this at scale; compared to the cost of the services being monitored. Hopefully an Agent tax doesn't become another Datadog tax.

    • tudorg 8 days ago
      > I do wonder about cost of this at scale; compared to the cost of the services being monitored. Hopefully an Agent tax doesn't become another Datadog tax.

      One idea that we want to experiment with is that we let the model pick the next time that it runs (between bounds). So if the model has any reason of concern it runs more often, otherwise maybe once every couple of hours is enough.

  • lelandfe 8 days ago
    > I support multiple models from OpenAI, Anthropic, and Deepseek.

    Are there risks associated with sending DB info off to these third parties?

    • fforflo 8 days ago
      Of course, there are, but I guess this relies on the various System views (pg_*), like most monitoring tools, which have fine-grained control access if you create appropriate roles.
    • vosper 8 days ago
      > Are there risks associated with sending DB info off to these third parties?

      You can use AWS Bedrock and get access to Claude, and then just have something that proxies from your tool over to AWS. It'll work provided you can set a provider URL for the LLM.

      So provided that you trust AWS you can use at least Claude (the best LLM anyway [at least today]) with confidence.

    • cship2 8 days ago
      You can also self host via ollama assuming you got proper GPU. Running it on CPU can take minutes.
    • thinkingtoilet 8 days ago
      Of course!
  • femiagbabiaka 8 days ago
    Interesting, might try it out at home.

    Documentation asserts: > I use preset SQL commands. I will never run destructive (even potentially destructive) commands against your database.

    This is enforced by taking the responsibility for generating SQL in order to evaluate state out of the hands of the LLM. The LLM simply interprets results of predetermined commands based on a set of prompts/playbooks: https://github.com/xataio/agent/blob/69329cede85d4bc920558c0...

    • iLoveOncall 8 days ago
      Until it hallucinates and doesn't run the command that you expect or doesn't run the command at all?

      This doesn't take anything out of the hands of the LLM.

      • femiagbabiaka 8 days ago
        A hallucination that doesn't use a predefined command, but instead hallucinates A. random SQL queries and B. destructive queries, seems unlikely. But as a best practice the user that the agent connects to the database with should be locked down, of course.
      • tudorg 8 days ago
        There are things that can go wrong, but not so wrong to delete your data or cause outages. So there is a level of safety that I think is important at this moment. We do want to also allow executing generated SQL, but with an approval workflow.
    • esafak 8 days ago
      I'd define the operations I allow in a composable language like Malloy and expose them to the LLM through MCP.
  • upghost 7 days ago
    Watched the video. That is an absolutely fantastic UI. Seriously, huge props. This makes the project actually useful (as opposed to "why not just code this myself?"). Well done Xataio!
  • a_tyler_ 7 days ago
    The Xata Agent sounds like a cool use of AI for PostgreSQL monitoring—having an LLM interpret logs and metrics could be super useful for catching issues early. I like that it sticks to preset SQL commands to avoid unintended actions. That said, I’d be curious about the privacy implications of sending DB insights to an AI and the cost of running LLMs at scale. Maybe a self-hosted option would make sense? Either way, it’s an interesting direction for database monitoring!
  • iLoveOncall 8 days ago
    The title should really include the fact that it's an expert at MONITORING PostgreSQL. It's not for writing queries from natural language.

    I'm extremely interested in the latter but not at all in the first.

  • arcticfox 8 days ago
    This is very cool! What is preventing the other cloud providers from being supported I wonder? is the integration not just a connection string?
    • tudorg 8 days ago
      It is a connection string for the direct queries, but also access to the metrics/logs via cloud specific tools. In case of AWS that's Cloudwatch. We do plan to add support for all the popular Postgres services.
  • arjunlol 8 days ago
    Looks cool! This may actually save a lot of manual dba work for myself