• pedalpete 12 days ago
    This is great! But it's also the 3rd platform I've seen in the research space in the last few weeks. I still haven't even gotten around to trying Elicit.com

    You say "try it now", but then link to a sign-up, and you don't have any social sign-in, so I can't just click a butt on and go.

    If you look at elicit.com, look at their branding, the quality of their design, then look at your competing site. You need to up your game to get trust.

    I'm assuming the reason you don't want to just have an open search is due to the cost of running searches, but what's the cost of nobody using it? How can you provide examples at least that showcase what you can do?

    WRT the name of your name, the first thing that came to mind is undermine, which is not a positive connection to research.

    I hope you can take this as constructive feedback. Like I said, I haven't even tried elicit yet, I can't remember what the other competitor in this space was.

    But also, here's a bonus. Emmitt Shear just posted on twitter looking for quality research on reaction time. I know of at least one paper on slow-wave enhancement for deep sleep (CLAS, PTAS) and a secondary finding was on slow-wave sleep. I said I'd get back to him with the link, but maybe you can do even better and show us what your product can do. What's the best research into reaction time? Is there something other than Clare Anderson's paper on slow-wave sleep and reaction time?

  • bglazer 11 days ago
    These are the best results that I've gotten from an AI research assistant.

    I really don't mind the long latency, in fact, I think it's a fundamentally better way of interacting with this kind of LLM based tool.

    Like the latency is necessary for the LLM to actually interact with the content, rather than just doing a Bing or Perplexity style RAG+summarization workflow that delivers very uneven results.

    I also really like the use of longer prompts, as it encourages a full description of your topic, rather than keyword fiddling trying to make the RAG system pick up the right signifiers.

    The "Discovery Progress and Exhaustiveness" section is a bit confusing as a user. Like, ok we have 23.6% of the relevant papers? Why not 100%? What am I supposed to do with that information? Can you give me any information about the missing papers?

    Overall, very nice work, I'll be using this in the future.

    • tomhartke 11 days ago
      On the discovery progress, we can't look at all 200M papers with the LLM, so we prioritize some of them (the first 100 most promising) for deep analysis. Within those, we find a few that are relevant. But the rate at which we discover these tells us roughly what will happen if we read the next 100 (if we're discovering new relevant papers all the time, we will likely continue). We need a better explanation on the website, but we can statistically model this to quantitatively predict how many papers we would find if we exhaustively searched the whole database.

      More info here: https://www.undermind.ai/static/Undermind_whitepaper.pdf

      • bglazer 11 days ago
        That makes sense, I figured it was related to some of the statistics work in ecology estimating species count from a limited sample.
  • skeptrune 12 days ago
    Hmm, does this just use the traditional term frequency search bars (scholar, arxiv, etc) under the hood with query expansion for the prelimary search?

    Without chunking the papers I'm skeptical the prelim search would be all that useful.

    Also, using GPT4 as a cross encoder seems really wasteful both in terms of compute and latency.

    Using GPT4 as a cross encoder also seems very wasteful.

    Might try it anyways, but damn 3-6mins is brutal. Traditionally research has shown that low latency is more important than relevance for search because it allows users to reformulate queries.

    Maybe this approach is worth it though.

    • tomhartke 12 days ago
      While the time/cost of using GPT-4 is not ideal, GPT-4-level classification is absolutely crucial for the entire adaptation process to succeed. With 3.5 guiding the adaptation, we find that errors quickly accumulate. It can't identify complex ideas correctly.

      3-6 minutes for results takes getting used to, but we've found most people don't complain if it solves a problem that is actually impossible to solve without hours of digging, ie if you use it on something truly hard. Low latency is more crucial for public search engines like Google (0.5s delay -> 20% traffic loss) where there are convenient, fast alternatives.

      Preliminary search is a blend of semantic embeddings on 100M+ papers and keyword search, citation links, etc. Reasonably accurate, but full of noise for complex queries.

  • blacksmith_tb 12 days ago
    Hmm, I assume the pun on "undermined" is intentional, though since that has somewhat negative connotations I am not sure it's entirely a good idea...
  • BioBen42 11 days ago
    I just tested the platform out. I am really impressed by the results of a search for papers regarding protein inhibitors for the treatment of cancer. I actually was able to find at least 4 new informative papers that gave good insights to my research.

    I am impressed because I have spent roughly 3 months researching this topic on Science Direct and PubMed and I did not expect your engine to turn up anything new. To this point, in less than six minutes, your search engine was able to give me more relevant search results than probably a week of searching Science Direct.

    Great work!

    (oh yeah, I know people are commenting about the interface, but I actually like the clean look of the search results. It is the home page that leaves something to be desired)