Identifying organic compounds with visible light


36 points | by wglb 399 days ago


  • cookieperson 398 days ago
    This paper is pretty weak. Sorry to bust the AI hype train on this but collecting the refractive index of compounds a bunch of wavelengths is completely possible, but not revolutionary. I did it at a job, and so have my friends. In some ways RI based measures are harder to take then raman measures. Classification based on RI is also nothing new....

    Also... There's some pretty serious challenges with using other people's data rather then actual data collected from a single instrument. Yes RI is a physical thing, but taking those measurements across many wavelengths involves error, sometimes instrument dependent error.

    I won't bother looking at the code because neither the abstract nor the results particularly interest me as someone trained in this field(notably the author doesn't seem to be). Some part of me does wonder how they ran this experiment without testing on training data. happens all the time with newcomers. Again not that the claim is unbelievable, it is, it's just it is a worry given the premises posed in the paper itself. A symptom that this is probably low quality is that it's not published in a journal where peers who are interested in or trained to actually review this. Yes believe it or not there are 5 or some odd journals wayyy better suited for this kind of publication. Almost worry it was rejected elsewhere before landing in a p chem journal because... It doesn't seem very good, new, well written, or useful.

  • bildung 398 days ago
    Funnily enough using the refractive index to discern compounds is the modus operandi of one of the very fist spectrometer made by William Hyde Wollaston in 1802: (disclaimer: I wrote that).

    Very cool approach, though it will most probably only work for classification (i.e. what is this sample), whereas NIR spectroscopy (most common when working with organic compounds) allows for quantification of the sample analytes (who much of each analyte is in this sample).

    • cookieperson 398 days ago
      Yea their goal seems to be for classification. I'm completely unimpressed by this paper. I'm a former participant in this field, and I feel bad for the outsiders here thinking this is a breakthrough. Had I of reviewed this it would t have been accept and I'm far more liberal of a reviewer than a lot of people in the field. It's unsurprising to me this was published in a journal which the topic is a bad fit...
  • sargun 398 days ago
    This is potentially super powerful for the idea of a pocket spectrophotometer for everyone. This idea has been in many startup pitches over the last 15 years -- the most common consumer use case is identifying allergens. I can see this being useful beyond that too, with things like reagent free desktop drug testing systems that don't cost $25k.
    • cookieperson 398 days ago
      Not really. You can build a pocket raman spectrometer for probably 1000usd max.

      Refractive index based classification across wavelengths is probably not going to help you find allergens. Unless you isolate the allergen from the mixture then take the measure and have it in your model...

  • kwhitefoot 398 days ago
    Sounds interesting but it's got a long way to go before it will be useful to anyone.
  • ginko 398 days ago
    Does the voting step with multiple classifiers make sense? My gut feeling would be that training a single larger network would be more effective.
    • gpcr1949 398 days ago
      They use a random forest classifier, which is an ensemble model that gives a consensus result of several decision trees. One way to achieve this consensus is voting. Random forest models are commonly used in building chemical models like this (and in QSAR), because they are quite robust. Due to the typically small size of chemical data sets (dozens to thousands, typically), more sophisticated methods are not usable and do not perform better.
      • cookieperson 398 days ago
        Even then random forest is the wrong choice for this type of data. It should be the thing you do in your first hour of having it before choosing something more appropriate
    • cookieperson 398 days ago
      My experience is that there are far simpler, faster, and well known models to perform this type of classification and that the premise of this study is flawed in so many ways it could have absolutely no utility at all if attemptedly implemented in real life.
  • jplona 398 days ago
    Just from the title, I'm imagining this is more or less
    • cookieperson 398 days ago
      You made my day lol this is what I got from it too.
  • Nullors 398 days ago
  • Nullors 398 days ago