Identifying organic compounds with visible light

(phys.org)

36 points | by wglb 1114 days ago

8 comments

cookieperson 1113 days ago
This paper is pretty weak. Sorry to bust the AI hype train on this but collecting the refractive index of compounds a bunch of wavelengths is completely possible, but not revolutionary. I did it at a job, and so have my friends. In some ways RI based measures are harder to take then raman measures. Classification based on RI is also nothing new....
Also... There's some pretty serious challenges with using other people's data rather then actual data collected from a single instrument. Yes RI is a physical thing, but taking those measurements across many wavelengths involves error, sometimes instrument dependent error.
I won't bother looking at the code because neither the abstract nor the results particularly interest me as someone trained in this field(notably the author doesn't seem to be). Some part of me does wonder how they ran this experiment without testing on training data. happens all the time with newcomers. Again not that the claim is unbelievable, it is, it's just it is a worry given the premises posed in the paper itself. A symptom that this is probably low quality is that it's not published in a journal where peers who are interested in or trained to actually review this. Yes believe it or not there are 5 or some odd journals wayyy better suited for this kind of publication. Almost worry it was rejected elsewhere before landing in a p chem journal because... It doesn't seem very good, new, well written, or useful.
bildung 1113 days ago
Funnily enough using the refractive index to discern compounds is the modus operandi of one of the very fist spectrometer made by William Hyde Wollaston in 1802: https://www.en.silicann.com/blog/post/history-of-spectroscop... (disclaimer: I wrote that).
Very cool approach, though it will most probably only work for classification (i.e. what is this sample), whereas NIR spectroscopy (most common when working with organic compounds) allows for quantification of the sample analytes (who much of each analyte is in this sample).
[-]
- cookieperson 1113 days ago
  Yea their goal seems to be for classification. I'm completely unimpressed by this paper. I'm a former participant in this field, and I feel bad for the outsiders here thinking this is a breakthrough. Had I of reviewed this it would t have been accept and I'm far more liberal of a reviewer than a lot of people in the field. It's unsurprising to me this was published in a journal which the topic is a bad fit...
sargun 1113 days ago
This is potentially super powerful for the idea of a pocket spectrophotometer for everyone. This idea has been in many startup pitches over the last 15 years -- the most common consumer use case is identifying allergens. I can see this being useful beyond that too, with things like reagent free desktop drug testing systems that don't cost $25k.
[-]
- cookieperson 1113 days ago
  Not really. You can build a pocket raman spectrometer for probably 1000usd max.
  Refractive index based classification across wavelengths is probably not going to help you find allergens. Unless you isolate the allergen from the mixture then take the measure and have it in your model...
kwhitefoot 1113 days ago
Sounds interesting but it's got a long way to go before it will be useful to anyone.
ginko 1113 days ago
Does the voting step with multiple classifiers make sense? My gut feeling would be that training a single larger network would be more effective.
[-]
- gpcr1949 1112 days ago
  They use a random forest classifier, which is an ensemble model that gives a consensus result of several decision trees. One way to achieve this consensus is voting. Random forest models are commonly used in building chemical models like this (and in QSAR), because they are quite robust. Due to the typically small size of chemical data sets (dozens to thousands, typically), more sophisticated methods are not usable and do not perform better.
  [-]
  - cookieperson 1112 days ago
    Even then random forest is the wrong choice for this type of data. It should be the thing you do in your first hour of having it before choosing something more appropriate
- cookieperson 1113 days ago
  My experience is that there are far simpler, faster, and well known models to perform this type of classification and that the premise of this study is flawed in so many ways it could have absolutely no utility at all if attemptedly implemented in real life.
jplona 1113 days ago
Just from the title, I'm imagining this is more or less https://images.app.goo.gl/jGsA1QGurgprJbw78
[-]
- cookieperson 1112 days ago
  You made my day lol this is what I got from it too.
Nullors 1113 days ago
[flagged]
Nullors 1113 days ago
[flagged]