Show HN: UpTrain – Open-source ML observability and refinement tool

(github.com)

88 points | by sourabh0394agr 548 days ago

10 comments

rcshubhadeep 548 days ago
I like your tool. Starred it. I wish you could show more examples. I am going to follow your tool and if I see interesting examples I can use it in my work. But please consider more examples. Such as Sentiment Analysis, NER, POS tagging etc.
Also do you support Huggingface models? If so, how?
[-]
- sourabh0394agr 548 days ago
  Thanks for starring! We are already working on an example to fine-tune a LLM (using Bert from Huggingface) by doing sentiment analysis on the model outputs (filtering output cases where user feedback has a negative sentiment and fine-tuning the model to improve outputs in such cases). Also, thanks for suggesting NER and POS tagging as use-cases. We will work to add those in a couple of weeks to illustrate more use-cases for UpTrain.
villgax 548 days ago
There's 2 new commenting accounts conveniently created 3hrs ago with Indian names. Dunno if it is a coincidence because of the constructed questions & answers....
[-]
- sourabh0394agr 548 days ago
  Really sorry about these comments. This is not something we intended. We did share about our post with few family and friends but we didn't realise it would translate this way. We are building our project very passionately and want true feedback from the HN community and don't support any of this and will be highly cautious from now on.
ironManRocks 548 days ago
What type of Machine Learning models do you support?
[-]
- sourabh0394agr 548 days ago
  Our tool supports a wide variety of ML models (both Deep learning based as well as classical ones, except for Video CNNs). Below are some of the sample use-cases:
  1. LLMs: UpTrain tracks unseen prompts, logs model performance, and detects problematic prompts by analysing user behaviour.
  2. Recommendation Systems: Use UpTrain to monitor popularity bias, recommendation quality across user groups etc.
  3. Prediction Systems: Use UpTrain to monitor feature drift and the effectiveness of your predictions.
  4. Computer Vision: Use UpTrain to measure drifts in the properties of your input image (brightness, intensity, temperature, model outputs etc.).
hsuyash 548 days ago
what do you mean by problematic data points and how do you identify them?
[-]
- sourabh0394agr 548 days ago
  Great question - Problematic data-points are essentially the cases where your model is not performing well.
  Now, we have three ways to find them:
  1. Statistical tools: We perform clustering on your training dataset and identify cases in production which are far away from all the training clusters (the idea is that if the given data-point is out-of-distribution, model may not perform well and may require retraining)
  2. User Feedback: Based on the user behaviour, we infer Ground Truth. For ex: In case of recommendation systems, GT = if user likes the video. In case of ChatGPT3, GT = 0 if we see user asking the same question in multiple ways etc. We use such signals to identify cases where the user is not satisfied with the model output
  3. Rule-based Signals: Many times, data scientists and ML engineers have a good idea about where their models are not performing well. These insights can be developed by analysing user feedback or manually testing their models. We allow them to define rule-based signals to filter out any interesting cases which they like to test or retrain their models upon
  [-]
  - thelastbender12 548 days ago
    Online detection of problematic inputs seems plenty interesting! I am curious, does your framework run the detection logic in process or as a daemon the library is shipping over data to?
    [-]
    - sourabh0394agr 548 days ago
      The former i.e. it runs the detection logic in background on the machine itself where the model predictions are happening. Currently we support running simple clustering algos but are working to enable even running simple Neural Nets as part of the observability loop.
milind4 548 days ago
What kind of statistical tools do you support for measuring data drift?
[-]
- sourabh0394agr 548 days ago
  We support multiple statistical measures such as Earth moving distance, KL Divergence, Jensen Shannon distance etc. and are continuously adding more. Each of these measures work well for different types of ML models: you can also read more about it here: https://uptrain-dev.netlify.app/blog/5-great-statistical-met...
varunpantp 548 days ago
Nice toolkit! Curious - how does this differ from existing solutions out there which are used for the same intent?
[-]
- sourabh0394agr 548 days ago
  Great question - couple of differences:
  1. We are open-source & self-hosted, while most of the existing solutions are closed Saas tools and many of them ask you to send your data to their servers for analysis
  2. We focus a lot on customisation. We allow ML engineers to define custom metrics to monitor (say you are doing human pose estimation, we allow measuring drift on individual key points as well as complex metrics such as body length, torso ratio etc.). In our experience as ML practitioners, these custom metrics are much more insightful that just out-of-the-box statistical measures.
  3. We are completing the whole refinement loop. With each check, we define a rule to capture "interesting" or "problematic" data-points (essentially cases where model performance is down) and integrate seamlessly with your existing ML workflows to automatically retrain the model when required