Feedmaker: URL + CSS selectors = RSS feed

(feedmaker.fly.dev)

119 points | by mustaphah 10 hours ago

9 comments

  • mg 2 hours ago
    That is a good idea.

    59 requirements, including Django, seems pretty heavy though?

    For my own RSS feed, I use this 48 line Python file with no dependencies outside the standard library:

    https://github.com/no-gravity/atomfeed.py

    It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.

  • kschaul 9 hours ago
    Glad you’re find the tool interesting! A short blog post behind it: https://kschaul.com/post/2023/04/16/feedmaker-quickly-genera...

    And the GitHub url (hopefully easy to host your own instance): https://github.com/kevinschaul/feedmaker

    • mustaphah 8 hours ago
      Looks like you're hosting this on fly.io - PAYG model. You could probably host this for free on Cloudflare Workers; 100k requests/day on the free tier; static content (the homepage) is free & unlimited.

      Edit: The catch is the 10ms CPU cap per request - you'd need a super lean implementation. Django's too heavy for that.

  • mustaphah 10 hours ago
    The good news: made it to the front page.

    The bad news: so did the 503 page.

    • benbristow 9 hours ago
      In some ways a good thing, no? Shows you've got work to do on optimisation for large audiences. A free stress test (unless you're on a host that charges per hit or bandwidth excess), as you will.

      Did load eventually for me, thought it was broken as no styles but looks like it's intentional.

      • uyzstvqs 9 hours ago
        Seems to be hosted using fly.io
  • bradbeattie 9 hours ago
    https://github.com/RSS-Bridge/rss-bridge is what I've been using for the same purpose.
  • ZYbCRq22HbJ2y7 2 hours ago
    Should be able to achieve this without selectors with HTML to Markdownish (something like Firefox's Reader mode).
  • int0x29 9 hours ago
    I made a CGI program that ran CSS selectors against URLs and returned the output. I debated making it public and then realized I probably didn't want to run an open proxy. I'm curious how long this will last.
  • crazygringo 8 hours ago
    I love this.

    Has anyone tested to see if it works with Blogtrottr which will email you whenever there's a new item in an RSS feed?

    Just since this doesn't seem like it even includes a date field in the RSS? And of course no guid. So I'm wondering how compatible it winds up being.

    • kevincox 8 hours ago
      Dates shouldn't matter. The feed has ID elements which is what identify entries. Atom has no guid element. So I would expect this to work with any reader.
      • edoceo 3 hours ago
        I wish they had concrete, accurate id and created_at. IIRC these attributes are fixed in AT.
  • zekenie 9 hours ago
    Not the same but this gives me an idea… what if there was a map reduce for doms as a web primitive. Like imagine if I could make a dom (or feed) that was some selection and transformation of another dom