Hmm. I have full size bloom running on a server in my basement. It can be ran naively with about 400GB of ram. Using used hardware you can get that for about $1200. Still, with CPU inference, you're looking at about 5 mins per response.
With optimization, I have it down to 140GB of ram. Trying to get it under 120GB without loosing too much accuracy so it can be ran on standard desktop consumer hardware (who's limits are usually 128GB).
Given the lack of resources I have found I figured the general intrest was low? Maybe I will open source it and do a write up.
I think that would be an incredibly interesting write-up. There are many applications where a 5-min response time would be more than adequate. It could slowly churn through the inbox, while I'm not looking. Or parse customer emails and suggest replies for a rep to potentially use.
Mmmm... and what about copyright ? I mean: may I dump all of HN and then consider it a book to be sold for my own profit ? And if I can't do it... what is the difference between this idea and using HN to train an LLM ? And what if I don't want my comments be parts of this LLM ? Or what about the "trash" accounts that don't want to be identified ?
Don't get me wrong: the idea could be nice but... ain't it time to think twice about all this before applying the last technological fad ?
Id argue there isnt an equivilency between an LLM and direct dumping the data straight into a book, theres a significant layer of abstraction there from my understanding. It is entirely legal for you to read HN and paraphrase what you learned in a book, which id argue is a much more fair argument.
Been thinking about this many times. I regularly check what HM things about a specific book, what services HN recommends to perform a particular task, etc..
To the sibling comment that I asked about doing this locally: there’s really no need for an LLM, much less for GPT-3. All you need is, well, attention. Sentence-transformer embeddings. Perhaps even just fastText.
AIUI sentence-transformer embeddings work for sentences or short paragraphs. But many comments only make sense in the context of parent/GP comments. This is especially true when a comment answers an earlier question.
I'm not sure how we'd pack enough context into a single 'sentence', to get a useful embedding for this purpose.