That's an impressive jump in performance by providing the agent with access to relevant literature.
Is there a breakdown of which wins came from hyperparameter values (where BO would likely match this) vs. wins from techniques the agent wouldn’t have tried without the paper?
yes - the blog post has a figure showing all the improvements and how big they were.
also, some times the baseline agent tries the same idea but doesn't get as big a boost as the baseline + Paper Lantern agent. We studied it and found the reason was that the baseline tries changes in isolation whereas the research-backed ideas understand the interactions between parameters and suggests multiple changes at the same time - which the baseline agent never discovers.
Is there a breakdown of which wins came from hyperparameter values (where BO would likely match this) vs. wins from techniques the agent wouldn’t have tried without the paper?
also, some times the baseline agent tries the same idea but doesn't get as big a boost as the baseline + Paper Lantern agent. We studied it and found the reason was that the baseline tries changes in isolation whereas the research-backed ideas understand the interactions between parameters and suggests multiple changes at the same time - which the baseline agent never discovers.