not only does using CC make me want to use it more, it's making it very hard to go back to laboratory work in the new year. Six hours hard labor to disprove yet another hypothesis vs the same amount of time to build and test new ideas entirely from scratch, with ease. why fight the gradient?!
I've got food poisoning, so not much to do other than sit around and build with Claude and happily express my joy through the written word. Give it a shot. Claude Code Hits Different.
not only does using CC make me want to use it more, it's making it very hard to go back to laboratory work in the new year. Six hours hard labor to disprove yet another hypothesis vs the same amount of time to build and test new ideas entirely from scratch, with ease. why fight the gradient?!
i find it very hard to calibrate LLMs when discussing new ideas or research directions. they can be unnecessarily positive or negative. we know LLMs exhibit human cultural bias in their outputs. privacy concerns are hard to quantify
it's not clear cut, so we should do science, learn more
January 9, 2026 at 11:38 AM
i find it very hard to calibrate LLMs when discussing new ideas or research directions. they can be unnecessarily positive or negative. we know LLMs exhibit human cultural bias in their outputs. privacy concerns are hard to quantify
it's not clear cut, so we should do science, learn more
i'm still skeptical of AI reviews of grant proposals. there is evidence of some poor outcomes from AI paper review. there also are ways to ease into this and do it carefully. we should (surprise) study this, not vibe through it
The German research foundation now allows AI use for grants evaluation
January 9, 2026 at 11:38 AM
i'm still skeptical of AI reviews of grant proposals. there is evidence of some poor outcomes from AI paper review. there also are ways to ease into this and do it carefully. we should (surprise) study this, not vibe through it
the more i learn about BioML the more i appreciate how ahead of its time AlphaFold2 was. so many concepts that apply across domains (confidence, recycling, triangle update, etc) can be traced back to AlphaFold
January 8, 2026 at 6:12 PM
the more i learn about BioML the more i appreciate how ahead of its time AlphaFold2 was. so many concepts that apply across domains (confidence, recycling, triangle update, etc) can be traced back to AlphaFold
ya. it's an interesting project. a big one. but they're not always comparing against challenging baselines or standard of care. some of their results aren't explained well, like prostate cancer prediction. overall it's more of an engineering effort
January 8, 2026 at 5:05 AM
ya. it's an interesting project. a big one. but they're not always comparing against challenging baselines or standard of care. some of their results aren't explained well, like prostate cancer prediction. overall it's more of an engineering effort
looks cool but i've done a sleep study and it was not indicative of a normal night of sleep. closer to an 8hr torture session. a lot of their headline claims don't have to do with sleep. looks like collecting a lot of cardio data explains most of the results not the 'language of sleep'
Just saw this paper today (even though it was out yesterday😱) and it is super interesting and powerful that so much clinical data with predictive power can be extracted from one night of sleep data www.nature.com/articles/s41...
looks cool but i've done a sleep study and it was not indicative of a normal night of sleep. closer to an 8hr torture session. a lot of their headline claims don't have to do with sleep. looks like collecting a lot of cardio data explains most of the results not the 'language of sleep'
i think a lot of professors forget how physically demanding benchwork is. there's a lot of "o if i wasn't a professor i would apply for this postdoc!" that I really don't believe! that said, there are plenty of examples of people doing benchwork late into their careers, and many are outstanding
January 7, 2026 at 5:20 PM
i think a lot of professors forget how physically demanding benchwork is. there's a lot of "o if i wasn't a professor i would apply for this postdoc!" that I really don't believe! that said, there are plenty of examples of people doing benchwork late into their careers, and many are outstanding
thanks for sharing, this is cool. can see how it provides higher quality results. for new ideas, this kind of proposer-critic loop is exactly the algorithm we ourselves use. i'm sure it's happening in the handful of AI scientist labs. with coding agents testing ideas in real time, it could work
January 7, 2026 at 12:51 AM
thanks for sharing, this is cool. can see how it provides higher quality results. for new ideas, this kind of proposer-critic loop is exactly the algorithm we ourselves use. i'm sure it's happening in the handful of AI scientist labs. with coding agents testing ideas in real time, it could work
I think we’ll soon see a “double-check” mode that lets you spend tokens for adversarial verification. this could be combined with “no really, do it all yourself” for more walk-away development
These things really need a better UI to let you use the downtime for learning though.
January 6, 2026 at 11:59 PM
I think we’ll soon see a “double-check” mode that lets you spend tokens for adversarial verification. this could be combined with “no really, do it all yourself” for more walk-away development
Yup that rings true. Good use pattern. It’s also a surprisingly satisfying balance. A bit slow-takeoff. We have these powerful tools to speed things up, but it’s still our responsibility to be tool users and thinkers. Just what we’re best at
January 6, 2026 at 11:44 PM
Yup that rings true. Good use pattern. It’s also a surprisingly satisfying balance. A bit slow-takeoff. We have these powerful tools to speed things up, but it’s still our responsibility to be tool users and thinkers. Just what we’re best at
Can you give a toy example? The problem I’m working on is legit hard. deeply studied and still open in important ways. I’m finding the answers either regress to the same (good, but standard and incomplete) ideas or add complexity without making the logical jumps to really solve the problem. Very LLM
January 6, 2026 at 8:39 PM
Can you give a toy example? The problem I’m working on is legit hard. deeply studied and still open in important ways. I’m finding the answers either regress to the same (good, but standard and incomplete) ideas or add complexity without making the logical jumps to really solve the problem. Very LLM
to answer your question: none. i've been using factory settings. with me as the reviewer, it's already hard enough to evaluate and verify potential ideas. but it points so hard at an agentic loop with coding very in-loop evaluation. maybe i should just work on this
January 6, 2026 at 7:57 PM
to answer your question: none. i've been using factory settings. with me as the reviewer, it's already hard enough to evaluate and verify potential ideas. but it points so hard at an agentic loop with coding very in-loop evaluation. maybe i should just work on this
that's a great point! i think you are correct, but i'm very daunted by this. how do you find the right system design? that would require scalable evaluation/verification of new ideas! that's potentially feasible in CS domain using coding agents! could be a way to develop automated discovery systems
January 6, 2026 at 7:57 PM
that's a great point! i think you are correct, but i'm very daunted by this. how do you find the right system design? that would require scalable evaluation/verification of new ideas! that's potentially feasible in CS domain using coding agents! could be a way to develop automated discovery systems