Lightnews — Scholar-powered news

Reposted by Manish Pandey

Maarten van Smeden

@maartenvsmeden.bsky.social

NEW PREPRINT

A detailed overview of 32 popular predictive performance metrics for prediction models

arxiv.org/abs/2412.10288

December 16, 2024 at 8:44 AM

Reposted by Manish Pandey

Andrew Gordon Wilson

@andrewgwils.bsky.social

Excited for the #NeurIPS2024 workshops today! I'll be speaking at:
(1) Science of DL (panel, 3:10-4:10, scienceofdlworkshop.github.io/schedule/)
(2) "Time Series in the Age of Large Models" (talk, 4:39-5:14, neurips-time-series-workshop.github.io).

Schedule | SciForDL'24

scienceofdlworkshop.github.io

December 15, 2024 at 5:51 PM

Reposted by Manish Pandey

Hamed Shirzad

@hamedshirzad.bsky.social

As a reminder, we will have our poster session tomorrow:

📍 East Exhibit Hall, Poster #3010
📄 arxiv.org/abs/2411.16278
💻 github.com/hamed1375/Sp...
To motivate you further, we have some insights gained from the attention score analysis of this work, which I'll share in this thread:

Hamed Shirzad @hamedshirzad.bsky.social · Dec 5

Graph Transformers (GTs) can handle long-range dependencies and resolve information bottlenecks, but they’re computationally expensive. Our new model, Spexphormer, helps scale them to much larger graphs – check it out at NeurIPS next week, or the preview here!
[1/13]
#NeurIPS2024

December 12, 2024 at 12:29 AM

Reposted by Manish Pandey

IAMJB

@iamjbd.bsky.social

Gemini 2.0 Flash outperforms 1.5 Pro on key benchmarks at 2X speed
Now available in anychat try it out:

🤗 huggingface.co/spaces/akha...

Anychat - a Hugging Face Space by akhaliq

huggingface.co

December 12, 2024 at 1:01 AM

Reposted by Manish Pandey

Brent Weinberg

@brentweinberg.bsky.social

If you are a radiologist, imager, or otherwise just interested in imaging, I'm working on a starter pack of radiologists and imagers to follow to create a positive imaging community here.

If you want to be included or know other people to include, drop them in the replies

go.bsky.app/Gmkg4yX

November 11, 2024 at 2:35 AM

Reposted by Manish Pandey

Berk Ustun

@berkustun.bsky.social

Couldn't find a machine learning for health starter pack so I made one.

DM/Reply if you want to be added!

go.bsky.app/PJKJ8vK

November 17, 2024 at 6:34 AM

Reposted by Manish Pandey

Christian Wolf

@chriswolfvision.bsky.social

Updated: 6 benchmarks testing spatial and agent reasoning of LLM/VLMs
arxiv.org/abs/2410.06468 does spatial cognition
arxiv.org/abs/2307.06281 MMBench
arxiv.org/abs/2411.13543 BALROG
arxiv.org/abs/2410.07765 GameTraversalBenchmark
3dsrbench.github.io 3DSRBenchmark
open-eqa.github.io Open-EQA

November 26, 2024 at 8:25 AM

Reposted by Manish Pandey

Zachary Lipton

@zacharylipton.bsky.social

Medically adapted foundation models (think Med-*) turn out to be more hot air than hot stuff. Correcting for fatal flaws in evaluation, the current crop are no better on balance than generic foundation models, even on the very tasks for which benefits are claimed.
arxiv.org/abs/2411.04118

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?

Several recent works seek to develop foundation models specifically for medical applications, adapting general-purpose large language models (LLMs) and vision-language models (VLMs) via continued pret...

arxiv.org

November 26, 2024 at 6:12 PM

Reposted by Manish Pandey

Eric Topol

@erictopol.bsky.social

The opportunities, challenges and outlook for LLM-based agents in medicine and healthcare—our paper published today

nature.com/articles/s42...

LLM-based agentic systems in medicine and healthcare - Nature Machine Intelligence

Large language model-based agentic systems can process input information, plan and decide, recall and reflect, interact and collaborate, leverage various tools and act. This opens up a wealth of oppor...

nature.com

December 5, 2024 at 1:29 PM

Reposted by Manish Pandey

Sander Dieleman

@sedielem.bsky.social

In a gratuitous attempt to acquire more followers myself 😁, I've made a start on a "starter pack". Hopefully as more people from 🐦 make it over to 🦋, we can extend this a bit. Suggestions welcome!

I've noticed not all accounts seem to be eligible to be added, anyone know what's up with that? 🤔

November 15, 2024 at 8:04 PM

Reposted by Manish Pandey

Kevin K. Yang 楊凱筌

@kevinkaichuang.bsky.social

Three BioML starter packs now!

Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc
Pack 3: go.bsky.app/NAKYUok

DM if you want to be included (or nominate people who should be!)

December 3, 2024 at 3:27 AM

Reposted by Manish Pandey

Jennifer Hu

@jennhu.bsky.social

Stop by our #NeurIPS tutorial on Experimental Design & Analysis for AI Researchers! 📊

neurips.cc/virtual/2024/tutorial/99528

Are you an AI researcher interested in comparing models/methods? Then your conclusions rely on well-designed experiments. We'll cover best practices + case studies. 👇

NeurIPS Tutorial Experimental Design and Analysis for AI ResearchersNeurIPS 2024

neurips.cc

December 7, 2024 at 6:15 PM

Reposted by Manish Pandey

Justin Chih-Yao Chen

@cyjustinchen.bsky.social

🚨 Reverse Thinking Makes LLMs Stronger Reasoners

We can often reason from a problem to a solution and also in reverse to enhance our overall reasoning. RevThink shows that LLMs can also benefit from reverse thinking 👉 13.53% gains + sample efficiency + strong generalization (on 4 OOD datasets)!

December 2, 2024 at 7:29 PM

Reposted by Manish Pandey

Finbarr

@finbarr.bsky.social

This is one of my all time favorite papers:

openreview.net/forum?id=ByJ...

It shows that, under fair experimental evaluation, lstms do just as well as a bunch of “improvements”

On the State of the Art of Evaluation in Neural Language Models

Show that LSTMs are as good or better than recent innovations for LM and that model evaluation is often unreliable.

openreview.net

December 7, 2024 at 3:51 PM

Reposted by Manish Pandey

Sylvain.

@sylvainviguier.com

Ahead of NeurIPS next week, we published our Papers of the month, a selection of fresh AI ideas worth knowing about.

graphcore-research.github.io/papers-of-th...

November Papers: An LLM Feast

This month we’ve got an all-LLM menu of papers for you, with summaries of four great works exploring many different aspects of crafting systems for LLM training and inference.

graphcore-research.github.io

December 6, 2024 at 11:59 PM

Reposted by Manish Pandey

Kai-Fu Lee

@kaifulee.bsky.social

I had a recent great interview with @PeterDiamandis . Dont miss it:
www.youtube.com/watch?v=n1BV...

Ex-Google China President on How China Is Shaping the Future of AI w/ Kai-Fu Lee | EP #134

YouTube video by Peter H. Diamandis

www.youtube.com

December 6, 2024 at 11:24 PM

Reposted by Manish Pandey

Dr Amine Korchi

@draminekorchi.bsky.social

« RSNA has been called radiology’s Super Bowl and this year didn’t disappoint. RSNA 2024 showed that radiology is prepared to fully embrace AI – and a future in which humans and machines collaborate to deliver better patient care » @brian-casey.bsky.social

theimagingwire.com/2024/12/04/a...

RSNA Goes All-In on AI - The Imaging Wire

It’s been AI all the time this week at RSNA 2024.

theimagingwire.com

December 6, 2024 at 9:23 PM

Reposted by Manish Pandey

Blake Richards

@tyrellturing.bsky.social

1/ I get the impulse to simplify things for a popular science articule, but I don't think "fake and sucks" versus "real and dangerous" is an accurate reflection of the debates in #AI.

The real debates are of course more nuanced, and interesting...

Small 🧵

Casey Newton @caseynewton.bsky.social · Dec 6

One of the most important debates in tech right now is the group of folks who think AI is fake and sucks vs. the people who think AI is real and dangerous. I wrote about why I'm in the latter camp, and talked about my differences with Gary Marcus www.platformer.news/ai-skeptics-...

Most people know these systems are flaws, and adjust their expectations and usage accordingly. The “AI is fake and sucks” crowd is hyper-fixated on the things it can’t do — count the number of r’s in strawberry, figure out that the Onion was joking when it told us to eat rocks — and weirdly uninterested in the things it can.

And that’s a problem, because just as these systems are more honest and helpful than they have ever been, they are also causing greater harm. And to name a real harm, already happening today, I offer the chief security officer of Amazon, CJ Moses, who had this to say about how generative AI is being used in efforts to disrupt critical infrastructure in an interview with the Wall Street Journal last month:

We’re seeing billions of attempts coming our way. On average, we’re seeing 750 million attempts per day. Previously, we’d see about 100 million hits per day, and that number has grown to 750 million over six or seven months.
This is the ongoing blind spot of the “AI is fake and sucks” crowd. This is the problem with telling people over and over again that it’s all a big bubble about to pop. They’re staring at the floor of AI’s current abilities, while each day the actual practitioners are successfully raising the ceiling.

December 6, 2024 at 11:34 PM

Reposted by Manish Pandey

Marzieh Fadaee

@mziizm.bsky.social

🚀 Our mission to strengthen the multilingual open-source ecosystem continues!👇

Sara Hooker @sarahooker.bsky.social · Dec 5

Is MMLU Western-centric? 🤔

As part of a massive cross-institutional collaboration:
🗽Find MMLU is heavily overfit to western culture
🔍 Professional annotation of cultural sensitivity data
🌍 Release improved Global-MMLU 42 languages

📜 Paper: arxiv.org/pdf/2412.03304
📂 Data: hf.co/datasets/Coh...

December 6, 2024 at 11:04 PM

Reposted by Manish Pandey

Nathan Lambert

@natolambert.bsky.social

OpenAI announced a new RL finetuning API. You can do this on open models w the repo we used to train Tulu 3.

Expanding reinforcement learning with verifiable rewards to more domains and with better answer extraction and to more domains in our near roadmap.

https://buff.ly/3V4JEIJ

December 6, 2024 at 6:27 PM

Reposted by Manish Pandey

Chris Offner

@chrisoffner3d.bsky.social

Glad to see synthetic data really coming to the fore in 3D Vision recently! Here's MegaSaM showing stunning results from synthetic data: mega-sam.github.io

December 6, 2024 at 6:47 PM

Reposted by Manish Pandey

Stella Li

@stellali.bsky.social

31% of US adults use generative AI for healthcare 🤯But most AI systems answer questions assertively—even when they don’t have the necessary context. Introducing #MediQ a framework that enables LLMs to recognize uncertainty🤔and ask the right questions❓when info is missing: 🧵

December 6, 2024 at 10:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news