ein
banner
einschtein.bsky.social
ein
@einschtein.bsky.social
2D crypto data-dog; IRL applied AI/ML/NLP in academic medicine; proud 1x winner of the Bloomberg Podcast What Goes Up's Craziest Things in Markets This Week
Reposted by ein
I totally think sbert is a great place to start out of the box for good embeddings without needing to fine tune. Throw on a linear classifier on top (something like sklearn logistic regression to keep it simple) and IMO you could get pretty far on something fairly lightweight and fast.
April 26, 2023 at 10:02 AM
One of the best parts about chatgpt is it's massive context window (easily in the 4000-8000 token range for gpt3.5). Many sbert models top out at a window of 512 tokens or roughly 300 words on average. Not too bad if you're only classifying a post/reply on its own
April 26, 2023 at 10:10 AM
Huggingface may have some decent models too but they may be overly optimized for certain datasets (ie sms spam)

https://huggingface.co/models?search=spam
April 26, 2023 at 10:07 AM
Def not going to be the best but should be an informative baseline. Starting out you could aim to filter out the most extreme stuff and focus on optimizing high precision so that only the most obvious spam gets filtered out (ie fake crypto scam giveaways).

Assumes that you have training data tho
April 26, 2023 at 10:05 AM
I totally think sbert is a great place to start out of the box for good embeddings without needing to fine tune. Throw on a linear classifier on top (something like sklearn logistic regression to keep it simple) and IMO you could get pretty far on something fairly lightweight and fast.
April 26, 2023 at 10:02 AM
Some of this volume is attributed to paper mills - - these are groups that sell "ready to publish" research papers to academics to inflate a career advancing metric of published articles. Unfortunately, many paper mill research papers are often falsified / use made up data
April 25, 2023 at 3:54 AM
So real

What's really upsetting is seeing the book take off and lend them cred

I've seen this in medicine. Curious about what fields you're looking at -- would love to know if this is a universal acaremic thing
April 15, 2023 at 12:40 AM
Yeah, we are somehow descending / discovering the lowest common denominator version of many ideas / ideologies
April 14, 2023 at 12:59 PM
LA sure but I'm not confident about New England drivers haha
April 12, 2023 at 6:13 PM
👀
April 12, 2023 at 5:21 AM
* farcaster
April 12, 2023 at 5:20 AM