Eun Cheol Choi
@euncheolchoi.bsky.social
PhD student at USC Annenberg; LLMs / CSS / SNA; @ HUMANS Lab (http://www.emilio.ferrara.name/)
Pinned
Eun Cheol Choi
@euncheolchoi.bsky.social
· Jun 26
𝑾𝒊𝒍𝒍 𝑨𝑰 𝒕𝒂𝒌𝒆 𝒎𝒚 𝒋𝒐𝒃?
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
𝑾𝒊𝒍𝒍 𝑨𝑰 𝒕𝒂𝒌𝒆 𝒎𝒚 𝒋𝒐𝒃?
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
June 26, 2025 at 4:30 PM
𝑾𝒊𝒍𝒍 𝑨𝑰 𝒕𝒂𝒌𝒆 𝒎𝒚 𝒋𝒐𝒃?
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
It’s a question that keeps many of us up at night—and for good reason.
Our new research maps labor market vulnerability in the age of AI with 100K job postings and 50K AI-related patents. (1/5)
Keeping Track of AI Use Cases in the Newsroom generative-ai-newsroom.com/keeping-trac...
Keeping Track of AI Use Cases in the Newsroom
A practical guide to how journalists are using generative AI and how to keep up
generative-ai-newsroom.com
June 15, 2025 at 7:29 PM
Keeping Track of AI Use Cases in the Newsroom generative-ai-newsroom.com/keeping-trac...
Reposted by Eun Cheol Choi
"Limited effectiveness of LLM-based data augmentation for COVID-19 misinformation stance detection" by @euncheolchoi.bsky.social @emilioferrara.bsky.social et al, presented by the awesome Chur at The Web Conference 2025
arxiv.org/abs/2503.02328
arxiv.org/abs/2503.02328
May 1, 2025 at 5:06 AM
"Limited effectiveness of LLM-based data augmentation for COVID-19 misinformation stance detection" by @euncheolchoi.bsky.social @emilioferrara.bsky.social et al, presented by the awesome Chur at The Web Conference 2025
arxiv.org/abs/2503.02328
arxiv.org/abs/2503.02328
Reposted by Eun Cheol Choi
Hot takes:
- The benefits of easily accessible social media data usually outweigh the potential harms.
- Some uses of (public) social media data are unethical/should be illegal, and we should target that.
- We're better off having clear boundaries between public and private online spaces.
- The benefits of easily accessible social media data usually outweigh the potential harms.
- Some uses of (public) social media data are unethical/should be illegal, and we should target that.
- We're better off having clear boundaries between public and private online spaces.
FYI, here's the entire code to create a dataset of every single bsky message in real time:
```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```
```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```
November 28, 2024 at 6:38 PM
Hot takes:
- The benefits of easily accessible social media data usually outweigh the potential harms.
- Some uses of (public) social media data are unethical/should be illegal, and we should target that.
- We're better off having clear boundaries between public and private online spaces.
- The benefits of easily accessible social media data usually outweigh the potential harms.
- Some uses of (public) social media data are unethical/should be illegal, and we should target that.
- We're better off having clear boundaries between public and private online spaces.
Reposted by Eun Cheol Choi
An employee of Huggingface, a site of AI training datasets, made a dataset of a million Bluesky posts scraped simply because they could. It’s currently trending: www.404media.co/someone-made...
Someone Made a Dataset of One Million Bluesky Posts for 'Machine Learning Research'
A Hugging Face employee made a huge dataset of Bluesky posts, and it’s already very popular.
www.404media.co
November 27, 2024 at 12:09 AM
An employee of Huggingface, a site of AI training datasets, made a dataset of a million Bluesky posts scraped simply because they could. It’s currently trending: www.404media.co/someone-made...
Reposted by Eun Cheol Choi
Bluesky's firehose is a treasure trove of public data for researchers and developers, and it's completely free. Check out our developer docs: docs.bsky.app
November 23, 2024 at 5:54 AM
Bluesky's firehose is a treasure trove of public data for researchers and developers, and it's completely free. Check out our developer docs: docs.bsky.app
Reposted by Eun Cheol Choi
Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩
Let's start with "What are embeddings" by @vickiboykis.com
The book is a great summary of embeddings, from history to modern approaches.
The best part: it's free.
Link: vickiboykis.com/what_are_emb...
Let's start with "What are embeddings" by @vickiboykis.com
The book is a great summary of embeddings, from history to modern approaches.
The best part: it's free.
Link: vickiboykis.com/what_are_emb...
November 22, 2024 at 11:13 AM
Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩
Let's start with "What are embeddings" by @vickiboykis.com
The book is a great summary of embeddings, from history to modern approaches.
The best part: it's free.
Link: vickiboykis.com/what_are_emb...
Let's start with "What are embeddings" by @vickiboykis.com
The book is a great summary of embeddings, from history to modern approaches.
The best part: it's free.
Link: vickiboykis.com/what_are_emb...
Reposted by Eun Cheol Choi
New Paper on Opportunities and risks of LLMs in survey research papers.ssrn.com/sol3/papers.... " Backed by both practical examples & academic literature, we identify areas
for research and development, distinguishing between challenges related to survey methods &
the tools used to deploy surveys"
for research and development, distinguishing between challenges related to survey methods &
the tools used to deploy surveys"
Opportunities and risks of LLMs in survey research
Recent advances in the development of large language models (LLMs) bring both disruptive opportunities and underlying risks to survey research. LLMs' capabiliti
papers.ssrn.com
November 22, 2024 at 3:07 PM
New Paper on Opportunities and risks of LLMs in survey research papers.ssrn.com/sol3/papers.... " Backed by both practical examples & academic literature, we identify areas
for research and development, distinguishing between challenges related to survey methods &
the tools used to deploy surveys"
for research and development, distinguishing between challenges related to survey methods &
the tools used to deploy surveys"
Reposted by Eun Cheol Choi
Interested in RLHF, DPO, LLM alignment?
I've just created this list featuring awesome people like @natolambert.bsky.social .
The list is the opposite of exhaustive; I've just joined some days ago 😅
go.bsky.app/MqRGAf2
I've just created this list featuring awesome people like @natolambert.bsky.social .
The list is the opposite of exhaustive; I've just joined some days ago 😅
go.bsky.app/MqRGAf2
November 21, 2024 at 1:26 PM
Interested in RLHF, DPO, LLM alignment?
I've just created this list featuring awesome people like @natolambert.bsky.social .
The list is the opposite of exhaustive; I've just joined some days ago 😅
go.bsky.app/MqRGAf2
I've just created this list featuring awesome people like @natolambert.bsky.social .
The list is the opposite of exhaustive; I've just joined some days ago 😅
go.bsky.app/MqRGAf2
Reposted by Eun Cheol Choi
If you're keen to learn content verification for fact-checking and open source investigations, this is a step-by-step guide on how to verify images and videos that I posted a while ago.
Once you familiarise yourself with reverse search, you'll become much better at spotting online misinformation.
Once you familiarise yourself with reverse search, you'll become much better at spotting online misinformation.
THREAD: How to verify online images
Social media is awash with false or misleading images, some of which get millions of engagements.
So here's a simple guide on ways you can quickly check the veracity of images you see on your social media feeds with major elections coming up this year.
Social media is awash with false or misleading images, some of which get millions of engagements.
So here's a simple guide on ways you can quickly check the veracity of images you see on your social media feeds with major elections coming up this year.
November 14, 2024 at 12:56 AM
If you're keen to learn content verification for fact-checking and open source investigations, this is a step-by-step guide on how to verify images and videos that I posted a while ago.
Once you familiarise yourself with reverse search, you'll become much better at spotting online misinformation.
Once you familiarise yourself with reverse search, you'll become much better at spotting online misinformation.
Reposted by Eun Cheol Choi
More #data2thepeople! #election2024
We just released a Twitter/X dataset of 22m+ tweets about the upcoming election! (Data will be routinely updated, keep an eye and spread the word!)
Paper arxiv.org/abs/2411.00376
Data github.com/sinking8/usc...
We just released a Twitter/X dataset of 22m+ tweets about the upcoming election! (Data will be routinely updated, keep an eye and spread the word!)
Paper arxiv.org/abs/2411.00376
Data github.com/sinking8/usc...
A Public Dataset Tracking Social Media Discourse about the 2024 U.S. Presidential Election on Twitter/X
In this paper, we introduce the first release of a large-scale dataset capturing discourse on $\mathbb{X}$ (a.k.a., Twitter) related to the upcoming 2024 U.S. Presidential Election. Our dataset compri...
arxiv.org
November 12, 2024 at 11:31 PM
More #data2thepeople! #election2024
We just released a Twitter/X dataset of 22m+ tweets about the upcoming election! (Data will be routinely updated, keep an eye and spread the word!)
Paper arxiv.org/abs/2411.00376
Data github.com/sinking8/usc...
We just released a Twitter/X dataset of 22m+ tweets about the upcoming election! (Data will be routinely updated, keep an eye and spread the word!)
Paper arxiv.org/abs/2411.00376
Data github.com/sinking8/usc...
Reposted by Eun Cheol Choi
#Data2thePeople! Pls reshare!
Do you want a billion telegram posts?!
You get data! You get data! You get data!
Paper arxiv.org/abs/2410.23638
Data github.com/leonardo-bla...
#election2024
Do you want a billion telegram posts?!
You get data! You get data! You get data!
Paper arxiv.org/abs/2410.23638
Data github.com/leonardo-bla...
#election2024
Unearthing a Billion Telegram Posts about the 2024 U.S. Presidential Election: Development of a Public Dataset
With its lenient moderation policies and long-standing associations with potentially unlawful activities, Telegram has become an incubator for problematic content, frequently featuring conspiratorial,...
arxiv.org
November 12, 2024 at 11:35 PM
#Data2thePeople! Pls reshare!
Do you want a billion telegram posts?!
You get data! You get data! You get data!
Paper arxiv.org/abs/2410.23638
Data github.com/leonardo-bla...
#election2024
Do you want a billion telegram posts?!
You get data! You get data! You get data!
Paper arxiv.org/abs/2410.23638
Data github.com/leonardo-bla...
#election2024
Reposted by Eun Cheol Choi
Sharing my first Computational Social Science starter pack! Will grow with time, feel free to nominate and self nominate!
go.bsky.app/CYmRvcK
go.bsky.app/CYmRvcK
November 13, 2024 at 2:05 AM
Sharing my first Computational Social Science starter pack! Will grow with time, feel free to nominate and self nominate!
go.bsky.app/CYmRvcK
go.bsky.app/CYmRvcK
Reposted by Eun Cheol Choi
Here is a @github.com repo where I will share tutorial notebooks on how to retrieve and analyze data from @bsky.app @atproto.com
Feedback, suggestions, and contributions are welcome!
github.com/brianckeegan...
Feedback, suggestions, and contributions are welcome!
github.com/brianckeegan...
GitHub - brianckeegan/bluesky-datascience: Exploratory notebooks for data science using Bluesky data
Exploratory notebooks for data science using Bluesky data - brianckeegan/bluesky-datascience
github.com
November 14, 2024 at 5:05 AM
Here is a @github.com repo where I will share tutorial notebooks on how to retrieve and analyze data from @bsky.app @atproto.com
Feedback, suggestions, and contributions are welcome!
github.com/brianckeegan...
Feedback, suggestions, and contributions are welcome!
github.com/brianckeegan...
Reposted by Eun Cheol Choi
Ready for another Computational Social Science Starter Pack?
Here is number 2! More amazing folks to follow! Many students and the next gen represented!
go.bsky.app/GoEyD7d
Here is number 2! More amazing folks to follow! Many students and the next gen represented!
go.bsky.app/GoEyD7d
November 14, 2024 at 11:42 PM
Ready for another Computational Social Science Starter Pack?
Here is number 2! More amazing folks to follow! Many students and the next gen represented!
go.bsky.app/GoEyD7d
Here is number 2! More amazing folks to follow! Many students and the next gen represented!
go.bsky.app/GoEyD7d