#topicmodeling
October 4, 2025 at 2:33 AM
Okay Bluesky, I need your help. I'm looking for some recommendations: What are your favourite tools/examples for #TopicModeling, #Stylometry, and #NetworkAnalysis?

Please comment below or DM me. And feel free to repost.

#DigitalHumanities #NLP #NLProc
July 23, 2025 at 11:24 AM
JMIR Formative Res: Improving Suicidal Ideation Detection in Social Media Posts: Topic Modeling and Synthetic Data Augmentation Approach #MentalHealth #SuicidePrevention #SocialMedia #PublicHealth #TopicModeling
Improving Suicidal Ideation Detection in Social Media Posts: Topic Modeling and Synthetic Data Augmentation Approach
Background: In an era dominated by social media conversations, it is pivotal to comprehend how suicide, a critical public health issue, is discussed online. Discussions around suicide often highlight a range of topics, such as mental health challenges, relationship conflicts, and financial distress. However, certain sensitive issues, like those affecting marginalized communities, may be underrepresented in these discussions. This underrepresentation is a critical issue to investigate because it is mainly associated with underserved demographics (eg, racial and sexual minorities), and models trained on such data will underperform on such topics. Objective: The objective of this study was to bridge the gap between established psychology literature on suicidal ideation and social media data by analyzing the topics discussed online. Additionally, by generating synthetic data, we aimed to ensure that datasets used for training classifiers have high coverage of critical risk factors to address and adequately represent underrepresented or misrepresented topics. This approach enhances both the quality and diversity of the data used for detecting suicidal ideation in social media conversations. Methods: We first performed unsupervised topic modeling to analyze suicide-related data from social media and identify the most frequently discussed topics within the dataset. Next, we conducted a scoping review of established psychology literature to identify core risk factors associated with suicide. Using these identified risk factors, we then performed guided topic modeling on the social media dataset to evaluate the presence and coverage of these factors. After identifying topic biases and gaps in the dataset, we explored the use of generative large language models to create topic-diverse synthetic data for augmentation. Finally, the synthetic dataset was evaluated for readability, complexity, topic diversity, and utility in training machine learning classifiers compared to real-world datasets. Results: Our study found that several critical suicide-related topics, particularly those concerning marginalized communities and racism, were significantly underrepresented in the real-world social media data. The introduction of synthetic data, generated using GPT-3.5 Turbo, and the augmented dataset improved topic diversity. The synthetic dataset showed levels of readability and complexity comparable to those of real data. Furthermore, the incorporation of the augmented dataset in fine-tuning classifiers enhanced their ability to detect suicidal ideation, with the F1-score improving from 0.87 to 0.91 on the University of Maryland Reddit Suicidality Dataset test subset and from 0.70 to 0.90 on the synthetic test subset, demonstrating its utility in improving model accuracy for suicidal narrative detection. Conclusions: Our results demonstrate that synthetic datasets can be useful to obtain an enriched understanding of online suicide discussions as well as build more accurate machine learning models for suicidal narrative detection on social media.
dlvr.it
June 11, 2025 at 4:43 PM
Based on an analysis of #DHd conference abstracts, I trace the evolution of #cls methods from 2014 to 2025: from omnipresent #networkanalysis and #annotation, to the first appearance of #topicmodeling and #sentimentanalysis, to #deeplearning and #generativeai.
May 13, 2025 at 3:46 PM
Based on an analysis of #DHd conference abstracts, I trace the evolution of #CLS methods from 2014 to 2025: from omnipresent #NetworkAnalysis and #Annotation, to the first appearance of #TopicModeling and #SentimentAnalysis, to #DeepLearning and #GenerativeAI.
May 13, 2025 at 3:20 PM
This week's Wednesday webinar (Feb. 26) at Vanderbilt Biostatistics is "Topic Models in Microbiome Analysis," at 1:30 pm CT, by Kris Sankaran @sankaranlab.bsky.social www.vumc.org/biostatistic... #RStats #GISky #GutSky #MedSky #TopicModeling
February 25, 2025 at 8:22 PM
Interesse an #topicmodeling für historische Fachzeitschriften? Eike Löhden & ich haben 50 Jahrgänge der „Francia“ analysiert.

Wie haben sich Schwerpunkte über die Jahre entwickelt? Welche Unterschiede zeigen sich zwischen deutsch- und französischsprachigen […]

[Original post on fedihum.org]
February 14, 2025 at 9:19 AM
This paper uses topic modeling and bias measurement techniques to analyze and determine gender bias in English song lyrics. Our analysis shows the thematic shift in song lyrics over the years

#mathsky #compsky #science #topicmodeling
Beats of Bias: Analyzing Lyrics with Topic Modeling and Gender Bias Measurements
arxiv.org
February 7, 2025 at 5:52 PM
Last update in 2024 🚀 for PsychTopics, our #Rstats #ShinyApp that automatically identifies research topics in psychology from DE, AT, CH & LUX

Data source: PSYNDEX database of @zpid.bsky.social
Method: RollingLDA #TopicModeling (aclanthology.org/2022.sdp-1.2)

👉 abitter.shinyapps.io/psychtopics
December 18, 2024 at 10:40 AM
Do you work with text data? Then our ✨topiclabels✨ #Rstats package may come in handy.

Using open #LLM, it automatically assigns a topic label to a bag of words.

It also works with all popular #TopicModeling packages!

👉 cran.r-project.org/package=topi...
👉 github.com/PetersFritz/...
December 10, 2024 at 10:23 AM
Happy to announce that our #Rstats package ✨topiclabels✨ has been updated on #CRAN 🎉

🤖Using open #LLMs, our package automatically assigns a topic label to a bag of words.
🤝It works with all popular #TopicModeling packages!

Find out more:
👉https://github.com/PetersFritz/topiclabels
October 30, 2024 at 2:15 PM
What is LDA and Why It’s Not Just for Text

Latent Dirichlet Allocation (LDA). Do you think about its roots in text analysis? #dataclustering #graphdata #imagedataanalysis #LDA #LDAapplications #LDAforimages #LDAinmusic #musicdata #nontextdata #topicmodeling
aicompetence.org/lda-beyond-t...
LDA Beyond Text: Applications In Image, Music, And Graph Data
LDA goes beyond text analysis, uncovering patterns in image, music, and graph data, driving innovative insights across diverse data types.
aicompetence.org
October 11, 2024 at 7:58 PM
📢 🧑‍🏫 Das Lehrangebot der DigitalHistory (@humboldtuni.bsky.social) bietet auch im SoSe24 wieder ein reichhaltiges Programm:
Von DataLiteracy & Python über Einführungen in SentimentAnalysis & TopicModeling bis hin zu #auxHist sowie Bibliotheken im Verhältnis zur #digiGW

➡️ hu.berlin/DigHisLehreS...
February 22, 2024 at 8:32 AM
🚨Exciting research alert! 🚨 Explore the world of topic modeling in communication science with our latest paper. Discover how different validation methods impact model selection and its consequences for theory development. 📚 commsky #TopicModeling www.aup-online.com/content/jour...
Topic Model Validation Methods and their Impact on Model Selection and Evaluation | Amsterdam Univer...
Topic Modeling is currently one of the most widely employed unsupervised text-as-data techniques in the field of communication science. While researchers increasingly recognize the importance of valid...
www.aup-online.com
October 23, 2023 at 1:36 PM
#CUDAN #CulturalAnalytics research fellow option 2: #PalaeontologyOfMemes > Use #TopicModeling or bi-partite #NetSci or #TDA to analyze structure & evolution of large unstructured corpora or classifications > Please spread and/or apply: <a href="http://cudan.tlu.ee/positions/" class="hover:underline text-blue-600 dark:text-sky-400 no-card-link" target="_blank" rel="noopener" data-link="bsky">http://cudan.tlu.ee/positions/ 5/12
February 13, 2020 at 6:29 AM
Thrilled to have the chance to talk about my #topicmodeling #research at the coming #python and friends conference @PyGrunn !

And a bit intimidated though, based on the website I think I m the only #female speaker 😬

#phdlife #genderbalance #programming #womenintech
June 13, 2025 at 1:38 PM
Wie kann man Topics verstehen? Erläutert @u_henny anhand einer spanischen Wortwolke, basierend auf einem Korpus hispanoamerikanischer Romane. #TopicModeling #dhd2019
December 10, 2024 at 3:05 PM
Definition von #TopicModeling von @JanHorstmannn: Ein auf Wahrscheinlichkeitsrechnung basierendes Verfahren zur Exploration größerer Textsammlungen. Bietet die Möglichkeit, Textsammlungen thematisch zu explorieren. https://fortext.net/routinen/methoden/topic-modeling #dhd2019
December 10, 2024 at 3:05 PM
.#BSwallow & #EBayer @ #MDurrett et al @ #dayofdh18cc: #topicmodeling of Latin texts = took wks. So for comps / to fill time, they built AWESOME #FOSS app to visualize #topicmodels. http://cs.carleton.edu/cs_comps/1718/latin/final-results/the-app.html #loquela CC @dighall @nolauren...
December 6, 2024 at 9:12 AM