GioCirco
banner
giocirco.bsky.social
GioCirco
@giocirco.bsky.social
Data scientist, criminologist, lapsed academic. I study health outcomes and gun violence.
Reposted by GioCirco
New blog post, in which I discuss the error of "the difference between stat significant and not is not itself stat significant". It often causes people to post-hoc try to explain things that are easily just due to the standard error of estimates, andrewpwheeler.com/2025/07/28/t...
The difference between models, drive-time vs fatality edition
Easily one of the most common critiques I make when reviewing peer reviewed papers is the concept, the difference between statistically significant and not statistically significant is not itself s…
andrewpwheeler.com
July 28, 2025 at 11:19 AM
Reposted by GioCirco
I recently received a question on Poisson vs OLS models for a dose response relationship, and posted the exchange to my blog. Long story short even with count data OLS models can make sense, it depends on the functional form.

andrewpwheeler.com/2025/05/28/a...
AMA OLS vs Poisson regression
Crazy busy with Crime De-Coder and day job, so this blog has gone by the wayside for a bit. I am doing more python training for crime analysts, most recently in Austin. If you want to get a flavor …
andrewpwheeler.com
June 2, 2025 at 12:24 PM
Reposted by GioCirco
Paper with Scott Jacques on stripping sensitive information from narratives is out, www.qualitativecriminology.com/pub/zhiuy6jg.... Ultimate goal to make it easier for qual people to share there data for replication. Has links to python code and uses open source models.
A plea for open access to qualitative criminology: With a Python script for anonymizing data and illustrative analysis of error rates
This is the online version of the article. To access a print version with page numbers for citation and reference purposes, select
www.qualitativecriminology.com
May 16, 2025 at 2:35 PM
After a long, long wait (and a transition from academia to private sector), our paper is finally out! www.sciencedirect.com/science/arti...

We used a unique source of data tracking residents of Milwaukee from the 1980s through the 2020s to examine the role of alcohol abuse on gun violence.
Drunk and dangerous? Exploring the tenuous links among drunk driving, alcohol arrests, and firearm violence in an urban context
Recent research and policy discussions have focused on prohibiting individuals with repeat alcohol-related offenses from purchasing or possessing fire…
www.sciencedirect.com
April 10, 2025 at 2:51 PM
Got to love the speed of academic publishing. A paper that I helped collaborate on in 2021 is, just now, getting published. A full 4 years later, and also I am no longer in academia.

But tack on another +1 to the GoogleScholar page I guess!
April 4, 2025 at 8:06 PM
Ah yes, this drowning risk cluster is *checks notes* the entire nation of Iran.
January 29, 2025 at 6:53 PM
Remember the 2024 election? Are you NOT sick of hearing predictions about who will win? Want to revisit 11/4?!

Probably not. But I wrote up some thoughts on my novice approach to building a poll aggregation model during this last cycle. gmcirco.github.io/blog/posts/p...
January 9, 2025 at 8:40 PM
Reposted by GioCirco
Might regret, but genuinely curious:

It's my understanding that most/all "big" LLMs in genAI today are transformer-based. This gives speed/scaling advantages, but has some drawbacks. (Hallucinations, AI "sheen" on images, etc)

Other ML methods (GAN for images, for ex) balance the other way, right?
December 18, 2024 at 5:16 PM
I've been looking at some data about the flow of "crime guns" (seized guns that were used in a crime) across state borders using ATF data. What I find is pretty interesting - although may not come to a surprise for many:

gmcirco.github.io/blog/posts/c...
December 5, 2024 at 2:32 PM
This article from @nytimes.com is making the rounds. The main premise: Is it still worth it to learn how to code in an increasingly AI-centric world? My person opinion is "Yes, but..."

www.nytimes.com/2024/11/24/b...
Do Coding Boot Camps Make Sense in an A.I. World? (Gift Article)
Coding boot camps once looked like the golden ticket to an economically secure future. But as that promise fades, what should you do? Keep learning, until further notice.
www.nytimes.com
November 25, 2024 at 5:00 PM
Where do crime guns in your state come from? In many places, NOT the same state the gun was registered in! For example, more than 80% of seized crime guns in New York were bought or registered somewhere else:
August 27, 2024 at 12:34 PM
Bit of a preview of something I've been working on - revisiting Project Green Light Detroit. One thing I always wanted to delve a bit deeper into was the relationship between PGLD and the proactive police presence. Below is a plot showing the pretty stark change post-Green Light.
July 31, 2024 at 9:02 PM
Here's a blog post about the one important skill I *didn't* learn in grad school: SQL! Here I talk a bit about using SQL inside of R, and how to use DuckDB to speed up analysis of large datasets. Looking back, something I wish I had learned a lot earlier.

gmcirco.github.io/blog/posts/d...
July 23, 2024 at 12:54 PM
It's been a while - but a little blog post here about synthetic data simulation with spatial data. I detail a bit of the how and why of generating synthetic crime data using data from Hartford, CT on robberies and gas stations.

gmcirco.github.io/blog/posts/s...
June 15, 2024 at 8:51 PM
Reposted by GioCirco
After that little exchange about missing the Livejournal days, I decided to bring 'em back (but better, 'cause Dreamwidth). So here's a quick train post about steeking my career.

b-e-x.dreamwidth.org/641679.html
Captcha Check
Hello, you've been (semi-randomly) selected to take a CAPTCHA to validate your requests. Please complete it below and hit the button!
b-e-x.dreamwidth.org
April 25, 2024 at 2:35 PM
A slightly more dry blog post here In short, if you use SMOTE to address class imbalance (which, arguably, you should not ever) be sure not to evaluate your model metrics on a hold-out set derived from the SMOTE dataset. Your predictions will be WILDLY optimistic.

gmcirco.github.io/blog/posts/t...
March 26, 2024 at 2:29 PM
Reposted by GioCirco
A guide to BlueSky for Scientists
* Common questions
* Links to resources
* An explanation of feeds
* A directory of science feeds

Please share with scientists on BlueSky!

Written by me and @markrubin.bsky.social

🧪 #stats #PsychSciSky #neuroscience
BlueSky for Scientists
BlueSky for Scientists Authors: Steve Haroz and Mark Rubin URL: http://blueskyscience.steveharoz.com Features you may miss from Twitter or Mastodon As BlueSky is in beta, some features are not impleme...
blueskyscience.steveharoz.com
August 18, 2023 at 1:04 PM
Someone recently asked me how to use #rstats to map journeys of individuals across the United States. It's actually pretty easy, but requires a bit of work. Here's a quick blog post that walks through how to do this using the `sf` package in R.
January 16, 2024 at 10:13 PM
Here's me dipping my toes into language models and NLP. In this quickie blog post I write about using pre-built @huggingface.bsky.social sentence transformers to aid searching through medical narratives:

gmcirco.github.io/blog/posts/n...
January 9, 2024 at 8:02 PM
Reposted by GioCirco
How should we operationalize deployment time in police calls for service?
Read all about it in the latest publication from our Evidence-Based Policing #ebp research program at the NSCR.
crimesciencejournal.biomedcentral.com/articles/10....
December 19, 2023 at 5:03 PM
Very excited to announce that Andy Wheeler and I won first place in NIJ's "Innovations in Measuring Community Perceptions" challenge

nij.ojp.gov/funding/inno...
Innovations in Measuring Community Perceptions Challenge | National Institute of Justice
WebinarNIJ hosted a webinar to discuss this challenge on June 6. Review the transcript and presentation slides.
nij.ojp.gov
December 13, 2023 at 3:30 AM
Reposted by GioCirco
My new favorite toy:

gam(y ~ unit + s(time) + s(time, by = unit))

Effectively "borrows strength" across units to inform predictions, while still allowing unit-specific deviations from the "usual" to come through.

Black lines = s(time)
Dashed = unit + s(time, by = unit)
Colored = combined
December 12, 2023 at 6:50 PM
I'm a data guy and also a coffee nerd. So when James Hoffman released data from his Great American Coffee Taste Test I had to look at it:

gmcirco.github.io/blog/posts/c...

tl;dr: There are some small gender preferences for a funky coffee, but most of the effect is age or expertise-based!
December 3, 2023 at 4:47 PM
A brief blog post on synthetic controls. Specifically, what we mean what we talk about "micro" approaches. A walkthrough using microsynth and some data I generated on NYPD's 2014 Operation Impact.

gmcirco.github.io/blog/posts/s...
October 24, 2023 at 2:19 PM
Me and (social-less Andy Wheeler) released our approach for NIJ's Innovations in Measuring Community Perceptions Challenge. We propose a mail + web-based push survey to minimize costs, and use regression-based techniques to improve small area estimates. Check it out

www.crimrxiv.com/pub/p2pxki1g...
October 11, 2023 at 12:56 PM