Jason Mercer
wetlandscapes.bsky.social
Jason Mercer
@wetlandscapes.bsky.social

Wetlands and waters from the montane to marine, data science, reproducibility, isotopes, and climate change. Personal account; opinions my own.

Environmental science 58%
Agriculture 17%

Reposted by Jason J. Mercer

Ah, didn't realize pins worked with python, too. Was thinking something a bit more language agnostic, but good to know this works for the two most common languages I use

What are your data science strategies for managing medium to big data in the context of version controlled code? Ideally in a way that is agnostic to OS and language and is fairly cheap and can be hosted on prem or in the cloud. DVC? Symbolic links? Something else? #rstats #python #julialang

For the project I'm currently working on, I'm just gonna go full {pak}. Definitely gonna checkout {rix} and {rv} in the future, though

The community in this thread is one of the reasons I really like #rstats. So many genuine and helpful people.
Getting {renv} and {pak} to play nicely together in a docker container is a nightmare #rstats

Oh, so like older than what PPM archives? Yeah, that's rough because then it's just local caching, to the degree that's possible

Do you use the posit package manager (PPM) repo? I find using their pre-compilied binaries along with caching from BuildKit drastically increases my iteration speed.

Yesh. That might even be part of the issue. No matter what I do, renv and pak generate a conflict during the build phase due to some mismatch in environment variables, library locations, etc. Sad trombone

That's great, because I'm losing my mind trying to figure this out. May have to use it on another project, but good to have it in the toolbox

Getting {renv} and {pak} to play nicely together in a docker container is a nightmare #rstats

This has me cackling. Thank you @hadley.nz
Do you teach #rstats? Do your students complain about how lame and old-fashioned dplyr is? Don't worry: I have the solution for you: github.com/hadley/genzp....

genzplyr is dplyr, but bussin fr fr no cap.
GitHub - hadley/genzplyr: dplyr but make it bussin fr fr no cap
dplyr but make it bussin fr fr no cap. Contribute to hadley/genzplyr development by creating an account on GitHub.
github.com
We are looking for #rstats community feedback on 3 new dplyr functions!

We're aiming to expand the `filter()` family:

- `filter()` to keep rows
- `filter_out()` to drop rows
- `when_any()` and `when_all()` as modifiers

Read more and leave feedback here:
github.com/tidyverse/ti...
Do you teach #rstats? Do your students complain about how lame and old-fashioned dplyr is? Don't worry: I have the solution for you: github.com/hadley/genzp....

genzplyr is dplyr, but bussin fr fr no cap.
GitHub - hadley/genzplyr: dplyr but make it bussin fr fr no cap
dplyr but make it bussin fr fr no cap. Contribute to hadley/genzplyr development by creating an account on GitHub.
github.com

What about great_tables, which is the python version of gt: posit-dev.github.io/great-tables.... Full disclosure, never used tinytables, so I may be off base
intro – great_tables
posit-dev.github.io

Since you mentioned conda, curious about your opinions on pixi

I think that's all three in a single axiom

Reposted by Jason J. Mercer

Posit @posit.co · Oct 1
Looking to use Quarto inside Positron, our new IDE for data science? We’ve released a video and guide for getting started.

Positron makes Quarto easy: Quarto is pre-installed, you get dedicated buttons for tasks like "Preview," and full coding support.

Learn more: posit.co/blog/create-...

Are you?

Cline+OpenRouter in Positron is amazing

Reposted by Jason J. Mercer

📢 PSA: The R-Forge server is under attack from hackers and hence web access (e.g., for package installation) is currently down. #rstats

The team at WU Wien is working on it. I'll report here when it is back up again.

Git pull/merge requests intimidate me. What really helped you understand them? And how do you usually do them in a way that helps the person making the request get good feedback while also ensuring the code is in good shape (outside of tests -- code isn't at that stage, yet)? Web? Local?
New #QuartoPub extension! Quarto output styling adds CSS rules for computational output, warnings, and messages for #rstats, #python, #julialang, and #observablejs - use the default styles, minimal styles, or add your own custom CSS andrewheiss.github.io/quarto-outpu...

Where are the communities focused on data science, engineering, analytics, etc in the public utilities sector? Especially water. Trying to find my nerds

You should. You should make a package for that

Say you have some observations and a black box model (e.g., compiled and you can't change anything about it). You can programmatically manipulate the model parameters and compare results to observations. That's it. What tool do you use to tune the parameters? Bonus for any #bayesian tools

Given a focus on scientific computing I prefer #julialang for speed and readability, but I'm Rust curious. Is there a rust equivalent to rcpp for #rstats?

I'm learning all kinds of new tricks. Thanks for the recs!

Cool package! It's a little awkward with the roxygen stuff inside the function, but I def get why they did that. Just means a lot of copy paste if/when converting to a package

Well that's a neat trick. Pretty heavy dependency, but for a dev environment wouldn't really matter