Ilya Schurov
banner
ilyaschurov.bsky.social
Ilya Schurov
@ilyaschurov.bsky.social
Mathematics, AI, ML, education and applications. http://ilya.schurov.com/
Hi Jan-Willem, thanks, could you add me?
November 20, 2024 at 3:24 PM
Similarly, I can dump results of the indermediate step to a file, with a helper function like this:

def dump(filename):
def wrapper(df):
df.to_parquet(filename)
return df
return wrapper

And so on! If you didn't use .pipe before, give it a try, it's nice!
November 20, 2024 at 11:28 AM
Now I wrote a small helper function

def assrt(condition):
def wrapper(df):
assert condition(df)
return df
return wrapper

and use this function with a .pipe method:

df.assign(…).query(…).pipe(assrt(lambda _: not _['column'].isna().any())).groupby(…)…
November 20, 2024 at 11:28 AM
Assume I want to make sure that at some intermediate step I do not have NaNs in a column, and give an error otherwise. Previously, I would break the method chain, assign that intermediate result to a variable, add an assert on that variable, and then continue the chain. Not nice.
November 20, 2024 at 11:28 AM