-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Open
Labels
API DesignGroupbyNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further actionReduction Operationssum, mean, min, max, etc.sum, mean, min, max, etc.
Description
Say I want to do a groupby
and perform various aggregations, e.g. I want to find mean
and std
of b
. Easy:
import pandas as pd
df = pd.DataFrame({'a': [1,1,2], 'b': [4,5,6]})
df.groupby('a').agg({'b': ['mean', 'std']})
What if I want to do the same with ddof=0
? If was computing a single aggregation, I could do:
print(df.groupby('a')['b'].std(ddof=0))
and that uses the Cythonized path.
However, I think the current pandas API doesn't allow a way of passing ddof
to 'std'
when used in .agg
. The workaround often suggested in StackOverflow is (π ):
print(df.groupby('a').agg({'b': ['mean', lambda x: np.std(x)]}))
but that'll evade the Cythonized path, which is a missed opportunity
FBruzzesi
Metadata
Metadata
Assignees
Labels
API DesignGroupbyNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further actionReduction Operationssum, mean, min, max, etc.sum, mean, min, max, etc.