You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would expect the following test to pass. We're seeing within concat_columns that when a DataFrame with a column with mixed null/integers is passed the Integer logical type during inference, the init fails. This is expected and an MR was put up to make concat_columns resilient to this. When we extended the test to cover Boolean/BooleanNullable, it was discovered that the init will impute the missing boolean value rather than error out that there was an attempted coercion to a non-nullable type.
I would expect that the following test would pass and also be extendable to Integer/IntegerNullable (and float64/Float64 when they're a thing).
importpytestimportnumpyasnp@pytest.mark.parametrize("none_type", [None, np.nan, pd.NA])@pytest.mark.parametrize("pass_logical_types", [True, False])deftest_boolean_inference(none_type, pass_logical_types):
df=pd.DataFrame({"boolean": [none_type, True, False, True]})
ifpass_logical_types:
withpytest.raises(Exception):
# Would expect init to fail as you're trying to coerce a boolean to bool.df.ww.init(logical_types= {"boolean": Boolean})
else:
df.ww.init()
assertisinstance(df.ww.logical_types["boolean"], BooleanNullable)
The text was updated successfully, but these errors were encountered:
@chukarsten@ParthivNaresh pandas library has a new method called convert_dtypes in version 1.0.0 which can possibly provide better inference for nullable types. (docs)
fromwoodwork.logical_typesimportBooleanNullableimportpandasaspdimportnumpyasnpfornone_typein [None, np.nan, pd.NA]:
# initial dtype is objectseries=pd.Series([none_type, True, True], dtype='object')
# method infers dtype to boolean nullableinferred_dtype=series.convert_dtypes().dtypeassertstr(inferred_dtype) ==BooleanNullable.primary_dtype
@jeff-hernandez Wow nice catch! We should definitely explore this and see where we can use it. I'm thinking in EvalML if we need quick high level type inference we might be able to use this. In Woodwork we can use the extension concept they provided on top of the smarter inference we're doing for nulls now
I would expect the following test to pass. We're seeing within
concat_columns
that when a DataFrame with a column with mixed null/integers is passed theInteger
logical type during inference, theinit
fails. This is expected and an MR was put up to makeconcat_columns
resilient to this. When we extended the test to cover Boolean/BooleanNullable, it was discovered that theinit
will impute the missing boolean value rather than error out that there was an attempted coercion to a non-nullable type.I would expect that the following test would pass and also be extendable to Integer/IntegerNullable (and float64/Float64 when they're a thing).
The text was updated successfully, but these errors were encountered: