-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas deprecation warning in steps_pytest_harvest_utils.py #45
Comments
Nice catch @j-carson ! |
I was not able to reproduce this with the above code unfortunately. Could you please let me know how to reproduce it ? (note: I tried to replace it with My pandas version is 1.3.2 and python 3.8 Alternatively can you be more precise about the file/line where the warning happens ? |
Are you sure you can't reproduce? I just created a new environment with miniconda and followed the instructions on the readme. Running "nox" I definitely see two warnings...
|
I tried on two existing environments with latest version of pandas and could not see this :( |
I tried to paste all my nox output in here, but it was too big. |
No worries. Note that I hacked nox for my projects so that you get a nice log for each job under Also I finally managed to reproduce it :D as you were suggesting, reusing an existing env was not sufficient but creating a new one was ok. This probably relates to a package version difference somewhere. I'll flatten as you suggest, hoping that this will not have any other side effect.. |
This is failing with current versions of Pandas: _____________________ ERROR at setup of test_synthesis_df ______________________
request = <SubRequest 'module_results_df_steps_pivoted' for <Function test_synthesis_df>>
module_results_df = pytest_obj ... accuracy
test_id s...05
score <function test_my_app_bench at 0x7f8eb490c5e0> ... NaN
[12 rows x 7 columns]
@pytest.fixture(scope='function')
def module_results_df_steps_pivoted(request, module_results_df):
"""
A pivoted version of fixture `module_results_df` from pytest_harvest.
In this version, there is one row per test with the results from all steps in columns.
"""
# Handle the steps
module_results_df = handle_steps_in_results_df(module_results_df, keep_orig_id=False)
# Pivot
> return pivot_steps_on_df(module_results_df, pytest_session=request.session)
pytest_steps/plugin.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pytest_steps/steps_harvest_df_utils.py:86: in pivot_steps_on_df
return remaining_df.join(one_per_step_df)
/usr/lib64/python3.12/site-packages/pandas/core/frame.py:10730: in join
return merge(
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:170: in merge
op = _MergeOperation(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pandas.core.reshape.merge._MergeOperation object at 0x7f8e9c342bd0>
left = pytest_obj ... dataset_param
test_id ... C
test_my_app_bench[C-2] <function test_my_app_bench at 0x7f8eb490c5e0> ... C
[6 rows x 3 columns]
right = step_id train ... score
status duration_ms ....116760 my dataset #C
test_my_app_bench[C-2] passed 0.132810 ... 0.108233 my dataset #C
[6 rows x 7 columns]
how = 'left', on = None, left_on = None, right_on = None, left_index = True
right_index = True, sort = False, suffixes = ('', ''), indicator = False
validate = None
def __init__(
self,
left: DataFrame | Series,
right: DataFrame | Series,
how: JoinHow | Literal["asof"] = "inner",
on: IndexLabel | AnyArrayLike | None = None,
left_on: IndexLabel | AnyArrayLike | None = None,
right_on: IndexLabel | AnyArrayLike | None = None,
left_index: bool = False,
right_index: bool = False,
sort: bool = True,
suffixes: Suffixes = ("_x", "_y"),
indicator: str | bool = False,
validate: str | None = None,
) -> None:
_left = _validate_operand(left)
_right = _validate_operand(right)
self.left = self.orig_left = _left
self.right = self.orig_right = _right
self.how = how
self.on = com.maybe_make_list(on)
self.suffixes = suffixes
self.sort = sort or how == "outer"
self.left_index = left_index
self.right_index = right_index
self.indicator = indicator
if not is_bool(left_index):
raise ValueError(
f"left_index parameter must be of type bool, not {type(left_index)}"
)
if not is_bool(right_index):
raise ValueError(
f"right_index parameter must be of type bool, not {type(right_index)}"
)
# GH 40993: raise when merging between different levels; enforced in 2.0
if _left.columns.nlevels != _right.columns.nlevels:
msg = (
"Not allowed to merge between different levels. "
f"({_left.columns.nlevels} levels on the left, "
f"{_right.columns.nlevels} on the right)"
)
> raise MergeError(msg)
E pandas.errors.MergeError: Not allowed to merge between different levels. (1 levels on the left, 2 on the right)
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:784: MergeError
=================================== FAILURES ===================================
________________________________ test_synthesis ________________________________
request = <FixtureRequest for <Function test_synthesis>>
fixture_store = OrderedDict({'dataset': OrderedDict({'pytest_steps/tests/test_docs_example_with_harvest.py::test_my_app_bench[A-1-trai...cy': 0.46894857698850767}, 'pytest_steps/tests/test_steps_harvest.py::test_my_app_bench[C-2-score]': ResultsBag:
{}})})
def test_synthesis(request, fixture_store):
"""
Tests that users can create a pivoted syntesis table manually by combining pytest-harvest and pytest-steps.
Note: we could do this at many other places (hook, teardown of a session-scope fixture...)
"""
# Get session synthesis
# - filtered on the test function of interest
# - combined with default fixture store and results bag
results_dct = get_session_synthesis_dct(request, filter=test_synthesis.__module__,
durations_in_ms=True, test_id_format='function', status_details=False,
fixture_store=fixture_store, flatten=True, flatten_more='results_bag')
# We could use this function to perform the test id split here, but we will do it directly on the df
# results_dct = handle_steps_in_results_dct(results_dct, is_flat=True, keep_orig_id=False)
# convert to a pandas dataframe
results_df = pd.DataFrame.from_dict(results_dct, orient='index')
results_df = results_df.loc[list(results_dct.keys()), :] # fix rows order
results_df.index.name = 'test_id'
# results_df.index.names = ['test_id', 'step_id'] # set multiindex names
results_df.drop(['pytest_obj'], axis=1, inplace=True) # drop pytest object column
# extract the step id and replace the index by a multiindex
results_df = handle_steps_in_results_df(results_df, keep_orig_id=False)
# Pivot but do not raise an error if one of the above columns is not present - just in case.
> pivoted_df = pivot_steps_on_df(results_df, pytest_session=request.session)
pytest_steps/tests/test_steps_harvest.py:86:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pytest_steps/steps_harvest_df_utils.py:86: in pivot_steps_on_df
return remaining_df.join(one_per_step_df)
/usr/lib64/python3.12/site-packages/pandas/core/frame.py:10730: in join
return merge(
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:170: in merge
op = _MergeOperation(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pandas.core.reshape.merge._MergeOperation object at 0x7f8e9ce94bc0>
left = algo_param dataset_param
test_id
test_my_app_bench[A-... 1.0 C
test_my_app_bench[C-2] 2.0 C
test_basic NaN NaN
right = step_id train ... -
status duration_ms ... s...8083 ... NaN NaN
test_my_app_bench[C-2] passed 0.110367 ... NaN NaN
[7 rows x 9 columns]
how = 'left', on = None, left_on = None, right_on = None, left_index = True
right_index = True, sort = False, suffixes = ('', ''), indicator = False
validate = None
def __init__(
self,
left: DataFrame | Series,
right: DataFrame | Series,
how: JoinHow | Literal["asof"] = "inner",
on: IndexLabel | AnyArrayLike | None = None,
left_on: IndexLabel | AnyArrayLike | None = None,
right_on: IndexLabel | AnyArrayLike | None = None,
left_index: bool = False,
right_index: bool = False,
sort: bool = True,
suffixes: Suffixes = ("_x", "_y"),
indicator: str | bool = False,
validate: str | None = None,
) -> None:
_left = _validate_operand(left)
_right = _validate_operand(right)
self.left = self.orig_left = _left
self.right = self.orig_right = _right
self.how = how
self.on = com.maybe_make_list(on)
self.suffixes = suffixes
self.sort = sort or how == "outer"
self.left_index = left_index
self.right_index = right_index
self.indicator = indicator
if not is_bool(left_index):
raise ValueError(
f"left_index parameter must be of type bool, not {type(left_index)}"
)
if not is_bool(right_index):
raise ValueError(
f"right_index parameter must be of type bool, not {type(right_index)}"
)
# GH 40993: raise when merging between different levels; enforced in 2.0
if _left.columns.nlevels != _right.columns.nlevels:
msg = (
"Not allowed to merge between different levels. "
f"({_left.columns.nlevels} levels on the left, "
f"{_right.columns.nlevels} on the right)"
)
> raise MergeError(msg)
E pandas.errors.MergeError: Not allowed to merge between different levels. (1 levels on the left, 2 on the right)
/usr/lib64/python3.12/site-packages/pandas/core/reshape/merge.py:784: MergeError |
At line 86 of steps_pytest_harvest_utils.py, the columns have a single level index on the left and a two level index on the right. This is causing a pandas deprecation warning.
Test case: insert the following into tests/test_steps_harvest.py at line 64 and run the library test suite.
You could perhaps fix the warning with the flatten_multilevel_columns function, but the column name change might affect existing tests.
The text was updated successfully, but these errors were encountered: