-
Notifications
You must be signed in to change notification settings - Fork 2
Decoupling Lorax from Triage Specific config #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
# NOTE-KA: I feel like the method should be independent of these as these seem very triage specific. | ||
# We can always have a script that bridges the triage data with the explain API | ||
# Leaving this decoupling to another PR | ||
self.id_col = id_col | ||
self.date_col = date_col | ||
self.outcome_col = outcome_col |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, though probably most true for the date column. I can't remember if we use the date anywhere as such, otherwise, we could maybe simple change this to a list of columns (in addition to the id and outcome) to be excluded from analysis (but either way agree we can worry about that in a future PR)...
raise ValueError('Must specify name patterns to aggregate over.' + | ||
'Use TheLorax.set_name_patterns() first.') | ||
elif how not in ['features', 'patterns']: | ||
# NOTE-KA: Minor, in this case, should we default to features and let the code run with a warning? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that seems reasonable to me
self.model_info['aggregated_dict'] = return_tuple[3] | ||
|
||
elif isinstance(self.clf, LogisticRegression): | ||
# Getting values for Random Forest Classifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment should read # Getting values for Logistic Regression
Also, should add a TODO here to handle ScaledLogisticRegression
(which is what we actually generally use in triage)
if pred_class is None: | ||
# TODO: Multiclass adpatation | ||
# use np.argmax(), or clf.predict() | ||
pred_class = np.argmax(scores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably log a warning here since in most real cases we would likely have some other threshold/top k, so if we get here it's probably more likely a mistake than intentional.
# TODO: If descriptive is set, the importance scores | ||
# are supported with the context provided by a test dataset | ||
# The code is available in the original constructor, move it here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the TODO comment might be out of date?
test_mat = self.X_test | ||
fstats = self.feature_stats | ||
else: | ||
fstats = self.populate_feature_stats(test_mat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does feel like it would be easy for users to fall into a pattern where they end up doing a lot of extra work if they repeatedly invoke explain_example()
with the same test matrix (rather than registering it first), but the only potentially better option I see is if we maybe compared the passed matrix with what's already registered and then calculated the states only if they're different (actually, probably within populate_feature_stats()
rather than here anyway). However, I'm not too sure how efficient comparing pandas matrices is whether its worth the overhead to avoid other calculations. Maybe add as an issue to look into sometime down the road?
else: | ||
contrib_df = self._build_contrib_df_sample(contrib_list, how=how) | ||
|
||
return contrib_df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not necessary for this PR, but I wonder if we should somehow try to return the matplotlib axis/figure object as well when graphing. Might make sense for an issue to look at in the future, though.
date_col='as_of_date', outcome_col='outcome', | ||
name_patterns=None): | ||
# TODO: This method should be removed after verifying that the new init compares |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we could go ahead and remove it at this point, right?
self._populate_feature_stats() | ||
self.feature_stats = self.populate_feature_stats(test_mat=self.X_test) | ||
|
||
# TODO: make protected again. Making public for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably go ahead and do this here, unless there's a reason to keep it public?
- idx: index for example | ||
- sample: the row matrix of the sample. Either a idx or sample should be provided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should include feature_stats
in the input list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... if I'm reading, correctly, it seems like you'd have to explicitly pass None
for either idx
or sample
here. Could either make it a single required parameter (where you check the type) or default both to None
, so they're not required/positional arguments.
# for arbitary feature groupings), but if just using patterns for categoricals/imputed flags | ||
# we should still be able to show relevant distribution info... | ||
|
||
def explain_example_old(self, idx, pred_class=None, num_features=10, graph=True, how='features'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be ok to remove now as well, right?
import os | ||
import sys | ||
project_path = os.path.join(os.path.dirname(__file__), '../') | ||
sys.path.append(project_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure you shouldn't have to do this... where are you invoking the tests from (e.g. this directory or its parent)? At least according to this stackoverflow:
https://stackoverflow.com/questions/1896918/running-unittest-with-typical-test-directory-structure
""" | ||
pass | ||
|
||
def test_old_vs_new_lorax(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably ok to remove these along with the old methods themselves.
There are different methods to get a descriptive explanation | ||
This test asserts all those methods yield the same answer | ||
""" | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this might still need to be filled in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks good and thanks for doing all of this!
See a few inline comments for some notes/suggestions. Also, might be good to add a few tests specifically around the different options in the new interface (e.g., where/when you can specify a test matrix, when it gets overwritten, options for specifying and index or sample, etc).
Anyway, feel free to make changes or create issues for those as you see fit, but approving now so you can merge whenever.
Added the following modifications.
Changed the constructor s.t. it is not dependent on a test dataset or the triage specific column information to initialize the object
Added a new function to load a dataset to the object and precompute feature stats. A dataset could be provided when the object is created or at a later point
modified the explain_example method to take in one of the following:
Users can use the explain_example to acquire simple feature attribution scores or descriptive explanations (with the input feature stats). For the descriptive explanations, the preloaded dataset can be used or the user could use a different test set.
There was a plotting error when a generic dataset was used, it was resolved.
Decoupled the feature stats calculation method from the object's test dataset. Now any test matric can
Tested the functionality of the new explain_example function against the old one to confirm the consistency of outputs
Now Lorax should be usable with any dataset without the need of adding placeholder columns.