Skip to content

Commit

Permalink
adding code for the modified chexpert-labeler and explanations of how…
Browse files Browse the repository at this point in the history
… it was used
  • Loading branch information
ricbl committed Apr 6, 2022
1 parent ceb39c8 commit 9824bf9
Show file tree
Hide file tree
Showing 4 changed files with 211 additions and 9 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ data_viewer
*.EDF
*.mat
*.m~
*.csv
mimic-sample
*.json
*.pkl
Expand Down
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This code was used to collect, process, and validate the REFLACX (Reports and Ey

The code is organized in 4 folders. `pre_processing_or_sampling_or_ibm_training`, `interface_src`, and `post_processing_and_dataset_generation` are provided to show how our dataset was collected, and their scripts might need changes to hard-coded code to adapt it to different needs of data collection.
`examples_and_paper_numbers` is provided to show how to get the numbers used to validate the publicly available dataset and a few examples of how to use it.
All scripts have to be run from inside their respective folders.
All scripts have to be run from inside their respective folders and, unless differently instructed, should be run using a Python environment satisfying the requirements listed below.

Below we provide a short description of each folder and the recommended order for running scripts. The provided paths are relative to each of the folders.

Expand Down Expand Up @@ -104,7 +104,10 @@ To calculate the agreement between radiologists in terms of manual labeling, run

To get the statistics for the `image_all_paths.txt` images, run `get_filtered_mimic_statistics.py` and `get_sex_statistics_datasets.py`

To get the temporal correlation graph (Figure 4), run `generate_temporal_correlation_numbers.py`, followed by `draw_graph_temporal_correlation.py`
To produce the temporal correlation graph (Figure 4), run `generate_temporal_correlation_numbers.py`, followed by `draw_graph_temporal_correlation.py`. We provide the file `manually_labeled_reports_3.csv` containing the manually-labeled abnormality mention locations in the reports to generate the graph. The modified chexpert-labeler, provided in the folder `chexpert-labeler`, was used for faster manual labeling following:
- generate the labels of the reports from the REFLACX dataset using the modified chexpert-labeler, running `extract_report.py`;
- follow the instruction in `chexpert-labeler/README.md` to create a new Python environment and run `python chexpert-labeler/label.py --reports_path=phase_3.csv --output_path=labeled_reports_3.csv`;
- labels from 200 random cases in the generated `labeled_reports_3.csv` were manually corrected to match the contents of the reports.

The tables used to calculate some of the numbers shown in the paper are in `tables_calculations/`. These tables were modified from the csv files generated by other scripts.

Expand All @@ -121,7 +124,6 @@ The following scripts need additional data not provided with the public dataset:
- `create_calibration_table.py`, which depends on the output from `../post_processing_and_dataset_generation/ASC2MAT.py` and was used to calculate the average and maximum error values for the calibrations.
- `edit_video.py`, used to generate a video showing interface use through all screens of a case, with all portions recorded by the MATLAB interface.


## Requirements

### Python
Expand Down
10 changes: 5 additions & 5 deletions examples_and_paper_numbers/chexpert-labeler/args/arg_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,23 +20,23 @@ def __init__(self):

# Phrases
parser.add_argument('--mention_phrases_dir',
default='./src/chexpert-labeler/phrases/mention',
default='./chexpert-labeler/phrases/mention',
help='Directory containing mention phrases for ' +
'each observation.')
parser.add_argument('--unmention_phrases_dir',
default='./src/chexpert-labeler/phrases/unmention',
default='./chexpert-labeler/phrases/unmention',
help='Directory containing unmention phrases ' +
'for each observation.')

# Rules
parser.add_argument('--pre_negation_uncertainty_path',
default='./src/chexpert-labeler/patterns/pre_negation_uncertainty.txt',
default='./chexpert-labeler/patterns/pre_negation_uncertainty.txt',
help='Path to pre-negation uncertainty rules.')
parser.add_argument('--negation_path',
default='./src/chexpert-labeler/patterns/negation.txt',
default='./chexpert-labeler/patterns/negation.txt',
help='Path to negation rules.')
parser.add_argument('--post_negation_uncertainty_path',
default='./src/chexpert-labeler/patterns/post_negation_uncertainty.txt',
default='./chexpert-labeler/patterns/post_negation_uncertainty.txt',
help='Path to post-negation uncertainty rules.')

# Output parameters.
Expand Down
Loading

0 comments on commit 9824bf9

Please sign in to comment.