Skip to content

Commit 833d87e

Browse files
authored
Merge branch 'main' into docs/fix-typo-unet-metadata
2 parents 819954c + 1fcee23 commit 833d87e

File tree

4 files changed

+67
-48
lines changed

4 files changed

+67
-48
lines changed

auto3dseg/README.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,44 @@ We provide [a two-minute example](notebooks/auto3dseg_hello_world.ipynb) for use
5656

5757
To further demonstrate the capabilities of **Auto3DSeg**, [here](./tasks/instance22/README.md) is the detailed performance of the algorithm in **Auto3DSeg**, which won 2nd place in the MICCAI 2022 challenge **[INSTANCE22: The 2022 Intracranial Hemorrhage Segmentation Challenge on Non-Contrast Head CT (NCCT)](https://instance.grand-challenge.org/)**
5858

59+
## Running With Your Own Data
60+
61+
To run Auto3DSeg on your own dataset, you need to build a `datalist.json` file, and pass it to the AutoRunner.
62+
63+
The datalist format is based on the datasets released by the [Medical Segmentation Decathlon](http://medicaldecathlon.com).
64+
See the function `load_decathlon_datalist` in `monai/data/decathlon_datalist.py` for a description of the format.
65+
66+
For the AutoRunner, we only need the `training` list in the JSON, it does not use any other fields.
67+
The `fold` key for each image is not required, as the AutoRunner will automatically create cross-validation folds (the number of folds is hard-coded to 5).
68+
If you do add the cross-validation folds beforehand, the AutoRunner will use these by default.
69+
You can also choose to include a `validation` list in the JSON file, in which case the AutoRunner will disable cross-validation and use the specified validation set.
70+
Any other metadata, such as `modality`, `numTraining`, `name`, etc. will not be used by the AutoRunner, but we do recommend using metadata fields to keep track of names and versions of your dataset. If you are using multi-modal scans, it is possible to enter lists of image paths for both the `image` and `label` keys; MONAI will stack them into channels.
71+
In short, your `datalist.json` file should look like this:
72+
73+
```
74+
{
75+
"name": "Example datalist.json"
76+
"training":
77+
[
78+
{"image": "/path/to/image_1.nii.gz", "label": "/path/to/label_1.nii.gz"},
79+
{"image": "/path/to/image_2.nii.gz", "label": "/path/to/label_2.nii.gz"},
80+
...
81+
]
82+
}
83+
84+
```
85+
86+
The AutoRunner will create a `work_dir` folder in the directory from which it is run, which will contain the resulting models and the copied datalist file _with_ cross-validation folds. This allows you to keep track of which datalist file the models are trained on.
87+
88+
See the description below or the file [run_with_minimal_input.md](docs/run_with_minimal_input.md) to use your datalist with the AutoRunner.
89+
5990
## Reference Python APIs for Auto3DSeg
6091

6192
**Auto3DSeg** offers users different levels of APIs to run pipelines that suit their needs.
6293

6394
### 1. Run with Minimal Input using ```AutoRunner```
6495

65-
The user needs to provide a data list (".json" file) for the new task and data root. A typical data list is as this [example](tasks/msd/Task05_Prostate/msd_task05_prostate_folds.json). A sample datalist for an existing MSD formatted dataset can be created using [this notebook](notebooks/msd_datalist_generator.ipynb). After creating the data list, the user can create a simple "task.yaml" file (shown below) as the minimum input for **Auto3DSeg**.
96+
The user needs to provide a data list (".json" file) for the new task and data root. A typical data list is as this [example](tasks/msd/Task05_Prostate/msd_task05_prostate_folds.json). [This notebook](notebooks/msd_crossval_datalist_generator.ipynb) features an example to create a datalist with cross-validation folds from an existing MSD dataset. After creating the data list, the user can create a simple "task.yaml" file (shown below) as the minimum input for **Auto3DSeg**.
6697

6798
```
6899
modality: CT

auto3dseg/docs/run_with_minimal_input.md

Lines changed: 24 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -18,55 +18,39 @@ if os.path.exists(root):
1818
download_and_extract(resource, compressed_file, root)
1919
```
2020

21-
**Step 1.** Provide the following data list (a ".json" file) for a new task and the data root. The typical data list is shown as follows.
21+
**Step 1.** Provide a `datalist.json` file.
22+
See the documentation under the `load_decathlon_datalist` function in `monai.data.decathlon_datalist` for details on the file format.
2223

24+
For the AutoRunner, you only need the `training` field with its list of training files:
2325
```
2426
{
25-
"training": [
26-
{
27-
"fold": 0,
28-
"image": "image_001.nii.gz",
29-
"label": "label_001.nii.gz"
30-
},
31-
{
32-
"fold": 0,
33-
"image": "image_002.nii.gz",
34-
"label": "label_002.nii.gz"
35-
},
36-
{
37-
"fold": 1,
38-
"image": "image_003.nii.gz",
39-
"label": "label_001.nii.gz"
40-
},
41-
{
42-
"fold": 2,
43-
"image": "image_004.nii.gz",
44-
"label": "label_002.nii.gz"
45-
},
46-
{
47-
"fold": 3,
48-
"image": "image_005.nii.gz",
49-
"label": "label_003.nii.gz"
50-
},
51-
{
52-
"fold": 4,
53-
"image": "image_006.nii.gz",
54-
"label": "label_004.nii.gz"
55-
}
56-
],
57-
"testing": [
58-
{
59-
"image": "image_010.nii.gz"
60-
}
61-
]
27+
"training":
28+
[
29+
{"image": "/path/to/image_1.nii.gz", "label": "/path/to/label_1.nii.gz"},
30+
{"image": "/path/to/image_2.nii.gz", "label": "/path/to/label_2.nii.gz"},
31+
...
32+
],
33+
"testing":
34+
[
35+
"/path/to/test_image_1.nii.gz",
36+
"/path/to/test_image_2.nii.gz",
37+
...
38+
]
6239
}
40+
6341
```
42+
In each training item, you can add a `fold` field (with an integer starting at 0) to pre-specify the cross-validation folds, otherwise the AutoRunner will generate its own folds (always 5). All trained algorithms will use the same generated or pre-specified folds, the file can be found in the `work_dir` folder that the AutoRunner generates.
43+
If you have a validation set, you can include it under a `validation` key with the same format as the `training` list. This will disable cross-validation.
44+
A "testing" list can also be added, which only requires the image files, not the labels. If it is included, the AutoRunner will output predictions on the testing set after training.
45+
It is recommended to add a `name` field and any other metadata fields that allow you to track which version of your dataset the models are trained on.
46+
47+
Save the file to `./datalist.json`.
6448

6549
**Step 2.** Prepare "task.yaml" with the necessary information as follows.
6650

6751
```
68-
modality: CT
69-
datalist: "./task.json"
52+
modality: CT # or MRI
53+
datalist: "./datalist.json"
7054
dataroot: "/workspace/data/task"
7155
```
7256

auto3dseg/notebooks/auto_runner.ipynb

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -273,13 +273,9 @@
273273
"\n",
274274
"`set_training_params` in `AutoRunner` provides an interface to change all algorithms' training parameters in one line. \n",
275275
"\n",
276-
"NOTE: \n",
277-
"**Auto3DSeg** uses MONAI bundle templates to perform training, validation, and inference.\n",
278-
"The number of epochs/iterations of training is specified by the config files in each template.\n",
279-
"Users can override these these values in the bundle templates.\n",
280-
"But users should consider that some bundle templates may use `num_iterations` and other may use `num_epochs` to iterate.\n",
276+
"As an example, see the code block below, which specifies e.g. the number of epochs used for training. Note that some algorithms may treat this as a maximum number of epochs.\n",
281277
"\n",
282-
"For demo purposes, below is a code block to convert num_epoch to iteration style and override all algorithms with the same training parameters.\n",
278+
"NOTE: \n",
283279
"The setup works fine for a machine that has GPUs less than or equal to 8.\n",
284280
"The datalist in this example is only using a subset of the original dataset.\n",
285281
"Users need to ensure the number of GPUs is not greater than the number that the training dataset can be partitioned.\n",

auto3dseg/notebooks/msd_datalist_generator.ipynb renamed to auto3dseg/notebooks/msd_crossval_datalist_generator.ipynb

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,15 @@
1919
"See the License for the specific language governing permissions and \n",
2020
"limitations under the License. \n",
2121
"\n",
22-
"# Datalist Generator"
22+
"# Datalist Cross-Validation Folds Generator"
23+
]
24+
},
25+
{
26+
"cell_type": "markdown",
27+
"metadata": {},
28+
"source": [
29+
"This notebook contains an example to add cross-validation folds to an existing Medical Segmentation Decathlon datalist, in this case the one of Task09_Spleen. \n",
30+
"When running repeated experiments, it can be beneficial to create cross-validation folds beforehand."
2331
]
2432
},
2533
{

0 commit comments

Comments
 (0)