Skip to content

Releases: dssg/triage

Revolution noodle

21 Mar 14:56
Compare
Choose a tag to compare

Revolution Noodle

This new version separates the Postmodeling analysis into two phases. The first phase generates an experiment summary report that allows the user to do a general sanity check of the experiment setup before moving on to Model selection. The second phase takes care of the Model analysis of a subset of models of interest, e.g., crosstabs, list analysis, etc.

⚠️ Warning! We no longer support Python 3.8

New functionality

  • Subsets. In this new version, subsets are generated by querying from the cohort rather than all existing entities. When the space of entities is large (and the subset is large), this significantly slows down the experiment. As a fix, we are now forcing the subsets to be a subset of the relevant cohort rather than of all available entities. We now include the cohrot_name in the subset hash.
  • Experiment summary report - after each experiment run, triage can generate a Jupyter notebook that summarizes the experiment outputs. This can be used to verify whether the experiment generated the intended outputs identify any initial errors.

Bug Fixes

  • When predicting forward and not having labels in the matrix, we add a default 0 as the label
  • Package Dependencies. Upgraded scikit-learn version, and specified a compatible numpy version to ensure support for Python 3.9+
  • Temporary files created for generating the CSV matrices are now stored in /tmp instead of /tmp/triage_ouptut/matrices
    Fixed bugs in Colab tutorial

Documentation

  • Added documentation in Postmodeling section related to the Experiment summary report
  • Updated the Colab tutorial to reflect the new Experiment Report Summary

White peaches

16 Oct 17:15
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug related to the random_state variable in Triage's configuration file. The random seeds generated from the defined random state were not propagating correctly during the training of individual models.

Documentation

  • Colab example has been updated with warnings when installing Triage.

Update on required packages

  • Some packages have been updated with newer versions.
  • Some packages have been removed as requirements since they are now included in other packages in the requirements file.

Pusadee's (Patch 2)

02 Feb 18:42
Compare
Choose a tag to compare

Bug Fixes

Saving csv.gz matrices on S3 was not working correctly. Now we put a file to S3 instead of streaming chunks.

Pusadee's (Patch 1)

01 Feb 17:20
Compare
Choose a tag to compare

Bug Fixes

Fixes saving the csv.gz features matrices on S3.

Pusadee's

22 Jan 21:19
Compare
Choose a tag to compare

New Functionality

  • Changing how we create the features matrix. Now we generate CSVs for each feature group and stitch them together on disk.
  • We are reading and loading CSVs with Polars instead of pandas DataFrames to optimize time with a 10x improvement.

Dried Peach (Patch 3)

19 Sep 21:19
Compare
Choose a tag to compare

New Functionality

  • Adds functionality to deal with the cohort being defined within the label query for predicting lists

Bug Fixes

  • Fixes a bug on the Baseline Multi-Feature Ranker on the normalization (#934)

Refactoring/Documentation

  • Updates different packages

Dried Peach (Patch 2)

08 Nov 16:28
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug with logging existing cohort dates when using the labels query for the cohort and running with replace=False (#915)

Dried Peach (Patched)

25 Oct 22:00
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug that was resulting in old default behavior being used when omitting the cohort section of the config to use the label config for both (#911)
  • Raise the intended depreciation warning (rather than a cryptic error) when specifying a groups key to feature aggregations (#907)

Refactoring/Documentation

  • Adds a colab-based tutorial for a quicker introduction to how to use triage (#878)
  • Remove a duplicated cohort section from the example experiment config (#902)
  • Various minor documentation updates

Dried Peach

20 May 22:45
Compare
Choose a tag to compare

New Functionality

  • Adds python 3.10 support (NOTE: loses python 3.7 support) (via #893)
  • Support (and prefer) specifying a SQL file path for cohort and label queries (#883)
  • Adds information about cohort, label, and bias hashes to the triage_runs table (#888)
  • Allow specifying a cohort to be option, defaulting to the label query (#877)

Bug Fixes

  • Fixes issue with installing on macos with python 3.9 (via #893)
  • Removes (buggy) support for groupings other than entity_id in feature generation (#887)

Refactoring/Documentation

  • Various dependency updates (#893)
  • Error on cohort or label duplicates (#889)
  • Various updates to documentation

Dried Mango (Patched)

08 Feb 22:19
Compare
Choose a tag to compare

New Functionality

  • Adds support for postgres 12 and 13 (#882)

Refactoring/Documentation

  • Fixes typos in documentation