Skip to content

Releases: EducationalTestingService/skll

Version 0.12.0

10 Sep 17:33
Compare
Choose a tag to compare
  • Fixed crash with kappa when given two sets of ratings that are both
    missing an intermediate value (e.g., [1, 2, 4]).
  • Added summarize_results script for creating a nice summary TSV file
    from a list of JSON results files.
  • Summary files for ablation studies now have an extra column that says
    which feature was removed.

Version 0.11.0

10 Sep 17:32
Compare
Choose a tag to compare
  • Added initial version of skll_convert script for converting between
    .jsonlines, .megam, and .tsv data file formats.
  • Fixed bug in _megam_dict_iter where labels for instances with all zero
    features were being incorrectly set to None.
  • Fixed bug in _tsv_dict_iter where features with zero values were being
    retained with values set as '0' instead of being removed completely. This
    caused DictVectorizer to create extra features, so results may
    change
    a little bit if you were using .tsv files.
  • Fixed crash with predict and train_only modes when running on the grid.
  • No longer use process pools to load files if
    SKLL_MAX_CONCURRENT_PROCESSES is 1.
  • Added more informative error message when trying to load a file without
    any features.

Version 0.10.1

10 Sep 17:32
Compare
Choose a tag to compare
  • Made processes non-daemonic to fix pool.map issue with running
    multiple configurations files at the same time with run_experiment.

Version 0.10.0

05 Sep 19:33
Compare
Choose a tag to compare
  • run_experiment can now take multiple configuration files.
  • Fixed issue where model parameters and scores were missing in evaluate
    mode

Version 0.9.17

05 Sep 19:32
Compare
Choose a tag to compare
  • Added skll.data.convert_examples function to convert a list
    dictionaries to an ExamplesTuple.
  • Added a new optional field to configuration file, ids_to_floats, to
    help save memory if you have a massive number of instances with numeric
    IDs.
  • Replaced use_dense_features and scale_features options with
    feature_scaling. See the
    run_experiment documentation
    for details.

Version 0.9.16

03 Sep 20:19
Compare
Choose a tag to compare
  • Fixed summary output for ablation experiments. Previously summary files
    would not include all results.
  • Added ablation unit tests.
  • Fixed issue with generating PDF documentation.

Version 0.9.15

03 Sep 20:20
Compare
Choose a tag to compare
  • Added two new required fields to the configuration file format under the
    General heading: experiment_name and task. See the
    run_experiment documentation <http://skll.readthedocs.org/en/latest/run_experiment.html#creating-configuration-files>__
    for details.
  • Fixed an issue where the "loading..." message was never being printed when
    loading data files.
  • Fixed a bug where keyword arguments were being ignored for metrics when
    calculating final scores for a tuned model. This means that previous
    reported results may be wrong for tuning metrics that use keywords
    arguments: f1_score_micro, f1_score_macro,
    linear_weighted_kappa, and quadratic_weighted_kappa.
  • Now try to convert IDs to floats if they look like them to save
    memory for very large files.
  • kappa now supports negative ratings.
  • Fixed a crash when specifing grid_search_jobs and pre-specified folds.

Version 0.9.14

03 Sep 20:20
Compare
Choose a tag to compare
  • Hotfix to fix issue where grid_search_jobs setting was being overriden
    by grid_search_folds.

Version 0.9.13

27 Aug 18:38
Compare
Choose a tag to compare
  • Added skll.data.write_feature_file (also available as
    skll.write_feature_file) to simplify outputting .jsonlines, .megam, and
    .tsv files.
  • Added more unit tests for handling .megam and .tsv files.
  • Fixed a bug that caused a crash when using gridmap.
  • grid_search_jobs now sets both n_jobs and pre_dispatch for
    GridSearchCV under the hood. This prevents a potential memory issue when
    dealing with large datasets and learners that cannot handle sparse data.
  • Changed logging format when using run_experiment to be a little more
    readable.

Version 0.9.12

27 Aug 18:37
Compare
Choose a tag to compare
  • Fixed serious issue where merging feature sets was not working correctly.
    All experiments conducted using feature set merging (i.e., where you
    specified a list of feature files and had them merged into one set for
    training/testing) should be considered invalid. In general, your
    results should previously have been poor and now should be much better.
  • Added more verbose regression output including descriptive statistics
    and Pearson correlation.