Releases: EducationalTestingService/skll
Releases · EducationalTestingService/skll
Version 0.12.0
- Fixed crash with kappa when given two sets of ratings that are both
missing an intermediate value (e.g.,[1, 2, 4]
). - Added
summarize_results
script for creating a nice summary TSV file
from a list of JSON results files. - Summary files for ablation studies now have an extra column that says
which feature was removed.
Version 0.11.0
- Added initial version of
skll_convert
script for converting between
.jsonlines, .megam, and .tsv data file formats. - Fixed bug in
_megam_dict_iter
where labels for instances with all zero
features were being incorrectly set toNone
. - Fixed bug in
_tsv_dict_iter
where features with zero values were being
retained with values set as '0' instead of being removed completely. This
causedDictVectorizer
to create extra features, so results may
change a little bit if you were using .tsv files. - Fixed crash with predict and train_only modes when running on the grid.
- No longer use process pools to load files if
SKLL_MAX_CONCURRENT_PROCESSES
is 1. - Added more informative error message when trying to load a file without
any features.
Version 0.10.1
- Made processes non-daemonic to fix
pool.map
issue with running
multiple configurations files at the same time withrun_experiment
.
Version 0.10.0
run_experiment
can now take multiple configuration files.- Fixed issue where model parameters and scores were missing in
evaluate
mode
Version 0.9.17
- Added
skll.data.convert_examples
function to convert a list
dictionaries to an ExamplesTuple. - Added a new optional field to configuration file,
ids_to_floats
, to
help save memory if you have a massive number of instances with numeric
IDs. - Replaced
use_dense_features
andscale_features
options with
feature_scaling
. See the
run_experiment documentation
for details.
Version 0.9.16
- Fixed summary output for ablation experiments. Previously summary files
would not include all results. - Added ablation unit tests.
- Fixed issue with generating PDF documentation.
Version 0.9.15
- Added two new required fields to the configuration file format under the
General
heading:experiment_name
andtask
. See the
run_experiment documentation <http://skll.readthedocs.org/en/latest/run_experiment.html#creating-configuration-files>
__
for details. - Fixed an issue where the "loading..." message was never being printed when
loading data files. - Fixed a bug where keyword arguments were being ignored for metrics when
calculating final scores for a tuned model. This means that previous
reported results may be wrong for tuning metrics that use keywords
arguments:f1_score_micro
,f1_score_macro
,
linear_weighted_kappa
, andquadratic_weighted_kappa
. - Now try to convert IDs to floats if they look like them to save
memory for very large files. kappa
now supports negative ratings.- Fixed a crash when specifing
grid_search_jobs
and pre-specified folds.
Version 0.9.14
- Hotfix to fix issue where
grid_search_jobs
setting was being overriden
bygrid_search_folds
.
Version 0.9.13
- Added
skll.data.write_feature_file
(also available as
skll.write_feature_file
) to simplify outputting .jsonlines, .megam, and
.tsv files. - Added more unit tests for handling .megam and .tsv files.
- Fixed a bug that caused a crash when using gridmap.
grid_search_jobs
now sets bothn_jobs
andpre_dispatch
for
GridSearchCV
under the hood. This prevents a potential memory issue when
dealing with large datasets and learners that cannot handle sparse data.- Changed logging format when using
run_experiment
to be a little more
readable.
Version 0.9.12
- Fixed serious issue where merging feature sets was not working correctly.
All experiments conducted using feature set merging (i.e., where you
specified a list of feature files and had them merged into one set for
training/testing) should be considered invalid. In general, your
results should previously have been poor and now should be much better. - Added more verbose regression output including descriptive statistics
and Pearson correlation.