Tribuo v4.0.2 Release Notes

Release Date: 2020-11-05 // over 3 years ago
  • ๐Ÿ“š This is the first Tribuo point release after the initial public announcement. It fixes many of the issues our early users have found, and improves the documentation in the areas flagged by those users. We also added a couple of small new methods as part of fixing the bugs, and added two new tutorials: one on columnar data loading and one on external model loading (i.e. XGBoost and ONNX models).

    ๐Ÿ› Bugs fixed:

    • ๐Ÿ›  Fixed a locale issue in the evaluation tests.
    • ๐Ÿ›  Fixed issues with RowProcessor (expand regexes not being called, improper provenance capture).
    • IDXDataSource now throws FileNotFoundException rather than a mysterious NullPointerException when it can't find the file.
    • ๐Ÿ›  Fixed issues in JsonDataSource (consistent exceptions thrown, proper termination of reading in several cases).
    • ๐Ÿ›  Fixed an issue where regression models couldn't be serialized due to a non-serializable lambda.
    • ๐Ÿ›  Fixed UTF-8 BOM issues in CSV loading.
    • ๐Ÿ›  Fixed an issue where LibSVMTrainer didn't track state between repeated calls to train.
    • ๐Ÿ›  Fixed issues in the evaluators to ensure consistent exception throwing when discovering unlabelled or unknown ground truth outputs.
    • ๐Ÿ›  Fixed a bug in ONNX LabelTransformer where it wouldn't read pytorch outputs properly.
    • โฌ†๏ธ Bumped to OLCUT 5.1.5 to fix a provenance -> configuration conversion issue.

    ๐Ÿ†• New additions:

    • โž• Added a method which converts a Jackson ObjectNode into a Map suitable for the RowProcessor.
    • โž• Added missing serialization tests to all the models.
    • โž• Added a getInnerModels method to LibSVMModel, LibLinearModel and XGBoostModel to allow users to access a copy of the internal models.
    • ๐Ÿ“š More documentation.
    • Columnar data loading tutorial.
    • External model (XGBoost & ONNX) tutorial.

    โšก๏ธ Dependency updates:

    • OLCUT 5.1.5 (brings in jline 3.16.0 and jackson 2.11.3).

Previous changes from v4.0.1

    • ๐Ÿ›  Fixes an issue where the CSVReader wouldn't read files with extraneous newlines at the end.
    • โž• Adds an IDXDataSource so we can read IDX (i.e. MNIST) formatted datasets.
    • โšก๏ธ Updated the configuration tutorial to read MNIST from IDX files rather than libsvm files.