Changelog#

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased#

v0.1.6 - 2022-07-31#

Added#

Changed#

  • Split Prism into reference-based Prism and reference-free PrismSrc. They now support multi-reference and multi-source via averaging over the references/sources.

  • Relaxed the dependency on pytest so it does not require a specific version

v0.1.5 - 2022-03-21#

Added#

Changed#

  • Changed the backend implementation of MoverScore to use a non-IDF dict based version.

  • Changed the default BLEURT version to use "BLEURT-20" instead of "bleurt-base-128" and using length-batched optimization.

v0.1.4 - 2022-01-29#

Changed#

  • Relaxed the datasets version requirement to match the GEM Metrics library

  • Moved some dependencies into dev-requirements.txt

Fixed#

  • Removed warnings that may happen if the Docker clients are not closed.

v0.1.3 - 2022-01-22#

Added#

Fixed#

  • Fixed an error in Lite3Pyramid by updating to a newer version of the code.

v0.1.2 - 2021-10-07#

Changed#

  • Changed backend of Lite3Pyramid to use our own fork of the official repo with some modifications.

v0.1.1 - 2021-10-05#

Added#

Changed#

  • Fixed silly variable name typo: DOCKERHUB_REPRO to DOCKERHUB_REPO

v0.1.0 - 2021-08-10#

Added#

  • Added DAE

  • Adding FactCC and FactCCX

  • Added utilities to remove empty inputs and insert values at specific indices

  • Added automatically building and publishing model images

  • Added a command to pull default Docker images for each model

  • Added SummaQA

  • Added NUBIA

  • Added Prism

Changed#

  • BERTScore now returns 0 for its metrics if the input is empty.

  • BLEURT now returns the mean and max scores over the references.

  • Changing Lewis et al. (2020) to download CNN/DM and XSum models by default

  • Changing Liu et al. (2019) to download all models by default

v0.0.3 - 2021-08-04#

Added#

Changed#

  • Changed the QAEval interface to match other text generation metrics. The backend was also changed to not rely on SacreROUGE.

v0.0.2 - 2021-07-30#

Added#

Changed#

  • Renamed the --model-args, --dataset-reader-args, and --output-write-args predict arguments to --model-kwargs, --dataset-reader-kwargs, and --output-write-kwargs.

  • Renamed the --output-file argument in predict to --output to allow for output files or directories.

v0.0.1 - 2021-07-22#

Added#