Kane et al. (2020)#
Publication#
NUBIA: NeUral Based Interchangeability Assessor for Text Generation
Repositories#
https://github.com/wl-research/nubia
Available Models#
Nubia
Description: A learned text generation evaluation metric
Name:
kane2020-nubia
Usage: Include a small snippet for how to use the model
from repro.models.kane2020 import NUBIA model = NUBIA() inputs = [ {"candidate": "The candidate text", "references": ["The reference text"]} ] macro, micro = model.predict_batch(inputs)
macro
is the Nubia score averaged over the inputs, andmicro
is the Nubia score per-input.
Implementation Notes#
The implementation does not support using a GPU
The metric only supports a single reference, so the length of
references
must be 1.
Docker Information#
Image name:
danieldeutsch/kane2020:1.0
Build command:
repro setup kane2020 [--silent]
Requires network: No
Testing#
repro setup kane2020
pytest models/kane2020/tests
Status#
[x] Regression unit tests pass
[x] Correctness unit tests pass
See here. We replicated the features show in an example from the original repository. However, there are additional features now and the overall score has changed.[ ] Model runs on full test dataset
Not tested[ ] Predictions approximately replicate results reported in the paper
Not tested[ ] Predictions exactly replicate results reported in the paper
Not tested