Yuan et al. (2021)#
Publication#
Repositories#
https://github.com/neulab/BARTScore
Available Models#
BARTScore
Description: A text generation evaluation metric based on BART
Name:
yuan2021-bartscore
Usage:
from repro.models.yuan2021 import BARTScore model = BARTScore(model="cnn") inputs = [ {"candidate": "The candidate text", "references": ["The references"]} ] macro, micro = model.predict_batch(inputs)
macro
andmicro
are the average and per-input BARTScores. There are three supported models:"default"
,"cnn"
, and"parabank"
."default"
will use thefacebook/bart-large
checkpoint."cnn"
will use thefacebook/bart-large-cnn
checkpoint."parabank"
will use thefacebook/bart-large-cnn
and load the weights trained on Parabank.
Implementation Notes#
Docker Information#
Image name:
danieldeutsch/yuan2021:1.0
Docker Hub:
Build command:
repro setup yuan2021 [--silent]
Requires network: Yes, there is still a request sent although the models are pre-cached.
Testing#
repro setup yuan2021
pytest models/yuan2021/tests
Status#
[x] Regression unit tests pass
[x] Correctness unit tests pass
We verify the outputs on their Github Readme. See here.[ ] Model runs on full test dataset
Not tested[ ] Predictions approximately replicate results reported in the paper
Not tested[ ] Predictions exactly replicate results reported in the paper
Not tested