Kryściński et al. (2019)#
Publication#
Evaluating the Factual Consistency of Abstractive Text Summarization
Repositories#
https://github.com/salesforce/factCC
Available Models#
This implementation wraps the FactCC and FactCCX models. Both models will return a score and a label. The score is the probability of the returned label (for binary classification) with label 1 meaning “incorrect”.
FactCC:
Description: A model to score the factual consistency of text
Name:
kryscinski2019-factcc
Usage:
from repro.models.kryscinski2019 import FactCC model = FactCC() inputs = [ {"candidate": "The candidate text", "sources": ["The source text"]} ] macro, micro = model.predict_batch(inputs)
macro
contains the scores averaged over the inputs, whereasmicro
contains the scores for each input.
FactCCX:
Description: A model to score the factual consistency of text
Name:
kryscinski2019-factccx
Usage:
from repro.models.kryscinski2019 import FactCCX model = FactCCX() inputs = [ {"candidate": "The candidate text", "sources": ["The source text"]} ] macro, micro = model.predict_batch(inputs)
macro
contains the scores averaged over the inputs, whereasmicro
contains the scores for each input.
Implementation Notes#
We modified the script to run prediction because it did not save the scores of the model, just the overall labels. The modified script can be found here.
Docker Information#
Image name:
kryscinski2019
Build command:
repro setup kryscinski2019 [--silent]
Requires network: No
Testing#
repro setup kryscinski2019
pytest models/kryscinski2019/tests
Status#
[x] Regression unit tests pass
See here[ ] Correctness unit tests pass
No examples provided in the original repo[x] Model runs on full test dataset
See our reproducibility experiment here[x] Predictions approximately replicate results reported in the paper
See our reproducibility experiment here[ ] Predictions exactly replicate results reported in the paper
Not tested