Chen et al. (2020)#
Publication#
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics
Repositories#
https://github.com/anthonywchen/MOCHA
Available Models#
This implementation contains a wrapper around the LERC model trained on all of the constituent datasets plus their evaluation script.
-
Description: The LERC model trained on all datasets
Name:
chen2020-lerc
Usage:
from repro.models.chen2020 import LERC model = LERC() score = model.predict("context", "question", "reference", "candidate")
MOCHA Evaluation
Description: The MOCHA evaluation script that calculates the Pearson correlation between the ground-truth and predicted scores.
Name:
chen2020-eval
Usage:
from repro.models.chen2020 import MOCHAEvaluationMetric model = MOCHAEvaluationMetric() # `inputs` should have the dataset, source, ground-truth score, # and predictions inputs = [ {"dataset": dataset, "source": source, "score": score, "prediction": prediction}, ... ] metrics = model.predict_batch(inputs)
Implementation Notes#
Docker Information#
Image name:
chen2020
Build command:
repro setup chen2020 [--silent]
Requires network: No
Testing#
Explain how to run the unittests for this model
repro setup chen2020
pytest models/chen2020/tests
Status#
[x] Regression unit tests pass
See here[ ] Correctness unit tests pass
No expected outputs provided in the original repo[x] Model runs on full test dataset
See here[x] Predictions approximately replicate results reported in the paper
Yes, see here[ ] Predictions exactly replicate results reported in the paper