Models#

The following papers have models implemented in Repro.

Summarization#

Paper

Authors

Docs

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer

Link

Text Summarization with Pretrained Encoders

Yang Liu and Mirella Lapata

Link

GSum: A General Framework for Guided Neural Abstractive Summarization

Zi-Yi Dou, Pengfei Liu, Hiroaki Hayashi, Zhengbao Jiang, and Graham Neubig

Link

Text Generation Evaluation#

Paper

Authors

Docs

ROUGE: A Package for Automatic Evaluation of Summaries

Chin-Yew Lin

Link

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Daniel Deutsch, Tania Bedrax-Weiss, and Dan Roth

Link

BLEURT: Learning Robust Metrics for Text Generation

Thibault Sellam, Dipanjan Das, and Ankur P. Parikh

Link

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi

Link

BLEU: A Method for Automatic Evaluation of Machine Translation

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu

Link

QuestEval: Summarization Asks for Fact-based Evaluation

Thomas Scialom, Paul-Alexis Dray, Patrick Gallinari, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano, and Alex Wang

Link

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, and Steffen Eger

Link

FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

Esin Durmus, He He, and Mona Diab

Link

Evaluating Factuality in Generation with Dependency-level Entailment

Tanya Goyal and Greg Durrett

Link

Evaluating the Factual Consistency of Abstractive Text Summarization

Wojciech Kryściński, Bryan McCann, Caiming Xiong, and Richard Socher

Link

Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

Thomas Scialom, Sylvain Lamprier, Benjamin Piwowarski, and Jacopo Staiano

Link

NUBIA: NeUral Based Interchangeability Assessor for Text Generation

Hassan Kane, Muhammed Yusuf Kocyigit, Ali Abdalla, Pelkins Ajanoh, and Mohamed Coulibali

Link

Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing

Brian Thompson and Matt Post

Link

Finding a Balanced Degree of Automation for Summary Evaluation

Shiyue Zhang and Mohit Bansal

Link

BARTScore: Evaluating Generated Text as Text Generation

Weizhe Yuan, Graham Neubig, and Pengfei Liu

Link

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi

Link

SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization

Yang Gao, Wei Zhao, and Steffen Eger

Link

Fill in the BLANC: Human-free quality estimation of document summaries

Oleg Vasilyev, Vedant Dharnidharka, and John Bohannon

Link

Meteor Universal: Language Specific Translation Evaluation for Any Target Language

Michael Denkowski and Alon Lavie

Link

COMET: A Neural Framework for MT Evaluation

Ricardo Rei, Craig Stewart, Ana C Farinha, and Alon Lavie

Link

Automatic Text Evaluation through the Lens of Wasserstein Barycenters

Pierre Colombo, Guillaume Staerman, Chloe Clavel, and Pablo Piantanida

Link

InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation

Pierre Colombo, Chloe Clavel, and Pablo Piantanida

Link

A Pseudo-Metric between Probability Distributions based on Depth-Trimmed Regions

Guillaume Staerman, Pavlo Mozharovskyi, Pierre Colombo, Stéphan Clémençon, and Florence d’Alché-Buc

Link

Just Ask! Evaluating Machine Translation by Asking and Answering Questions

Mateusz Krubinski, Erfan Ghadery, Marie-Francine Moens, Pavel Pecina

Link

Question Answering#

Paper

Authors

Docs

Neural Module Networks for Reasoning over Text

Nitish Gupta, Kevin Lin, Dan Roth, Sameer Singh, and Matt Gardner

Link

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Daniel Deutsch, Tania Bedrax-Weiss, and Dan Roth

Link

Question Generation#

Paper

Authors

Docs

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Daniel Deutsch, Tania Bedrax-Weiss, and Dan Roth

Link

Asking It All: Generating Contextualized Questions for any Semantic Role

Valentina Pyatkin, Paul Roit, Julian Michael, Reut Tsarfaty, Yoav Goldberg, and Ido Dagan

Link

Parsing#

Paper

Authors

Docs

Multilingual Constituency Parsing with Self-Attention and Pre-Training

Nikita Kitaev, Steven Cao, and Dan Klein

Link

Others#

Paper

Authors

Docs

RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text

Liam Dugan, Daphne Ippolito, Arun Kirubarajan, and Chris Callison-Burch

Link

Learning to Capitalize with Character-Level Recurrent Neural Networks: An Empirical Study

Raymond Hendy Susanto, Hai Leong Chieu, and Wei Lu

Link

MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics

Anthony Chen, Gabriel Stanovsky, Sameer Singh, and Matt Gardner

Link

Large-Scale QA-SRL Parsing

Nicholas FitzGerald, Julian Michael, Luheng He, and Luke Zettlemoyer

Link