repro.data.dataset_readers.datasets#

class repro.data.dataset_readers.datasets.HuggingfaceDatasetsDatasetReader(dataset_name, split)#
repro.data.dataset_readers.datasets.hf_dataset_exists_locally(name, version=None)#

Checks to see if a Huggingface datasets dataset exists locally in the cache. The logic checks to see if the directory exists where the data should be, but does not do any further verification.

Parameters
  • name (str) – The name of the dataset, like “cnn_dailymail”

  • version (str, default=None) – The version of the dataset, like “3.0.0”. If None, then the default version is used if one exists.

Returns

True if the dataset exists, False otherwise.

Return type

bool