repro.models.model#
- class repro.models.model.Model#
- predict(*args, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, *args, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.ParallelModel(model_cls, model_kwargs_list=None, num_models=None)#
A
ParallelModel
is a simple abstraction around thejoblib
library for running models in parallel. It allows for parallel processing on multiple CPUs as well as GPUs.To create a
ParallelModel
, you must specify the type of the model that is being run in parallel as well as a list ofkwargs
that will be passed to the model’s constructor to instantiate the parallel models. The number of parallel processes is equal to the length of the list ofkwargs
.The
ParallelModel
’spredict_batch()
methods will divide theinputs
into batches, pass the akwargs
object and a batch to a worker. The worker then instantiates the model, processes the data, and returns the result. Because the implementation relies onjoblib
, thekwargs
and the output from the model must be serializable byjoblib
.The output from the
predict_batch()
method will be the list of outputs returned by each of the individual processes.If the input ordering matters or the model does some final aggregation over all of the items in the
inputs
passed topredict_batch()
, theParallelModel
will not compute the right result.If you are evaluating inputs with a metric, also see
repro.common.util.aggregate_parallel_metrics()
.Note: Please make sure you understand how ParallelModel is implemented before you use it to ensure the behavior is expected for your use case.
- Parameters
model_cls (Type) – The model class which will be run in parallel
model_kwargs_list (List) – A list of
kwargs
that will be used to create models of typemodel_cls
. The length of the list is the number of parallel processes to use. If all of thekwargs
are equal to{}
, you may use thenum_models
parameter instead. Only one ofmodel_kwargs_list
andnum_models
may be set.num_models (int) – The number of models to run in parallel. This is equivalent to passing a list of
num_models
emptykwargs
(i.e.,{}
). Only one ofmodel_kwargs_list
andnum_models
may be set.
Examples
First, we define a simple model to run in parallel. This model simply multiplies all of its inputs by 10.
from repro.models import Model class TimesTen(Model): def predict_batch(self, inputs: List[Dict[str, int]]): return [inp["value"] * 10 for inp in inputs]
Then we specify some inputs:
inputs = [{"value": 0}, {"value": 1}, {"value": 2}, {"value": 3}]
Now, create a
ParallelModel
withnum_models=2
. This will result in running twoTimesTen
models in parallel.from repro.models import ParallelModel parallel_model = ParallelModel(TimesTen, num_models=2)
Then the
inputs
can be processed in parallel:output_list = parallel_model.predict_batch(inputs)
The
output_list
will be equal to:[ [{"value": 0, "value": 10}], [{"value": 20, "value": 30}], ]
where each of the elements corresponds to the output from each of the two models. This example is equivalent to the following serial execution:
model = TimesTen() outputs1 = model.predict_batch([inputs[0], inputs[1]]) outputs2 = model.predict_batch([inputs[2], inputs[3]]) outputs_list = [outputs1, outputs2]
If you need to pass specific parameters to each of the model’s constructors, you may do so using
model_kwargs_list
. For example, if the model required a GPU ID, you could pass that information as such:class GPUModel(Model): def __init__(self, device: int): self.device = device def predict_batch(self, inputs: List[Dict[str, int]]): # Do some computation return result parallel_model = ParallelModel(GPUModel, [{"device": 0}, {"device": 2}])
This will run two instances of
GPUModel
in parallel. One process will usedevice=0
and the otherdevice=2
.- predict_batch(inputs, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.QuestionAnsweringModel#
- predict(context, question, *args, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, *args, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.QuestionGenerationModel#
- predict(context, start, end, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.RecipeGenerationModel#
- predict(name, ingredients, *args, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, *args, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.SingleDocumentSummarizationModel#
- predict(document, *args, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, *args, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.
- class repro.models.model.TruecasingModel#
- predict(text, *args, **kwargs)#
Runs inference for the single input instance.
- predict_batch(inputs, *args, **kwargs)#
Runs inference for all of the instances in inputs. Each item in inputs should have a key which corresponds to a parameter of the predict method. For instance, if the signature of predict was:
def predict(input1, input2)
Then each item in inputs should be a dictionary with keys “input1” and “input2”.