bluepyparallel.evaluator¶

Module to evaluate generic functions on rows of dataframe.

Functions

evaluate(df, evaluation_function[, ...])

Evaluate and save results in a sqlite database on the fly and return dataframe.

bluepyparallel.evaluator.evaluate(df, evaluation_function, new_columns=None, resume=False, parallel_factory=None, db_url=None, func_args=None, func_kwargs=None, shuffle_rows=True, progress_bar=True, **mapper_kwargs)¶

Evaluate and save results in a sqlite database on the fly and return dataframe.

Parameters:

df (pandas.DataFrame) – each row contains information for the computation.
evaluation_function (callable) – function used to evaluate each row, should have a single argument as list-like containing values of the rows of df, and return a dict with keys corresponding to the names in new_columns.
new_columns (list) – list of names of new column and empty value to save evaluation results, i.e.: [['result', 0.0], ['valid', False]].
resume (bool) – if True and db_url is provided, it will use only compute the missing rows of the database.
parallel_factory (ParallelFactory or str) – parallel factory name or instance.
db_url (str) – should be DB URL that can be interpreted by sqlalchemy.create_engine() or can be a file path that is interpreted as a SQLite database. If an URL is given, the SQL backend will be enabled to store results and allowing future resume. Should not be used when evaluations are numerous and fast, in order to avoid the overhead of communication with the SQL database.
func_args (list) – the arguments to pass to the evaluation_function.
func_kwargs (dict) – the keyword arguments to pass to the evaluation_function.
shuffle_rows (bool) – if True, it will shuffle the rows before computing the results.
progress_bar (bool) – if True, a progress bar will be displayed during computation.
**mapper_kwargs – the keyword arguments are passed to the get_mapper() method of the ParallelFactory instance.

Returns:

dataframe with new columns containing the computed results.

Return type:

pandas.DataFrame