ibllib.pipes.tasks

Functions

run_alyx_task

Runs a single Alyx job and registers output datasets

Classes

Pipeline

Pipeline class: collection of related and potentially interdependent tasks

Task

class Task(session_path, parents=None, taskid=None, one=None, machine=None, clobber=True, location='server', **kwargs)[source]

Bases: ABC

log = ''
cpu = 1
gpu = 0
io_charge = 5
priority = 30
ram = 4
outputs = None
time_elapsed_secs = None
time_out_secs = 7200
version = '2.19.0'
signature = {'input_files': [], 'output_files': []}
force = False
job_size = 'small'
one = None
level = 0
property name
run(**kwargs)[source]

— do not overload, see _run() below— wraps the _run() method with - error management - logging to variable - writing a lock file if the GPU is used - labels the status property of the object. The status value is labeled as:

0: Complete

-1: Errored -2: Didn’t run as a lock was encountered -3: Incomplete

register_datasets(one=None, **kwargs)[source]

Register output datasets form the task to Alyx

Parameters
  • one

  • jobid

  • kwargs – directly passed to the register_dataset function

Returns

register_images(**kwargs)[source]

Registers images to alyx database :return:

rerun()[source]
get_signatures(**kwargs)[source]

This is the default but should be overwritten for each task :return:

setUp(**kwargs)[source]

Setup method to get the data handler and ensure all data is available locally to run task

Parameters

kwargs

Returns

tearDown()[source]

Function after runs() Does not run if a lock is encountered by the task (status -2)

cleanUp()[source]

Function to optionally overload to clean up :return:

assert_expected_outputs(raise_error=True)[source]

After a run, asserts that all signature files are present at least once in the output files Mainly useful for integration tests :return:

assert_expected_inputs(raise_error=True)[source]

Before running a task, check that all the files necessary to run the task have been downloaded/ are on the local file system already :return:

assert_expected(expected_files, silent=False)[source]
get_data_handler(location=None)[source]

Gets the relevant data handler based on location argument :return:

static make_lock_file(taskname='', time_out_secs=7200)[source]

Creates a GPU lock file with a timeout of

is_locked()[source]

Checks if there is a lock file for this given task

class Pipeline(session_path=None, one=None, eid=None)[source]

Bases: ABC

Pipeline class: collection of related and potentially interdependent tasks

tasks = {}
one = None
make_graph(out_dir=None, show=True)[source]
create_alyx_tasks(rerun__status__in=None, tasks_list=None)[source]

Instantiate the pipeline and create the tasks in Alyx, then create the jobs for the session If the jobs already exist, they are left untouched. The re-run parameter will re-init the job by emptying the log and set the status to Waiting

Parameters

rerun__status__in – by default no re-run. To re-run tasks if they already exist,

specify a list of statuses string that will be re-run, those are the possible choices: [‘Waiting’, ‘Started’, ‘Errored’, ‘Empty’, ‘Complete’] to always patch, the string ‘__all__’ can also be provided :return: list of alyx tasks dictionaries (existing and or created)

create_tasks_list_from_pipeline()[source]

From a pipeline with tasks, creates a list of dictionaries containing task description that can be used to upload to create alyx tasks :return:

run(status__in=['Waiting'], machine=None, clobber=True, **kwargs)[source]

Get all the session related jobs from alyx and run them

Parameters

status__in – lists of status strings to run in

[‘Waiting’, ‘Started’, ‘Errored’, ‘Empty’, ‘Complete’] :param machine: string identifying the machine the task is run on, optional :param clobber: bool, if True any existing logs are overwritten, default is True :param kwargs: arguments passed downstream to run_alyx_task :return: jalyx: list of REST dictionaries of the job endpoints :return: job_deck: list of REST dictionaries of the jobs endpoints :return: all_datasets: list of REST dictionaries of the dataset endpoints

rerun_failed(**kwargs)[source]
rerun(**kwargs)[source]
property name
run_alyx_task(tdict=None, session_path=None, one=None, job_deck=None, max_md5_size=None, machine=None, clobber=True, location='server', mode='log')[source]

Runs a single Alyx job and registers output datasets

Parameters
  • tdict

  • session_path

  • one

  • job_deck – optional list of job dictionaries belonging to the session. Needed

to check dependency status if the jdict has a parent field. If jdict has a parent and job_deck is not entered, will query the database :param max_md5_size: in bytes, if specified, will not compute the md5 checksum above a given filesize to save time :param machine: string identifying the machine the task is run on, optional :param clobber: bool, if True any existing logs are overwritten, default is True :param location: where you are running the task, ‘server’ - local lab server, ‘remote’ - any compute node/ computer, ‘SDSC’ - flatiron compute node, ‘AWS’ - using data from aws s3 :param mode: str (‘log’ or ‘raise’) behaviour to adopt if an error occured. If ‘raise’, it will Raise the error at the very end of this function (ie. after having labeled the tasks) :return: