Engine

class Engine.Engine(devices)[source]
analyze(data, statistics)[source]
Parameters:
Returns:

nothing, will print everything to log.v1

check_last_epoch()[source]
classify(data, output_file)[source]
compute_priors(dataset, config)[source]
classmethod config_get_final_epoch(config)[source]
daemon(config)[source]
classmethod epoch_model_filename(model_filename, epoch, is_pretrain)[source]
Return type:str
eval_model()[source]
format_score(score)[source]
forward_to_hdf(data, output_file, combine_labels='', batch_size=0)[source]
classmethod get_epoch_model(config)[source]

:returns (epoch, modelFilename) :rtype: (int|None, str|None)

get_epoch_model_filename()[source]
get_epoch_str()[source]
get_eval_datasets()[source]
classmethod get_existing_models(config)[source]
classmethod get_train_start_epoch_batch(config)[source]

We will always automatically determine the best start (epoch,batch) tuple based on existing model files. This ensures that the files are present and enforces that there are no old outdated files which should be ignored. Note that epochs start at idx 1 and batches at idx 0. :type config: Config.Config :returns (epoch,batch) :rtype (int,int)

init_network_from_config(config)[source]
init_train_epoch()[source]
init_train_from_config(config, train_data, dev_data=None, eval_data=None)[source]
is_first_epoch_after_pretrain()[source]
is_pretrain_epoch()[source]
classmethod model_filename_postfix()[source]
network_dump_json(json_filename)[source]
print_network_info()[source]
save_model(filename, epoch)[source]
Parameters:
  • filename (str) – full filename for model
  • epoch (int) – save epoch idx
train()[source]
train_epoch()[source]
class Engine.SeqTrainParallelControl(engine, config, **kwargs)[source]

Idea: Parallelize some stuff in seq training (e.g. sprint loss). Can use chunked training. We have these steps:

  1. (forward:GPU) forward only, remember output
  2. (calc_loss:CPU) calculate loss based on data from (1), error signal. store hat_y = y - grad_L (for stability).
  3. (train:GPU) forward again, and backprop with data from (2).
  1. and (3) are on the same GPU, use the same shared params.
  2. is on CPU.
  3. is done via the usual loop via EngineTask.TrainTaskThread.
It calls self.train_wait_for_seqs().

This class SeqTrainParallelControl is instantiated by the Engine and it has the these callbacks which are called by the engine (TrainTaskThread):

train_start_epoch() train_finish_epoch() train_wait_for_seqs()

Thus, this instance lives in the main proc and this code is executed in the main proc.

There is a counterpart of this code living in the device proc and we are calling it via Device.seq_train_parallel_control which is an instance of SeqTrainParallelControlDevHost. Most things are actually happening there.

forward_fill_queue()[source]

Full sequence forwarding, no chunking (at the moment).

train_finish_epoch()[source]

Called from TrainTaskThread at the end of an epoch.

train_start_epoch()[source]

Called from TrainTaskThread at the beginning of a new epoch.

train_wait_for_seqs(device, batches)[source]

Called from TrainTaskThread while doing training (forward + backprop). This will tell the device what set of batches we want to train next. :type device: Device.Device :type batches: list[EngineBatch.Batch]