Training¶
See also Training for an overview of the relevant aspects.
- batch_size
An integer defining the batch size in data items (frames, words, subwords, etc.) per batch. A mini-batch has at least a time-dimension and a batch-dimension (or sequence-dimension), and depending on dense or sparse, also a feature-dimension.
batch_size
is the upper limit fortime * sequences
during creation of the mini-batches.- batching
Defines the default value for
seq_ordering
across all datasets. It is recommended to not use this parameter, but rather defineseq_ordering
explicitly in the datasets for better readability. Possible values are:default
: Keep the sequences as isreverse
: Use the default sequences in reversed orderrandom
: Shuffle the data with a predefined fixed seedrandom:<seed>
: Shuffle the data with the seed givensorted
: Sort by length (only if available), beginning with shortest sequencessorted_reverse
: Sort by length, beginning with longest sequenceslaplace:<n_buckets>
: Sort by length with n laplacian buckets (one bucket means going from shortest to longest and back with 1/n of the data).laplace:.<n_sequences>
: sort by length with n sequences per laplacian bucket.
Note that not all sequence order modes are available for all datasets, and some datasets may provide additional modes.
- chunking
You can chunk sequences of your data into parts, which will greatly reduce the amount of needed zero-padding. This option is a string of two numbers, separated by a comma, i.e.
chunk_size:chunk_step
, wherechunk_size
is the size of a chunk, andchunk_step
is the step after which we create the next chunk. I.e. the chunks will overlap bychunk_size - chunk_step
frames. Set this to0
to disable it, or for example100:75
to enable it.- cleanup_old_models
If set to
True
, checkpoints are removed based on their score on the dev set. Per default, 2 recent, 4 best, and the checkpoints 20,40,80,160,240 are kept. Can be set as a dictionary to specify additional options.keep_last_n
: integer defining how many recent checkpoints to keepkeep_best_n
: integer defining how many best checkpoints to keepkeep
: list or set of integers defining which checkpoints to keep
- max_seq_length
A dict with string:integer pairs. The string must be a valid data key, and the integer specifies the upper bound for this data object. During batch construction any sequence where the specified data object exceeds the upper bound are discarded. Note that some datasets (e.g
OggZipDataset
) load and process the data to determine the length, so even for discarded sequences data processing might be performed.- max_seqs
An integer specifying the upper limit of sequences in a batch (can be used in addition to
batch_size
).- num_epochs
An integer specifying the number of epochs to train.
- save_interval
An integer specifying after how many epochs the model is saved.
- start_epoch
An integer or string specifying the epoch to start the training at. The default is ‘auto’.
- stop_on_nonfinite_train_score
If set to
False
, the training will not be interrupted if a single update step has a loss with NaN of Inf