Trainers¶

class tflon.train.trainer.Hook(frequency)¶

Base class for training hooks. New hooks should inherit this class and override Hook.call (optionally, override Hook.finish)

Hook.call should return a tensorflow op, usually constructed using tf.py_func The op returned by Hook.call will be called at fixed intervals

Hook.finish is called after a Trainer completes Trainer.train

Parameters:	frequency (int) – Number of iterations between invocations of this hook’s call method

class tflon.train.trainer.LambdaHook(func)¶: A hook used to wrap lambda functions passed as hooks to Trainer. This is automatically applied by Trainer

class tflon.train.trainer.LearningRateDecayHook(initial_rate, decay, frequency)¶

class tflon.train.trainer.LogProgressHook(frequency, maxiter)¶

Hook to print progress to screen at specified intervals, automatically created by TFTrainer and OpenOptTrainer.

Parameters:	frequency (int) – The interval between logging output maxiter (int) – The number of training iterations

class tflon.train.trainer.OpenOptTrainer(iterations=150, solver='scipy_lbfgsb', options={}, log_frequency=20, **kwargs)¶

Wrapper that applies an optimizer from OpenOpt (wrapper for scipy.minimize) to train a tflon model.

Keyword Arguments:
	iterations (int) – The number of training iterations (default=150) solver (str) – The minimizer to use, valid values include any algorithm implemented by scipy.minimize (default=scipy_lbfgsb) options (dict) – Options to pass to the openopt.NLP module (default={}) log_frequency (int) – Frequency to print progress to screen or log file (default=20)

class tflon.train.trainer.SummaryHook(directory, frequency=10, summarize_trainables=False, summarize_gradients=False, summarize_activations=False, summarize_losses=True, summarize_resources=True)¶

Hook to write tensorboard logs. Only logs loss values by default

Keyword Arguments:
Parameters:	directory (str) – The directory to create tensorboard event files
	frequency (int) – The number of iterations between log entries (default=10) summarize_trainables (bool) – Include histograms of all weight and bias variables (default=False) summarize_gradients (bool) – Include plots of all gradient magnitudes (default=False) summarize_losses (bool) – Include plots of all loss magnitudes (default=True)

class tflon.train.trainer.TFTrainer(optimizer, iterations, checkpoint=None, resume=False, log_frequency=100, gpu_profile_file=None, **kwargs)¶

Wrapper that applies an optimizer inheriting from tf.train.Optimizer to train a tflon model.

Keyword Arguments:
Parameters:	optimizer (tf.train.Optimizer) – An instance of tensorflow Optimizer iterations (int) – The number of training iterations
	checkpoint (tuple) – None or pair of directory (str) and frequency (int). If not None, then write checkpoint files to the specified directory at specified intervals (default=None) resume (bool) – Resume from checkpoint file, if available (default=False) log_frequency (int) – Frequency to print progress to screen or log file (default=100)

class tflon.train.trainer.Trainer(hooks=[])¶

Trainer base class for use with tflon Model.fit

Keyword Arguments:
	hooks (list) – Zero or more Hook objects, to be called during training (default=[])

Optimizers¶

class tflon.train.optimizer.GradientAccumulatorOptimizer(optimizer, gradient_steps, **kwargs)¶

Wrapper for tf.train.Optimizer instances which accumulates gradients over several minibatches before applying the gradients

Parameters:	optimizer (tf.train.Optimizer) – An optimizer instance to wrap and accumulate gradients gradient_steps (int) – The number of steps over which to accumulate gradients

apply_gradients(gradients)¶

Apply gradients to variables.

This is the second part of minimize(). It returns an Operation that applies gradients.

Parameters:	grads_and_vars – List of (gradient, variable) pairs as returned by compute_gradients(). global_step – Optional Variable to increment by one after the variables have been updated. name – Optional name for the returned operation. Default to the name passed to the Optimizer constructor.
Returns:	An Operation that applies the specified gradients. If global_step was not None, that operation also increments global_step.
Raises:	`TypeError` – If grads_and_vars is malformed. `ValueError` – If none of the variables have gradients. `RuntimeError` – If you should use _distributed_apply() instead.

compute_gradients(*args, **kwargs)¶

Compute gradients of loss for the variables in var_list.

This is the first part of minimize(). It returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”. Note that “gradient” can be a Tensor, an IndexedSlices, or None if there is no gradient for the given variable.

Parameters:	loss – A Tensor containing the value to minimize or a callable taking no arguments which returns the value to minimize. When eager execution is enabled it must be a callable. var_list – Optional list or tuple of tf.Variable to update to minimize loss. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES. gate_gradients – How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. aggregation_method – Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod. colocate_gradients_with_ops – If True, try colocating gradients with the corresponding op. grad_loss – Optional. A Tensor holding the gradient computed for loss.
Returns:	A list of (gradient, variable) pairs. Variable is always present, but gradient can be None.
Raises:	`TypeError` – If var_list contains anything else than Variable objects. `ValueError` – If some arguments are invalid. `RuntimeError` – If called with eager execution enabled and loss is not callable.

@compatibility(eager) When eager execution is enabled, gate_gradients, aggregation_method, and colocate_gradients_with_ops are ignored. @end_compatibility

class tflon.train.optimizer.GradientClippingOptimizer(optimizer, clip_value, **kwargs)¶

Wrapper for tf.train.Optimizer instances which applys tf.clip_by_global_norm to the optimizer gradients

Parameters:	optimizer (tf.train.Optimizer) – An optimizer instance to wrap and apply clipping clip_value (float) – The global norm threshold for clipping

compute_gradients(*args, **kwargs)¶

Compute gradients of loss for the variables in var_list.

This is the first part of minimize(). It returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”. Note that “gradient” can be a Tensor, an IndexedSlices, or None if there is no gradient for the given variable.

Parameters:	loss – A Tensor containing the value to minimize or a callable taking no arguments which returns the value to minimize. When eager execution is enabled it must be a callable. var_list – Optional list or tuple of tf.Variable to update to minimize loss. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES. gate_gradients – How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. aggregation_method – Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod. colocate_gradients_with_ops – If True, try colocating gradients with the corresponding op. grad_loss – Optional. A Tensor holding the gradient computed for loss.
Returns:	A list of (gradient, variable) pairs. Variable is always present, but gradient can be None.
Raises:	`TypeError` – If var_list contains anything else than Variable objects. `ValueError` – If some arguments are invalid. `RuntimeError` – If called with eager execution enabled and loss is not callable.

@compatibility(eager) When eager execution is enabled, gate_gradients, aggregation_method, and colocate_gradients_with_ops are ignored. @end_compatibility

class tflon.train.optimizer.WrappedOptimizer(optimizer, name=None, use_locking=False)¶

An optimizer that wraps another tf.Optimizer, used to modify gradient computations.

get_slot(var, sn)¶

Return a slot named name created for var by the Optimizer.

Some Optimizer subclasses use additional variables. For example Momentum and Adagrad use variables to accumulate updates. This method gives access to these Variable objects if for some reason you need them.

Use get_slot_names() to get the list of slot names created by the Optimizer.

Parameters:	var – A variable passed to minimize() or apply_gradients(). name – A string.
Returns:	The Variable for the slot if it was created, None otherwise.

get_slot_names()¶

Return a list of the names of slots created by the Optimizer.

See get_slot().

Returns:	A list of strings.

variables()¶

A list of variables which encode the current state of Optimizer.

Includes slot variables and additional global variables created by the optimizer in the current default graph.

Returns:	A list of variables.

Curriculum¶

class tflon.train.sampling.Curriculum(levels, step_function=<function curriculum_step_distribute>, **kwargs)¶

Interface for curriculum orchestration, curriculum strategies must implement:

def evaluate(self, model, step, loss):: return False

Usage Example:

level1 = … # Some function returning generators of batches level2 = … # ditto trainer = tflon.train.TFTrainer(tf.train.AdamOptimizer(1e-3)) curr = tflon.train.FixedIntervalCurriculum([level1, level2]) model.fit(curr.iterate(), trainer)

Keyword Arguments:
Parameters:	levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples
	frequency (int, required) – Number of steps between curriculum evaluations step_function (function) – The type of curriculum update function to use (default=curriculum_step_distribute)

evaluate(step, loss)¶

Curriculum evaluation

Parameters:	step (numpy.int64) – The current global optimization step loss (numpy.float32) – The current optimizer loss
Returns:	Indicator for early termination of training (see tflon.train.Hook)
Return type:	boolean

class tflon.train.sampling.FixedIntervalCurriculum(*args, **kwargs)¶

Keyword Arguments:
Parameters:	levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples
	frequency (int, required) – The number of steps between curriculum changes

evaluate(step, loss)¶

Curriculum evaluation

Parameters:	step (numpy.int64) – The current global optimization step loss (numpy.float32) – The current optimizer loss
Returns:	Indicator for early termination of training (see tflon.train.Hook)
Return type:	boolean

class tflon.train.sampling.MetricGatedCurriculum(criterion, *args, **kwargs)¶

Keyword Arguments:
Parameters:	criterion (lambda) – Lambda expression taking result of model.evaluate and returning whether the current level is passed levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples
	frequency (int, required) – The number of steps between test evaluations

evaluate(step, loss)¶

Curriculum evaluation

Parameters:	step (numpy.int64) – The current global optimization step loss (numpy.float32) – The current optimizer loss
Returns:	Indicator for early termination of training (see tflon.train.Hook)
Return type:	boolean

tflon.train.sampling.merge_feeds(*iterators)¶: Merge multiple feed iterators into a single iterator