Trainers

Trainers

class tflon.train.trainer.Hook(frequency)

Base class for training hooks. New hooks should inherit this class and override Hook.call (optionally, override Hook.finish)

Hook.call should return a tensorflow op, usually constructed using tf.py_func The op returned by Hook.call will be called at fixed intervals

Hook.finish is called after a Trainer completes Trainer.train

Parameters:frequency (int) – Number of iterations between invocations of this hook’s call method
class tflon.train.trainer.LambdaHook(func)

A hook used to wrap lambda functions passed as hooks to Trainer. This is automatically applied by Trainer

class tflon.train.trainer.LearningRateDecayHook(initial_rate, decay, frequency)
class tflon.train.trainer.LogProgressHook(frequency, maxiter)

Hook to print progress to screen at specified intervals, automatically created by TFTrainer and OpenOptTrainer.

Parameters:
  • frequency (int) – The interval between logging output
  • maxiter (int) – The number of training iterations
class tflon.train.trainer.OpenOptTrainer(iterations=150, solver='scipy_lbfgsb', options={}, log_frequency=20, **kwargs)

Wrapper that applies an optimizer from OpenOpt (wrapper for scipy.minimize) to train a tflon model.

Keyword Arguments:
 
  • iterations (int) – The number of training iterations (default=150)
  • solver (str) – The minimizer to use, valid values include any algorithm implemented by scipy.minimize (default=scipy_lbfgsb)
  • options (dict) – Options to pass to the openopt.NLP module (default={})
  • log_frequency (int) – Frequency to print progress to screen or log file (default=20)
class tflon.train.trainer.SummaryHook(directory, frequency=10, summarize_trainables=False, summarize_gradients=False, summarize_activations=False, summarize_losses=True, summarize_resources=True)

Hook to write tensorboard logs. Only logs loss values by default

Parameters:

directory (str) – The directory to create tensorboard event files

Keyword Arguments:
 
  • frequency (int) – The number of iterations between log entries (default=10)
  • summarize_trainables (bool) – Include histograms of all weight and bias variables (default=False)
  • summarize_gradients (bool) – Include plots of all gradient magnitudes (default=False)
  • summarize_losses (bool) – Include plots of all loss magnitudes (default=True)
class tflon.train.trainer.TFTrainer(optimizer, iterations, checkpoint=None, resume=False, log_frequency=100, gpu_profile_file=None, **kwargs)

Wrapper that applies an optimizer inheriting from tf.train.Optimizer to train a tflon model.

Parameters:
  • optimizer (tf.train.Optimizer) – An instance of tensorflow Optimizer
  • iterations (int) – The number of training iterations
Keyword Arguments:
 
  • checkpoint (tuple) – None or pair of directory (str) and frequency (int). If not None, then write checkpoint files to the specified directory at specified intervals (default=None)
  • resume (bool) – Resume from checkpoint file, if available (default=False)
  • log_frequency (int) – Frequency to print progress to screen or log file (default=100)
class tflon.train.trainer.Trainer(hooks=[])

Trainer base class for use with tflon Model.fit

Keyword Arguments:
 hooks (list) – Zero or more Hook objects, to be called during training (default=[])

Optimizers

class tflon.train.optimizer.GradientAccumulatorOptimizer(optimizer, gradient_steps, **kwargs)

Wrapper for tf.train.Optimizer instances which accumulates gradients over several minibatches before applying the gradients

Parameters:
  • optimizer (tf.train.Optimizer) – An optimizer instance to wrap and accumulate gradients
  • gradient_steps (int) – The number of steps over which to accumulate gradients
apply_gradients(gradients)

Apply gradients to variables.

This is the second part of minimize(). It returns an Operation that applies gradients.

Parameters:
  • grads_and_vars – List of (gradient, variable) pairs as returned by compute_gradients().
  • global_step – Optional Variable to increment by one after the variables have been updated.
  • name – Optional name for the returned operation. Default to the name passed to the Optimizer constructor.
Returns:

An Operation that applies the specified gradients. If global_step was not None, that operation also increments global_step.

Raises:
  • TypeError – If grads_and_vars is malformed.
  • ValueError – If none of the variables have gradients.
  • RuntimeError – If you should use _distributed_apply() instead.
compute_gradients(*args, **kwargs)

Compute gradients of loss for the variables in var_list.

This is the first part of minimize(). It returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”. Note that “gradient” can be a Tensor, an IndexedSlices, or None if there is no gradient for the given variable.

Parameters:
  • loss – A Tensor containing the value to minimize or a callable taking no arguments which returns the value to minimize. When eager execution is enabled it must be a callable.
  • var_list – Optional list or tuple of tf.Variable to update to minimize loss. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES.
  • gate_gradients – How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH.
  • aggregation_method – Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod.
  • colocate_gradients_with_ops – If True, try colocating gradients with the corresponding op.
  • grad_loss – Optional. A Tensor holding the gradient computed for loss.
Returns:

A list of (gradient, variable) pairs. Variable is always present, but gradient can be None.

Raises:
  • TypeError – If var_list contains anything else than Variable objects.
  • ValueError – If some arguments are invalid.
  • RuntimeError – If called with eager execution enabled and loss is not callable.

@compatibility(eager) When eager execution is enabled, gate_gradients, aggregation_method, and colocate_gradients_with_ops are ignored. @end_compatibility

class tflon.train.optimizer.GradientClippingOptimizer(optimizer, clip_value, **kwargs)

Wrapper for tf.train.Optimizer instances which applys tf.clip_by_global_norm to the optimizer gradients

Parameters:
  • optimizer (tf.train.Optimizer) – An optimizer instance to wrap and apply clipping
  • clip_value (float) – The global norm threshold for clipping
compute_gradients(*args, **kwargs)

Compute gradients of loss for the variables in var_list.

This is the first part of minimize(). It returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”. Note that “gradient” can be a Tensor, an IndexedSlices, or None if there is no gradient for the given variable.

Parameters:
  • loss – A Tensor containing the value to minimize or a callable taking no arguments which returns the value to minimize. When eager execution is enabled it must be a callable.
  • var_list – Optional list or tuple of tf.Variable to update to minimize loss. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES.
  • gate_gradients – How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH.
  • aggregation_method – Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod.
  • colocate_gradients_with_ops – If True, try colocating gradients with the corresponding op.
  • grad_loss – Optional. A Tensor holding the gradient computed for loss.
Returns:

A list of (gradient, variable) pairs. Variable is always present, but gradient can be None.

Raises:
  • TypeError – If var_list contains anything else than Variable objects.
  • ValueError – If some arguments are invalid.
  • RuntimeError – If called with eager execution enabled and loss is not callable.

@compatibility(eager) When eager execution is enabled, gate_gradients, aggregation_method, and colocate_gradients_with_ops are ignored. @end_compatibility

class tflon.train.optimizer.WrappedOptimizer(optimizer, name=None, use_locking=False)

An optimizer that wraps another tf.Optimizer, used to modify gradient computations.

get_slot(var, sn)

Return a slot named name created for var by the Optimizer.

Some Optimizer subclasses use additional variables. For example Momentum and Adagrad use variables to accumulate updates. This method gives access to these Variable objects if for some reason you need them.

Use get_slot_names() to get the list of slot names created by the Optimizer.

Parameters:
  • var – A variable passed to minimize() or apply_gradients().
  • name – A string.
Returns:

The Variable for the slot if it was created, None otherwise.

get_slot_names()

Return a list of the names of slots created by the Optimizer.

See get_slot().

Returns:A list of strings.
variables()

A list of variables which encode the current state of Optimizer.

Includes slot variables and additional global variables created by the optimizer in the current default graph.

Returns:A list of variables.

Curriculum

class tflon.train.sampling.Curriculum(levels, step_function=<function curriculum_step_distribute>, **kwargs)
Interface for curriculum orchestration, curriculum strategies must implement:
def evaluate(self, model, step, loss):
return False

Usage Example:

level1 = … # Some function returning generators of batches level2 = … # ditto trainer = tflon.train.TFTrainer(tf.train.AdamOptimizer(1e-3)) curr = tflon.train.FixedIntervalCurriculum([level1, level2]) model.fit(curr.iterate(), trainer)
Parameters:

levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples

Keyword Arguments:
 
  • frequency (int, required) – Number of steps between curriculum evaluations
  • step_function (function) – The type of curriculum update function to use (default=curriculum_step_distribute)
evaluate(step, loss)

Curriculum evaluation

Parameters:
  • step (numpy.int64) – The current global optimization step
  • loss (numpy.float32) – The current optimizer loss
Returns:

Indicator for early termination of training (see tflon.train.Hook)

Return type:

boolean

class tflon.train.sampling.FixedIntervalCurriculum(*args, **kwargs)
Parameters:levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples
Keyword Arguments:
 frequency (int, required) – The number of steps between curriculum changes
evaluate(step, loss)

Curriculum evaluation

Parameters:
  • step (numpy.int64) – The current global optimization step
  • loss (numpy.float32) – The current optimizer loss
Returns:

Indicator for early termination of training (see tflon.train.Hook)

Return type:

boolean

class tflon.train.sampling.MetricGatedCurriculum(criterion, *args, **kwargs)
Parameters:
  • criterion (lambda) – Lambda expression taking result of model.evaluate and returning whether the current level is passed
  • levels (list) – List of functions returning generators corresponding to sorted order of occurence for curriculum examples
Keyword Arguments:
 

frequency (int, required) – The number of steps between test evaluations

evaluate(step, loss)

Curriculum evaluation

Parameters:
  • step (numpy.int64) – The current global optimization step
  • loss (numpy.float32) – The current optimizer loss
Returns:

Indicator for early termination of training (see tflon.train.Hook)

Return type:

boolean

tflon.train.sampling.merge_feeds(*iterators)

Merge multiple feed iterators into a single iterator