Tutorial: Neural Network

Overview

In this section, we are building a neural network with minor tweaks to the ridge logistic regressor in the previous section.

Architecture

We’re going to add a simple fully connected (dense) layer between the input and the target neuron, which will allow our predictor to be more able to represent the data. Here’s a diagram of our desired model:

In tflon, it’s as simple as adding

tflon.toolkit.Dense(5, activation=tf.tanh)

Giving us:

class NNModel(tflon.model.Model):
    def _model(self):
        I = self.add_input('desc', shape=[None, 30])
        T = self.add_target('targ', shape=[None, 1])

        net = tflon.toolkit.WindowInput() |\
              tflon.toolkit.Dense(5, activation=tf.tanh) |\
              tflon.toolkit.Dense(1)
        out = net(I)

        self.add_output( 'pred', tf.nn.sigmoid(out) )
        self.add_loss( 'xent', tflon.toolkit.xent_uniform_sum(T, out) )
        self.add_loss( 'l2', tflon.toolkit.l2_penalty(self.weights) )
        self.add_metric( 'auc', tflon.toolkit.auc(T, out) )

Variants

Multiple output regression

For many Neural Networks we would like to predict multiple targets real valued targets. This can be done convienently using Dense and modifying add_target. Since it’s a real value regressor, we will used the common least mean squared error instead of crossentropy.

class MNNModel(tflon.model.Model):
    def _model(self):
        I = self.add_input('desc', shape=[None, 29])
        T = self.add_target('targ', shape=[None, 2])

        net = tflon.toolkit.WindowInput() |\
              tflon.toolkit.Dense(5, activation=tf.tanh) |\
              tflon.toolkit.Dense(2)
        out = net(I)

        self.add_output( 'pred', tf.nn.sigmoid(out) )
        self.add_loss('loss', tf.reduce_sum(tf.square(pred - out)))
        self.add_loss( 'l2', tflon.toolkit.l2_penalty(self.weights) )
        self.add_metric( 'auc', tflon.toolkit.auc(T, out) )

For use with the breast cancer data, the following line in the preprocessing will also need to be modified to set two columns for the target instead of one.

targ, desc = tflon.data.Table(df).split([2])

Generalization

The goal in building deep learning models is to train the network to best respond to unobserved data. Thus, it’s important to construct a model that doesn’t overfit or ‘memorizes’ the training data.

Validation and Test Set

Holdout: It’s best practice to reserve a holdout set partitioned before any parameter tuning. This ensures that at the end this dataset can be used as an unbiased estimate of the generalizaiton error. We refer to this set as the holdout/test set and the rest the training set.

As expected, this can easily be achieved by adding the following line after we define our feed. We holdout 10% for testing.

feed = tflon.data.TableFeed({'desc':desc, 'targ':targ})
IDs = np.random.randint(569 -1, shape=569//10)
testing =  feed.holdout(IDs)

Then, we can evaluate the result. test_result is an array of the model’s predictions while test_AUC provides the AUC.

with tf.Session():
    LR.fit( feed, trainer, restarts=2 )
    test_result = NN.infer(testing,'pred')
    test_AUC = NN.evaluate(testing)['AUC']

Crossvalidation: Simply put crossvalidation splits the data into fragments using each fragment as validation once. This allows us to use all the data in the training set for validation when evaluating the effect of different parameter settings. We will not show it explicitly, but effectively we implement this by applying holdout to different subsets.

AUC: 0.9991015

In [ ]:

import pandas as pd
import tensorflow as tf
import tflon
tflon.system.reset()
class LRModel(tflon.model.Model):
    def _model(self):
        I = self.add_input('desc', shape=[None, 29])
        T = self.add_target('targ', shape=[None, 1])

        net = tflon.toolkit.WindowInput() |\
                tflon.toolkit.Apply(lambda x: tf.nn.convolution())|\
        tflon.toolkit.Dense(5, activation=tf.tanh) | tflon.toolkit.Dense(1)
        L = net(I)

        self.add_output( 'pred', L )
        self.add_loss('loss', tf.reduce_sum(tf.square(pred - out)))
        self.add_loss( 'L2', tflon.toolkit.l2_penalty(self.weights))
        self.add_metric( 'auc', tflon.toolkit.auc(T, L) )

df = pd.read_csv("~/wdbc.data" , header=None)
df[1] = df[1].apply(lambda x : 1 if (x == 'M') else 0)
df = df.iloc[:,2:]
targ, desc = tflon.data.Table(df).split([1])
feed = tflon.data.TableFeed({'desc':desc, 'targ':targ})

LR = LRModel()

trainer = tflon.train.OpenOptTrainer( iterations=100)

with tf.Session():
    LR.fit( feed, trainer, restarts=2 )
    metrics = LR.evaluate(feed)
    print "AUC:", metrics['auc']