convnetjs icon indicating copy to clipboard operation
convnetjs copied to clipboard

Question: trainer network error

Open alexge233 opened this issue 9 years ago • 3 comments

Hi,

Thanks a ton for the fantastic library! I'm using a deep network for NLP, with a varying input size of 12000 down to 4000 nodes. I've made sure to enable 8GB for RAM for node:

node --max-old-space-size=8192

My network looks like this (although I'd like to try different types and architectures):

var layers = [];
layers.push({type: 'input', out_sx: 1, out_sy: 1, out_depth: input_size});
layers.push({type: 'fc', num_neurons: 200, activation: 'relu'});
layers.push({type: 'fc', num_neurons: 100, activation: 'relu'});
layers.push({type: 'fc', num_neurons: 50, activation: 'relu'});
layers.push({type: 'fc', num_neurons: 25, activation: 'relu'});
layers.push({type: 'fc', num_neurons: 10, activation: 'relu'});
layers.push({type: 'softmax', num_classes: 2});

var net = new convet.Net();
net.makeLayers(layers);

My data has been parsed in a json array of objects, which I then randomly shuffle and partition into a training set and testing set.

Each json object has a vector (which is simply an array of floats), and a score which is a single value.

My training loop is basically the following:

var trainer = new convnet.Trainer(network, {learning_rate: 0.1, l2_decay: 001});
var epochs = 1000;
for (var i = 0; i < epochs; i++)
{
    for (var index in dataset.training())
    {
        var input = new convnet.Vol(json[index].vector);
        var output = new convnert.Vol(json[index].score);
        trainer.train(input, output);
    }
}

It runs, but I have no way of validating it. Is there a Mean Square Error or Average Cross Entropy or any other network-error measurement? AFAIK, the only way to test the network's accuracy is to cross-validate using my testing samples, and see (a) if they are classified correctly, or (b) how far the actual output is from my target/ideal output.

I took a peek into the convent.js source file but I don't see Trainer.train to be returning any type of network error (unless I missed something - very possible!).

Last but not least, referencing your library, do you have a citation you'd like me to use?

PS: is there a way to save a trained network?

Best regards, Alex

alexge233 avatar Mar 02 '16 19:03 alexge233

Hello,

I'm back with more questions. I've implemented the Average Cross-Entropy/Log Loss Error Function for my ideal and prediction values.

  1. When running more than one hidden layer, all my network values propagate Nan. Is this intended behavior, loss of precision, or a bug?
  2. When using only one hidden layer, I get convergence on absolute values e.g., [1, 0] or [0, 1] but my ACE becomes a NaN.

My implementation is quite simple:

function log_error(actual, ideal)
{
    var err_0 = ideal.w[0] * Math.log(actual.w[0]) + ((1 - ideal.w[0]) * Math.log(1 - actual.w[0]));
    var err_1 = ideal.w[1] * Math.log(actual.w[1]) + ((1 - ideal.w[1]) * Math.log(1 - actual.w[1]));
    return err_0 + err_1;
}

Please note that both ideal and actual are convnetjs.Vol, and I obtain them during training iteration:

var log_sum = 0;
for (var k = 0; k < data.length; k++)
{
    trainer.train(data[k][0], data[k][1]);
    var actual = network.forward(data[k][0]);
    log_sum += log_error(actual, data[k][1]);
}
var ace = -1*(log_sum / data.length);

The variable data is an array which holds a convnetjs.Vol as input data[k][0] and a convnetjs.Vol as output data[k][1] which have been empirically verified.

I understand that the parameters learning_rate play a very important role (I've experimented with very small and large values) as well as the L1 and L2 decay values are important.

I haven't yet cross-validated accuracy, but why do I keep seeing those NaN values?

alexge233 avatar Mar 07 '16 18:03 alexge233

^ for a similar set up I have the same issue :+1:

franciscovargas avatar Apr 03 '16 04:04 franciscovargas

yeah, I am also having NaN issues for a 2-hidden-layer network running deep q learning. Everything is OK for a while and then suddenly I find all my network weights have turned to NaNs.

djminkus avatar Dec 01 '20 19:12 djminkus