Chapter 4 - How is weight_delta computed ?
Hello,
I have finished the Chapter 4. But I have a question regarding weight_delta. It has a value of delta * input. In the book it says that weight_delta is the derivative of the error, right ? On page 60 for example, the error is error = ((0.5*weight ) - 0.8) ** 2
When I give this error function to Wolfram Alpha, it gives me the derivative of 0.5 * x - 0.8 (where x = weight). So, in general, the derivative of error should be input * weight - goal_pred.
So, why they use delta * input for weight_delta if weight_delta is the derivative ???
I think the derivative is 2 * ((0.5weight) - 0.8) * 0.5, that is 2 * 0.5 * ((0.5weight) - 0.8) , so the result is 0.5*x - 0.8
the general forum for this : 2 * ((input*weight) - goal_pred) * input. in nerualnetwork people may dont care about the exactly coefficient of derivative, so just omit 2 and leave the key part of the derivative
Hmm, if error = (input * weight - goal_pred) ** 2
should derivative not be 2 * (input * weight - goal_pred)?
Since in this example input = 2, it's the same but I'm also confused...
Hmm, if
error = (input * weight - goal_pred) ** 2should derivative not be2 * (input * weight - goal_pred)? Since in this example input = 2, it's the same but I'm also confused...
the derivative should be 2 * weight * (input * weight-goal_pred),that's the chain rule, you also should do derivative to input * weight

As far as I'm concerned, weights_delta in 4th Chapter are calculated via delta rule
Just to clarify: The Delta rule is an update rule for single layer NN. It makes use of Gradient Descent. Backpropagation is an update rule for multi layer NN based on Gradient Descent.
But if we are using direction_and_amount = (pred - goal_pred) * input * 2 (e.g., not omitting the two), the model converges much faster?