MachineLearningWithPython icon indicating copy to clipboard operation
MachineLearningWithPython copied to clipboard

Incorrect data in training

Open javixeneize opened this issue 3 years ago • 2 comments

Hi

This code looks wrong

print("Training True : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 1]), (len(y_train[y_train[:] == 1])/len(df.index) * 100.0))) print("Training False : {0} ({1:0.2f}%)".format(len(y_train[y_train[:] == 0]), (len(y_train[y_train[:] == 0])/len(df.index) * 100.0))) print("Test True : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 1]), (len(y_test[y_test[:] == 1])/len(df.index) * 100.0))) print("Test False : {0} ({1:0.2f}%)".format(len(y_test[y_test[:] == 0]), (len(y_test[y_test[:] == 0])/len(df.index) * 100.0)

Training True : 537 (69.92%) Training False : 537 (69.92%) Test True : 231 (30.08%) Test False : 231 (30.08%)

When counting the occurences of 1, with len(y_train[y_train[:] == 1]), it returns all the items match that. In fact, if you change the condition to ==5, it still returns the full length of the array

javixeneize avatar Dec 16 '22 12:12 javixeneize

I was able to get the code segment to work properly by changing [:] to 'diabetes'.
rint("Training True : {0} ({1:0.2f}%)".format(len(y_train[y_train['diabetes'] == 1]), (len(y_train[y_train['diabetes'] == 1])/len(df.index) * 100.0))) etc...

jdobrott avatar Jan 19 '23 16:01 jdobrott

Thanks guys! I fixed the qualifier error. Good catch!

JerryKurata avatar Jan 27 '23 01:01 JerryKurata