The smaller the cross-entropy, the smaller the difference between the predicted probability distribution and the correct probability distribution. If images of cars often have a red first pixel, we want the score for car to increase. We achieve this by […]