First, we need to talk about data. So you
see here we have basically a CSV with a bunch of data right?
We have column 1 and column 2 and if you’re a bit perceptive you notice that
column 3 is a simple multiplication operation between column 1 and column 2.
Now what a machine learning algorithm would do, essentially, is if you would
tell it hey I have this data I want to figure out what’s the relation between
column 1 and column 2 that can generate column 3? A good machine learning
algorithm would basically be able to figure out “oh it’s multiplication” and
you see here we have a training data set and testing data set and that’s because
when we train a machine learning algorithm, we when we try to make it
figure out relations we do so on a training data set and then we want to
test that algorithm to make sure that it’s learned something meaningful on
some data that it’s never seen before and we call that a testing data set
where we don’t actually give the solution