Let’s try to predict our chances of getting admitted to Msc studies, based on our Bsc degree GPA and years of experience.
Our csv should look like: 2 “X” columns: Bsc-GPA and experience and a “y” column of if-admitted value (0 / 1).
We can simulate some data with excel-like functions (I’m using LibreOffice calc on Ubuntu):
Use random values for Bsc-GPA and experience, and this formula for admittance, taking into account both “X” values:
IF((A2+0.02*B2)>0.84, 1, 0)
bsc-gpa, #experience, admitted
0.76, 6, 1
0.81, 1, 0
0.77, 2, 0
0.82, 5, 1
Now let’s drag this csv to MLPlayground.org:
Our plot makes a visual “sense” with our admittance algorithm. We can notice that a very good GPA will get you admitted even without an experience, and a very low one won’t, regardless experience. In the middle we have all candidates with a mixture of both.
Let’s hit logistic regression:
And after some more tweaks, with a better cost and, thus, training-accuracy:
In this case, indeed a simple linear decision boundary was more sufficient for getting a decision boundary. When you think about it, it actually reflects the “linear logic” we used for our admittance formula. We can see that there aren’t any “orange” points below ~0.7 GPA, and every point is orange above ~0.83 GPA. The rest indeed reflects the linear formula of some “y=mx+b” ..
Finally, a prediction: for a candidate with a Bsc GPA of 80 and more than 4 years of experience – we can predict with high confidence an admittance 🙂
In a more real-life example, we’ll probably need a non-linear decision boundary capability, like polynomial and Gaussian kernels, which I’ll post about later.