Let’s test drive simdfied library with a linear regression example.

We’ll use MLplaygroung.org, that uses simdfied for Machine Learning and can read csv or mat files. For example, a csv file representing house prices according to its square-foot and number of bedrooms:

square foot, #bedrooms, price

2461.68 , 4 , 467883

1872, 4 , 385983

And so on…

At first our file plot looks like this:

In our case, the y data is linear, not labeled, so coloring as y makes no sense. Lets change y axis to the price and the color axis to the number of bedrooms. Now we get:

Which makes more visual sense. Now lets run linear regression from the menu:

We can see that cost is going down smoothly, but can still go down farther. The linear regression itself is far from being linear, though..

It tries to plot a line connecting all the hypothesis function results with the existing X data. Since our “theta” vector is not perfect yet, each y-price point is not yet optimally “centered” in the imaginary linear line, and since our data points are not ordered by our current x-axis we get this “polygraph” like line.

Now let’s do some iterative optimizations; performance wise, I always prefer to update the alpha learning rate before adding more iterations.

At alpha = 0.03 we get a closer linear line and our cost is getting close to its minimum, though starting to get “elbow” shape like. The “elbow” shape tells us we probably got to the highest, if not too high, alpha learning rate:

At the same alpha, adding 500 iteration gives us a first “strait line” of prediction, but the cost “elbow” shape got worse:

After playing with more variations, we can get the same linear result with alpha = 0.1 and a 100 iterations; performance wise – this would be a nice choice.

The linear plot gives us a sense of which direction house prices will go with relations to their square-foot and #bedrooms provided data. If we’d like an actual prediction we can use simdfied directly (MLplayground will soon have the ability to predict a “y” according to a new x vector). Something like that:

//load our X matrix with 2 features: square footage and number of bedrooms

var X = simdfied.mat().from2dArray( [2461.68, 1872, …], [4, 4, …] );

//load our y vector with house prices

var y = simdfied.vec().fromArray( [467883, 385983, …] );

//run and predict a price for a 3 bedroom, 900 square footage house:

var ml = simdfied.ml().algo(“linReg”).X(X).y(y).set(“iter”, 100).set(“alpha”, 0.1);

ml.run( function(ml){ ml.predOne( [900, 3] ); } );

## Leave a Reply