DEV Community

Cover image for KNN with PHP ML & Rubix ML
Mr X
Mr X

Posted on

KNN with PHP ML & Rubix ML

This post is for anybody who has tried to migrate to Rubix ML from PHP ML and more specifically anybody who is experimenting with K-Nearest Neighbors.

As probably anybody who gets familiar with the capabilities of regression does, I decided to use it to try to win the lottery. The basis of the argument is that on occasion regression gets surprisingly close to generating winning number combinations.

You can check my blog and data tables here on my lottery predictions website. I am currently mapping some data to graphs in some ongoing trend analysis.

Anyway, I discovered when revisiting Rubix ML after starting with PHP ML that there were large differences in how the libraries handle data and its labels.

The process is simple I generate a simple multi-dimensional array, each item containing an integer for a month and a lottery number, and another array of labels simply 1 for a winning number or 0 for a losing number. After training with this data, I ask the program what numbers are likely to appear in a draw for every month of the year.

Anyway, to cut a long story short to use PHP ML I coded the following

public function knntest( $array, $distance ) {
$classifier = new KNearestNeighbors( $k = $distance, new Minkowski( 4 ) );
$classifier->train( $this->tensor[0], $this->tensor[1] );
$predictions = array();
foreach ( $array as $key => $check ) {
    foreach ( $check as $k => $_check ) {
    $input  = array( $_check[0], $_check[1] );
    $nkpred = $classifier->predict( $input );
    if ( $nkpred == '1' ) {
$month     = $key + 1;
$predictions[] = array(
'Month'   => $month,
'Numbers' => $_check[1],
);
}
}
}
return $predictions;
}
Enter fullscreen mode Exit fullscreen mode

To process the same data with Rubix ML I had to write the following.

public function knntest( $array, $distance ) {
$classifier = new KNearestNeighbors( $distance, true, new Minkowski(4.0));
$dataset = new Labeled( $this->tensor[0], $this->tensor[1]);
$dataset->apply(new OneHotEncoder());
$classifier->train( $dataset );
$predictions = array();
foreach ( $array as $key => $check ) {
$input  = new Unlabeled( $check );
$input->apply(new OneHotEncoder());
$nkpred = $classifier->predict( $input );
$i = 0;
$length = count($nkpred);
while($i < $length) {
if ( (string) $nkpred[$i] == '1' ) {
$month = $key + 1;
$predictions[] = array(
'Month'   => $month,
'Numbers' => $check[$i][1],
);
}
$i++;
}       
}
return $predictions;
}
Enter fullscreen mode Exit fullscreen mode

As you can see the contrast is huge. I found the way I was working with Rubix was the only way I could get it to process the same data sets and labels.

I figure this might help anybody who wants to start with KNN and Rubix after migrating from PHP ML.

Personally speaking for the circumstance I am working with KNN in I am staying with PHP ML.

Rubix seems to be group similar datasets together which is interesting itself.

Top comments (0)