4-2

代码实现KNN算法

import numpy as np
from math import sqrt
from collections import Counter

def kNN_classify(k, X_train, y_train, x):
    assert 1 <= k <= X_train.shape[0], "k must be valid"
    assert X_train.shape[0] == y_train.shape[0], "the size of X_train must equal to the size of y_train"
    assert X_train.shape[1] == x.shape[0], "the feature number of x must be equal to X_train"

    distances = [sqrt(np.sum((x_train-x)**2)) for x_train in X_train]
    nearst = np.argsort(distances)

    topK_y = [y_train[i] for i in nearst[:k]]
    votes = Counter(topK_y)

    return votes.most_common(1)[0][0]

准备数据

调用算法

运行结果:predict_y = 1

什么是机器学习

KNN是一个不需要训练的算法 KNN没有模型,或者说训练数据就是它的模型

使用scikit-learn中的kNN

错误写法

这样写会报错: 原因是,predict为了兼容多组测试数据的场景,要求参数是个矩阵

正确写法

运行结果:predict_y[0] = 1

重新整理我们的kNN的代码

封装成sklearn风格的类

使用kNNClassifier

运行结果:predict_y[0] = 1

Last updated