K近邻分类器的matlab代码(Matlab code of k-nearest neighbors)

function rate = KNN(Train_data,Train_label,Test_data,Test_label,k,Distance_mark);

% K-Nearest-Neighbor classifier(K-NN classifier)

%Input:

% Train_data,Test_data are training data set and test data

% set,respectively.(Each row is a data point)

% Train_label,Test_label are column vectors.They are labels of training

% data set and test data set,respectively.

% k is the number of nearest neighbors

% Distance_mark : ['Euclidean', 'L2'| 'L1' | 'Cos']

% 'Cos' represents Cosine distance.

%Output:

% rate:Accuracy of K-NN classifier

% Examples:

% %Classification problem with three classes

% A = rand(50,300);

% B = rand(50,300)+2;

% C = rand(50,300)+3;

% % label vector for the three classes

% gnd = [ones(300,1);2*ones(300,1);3*ones(300,1)];

% fea = [A B C]';

% trainIdx = [1:150,301:450,601:750]';

% testIdx = [151:300,451:600,751:900]';

% fea_Train = fea(trainIdx,:);

% gnd_Train = gnd(trainIdx);

% fea_Test = fea(testIdx,:);

% gnd_Test = gnd(testIdx);

% rate = KNN(fea_Train,gnd_Train,fea_Test,gnd_Test,1)

%Reference:

% If you used my matlab code, we appreciate it very much if you can cite our following papers:
% Jie Gui, Tongliang Liu, Dacheng Tao, Zhenan Sun, Tieniu Tan, "Representative Vector Machines: A unified framework for classical classifiers", IEEE
% Transactions on Cybernetics.

%This code is written by Gui Jie in the evening 2009/03/11.

%If you have find some bugs in the codes, feel free to contract me

if nargin < 5

error('Not enought arguments!');

elseif nargin < 6

Distance_mark='L2';

end

[n dim] = size(Test_data);% number of test data set

train_num = size(Train_data, 1); % number of training data set

% Normalize each feature to have zero mean and unit variance.

% If you need the following four rows,you can uncomment them.

% M = mean(Train_data); % mean & std of the training data set

% S = std(Train_data);

% Train_data = (Train_data - ones(train_num, 1) * M)./(ones(train_num, 1) * S); % normalize training data set

% Test_data = (Test_data-ones(n,1)*M)./(ones(n,1)*S); % normalize data

U = unique(Train_label); % class labels

nclasses = length(U);%number of classes

Result = zeros(n, 1);

Count = zeros(nclasses, 1);

dist=zeros(train_num,1);

for i = 1:n

% compute distances between test data and all training data and

% sort them

test=Test_data(i,:);

for j=1:train_num

train=Train_data(j,:);V=test-train;

switch Distance_mark

case {'Euclidean', 'L2'}

dist(j,1)=norm(V,2); % Euclead (L2) distance

case 'L1'

dist(j,1)=norm(V,1); % L1 distance

case 'Cos'

dist(j,1)=acos(test*train'/(norm(test,2)*norm(train,2))); % cos distance

otherwise

dist(j,1)=norm(V,2); % Default distance

end

[Dummy Inds] = sort(dist);

% compute the class labels of the k nearest samples

Count(:) = 0;

for j = 1:k

ind = find(Train_label(Inds(j)) == U); %find the label of the j'th nearest neighbors

Count(ind) = Count(ind) + 1;

end% Count:the number of each class of k nearest neighbors

% determine the class of the data sample

[dummy ind] = max(Count);

Result(i) = U(ind);

end

correctnumbers=length(find(Result==Test_label));

rate=correctnumbers/n;

--------------------------------------------------以上是代码---------------------------------------------------------------------

余弦距离和余弦相似度的区别

餘弦相似度（cosine similarity）乃是傳統文件分類中，最常被拿來度量文件間距離的基本度量方法，其以兩個 d 維向量間的角度差異來度量該向量間的距離，所得數據介於 0 ~ 1 之間，當兩向量角度越相近時，所求出的餘弦距離越接近1；反之，則越接近 0。假設在 d 維空間中有兩點a = [a1, a2, …, ad]，b = [b1, b2, …,bd]，則其餘弦相似度可表示為：

cosineSimilarity(a,b) = dot(a,b) / (norm(a)*norm(b)) [我觉得这里说成cosineSimilarity，不应该说成cosineDistance。相似度越大，距离应该越小。比如，a和b夹角为0，此时最相似，相似度最大，距离最小]
dot(a,b) 代表a和b的内积，因为向量内积定义为a·b = |a| × |b| × cosθ，（一般情况下，θ∈[0,π]， http://baike.baidu.com/view/1485493.htm ）。故这样定义不能满足在 0 ~ 1 之間，而是-1到1之间，有两种方式：
(1) 我下面的代码是正确的，用acos，将这个余弦转化为[0, π]之间的角度. 未必一定要限制在0 ~ 1 ，我的代码转化成[0, π]，值越大代表其距离越大；
(2) cosineDistance(a,b) = 1- cosineSimilarity(a,b) = 1- dot(a,b) / (norm(a)*norm(b))。cosineDistance的范围就在[0 2]。

範例：

a=[1 1 1]; b=[1 0 0];

cosineDistance = dot(a,b) / (norm(a)*norm(b))

cosineDistance =

0.5774 [http://neural.cs.nthu.edu.tw/jang/books/dcpr/doc/02%E8%B7%9D%E9%9B%A2%E8%88%87%E7%9B%B8%E4%BC%BC%E5%BA%A6.pdf ，已经保存到电脑：距离与相似度.pdf]
"KDD17_Linearized GMM Kernels"的公式1是cosine的等价表示形式；“Min-Max Kernels 1503.01737”的公式1是Min-Max Kernel的标准定义

Minimal Local Reconstruction Error Measure Based Discriminant Feature Extraction and Classification的P3

For robustness, before classification, we generally need to normalize feature vectors, making the length of each feature vector to be 1, that is, x -> x /|| x || .

The normalized Euclidean distance is equivalent to the cosine distance.

(1) Lin Zhu 师弟讲将循环改为计算距离矩阵会节省时间，因为matlab循环很耗时，但大样本还必须用循环否则out of memory.想起以前上课jinsong老师也提供了一个KNN代码，不过他的也是用循环实现的.matlab有自带的函数knnclassify，论文Sparsity preserving projections的代码SPP_1NN.m中就用的该函数。在ASLAN上我的KNN和knnclassify识别率完全一样
(2) 极其重要注意点：倒数第四行程序不要用Result(i) = ind;这对Yale等标号依次为1,2,3等没问题。对二分类1和-1就有问题。SRC_QC和SRC_QC2也是类似的，倒数第三行不能用Result(i) = index, 要用Result(i) = classLabel(index); 原来只修改了这一处，其实SRC_QC2的50行和SRC_QC的42行也要将ii修改为classLabel(ii)。正因为这个错误，才得出SRC在ASLAN上是50%错误率方差是0的错误结果。正确的SRC_QC2和SRC_QC程序在ASLAN目录

发表于 2009-03-19 14:41 杰哥阅读(9568) 评论(13) 编辑收藏引用所属分类: Matlab

# re: K紧邻分类器的matlab代码回复更多评论

该代码怎么下载啊？

guo 评论于 2009-06-01 22:10

# re: K紧邻分类器的matlab代码回复更多评论

你的MATLAB代码怎么用呀？应该写一个实例！

w 评论于 2009-07-05 16:10

# re: K紧邻分类器的matlab代码回复更多评论

你是黄德双的学生？智能所的？

09157 评论于 2009-12-23 16:06

# re: K紧邻分类器的matlab代码回复更多评论

请问楼上的是哪位高人？:)

杰哥评论于 2009-12-23 21:04

# re: K紧邻分类器的matlab代码回复更多评论

您好，用您的代码运行了一下，发现效果还可以。本来在网上下载了好几个KNN分类的代码，有些太复杂看不懂，有些计算距离的算法不是很清楚，不知道是欧几里得还是其它的算法，楼主的代码清晰易懂，很容易学会，谢谢楼主的分享。

韦燕华评论于 2013-08-15 19:21

# re: K紧邻分类器的matlab代码回复更多评论

谢谢@韦燕华！祝研究进展顺利:)

杰哥评论于 2013-08-29 15:47

# re: K紧邻分类器的matlab代码回复更多评论

这是matlab程序，将程序部分复制到matlab中，保存一下，就可以了。有什么问题再联系@guo

杰哥评论于 2013-09-21 13:32

# re: K紧邻分类器的matlab代码回复更多评论

已经写了例子，谢谢提醒。见注释部分。@w

杰哥评论于 2013-09-21 13:32

# re: K紧邻分类器的matlab代码回复更多评论

杰哥,我想请教你下matlab的关于K近邻分类器的问题,可以吗?能的话,加我1310401030,现在在做毕设,遇到问题了

Lillian 评论于 2014-04-28 21:51

# re: K紧邻分类器的matlab代码回复更多评论

好的，有什么问题，你给我留言@Lillian

杰哥评论于 2014-04-28 22:08

# re: K近邻分类器的matlab代码(Matlab code of k-nearest neighbors)[未登录] 回复更多评论

请问如果是遗传算法每次选出来特征子集每次特征不一样怎么一个个分类

wei 评论于 2014-10-23 12:27

# re: K近邻分类器的matlab代码(Matlab code of k-nearest neighbors)[未登录] 回复更多评论

已经测试了准确率感觉比神经网络还高

wei 评论于 2014-10-27 03:19

# re: K近邻分类器的matlab代码(Matlab code of k-nearest neighbors) 回复更多评论

请问你在博文中提到的Sparsity preserving projections的代码，你那里还有吗，我最近在做这方面的研究，有的话麻烦传我邮箱一份，992401840@qq.com，多谢了

沐阳评论于 2015-03-18 09:34

刷新评论列表

只有注册用户登录后才能发表评论。


相关文章: How to read *.data in Matlab and Python matlab legend字号改变 [转载]matlab中text 函数在显示字符串时的使用方法 matlab ColorSpec (Color Specification) Professor Deng Cai's code, very good code [zz]MATLAB如何实现十进制数与二进制数的转换 matlab dir [zz] Matlab中使用.p文件的方法 The matlab code of Locality sensitive hashing (LSH) matlab函数meshgrid

网站导航: 博客园博客园最新博文博问管理

常用链接

留言簿(58)

随笔分类

随笔档案

相册

Other

Paper submission

福彩

留学相关

论坛

搜索

学者

邮箱

中科大和中科院

搜索

最新评论

阅读排行榜

评论排行榜

K近邻分类器的matlab代码(Matlab code of k-nearest neighbors)