我学libsvm的教程
piaip's Using (lib)SVM Tutorial piaip 的 (lib)SVM 简易入门
piaip at csie dot ntu dot edu dot tw,
Hung-Te Lin
Fri Apr 18 15:04:53 CST 2003
$Id: svm_tutorial.html,v 1.12 2005/10/26 06:12:40 piaip Exp piaip $ 原作:林弘德,转载请保留原出处
Why this tutorial is here
我一直觉得 SVM 是个很有趣的东西,不过也一直没办法 (mostly 冲堂 ) 去听 林智仁老师 的 Data mining 跟 SVM 的课; 后来看了一些网络上的文件跟听 kcwu 讲了一下 libsvm 的用法后,就想整理一下,算是对于并不需要知道完整 SVM 理论的人提供使用 libsvm 的入门。 原始 libsvm 的 README 跟 FAQ 也是很好的文件, 不过你可能要先对 svm 跟流程有点了解才看得懂 ( 我在看时有这样的感觉 ) ; 这篇入门就是为了从零开始的人而写的。 I've been considering SVM as an interesting and useful tool but couldn't attend the "Data mining and SVM" course by prof. cjline about it (mostly due to scheduling conflicts). After reading some materials on the internet and discussing libsvm with some of my classmates and friends , I wanted to provide some notes here as a tutorial for those who do not need to know the complete theory behind SVM theory to use libsvm . The original README and FAQ files that comes with libsvm are good documents too. But you may need to have some basic knowledge of SVM and its workflow (that's how I felt when I was reading them). This tutorial is specificly for those starting from zero.
后来还有一些人提供意见,所以在此要感谢: I must thank these guys who provided feedback and helped me make this tutorial:
kcwu, biboshen, puffer, somi
不过请记得底下可能有些说法不一定对,但是对于只是想用 SVM 的人来说我觉得这样说明会比较易懂。 Remember that some aspect below may not be correct. But for those who just wish to "USE" SVM, I think the explanation below is easier to understand.
这篇入门原则上是给会写基本程序的人看的,也是给我自己一个备忘 , 不用太多数学底子,也不用对 SVM 有任何先备知识。 This tutorial is basically for people who already know how to program. It's also a memo to myself. Neither too much mathmatics nor prior SVM knowledge is required.
还看不懂的话有三个情形 , 一是我讲的不够清楚 , 二是你的常识不足 , 三是你是小白 ^^; If you still can't understand this tutorial, there are three possibilities: 1. I didn't explain clearly enough, 2. You lack sufficient common knowledge, 3. You don't use your brain properly ^^;
我自己是以完全不懂的角度开始的,这篇入门也有不少一样不懂 SVM 的人 看过、而且看完多半都有一定程度的理解,所以假设情况一不会发生, 那如果不懂一定是后两个情况 :P 也所以 , 有问题别问我。 Since I begin writing this myself with no understanding of the subject, ans this document has been read by many people who also didn't understand SVM but gained a certain level of understanding after reading it, possibility 1 can be ruled out. Thus if you can't understand it you must belong to the latter two categories, :P thus even if you have any questions after reading this, don't ask me.
SVM: What is it and what can it do for me?
SVM, Support Vector Machine , 简而言之它是个起源跟类神经网络有点像的东西, 不过现今最常拿来就是做分类 (classification) 。 也就是说,如果我有一堆已经分好类的东西 (可是分类的依据是未知的!) ,那当收到新的东西时, SVM 可以预测 (predict) 新的数据要分到哪一堆去。 SVM, Support Vector Machine , is something that has similar roots with neural networks. But recently it has been widely used in Classification . That means, if I have some sets of things classified (But you know nothing about HOW I CLASSIFIED THEM, or say you don't know the rules used for classification) , when a new data comes, SVM can PREDICT which set it should belong to.
听起来是很神奇的事(如果你觉得不神奇,请重想一想这句话代表什么: 分类的依据是未知的! ,还是不神奇的话就请你写个程序 解解看这个问题), 也很像要 AI 之类的高等技巧 ... 不过 SVM 基于 统计学习理论 可以在合理的时间内漂亮的解决这个问题。 It sounds marvelous and would seem to require advanced techniques like AI searching or some time-consuming complex computation. But SVM used some Statistical Learning Theory to solve this problem in reasonable time.
pdf or ps 。
底下我试着在不用看那个 slide 的情况 解释及使用 libsvm 。 To get yourself more familiar with SVM, you may refer to the slides cjlin used in his Data Mining course : pdf or ps .
I'm going to try to explain and use libSVM without those slides.
所以 , 我们可以把 SVM 当个黑盒子 , 数据丢进去让他处理然后我们再来用就好了 . Thus we can consider SVM as a black box. Just push data into SVM and use the output.
How do I get SVM?
林智仁(cjlin) 老师 的 libsvm 当然是最完美的工具 . Chih-Jen Lin 's libsvm is of course the best tool you can ever find.
Download libsvm
下载处 : Download Location:
libsvm.zip or libsvm.tar.gz
.zip 跟 .tar.gz 基本上是一样的 , 只是看你的 OS; 习惯上 Windows 用 .zip 比较方便 ( 因为有 WinZIP, 不过我都用 WinRAR), UNIX 则是用 .tar.gz Contents in the .zip and .tar.gz are the same. People using Windows usually like to use .zip files because they have WinZIP, which I always replace with WinRAR. UNIX users mostly prefer .tar.gz
Build libsvm
解开来后 , 假定是

您当前的位置: