Mr.Balaji Krishnapuram
Hi.Can you mail the presentation file (.ppt?) to me? Thank you very much.
-------------------
>>When you use a kernel function to map training samples to
>>a feature space, how can you be sure that the samples are
>> being mapped to a higher dimensional space and not to a
>> lower dimensional space?
>
>It is not correct to say that any Kernel (as in Mercer Kernel) maps training samples to a higher dimensional space. For example consider this case:
>use a transformation matrix from the original space of observations i.e. phi(x)=B*x where x is the input feature vector, B is the transformation matrix. By choosing B (with number of rows less than the dimensionality of x)as a transformation matrix in this fashion you have thus chosen a feature space to be a linear combination of the input vector dimensions, and further chosen a feature space of lower dimensionality than the original feature vector x (since it has fewer rows in B). Now the Kernel is nothing but an inner product in the induced feature space:
>i.e. the Kernel function here K(x,y)=<phi(x),phi(y)>=transpose(y)*transpose(B)*B*x, from the inner product definition being <a,b>=transpose(a)*b
>
>This is a valid Kernel for an SVM in the sense that it is a Mercer Kernel: its Gram matrix will be positive semi-definite for any B (nonzero) chosen. Thus you have just constructed a kernel which maps data from a higher dimensional space to a lower dimensional space.
>
>The important point is:
>1. The kernel has to satisfy the Mercer conditions
>2. It maps into a feature space which can be found by you explicitly as the eigenfunction space of the Kernel function (see for example, the chapter in Chistianni & shawe Taylor, "An Introduction to Support Vector Machines and other Kernel based methods", or even the introduction in Burges' tutorial paper available from www.kernel-machines.org
>
>>Also, how can you be sure that the samples can be
>>separated by a linear surface in that higher dimensional space?
>
>Again, you cannot do any such thing. The point is that it may be impossible to separate the data in any way under the sun, since their class conditional PDFs may overlap. All that the Kernel can do is to map the data into the space you want to map into in a computationally efficient way (which generally works well in a lot of cases with even standard kernels liuke the RBF, but not necessarily) I made a presentation on this topic and can mail you the file (Powerpoint) if you want further information.
>
>Regards,
>Balaji Krishnapuram
longbinchen
[log in to unmask]
------------
P.O.BOX 2728# ,NLPR,
BeiJing P.R.China,100080
TEL:+86-10-82613867
13661238601
ICQ:136386021
Institute of Automation,
Chinese Academy of Sciences (CASIA)
P.R.China
|