Print

Print


sure. Please find it attached.

Balaji


----- Original Message -----
From: "Longbin Chen" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, December 06, 2001 2:50 AM
Subject: Re: kernel functions


> Mr.Balaji Krishnapuram
>
> Hi.Can you mail the presentation file (.ppt?) to me? Thank you very much.
>
> -------------------
> >>When you use a kernel function to map training samples to
> >>a feature space, how can you be sure that the samples are
> >> being mapped to a higher dimensional space and not to a
> >> lower dimensional space?
> >
> >It is not correct to say that any Kernel (as in Mercer Kernel) maps
training samples to a higher dimensional space. For example consider this
case:
> >use a transformation matrix from the original space of observations i.e.
phi(x)=B*x where x is the input feature vector, B is the transformation
matrix. By choosing B (with number of rows less than the dimensionality of
x)as a transformation matrix in this fashion you have thus chosen a feature
space to be a linear combination of the input vector dimensions, and further
chosen a feature space of lower dimensionality than the original feature
vector x (since it has fewer rows in B). Now the Kernel is nothing but an
inner product in the induced feature space:
> >i.e. the Kernel function here
K(x,y)=<phi(x),phi(y)>=transpose(y)*transpose(B)*B*x, from the inner product
definition being <a,b>=transpose(a)*b
> >
> >This is a valid Kernel for an SVM in the sense that it is a Mercer
Kernel: its Gram matrix will be positive semi-definite for any B (nonzero)
chosen. Thus you have just constructed a kernel which maps data from a
higher dimensional space to a lower dimensional space.
> >
> >The important point is:
> >1. The kernel has to satisfy the Mercer conditions
> >2. It maps into a feature space which can be found by you explicitly as
the eigenfunction space of the Kernel function (see for example, the chapter
in Chistianni & shawe Taylor, "An Introduction to Support Vector Machines
and other Kernel based methods", or even the introduction in Burges'
tutorial paper available from www.kernel-machines.org
> >
> >>Also, how can you be sure that the samples can be
> >>separated by a linear surface in that higher dimensional space?
> >
> >Again, you cannot do any such thing. The point is that it may be
impossible to separate the data in any way under the sun, since their class
conditional PDFs may overlap. All that the Kernel can do is to map the data
into the space you want to map into in a computationally efficient way
(which generally works well in a lot of cases with even standard kernels
liuke the RBF, but not necessarily) I made a presentation on this topic and
can mail you the file (Powerpoint) if you want further information.
> >
> >Regards,
> >Balaji Krishnapuram
>
>
>             longbinchen
>             [log in to unmask]
> ------------
> P.O.BOX 2728# ,NLPR,
> BeiJing P.R.China,100080
>
> TEL:+86-10-82613867
>         13661238601
> ICQ:136386021
> Institute of Automation,
> Chinese Academy of Sciences (CASIA)
> P.R.China
>
>