[I heard that the tahn kernel sometimes violates the kernel condition.]
Jaepil,
Adapted from: p.432 Vapnik (1998)
consider the sigmoid kernel
S(x,z) = 1/(1+exp(v*x.z-c))
where x,z are n-dimensional and ||x|| = ||z|| = 1
[this is equivalent to the tanh kernel:
T(x,z) = tanh(v*x.z-c), under a rescaling,
since T(x,z) = 2*S(x,z)-1]
S(x,z) is actually an RBF kernel in (n+1)-dimensional space:
S(x,z) = 1/(1+exp(-0.5v||x*-z*||^2))
where x*, z* are (n+1)-dimensional vectors with first n dimensions
x,z respectively, and (n+1)st dimension equal to sqrt(2(c-v)).
Thus we must have c >= v.
The easiest thing then, is to scale all data to have norm ||x||= 1, and
pick c>=v.
If your data are low dimensional the loss of information from
normalization may be
significant. In this case you will need to calculate x.z for
all data points (including any potential test points) and make
sure (v,c) are such that
v*x.z-c <= 0
for all data points. For example, if the examples are 166 bit
binary strings then max(x.z) = 166 so make sure 166v <= c.
Rgds
Robert
|