IT is attached. Good luck.
Lijuan Cao wrote:
>
> Hi, Hubey:
> I am interested in your method.
>
> Lijuan
>
> "H.M. Hubey" wrote:
>
> > I have a method which although a KDD method in its own right can
> > also be used to create the nonlinear mapping into the feature space
> > that is custom-fitted to the data. After that SVM can be used.
> >
> > If anyone wants it I can send the pdf file. If there are too many
> > requests I will put it on my site.
> >
> > "Burbidge, Robert" wrote:
> > >
> > > [The presence of a large number of
> > > irrelevant features does degrade the performance, especially so if the
> > > kernel is not linear. ]
> > >
> > > Yes, this is true for both classification and regression. I should
> > > have mentioned that I would use the linear kernel with all features,
> > > to do the feature selection. And then investigate other kernels.
> > >
> > > The only systematic feauture selection methods I've seen are for
> > > classification, based on gradient descent on an error estimate.
> > > The kernel is modified to K(x,z) = sum(x^p.z^p/sigma_p) and the
> > > optimal set of sigma_p's found. Those going to 0 suggest irrelevant
> > > features.
> > > I am not aware of similar work for the regression case.
> > >
> > > One way to avoid overfitting with a lot of features is to set C to be
> > > very small. This will also result in an oversmoothed solution fairly
> > > rapidly, and thus permit a number of SVMs to be trained during the
> > > model/feature selection stage.
> > >
> > > Lastly, one could follow Mangasarian's classification work and use
> > > a linear programming SVM to directly minimize the number of nonzero
> > > weights.
> > >
> > > To come back to the original question: how many features are sensible?
> > > there's no answer, but it is likely to be more than other techniques
> > > can sensibly cope with (other than Bayesian treatments)
> > > one could use PCA (or kernel PCA) to estimate the dimensionality of the
> > > data in the (kernel induced) feature space, and take that as a guide,
> > > but then one's still faced with the combinatorial feature selection
> > > problem
> > >
> > > Rgds
> > >
> > > Robert
> >
> > --
> >
> > Regards,
> >
> > Mark
> > Computer Science
> > [log in to unmask]
--
Regards,
Mark
Computer Science
[log in to unmask]
|