Print

Print


Hi, everybody, nice list!
this my first post, I familiarized myself with SVM method by reading
tutorials and papers, went through this list and this is my first
application  attempt to solve stochastic  time series
regression/classification problem.

Ok, posing the real problem, would like to get some pointers on  “can be
solved or not with SVM”.

dynamical stochastic non-linear system goes through 9 probable states.
states are described by state probability at t, example P9 = 0.67
i.e. probability that system is in state number 9 is 0.67
so at any t the system can be described by a 9 element vector = state
prob. vector
[P1,P2,P3,...P9] at t
system is sampled at  discrete time intervals and we get discrete time
series,
total 10K samples for t:t-10000

next we compute system state trajectory for example past 30 time periods
using window =30 t, ie lag=30
thus inputs are 30 columns representing t-1,t-2...t-30,
i.e. column 1 = lag=1, column2 = lag=2... so on
9d vector in each column, i.e. per each t representing 9 element state
prob. vector at t,
or basically 270 features? ( not sure if “feature” in SVM literature is
same as “input” ) in all per sample row

output is same 9 classes as in 9 element vector,
the goal = try to forecast the next system state, i.e. highest rank
selection, 9 class classifier
assume Sum(all 9 state prob) = 1.00

since state probabilities ( inputs ) overlap on density functions,
historical time trajectories will also overlap
then output will likely to overlap on pdfs.

note that states are statistically “unbalanced”, some states are more
frequent than others, some persist in time
and some are anti persistent, so classes are going to be unbalanced too.

We’re basically comparing 10K of 9d system trajectories over past 30 time
interval window...

the question: what's the chance of successful 9 class separation using SVM
method as a classifier? :)

and suppose instead of 9 element vector i use 9x3 matrix,
ie in addition to using Pstate as a single feature i add 2 more features,
now
i have 9x3 matrix per t and 9x3x30 = 810 total features per row, again 10K
rows.
what are my chances then, "good", "bad" or "forget it"? :)

so I guess, to solve  i need an SVM software that can handle:
- large data set > 10K
- large input feature set, > 200 but < 1000
- multi class > 10 classes, with ranking or win-takes-all type
classification
- unbalanced classes, cost handling
- and preferably at least grid search to find opt. parameters.

is there SVM software that can handle all of that, matlab or C or
ported to windows? With so many svm toolboxes and code I am kind of lost
as far as which one I should try… it’ll take a while to get through them
all.

Maybe there is an alternative ML method that’s more appropriate?

Opinions, pointers?

Thank you kindly.
Josh.
p.s. this is basically a typical physics problem, i.e. we get a N state
system
with state prob. matrix size Nx1, we compute it's trajectory over a window,
try to forecast the next system state... and adding other system
descriptors besides
state probabilities...( try to add more features for better separation ? )