Thanks for your advice, but I think that a 3% of my no churn data are bit data for a formal study, this is my problem I cant take a random sample of the same number of cases that no churn because I have got only a 3% of churn clients!.
¿Is it possible repeat measures? For example:
ID X1 Target
1 2 0
2 1 0
3 1 0
...........
99 1 0
100 4 1
101 3 1
102 2 1
¿Can I make this...?
ID X1 Target
1 2 0
2 1 0
3 1 0
...........
99 1 0100 4 1
101 3 1
102 2 1
and repeat mesaures....
103 4 1
104 3 1
105 2 1
106 4 1
107 3 1
108 2 1
109 4 1
110 3 1
.............
180 2 1
..........
Natalie Tarry <[log in to unmask]> escribió:
Hi
What I have standardly done is taken the smallest sample in two binary
measure - churn in your case and note the number of cases.
The take a random sample of the same number of cases from the non churn
sample.
That way you have an equally balanced sample of churn / non churn.
Obviously this is only suitable when you have a reasonable number of cases
of churn - if there are only 10 cases then it is not very suitable!
thanks
Natalie
-----Original Message-----
From: A UK-based worldwide e-mail broadcast system mailing list
[mailto:[log in to unmask]]On Behalf Of Jorge Caballero Rodríguez
Sent: 16 May 2005 12:34
To: [log in to unmask]
Subject: Logistic regression. How can I balance my data???
Hi all!
I have got a problem with a logistic model. I want scoring my data, I am
modelling churning in a bank. The problem is that people that is churn is
only a 3% of all data. I need balance my data for obtain a 30 or 50 %. ¿How
I can do it?
Thanks!
---------------------------------
Correo Yahoo!
Comprueba qué es nuevo, aquí
http://correo.yahoo.es
---------------------------------
Correo Yahoo!
Comprueba qué es nuevo, aquí
http://correo.yahoo.es
|