//OFF
cls
cd "U:\Documents\STATA\StataHacks\docs\Coding Stata\Cox_regression"
//ON
/***
# Howto hide small steps in a Kaplan Meyer plot
In Danish national registries it is forbidden to report smaller groups than 5.
Since steps in Kaplan Meyer plots often are based on steps less than 5
reporting Kaplan Meyer in small dataset is a problem.
A solution to either use a lowess smoothed version of the Kaplan Meyer or to
make Kaplan Meyer in steps of 5 is presented here.
## The example data
We use a clasical Stata example dataset:
***/
/**/webuse drug2, clear
/**/stset, clear
/***
The variables are:
***/
describe
/***
And the data looks like (each row is a person):
***/
list in 1/6, sepby(studytime) abbreviate(20)
/***
## Generating the data behind the Kaplan Meyer plots
First **sts generate** is used to find the failure probabilitSes from the
survival probabilities.
***/
stset studytime, failure(died) noshow
/**/sts generate survival = s
/**/generate failure = 1 - survival
/**/label variable failure "KM failure"
/**/format failure %6.2f
/***
A lowess smoothed **twoway** graph of failure vs studytime is one way to report
the Kaplan Meyer plot.
## Making step size to 5
The variable n_prsns counts the the number of persons at each time (variable
studytime). The count is only saved in the last row for each time.
***/
/**/bysort studytime: generate n_prsns = cond(_n == _N, _N, 0)
/***
To get the accumulated number of persons over time one can use relative
references and the function **cond**:
***/
/**/generate acc_prsns = n_prsns if _n == 1
/**/replace acc_prsns = cond(acc_prsns[_n-1] < 5, n_prsns + acc_prsns[_n-1], n_prsns) if _n > 1
/***
Only the failure values based on at least 5 persons are selected:
***/
/**/generate failure2 = failure if acc_prsns > 4
/**/quietly summarize failure2
/**/replace failure2 = `r(min)' if _n == 1
/**/replace failure2 = `r(max)' if _n == _N
/**/label variable failure2 "KM failure with steps of at least 5"
/**/format failure2 %6.2f
/***
## A graph comparison
Finally a graphical comparison of the classical Kaplan Meyer, the lowess
smoothed version and the Kaplan Meyer based on steps of at least 5 persons is
presented:
***/
twoway ///
(line failure studytime, lcolor(black) connect(stairstep)) ///
(lowess failure studytime, lcolor(blue) ) ///
(line failure2 studytime, lcolor(red) connect(stairstep)) ///
, legend(on position(5) ring(0) cols(1) ///
order(1 "Kaplan Meyer" 2 "Kaplan Meyer lowess" 3 "Kaplan Meyer steps of 5") ///
) ///
name(km, replace)
//OFF
graph export km.png, width(800) height(600) replace
//ON
/***
![](km.png)
***/