A CI plot from coefplot and a matrix
Introduction
To create a CI plot of crude estimates using Mata matrices and coefplot. In this example we use hazard ratio estimates for the coefplot.
Further, it is shown how to add the estimates and CI limits as labels in the ciplot.
An alternative approach is described in the documnetation for matrix2stata.
Yet another approach is to use the graphic capabilities in either -metan- or -mi forestplot- in the STATA META-ANALYSIS REFERENCE MANUAL.
The command coefplot
Note that coefplot might have to be installed on your version of Stata.
If coefplot is not installed, run the command:
ssc install coefplot
To see if coefplot is installed see if the command below returns a help page:
help coefplot
The data
We use the Stata example dataset stan3 (Heart transplant survival data from Stanford).
webuse stan3, clear
label variable posttran "After tranplantation"
label define posttran 0 "No" 1 "Yes"
label values posttran posttran
metadata id year age died stime surgery transplant wait posttran
--------------------------------------------------------------------------------------------------------------- Name Index Label Value Label Name Format Value Label Values n unique missing --------------------------------------------------------------------------------------------------------------- id 1 Patient Identifier %8.0g 172 103 0 year 2 Year of Acceptance %8.0g 172 8 0 age 3 Age %8.0g 172 35 0 died 4 Survival Status (1=dead) %8.0g 172 2 0 stime 5 Survival Time (Days) %8.0g 172 88 0 surgery 6 Surgery (e.g. CABG) %8.0g 172 2 0 transplant 7 Heart Transplant %8.0g 172 2 0 wait 8 Waiting Time %8.0g 172 41 0 posttran 9 After tranplantation posttran %8.0g 0 "No" 1 "Yes" 172 2 0 ---------------------------------------------------------------------------------------------------------------
The dataset is already set for Cox regression:
stset
-> stset t1, id(id) failure(died) Survival-time data settings ID variable: id Failure event: died!=0 & died<. Observed time interval: (t1[_n-1], t1] Exit on or before: failure -------------------------------------------------------------------------- 172 total observations 0 exclusions -------------------------------------------------------------------------- 172 observations remaining, representing 103 subjects 75 failures in single-failure-per-subject data 31,938.1 total analysis time at risk and under observation At risk from t = 0 Earliest observed entry t = 0 Last observed exit t = 1,799
The dataset will be used without taking the case story of the dataset into account. So the results as such are pure rubish.
A ciplot using coefplot in 5 steps
Before creating the ciplot we gather (crude) estimates and CI limits in the Mata matrix estimates. Also we gather the variable names for which estimates are found in the Stata local rownames.
In order to use coefplot a Stata matrix is needed so at the end the Mata matrix and the local rownames are transformed into a Stata matrix.
Then the coefplot command is used to generate a ciplot.
Finally it is shown how a label with estimates can be added the coefplot.
Step 1: Reset and start
Both the local and the Mata matrix has to be reset before start:
local rownames ""
mata: estimates = J(0,3,.)
Step 2: Gather estimates in local "rownames" and Mata matrix "estimates"
A foreach loop is used to loop through the variables for which the estimates are needed.
In the loop the regression of choice is made. The regression is formulated referring to Stata local var.
All Stata regression returns a table of regression estimates in the Stata named matrix r(table).
In the r(table) regression estimates are in row 1 and lower and upper CI limits are in rows 5 and 6, respectively.
It is easier to extract data from Mata matrices, so the values of r(table) are copied into a Mata matrix (st_matrix("r(table)")), transposed (') and columns 1, 5 and 6 are added to the matrix estimates.
Since estimates from the regression comes in the order of the independent regression variables ending with constant estimates only the first row of the transposed r(table).
The variable names are gathered in the local rownames. They are needed as labels on the y-scale:
foreach var of varlist year surgery transplant wait posttran {
stcox `var'
mata: estimates = estimates \ st_matrix("r(table)")'[1, (1,5,6)]
local rownames `rownames' `var'
}
Step 3: Move estimates back into a Stata matrix "estimates"
The values from the Mata matrix are moved into the Stata matrix with the same name.
Rownames (from Stata local rownames) and column names (HZ, LL, and UL) are added to make the coefplot more readable.
mata: st_matrix("estimates", estimates)
matrix rownames estimates = `rownames'
matrix colnames estimates = HZ LL UL
The Stata matrix "estimates" looks like
matprint estimates, d(3)
------------------------------- HZ LL UL ------------------------------- year 0.840 0.736 0.960 surgery 0.349 0.151 0.810 transplant 0.267 0.166 0.431 wait 0.981 0.970 0.992 posttran 1.112 0.619 1.995 -------------------------------
Step 4: Doing the ciplot
To make a ciplot based on the Stata matrix "estimates" using coefplot it is needed to tell in which column the estimates are [matrix(estimates[,1])] and in which columns the ci limits are [ci((estimates[,2] estimates[,3]))].
coefplot matrix(estimates[,1]), ci((estimates[,2] estimates[,3])) xline(1)
Step 5: Adding estimates and ci limits as labels
This has become much easier from version 1.8.1. In option mlabel it is specified how the confidence interval should look based on temporary internal variables. Then telling how the confidence interval is to appear options like mlabposition and mlabsize:
coefplot matrix(estimates[,1]), ci((estimates[,2] estimates[,3])) xline(1, lcolor(red%40)) ///
mlabel(string(@b, "%5.2f") + " (" + string(@ll, "%5.2f") + "; " + string(@ul, "%5.2f") + ")") ///
mlabposition(12) mlabsize(vsmall)
Last update: 2022-04-22, Stata version 17