What is -sumat-

-sumat- is a flexible tool to generate survey tables. It differs from existing tools in that more statistics are possible to report and in that the summary table is saved for further data handling.

-sumat- is integrated with -matprint- such that styling in smcl, html, latex, csv and markdown is quite easy.

-sumat- is a part of the package matrixtools.

Syntax

The syntax is: sumat varlist [if] [in] [using] , **stat**istics(list of statistics) [options matprint_options]

Options

Required:

Optional:

Matprint options:

List of statistics

Any non lowercase version of statistics below will also do as an argument.

From summarize

Only for non string variables. In case of a string variable the value missing (.) is returned.

Derived values

Stored results

-sumat- stores the following in r():

Matrices

Versions

-sumat- is tested in version 12.1 ic, 13.1 ic, and 14.2 ic.

Installation

To install use the command: ssc install matrixtools

Examples

The example dataset


sysuse auto, clear
label define rep78 1 "1 repair" 2 "2 repairs" 3 "3 repairs" 4 "4 repairs" 5 "5 repairs"
label values rep78 rep78

To get a matrix with statistics n, missing, unique, mean, sd, and ci for the variables price, foreign and rep78:


sumat price foreign rep78 if rep78 > 2, statistics(n missing unique mean sd ci)p(75) d((0,2,0,0,2))

--------------------------------------------------------------------------
                     n  missing  unique  mean       sd  ci75% lb  ci75% ub
--------------------------------------------------------------------------
Price               64     0.00      64  6240  2925.84   5819.27   6660.70
Car type            64     0.00       2     0     0.48      0.27      0.41
Repair Record 1978  59     5.00       3     4     0.78      3.56      3.79
--------------------------------------------------------------------------

The result is stored in r(sumat) for further use:


return list

matrices:
              r(sumat) :  3 x 7

-sumat- is integrated with -matprint- and hence the result can be styled, eg as latex and decimals can be set:


sumat make foreign, statistics(n missing unique) style(tex) d(0)

\begin{table}[h]
\centering
\begin{tabular}{lrrr}
\hline
\hline
               &  n & missing & unique \\
\hline
Make and Model & 74 &       0 &     74 \\
Car type       & 74 &       0 &      2 \\
\hline
\hline
\end{tabular}
\end{table}

Note that -sumat- handles string variables when possible


sumat make price, stat(n missing unique mean ci) d((0,2)#(1,1,1))

----------------------------------------------------------------
                 n  missing  unique     mean  ci95% lb  ci95% ub
----------------------------------------------------------------
Make and Model  74        0      74        .         .         .
Price           74        0      74  6165.26   5493.24   6837.27
----------------------------------------------------------------

-sumat- can also summarize variables across a categorical variable and possibly add a total:


sumat price foreign rep78 if rep78 > 2, statistics(n fraction missing unique mean sd ci) total rowby(rep78) p(75) d((0,2,0,0,2))

---------------------------------------------------------------------------------------------------------
                                       n  fraction  missing  unique     mean       sd  ci75% lb  ci75% ub
---------------------------------------------------------------------------------------------------------
rep78(3 repairs)  Price               30     46.88        0      30  6429.23  3525.14   5688.87   7169.60
                  Car type            30     46.88        0       2     0.10     0.31      0.04      0.16
                  Repair Record 1978  30     46.88        0       1     3.00     0.00      3.00      3.00
rep78(4 repairs)  Price               18     28.13        0      18  6071.50  1709.61   5607.96   6535.04
                  Car type            18     28.13        0       2     0.50     0.51      0.36      0.64
                  Repair Record 1978  18     28.13        0       1     4.00     0.00      4.00      4.00
rep78(5 repairs)  Price               11     17.19        0      11  5913.00  2615.76   5005.74   6820.26
                  Car type            11     17.19        0       2     0.82     0.40      0.68      0.96
                  Repair Record 1978  11     17.19        0       1     5.00     0.00      5.00      5.00
Total             Price               64    100.00        0      64  6239.98  2925.84   5819.27   6660.70
                  Car type            64    100.00        0       2     0.34     0.48      0.27      0.41
                  Repair Record 1978  59     92.19        5       3     3.68     0.78      3.56      3.79
---------------------------------------------------------------------------------------------------------

Combine matrices

Matrices can be combined easily in Stata. Eg side by side:


sumat price length weight if foreign, statistics(n missing mean ci) coleq(Foreign)
matrix out = r(sumat)
sumat price length weight if !foreign, statistics(n missing mean ci) coleq(Domestic)
matrix out = out, r(sumat)
matprint out, decimals(J(1, 2, (0,0,2)))

------------------------------------------------------------------------------------------------------------
               Foreign                                        Domestic                                      
                     n  missing     mean  ci95% lb  ci95% ub         n  missing     mean  ci95% lb  ci95% ub
------------------------------------------------------------------------------------------------------------
Price               22        0  6384.68      5289      7480     52.00     0.00  6072.42   5230.64   6914.21
Length (in.)        22        0   168.55       163       174     52.00     0.00   196.13    190.69    201.58
Weight (lbs.)       22        0  2315.91      2135      2497     52.00     0.00  3317.12   3128.12   3506.11
------------------------------------------------------------------------------------------------------------

Or on top of one another:


sumat price length weight if foreign, statistics(n missing mean ci) roweq(Foreign) coleq(" ")
matrix out = r(sumat)
sumat price length weight if !foreign, statistics(n missing mean ci) roweq(Domestic)
matrix out = out \ r(sumat)
matprint out, decimals((0,0,2)) 

-----------------------------------------------------------------
                          n  missing     mean  ci95% lb  ci95% ub
-----------------------------------------------------------------
Foreign   Price          22        0  6384.68   5289.07   7480.29
          Length (in.)   22        0   168.55    162.83    174.26
          Weight (lbs.)  22        0  2315.91   2134.97   2496.85
Domestic  Price          52        0  6072.42   5230.64   6914.21
          Length (in.)   52        0   196.13    190.69    201.58
          Weight (lbs.)  52        0  3317.12   3128.12   3506.11
-----------------------------------------------------------------

Or the grand scale tabulation. Note the necessity of option full:


sumat price mpg weight if foreign, statistics(n missing unique mean sd) total rowby(rep78) coleq(Foreign) full
matrix out = r(sumat)
sumat price mpg weight if !foreign, statistics(n missing unique mean sd) total rowby(rep78) coleq(Domestic)
matrix out = out, r(sumat)
matprint out, d((1,1)#(0,0,2,2,2))

------------------------------------------------------------------------------------------------------------------------
                                 Foreign                                     Domestic                                   
                                       n  missing  unique     mean       sd         n  missing  unique     mean       sd
------------------------------------------------------------------------------------------------------------------------
rep78(1 repair)   Price                0        0    0.00        .        .         2        0    2.00  4564.50   522.55
                  Mileage (mpg)        0        0    0.00        .        .         2        0    2.00    21.00     4.24
                  Weight (lbs)         0        0    0.00        .        .         2        0    2.00  3100.00   523.26
rep78(2 repairs)  Price                0        0    0.00        .        .         8        0    8.00  5967.63  3579.36
                  Mileage (mpg)        0        0    0.00        .        .         8        0    6.00    19.13     3.76
                  Weight (lbs)         0        0    0.00        .        .         8        0    7.00  3353.75   446.00
rep78(3 repairs)  Price                3        0    3.00  4828.67  1285.61        27        0   27.00  6607.07  3661.27
                  Mileage (mpg)        3        0    3.00    23.33     2.52        27        0   13.00    19.00     4.09
                  Weight (lbs)         3        0    3.00  2010.00   158.75        27        0   26.00  3442.22   645.37
rep78(4 repairs)  Price                9        0    9.00  6261.44  1896.09         9        0    9.00  5881.56  1592.02
                  Mileage (mpg)        9        0    6.00    24.89     2.71         9        0    7.00    18.44     4.59
                  Weight (lbs)         9        0    9.00  2207.78   335.48         9        0    7.00  3532.22   806.21
rep78(5 repairs)  Price                9        0    9.00  6292.67  2765.63         2        0    2.00  4204.50   311.83
                  Mileage (mpg)        9        0    6.00    26.33     9.37         2        0    2.00    32.00     2.83
                  Weight (lbs)         9        0    9.00  2403.33   405.09         2        0    2.00  1960.00   226.27
Total             Price               22        0   22.00  6384.68  2621.92        52        0   52.00  6072.42  3097.10
                  Mileage (mpg)       22        0   13.00    24.77     6.61        52        0   17.00    19.83     4.74
                  Weight (lbs.)       22        0   22.00  2315.91   433.00        52        0   47.00  3317.12   695.36
------------------------------------------------------------------------------------------------------------------------

Export to excel

The matrix can easily be reused in Excel by using -putexcel-.

In Stata 13 it looks like:

putexcel a1=matrix(out, names) using "delete_me13.xlsx", sheet(nhb, replace) replace

In Stata 14 it looks like:

putexcel set delete_me14.xlsx, replace putexcel (a1) = matrix( out ), names


The do file for this document

Last update: 2017-03-31