Using -basetable- to summarise a set of items


Using questionaires it is often necessary to summarise a battery of questions/items where the answer for all questions/items are the same category, e.g. a Lickert scale. This can be done by selecting the questions to summarise, making the dataset of these questions long and then report the result using -basetable-.

The data

The data used are from the 38th round of the State Survey conducted by Michigan State University’s Institute for Public Policy and Social Research. The survey was administered to 949 Michigan citizens from May 28 to July 18, 2005, by telephone.

The focus of the survey included charitable giving and volunteer activities of Michigan households. Five questions measured the public’s faith and trust in charity organizations.

Respondents were asked to indicate to what degree they agree with five statements. The questions have four response categories corresponding to "strongly agree", "somewhat agree", "somewhat disagree", and "strongly disagree".

The answers for the five questions are gathered in the dataset

use "", clear

The variables are renamed and notes are added. Variable labels are 79 characters long and this is usually not enough for the formulation of questions. One way of saving the question is to use -notes-.

rename ta? answer?
notes answer1: "Charitable organizations are more effective now in providing services than they were 5 years ago"
notes answer2: "I place a low degree of trust in charitable organizations"
notes answer3: "Most charitable organizations are honest and ethical in their use of donated funds"
notes answer4: "Generally, charitable organizations play a major role in making our communities better places to live"
notes answer5: "On the whole, charitable organizations do not do a very good job in helping those who need help"


Making the dataset long

The variable labels that typically are the questions must be converted to a value label to use after the reshape.

forvalues q = 1/5 {
  label define question `q' "`:variable label answer`q''", modify

A long dataset is created where there for each id and each question are one possibly missing answer.

generate id = _n
reshape long answer, i(id) j(question)

(j = 1 2 3 4 5)

Data                               Wide   ->   Long
Number of observations              949   ->   4,745       
Number of variables                   6   ->   3           
j variable (5 values)                     ->   question
xij variables:
            answer1 answer2 ... answer5   ->   answer

The long version of data is relabelled.

label values question question
label variable answer "Answer"
label variable question "Public’s faith and trust in charity organizations"

Summarise questions/items using -basetable-

basetable answer question(r), notopcount

Columns by: Answer                                          strongly agree       agree    disagree  strongly disagree        Total  P-value
Public’s faith and trust in charity organizations, n (%)                                                                                   
  Charitable Organizations More Effective                       203 (22.9)  447 (50.5)  177 (20.0)           58 (6.6)  885 (100.0)         
  Degree of Trust                                               185 (20.3)  263 (28.8)  362 (39.7)         102 (11.2)  912 (100.0)         
  Charitable Organizations Honest/Ethical                       205 (22.1)  511 (55.0)  158 (17.0)           55 (5.9)  929 (100.0)         
  Role Improving Communities                                    372 (39.8)  438 (46.9)    88 (9.4)           36 (3.9)  934 (100.0)         
  Job Delivering Services                                       266 (28.8)  350 (37.9)  221 (23.9)           86 (9.3)  923 (100.0)     0.00

The do file for this document

Last update: 2022-04-21, Stata version 17