-metadata-
metadata is a tool to get metadata of datasets without necessarily having to load the datasets.
The using modifier can either a dataset or a diretory as argument.
If a dataset is specified then the metadata of that dataset is presented in the Result window, possibly saved in a specified dataset and possibly send Data Editor window.
If a directory is specified then metadata of all datasets in that directory are collected.
If further the option subdirectories is set then metadata in that directory as well as all subdirectories are collected.
Installation
To install use the command: ssc install matrixtools
Demonstration
-metadata- can be used report on current dataset:
sysuse auto, clear
metadata *
-------------------------------------------------------------------------------------------------------------------- Name Index Label Value Label Name Format Value Label Values n unique missing -------------------------------------------------------------------------------------------------------------------- make 1 Make and model %-18s 74 74 0 price 2 Price %8.0gc 74 74 0 mpg 3 Mileage (mpg) %8.0g 74 21 0 rep78 4 Repair record 1978 %8.0g 69 5 5 headroom 5 Headroom (in.) %6.1f 74 8 0 trunk 6 Trunk space (cu. ft.) %8.0g 74 18 0 weight 7 Weight (lbs.) %8.0gc 74 64 0 length 8 Length (in.) %8.0g 74 47 0 turn 9 Turn circle (ft.) %8.0g 74 18 0 displacement 10 Displacement (cu. in.) %8.0g 74 31 0 gear_ratio 11 Gear ratio %6.2f 74 36 0 foreign 12 Car origin origin %8.0g 0 "Domestic" 1 "Foreign" 74 2 0 --------------------------------------------------------------------------------------------------------------------
One can get the metadata report on a subset of variables. One could eg be interested in looking at all person id variables, where the only thing known is that the names contains the string "id" or "ID".
metadata weight make foreign rep78
----------------------------------------------------------------------------------------------------------- Name Index Label Value Label Name Format Value Label Values n unique missing ----------------------------------------------------------------------------------------------------------- weight 7 Weight (lbs.) %8.0gc 74 64 0 make 1 Make and model %-18s 74 74 0 foreign 12 Car origin origin %8.0g 0 "Domestic" 1 "Foreign" 74 2 0 rep78 4 Repair record 1978 %8.0g 69 5 5 -----------------------------------------------------------------------------------------------------------
As can be seen -metadata- is inspired by -describe- and -codebook-.
A warning is given when a non existing variable is added to the varlist:
capture noisily metadata weight make xxx foreign
variable xxx not found
By specifying a path and filename with using modifier one gets the metadata of that file:
local pwd `c(pwd)'
cd "`c(sysdir_base)'"
metadata * using `"./c/census.dta"'
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Dataset_path Dataset Filesize kb Name Index Label Value Label Name Format Value Label Values n unique missing ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ./c census.dta 6.044 state 1 State %-14s 50 50 0 ./c census.dta 6.044 state2 2 Two-letter state abbreviation %-2s 50 50 0 ./c census.dta 6.044 region 3 Census region cenreg %-8.0g 1 "NE" 2 "N Cntrl" 3 "South" 4 "West" 50 4 0 ./c census.dta 6.044 pop 4 Population %12.0gc 50 50 0 ./c census.dta 6.044 poplt5 5 Pop, < 5 year %12.0gc 50 50 0 ./c census.dta 6.044 pop5_17 6 Pop, 5 to 17 years %12.0gc 50 50 0 ./c census.dta 6.044 pop18p 7 Pop, 18 and older %12.0gc 50 50 0 ./c census.dta 6.044 pop65p 8 Pop, 65 and older %12.0gc 50 50 0 ./c census.dta 6.044 popurban 9 Urban population %12.0gc 50 50 0 ./c census.dta 6.044 medage 10 Median age %9.2f 50 37 0 ./c census.dta 6.044 death 11 Number of deaths %12.0gc 50 50 0 ./c census.dta 6.044 marriage 12 Number of marriages %12.0gc 50 50 0 ./c census.dta 6.044 divorce 13 Number of divorces %12.0gc 50 50 0 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cd "`pwd'"
With the option keep it is possible to keep the metadata in the data editor.
The option searchsubdirs makes -metadata- search all subdirectories for dataset besides the specified current directory.
Below is the imported dataset auto replaced by the search results from the directory "`c(sysdir_base)'/c" and all its subdirectories. The search results are also saved in the Stata dataset meta.dta:
local pwd `c(pwd)'
cd "`c(sysdir_base)'"
sysuse auto
metadata * using `"./c"', savein("`pwd'/meta.dta", replace) keep searchsubdirs
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Dataset_path Dataset Filesize kb Name Index Label Value Label Name Format Value Label Values n unique missing ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ./c cancer.dta 8.994 studytime 1 Months to death or end of exp. %8.0g 48 28 0 ./c cancer.dta 8.994 died 2 Patient died diedlbl %8.0g 0 "No" 1 "Yes" 48 2 0 ./c cancer.dta 8.994 drug 3 Drug type type %8.0g 1 "Placebo" 2 "Other" 3 "NA" 48 3 0 ./c cancer.dta 8.994 age 4 Patient's age at start of exp. %8.0g 48 18 0 ./c cancer.dta 8.994 _st 5 1 if record is to be used; 0 otherwise %8.0g 48 1 0 ./c cancer.dta 8.994 _d 6 1 if failure; 0 if censored %8.0g 48 2 0 ./c cancer.dta 8.994 _t 7 Analysis time when record ends %10.0g 48 28 0 ./c cancer.dta 8.994 _t0 8 Analysis time when record begins %10.0g 48 1 0 ./c census.dta 6.044 state 1 State %-14s 50 50 0 ./c census.dta 6.044 state2 2 Two-letter state abbreviation %-2s 50 50 0 ./c census.dta 6.044 region 3 Census region cenreg %-8.0g 1 "NE" 2 "N Cntrl" 3 "South" 4 "West" 50 4 0 ./c census.dta 6.044 pop 4 Population %12.0gc 50 50 0 ./c census.dta 6.044 poplt5 5 Pop, < 5 year %12.0gc 50 50 0 ./c census.dta 6.044 pop5_17 6 Pop, 5 to 17 years %12.0gc 50 50 0 ./c census.dta 6.044 pop18p 7 Pop, 18 and older %12.0gc 50 50 0 ./c census.dta 6.044 pop65p 8 Pop, 65 and older %12.0gc 50 50 0 ./c census.dta 6.044 popurban 9 Urban population %12.0gc 50 50 0 ./c census.dta 6.044 medage 10 Median age %9.2f 50 37 0 ./c census.dta 6.044 death 11 Number of deaths %12.0gc 50 50 0 ./c census.dta 6.044 marriage 12 Number of marriages %12.0gc 50 50 0 ./c census.dta 6.044 divorce 13 Number of divorces %12.0gc 50 50 0 ./c citytemp.dta 20.269 division 1 Census division division %16.0g 1 "N Eng" 2 "Mid Atl" 3 "ENC" 4 "WNC" 5 "S Atl" 6 "ESC" 7 "WSC" 8 "Mtn" 9 "Pacific" 956 9 0 ./c citytemp.dta 20.269 region 2 Census region region %13.0g 1 "NE" 2 "N Cntrl" 3 "South" 4 "West" 956 4 0 ./c citytemp.dta 20.269 heatdd 3 Heating degree days %8.0g 953 471 3 ./c citytemp.dta 20.269 cooldd 4 Cooling degree days %8.0g 953 438 3 ./c citytemp.dta 20.269 tempjan 5 Average January temperature %9.0g 954 310 2 ./c citytemp.dta 20.269 tempjuly 6 Average July temperature %9.0g 954 196 2 ./c citytemp4.dta 20.269 division 1 Census division division %16.0g 1 "N Eng" 2 "Mid Atl" 3 "ENC" 4 "WNC" 5 "S Atl" 6 "ESC" 7 "WSC" 8 "Mtn" 9 "Pacific" 956 9 0 ./c citytemp4.dta 20.269 region 2 Census region region %13.0g 1 "NE" 2 "N Cntrl" 3 "South" 4 "West" 956 4 0 ./c citytemp4.dta 20.269 heatdd 3 Heating degree days %8.0g 954 472 2 ./c citytemp4.dta 20.269 cooldd 4 Cooling degree days %8.0g 954 439 2 ./c citytemp4.dta 20.269 tempjan 5 Average January temperature %9.0g 954 310 2 ./c citytemp4.dta 20.269 tempjuly 6 Average July temperature %9.0g 954 196 2 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cd "`pwd'"
The command -metadata- can also save search results as html or latex. Below is the search results saved as html and there is no logging of results (option nolog):
metadata m* using `"`c(sysdir_base)'"', savein(meta.html, replace) nolog searchsubdirs
Last update: 2022-04-19, Stata version 17