What is -metadata-

metadata is a tool to get metadata of datasets without necessarily having to load the datasets.

The using modifier can either a dataset or a diretory as argument.

If a dataset is specified then the metadata of that dataset is presented in the Result window, possibly saved in a specified dataset and possibly send Data Editor window.

If a directory is specified then metadata of all datasets in that directory are collected.

If further the option subdirectories is set then metadata in that directory as well as all subdirectories are collected.

-metadata- is a part of the package matrixtools.

Syntax

The syntax is: metadata varlist [, options]

Options

Main:

matprint options:

Versions

-metadata- is tested in version 12.1 ic, 13.1 ic, and 14.2 ic.

Installation

To install use the command: ssc install matrixtools

A demonstration of -metadata-

-metadata- can be used report on current dataset:


sysuse auto, clear
metadata *

--------------------------------------------------------------------------------------------------------------------
Name          Index  Label                   Value Label Name  Format  Value Label Values         n  unique  missing
--------------------------------------------------------------------------------------------------------------------
make              1  Make and Model                            %-18s                             74      74        0
price             2  Price                                     %8.0gc                            74      74        0
mpg               3  Mileage (mpg)                             %8.0g                             74      21        0
rep78             4  Repair Record 1978                        %8.0g                             69       5        5
headroom          5  Headroom (in.)                            %6.1f                             74       8        0
trunk             6  Trunk space (cu. ft.)                     %8.0g                             74      18        0
weight            7  Weight (lbs.)                             %8.0gc                            74      64        0
length            8  Length (in.)                              %8.0g                             74      47        0
turn              9  Turn Circle (ft.)                         %8.0g                             74      18        0
displacement     10  Displacement (cu. in.)                    %8.0g                             74      31        0
gear_ratio       11  Gear Ratio                                %6.2f                             74      36        0
foreign          12  Car type                origin            %8.0g   0 "Domestic" 1 "Foreign"  74       2        0
--------------------------------------------------------------------------------------------------------------------

One can get the metadata report on a subset of variables. One could eg be interested in looking at all person id variables, where the only thing known is that the names contains the string "id" or "ID".


metadata weight make foreign rep78

-----------------------------------------------------------------------------------------------------------
Name     Index  Label               Value Label Name  Format  Value Label Values         n  unique  missing
-----------------------------------------------------------------------------------------------------------
weight       7  Weight (lbs.)                         %8.0gc                            74      64        0
make         1  Make and Model                        %-18s                             74      74        0
foreign     12  Car type            origin            %8.0g   0 "Domestic" 1 "Foreign"  74       2        0
rep78        4  Repair Record 1978                    %8.0g                             69       5        5
-----------------------------------------------------------------------------------------------------------

As can be seen -metadata- is inspired by -describe- and -codebook-.

A warning is given when a non existing variable is added to the varlist:


capture noisily metadata weight make xxx foreign

variable xxx not found

By specifying a path and filename with using modifier one gets the metadata of that file:


local pwd `c(pwd)'
cd "`c(sysdir_base)'"
metadata * using `"./c/census.dta"'

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Dataset_path  Dataset     Filesize kb  Name      Index  Label                          Value Label Name  Format   Value Label Values                      n  unique  missing
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
./c           census.dta        6.044  state         1  State                                            %-14s                                           50      50        0
./c           census.dta        6.044  state2        2  Two-letter state abbreviation                    %-2s                                            50      50        0
./c           census.dta        6.044  region        3  Census region                  cenreg            %-8.0g   1 "NE" 2 "N Cntrl" 3 "South" 4 "West"  50       4        0
./c           census.dta        6.044  pop           4  Population                                       %12.0gc                                         50      50        0
./c           census.dta        6.044  poplt5        5  Pop, < 5 year                                    %12.0gc                                         50      50        0
./c           census.dta        6.044  pop5_17       6  Pop, 5 to 17 years                               %12.0gc                                         50      50        0
./c           census.dta        6.044  pop18p        7  Pop, 18 and older                                %12.0gc                                         50      50        0
./c           census.dta        6.044  pop65p        8  Pop, 65 and older                                %12.0gc                                         50      50        0
./c           census.dta        6.044  popurban      9  Urban population                                 %12.0gc                                         50      50        0
./c           census.dta        6.044  medage       10  Median age                                       %9.2f                                           50      37        0
./c           census.dta        6.044  death        11  Number of deaths                                 %12.0gc                                         50      50        0
./c           census.dta        6.044  marriage     12  Number of marriages                              %12.0gc                                         50      50        0
./c           census.dta        6.044  divorce      13  Number of divorces                               %12.0gc                                         50      50        0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

cd "`pwd'"

With the option keep it is possible to keep the metadata in the data editor.

The option searchsubdirs makes -metadata- search all subdirectories for dataset besides the specified current directory.

Below is the imported dataset auto replaced by the search results from the directory "`c(sysdir_base)'/c" and all its subdirectories. The search results are also saved in the Stata dataset meta.dta:


local pwd `c(pwd)'
cd "`c(sysdir_base)'"
sysuse auto
metadata * using `"./c"', savein("`pwd'/meta.dta", replace) keep searchsubdirs

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Dataset_path  Dataset        Filesize kb  Name       Index  Label                                   Value Label Name  Format   Value Label Values                                                                                          n  unique  missing
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
./c           cancer.dta           3.312  studytime      1  Months to death or end of exp.                            %8.0g                                                                                                               48      28        0
./c           cancer.dta           3.312  died           2  1 if patient died                                         %8.0g                                                                                                               48       2        0
./c           cancer.dta           3.312  drug           3  Drug type (1=placebo)                                     %8.0g                                                                                                               48       3        0
./c           cancer.dta           3.312  age            4  Patient's age at start of exp.                            %8.0g                                                                                                               48      18        0
./c           cancer.dta           3.312  _st            5  1 if record is to be used; 0 otherwise                    %8.0g                                                                                                               48       1        0
./c           cancer.dta           3.312  _d             6  1 if failure; 0 if censored                               %8.0g                                                                                                               48       2        0
./c           cancer.dta           3.312  _t             7  analysis time when record ends                            %10.0g                                                                                                              48      28        0
./c           cancer.dta           3.312  _t0            8  analysis time when record begins                          %10.0g                                                                                                              48       1        0
./c           census.dta           6.044  state          1  State                                                     %-14s                                                                                                               50      50        0
./c           census.dta           6.044  state2         2  Two-letter state abbreviation                             %-2s                                                                                                                50      50        0
./c           census.dta           6.044  region         3  Census region                           cenreg            %-8.0g   1 "NE" 2 "N Cntrl" 3 "South" 4 "West"                                                                      50       4        0
./c           census.dta           6.044  pop            4  Population                                                %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  poplt5         5  Pop, < 5 year                                             %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  pop5_17        6  Pop, 5 to 17 years                                        %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  pop18p         7  Pop, 18 and older                                         %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  pop65p         8  Pop, 65 and older                                         %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  popurban       9  Urban population                                          %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  medage        10  Median age                                                %9.2f                                                                                                               50      37        0
./c           census.dta           6.044  death         11  Number of deaths                                          %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  marriage      12  Number of marriages                                       %12.0gc                                                                                                             50      50        0
./c           census.dta           6.044  divorce       13  Number of divorces                                        %12.0gc                                                                                                             50      50        0
./c           citytemp.dta        16.974  division       1  Census Division                         division          %8.0g    1 "N. Eng." 2 "Mid Atl" 3 "E.N.C." 4 "W.N.C." 5 "S. Atl." 6 "E.S.C." 7 "W.S.C." 8 "Mountain" 9 "Pacific"  956       9        0
./c           citytemp.dta        16.974  region         2  Census Region                           region            %8.0g    1 "NE" 2 "N Cntrl" 3 "South" 4 "West"                                                                     956       4        0
./c           citytemp.dta        16.974  heatdd         3  Heating degree days                                       %8.0g                                                                                                              953     471        3
./c           citytemp.dta        16.974  cooldd         4  Cooling degree days                                       %8.0g                                                                                                              953     438        3
./c           citytemp.dta        16.974  tempjan        5  Average January temperature                               %9.0g                                                                                                              954     310        2
./c           citytemp.dta        16.974  tempjuly       6  Average July temperature                                  %9.0g                                                                                                              954     196        2
./c           citytemp4.dta       16.979  division       1  Census Division                         division          %8.0g    1 "N. Eng." 2 "Mid Atl" 3 "E.N.C." 4 "W.N.C." 5 "S. Atl." 6 "E.S.C." 7 "W.S.C." 8 "Mountain" 9 "Pacific"  956       9        0
./c           citytemp4.dta       16.979  region         2  Census Region                           region            %10.0g   1 "N.E." 2 "N. Central" 3 "South" 4 "West"                                                                956       4        0
./c           citytemp4.dta       16.979  heatdd         3  Heating degree days                                       %8.0g                                                                                                              954     472        2
./c           citytemp4.dta       16.979  cooldd         4  Cooling degree days                                       %8.0g                                                                                                              954     439        2
./c           citytemp4.dta       16.979  tempjan        5  Average January temperature                               %9.0g                                                                                                              954     310        2
./c           citytemp4.dta       16.979  tempjuly       6  Average July temperature                                  %9.0g                                                                                                              954     196        2
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

cd "`pwd'"

The command -metadata- can also save search results as html or latex. Below is the search results saved as html and there is no logging of results (option nolog):


metadata m* using `"`c(sysdir_base)'"', savein(meta.html, replace) nolog searchsubdirs

The search results as html


The do file for this document

Last update: 2017-03-31