Encode all string variables in Stata

The encode command in Stata can convert a string variable into a numeric with a label. However, it cannot replace (overwrite) the existing variable, instead generating a new variable. Here’s a quick snippet to encode all strings in the data in-place.

foreach var of varlist _all {
  capture confirm string variable `var'
  if (_rc == 0) {
    encode `var', gen(`var'2)
    order `var'2, after(`var')
    drop `var'
    rename `var'2 `var'

You can use df, has(type string) and loop over that instead with foreach var in `r(varlist)’, but that requires looping through the variables twice (once for ds then a second time in the loop), whereas the above code does it once.

Home | Back to blog

This work is licensed under CC BY-NC 4.0 Creative Commons BY-NC image