360Studies

Your Destination for Career Excellence in Bioscience, Statistics, and Data Science

How can you recode a variable in Stata

how-can-you-recode-a-variable-in-stata

Before starting this exercise, you should have the “Digital.dta” file. (Download link is given below)

Click here to download“Digital.dta”

Importing the “Digital.dta” file.

set maxvar 10000
use "/Users/pankajchowdhury/Downloads/Digital.dta"

I have the file located at "/Users/pankajchowdhury/Downloads/Digital.dta". Feel free to modify the file path according to your needs.

Observations:724115
Variables:31
Variable nameStorage typeDisplay formatValue labelVariable label
v005long%12.0gv005women's individual sample weight (6 decimals)
v013byte%8.0gV013age in 5-year groups
v024byte%8.0gV024state
v025byte%8.0gV025type of place of residence
v130byte%8.0gV130religion
v133byte%8.0gV133education in single years
v151byte%8.0gV151sex of household head
v157byte%8.0gV157frequency of reading newspaper or magazine
v158byte%8.0gV158frequency of listening to radio
v159byte%8.0gV159frequency of watching television
v169abyte%8.0gV169Aowns a mobile telephone
v190byte%8.0gV190wealth index combined
v217byte%8.0gV217knowledge of ovulatory cycle
v504byte%8.0gV504currently residing with husband/partner
v702byte%8.0gV702husband/partner's highest year of education (at level in v701)
v704byte%8.0gV704husband/partner's occupation
v715byte%8.0gV715husband/partner's total number of years of education
v730byte%8.0gV730husband/partner's age
v743fbyte%8.0gV743Fperson who usually decides what to do with money husband earns
v746byte%8.0gV746respondent earns more than husband/partner
d005long%12.0gd005weight for domestic violence (6 decimals)
d102byte%8.0gd012number of control issues answered 'yes' (d101x = 1)
sweightlong%12.0gsweightsample weight (6 decimals) (state level)
s116byte%8.0gS116belong to a scheduled caste, a scheduled tribe, other backward class
s303int%8.0gS303time period not living with husband
s311byte%8.0gS311type of relationship to current husband,prior to marriage
s931byte%8.0gS931do you have a bank or savings account that you yourself use
s932byte%8.0gS932do you have any mobile phone that you yourself use
s933byte%8.0gS933do you use your mobile phone for any financial transaction ?
s934byte%8.0gS934have you ever used the internet?
s1004pbyte%8.0gS1004Psource of information about aids: internet

Code 1:

numlabel,add 

Explanation : 

In Stata, the numlabel the command is used to assign value labels to numeric variables. It allows you to assign meaningful labels to numeric values, which can be helpful for data interpretation and analysis. The add option, in particular, is used with the numlabel command to add value labels to a variable.

Now you can view the value labels by examining the tabulation of the variable. This allows you to see the meaningful labels associated with each value in the variable.

Code 2:

ta v190

Result :  

wealth index combinedFreq.PercentCum.
1. poorest1,49,84420.6920.69
2. poorer1,60,34022.1442.84
3. middle1,51,50520.9263.76
4. richer1,39,60719.2883.04
5. richest1,22,81916.96100
Total7,24,115100

Explanation : ta v190– this command will provide a frequency table comprising cumulative frequency and percentage distribution for variable v190.

Now, I will present various recoding techniques to you, demonstrating different ways to transform and manipulate variables in Stata.

Code 3:

recode v190 (1 2 =1 "Poor") (3=2 "Middle")( 4/5=3 "Rich"), gen (Income) label(Income)  

Explanation :

The command you provided, recode v190 (1 2 =1 "Poor") (3=2 "Middle")(4/5=3 "Rich"), gen(Income) label(Income), can be broken down into two parts: recoding and labelling a variable in Stata.

  1. Recoding:
    • recode v190: This specifies the variable v190 that you want to recode.
    • (1 2 = 1 "Poor"): This recodes the values 1 and 2 in v190 – to a new value of 1 and assigns the label “Poor” to them.
    • (3 = 2 "Middle"): This recodes the value 3 in v190 – to a new value of 2 and assigns the label “Middle” to it.
    • (4/5 = 3 "Rich"): This recodes the values 4 and 5 in v190 – to a new value of 3 and assigns the label “Rich” to them.
    • The recoding process creates a new variable.
  2. Labelling:
    • gen(Income): This part generates a new variable named “Income” that contains the recoded values.
    • label(Income): This assigns the label “Income” to the newly generated variable “Income”.

To summarize, the command recode v190 (1 2 =1 "Poor") (3=2 "Middle")(4/5=3 "Rich"), gen(Income) label(Income) recodes the values of the variable v190 into new values, creates a new variable named “Income” with the recoded values, and assigns the label “Income” to that variable.

Result:

RECODE of v190 (wealth index combined)Freq.PercentCum.
Poor3,10,18442.8442.84
Middle1,51,50520.9263.76
Rich2,62,42636.24100
Total7,24,115100

Code 4:

ta v024

Result :

stateFreq.PercentCum.
1. jammu & kashmir23,0373.183.18
2. himachal pradesh10,3681.434.61
3. punjab21,7713.017.62
4. chandigarh7460.17.72
5. uttarakhand13,2801.839.56
6. haryana21,9093.0312.58
7. nct of delhi11,1591.5414.12
8. rajasthan42,9905.9420.06
9. uttar pradesh93,12412.8632.92
10. bihar42,4835.8738.79
11. sikkim3,2710.4539.24
12. arunachal pradesh19,7652.7341.97
13. nagaland9,6941.3443.31
14. manipur8,0421.1144.42
15. mizoram7,2791.0145.42
16. tripura7,3141.0146.43
17. meghalaya13,0891.8148.24
18. assam34,9794.8353.07
19. west bengal21,4082.9656.03
20. jharkhand26,4953.6659.69
21. odisha27,9713.8663.55
22. chhattisgarh28,4683.9367.48
23. madhya pradesh48,4106.6974.17
24. gujarat33,3434.678.77
25. dadra & nagar haveli and daman & diu2,7130.3779.15
27. maharashtra33,7554.6683.81
28. andhra pradesh10,9751.5285.32
29. karnataka30,4554.2189.53
30. goa2,0300.2889.81
31. lakshadweep1,2340.1789.98
32. kerala10,9691.5191.49
33. tamil nadu25,6503.5495.04
34. puducherry3,6690.5195.54
35. andaman & nicobar islands2,3970.3395.87
36. telangana27,5183.899.67
37. ladakh2,3550.33100
Total7,24,115100
recode v024 (1 2 3 4 6 7 8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")( 10 19 20 21 11=3 "Eastern Region")( 12/18=4 "North Eastern Region")( 24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(28/36=6 "Southern Region"), gen (Region1) label(Region1)
ta Religion1 

Explanation :

The command you provided, recode v024 (1 2 3 4 6 7 8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")(10 19 20 21 11=3 "Eastern Region")(12/18=4 "North Eastern Region")(24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(28/36=6 "Southern Region"), gen(Region1) label(Region1), involves recoding a variable in Stata, generating a new variable, and assigning labels to it.

Here’s a breakdown of the command:

  1. Recoding:
    • v024: This specifies the variable v024 that you want to recode.
    • (1 2 3 4 6 7 8 37=1 "Northern Region"): This recodes the values 1, 2, 3, 4, 6, 7, 8, and 37 in v024 to a new value of 1 and assigns the label “Northern Region” to them.
    • (5 9 23 22=2 "Central Region"): This recodes the values 5, 9, 23, and 22 in v024 to a new value of 2 and assigns the label “Central Region” to them.
    • (10 19 20 21 11=3 "Eastern Region"): This recodes the values 10, 19, 20, 21, and 11 in v024 to a new value of 3 and assigns the label “Eastern Region” to them.
    • (12/18=4 "North Eastern Region"): This recodes the values 12 to 18 in v024 to a new value of 4 and assigns the label “North Eastern Region” to them.
    • (24 25 27=5 "Western Region"): This recodes the values 24, 25, and 27 in v024 to a new value of 5 and assigns the label “Western Region” to them.
    • (28/36=6 "Southern Region"): This recodes the values 28 to 36 in v024 to a new value of 6 and assigns the label “Southern Region” to them.
  2. Generating and labeling:
    • gen(Region1): This part generates a new variable named “Region1” that contains the recoded values.
    • label(Region1): This assigns the label “Region1” to the newly generated variable “Region1”.

Result :

RECODE of v024 (state)Freq.PercentCum.
Northern Region1,34,33518.5518.55
Central Region1,83,28225.3143.86
Eastern Region1,21,62816.860.66
North Eastern Region1,00,16213.8374.49
Western Region69,8119.6484.13
Southern Region1,14,89715.87100
Total7,24,115100

Code 5:

recode v024 (min/8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")( 10 19 20 21 11=3 "Eastern Region")( 12/18=4 "North Eastern Region")( 24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(28/max=6 "Southern Region"), gen (Region2) label(Region2)

Explanation : 

The command recode v024 (min/8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")(10 19 20 21 11=3 "Eastern Region")(12/18=4 "North Eastern Region")(24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(28/max=6 "Southern Region"), gen(Region2) label(Region2) performs recoding and labeling operations in Stata. Let’s break down the command:

  1. Recoding:
    • v024: This specifies the variable v024 that you want to recode.
    • (min/8 37=1 "Northern Region"): This recodes the values from the minimum value of v024 up to 8, and the value 37, to a new value of 1. It assigns the label “Northern Region” to these recoded values.
    • (5 9 23 22=2 "Central Region"): This recodes the values 5, 9, 23, and 22 in v024 to a new value of 2. It assigns the label “Central Region” to these recoded values.
    • (10 19 20 21 11=3 "Eastern Region"): This recodes the values 10, 19, 20, 21, and 11 in v024 to a new value of 3. It assigns the label “Eastern Region” to these recoded values.
    • (12/18=4 "North Eastern Region"): This recodes the values from 12 to 18 in v024 to a new value of 4. It assigns the label “North Eastern Region” to these recoded values.
    • (24 25 27=5 "Western Region"): This recodes the values 24, 25, and 27 in v024 to a new value of 5. It assigns the label “Western Region” to these recoded values.
    • (28/max=6 "Southern Region"): This recodes the values from 28 up to the maximum value of v024 to a new value of 6. It assigns the label “Southern Region” to these recoded values.
    • The recoding process creates a new variable.
  2. Labeling:
    • gen(Region2): This generates a new variable named “Region2” that contains the recoded values.
    • label(Region2): This assigns the label “Region2” to the newly generated variable “Region2”.

Results :

RECODE of v024 (state)Freq.PercentCum.
Northern Region1,34,33518.5518.55
Central Region1,83,28225.3143.86
Eastern Region1,21,62816.860.66
North Eastern Region1,00,16213.8374.49
Western Region69,8119.6484.13
Southern Region1,14,89715.87100
Total7,24,115100

Code 6:

recode v024 (1/8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")( 10 19 20 21 11=3 "Eastern Region")( 12/18=4 "North Eastern Region")( 24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(nonmiss=6 "Southern Region"), gen (Region3) label(Region3)

Explanation : 

The command you provided, recode v024 (1/8 37=1 "Northern Region") (5 9 23 22=2 "Central Region")(10 19 20 21 11=3 "Eastern Region")(12/18=4 "North Eastern Region")(24 25 27=5 "Western Region")(24 25 27=5 "Western Region")(nonmiss=6 "Southern Region"), gen(Region3) label(Region3), involves recoding a variable in Stata using multiple conditions and generating a new variable with value labels assigned.

Let’s break down the command step by step:

  1. Recoding:
    • v024: This specifies the variable v024 that you want to recode.
    • (1/8 37=1 "Northern Region"): This recodes the values 1 to 8 and 37 in v024 to a new value of 1, and assigns the label “Northern Region” to them.
    • (5 9 23 22=2 "Central Region"): This recodes the values 5, 9, 23, and 22 in v024 to a new value of 2, and assigns the label “Central Region” to them.
    • (10 19 20 21 11=3 "Eastern Region"): This recodes the values 10, 19, 20, 21, and 11 in v024 to a new value of 3, and assigns the label “Eastern Region” to them.
    • (12/18=4 "North Eastern Region"): This recodes the values 12 to 18 in v024 to a new value of 4, and assigns the label “North Eastern Region” to them.
    • (24 25 27=5 "Western Region"): This recodes the values 24, 25, and 27 in v024 to a new value of 5, and assigns the label “Western Region” to them.
    • (nonmiss=6 "Southern Region"): This recodes all non-missing values in v024 to a new value of 6, and assigns the label “Southern Region” to them.
  2. Generating and labeling:
    • gen(Region3): This part generates a new variable named “Region3” that contains the recoded values.
    • label(Region3): This assigns the label “Region3” to the newly generated variable “Region3”.

Results : 

RECODE of v024 (state)Freq.PercentCum.
Northern Region1,34,33518.5518.55
Central Region1,83,28225.3143.86
Eastern Region1,21,62816.860.66
North Eastern Region1,00,16213.8374.49
Western Region69,8119.6484.13
Southern Region1,14,89715.87100
Total7,24,115100

Code 7:

ta s116

Results :

belong to a scheduled caste, a scheduled tribe, other backward classFreq.PercentCum.
1. schedule caste1,39,95720.320.3
2. schedule tribe1,35,23919.6239.92
3. obc2,76,88140.1680.07
4. none of them1,33,34719.3499.42
8. don't know4,0300.58100
Total6,89,454100
recode s116 (1=1 "Schedule Caste ") (2=2 "Schedule Tribe ")(3=3 "OBC")(else=4 "Others"), gen(Caste) label(Caste2)

Explanation :

The command recode s116 (1=1 "Schedule Caste ") (2=2 "Schedule Tribe ")(3=3 "OBC")(else=4 "Others"), gen(Caste) label(Caste2) can be explained as follows:

  1. recode s116: This specifies the variable s116 that you want to recode.
  2. (1=1 "Schedule Caste "): This recodes the value 1 in s116 to a new value of 1 and assigns the label “Schedule Caste” to it.
  3. (2=2 "Schedule Tribe "): This recodes the value 2 in s116 to a new value of 2 and assigns the label “Schedule Tribe” to it.
  4. (3=3 "OBC"): This recodes the value 3 in s116 to a new value of 3 and assigns the label “OBC” to it.
  5. (else=4 "Others"): This specifies that any other value in s116 that is not explicitly mentioned in the previous recoding rules should be recoded as 4 and labeled as “Others”.
  6. gen(Caste): This part generates a new variable named “Caste” that contains the recoded values.
  7. label(Caste2): This assigns the label “Caste2” to the newly generated variable “Caste”.

Results:

RECODE of s116 (belong to a scheduled caste, a scheduled tribe, other backward class)Freq.PercentCum.
Schedule Caste1,39,95719.3319.33
Schedule Tribe1,35,23918.6838
OBC2,76,88138.2476.24
Others1,72,03823.76100
Total7,24,115100

Code 8:

recode s116 (1=1 "Schedule Caste ") (2=2 "Schedule Tribe ")( 3=3 "OBC")( 4 8 .=4 "Others"), gen (Caste) label(Caste1) 

Explanation :

The command you provided, recode s116 (1=1 "Schedule Caste") (2=2 "Schedule Tribe")(3=3 "OBC")(4 8 .=4 "Others"), gen(Caste) label(Caste1), can be explained as follows:

  1. Recoding:
    • recode s116: This specifies the variable s116 that you want to recode.
    • (1=1 "Schedule Caste"): This recodes the value 1 in s116 to a new value of 1 and assigns the label “Schedule Caste” to it.
    • (2=2 "Schedule Tribe"): This recodes the value 2 in s116 to a new value of 2 and assigns the label “Schedule Tribe” to it.
    • (3=3 "OBC"): This recodes the value 3 in s116 to a new value of 3 and assigns the label “OBC” to it.
    • (4 8 .=4 "Others"): This recodes the values 4 and 8 in s116 to a new value of 4 and assigns the label “Others” to them. The . (period) represents missing values, so any missing value in s116 will also be recoded to 4 with the label “Others”.
    • The recoding process creates a new variable.
  2. Labeling:
    • gen(Caste): This part generates a new variable named “Caste” that contains the recoded values.
    • label(Caste1): This assigns the label “Caste1” to the newly generated variable “Caste”.

Results:

RECODE of s116 (belong to a scheduled caste, a scheduled tribe, other backward class)Freq.PercentCum.
Schedule Caste1,39,95719.3319.33
Schedule Tribe1,35,23918.6838
OBC2,76,88138.2476.24
Others1,72,03823.76100
Total7,24,115100

Looking for latest updates and job news, join us on Facebook, WhatsApp, Telegram and Linkedin

You May Also Like

Scroll to Top