Create age quintiles (weighted, in entire subpop2 sample). Report the range and median (midpoint) of each quintile (also weighted).
use "NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
xtile age_q5=ridageyr [aweight=wtmec2yr] if subpop2==1,nq(5)
bysort age_q5: tabstat ridageyr [aweight=wtmec2yr],stat(n min max p50)
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
-------------------------------------------------------------------------------
-> age_q5 = 1
variable | N min max p50
-------------+----------------------------------------
ridageyr | 342 4 6 5
------------------------------------------------------
-------------------------------------------------------------------------------
-> age_q5 = 2
variable | N min max p50
-------------+----------------------------------------
ridageyr | 337 7 9 8
------------------------------------------------------
-------------------------------------------------------------------------------
-> age_q5 = 3
variable | N min max p50
-------------+----------------------------------------
ridageyr | 213 10 11 11
------------------------------------------------------
-------------------------------------------------------------------------------
-> age_q5 = 4
variable | N min max p50
-------------+----------------------------------------
ridageyr | 237 12 14 13
------------------------------------------------------
-------------------------------------------------------------------------------
-> age_q5 = 5
variable | N min max p50
-------------+----------------------------------------
ridageyr | 192 15 17 16
------------------------------------------------------
-------------------------------------------------------------------------------
-> age_q5 = .
variable | N min max p50
-------------+----------------------------------------
ridageyr | 8441 0 80 39
------------------------------------------------------
use "NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
xtile age_q5=ridageyr [aweight=wtmec2yr] if subpop2==1,nq(5)
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata i.age_q5
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
i.age_q5 _Iage_q5_1-5 (naturally coded; _Iage_q5_1 omitted)
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 4, 13) = 2.24
Prob > F = 0.1215
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iage_q5_2 | .1842817 .3964377 0.46 0.648 -.6561288 1.024692
_Iage_q5_3 | -.1072657 .3823348 -0.28 0.783 -.9177793 .703248
_Iage_q5_4 | 1.236589 .467235 2.65 0.018 .2460951 2.227083
_Iage_q5_5 | .2704047 .5807907 0.47 0.648 -.9608166 1.501626
_cons | -1.92872 .3308428 -5.83 0.000 -2.630076 -1.227365
------------------------------------------------------------------------------
use "NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
xtile age_q5=ridageyr [aweight=wtmec2yr] if subpop2==1,nq(5)
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata ridageyr
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 1, 16) = 4.81
Prob > F = 0.0434
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ridageyr | .0816693 .0372445 2.19 0.043 .0027145 .1606241
_cons | -2.323536 .3852478 -6.03 0.000 -3.140225 -1.506847
------------------------------------------------------------------------------
This first creates a centered age variable, and creates higher-order terms based on that variable.
use "C:\Users\Matt\Documents\EPI536\Assignments\NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
# Calculates weighted mean for creating the centered age variable
svy,subpop(subpop2): mean ridageyr
# Creates weighted age variable
gen agec=ridageyr-10.2
# Creates higher-order age variables
gen agec_sq=agec^2
gen agec_cb=agec^3
# Verifying new variables
su ridageyr agec agec_sq agec_cb if subpop2==1
# Run Quadratic
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata agec agec_sq
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
Unknown #command
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 1,321
Subpop. size = 23,276,001.8
Design df = 16
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
ridageyr | 10.16534 .2237876 9.690935 10.63975
--------------------------------------------------------------
Unknown #command
Unknown #command
Unknown #command
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
ridageyr | 1,321 9.663134 3.846791 4 17
agec | 1,321 -.536866 3.846791 -6.2 6.8
agec_sq | 1,321 15.07482 14.03692 .04 46.24
agec_cb | 1,321 -10.41995 118.9342 -238.328 314.432
Unknown #command
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 2, 15) = 3.00
Prob > F = 0.0800
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agec | .0877135 .0455836 1.92 0.072 -.0089194 .1843464
agec_sq | -.0195448 .0089637 -2.18 0.045 -.0385469 -.0005427
_cons | -1.210634 .1982293 -6.11 0.000 -1.630861 -.7904065
------------------------------------------------------------------------------
This first creates a centered age variable, and creates higher-order terms based on that variable.
use "C:\Users\Matt\Documents\EPI536\Assignments\NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
# Calculates weighted mean for creating the centered age variable
svy,subpop(subpop2): mean ridageyr
# Creates weighted age variable
gen agec=ridageyr-10.2
# Creates higher-order age variables
gen agec_sq=agec^2
gen agec_cb=agec^3
# Run Cubic
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata agec agec_sq agec_cb
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
Unknown #command
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 1,321
Subpop. size = 23,276,001.8
Design df = 16
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
ridageyr | 10.16534 .2237876 9.690935 10.63975
--------------------------------------------------------------
Unknown #command
Unknown #command
Unknown #command
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 3, 14) = 2.28
Prob > F = 0.1245
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agec | .166679 .0862877 1.93 0.071 -.0162427 .3496008
agec_sq | -.0173134 .0077224 -2.24 0.039 -.0336841 -.0009427
agec_cb | -.0029188 .0030952 -0.94 0.360 -.0094803 .0036427
_cons | -1.244106 .1796734 -6.92 0.000 -1.624996 -.8632152
------------------------------------------------------------------------------
Plot:
use "C:\Users\Matt\Documents\EPI536\Assignments\NHANES0708_all.dta"
svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
# generate quintiles
xtile age_q5=ridageyr [aweight=wtmec2yr] if subpop2==1,nq(5)
# Calculates weighted mean for creating the centered age variable
svy,subpop(subpop2): mean ridageyr
# Creates weighted age variable
gen agec=ridageyr-10.2
# Creates higher-order age variables
gen agec_sq=agec^2
gen agec_cb=agec^3
# Run Cubic
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata agec agec_sq agec_cb
# Run model with age quintiles
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata i.age_q5
predict pred_wc2_q5,xb
predict pred_wc2_se_q5,stdp
gen pred_wc2_ub_q5=pred_wc2_q5 + 1.96*pred_wc2_se_q5
gen pred_wc2_lb_q5=pred_wc2_q5 - 1.96*pred_wc2_se_q5
*Run model with linear age
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata ridageyr
predict pred_wc2_x,xb
*Run model with quadratic age
xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata agec agec_sq
predict pred_wc2_sq,xb
twoway scatter pred_wc2_q5 ridageyr if subpop2==1 & male==1 & foodinsec==0, mcolor(blue) ///
|| scatter pred_wc2_lb_q5 ridageyr if subpop2==1 & male==1 & foodinsec==0, msymbol(smx) mcolor(blue) ///
|| scatter pred_wc2_ub_q5 ridageyr if subpop2==1 & male==1 & foodinsec==0, msymbol(smx) mcolor(blue) ///
|| scatter pred_wc2_x ridageyr if subpop2==1 & male==1 & foodinsec==0 ///
|| scatter pred_wc2_sq ridageyr if subpop2==1 & male==1 & foodinsec==0
graph export age_plot.svg, replace
pweight: wtmec2yr
VCE: linearized
Single unit: missing
Strata 1: sdmvstra
SU 1: sdmvpsu
FPC 1: <zero>
Unknown #command
Unknown #command
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 1,321
Subpop. size = 23,276,001.8
Design df = 16
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
ridageyr | 10.16534 .2237876 9.690935 10.63975
--------------------------------------------------------------
Unknown #command
Unknown #command
Unknown #command
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 3, 14) = 2.28
Prob > F = 0.1245
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agec | .166679 .0862877 1.93 0.071 -.0162427 .3496008
agec_sq | -.0173134 .0077224 -2.24 0.039 -.0336841 -.0009427
agec_cb | -.0029188 .0030952 -0.94 0.360 -.0094803 .0036427
_cons | -1.244106 .1796734 -6.92 0.000 -1.624996 -.8632152
------------------------------------------------------------------------------
Unknown #command
i.age_q5 _Iage_q5_1-5 (naturally coded; _Iage_q5_1 omitted)
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 4, 13) = 2.24
Prob > F = 0.1215
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iage_q5_2 | .1842817 .3964377 0.46 0.648 -.6561288 1.024692
_Iage_q5_3 | -.1072657 .3823348 -0.28 0.783 -.9177793 .703248
_Iage_q5_4 | 1.236589 .467235 2.65 0.018 .2460951 2.227083
_Iage_q5_5 | .2704047 .5807907 0.47 0.648 -.9608166 1.501626
_cons | -1.92872 .3308428 -5.83 0.000 -2.630076 -1.227365
------------------------------------------------------------------------------
(8828 missing values generated)
(8828 missing values generated)
(8,828 missing values generated)
(8,828 missing values generated)
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 1, 16) = 4.81
Prob > F = 0.0434
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ridageyr | .0816693 .0372445 2.19 0.043 .0027145 .1606241
_cons | -2.323536 .3852478 -6.03 0.000 -3.140225 -1.506847
------------------------------------------------------------------------------
(running logit on estimation sample)
Survey: Logistic regression
Number of strata = 16 Number of obs = 10,149
Number of PSUs = 32 Population size = 297,136,095
Subpop. no. obs = 439
Subpop. size = 8,131,107.05
Design df = 16
F( 2, 15) = 3.00
Prob > F = 0.0800
------------------------------------------------------------------------------
| Linearized
wccata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agec | .0877135 .0455836 1.92 0.072 -.0089194 .1843464
agec_sq | -.0195448 .0089637 -2.18 0.045 -.0385469 -.0005427
_cons | -1.210634 .1982293 -6.11 0.000 -1.630861 -.7904065
------------------------------------------------------------------------------
(file age_plot.svg written in SVG format)
use "C:\Users\Matt\Documents\EPI536\Assignments\NHANES0708_all.dta"
lowess wccata ridageyr if subpop2==1 & male==1 & foodinsec==0,logit
graph export lowess.svg, replace
(file lowess.svg written in SVG format)
use "NHANES0708_all.dta"
xi: logit wccata ridageyr if subpop2==1 & male==1 & foodinsec==0
nlcheck ridageyr,graph
graph export nlcheck.svg, replace
Iteration 0: log likelihood = -202.29128
Iteration 1: log likelihood = -199.98977
Iteration 2: log likelihood = -199.97022
Iteration 3: log likelihood = -199.97022
Logistic regression Number of obs = 439
LR chi2(1) = 4.64
Prob > chi2 = 0.0312
Log likelihood = -199.97022 Pseudo R2 = 0.0115
------------------------------------------------------------------------------
wccata | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ridageyr | .0709313 .0328997 2.16 0.031 .0064491 .1354135
_cons | -2.237634 .3474705 -6.44 0.000 -2.918663 -1.556604
------------------------------------------------------------------------------
Nonlinearity test:
chi2( 9) = 11.24
Prob > chi2 = 0.2597
(file nlcheck.svg written in SVG format)
*Stata-generated plot
*Create quintile variable xtile prage_q5=dmdhrage [aweight=wtmec2yr] if subpop2==1,nq(5)
Create centered variable svy,subpop(subpop2): mean dmdhrage mean PR age: 39.0 gen pragec=dmdhrage-39 *Create higher order terms gen pragec_sq=pragec^2 gen pragec_cb=pragec^3
Run model with age quintiles xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata i.prage_q5 predict pred_wc2_q5,xb predict pred_wc2_se_q5,stdp gen pred_wc2_ub_q5=pred_wc2_q5 + 1.96pred_wc2_se_q5 gen pred_wc2_lb_q5=pred_wc2_q5 - 1.96*pred_wc2_se_q5
Run model with linear age xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata dmdhrage predict pred_wc2_x,xb Run model with quadratic age xi: svy,subpop(if subpop2==1 & male==1 & foodinsec==0): logit wccata pragec pragec_sq predict pred_wc2_sq,xb
twoway scatter pred_wc2_q5 dmdhrage if subpop2==1 & male==1 & foodinsec==0, mcolor(blue) /// || scatter pred_wc2_lb_q5 dmdhrage if subpop2==1 & male==1 & foodinsec==0, msymbol(smx) mcolor(blue) /// || scatter pred_wc2_ub_q5 dmdhrage if subpop2==1 & male==1 & foodinsec==0, msymbol(smx) mcolor(blue) /// || scatter pred_wc2_x dmdhrage if subpop2==1 & male==1 & foodinsec==0 /// || scatter pred_wc2_sq dmdhrage if subpop2==1 & male==1 & foodinsec==0