Chapter 20 - Stratified Sampling and Cluster Sampling
Examples
------------------------------------------------------------------------------------------
name: SN
log: \iiexample20.smcl
log type: smcl
closed on: 12 May 2020, 20:45:32
. **********************************************
. * Solomon Negash - Examples
. * Wooldridge (2010). Economic Analysis of Cross-Section and Panel Data. 2nd ed.
. * STATA Program, version 16.1.
. * Chapter 20 - Stratified Sampling and Cluster Sampling
. ***********************************************
. // Example 20.3 (Cluster Correlation in Teacher Compensation)
. u "Wooldridge_2E\benefits", clear
. eststo POLS: reg lavgsal bs lstaff lenroll lunch
Source | SS df MS Number of obs = 1,848
-------------+---------------------------------- F(4, 1843) = 429.78
Model | 48.3485452 4 12.0871363 Prob > F = 0.0000
Residual | 51.8328336 1,843 .028124164 R-squared = 0.4826
-------------+---------------------------------- Adj R-squared = 0.4815
Total | 100.181379 1,847 .054240054 Root MSE = .1677
------------------------------------------------------------------------------
lavgsal | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.1774396 .1219691 -1.45 0.146 -.4166518 .0617725
lstaff | -.6907025 .0184598 -37.42 0.000 -.7269068 -.6544981
lenroll | -.0292406 .0084997 -3.44 0.001 -.0459107 -.0125705
lunch | -.0008471 .0001625 -5.21 0.000 -.0011658 -.0005284
_cons | 13.72361 .1121095 122.41 0.000 13.50374 13.94349
------------------------------------------------------------------------------
. eststo POLSr: reg lavgsal bs lstaff lenroll lunch, cluster(distid)
Linear regression Number of obs = 1,848
F(4, 536) = 134.77
Prob > F = 0.0000
R-squared = 0.4826
Root MSE = .1677
(Std. Err. adjusted for 537 clusters in distid)
------------------------------------------------------------------------------
| Robust
lavgsal | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.1774396 .2596214 -0.68 0.495 -.6874398 .3325605
lstaff | -.6907025 .0352962 -19.57 0.000 -.7600383 -.6213666
lenroll | -.0292406 .0257414 -1.14 0.256 -.079807 .0213258
lunch | -.0008471 .0005709 -1.48 0.138 -.0019686 .0002744
_cons | 13.72361 .2562909 53.55 0.000 13.22016 14.22707
------------------------------------------------------------------------------
. eststo RE: xtreg lavgsal bs lstaff lenroll lunch, re
Random-effects GLS regression Number of obs = 1,848
Group variable: distid Number of groups = 537
R-sq: Obs per group:
within = 0.5453 min = 1
between = 0.3852 avg = 3.4
overall = 0.4671 max = 162
Wald chi2(4) = 1890.56
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lavgsal | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.3812698 .1118678 -3.41 0.001 -.6005267 -.162013
lstaff | -.6174177 .0153587 -40.20 0.000 -.6475202 -.5873151
lenroll | -.0249189 .0075532 -3.30 0.001 -.0397228 -.0101149
lunch | .0002995 .0001794 1.67 0.095 -.0000521 .0006511
_cons | 13.36682 .0975734 136.99 0.000 13.17558 13.55806
-------------+----------------------------------------------------------------
sigma_u | .12627558
sigma_e | .09996638
rho | .61473634 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. eststo REr: xtreg lavgsal bs lstaff lenroll lunch, re cluster(distid)
Random-effects GLS regression Number of obs = 1,848
Group variable: distid Number of groups = 537
R-sq: Obs per group:
within = 0.5453 min = 1
between = 0.3852 avg = 3.4
overall = 0.4671 max = 162
Wald chi2(4) = 316.91
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
(Std. Err. adjusted for 537 clusters in distid)
------------------------------------------------------------------------------
| Robust
lavgsal | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.3812698 .1504893 -2.53 0.011 -.6762235 -.0863162
lstaff | -.6174177 .0363789 -16.97 0.000 -.688719 -.5461163
lenroll | -.0249189 .0115371 -2.16 0.031 -.0475312 -.0023065
lunch | .0002995 .0001963 1.53 0.127 -.0000852 .0006841
_cons | 13.36682 .1968713 67.90 0.000 12.98096 13.75268
-------------+----------------------------------------------------------------
sigma_u | .12627558
sigma_e | .09996638
rho | .61473634 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. eststo FEr: xtreg lavgsal bs lstaff lenroll lunch, fe cluster(distid)
Fixed-effects (within) regression Number of obs = 1,848
Group variable: distid Number of groups = 537
R-sq: Obs per group:
within = 0.5486 min = 1
between = 0.3544 avg = 3.4
overall = 0.4567 max = 162
F(4,536) = 57.84
corr(u_i, Xb) = 0.1433 Prob > F = 0.0000
(Std. Err. adjusted for 537 clusters in distid)
------------------------------------------------------------------------------
| Robust
lavgsal | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.4948449 .1937316 -2.55 0.011 -.8754112 -.1142785
lstaff | -.6218901 .0431812 -14.40 0.000 -.7067152 -.5370649
lenroll | -.0515063 .0130887 -3.94 0.000 -.0772178 -.0257948
lunch | .0005138 .0002127 2.42 0.016 .0000959 .0009317
_cons | 13.61783 .2413169 56.43 0.000 13.14379 14.09187
-------------+----------------------------------------------------------------
sigma_u | .15491886
sigma_e | .09996638
rho | .70602068 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. eststo FE: xtreg lavgsal bs lstaff lenroll lunch, fe
Fixed-effects (within) regression Number of obs = 1,848
Group variable: distid Number of groups = 537
R-sq: Obs per group:
within = 0.5486 min = 1
between = 0.3544 avg = 3.4
overall = 0.4567 max = 162
F(4,1307) = 397.05
corr(u_i, Xb) = 0.1433 Prob > F = 0.0000
------------------------------------------------------------------------------
lavgsal | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
bs | -.4948449 .133039 -3.72 0.000 -.7558382 -.2338515
lstaff | -.6218901 .0167565 -37.11 0.000 -.6547627 -.5890175
lenroll | -.0515063 .0094004 -5.48 0.000 -.0699478 -.0330648
lunch | .0005138 .0002088 2.46 0.014 .0001042 .0009234
_cons | 13.61783 .1133406 120.15 0.000 13.39548 13.84018
-------------+----------------------------------------------------------------
sigma_u | .15491886
sigma_e | .09996638
rho | .70602068 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(536, 1307) = 7.24 Prob > F = 0.0000
. estout POLS POLSr RE REr FE FEr, cells(b(nostar fmt(3)) se(par fmt(3))) /*
*/ ti("Table 20.1 Salary-Benefits Trade-off for Michigan Teachers")
Table 20.1 Salary-Benefits Trade-off for Michigan Teachers
------------------------------------------------------------------------------------------
POLS POLSr RE REr FEr FE
b/se b/se b/se b/se b/se b/se
------------------------------------------------------------------------------------------
bs -0.177 -0.177 -0.381 -0.381 -0.495 -0.495
(0.122) (0.260) (0.112) (0.150) (0.194) (0.133)
lstaff -0.691 -0.691 -0.617 -0.617 -0.622 -0.622
(0.018) (0.035) (0.015) (0.036) (0.043) (0.017)
lenroll -0.029 -0.029 -0.025 -0.025 -0.052 -0.052
(0.008) (0.026) (0.008) (0.012) (0.013) (0.009)
lunch -0.001 -0.001 0.000 0.000 0.001 0.001
(0.000) (0.001) (0.000) (0.000) (0.000) (0.000)
_cons 13.724 13.724 13.367 13.367 13.618 13.618
(0.112) (0.256) (0.098) (0.197) (0.241) (0.113)
------------------------------------------------------------------------------------------
. // Example 20.4 (Effects of Spending on School Performance)
. u "Wooldridge_2E\meap94_98", clear
. eststo FE: xtreg math4 lavgrexp lunch lenrol y95 y96 y97 y98, fe
Fixed-effects (within) regression Number of obs = 7,150
Group variable: schid Number of groups = 1,683
R-sq: Obs per group:
within = 0.3602 min = 3
between = 0.0292 avg = 4.2
overall = 0.1514 max = 5
F(7,5460) = 439.11
corr(u_i, Xb) = 0.0073 Prob > F = 0.0000
------------------------------------------------------------------------------
math4 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lavgrexp | 6.288376 2.098685 3.00 0.003 2.174117 10.40264
lunch | -.0215072 .0312185 -0.69 0.491 -.082708 .0396935
lenrol | -2.038461 1.791604 -1.14 0.255 -5.550718 1.473797
y95 | 11.6192 .5545233 20.95 0.000 10.53212 12.70629
y96 | 13.05561 .6630948 19.69 0.000 11.75568 14.35554
y97 | 10.14771 .7024067 14.45 0.000 8.770713 11.52471
y98 | 23.41404 .7187237 32.58 0.000 22.00506 24.82303
_cons | 11.84422 22.81097 0.52 0.604 -32.87436 56.5628
-------------+----------------------------------------------------------------
sigma_u | 15.84958
sigma_e | 11.325028
rho | .66200804 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(1682, 5460) = 4.82 Prob > F = 0.0000
. eststo FEr_sch: xtreg math4 lavgrexp lunch lenrol y95 y96 y97 y98, fe cluster(schid)
Fixed-effects (within) regression Number of obs = 7,150
Group variable: schid Number of groups = 1,683
R-sq: Obs per group:
within = 0.3602 min = 3
between = 0.0292 avg = 4.2
overall = 0.1514 max = 5
F(7,1682) = 431.08
corr(u_i, Xb) = 0.0073 Prob > F = 0.0000
(Std. Err. adjusted for 1,683 clusters in schid)
------------------------------------------------------------------------------
| Robust
math4 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lavgrexp | 6.288376 2.431317 2.59 0.010 1.519651 11.0571
lunch | -.0215072 .0390732 -0.55 0.582 -.0981445 .05513
lenrol | -2.038461 1.789094 -1.14 0.255 -5.547545 1.470623
y95 | 11.6192 .5358469 21.68 0.000 10.56821 12.6702
y96 | 13.05561 .6910815 18.89 0.000 11.70014 14.41108
y97 | 10.14771 .7326314 13.85 0.000 8.710745 11.58468
y98 | 23.41404 .7669553 30.53 0.000 21.90975 24.91833
_cons | 11.84422 25.16643 0.47 0.638 -37.51659 61.20503
-------------+----------------------------------------------------------------
sigma_u | 15.84958
sigma_e | 11.325028
rho | .66200804 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. eststo FEr_dist: xtreg math4 lavgrexp lunch lenrol y95 y96 y97 y98, fe cluster(distid)
Fixed-effects (within) regression Number of obs = 7150
Group variable: schid Number of groups = 1683
R-sq: within = 0.3602 Obs per group: min = 3
between = 0.0292 avg = 4.2
overall = 0.1514 max = 5
F(7,466) = 259.90
corr(u_i, Xb) = 0.0073 Prob > F = 0.0000
(Std. Err. adjusted for 467 clusters in distid)
------------------------------------------------------------------------------
| Robust
math4 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lavgrexp | 6.288376 3.132334 2.01 0.045 .1331271 12.44363
lunch | -.0215072 .0399206 -0.54 0.590 -.0999539 .0569395
lenrol | -2.038461 2.098607 -0.97 0.332 -6.162365 2.085443
y95 | 11.6192 .7210398 16.11 0.000 10.20231 13.0361
y96 | 13.05561 .9326851 14.00 0.000 11.22282 14.8884
y97 | 10.14771 .9576417 10.60 0.000 8.26588 12.02954
y98 | 23.41404 1.027313 22.79 0.000 21.3953 25.43278
_cons | 11.84422 32.68429 0.36 0.717 -52.38262 76.07107
-------------+----------------------------------------------------------------
sigma_u | 15.84958
sigma_e | 11.325028
rho | .66200804 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. estout FE FEr_sch FEr_dist, cells(b(nostar fmt(2)) se(par fmt(2))) /*
*/ ti("Table 20.2 Fixed Effects Estimation of Spending on Test Pass Rates")
Table 20.2 Fixed Effects Estimation of Spending on Test Pass Rates
---------------------------------------------------
FE FEr_sch FEr_dist
b/se b/se b/se
---------------------------------------------------
lavgrexp 6.29 6.29 6.29
(2.10) (2.43) (3.13)
lunch -0.02 -0.02 -0.02
(0.03) (0.04) (0.04)
lenrol -2.04 -2.04 -2.04
(1.79) (1.79) (2.10)
y95 11.62 11.62 11.62
(0.55) (0.54) (0.72)
y96 13.06 13.06 13.06
(0.66) (0.69) (0.93)
y97 10.15 10.15 10.15
(0.70) (0.73) (0.96)
y98 23.41 23.41 23.41
(0.72) (0.77) (1.03)
_cons 11.84 11.84 11.84
(22.81) (25.17) (32.68)
---------------------------------------------------
. log close
name: SN
log: iiexample20.smcl
log type: smcl
closed on: 12 May 2020, 20:45:33
------------------------------------------------------------------------------------------