Chapter 03 - Multiple Regression Analysis

Chapter 03 - Multiple Regression Analysis#

import stata_setup

stata_setup.config("C:/Program Files/Stata18/", "se", splash=False)

Problem 3.1.#

$bwght = \beta_0 + \beta_1cigs + \beta_2faminc + u$

iii. bgwht on cig & family income

%%stata
use bwght.dta, clear 
corr(cigs faminc)
reg bwght cigs
display "bwght= " %5.3f _b[_cons] "+ " %5.3f _b[cigs] "cigs; N=" _N ", Rsq=" %5.4f e(r2) 
reg bwght cigs faminc
display "bwght= " %5.3f _b[_cons] "+ " %5.3f _b[cigs] "cigs + " %5.3f _b[faminc] "faminc; N=" _N ", Rsq=" %5.4f e(r2) 

. use bwght.dta, clear 

. corr(cigs faminc)
(obs=1,388)

             |     cigs   faminc
-------------+------------------
        cigs |   1.0000
      faminc |  -0.1730   1.0000


. reg bwght cigs

      Source |       SS           df       MS      Number of obs   =     1,388
-------------+----------------------------------   F(1, 1386)      =     32.24
       Model |  13060.4194         1  13060.4194   Prob > F        =    0.0000
    Residual |    561551.3     1,386  405.159668   R-squared       =    0.0227
-------------+----------------------------------   Adj R-squared   =    0.0220
       Total |   574611.72     1,387  414.283864   Root MSE        =    20.129

------------------------------------------------------------------------------

       bwght | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]

-------------+----------------------------------------------------------------

        cigs |

  -.5137721   .0904909    -5.68   0.000    -.6912861   -.3362581

       _cons |   119.7719   .5723407   209.27   0.000     118.6492    120.8946

------------------------------------------------------------------------------

. display "bwght= " %5.3f _b[_cons] "+ " %5.3f _b[cigs] "cigs; N=" _N ", Rsq=" 
> %5.4f e(r2) 
bwght= 119.772+ -0.514cigs; N=1388, Rsq=0.0227

. reg bwght cigs faminc

      Source |       SS           df       MS      Number of obs   =     1,388
-------------+----------------------------------   F(2, 1385)      =     21.27
       Model |  17126.2088         2  8563.10442   Prob > F        =    0.0000
    Residual |  557485.511     1,385  402.516614   R-squared       =    0.0298
-------------+----------------------------------   Adj R-squared   =    0.0284
       Total |   574611.72     1,387  414.283864   Root MSE        =   

 20.063

------------------------------------------------------------------------------
       bwght | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        cigs |  -.4634075   .0915768    -5.06   0.000    -.6430518   -.2837633
      faminc |   .0927647   .0291879     3.18   0.002     .0355075    .1500219
       _cons |   116.9741   1.048984   111.51   0.000     114.9164    119.0319
------------------------------------------------------------------------------

. display "bwght= " %5.3f _b[_cons] "+ " %5.3f _b[cigs] "cigs + " %5.3f _b[fami
> nc] "faminc; N=" _N ", Rsq=" %5.4f e(r2) 
bwght= 116.974+ -0.463cigs + 0.093faminc; N=1388, Rsq=0.0298

. 

Problem3.2. House price:#

$price = f(sqrft, bdrms)$

i. reg house price on area & no. of bedrooms. Report result in equation form

%%stata
u hprice1.dta, clear
reg price sqrft bdrms
display "price= " %5.3f _b[_cons] " + " %5.3f _b[sqrft] "sqrft + " %5.3f _b[bdrms] "bdrms; N=" _N ", Rsq=" %5.4f e(r2) 

. u hprice1.dta, clear

. reg price sqrft bdrms

      Source |       SS           df       MS      Number of obs   =        88
-------------+----------------------------------   F(2, 85)        =     72.96
       Model |  580009.152         2  290004.576   Prob > F        =    0.0000
    Residual |  337845.354        85  3974.65122   R-squared       =    0.6319
-------------+----------------------------------   Adj R-squared   =    0.6233
       Total |  917854.506        87  10550.0518   Root MSE        =    63.045

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       sqrft |   .1284362   .0138245     9.29   0.000     .1009495    .1559229
       bdrms |   15.19819   9.483517     1.60   0.113    -3.657582    34.05396
       _cons |    -19.315   31.04662    -0.62   0.536    -81.04399      42.414
------------------------------------------------------------------------------

. display "price= " %5.3f _b[_cons] " + " %5.3f _b[sqrft] "sqrft + " %5.3f _b[b
> drms] "bdrms; N=" _N ", Rsq=" %5.4f e(r2) 
price= -19.315 + 0.128sqrft + 15.198bdrms; N=88, Rsq=0.6319

. 

ii. The increase in price due to one more bedroms, holding size constant is

%%stata
display  _b[bdrms] " (thousand dollars)"

15.198191 (thousand dollars)

The increase in price due to an additional room that has an area of 140 sqrft

%%stata
display  _b[bdrms] + _b[sqrft]*140 " (thousand dollars)"

33.17926 (thousand dollars)

iv. The price change explained by change in bdrms & sqrft is about

%%stata
display %5.4f e(r2)*100 "%"

63.1918%

V The predict selling price for the house is

%%stata
display "price= " _b[_cons] + _b[sqrft]*2438 + _b[bdrms]*4 
margins, at(bdrms=4 sqrft=2438)

. display "price= " _b[_cons] + _b[sqrft]*2438 + _b[bdrms]*4 
price= 354.60525

. margins, at(bdrms=4 sqrft=2438)

Adjusted predictions                                        Number of obs = 88
Model VCE: OLS

Expression: Linear prediction, predict()
At: sqrft = 2438
    bdrms =    4

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   354.6052    8.41493    42.14   0.000     337.8741    371.3364
------------------------------------------------------------------------------

vi. Find uhat for a house with price $300

%%stata
reg price sqrft bdrms
predict pricehat
predict uh, residual 
list price pricehat uh if price==300

. reg price sqrft bdrms

      Source |       SS           df       MS      Number of obs   =        88
-------------+----------------------------------   F(2, 85)        =     72.96
       Model |  580009.152         2  290004.576   Prob > F        =    0.0000
    Residual |  337845.354        85  3974.65122   R-squared       =    0.6319
-------------+----------------------------------   Adj R-squared   =    0.6233
       Total |  917854.506        87  10550.0518   Root MSE        =    63.045

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       sqrft |   .1284362   .0138245     9.29   0.000     .1009495    .1559229
       bdrms |   15.19819   9.483517     1.60   0.113    -3.657582    34.05396
       _cons |    -19.315   31.04662    -0.62   0.536    -81.04399      42.414
------------------------------------------------------------------------------

. predict pricehat
(option xb assumed; fitted values)

. predict uh, residual 

. list price pricehat uh if price==300

     +------------------------------+
     | price   pricehat          uh |
     |------------------------------|
  1. |   300   354.6053   -54.60525 |
 12. |   300   394.9769   -94.97694 |
     +------------------------------+

. 

Problem 3.3.#

i. Reg salary on sales & market value, in constant elasticity

%%stata
use ceosal2.dta, clear
*g lsales = ln(sales)
*g lsalary = ln(salary)
*g lmktval = ln(mktval) 
reg lsalary lsales lmktval

. use ceosal2.dta, clear

. *g lsales = ln(sales)
. *g lsalary = ln(salary)
. *g lmktval = ln(mktval) 
. reg lsalary lsales lmktval

      Source |       SS           df       MS      Number of obs   =       177
-------------+----------------------------------   F(2, 174)       =     37.13
       Model |  19.3365617         2  9.66828083   Prob > F        =    0.0000
    Residual |  45.3096514       174  .260400295   R-squared       =    0.2991
-------------+----------------------------------   Adj R-squared   =    0.2911
       Total |  64.6462131       176  .367308029   Root MSE        =    .51029

------------------------------------------------------------------------------
     lsalary | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lsales |   .1621283   .0396703     4.09   0.000     .0838315    .2404252
     lmktval |    .106708    .050124     2.13   0.035     .0077787    .2056372
       _cons |   4.620917   .2544083    18.16   0.000     4.118794    5.123041
------------------------------------------------------------------------------

ii. add profit to the model

%%stata
reg lsalary lsales lmktval profit

      Source |       SS           df       MS      Number of obs   =       177
-------------+----------------------------------   F(3, 173)       =     24.64
       Model |  19.3509799         3  6.45032663   Prob > F        =    0.0000
    Residual |  45.2952332       173  .261822157   R-squared       =    0.2993
-------------+----------------------------------   Adj R-squared   =    0.2872
       Total |  64.6462131       176  .367308029   Root MSE        =    .51169

------------------------------------------------------------------------------
     lsalary | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lsales |   .1613683   .0399101     4.04   0.000     .0825949    .2401416
     lmktval |   .0975286   .0636886     1.53   0.128    -.0281782    .2232354
     profits |   .0000357    .000152     0.23   0.815    -.0002643    .0003356
       _cons |   4.686924   .3797294    12.34   0.000     3.937425    5.436423
------------------------------------------------------------------------------

iii. add CEO tenure to the model

%%stata
reg lsalary lsales lmktval profit ceoten

      Source |       SS           df       MS      Number of obs   =       177
-------------+----------------------------------   F(4, 172)       =     20.08
       Model |  20.5768102         4  5.14420254   Prob > F        =    0.0000
    Residual |  44.0694029       172  .256217459   R-squared       =    0.3183
-------------+----------------------------------   Adj R-squared   =    0.3024
       Total |  64.6462131       176  .367308029   Root MSE        =    .50618

------------------------------------------------------------------------------
     lsalary | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      lsales |   .1622339   .0394826     4.11   0.000     .0843012    .2401667
     lmktval |   .1017598    .063033     1.61   0.108     -.022658    .2261775
     profits |   .0000291   .0001504     0.19   0.847    -.0002677    .0003258
      ceoten |   .0116847    .005342     2.19   0.030     .0011403     .022229
       _cons |    4.55778   .3802548    11.99   0.000     3.807213    5.308347
------------------------------------------------------------------------------

iv.

%%stata
corr lmktval profit 

(obs=177)

             |  lmktval  profits
-------------+------------------
     lmktval |   1.0000
     profits |   0.7769   1.0000

Problem 3.4#

i. Find min, max & mean for atndrte, priGPA & ACT

%%stata
use attend.dta, clear
sum atndrte priGPA ACT

. use attend.dta, clear

. sum atndrte priGPA ACT

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     atndrte |        680    81.70956    17.04699       6.25        100
      priGPA |        680    2.586775    .5447141       .857       3.93
         ACT |        680    22.51029    3.490768         13         32

. 

ii. Estimate the model. atndrte = f(priGPA, ACT )

%%stata
reg atndrte priGPA ACT
display "atndrte = " %5.2f _b[_cons] " + " %5.2f _b[priGPA] "priGPA + " %5.2f _b[ACT] "ACT; N=" _N ", Rsq=" %5.4f e(r2) 

. reg atndrte priGPA ACT

      Source |       SS           df       MS      Number of obs   =       680
-------------+----------------------------------   F(2, 677)       =    138.65
       Model |  57336.7612         2  28668.3806   Prob > F        =    0.0000
    Residual |  139980.564       677  206.765974   R-squared       =    0.2906
-------------+----------------------------------   Adj R-squared   =    0.2885
       Total |  197317.325       679   290.59989   Root MSE        =    14.379

------------------------------------------------------------------------------
     atndrte | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      priGPA |   17.26059   1.083103    15.94   0.000     15.13395    19.38724
         ACT |  -1.716553    .169012   -10.16   0.000    -2.048404   -1.384702
       _cons |    75.7004   3.884108    19.49   0.000     68.07406    83.32675
------------------------------------------------------------------------------

. display "atndrte = " %5.2f _b[_cons] " + " %5.2f _b[priGPA] "priGPA + " %5.2f
>  _b[ACT] "ACT; N=" _N ", Rsq=" %5.4f e(r2) 
atndrte = 75.70 + 17.26priGPA + -1.72ACT; N=680, Rsq=0.2906

. 

iii. Discuss the estimated slope coeficients.

iv. Predict atndrte at priGPA=3.65 & ACT=20. What do you make of the result?

%%stata
margins, at(priGPA=3.65 ACT=20) 
sum atndrte priGPA ACT if atndrte>= 104  

. margins, at(priGPA=3.65 ACT=20)

Adjusted predictions                                       Number of obs = 680
Model VCE: OLS

Expression: Linear prediction, predict()
At: priGPA = 3.65
    ACT    =   20

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   104.3705     1.4683    71.08   0.000     101.4875    107.2535
------------------------------------------------------------------------------

. sum atndrte priGPA ACT if atndrte>= 104  

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     atndrte |          0
      priGPA |          0
         ACT |          0

. 

v. predict difference in atndrte if student A has priGPA=3.1 & ACT=21 & student B has priGPA=2.1 & ACT=26

%%stata
margins, at(priGPA=3.1  ACT=21) 
margins, at( priGPA=2.1  ACT=26) 
display "Predicted difference in atndrte is " 93.16 - 67.32

. margins, at(priGPA=3.1  ACT=21)

Adjusted predictions                                       Number of obs = 680
Model VCE: OLS

Expression: Linear prediction, predict()
At: priGPA = 3.1
    ACT    =  21

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   93.16062   .8823921   105.58   0.000     91.42807    94.89318
------------------------------------------------------------------------------

. margins, at( priGPA=2.1  ACT=26) 

Adjusted predictions                                       Number of obs = 680
Model VCE: OLS

Expression: Linear prediction, predict()
At: priGPA = 2.1
    ACT    =  26

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   67.31727   1.072343    62.78   0.000     65.21175    69.42279
------------------------------------------------------------------------------

. display "Predicted difference in atndrte is " 93.16 - 67.32
Predicted difference in atndrte is 25.84

. 

Problem 3.5: Example3.2 Wage equation#

%%stata
u wage1.dta, clear
reg educ exper tenure
predict r1, r
reg lwage r1
reg lwage educ exper tenure

. u wage1.dta, clear

. reg educ exper tenure

      Source |       SS           df       MS      Number of obs   =       526
-------------+----------------------------------   F(2, 523)       =     29.49
       Model |  407.946311         2  203.973156   Prob > F        =    0.0000
    Residual |  3617.48335       523  6.91679416   R-squared       =    0.1013
-------------+----------------------------------   Adj R-squared   =    0.0979
       Total |  4025.42966       525  7.66748506   Root MSE        =      2.63

------------------------------------------------------------------------------
        educ | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       exper |  -.0737851   .0097609    -7.56   0.000    -.0929604   -.0546098
      tenure |   .0476795   .0183371     2.60   0.010      .011656    .0837031
       _cons |   13.57496   .1843245    73.65   0.000     13.21286    13.93707
------------------------------------------------------------------------------

. predict r1, r

. reg lwage r1

      Source |       SS           df       MS      Number of obs   =       526
-------------+----------------------------------   F(1, 524)       =    136.41
       Model |  30.6376773         1  30.6376773   Prob > F        =    0.0000
    Residual |  117.692074       524  .224603195   R-squared       =    0.2066
-------------+----------------------------------   Adj R-squared   =    0.2050
       Total |  148.329751       525   .28253286   Root MSE        =    .47392

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          r1 |    .092029   .0078796    11.68   0.000     .0765495    .1075085
       _cons |   1.623268    .020664    78.56   0.000     1.582674    1.663863
------------------------------------------------------------------------------

. reg lwage educ exper tenure

      Source |       SS           df       MS      Number of obs   =       526
-------------+----------------------------------   F(3, 522)       =     80.39
       Model |  46.8741776         3  15.6247259   Prob > F        =    0.0000
    Residual |  101.455574       522  .194359337   R-squared       =    0.3160
-------------+----------------------------------   Adj R-squared   =    0.3121
       Total |  148.329751       525   .28253286   Root MSE        =    .44086

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |    .092029   .0073299    12.56   0.000     .0776292    .1064288
       exper |   .0041211   .0017233     2.39   0.017     .0007357    .0075065
      tenure |   .0220672   .0030936     7.13   0.000     .0159897    .0281448
       _cons |   .2843595   .1041904     2.73   0.007     .0796756    .4890435
------------------------------------------------------------------------------

. 

Problem 3.6#

i. IQ on educ

%%stata
use wage2.dta, clear
reg IQ educ

. use wage2.dta, clear

. reg IQ educ

      Source |       SS           df       MS      Number of obs   =       935
-------------+----------------------------------   F(1, 933)       =    338.02
       Model |  56280.9277         1  56280.9277   Prob > F        =    0.0000
    Residual |  155346.531       933  166.502177   R-squared       =    0.2659
-------------+----------------------------------   Adj R-squared   =    0.2652
       Total |  211627.459       934  226.581862   Root MSE        =    12.904

------------------------------------------------------------------------------
          IQ | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   3.533829   .1922095    18.39   0.000     3.156616    3.911042
       _cons |   53.68715   2.622933    20.47   0.000     48.53962    58.83469
------------------------------------------------------------------------------

ii. lwage on educ

%%stata
reg lwage educ

      Source |       SS           df       MS      Number of obs   =       935
-------------+----------------------------------   F(1, 933)       =    100.70
       Model |  16.1377042         1  16.1377042   Prob > F        =    0.0000
    Residual |  149.518579       933  .160255712   R-squared       =    0.0974
-------------+----------------------------------   Adj R-squared   =    0.0964
       Total |  165.656283       934  .177362188   Root MSE        =    .40032

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   .0598392   .0059631    10.03   0.000     .0481366    .0715418
       _cons |   5.973063   .0813737    73.40   0.000     5.813366    6.132759
------------------------------------------------------------------------------

c. lwage on educ and IQ

%%stata
reg lwage educ IQ

      Source |       SS           df       MS      Number of obs   =       935
-------------+----------------------------------   F(2, 932)       =     69.42
       Model |  21.4779447         2  10.7389723   Prob > F        =    0.0000
    Residual |  144.178339       932  .154697788   R-squared       =    0.1297
-------------+----------------------------------   Adj R-squared   =    0.1278
       Total |  165.656283       934  .177362188   Root MSE        =    .39332

------------------------------------------------------------------------------
       lwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   .0391199   .0068382     5.72   0.000     .0256998      .05254
          IQ |   .0058631   .0009979     5.88   0.000     .0039047    .0078215
       _cons |   5.658288   .0962408    58.79   0.000     5.469414    5.847162
------------------------------------------------------------------------------

d. Verify

%%stata
di  _b[educ]+_b[IQ]*3.534

.05984021

Problem 3.7#

i. Estimate the model,

%%stata
use meap93.dta, clear
reg math10 lexpend lnchprg 
display "math10 = " %5.2f _b[_cons] " + " %5.2f _b[lexpend] "lexpend + " %5.2f _b[lnchprg] "lnchprg; N=" _N ", Rsq=" %5.4f e(r2) 

. use meap93.dta, clear

. reg math10 lexpend lnchprg 

      Source |       SS           df       MS      Number of obs   =       408
-------------+----------------------------------   F(2, 405)       =     44.43
       Model |  8063.82429         2  4031.91215   Prob > F        =    0.0000
    Residual |  36753.3562       405  90.7490276   R-squared       =    0.1799
-------------+----------------------------------   Adj R-squared   =    0.1759
       Total |  44817.1805       407  110.115923   Root MSE        =    9.5262

------------------------------------------------------------------------------
      math10 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     lexpend |    6.22969   2.972634     2.10   0.037     .3859705    12.07341
     lnchprg |  -.3045853   .0353574    -8.61   0.000    -.3740923   -.2350783
       _cons |  -20.36075   25.07288    -0.81   0.417    -69.64998    28.92848
------------------------------------------------------------------------------

. display "math10 = " %5.2f _b[_cons] " + " %5.2f _b[lexpend] "lexpend + " %5.2
> f _b[lnchprg] "lnchprg; N=" _N ", Rsq=" %5.4f e(r2) 
math10 = -20.36 +  6.23lexpend + -0.30lnchprg; N=408, Rsq=0.1799

. 

ii. Discuss the intercept. iii. Run simple ols, math10 on lexpend, compare the results

%%stata
reg math10 lexpend 
display "math10 = " %5.2f _b[_cons] " + " %5.2f _b[lexpend] "lexpend; N=" _N ", Rsq=" %5.4f e(r2) 

. reg math10 lexpend 

      Source |       SS           df       MS      Number of obs   =       408
-------------+----------------------------------   F(1, 406)       =     12.41
       Model |  1329.42517         1  1329.42517   Prob > F        =    0.0005
    Residual |  43487.7553       406  107.112698   R-squared       =    0.0297
-------------+----------------------------------   Adj R-squared   =    0.0273
       Total |  44817.1805       407  110.115923   Root MSE        =     10.35

------------------------------------------------------------------------------
      math10 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     lexpend |   11.16439   3.169011     3.52   0.000     4.934677    17.39411
       _cons |   -69.3411   26.53013    -2.61   0.009    -121.4947   -17.18753
------------------------------------------------------------------------------

. display "math10 = " %5.2f _b[_cons] " + " %5.2f _b[lexpend] "lexpend; N=" _N 
> ", Rsq=" %5.4f e(r2) 
math10 = -69.34 + 11.16lexpend; N=408, Rsq=0.0297

. 

iv. Correlation between lexpend and lnchprg

%%stata
corr lexpend lnchprg 

(obs=408)

             |  lexpend  lnchprg
-------------+------------------
     lexpend |   1.0000
     lnchprg |  -0.1927   1.0000

Problem 3.8#

i.

%%stata
use discrim.dta, clear
mean prpblck income 
d prpblck income 

. use discrim.dta, clear

. mean prpblck income 

Mean estimation                            Number of obs = 409

--------------------------------------------------------------
             |       Mean   Std. err.     [95% conf. interval]
-------------+------------------------------------------------
     prpblck |   .1134864   .0090199      .0957551    .1312177
      income |   47053.78   651.6738      45772.73    48334.84
--------------------------------------------------------------

. d prpblck income 

Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
prpblck         float   %9.0g                 proportion black, zipcode
income          float   %9.0g                 median family income, zipcode

. 

ii. estimate effect of prpblck income on price of soda

%%stata
reg psoda prpblck income
di "psoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.4f _b[income] "income; N=" _N " Rsq=" %5.4f e(r2) 

. reg psoda prpblck income

      Source |       SS           df       MS      Number of obs   =       401
-------------+----------------------------------   F(2, 398)       =     13.66
       Model |  .202552215         2  .101276107   Prob > F        =    0.0000
    Residual |  2.95146493       398  .007415741   R-squared       =    0.0642
-------------+----------------------------------   Adj R-squared   =    0.0595
       Total |  3.15401715       400  .007885043   Root MSE        =    .08611

------------------------------------------------------------------------------
       psoda | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     prpblck |   .1149882   .0260006     4.42   0.000     .0638724    .1661039
      income |   1.60e-06   3.62e-07     4.43   0.000     8.91e-07    2.31e-06
       _cons |   .9563196    .018992    50.35   0.000     .9189824    .9936568
------------------------------------------------------------------------------

. di "psoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.4f _b[i
> ncome] "income; N=" _N " Rsq=" %5.4f e(r2) 

psoda =  0.96 +  0.11prpblck + 0.0000income; N=410 Rsq=0.0642

. 

iii. compare.

%%stata
reg psoda prpblck
di "psoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck; N=" _N " Rsq=" %5.4f e(r2)
test prpblck = 0.115

. reg psoda prpblck

      Source |       SS           df       MS      Number of obs   =       401
-------------+----------------------------------   F(1, 399)       =      7.34
       Model |  .057010466         1  .057010466   Prob > F        =    0.0070
    Residual |  3.09700668       399  .007761922   R-squared       =    0.0181
-------------+----------------------------------   Adj R-squared   =    0.0156
       Total |  3.15401715       400  .007885043   Root MSE        =     .0881

------------------------------------------------------------------------------
       psoda | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     prpblck |   .0649269    .023957     2.71   0.007     .0178292    .1120245
       _cons |   1.037399   .0051905   199.87   0.000     1.027195    1.047603
------------------------------------------------------------------------------

. di "psoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck; N=" _N " Rsq=
> " %5.4f e(r2)
psoda =  1.04 +  0.06prpblck; N=410 Rsq=0.0181

. test prpblck = 0.115

 ( 1)  prpblck = .115

       F(  1,   399) =    4.37
            Prob > F =    0.0372

. 

iv. Model with a constant price elasticity w.r.t income. Estimate the %age change in psoda, if prpblck increases by 0.2,

%%stata
reg lpsoda prpblck lincome
di "lpsoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.4f _b[lincome] "lincome; N=" _N " Rsq=" %5.4f e(r2) 
di _b[prpblck] * .2 *100

. reg lpsoda prpblck lincome

      Source |       SS           df       MS      Number of obs   =       401
-------------+----------------------------------   F(2, 398)       =     14.54
       Model |  .196020672         2  .098010336   Prob > F        =    0.0000
    Residual |  2.68272938       398  .006740526   R-squared       =    0.0681
-------------+----------------------------------   Adj R-squared   =    0.0634
       Total |  2.87875005       400  .007196875   Root MSE        =     .0821

------------------------------------------------------------------------------
      lpsoda | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     prpblck |   .1215803   .0257457     4.72   0.000     .0709657    .1721948
     lincome |   .0765114   .0165969     4.61   0.000     .0438829    .1091399
       _cons |   -.793768   .1794337    -4.42   0.000    -1.146524   -.4410117
------------------------------------------------------------------------------

. di "lpsoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.4f _b[
> lincome] "lincome; N=" _N " Rsq=" %5.4f e(r2) 
lpsoda = -0.79 +  0.12prpblck + 0.0765lincome; N=410 Rsq=0.0681

. di _b[prpblck] * .2 *100
2.4316051

. 

v. add var prppov, what happens to coef. of prpblck?

%%stata
reg lpsoda prpblck lincome prppov
di "lpsoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.3f _b[lincome] "lincome + " %5.3f _b[prppov] "prppov; N=" _N " Rsq=" %5.4f e(r2) 

. reg lpsoda prpblck lincome prppov

      Source |       SS           df       MS      Number of obs   =       401
-------------+----------------------------------   F(3, 397)       =     12.60
       Model |  .250340622         3  .083446874   Prob > F        =    0.0000
    Residual |  2.62840943       397  .006620679   R-squared       =    0.0870
-------------+----------------------------------   Adj R-squared   =    0.0801
       Total |  2.87875005       400  .007196875   Root MSE        =    .08137

------------------------------------------------------------------------------
      lpsoda | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     prpblck |   .0728072   .0306756     2.37   0.018     .0125003    .1331141
     lincome |   .1369553   .0267554     5.12   0.000     .0843552    .1895553
      prppov |     .38036   .1327903     2.86   0.004     .1192999    .6414201
       _cons |  -1.463333   .2937111    -4.98   0.000    -2.040756   -.8859092
------------------------------------------------------------------------------

. di "lpsoda = " %5.2f _b[_cons] " + " %5.2f _b[prpblck] "prpblck + " %5.3f _b[
> lincome] "lincome + " %5.3f _b[prppov] "prppov; N=" _N " Rsq=" %5.4f e(r2) 
lpsoda = -1.46 +  0.07prpblck + 0.137lincome + 0.380prppov; N=410 Rsq=0.0870

. 

vi. Correlation between lincome prppov

%%stata
corr lincome prppov

(obs=409)

             |  lincome   prppov
-------------+------------------
     lincome |   1.0000
      prppov |  -0.8385   1.0000

Problem 3.9#

i. Run OLS & estimate.

%%stata
bcuse charity.dta, clear nodesc
reg gift mailsyear giftlast propresp
di "gift = " %5.2f _b[_cons] " + " %5.2f _b[mailsyear] "mailsyear + " %5.3f _b[giftlast] "giftlast + " %5.3f _b[propresp] "propresp ; N=" _N " Rsq=" %5.4f e(r2) 
reg gift mailsyear
di "gift = " %5.2f _b[_cons] " + " %5.2f _b[mailsyear] "mailsyear ; N=" _N " Rsq=" %5.4f e(r2)

. bcuse charity.dta, clear nodesc

. reg gift mailsyear giftlast propresp

      Source |       SS           df       MS      Number of obs   =     4,268
-------------+----------------------------------   F(3, 4264)      =    129.26
       Model |  80700.7052         3  26900.2351   Prob > F        =    0.0000
    Residual |  887399.134     4,264  208.114244   R-squared       =    0.0834
-------------+----------------------------------   Adj R-squared   =    0.0827
       Total |   968099.84     4,267  226.880675   Root MSE        =    14.426

------------------------------------------------------------------------------
        gift | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   mailsyear |   2.166259   .3319271     6.53   0.000     1.515509    2.817009
    giftlast |   .0059265   .0014324     4.14   0.000     .0031184    .0087347
    propresp |   15.35861   .8745394    17.56   0.000     13.64405    17.07316
       _cons |  -4.551518   .8030336    -5.67   0.000    -6.125882   -2.977155
------------------------------------------------------------------------------

. di "gift = " %5.2f _b[_cons] " + " %5.2f _b[mailsyear] "mailsyear + " %5.3f _
> b[giftlast] "giftlast + " %5.3f _b[propresp] "propresp ; N=" _N " Rsq=" %5.4f
>  e(r2) 
gift = -4.55 +  2.17mailsyear + 0.006giftlast + 15.359propresp ; N=4268 Rsq=0.0
> 834

. reg gift mailsyear

      Source |       SS           df       MS      Number of obs   =     4,268
-------------+----------------------------------   F(1, 4266)      =     59.65
       Model |  13349.7251         1  13349.7251   Prob > F        =    0.0000
    Residual |  954750.114     4,266  223.804528   R-squared       =    0.0138
-------------+----------------------------------   Adj R-squared   =    0.0136
       Total |   968099.84     4,267  226.880675   Root MSE        =     14.96

------------------------------------------------------------------------------
        gift | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   mailsyear |   2.649546   .3430598     7.72   0.000     1.976971    3.322122
       _cons |    2.01408   .7394696     2.72   0.006     .5643347    3.463825
------------------------------------------------------------------------------

. di "gift = " %5.2f _b[_cons] " + " %5.2f _b[mailsyear] "mailsyear ; N=" _N " 
> Rsq=" %5.4f e(r2)
gift =  2.01 +  2.65mailsyear ; N=4268 Rsq=0.0138

. 

iv Add avggift

%%stata
reg gift mailsyear giftlast propresp avggift
corr mailsyear giftlast avggift

. reg gift mailsyear giftlast propresp avggift

      Source |       SS           df       MS      Number of obs   =     4,268
-------------+----------------------------------   F(4, 4263)      =    267.33
       Model |  194137.386         4  48534.3466   Prob > F        =    0.0000
    Residual |  773962.453     4,263  181.553472   R-squared       =    0.2005
-------------+----------------------------------   Adj R-squared   =    0.1998
       Total |   968099.84     4,267  226.880675   Root MSE        =    13.474

------------------------------------------------------------------------------
        gift | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   mailsyear |   1.201168    .312418     3.84   0.000     .5886665     1.81367
    giftlast |  -.2608573   .0107565   -24.25   0.000    -.2819456    -.239769
    propresp |   16.20464   .8175292    19.82   0.000     14.60186    17.80743
     avggift |   .5269471   .0210811    25.00   0.000     .4856172    .5682769
       _cons |  -7.327763     .75822    -9.66   0.000    -8.814269   -5.841257
------------------------------------------------------------------------------

. corr mailsyear giftlast avggift
(obs=4,268)

             | mailsy~r giftlast  avggift
-------------+---------------------------
   mailsyear |   1.0000
    giftlast |   0.0063   1.0000
     avggift |   0.0213   0.9921   1.0000


. 

Problem 3.10#

i. Range of educ

%%stata
use htv.dta, clear
sum edu motheduc fatheduc
sum edu motheduc fatheduc if educ==12
sum edu motheduc fatheduc if educ>=12

. use htv.dta, clear

. sum edu motheduc fatheduc

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        educ |      1,230     13.0374    2.354346          6         20
    motheduc |      1,230    12.17805    2.278067          0         20
    fatheduc |      1,230    12.44715    3.263835          0         20

. sum edu motheduc fatheduc if educ==12

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        educ |        512          12           0         12         12
    motheduc |        512    11.76367    1.697474          3         18
    fatheduc |        512    11.78516    2.626214          0         20

. sum edu motheduc fatheduc if educ>=12

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        educ |      1,044    13.62835    1.991304         12         20
    motheduc |      1,044    12.46264      2.1308          3         20
    fatheduc |      1,044    12.86782    3.121349          0         20

. 

ii. Regress education on parents education

%%stata
reg educ motheduc fatheduc 

      Source |       SS           df       MS      Number of obs   =     1,230
-------------+----------------------------------   F(2, 1227)      =    203.68
       Model |   1697.9676         2    848.9838   Prob > F        =    0.0000
    Residual |  5114.31207     1,227   4.1681435   R-squared       =    0.2493
-------------+----------------------------------   Adj R-squared   =    0.2480
       Total |  6812.27967     1,229  5.54294522   Root MSE        =    2.0416

------------------------------------------------------------------------------
        educ | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    motheduc |   .3041971   .0319266     9.53   0.000     .2415603     .366834
    fatheduc |   .1902858   .0222839     8.54   0.000     .1465669    .2340046
       _cons |   6.964355   .3198205    21.78   0.000     6.336899     7.59181
------------------------------------------------------------------------------

iii. Add abil to the model

%%stata
reg educ motheduc fatheduc abil

      Source |       SS           df       MS      Number of obs   =     1,230
-------------+----------------------------------   F(3, 1226)      =    305.17
       Model |  2912.30705         3  970.769018   Prob > F        =    0.0000
    Residual |  3899.97262     1,226  3.18105434   R-squared       =    0.4275
-------------+----------------------------------   Adj R-squared   =    0.4261
       Total |  6812.27967     1,229  5.54294522   Root MSE        =    1.7836

------------------------------------------------------------------------------
        educ | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    motheduc |   .1891314   .0285062     6.63   0.000     .1332051    .2450578
    fatheduc |   .1110854   .0198849     5.59   0.000     .0720733    .1500976
        abil |   .5024829    .025718    19.54   0.000     .4520268     .552939
       _cons |    8.44869   .2895407    29.18   0.000      7.88064     9.01674
------------------------------------------------------------------------------

iv. Add abilsq to the model

%%stata
g abilsq= abil^2 
reg educ motheduc fatheduc abil abilsq
di %5.3f _b[abil] "+" %5.3f 2*_b[abilsq] "abil = 0" 
di "abil = "  _b[abil] / (2*_b[abilsq])

. g abilsq= abil^2 

. reg educ motheduc fatheduc abil abilsq

      Source |       SS           df       MS      Number of obs   =     1,230
-------------+----------------------------------   F(4, 1225)      =    244.91
       Model |  3027.03706         4  756.759264   Prob > F        =    0.0000
    Residual |  3785.24262     1,225  3.08999397   R-squared       =    0.4444
-------------+----------------------------------   Adj R-squared   =    0.4425
       Total |  6812.27967     1,229  5.54294522   Root MSE        =    1.7578

------------------------------------------------------------------------------
        educ | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    motheduc |   .1901261   .0280957     6.77   0.000     .1350051    .2452472
    fatheduc |   .1089387   .0196014     5.56   0.000     .0704827    .1473946
        abil |   .4014624   .0302875    13.26   0.000     .3420413    .4608835
      abilsq |    .050599   .0083039     6.09   0.000     .0343076    .0668905
       _cons |   8.240226   .2874099    28.67   0.000     7.676356    8.804097
------------------------------------------------------------------------------

. di %5.3f _b[abil] "+" %5.3f 2*_b[abilsq] "abil = 0" 
0.401+0.101abil = 0

. di "abil = "  _b[abil] / (2*_b[abilsq])
abil = 3.9670977

. 

v.

%%stata
sum abil if abil<3.967

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        abil |      1,035     1.27629    1.980314  -5.631463   3.955591

Problem 3.11#

i. regress math4 on pctsgle

%%stata
use meapsingle.dta, clear
reg math4 pctsgle

. use meapsingle.dta, clear
(Written by R.              )

. reg math4 pctsgle

      Source |       SS           df       MS      Number of obs   =       229
-------------+----------------------------------   F(1, 227)       =    138.85
       Model |  21625.7284         1  21625.7284   Prob > F        =    0.0000
    Residual |  35354.2892       227  155.745767   R-squared       =    0.3795
-------------+----------------------------------   Adj R-squared   =    0.3768
       Total |  56980.0176       228  249.912358   Root MSE        =     12.48

------------------------------------------------------------------------------
       math4 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     pctsgle |  -.8328814   .0706815   -11.78   0.000    -.9721572   -.6936056
       _cons |   96.77043   1.596802    60.60   0.000     93.62398    99.91688
------------------------------------------------------------------------------

. 

ii. add ‘lmedinc’ and ‘free’ to the model

%%stata
reg math4 pctsgle lmedinc free

      Source |       SS           df       MS      Number of obs   =       229
-------------+----------------------------------   F(3, 225)       =     63.85
       Model |  26201.7562         3  8733.91873   Prob > F        =    0.0000
    Residual |  30778.2614       225  136.792273   R-squared       =    0.4598
-------------+----------------------------------   Adj R-squared   =    0.4526
       Total |  56980.0176       228  249.912358   Root MSE        =    11.696

------------------------------------------------------------------------------
       math4 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     pctsgle |  -.1996453   .1587163    -1.26   0.210    -.5124059    .1131153
     lmedinc |   3.560128   5.041704     0.71   0.481    -6.374869    13.49512
        free |  -.3964185    .070346    -5.64   0.000    -.5350398   -.2577973
       _cons |   51.72322   58.47814     0.88   0.377    -63.51166    166.9581
------------------------------------------------------------------------------

iii. correlation between lmedinc and free

%%stata
corr lmedinc free 

(obs=229)

             |  lmedinc     free
-------------+------------------
     lmedinc |   1.0000
        free |  -0.7470   1.0000

v. Find VIF

%%stata
reg math4 pctsgle lmedinc free
vif
collin math4 pctsgle lmedinc free // help collin

. reg math4 pctsgle lmedinc free

      Source |       SS           df       MS      Number of obs   =       229
-------------+----------------------------------   F(3, 225)       =     63.85
       Model |  26201.7562         3  8733.91873   Prob > F        =    0.0000
    Residual |  30778.2614       225  136.792273   R-squared       =    0.4598
-------------+----------------------------------   Adj R-squared   =    0.4526
       Total |  56980.0176       228  249.912358   Root MSE        =    11.696

------------------------------------------------------------------------------
       math4 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     pctsgle |  -.1996453   .1587163    -1.26   0.210    -.5124059    .1131153
     lmedinc |   3.560128   5.041704     0.71   0.481    -6.374869    13.49512
        free |  -.3964185    .070346    -5.64   0.000    -.5350398   -.2577973
       _cons |   51.72322   58.47814     0.88   0.377    -63.51166    166.9581
------------------------------------------------------------------------------

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------

     pctsgle |      5.74    0.174186
     lmedinc |      4.12    0.242788
        free |      3.19    0.313669
-------------+----------------------
    Mean VIF |      4.35

. collin math4 pctsgle lmedinc free // help collin
(obs=229)

  Collinearity Diagnostics

                        SQRT                   R-
  Variable      VIF     VIF    Tolerance    Squared
----------------------------------------------------
     math4      1.85    1.36    0.5402      0.4598
   pctsgle      5.78    2.40    0.1730      0.8270
   lmedinc      4.13    2.03    0.2423      0.7577
      free      3.64    1.91    0.2749      0.7251
----------------------------------------------------
  Mean VIF      3.85

                           Cond
        Eigenval          Index
---------------------------------
    1     4.3011          1.0000
    2     0.6266          2.6199
    3     0.0602          8.4558
    4     0.0120         18.9037
    5     0.0001        217.8816
---------------------------------
 Condition Number       217.8816 
 Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)
 Det(correlation matrix)    0.0416

. 

Problem 3.12#

i.

%%stata
use econmath.dta, clear
sum score actmth acteng if score==100
sum score actmth acteng

. use econmath.dta, clear
(Written by R.              )

. sum score actmth acteng if score==100

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       score |          0
      actmth |          0
      acteng |          0

. sum score actmth acteng

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       score |        856    72.59981    13.40068      19.53      98.44
      actmth |        814     23.2113    3.773354         12         36
      acteng |        814    22.59459    3.788735         12         34

. 

ii.

%%stata
reg score colgpa actmth acteng

      Source |       SS           df       MS      Number of obs   =       814
-------------+----------------------------------   F(3, 810)       =    177.94
       Model |  57165.5682         3  19055.1894   Prob > F        =    0.0000
    Residual |  86743.1988       810  107.090369   R-squared       =    0.3972
-------------+----------------------------------   Adj R-squared   =    0.3950
       Total |  143908.767       813  177.009554   Root MSE        =    10.348

------------------------------------------------------------------------------
       score | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      colgpa |    12.3662   .7150624    17.29   0.000     10.96261    13.76979
      actmth |   .8833519   .1121984     7.87   0.000      .663118    1.103586
      acteng |    .051764   .1110631     0.47   0.641    -.1662415    .2697696
       _cons |   16.17402   2.800439     5.78   0.000     10.67704    21.67099
------------------------------------------------------------------------------

iii.

%%stata
test actmth 
test acteng

. test actmth 

 ( 1)  actmth = 0

       F(  1,   810) =   61.99
            Prob > F =    0.0000

. test acteng

 ( 1)  acteng = 0

       F(  1,   810) =    0.22
            Prob > F =    0.6413

. 

d. R-squared

%%stata
di " R-squared=" %5.4f e(r2)

 R-squared=0.3972