next up previous
Next: About this document ... Up: Analysis of Aggregated Data Previous: Analysis of Cross Classified

Analysis of Aggregate Means

As in the other forms of analysis described above, valid analysis of stratified means requires careful use of weights. Because appropriate weighting needs to take into account stratified variances as well, this definitely goes beyond the scope of what can be discussed here.

One well known difficulty in the interpretation aggregate data is the the issue of ecological inflation of association. To provide a simple example, below I present regression analyses relating heparin sulfate levels to weight (untransformed), based on the data from Chapter 2. As the plots and the $R^2$ values show, the results based on the aggregate data are quite misleading.

\scalebox{.7}{\includegraphics{parts/m29.hep1}}
\scalebox{.7}{\includegraphics{parts/m29.hep2}}
. regress hsulf weight

      Source |       SS       df       MS              Number of obs =     148
-------------+------------------------------           F(  1,   146) =   26.15
       Model |  1.75793155     1  1.75793155           Prob > F      =  0.0000
    Residual |  9.81634188   146  .067235218           R-squared     =  0.1519
-------------+------------------------------           Adj R-squared =  0.1461
       Total |  11.5742734   147  .078736554           Root MSE      =   .2593

. regress avhep avwt

      Source |       SS       df       MS              Number of obs =       6
-------------+------------------------------           F(  1,     4) =   21.74
       Model |  .063427068     1  .063427068           Prob > F      =  0.0096
    Residual |  .011669653     4  .002917413           R-squared     =  0.8446
-------------+------------------------------           Adj R-squared =  0.8058
       Total |  .075096721     5  .015019344           Root MSE      =  .05401



Rollin Brant 2004-03-24