Submitted by Andy Anderson on Friday, 3/4/2011, at 3:50 PM

By Tuesday, March 8, at 1 PM, please do the following exercises:

1. On whichever Windows computer you're using, set up the assignment:
1. Map the network drive \\storage\colq-32 (if necessary).
2. Inside your own folder, locate your copy of the folder named “Class Week 6” and drag a copy to your desktop (this will make your work faster). You don't need to do this if you're sitting at the same computer as on Thursday, and haven't made any changes to this folder on COLQ-32.
3. Rename the folder on your desktop “Assignment Week 7”.
2. You will be continuing your statistical analysis of the urban renewal areas in 1960 Cambridge, for which we designed a relatively poor model on Thursday afternoon.
1. The first task is to create a better set of explanatory variables:
1. With Excel, open the file MA_tract_1960_race_housing.xls (or MA_tract_1960_race_housing.csv), which you worked on last week.
2. If you don't already have a column for fraction of housing that is OwnerOccupied (independent of race), create one.
3. If you don't already have a column for fraction of population that is Negro, create one.
4. If you don't already have a column for fraction of population that is OtherRaces, create one.
5. Create a column for fraction of housing that was built Before1940.
6. Create a column for average owner-occupied HousingValue. This can be estimated with a weighted average:

= (V0040001 * (0 + 5000)/2 + V0040002 * (5000 + 7400)/2 +
V0040003 * (7500 + 9900)/2 + V0040004 * (10000 + 12400)/2  +
V0040005 * (12500 + 14900)/2  + V0040006 * (15000 + 17400)/2 +
V0040007 * (17500 + 19900)/2 + V0040008 * (20000 + 24900)/2  +
V0040009 * (25000 + 34900)/2  + V0040010 * (35000 + 44900)/2 ) /
SUM(V0040001:V0040010)

Copy the above expression and paste it into the new column in the first row and replace the V004* values with Excel cell references for that row. Then drag the lower-right corner of that cell down the column to copy it with row updates.
7. Review the instructions handed out in class for Regression Analysis Basics. In the list of Common Regression Problems, note that a good model should avoid multicollinearity, i.e. don't use explanatory variables that are cross-correlated with each other and are therefore redundant. An initial test for cross-correlation can be performed in Excel using the CORREL function, for example:

= CORREL(M2:M31,R2:R31)

where the cell ranges refer to two of the possible explanatory variables. (Note that only the thirty Cambridge census tracts, MC001 through MC0030, should be used here, since that is the area of analysis.) If this value is near zero, that indicates no correlation  between the two variables, and if it approaches 1 or -1 that indicates a positive correlation or  negative correlation. Are any pairs of these variables significantly correlated? Why wouldn't we want to use the fraction of housing that is renter-occupied as an additional explanatory variable? Write your answers in a text document in your “Assignment week 7” folder.
8. Save this document, in Excel format if it isn't already (to preserve the formulas).
2. The next task is to repeat the Dissolve analysis that we performed on Thursday, since an actual dissolve didn't take place — we shouldn't have included any of the FID fields as Dissolve Fields, since they always vary between features and these fields should have identical values for features that should "dissolve".
1. Open your map document from Thursday.
2. Open ArcToolbox and locate Data Management Tools and then Generalization, and then double-click on Dissolve.
3. Set the Input Features to be the layer MA_tract_1960_UR that we created on Thursday.
4. Verify that the Output Feature Class will be in your folder “Assignment Week 7”.
5. Set the Dissolve Fields to be only from NHGISST through SHAPE_LEN.
6. Set the Statistics Fields to just the URArea with a Statistic Type of SUM.
7. Click OK.
8. When the analysis is finished, close the dialog.
9. If the new dissolve layer, e.g. MA_tract_1960_UR_Dissolve, is not automatically added to your map, add it.
10. Open its attribute table and verify that each census tract in the Cambridge area now only shows up once, along with its total urban renewal area.
3. Add the file MA_tract_1960_race_housing.xls to your map (if it isn't already), and then join it to the layer MA_tract_1960_UR_Dissolve created in step (b), keeping only matching records.
4. A Geographically Weighted Regression analysis will only be useful if the explanatory variables show evidence of spatial autocorrelation. A good measure of this is the quantity called Moran’s I, described in the class handout.
1. Open ArcToolbox and locate Spatial Statistics Tools and then Analyzing Patterns, and then double-click on Spatial Autocorrelation (Morans I).
2. Set the Input Feature Class to be the layer MA_tract_1960_UR_Dissolve created in Step (b) and joined in Step (c).
3. Set the Input Field to be one of the explanatory variables, e.g. OwnerOccupied.
4. Set the Conceptualization of Spatial Relationships to Polygon Contiguity (First Order) , so that it will relate immediately neighboring tracts, which is best when you have relatively few polygons and/or if the center-to-center distance from one polygon to its neighbors is highly variable.
5. Click OK.
6. When the analysis is finished, copy the Global Moran's I Summary data to your text document in your “Assignment week 7” folder (you may need to expand the tool window to see it). Is this a significantly autocorrelated variable? Write your answer in the same text document.
7. Close the dialog.
8. Repeat steps i-vii for the other explanatory variables.
5. Now perform a Geographically Weighted Regression analysis, as described in the class handout:
1. Open ArcToolbox and locate Spatial Statistics Tools and then Modeling Spatial Relationships, and then double-click on Geographically Weighted Regression.
2. Set the Input Feature Class to be the layer MA_tract_1960_UR_Dissolve created in Step (b) and joined in Step (c).
3. Set the Dependent Variable to be SUM_URArea.
4. Set the Explanatory Variables to be the ones chosen above, but exclude the one variable that was significantly cross-correlated (multicollinear).
5. Verify that the Output Feature Class will be in your folder “Assignment Week 7”.
6. Click OK.
7. When the analysis is finished, review the summary data (you may need to expand the tool window to see it), and copy it to your text document in your “Assignment week 7” folder.
8. Close the dialog.
9. If the shapefile created here, e.g. GeographicallyWeightedRegression2.shp, is not automatically added to your map and symbolized for you, add it and symbolize it by the parameter StdResidual. Menu File and then Export Map…, and in the resulting dialog locate the menu Save as Type: and choose PNG; name the file appropriately and then navigate to your folder “Assignment Week 7” and click the button Save.
10. Repeat Step (d) on the parameter StdResidual.
11. Open the attribute table of GeographicallyWeightedRegression2 and copy the “typical values” of the model coefficients and their errors to your text document in your “Assignment week 7” folder. (If a model parameter’s values are not pretty close to being the same, that’s a problem with the model.)
12. Review the discussion of Interpreting GWR Results in the handout. Is this a reasonably good model based on the global parameters and model coefficient errors?  Write your answers in the same text document in your “Assignment week 7” folder.
6. Save your map document.
3. Drag the folder “Assignment Week 7” from your desktop to your folder in \\storage\colq-32.
Due Date:
Tue, 03/08/2011 - 13:00