How does the stochastic nature of racial landscape affect the accuracy of segregation, and diversity metrics?

The RL should be constructed using the smallest available census subdivisions, as such units are relatively racially homogeneous. In such a case, the resultant spatial-racial pattern and the values of segregation and diversity metrics will not be affected by the stochastic nature of the calculations.

To illustrate how the segregation and diversity metrics changed between the realizations, we chose 51 counties characterized by a different spatio-racial pattern and, for each county, calculated 50 realizations.

The segregation metric \(MI\) change in this counties from 0.007 to 0.379 (with the standard deviation between 0.0002-0.002), and the diversity metric \(E\) range from 0.91 to 2.13 (with the standard deviation between 0.0003 - 0.005).

Segregation metrics \(MI\)

In the figure below, the black points present the segregation metrics calculated for each realization; the red point is a mean calculated from 50 realizations. It can be observed that the deviation of metrics values is indeed small regardless of segregation metrics values.

Figure 1. Deviation of segregation metric in 51 counties based on 50 realizations.

Diversity metrics \(E\)

We also repeat the same calculation for diversity metric - entropy \(E\). In the figure below, the black points present the diversity metrics calculated for each realization; the red point is a mean calculated from 50 realizations. It can be observed that the deviation of metrics values is indeed small regardless of diversity metrics values.

Figure 2. Deviation of diversity metric in 51 counties based on 50 realizations.

How accurate are the metrics calculated using NRGD2020 dataset?

The National Racial Geography 2020 dataset https://socscape.edu.pl/index.php?id=nrgd provides RL racial ID and RL population density grids that can be used with raceland package to calculate racial segregation and diversity metrics for any arbitrary, user-defined region. Both grids are just one realizations of racial landscape. How accurate are the metrics calculated using NRGD2020 dataset, comparing to the calculations perform using larger number of realizations?

To illustrate how accurate are the segregation and diversity metrics we chose 51 counties characterized by a different spatio-racial pattern. For each county, we calculated racial segregation and diversity metrics based on 5, 10, 20, 30, 40, 50 realizations and based on RL racial ID and RL population density grids included in the NRGD2020 dataset. Then, we compare segregation, and diversity rankings for those 51 counties.

Conclusions

The Spearman correlation between segregation metric calculated using NRGD, and the mean values from 5, 10, 20, 30, 40, and 50 realizations range from 0,9997 to 0,9999. It indicates that the rankings based on segregation metrics are almost identical. The further investigation shows that 2 pairs of counties switch their position.
For example, comparing the segregation rankings based on segregation metrics calculated using 50 realizations and 1 realization obtained from NRGD2020 dataset only 2 pairs of counties switch their position.
- counties Shelby, AL and San Francisco, CA switch the positions between 15 and 16. (Shelby, AL: \(MI\) based on 50 realizations is equal to 0.44835, and based on NRGD is 0.04534; San Francisco, CA: \(MI\) based on 50 realizations is equal to 0.04490, and based on NRGD is 0.04512)
- counties Bronx, NY and Kent, MI switch their positions between 22 and 23. (Kent, MI: \(MI\) based on 50 realizations is equal to 0.08036, and based on NRGD is 0.0796; Bronx, NY: \(MI\) based on 50 realizations is equal to 0.08013, and based on NRGD is 0.08039)
The Spearman correlation between diversity metric calculated using NRGD, and the mean values from 5, 10, 20, 30, 40, and 50 realizations range from 0,9996 to 0,9997. It indicates that the rankings based on diversity metrics are almost identical. The further investigation shows that 2 pairs of counties switch their position.
For example, comparing the diversity rankings based on the diversity metrics calculated using 50 realizations and 1 realization obtained from NRGD2020 dataset only 2 pairs of counties switch their position.
- counties Bronx, NY and Bermallillo, NM switch the positions between 25 and 26. (Bronx, NY: \(E\) based on 50 realizations is equal to 1.6878, and based on NRGD is 1.6838; Bermallillo, NM: \(E\) based on 50 realizations is equal to 1.6849, and based on NRGD is 1.6843)
- counties Santa Clara, CA and San Diego, CA switch the positions between 33 and 34. (Santa Clara, CA : \(E\) based on 50 realizations is equal to 1.8795, and based on NRGD is 1.8776; San Diego: \(E\) based on 50 realizations is equal to 1.8791, and based on NRGD is 1.8784)