The RL should be constructed using the smallest available census subdivisions, as such units are relatively racially homogeneous. In such a case, the resultant spatial-racial pattern and the values of segregation and diversity metrics will not be affected by the stochastic nature of the calculations.
To illustrate how the segregation and diversity metrics changed between the realizations, we chose 51 counties characterized by a different spatio-racial pattern and, for each county, calculated 50 realizations.
The segregation metric \(MI\) change in this counties from 0.007 to 0.379 (with the standard deviation between 0.0002-0.002), and the diversity metric \(E\) range from 0.91 to 2.13 (with the standard deviation between 0.0003 - 0.005).
In the figure below, the black points present the segregation metrics calculated for each realization; the red point is a mean calculated from 50 realizations. It can be observed that the deviation of metrics values is indeed small regardless of segregation metrics values.
We also repeat the same calculation for diversity metric - entropy \(E\). In the figure below, the black points present the diversity metrics calculated for each realization; the red point is a mean calculated from 50 realizations. It can be observed that the deviation of metrics values is indeed small regardless of diversity metrics values.
The National Racial Geography 2020 dataset https://socscape.edu.pl/index.php?id=nrgd provides RL racial ID and RL population density grids that can be used with raceland package to calculate racial segregation and diversity metrics for any arbitrary, user-defined region. Both grids are just one realizations of racial landscape. How accurate are the metrics calculated using NRGD2020 dataset, comparing to the calculations perform using larger number of realizations?
To illustrate how accurate are the segregation and diversity metrics we chose 51 counties characterized by a different spatio-racial pattern. For each county, we calculated racial segregation and diversity metrics based on 5, 10, 20, 30, 40, 50 realizations and based on RL racial ID and RL population density grids included in the NRGD2020 dataset. Then, we compare segregation, and diversity rankings for those 51 counties.
The Spearman correlation between segregation metric calculated using NRGD, and the mean values from 5, 10, 20, 30, 40, and 50 realizations range from 0,9997 to 0,9999. It indicates that the rankings based on segregation metrics are almost identical. The further investigation shows that 2 pairs of counties switch their position.
For example, comparing the segregation rankings based on segregation metrics calculated using 50 realizations and 1 realization obtained from NRGD2020 dataset only 2 pairs of counties switch their position.
The Spearman correlation between diversity metric calculated using NRGD, and the mean values from 5, 10, 20, 30, 40, and 50 realizations range from 0,9996 to 0,9997. It indicates that the rankings based on diversity metrics are almost identical. The further investigation shows that 2 pairs of counties switch their position.
For example, comparing the diversity rankings based on the diversity metrics calculated using 50 realizations and 1 realization obtained from NRGD2020 dataset only 2 pairs of counties switch their position.