A common inquiry in demographic studies involves comparing various zones within metropolitan statistical areas (MSAs) regarding racial diversity and segregation. For example, how do racial diversity and segregation change between suburban areas and a principal city?

Here, we demonstrate how to conduct such analysis using the NRGD2020 dataset and the R package raceland. Our case study pertains to the Atlanta metropolitan statistical area (MSA) in 2020.

1 Data

The code below can be replicated using data from the Atlanta dataset.

Here we also describe what NRGD2020 layers and preprocessing steps are required to perform the calculations.

1.1 Data preparation

Data preparation consists of cropping RL grid layers (RL grid racial ID and RL grid population density) to the area of MSA, principal city, and suburban areas. Due to the size of RL grids, data should be prepare using GIS software, instead of R.

In our example, we use typology of suburban areas introduced by Lichter et al (2023)). For each MSA, they divided MSA into principal city and three suburban zones. Suburban zones are classified based on counties.

  • principal cities - area of the city based on US Place data;
  • inner-ring suburbs - the area outside principal city, but within the counties containing principal city (core counties),
  • outlying suburbs - counties included in the MSA in 1993 or earlier,
  • fringe suburbs - the counties which were reclassified from non-metro to metro after 1993. They consist of mix of rural and urban areas in 2020 located in the MSA boundaries.

Example of such data for the Atlanta, MSA can be dowloaded from http://www.socscape.edu.pl/index.php?id=nrgd) The zip archive contains RL grid racial ID and RL grid population density layer for Atlanta MSA, and for 4 zones within MSA.

2 Calculating racial diversity and segregation metrics using RL method

The code below shows how to calculate racial diversity and segregation metrics based on RL grid racial ID and RL grid population density using raceland package. We use Atlanta MSA as an example.

#read package
library(raceland)
library(terra)

#set working directory to the folder containing data using function setwd("")
##setwd("")

#read RL grid racial ID to R using terra package
rl = rast("Atlanta_dataset/RL_grids/rl_msa.tif")
#read RL grid population density to R using terra package
rd = rast("Atlanta_dataset/RL_grids/rd_msa.tif")
#calculate diversity and segregation metrics using raceland package 
metr = calculate_metrics(rl, rd, fun = "mean", threshold = 1)
#return metrics in a user-friendly format
res = c(ENTROPY = metr$ent, HILL = 2^metr$ent, MI = metr$mutinf, NMI = metr$mutinf/metr$ent)
round(res, 4)

3 Atlanta example - how racial segregation and diversity differ between principial city and suburban zones?

The code below calculates racial segregation and diversity metrics for the metropolitan statistical area, principal city, and 3 suburban zones. The resultant metrics are returned as a table.

library(raceland)
library(terra)

#set working directory to the folder containing data using function setwd("")
##setwd("")

#list path to data files
data = list.files("Atlanta_dataset/RL_grids", full.names = TRUE)
#extract zone name (msa, city, fringe, outlying, inner) from file name
zones =unique(sapply(strsplit(basename(data), "[_. ]+"), function(x) x[2]))

metrics = data.frame()
for (zone in zones) {
  #read RL grids 
  rl = rast(file.path("Atlanta_dataset/RL_grids", paste("rl_", zone, ".tif", sep="")))
  rd = rast(file.path("Atlanta_dataset/RL_grids", paste("rd_", zone, ".tif", sep="")))
  
  #calculate metrics
  metr = calculate_metrics(rl, rd, fun = "mean", threshold = 1)
  #return metrics in a user-friendly format
  res = c(ENTROPY = metr$ent, HILL = 2^metr$ent, MI = metr$mutinf, NMI = metr$mutinf/metr$ent)
  #create output table with metrics
  metrics = rbind(metrics, round(res, 3))
  
}

rownames(metrics) = zones
colnames(metrics)  = c("E", "NH", "MI", "NMI")

The table lists segregation and diversity metrics for the MSA and zones within MSA.

library(kableExtra)
#print table
kable_classic(kbl(metrics), full_width = F)
E NH MI NMI
city 1.694 3.235 0.258 0.152
fringe 1.090 2.128 0.123 0.113
inner 1.873 3.662 0.316 0.169
msa 1.880 3.681 0.193 0.103
outlying 1.888 3.702 0.134 0.071