Solutions - ISAIR2025

Here is a possible solution for the practical

0)\(~\)Load required packages

# Load packages
library(dplyr)
library(readr)

1)\(~\)Load the dataset haiti-healthsites.csv into R.

# Read in health centre data
hc <- read_csv('data/haiti/haiti-healthsites.csv')

2)\(~\)Select the variables required for analysis (health facilities, admin areas, population).

hc <- select(hc, name, adm1_en, adm2_en, total)

3)\(~\)Filter the data so that only the data for Northern departments remains.

adm2north <-  filter(hc, adm1_en %in% c("North", "North-East", "North-West"))

4)\(~\)Group the data by the smallest scale admin level. 5)\(~\)Create summary variables of the number of health facilities and the population per commune.

adm2north <-  adm2north %>% 
  group_by(adm1_en, adm2_en) %>% 
  summarise(healthcentres = n(), pop = first(total))

## `summarise()` has grouped output by 'adm1_en'. You can override using the `.groups` argument.

The pipe (%>%) operator is used to link the adm2north object to the next line of code. This can be read as take the dataframe adm2north and then group by adm1_en and adm2_en and then summarise the groups by counting the health centre and taking the first value for the population.

Note: population is already at the aggregate admin 2 level so the first value is taken.

6)\(~\)Calculate the number of people per health facility.

adm2north <-  adm2north %>% 
  mutate(pop_per_health = pop/healthcentres)

7)\(~\)Sort the data to find the areas with the highest number of people per health facility.

arrange(adm2north, desc(pop_per_health))

Table 1: The 5 areas with the higest number of people per health facility in the Northern departments
adm1_en	adm2_en	pop_per_health	healthcentres
North-West	Chamsolme	30361.0	1
North	Limbe	28434.0	3
North	Pilate	27025.0	2
North	Borgne	22307.0	3
North	Saint-Raphael	17918.3	3

In one block

Note that steps 1 to 6 could be chained together.

# Load packages
library(dplyr)
library(readr)

# Read in health centre data
hc <- read_csv('data/haiti/haiti-healthsites.csv') 

adm2north <- hc %>% 
  select(name, adm1_en, adm2_en, total) %>% 
  filter(adm1_en %in% c("North", "North-East", "North-West")) %>% 
  group_by(adm1_en, adm2_en) %>% 
  summarise(healthcentres = n(), pop = min(total))  %>% 
  mutate(pop_per_health = pop/healthcentres) 

# Sort by population per health centre
arrange(adm2north, desc(pop_per_health))

Return to practical