Solutions

Here is a possible solution for the practical

Return to practical

0)\(~\)Load required packages

# Load packages
library(dplyr)
library(readr)

1)\(~\)Load the dataset haiti-healthsites.csv into R.

# Read in health centre data
hc <- read_csv('data/haiti/haiti-healthsites.csv') 

2)\(~\)Select the variables required for analysis (health facilities, admin areas, population).

hc <- select(hc, name, adm1_en, adm2_en, total)

3)\(~\)Filter the data so that only the data for Northern departments remains.

adm2north <-  filter(hc, adm1_en %in% c("North", "North-East", "North-West"))

4)\(~\)Group the data by the smallest scale admin level. 5)\(~\)Create summary variables of the number of health facilities and the population per commune.

adm2north <-  adm2north %>% 
  group_by(adm1_en, adm2_en) %>% 
  summarise(healthcentres = n(), pop = first(total))
## `summarise()` has grouped output by 'adm1_en'. You can override using the `.groups` argument.

The pipe (%>%) operator is used to link the adm2north object to the next line of code. This can be read as take the dataframe adm2north and then group by adm1_en and adm2_en and then summarise the groups by counting the health centre and taking the first value for the population.

Note: population is already at the aggregate admin 2 level so the first value is taken.

6)\(~\)Calculate the number of people per health facility.

adm2north <-  adm2north %>% 
  mutate(pop_per_health = pop/healthcentres) 

7)\(~\)Sort the data to find the areas with the highest number of people per health facility.

arrange(adm2north, desc(pop_per_health))
Table 1: The 5 areas with the higest number of people per health facility in the Northern departments
adm1_en adm2_en pop_per_health healthcentres
North-West Chamsolme 30361.0 1
North Limbe 28434.0 3
North Pilate 27025.0 2
North Borgne 22307.0 3
North Saint-Raphael 17918.3 3

In one block

Note that steps 1 to 6 could be chained together.

# Load packages
library(dplyr)
library(readr)

# Read in health centre data
hc <- read_csv('data/haiti/haiti-healthsites.csv') 

adm2north <- hc %>% 
  select(name, adm1_en, adm2_en, total) %>% 
  filter(adm1_en %in% c("North", "North-East", "North-West")) %>% 
  group_by(adm1_en, adm2_en) %>% 
  summarise(healthcentres = n(), pop = min(total))  %>% 
  mutate(pop_per_health = pop/healthcentres) 

# Sort by population per health centre
arrange(adm2north, desc(pop_per_health))

Return to practical