Hints
- Load packages
dplyr and readr.
- Use the
read_csv function from the readr package to read in the haiti.csv file.
- Use the
select verb from the dplyr package to select the following variables: name, adm1_en, adm2_en, total.
- Use the
filter verb to filter the data where adm1_en contains North, Nort-East, or North-West. %in% can be used to check whether multiple items are in a variable.
- Use the
group_by verb to group by adm1_en and adm2_en.
- Use
summarise in combination with group_by to calculate the total number of health facilties (the function n() will count the number of rows in a group). To calculate the total population in each area, use the first() function on the total variable (This is a programming trick. We know the population count is already grouped by admin level 2, therefore we can just take the first value).
- Use
mutate to create a new variable pop_per_health by dividing the population by the number of health facilities.
- Use the
arrange verb to sort by population per health centre and report the 5 areas with the worst ratio. (A higher number would be worse as it means there are fewer facilities per person).
For further help see our solution.
Return to the practical.