Hints

Hints

  1. Load packages dplyr and readr.
  2. Use the read_csv function from the readr package to read in the haiti.csv file.
  3. Use the select verb from the dplyr package to select the following variables: name, adm1_en, adm2_en, total.
  4. Use the filter verb to filter the data where adm1_en contains North, Nort-East, or North-West. %in% can be used to check whether multiple items are in a variable.
  5. Use the group_by verb to group by adm1_en and adm2_en.
  6. Use summarise in combination with group_by to calculate the total number of health facilties (the function n() will count the number of rows in a group). To calculate the total population in each area, use the first() function on the total variable (This is a programming trick. We know the population count is already grouped by admin level 2, therefore we can just take the first value).
  7. Use mutate to create a new variable pop_per_health by dividing the population by the number of health facilities.
  8. Use the arrange verb to sort by population per health centre and report the 5 areas with the worst ratio. (A higher number would be worse as it means there are fewer facilities per person).

For further help see our solution.

Return to the practical.