Introduction to R and the tidyverse

Slides

Notes

Health coverage in Haiti

Objectives

The aim of this session is to introduce R and the tidyverse. More specifically you will

  1. Manipulate data using the dplyr package
  2. Read and write data using readr.
  3. Look at how to create graphs with ggplot2.

Touchdown!

You’ve just arrived in Haiti to start a new job as a Junior Spatial Analyst. As you step off the plane, your boss calls and says that the three Northern departments have requested help to assess whether they have adequate numbers of health facilities in their areas.

Your boss tells you that a colleague has checked HDX (The Humanitarian data exchange) and has found several datasets that are relevant. Very kindly they have pulled the data together, but in future you’ll have to combine the data yourself. Just as you put the phone down your email pings and the data has arrived. You read through the email and see that you’re colleague is away for the next few days so you’re on your own…

Task

Identify the areas with the highest numbers of people per health facility.

Before starting to code, write down the steps required for your analysis. Try to think in terms of the dplyr verbs. For instance, if you only need a few variables of the data then write Select variables var1, var2, var3 as opposed to writing “Keep relevant variables”.

To complete the task you need to follow these broad instructions.

  1. Transform the data to an appropriate form for analysis. You may need to remove variables and create new ones.
  2. Determine the 5 areas with the highest number of people per health facility.
  3. Save the data frame you have created for future analyses.

Below there are details of resources sent to you. Take a look at the resources and start to identify what parts will be useful to complete the task.

Once you have written your analysis steps, open up an R scripts to start writing code and if you’re stuck then see here for more detailed hints of the commands to use.

Resources

In the email there is an interactive map of Haiti and the dataset that your colleague has put together.

Map

The map shows Haiti and highlights the three Northern departments, it also shows the locations of the health centres in the Northern departments. You can click on the map to get an idea of what the names of the Northern departments are. Hint: They are at admin level 1.

Data

The email also contains a dataset called haiti-healthsites.csv. The dataset contains locations of the health centres and has information about the population at admin level 2. When there are multiple health centres per department the population variables are repeated for each row. A codebook is provided below.

Table 1: Table 2: Codebook
Variables Description
x latitude
y longitude
name Name of Health centre
type Type of centre
adm2code Admin 2 level code
adm1code Admin 1 level code
adm1_en Admin 1 name English
adm2_en Admin 2 name English
adm0code Admin 0 level code
adm0_en Admin 0 name English
total Total population 2013
male Number of females 2013
female Number of males 2013

Going Further

If you have managed to complete then try these extra analyses:

  • Try turning the analysis into one piped sequence of code.
  • Compare health coverage for males and females in the Northern departments.
  • Compare the health coverage in another area of Haiti.
  • Try plotting a bar chart of the health coverage by area.
  • Combine multiple plots using patchwork.

Trouble shooting

  • Check every “, (, { or [ is closed, and that the letter case is correct.
  • Check the examples associated with the packages you are using.
  • Search online (e.g., stack overflow)

Using AI to learn R

  1. Debug your R code by including the following in your prompts:
  • The purpose of your code (“I’m trying to do xyz…”)
  • The code you used (“This is the code I used…”)
  • The error you received (“This is the error I received…”)
  1. Ask what a function does/figure out what your R code does
  2. Specify packages in your prompts (“I’m trying to do xyz using dplyr…”)

💡 Don’t rely on AI as your first step when your code doesn’t work. Try debugging on your own first- it’s an essential part of learning. ChatGPT is just one of many tools you can use to enhance your understanding.

⚠️ ChatGPT might confidently provide incorrect answers, so always double-check the code independently and test it thoroughly.