3 Plotting cars

3.1 Learn and code

First, let’s make a simple scatter plot. We’ll use a famous dataset that is used in R a lot for educational purposes. This is the mtcars dataset. It stands for “Motor Trend Car Road Tests”. See parameter overview and documentation for info about the mtcars dataset. It is one of many datasets available from base R or tidyverse packages, so we can always call it without having to load it.

First, we will inspect the dataset. For this we will load the tidyverse:

R Code

Start Over

Run Code

library(tidyverse)

_webr_editor_1 = Object {code: null, options: Object, indicator: Ke}

Once tidyverse is loaded via the library call, it is available in your current session in your browser, so you do not have to load it each time. Let’s have a look at the full dataset:

R Code

Start Over

Run Code

mtcars

_webr_editor_2 = Object {code: null, options: Object, indicator: Ke}

R Code

Start Over

Run Code

#if you get an error here,

# please load the library call to tidyverse

mtcars %>% glimpse()

_webr_editor_3 = Object {code: null, options: Object, indicator: Ke}

R Code

Start Over

Run Code

mtcars %>% head()

_webr_editor_4 = Object {code: null, options: Object, indicator: Ke}

R Code

Start Over

Run Code

mtcars %>% tail()

_webr_editor_5 = Object {code: null, options: Object, indicator: Ke}

Let’s select a small part of the data using select from the dplyr package:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, disp)

_webr_editor_6 = Object {code: null, options: Object, indicator: Ke}

Next, make a simple plot with the miles per gallon (mpg) and displacement parameters (disp) in the mtcars dataset.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, disp) %>%

ggplot(aes(x = mpg, y = disp))+

geom_point(size = 4)

_webr_editor_7 = Object {code: null, options: Object, indicator: Ke}

This is a very basic plot, without much formatting. Let’s make it prettier!

Add color and bring in a third parameter:

R Code

Start Over

Run Code

mtcars %>%

#added cyl to the selection here

select(mpg, disp, cyl) %>%

ggplot(aes(x = mpg,

y = disp,

color = cyl) #added color to the aesthetics here

) +

geom_point(size = 4)

_webr_editor_8 = Object {code: null, options: Object, indicator: Ke}

Here we need to have a look at data-types. The cyl parameter is numerical. ggplot automatically assumes we want a continuous scale for this. Instead the cyl is more of a categorical data type (there are either 4, 6 or 8 cylinders in each car) so we can explicitly make the cyl parameter categorical like this:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl))) +

geom_point(size = 4)

_webr_editor_9 = Object {code: null, options: Object, indicator: Ke}

If you want to have different colors you can use one of the many color palettes available:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl)))+

geom_point(size = 4)+

#I also manually changed the name of the legend here

scale_color_brewer(name = "cylinders",

palette = "Set2")

_webr_editor_10 = Object {code: null, options: Object, indicator: Ke}

Apart from color you can change the shape of the datapoints:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

#please note to also add shape to the aesthetics here

shape = as.factor(cyl))) +

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")

_webr_editor_11 = Object {code: null, options: Object, indicator: Ke}

ggplot can use different themes for your plots, and there are many many options to tweak your plots to the way you like. You can see some examples below:

Let’s change titles:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")

_webr_editor_12 = Object {code: null, options: Object, indicator: Ke}

Change the plotting theme and base size of the elements:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

theme_bw(base_size = 20)

_webr_editor_13 = Object {code: null, options: Object, indicator: Ke}

Change the scaling of the axes. It is good practice to plot graphs from zero:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(limits = c(0, NA),

expand = c(0,NA))+

scale_y_continuous(limits = c(0, NA),

expand = c(0,NA))+

theme_bw(base_size = 20)

_webr_editor_14 = Object {code: null, options: Object, indicator: Ke}

Now the datapoints at the maxima of the axis are not completely visible so it would be nice that we have some more space:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

_webr_editor_15 = Object {code: null, options: Object, indicator: Ke}

Now we have generated a nice visualisation of our data using ggplot. Please note that ggplot uses layers and we added each time a different layer of information to the ggplot. If you want you can go wild with ggplot. Please find a nice overview of visualisations using ggplot, tidy and R from Cedric Scherer. Also the underlying R code is available for those plots.

3.2 Exercises

3.2.1 Adding layers and changing the MTCARS plot

Exercise 1

Give the points in the ggplot some transparency (or opacity), so that individual points are better visible. TIP: use the alpha argument it should be a number from 0 to 1.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4,

#type your extra code here:

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

_webr_editor_16 = Object {code: null, options: Object, indicator: Ke}

Solution to Exercise 1

Please note that the alpha we added is not part of an aesthetics (aes), meaning that the value of the alpha is not linked with a parameter in our data.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl)))+

geom_point(size = 4,

#type your extra code here:

alpha = 0.7

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

_webr_editor_17 = Object {code: null, options: Object, indicator: Ke}

Exercise 2

Add a layer that will generate a smooth linear regression line that shows the relation between mpg and disp. Use the stat_smooth command for this.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp))+

geom_point(size = 4,

alpha = 0.7)+

# enter code here

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

_webr_editor_18 = Object {code: null, options: Object, indicator: Ke}

Solution to Exercise 2

Please make sure that the ggplot is not separated into groups. If the data is grouped by color or shape a different regression line for each group will be generated.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp))+

geom_point(size = 4,

alpha = 0.7)+

# enter code here

stat_smooth(geom = "line",

method = "lm",

formula = "y ~ x")+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

_webr_editor_19 = Object {code: null, options: Object, indicator: Ke}

Exercise 3

Use the facet_wrap command to make three separate plots for each cylinder.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp))+

geom_point(size = 4,

alpha = 0.7)+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)

# enter code here

#(and don't forget to at a plus to the last line)

_webr_editor_20 = Object {code: null, options: Object, indicator: Ke}

Solution to Exercise 3

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp))+

geom_point(size = 4,

alpha = 0.7)+

labs(title = "My cool MTCARS plot",

x = "miles per gallon",

y = "dispension")+

scale_x_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

scale_y_continuous(

limits = c(0, NA),

expand = expansion(mult = c(0, 0.1)))+

theme_bw(base_size = 20)+

facet_wrap(~cyl)

_webr_editor_21 = Object {code: null, options: Object, indicator: Ke}

3.2.2 Fixing common errors

Below is some code that is not working properly, because of coding semantics mistakes. Can you spot (and fix) the errors?

Fix error 1

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp

color = cyl))+

geom_point(size = 4)

_webr_editor_22 = Object {code: null, options: Object, indicator: Ke}

Solution to Error 1

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp, #the error was here

color = cyl))+

geom_point(size = 4)

_webr_editor_23 = Object {code: null, options: Object, indicator: Ke}

Commas are often forgotten, but easily fixed. Within brackets arguments are separated with commas. R also generates an error that is helpful and can point you to the missing ,.

Fix error 2

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = cyl)) %>%

geom_point(size = 4)

_webr_editor_24 = Object {code: null, options: Object, indicator: Ke}

Solution to Error 2

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = cyl))+ #the error was in this line

geom_point(size = 4)

_webr_editor_25 = Object {code: null, options: Object, indicator: Ke}

ggplot layers are added with a + not with the pipe term.

Fix error 3

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp

color = cyl)) +

geom_point(size = 4) +

_webr_editor_26 = Object {code: null, options: Object, indicator: Ke}

Solution to Error 3

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = cyl))+

geom_point(size = 4) #the error was in this line

_webr_editor_27 = Object {code: null, options: Object, indicator: Ke}

Make sure that the end of a layer or line of code is not followed up with a + or %>%.

Fix error 4

Although R doesn’t show you an error message, the code does not give you what you want. The plot should show the cyl parameter in different shapes, just like there are three different colors for each level of the cyl parameter.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl)),

shape = as.factor(cyl))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")

_webr_editor_28 = Object {code: null, options: Object, indicator: Ke}

Solution to Error 4

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl), #parenthesis error

shape = as.factor(cyl)))+

geom_point(size = 4)+

scale_color_brewer(name = "cylinders",

palette = "Set2")+

scale_shape(solid = TRUE,

name = "cylinders")

_webr_editor_29 = Object {code: null, options: Object, indicator: Ke}

The shape argument should be included in the aesthetics (aes) part of the ggplot

Solution to Error 4 using chatGPT

Since chatGPT can also solve basic programming problems https://doi.org/10.1371/journal.pcbi.1011511, I tried solving Error 4 also with chatGPT. In my chatGPT session, the following solution was given:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_shape_manual(

name = "Cylinders",

values = c(1, 2, 3), # Customize shape values as needed

labels = c("4", "6", "8")

)

_webr_editor_30 = Object {code: null, options: Object, indicator: Ke}

chatGPT indeed put the shape in the aes, but also started making other changes. The remark that chatGPT made was: “you should correct the syntax error and use scale_shape_manual() instead of scale_shape().”… “to customize the shape values and labels.”. I didn’t really ask for that, and from there it went all downhill. Let’s call that the “chatGPT loop of death”. Let’s see what happened.

Running the “improved” code showed that there was still no fill, so it did not improve in a way that I wanted.

Next I said “but now the shapes don’t have a fill” and chatGPT replied:

“I see your concern. If you want to have both color and fill for the shapes, you can achieve this by mapping fill within the aes() call and using scale_fill_brewer() for the fill scale. Here’s the modified code to achieve that:”

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl),

fill = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_fill_brewer(name = "Cylinders", palette = "Set2") +

scale_shape_manual(

name = "Cylinders",

values = c(1, 2, 3), # Customize shape values as needed

labels = c("4", "6", "8")

)

_webr_editor_31 = Object {code: null, options: Object, indicator: Ke}

This again did not give fill unfortunately. Asking for a different solution: “the fill is not there, do you have another solution?”

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_fill_manual(

name = "Cylinders",

values = c("4" = "red", "6" = "blue", "8" = "green")

) +

scale_shape_manual(

name = "Cylinders",

values = c(1, 2, 3), # Customize shape values as needed

labels = c("4", "6", "8")

)

_webr_editor_32 = Object {code: null, options: Object, indicator: Ke}

Also did not give me fill.

Another attempt by chatGPT (still not working):

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

mutate(cyl_shape_fill = as.factor(cyl)) %>%

ggplot(aes(x = mpg, y = disp,

shape = cyl_shape_fill,

color = cyl_shape_fill)) +

geom_point(size = 4) +

scale_shape_manual(

name = "Cylinders",

values = c("4" = 21, "6" = 22, "8" = 23),

labels = c("4", "6", "8")

) +

scale_color_manual(

name = "Cylinders",

values = c("4" = "red", "6" = "blue", "8" = "green")

)

_webr_editor_33 = Object {code: null, options: Object, indicator: Ke}

Another attempt by chatGPT (still not working):

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

shape = as.factor(cyl),

color = as.factor(cyl))) +

geom_point(size = 4) +

geom_point(aes(fill = as.factor(cyl)), size = 2,

show.legend = FALSE) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_fill_brewer(name = "Cylinders", palette = "Set2") +

scale_shape_manual(

name = "Cylinders",

values = c(1, 2, 3), # Customize shape values as needed

labels = c("4", "6", "8")

)

_webr_editor_34 = Object {code: null, options: Object, indicator: Ke}

Asking chatGPT to use scale_shape and solid=TRUE gives:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_shape(solid = TRUE,

name = "Cylinders",

breaks = c("4", "6", "8"),

labels = c("4", "6", "8"),

values = c(15, 16, 17),

guide = guide_legend(

override.aes =

list(fill = c("red", "blue", "green")))

)

_webr_editor_35 = Object {code: null, options: Object, indicator: Ke}

…. but apparently chatGPT doesn’t know how to use scale_shape (in the first instance), because it scale_shape cannot take values as an argument. So when I reply with the following “I get the following error: Error: unused argument (values = c(15, 16, 17))”, chatGPT goes back to use the ’scale_fill_manual`.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

color = as.factor(cyl),

shape = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(name = "Cylinders", palette = "Set2") +

scale_shape_manual(

name = "Cylinders",

breaks = c("4", "6", "8"),

labels = c("4", "6", "8"),

values = c(15, 16, 17)

) +

scale_fill_manual(

name = "Cylinders",

breaks = c("4", "6", "8"),

labels = c("4", "6", "8"),

values = c("red", "blue", "green")

)

_webr_editor_36 = Object {code: null, options: Object, indicator: Ke}

AND FINALLY we have a nice and simple fill with one legend…… but also lots of unnecessary code. So after a few exchanges to make the code more concise this is what came out of it:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

shape = as.factor(cyl),

color = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(palette = "Set2",

name = "Cylinders") +

scale_shape_manual(name = "Cylinders",

values = c("4" = 15,

"6" = 16,

"8" = 17)) +

scale_fill_brewer(palette = "Set2",

name = "Cylinders")

_webr_editor_37 = Object {code: null, options: Object, indicator: Ke}

Affter asking to use scale_shape instead of scale_shape_manual, chatGPT generates:

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

shape = as.factor(cyl),

color = as.factor(cyl))) +

geom_point(size = 4, stroke = 1) +

scale_color_brewer(palette = "Set2", name = "Cylinders") +

scale_shape(solid = TRUE, name = "Cylinders")

_webr_editor_38 = Object {code: null, options: Object, indicator: Ke}

This works nicely, but chatGPT introduces stroke = 1, which is not needed here, so again we have unnecessary code. So after I asked “can I leave out the stroke argument?” we get the easiest solution and exactly the same solution as I came up with myself without chatGPT.

R Code

Start Over

Run Code

mtcars %>%

select(mpg, cyl, disp) %>%

ggplot(aes(x = mpg, y = disp,

shape = as.factor(cyl),

color = as.factor(cyl))) +

geom_point(size = 4) +

scale_color_brewer(palette = "Set2", name = "Cylinders") +

scale_shape(solid = TRUE, name = "Cylinders")

_webr_editor_39 = Object {code: null, options: Object, indicator: Ke}

Please note, that when building the ggplot example, I did use google (…off course) to get some solutions, I liked the scale_shape and solid=TRUE solution that I found, because it made the code so concise and I don’t like to type in values and breaks manually.

ChatGPT use in science and coing is just dipping the toe in the water. ChatGPT is likely to better not be used as knowledge database but instead as “reasoning or infering agents” https://www.nature.com/articles/s41591-023-02594-z. ChatGPT can produce false information, also described as “hallucinations” https://www.nature.com/articles/d41586-023-00816-5, which makes it difficult to use it for getting knowledge and facts. That said, it can be used to gain knowledge and learn better coding skills. Here is a nice quick tips paper from PLOS computational biology on how to “harness the power of chatGPT” https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011319.

Downloading package: evaluate