latticelatticelatticeIgnore if you don't need this bit of support.
This is one in a series of tutorials in which we explore basic data import, exploration and much more using data from the Gapminder project. Now is the time to make sure you are working in the appropriate directory on your computer, perhaps through the use of an RStudio project. To ensure a clean slate, you may wish to clean out your workspace and restart R (both available from the RStudio Session menu, among other methods). Confirm that the new R process has the desired working directory, for example, with the getwd() command or by glancing at the top of RStudio's Console pane.
Open a new R script (in RStudio, File > New > R Script). Develop and run your code from there (recommended) or periodicially copy "good" commands from the history. In due course, save this script with a name ending in .r or .R, containing no spaces or other funny stuff, and evoking "lattice" and "colors".
latticeAssuming the data can be found in the current working directory, this works:
gDat <- read.delim("gapminderDataFiveYear.txt")Plan B (I use here, because of where the source of this tutorial lives):
## data import from URL
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt"
gDat <- read.delim(file = gdURL)Basic sanity check that the import has gone well:
str(gDat)## 'data.frame':    1704 obs. of  6 variables:
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ..
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 ..
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...Drop Oceania, which only has two continents
## drop Oceania
jDat <- droplevels(subset(gDat, continent != "Oceania"))
str(jDat)## 'data.frame':    1680 obs. of  6 variables:
##  $ country  : Factor w/ 140 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ..
##  $ year     : int  1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ pop      : num  8425333 9240934 10267083 11537966 13079460 ...
##  $ continent: Factor w/ 4 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 ..
##  $ lifeExp  : num  28.8 30.3 32 34 36.1 ...
##  $ gdpPercap: num  779 821 853 836 740 ...Load the lattice package:
library(lattice)Here's a basic scatterplot of life expectancy against year for 2007. How do we change the colors associated with the different continents?
xyplot(lifeExp ~ gdpPercap, jDat,
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       group = continent, auto.key = TRUE)Many aspects of a lattice graphic are determined by the current theme. To get a visual overview of yours, submit this:
show.settings()To get the gory details of your current theme, use the trellis.par.get() function (I won't print the output here, but you should inspect on your machine):
trellis.par.get()What was all that ?!? Let's get an overview.
str(trellis.par.get(), max.level = 1)## List of 35
##  $ grid.pars        : list()
##  $ fontsize         :List of 2
##  $ background       :List of 2
##  $ panel.background :List of 1
##  $ clip             :List of 2
##  $ add.line         :List of 4
##  $ add.text         :List of 5
##  $ plot.polygon     :List of 5
##  $ box.dot          :List of 5
##  $ box.rectangle    :List of 5
##  $ box.umbrella     :List of 4
##  $ dot.line         :List of 4
##  $ dot.symbol       :List of 5
##  $ plot.line        :List of 4
##  $ plot.symbol      :List of 6
##  $ reference.line   :List of 4
##  $ strip.background :List of 2
##  $ strip.shingle    :List of 2
##  $ strip.border     :List of 4
##  $ superpose.line   :List of 4
##  $ superpose.symbol :List of 6
##  $ superpose.polygon:List of 5
##  $ regions          :List of 2
##  $ shade.colors     :List of 2
##  $ axis.line        :List of 4
##  $ axis.text        :List of 5
##  $ axis.components  :List of 4
##  $ layout.heights   :List of 19
##  $ layout.widths    :List of 15
##  $ box.3d           :List of 4
##  $ par.xlab.text    :List of 5
##  $ par.ylab.text    :List of 5
##  $ par.zlab.text    :List of 5
##  $ par.main.text    :List of 5
##  $ par.sub.text     :List of 5The theme is a large list of graphical parameters that provide fine control of lattice graphics. Many of the names are fairly self-explanatory, especially when viewed alongside the output of show.settings().
Consider, for example, the list component superpose.symbol. Let's inspect it.
str(trellis.par.get("superpose.symbol"))## List of 6
##  $ alpha: num [1:7] 1 1 1 1 1 1 1
##  $ cex  : num [1:7] 0.8 0.8 0.8 0.8 0.8 0.8 0.8
##  $ col  : chr [1:7] "#0080ff" "#ff00ff" "darkgreen" "#ff0000" ...
##  $ fill : chr [1:7] "#CCFFFF" "#FFCCFF" "#CCFFCC" "#FFE5CC" ...
##  $ font : num [1:7] 1 1 1 1 1 1 1
##  $ pch  : num [1:7] 1 1 1 1 1 1 1It is itself a list with components controlling various properties of points when we use lattice's functionality for superposition via the group = argument. If we want to change the color of the points, this is where it needs to happen.
First, let's simply establish that superpose.symbol is in fact the set of graphical parameters that we need to modify. Graphical parameters can be set in an extremely limited way -- applying only to a single call -- by using the par.settings = argument to any high-level lattice call.
xyplot(lifeExp ~ gdpPercap | continent, jDat,
       group = country, subset = year == 2007,
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       par.settings = list(superpose.symbol = list(pch = 19, cex = 1.5,
                                                   col = c("orange", "blue"))))Yes! We successfully modified the plot symbol, it's size, and the colors. Granted, we used nonsensical colors, but that's often a good move at the very start. Before I go to the trouble of inserting a finely crafted color palette, I want to make sure I know where to put it.
Now we need that fancy color palette. The details of how to construct our country color palette are given elsewhere (future link), so we simply import and inspect it here.
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderCountryColors.txt"
countryColors <- read.delim(file = gdURL, as.is = 3) # protect color
str(countryColors)## 'data.frame':    142 obs. of  3 variables:
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 1 1 1 1 1 1 1 1 ..
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 95 39 43 28 118 121 ..
##  $ color    : chr  "#7F3B08" "#833D07" "#873F07" "#8B4107" ...head(countryColors)##   continent          country   color
## 1    Africa          Nigeria #7F3B08
## 2    Africa            Egypt #833D07
## 3    Africa         Ethiopia #873F07
## 4    Africa Congo, Dem. Rep. #8B4107
## 5    Africa     South Africa #8F4407
## 6    Africa            Sudan #934607countryColors is a data.frame, with one row per country, and with factors for country and continent. The crown jewel is the vector of country colors and that is what we need to insert into the superpose.symbol list.
From viewing the first few lines of countryColors we can see that the rows are not arranged in alphabetical order by country, which is the default level order for the country factor. So, before we can invoke our custom colors, we must make sure they are in the correct order, i.e. are harmonized to the levels of jDat$country. Note that the way I do this smoothly handles the additional wrinkle that we have dropped Oceania from jDat. By using match(), instead of merely sorting alphabetically and hoping for the best, we gain an extra level of protection from ourselves. We are now ready to use the colors, on the left in a scatterplot and on the right in a line plot. I use grouping and multi-panel conditioning redundantly, because I like the way it looks and I like the visual sanity check that I've applied my color scheme correctly.
countryColors <-
  countryColors[match(levels(jDat$country), countryColors$country), ]
str(countryColors) # see ... there are only 140 now, not the original 142## 'data.frame':    140 obs. of  3 variables:
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 4 1 1 2 4 3 3 ..
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 2 3 4 5 7 8 9 10 1..
##  $ color    : chr  "#874D96" "#D2ECB1" "#A34F06" "#C96C0C" ...xyplot(lifeExp ~ gdpPercap | continent, jDat,
       group = country, subset = year == 2007,
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       par.settings = list(superpose.symbol = list(pch = 19, cex = 1,
                                                   col = countryColors$color)))
xyplot(lifeExp ~ year | continent, jDat,
       group = country, type = "l",
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       par.settings = list(superpose.line = list(col = countryColors$color,
                                                 lwd = 2)))If you want to change several graphical parameters or if you want to apply your changes to multiple plots, the above method gets a bit cumbersome. You can assign your changes to an object and then use that to set par.settings =. This has all the usual benefits of isolating the changes to one piece of code, such as ease of modification and reuse.
We demonstrate with a new example, where we draw on the custom continent color scheme that underpins the larger country color scheme above. These colors were chosen to anchor the selection of country colors, not to really stand alone, so apologies that they are rather dark. In the second plot, we verify that the custom color scheme works perfectly well when we use multi-panel conditioning on a completely different variable, year.
gdURL <- "http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderContinentColors.txt"
(continentColors <- read.delim(file = gdURL, as.is = 3)) # protect color##   continent nCountries   color
## 1    Africa         52 #7F3B08
## 2  Americas         25 #A50026
## 3      Asia         33 #40004B
## 4    Europe         30 #276419
## 5   Oceania          2 #313695(continentColors <-
  continentColors[match(levels(jDat$continent), continentColors$continent), ])##   continent nCountries   color
## 1    Africa         52 #7F3B08
## 2  Americas         25 #A50026
## 3      Asia         33 #40004B
## 4    Europe         30 #276419coolNewPars <- 
  list(superpose.symbol = list(pch = 21, cex = 2, col = "gray20",
                               fill = continentColors$color))
xyplot(lifeExp ~ gdpPercap, jDat,
       subset = year == 2007,
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       group = continent, auto.key = list(columns = 4),
       par.settings = coolNewPars)
xyplot(lifeExp ~ gdpPercap | factor(year), jDat,
       subset = year %in% c(1952, 2007),
       scales = list(x = list(log = 10, equispaced.log = FALSE)),
       group = continent, auto.key = list(columns = 4),
       par.settings = coolNewPars)Here we show how to change the theme itself via trellis.par.set() and verify its effect. As we did in base graphics, we also model best practice for modifying such "hidden" parameters: we store the original state and restore it when we're done. We're taking advantage of the fact that high-level lattice calls return actual objects. We make the figure once and store is as myPlot. We then print it three times, in a changing theme context: original theme, our custom theme, original theme.
tp <- trellis.par.get() # store the original theme
myPlot <- xyplot(lifeExp ~ gdpPercap | continent, jDat,
                 group = country, subset = year == 2007,
                 scales = list(x = list(log = 10, equispaced.log = FALSE)))
myPlot
trellis.par.set(superpose.symbol = list(pch = 19, cex = 1,
                                        col = countryColors$color))
myPlot
trellis.par.set(tp)
myPlotIt's worth pointing out a very nice feature of the theme modifications above. Whether our theme changes are temporary and limited to a single call or are more persistent and global, notice that we never had to specify the entire set of graphical parameters. You can specify only the things you want to change and everything else will remain at its current value.
Go to the tutorial on technical details of lattice to review an alternative method for exerting graphical control, including over the colors, by modifying the panel function.
Lattice: Multivariate Data Visualization with R available via SpringerLink by Deepayan Sarkar, Springer (2008) | all code from the book | GoogleBooks search
Chapter 7 Graphical Parameters and Other Settings