Introduction to R - Part III
Recap
library(tidyverse)
gapminder <- read_csv("raw_data/gapminder.csv")
Summarising and grouping with dplyr
The summarise
function can take any R function that takes a vector of values (i.e. a column from a data frame) and returns a single value. Some of the more useful functions include:
min
minimum value
max
maximum value
sum
sum of values
mean
mean value
sd
standard deviation
median
median value
IQR
the interquartile range
n_distinct
the number of distinct values
n
the number of observations (Note: this is a special function that doesn’t take a vector argument, i.e. column)
summarise(gapminder, min(lifeExp), max(gdpPercap), mean(pop))
It is also possible to summarise using a function that takes more than one value, i.e. from multiple columns. For example, we could compute the correlation between year and life expectancy. Here we also assign names to the table that is produced.
gapminder %>%
summarise(MinLifeExpectancy = min(lifeExp),
MaximumGDP = max(gdpPercap),
AveragePop = mean(pop),
Correlation = cor(year, lifeExp))
However, it is not particularly useful to calculate such values from the entire table as we have different continents and years. The group_by
function allows us to split the table into different categories, and compute summary statistics for each year (for example).
gapminder %>%
group_by(year) %>%
summarise(MinLifeExpectancy = min(lifeExp),
MaximumGDP = max(gdpPercap),
AveragePop = mean(pop))
Other summary statistics that can be useful include first
and last
which are used to report the first and last values for a particular group. For instance, we might want to look at the increase in wealth over time for each country by extracting their gdpPercap
in 1952
and 2007
.
gapminder %>%
group_by(country) %>%
arrange(year) %>%
summarise(StartGDP = first(gdpPercap), EndGDP = last(gdpPercap), GDPIncrease = EndGDP - StartGDP)
The nice thing about summarise
is that it can followed up by any of the other dplyr
verbs that we have met so far (select
, filter
, arrange
..etc).
Returning to the correlation between life expectancy and year, we can summarise as follows:-
gapminder %>%
group_by(country) %>%
summarise(Correlation = cor(year , lifeExp))
We can then arrange the table by the correlation to see which countries have the lowest correlation
gapminder %>%
group_by(country) %>%
summarise(Correlation = cor(year , lifeExp)) %>%
arrange(Correlation)
We can filter the results to find obsevations of interest
gapminder %>%
group_by(country) %>%
summarise(Correlation = cor(year , lifeExp)) %>%
filter(Correlation < 0)
The countries we identify could then be used as the basis for a plot.
filter(gapminder, country %in% c("Rwanda","Zambia","Zimbabwe")) %>%
ggplot(aes(x=year, y=lifeExp,col=country)) + geom_line()

Exercise
- Produce a plot to show the change in average
gdpPercap
for each continent over time.
- see below for a suggestion
- HINT: you will need to specifiy the
stat=identity
option when creating the bar plot
......+ geom_bar(stat="identity")

Joining
In many real life situations, data are spread across multiple tables or spreadsheets. Usually this occurs because different types of information about a subject, e.g. a patient, are collected from different sources. It may be desirable for some analyses to combine data from two or more tables into a single data frame based on a common column, for example, an attribute that uniquely identifies the subject.
dplyr
provides a set of join functions for combining two data frames based on matches within specified columns. For those familiar with such SQL, these operations are very similar to carrying out join operations between tables in a relational database.
As a toy example, lets consider two data frames that contain the names of various bands, and the instruments that they play:-
band_instruments
band_members
There are various ways in which we can join these two tables together. We will just consider the case of a “left join”.
Animated gif by Garrick Aden-Buie
left_join
returns all rows from the first data frame regardless of whether there is a match in the second data frame. Rows with no match are included in the resulting data frame but have NA values in the additional columns coming from the second data frame.
Animations to illustrate other types of join are available at https://github.com/gadenbuie/tidy-animated-verbs
left_join(band_members, band_instruments)
Joining, by = "name"
right_join
is similar but returns all rows from the second data frame that have a match with rows in the first data frame based on the specified column.
right_join(band_members, band_instruments)
Joining, by = "name"
inner_join
only returns those rows where matches could be made
inner_join(band_members, band_instruments)
Joining, by = "name"
Exercise (open-ended)
- The file
medal_table.csv
in the raw_data/
project sub-directory contains data about how many medals how been won by various countries at the Beijing summer olympics of 2008.
- Read this csv file into R and join with the
gapminder
data from 2007
- What interesting summaries / plots can you make from the data? For example…
- what countries have the greatest proportion of gold medals (ignore countries with too few medals)
- calculate the number of medals won per million people and re-arrange by this new measure. What countries perform best?
- how similar is the distribution of total medals between continents?
- do countries with a larger population tend to win more medals?
- do countries with larger GDP tend to win more medals?
- are these trends consistent among different continents?
Using R to analyse RNAi data
You should have access to an example RNAi screen dataset; rnai.csv
. In RStudio, create a new project in the directory that this csv
is saved.
You can create a new markdown file to document the analysis through the menus.
File -> New File -> R markdown >
The first code chunk should make sure we have the tidyverse
package loaded.
library(tidyverse)
As a quality control measure, we can look at the signal from each plate and make sure there are no systematic differences. This could be visualised as a boxplot with geom_boxplot
as we have seen before.
ggplot(rnai, aes(x=as.factor(Plate), y=`FL/RL_rep1`)) + geom_boxplot()

We also have some positional information at our disposal that could be of interest. This is encoded in the well name, but we can split the well name into a row and column ID using a mutate
command with substr
to extract the letter and number of the well.
rnai <- mutate(rnai,Row = substr(Well,1,1), Col=substr(Well,2,3))
rnai
This can be visualised using a type of geom
that we haven’t seen before called geom_tile
that produces a “heatmap” style of plot.
ggplot(rnai, aes(x=Row,y=Col,fill=`FL/RL_rep1`)) + geom_tile() + facet_wrap(~Plate)

Exercise
- Try and re-produce the
geom_boxplot
and geom_tile
above with the FL/RL_Av
column instead of FL/RL_Rep1
- R will struggle to produce the plot. Look at the contents of this column and think about why this might happen?
- Apply appropriate filtering to the data in order to produce a boxplot of the
FL/RL_Av
values for each Plate
- HINT: the R code
!is.na
can be used to check identify what rows do not contain an NA
value in a particular column
Before proceeding to further analysis we can clean the dataset so that we only have the gene identifiers and average FL
and RL
values as columns.
rnai_cleaned <- filter(rnai, !is.na(BKN)) %>%
mutate(FL=as.numeric(`FL_Av`), RL=as.numeric(`RL_Av`), Ratio = as.numeric(`FL/RL_Av`)) %>%
select(GeneSymbol, GeneID,FL,RL,Ratio)
rnai_cleaned
The Gene identifiers are not particularly useful by themselves but we can use resources such as Biomart to map these to more recognisable names, and also obtain homologous human genes. The biomaRt
package is able to perform the mapping, and without going into too much detail here is the code to do so.
library(biomaRt)
genes_of_interest <-rnai_cleaned$GeneID
ensembl <- useEnsembl(biomart = "ensembl",
dataset = "dmelanogaster_gene_ensembl")
anno <- getBM(c("ensembl_gene_id","external_gene_name", "hsapiens_homolog_ensembl_gene","hsapiens_homolog_associated_gene_name","hsapiens_homolog_orthology_confidence"), "external_gene_name", genes_of_interest, ensembl) %>%
rename(GeneID = external_gene_name)
This can now be joined to the cleaned dataset using the left_join
command from before.
rnai_cleaned <- left_join(rnai_cleaned, anno)
Joining, by = "GeneID"
rnai_cleaned
A basic scatter plot of FL
and RL
reveals some extreme values in the dataset.
ggplot(rnai_cleaned, aes(x = FL, y = RL,col=Ratio)) + geom_point()

Exercise
- Identify the names of the genes that have extreme
RL
or FL
values
- Produce a version of the plot that has the name of these genes labelled
- check the identity of these genes. Do they have any biological significance?

- Decide on suitable cut-offs on the ratio of
RL
to FL
to identify genes of interest
- Write the rows corresponding to these genes to a csv file for further processing
LS0tCnRpdGxlOiAiUiBDcmFzaCBDb3Vyc2UiCmF1dGhvcjogIk1hcmsgRHVubmluZyIKZGF0ZTogJ2ByIGZvcm1hdChTeXMudGltZSgpLCAiTGFzdCBtb2RpZmllZDogJWQgJWIgJVkiKWAnCm91dHB1dDogCiAgaHRtbF9ub3RlYm9vazogCiAgICB0b2M6IHllcwogICAgdG9jX2Zsb2F0OiB5ZXMKZWRpdG9yX29wdGlvbnM6IAogIGNodW5rX291dHB1dF90eXBlOiBpbmxpbmUKLS0tCgojIEludHJvZHVjdGlvbiB0byBSIC0gUGFydCBJSUkKCiMjIFJlY2FwCgpgYGB7ciBtZXNzYWdlPUZBTFNFfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKZ2FwbWluZGVyIDwtIHJlYWRfY3N2KCJyYXdfZGF0YS9nYXBtaW5kZXIuY3N2IikKYGBgCgoKIyBTdW1tYXJpc2luZyBhbmQgZ3JvdXBpbmcgd2l0aCBkcGx5cgoKVGhlIGBzdW1tYXJpc2VgIGZ1bmN0aW9uIGNhbiB0YWtlIGFueSBSIGZ1bmN0aW9uIHRoYXQgdGFrZXMgYSB2ZWN0b3Igb2YgdmFsdWVzIChpLmUuIGEgY29sdW1uIGZyb20gYSBkYXRhIGZyYW1lKSBhbmQgcmV0dXJucyBhIHNpbmdsZSB2YWx1ZS4gU29tZSBvZiB0aGUgbW9yZSB1c2VmdWwgZnVuY3Rpb25zIGluY2x1ZGU6CgotIGBtaW5gIG1pbmltdW0gdmFsdWUKLSBgbWF4YCBtYXhpbXVtIHZhbHVlCi0gYHN1bWAgc3VtIG9mIHZhbHVlcwotIGBtZWFuYCBtZWFuIHZhbHVlCi0gYHNkYCBzdGFuZGFyZCBkZXZpYXRpb24KLSBgbWVkaWFuYCBtZWRpYW4gdmFsdWUKLSBgSVFSYCB0aGUgaW50ZXJxdWFydGlsZSByYW5nZQotIGBuX2Rpc3RpbmN0YCB0aGUgbnVtYmVyIG9mIGRpc3RpbmN0IHZhbHVlcwotIGBuYCB0aGUgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyAoTm90ZTogdGhpcyBpcyBhIHNwZWNpYWwgZnVuY3Rpb24gdGhhdCBkb2VzbuKAmXQgdGFrZSBhIHZlY3RvciBhcmd1bWVudCwgaS5lLiBjb2x1bW4pCgoKYGBge3J9CnN1bW1hcmlzZShnYXBtaW5kZXIsIG1pbihsaWZlRXhwKSwgbWF4KGdkcFBlcmNhcCksIG1lYW4ocG9wKSkKYGBgCgpJdCBpcyBhbHNvIHBvc3NpYmxlIHRvIHN1bW1hcmlzZSB1c2luZyBhIGZ1bmN0aW9uIHRoYXQgdGFrZXMgbW9yZSB0aGFuIG9uZSB2YWx1ZSwgaS5lLiBmcm9tIG11bHRpcGxlIGNvbHVtbnMuIEZvciBleGFtcGxlLCB3ZSBjb3VsZCBjb21wdXRlIHRoZSBjb3JyZWxhdGlvbiBiZXR3ZWVuIHllYXIgYW5kIGxpZmUgZXhwZWN0YW5jeS4gSGVyZSB3ZSBhbHNvIGFzc2lnbiBuYW1lcyB0byB0aGUgdGFibGUgdGhhdCBpcyBwcm9kdWNlZC4KCmBgYHtyfQpnYXBtaW5kZXIgJT4lIApzdW1tYXJpc2UoTWluTGlmZUV4cGVjdGFuY3kgPSBtaW4obGlmZUV4cCksIAogICAgICAgICAgTWF4aW11bUdEUCA9IG1heChnZHBQZXJjYXApLCAKICAgICAgICAgIEF2ZXJhZ2VQb3AgPSBtZWFuKHBvcCksIAogICAgICAgICAgQ29ycmVsYXRpb24gPSBjb3IoeWVhciwgbGlmZUV4cCkpCmBgYAoKSG93ZXZlciwgaXQgaXMgbm90IHBhcnRpY3VsYXJseSB1c2VmdWwgdG8gY2FsY3VsYXRlIHN1Y2ggdmFsdWVzIGZyb20gdGhlIGVudGlyZSB0YWJsZSBhcyB3ZSBoYXZlIGRpZmZlcmVudCBjb250aW5lbnRzIGFuZCB5ZWFycy4gVGhlIGBncm91cF9ieWAgZnVuY3Rpb24gYWxsb3dzIHVzIHRvIHNwbGl0IHRoZSB0YWJsZSBpbnRvIGRpZmZlcmVudCBjYXRlZ29yaWVzLCBhbmQgY29tcHV0ZSBzdW1tYXJ5IHN0YXRpc3RpY3MgZm9yIGVhY2ggeWVhciAoZm9yIGV4YW1wbGUpLgoKYGBge3J9CmdhcG1pbmRlciAlPiUgCiAgICBncm91cF9ieSh5ZWFyKSAlPiUgCiAgICBzdW1tYXJpc2UoTWluTGlmZUV4cGVjdGFuY3kgPSBtaW4obGlmZUV4cCksIAogICAgICAgICAgICAgIE1heGltdW1HRFAgPSBtYXgoZ2RwUGVyY2FwKSwgCiAgICAgICAgICAgICAgQXZlcmFnZVBvcCA9IG1lYW4ocG9wKSkKYGBgCgpPdGhlciBzdW1tYXJ5IHN0YXRpc3RpY3MgdGhhdCBjYW4gYmUgdXNlZnVsIGluY2x1ZGUgYGZpcnN0YCBhbmQgYGxhc3RgIHdoaWNoIGFyZSB1c2VkIHRvIHJlcG9ydCB0aGUgZmlyc3QgYW5kIGxhc3QgdmFsdWVzIGZvciBhIHBhcnRpY3VsYXIgZ3JvdXAuIEZvciBpbnN0YW5jZSwgd2UgbWlnaHQgd2FudCB0byBsb29rIGF0IHRoZSBpbmNyZWFzZSBpbiB3ZWFsdGggb3ZlciB0aW1lIGZvciBlYWNoIGNvdW50cnkgYnkgZXh0cmFjdGluZyB0aGVpciBgZ2RwUGVyY2FwYCBpbiBgMTk1MmAgYW5kIGAyMDA3YC4KCmBgYHtyfQpnYXBtaW5kZXIgJT4lIAogIGdyb3VwX2J5KGNvdW50cnkpICU+JSAKICBhcnJhbmdlKHllYXIpICU+JSAKc3VtbWFyaXNlKFN0YXJ0R0RQID0gZmlyc3QoZ2RwUGVyY2FwKSwgRW5kR0RQID0gbGFzdChnZHBQZXJjYXApLCBHRFBJbmNyZWFzZSA9IEVuZEdEUCAtIFN0YXJ0R0RQKQoKYGBgCgoKVGhlIG5pY2UgdGhpbmcgYWJvdXQgYHN1bW1hcmlzZWAgaXMgdGhhdCBpdCBjYW4gZm9sbG93ZWQgdXAgYnkgYW55IG9mIHRoZSBvdGhlciBgZHBseXJgIHZlcmJzIHRoYXQgd2UgaGF2ZSBtZXQgc28gZmFyIChgc2VsZWN0YCwgYGZpbHRlcmAsIGBhcnJhbmdlYC4uZXRjKS4gCgpSZXR1cm5pbmcgdG8gdGhlIGNvcnJlbGF0aW9uIGJldHdlZW4gbGlmZSBleHBlY3RhbmN5IGFuZCB5ZWFyLCB3ZSBjYW4gc3VtbWFyaXNlIGFzIGZvbGxvd3M6LQoKYGBge3J9CmdhcG1pbmRlciAlPiUgICAgIAogICAgZ3JvdXBfYnkoY291bnRyeSkgJT4lIAogICAgc3VtbWFyaXNlKENvcnJlbGF0aW9uID0gY29yKHllYXIgLCBsaWZlRXhwKSkKYGBgCldlIGNhbiB0aGVuIGFycmFuZ2UgdGhlIHRhYmxlIGJ5IHRoZSBjb3JyZWxhdGlvbiB0byBzZWUgd2hpY2ggY291bnRyaWVzIGhhdmUgdGhlIGxvd2VzdCBjb3JyZWxhdGlvbgoKYGBge3J9CmdhcG1pbmRlciAlPiUgICAgICAKICAgIGdyb3VwX2J5KGNvdW50cnkpICU+JSAKICAgIHN1bW1hcmlzZShDb3JyZWxhdGlvbiA9IGNvcih5ZWFyICwgbGlmZUV4cCkpICU+JSAKICAgIGFycmFuZ2UoQ29ycmVsYXRpb24pCmBgYAoKV2UgY2FuIGZpbHRlciB0aGUgcmVzdWx0cyB0byBmaW5kIG9ic2V2YXRpb25zIG9mIGludGVyZXN0CgpgYGB7cn0KZ2FwbWluZGVyICU+JSAgICAgIAogICAgZ3JvdXBfYnkoY291bnRyeSkgJT4lIAogICAgc3VtbWFyaXNlKENvcnJlbGF0aW9uID0gY29yKHllYXIgLCBsaWZlRXhwKSkgJT4lIAogICAgZmlsdGVyKENvcnJlbGF0aW9uIDwgMCkKYGBgCgpUaGUgY291bnRyaWVzIHdlIGlkZW50aWZ5IGNvdWxkIHRoZW4gYmUgdXNlZCBhcyB0aGUgYmFzaXMgZm9yIGEgcGxvdC4KCmBgYHtyfQpmaWx0ZXIoZ2FwbWluZGVyLCBjb3VudHJ5ICVpbiUgYygiUndhbmRhIiwiWmFtYmlhIiwiWmltYmFid2UiKSkgJT4lIAogIGdncGxvdChhZXMoeD15ZWFyLCB5PWxpZmVFeHAsY29sPWNvdW50cnkpKSArIGdlb21fbGluZSgpCmBgYCAKCioqKioqKgoqKioqKioKKioqKioqCgojIyMgRXhlcmNpc2UgCgotIFByb2R1Y2UgYSBwbG90IHRvIHNob3cgdGhlIGNoYW5nZSBpbiBhdmVyYWdlIGBnZHBQZXJjYXBgIGZvciBlYWNoIGNvbnRpbmVudCBvdmVyIHRpbWUuCi0gc2VlIGJlbG93IGZvciBhIHN1Z2dlc3Rpb24KICAgICsgSElOVDogeW91IHdpbGwgbmVlZCB0byBzcGVjaWZpeSB0aGUgYHN0YXQ9aWRlbnRpdHlgIG9wdGlvbiB3aGVuIGNyZWF0aW5nIHRoZSBiYXIgcGxvdAogICAgCmBgYHtyIGV2YWw9RkFMU0V9Ci4uLi4uLisgZ2VvbV9iYXIoc3RhdD0iaWRlbnRpdHkiKQpgYGAKCioqKioqKgoqKioqKioKKioqKioqCgoKCgpgYGB7ciBlY2hvPUZBTFNFfQpnYXBtaW5kZXIgJT4lIGdyb3VwX2J5KHllYXIsY29udGluZW50KSAlPiUgCiAgc3VtbWFyaXNlKFdlYWx0aD1tZWFuKGdkcFBlcmNhcCkpICU+JSBnZ3Bsb3QoYWVzKHg9eWVhcix5PVdlYWx0aCxmaWxsPWNvbnRpbmVudCkpICsgZ2VvbV9iYXIoc3RhdD0iaWRlbnRpdHkiKSAgKyBmYWNldF93cmFwKH5jb250aW5lbnQpCmBgYAoKCiMgSm9pbmluZwoKSW4gbWFueSByZWFsIGxpZmUgc2l0dWF0aW9ucywgZGF0YSBhcmUgc3ByZWFkIGFjcm9zcyBtdWx0aXBsZSB0YWJsZXMgb3Igc3ByZWFkc2hlZXRzLiBVc3VhbGx5IHRoaXMgb2NjdXJzIGJlY2F1c2UgZGlmZmVyZW50IHR5cGVzIG9mIGluZm9ybWF0aW9uIGFib3V0IGEgc3ViamVjdCwgZS5nLiBhIHBhdGllbnQsIGFyZSBjb2xsZWN0ZWQgZnJvbSBkaWZmZXJlbnQgc291cmNlcy4gSXQgbWF5IGJlIGRlc2lyYWJsZSBmb3Igc29tZSBhbmFseXNlcyB0byBjb21iaW5lIGRhdGEgZnJvbSB0d28gb3IgbW9yZSB0YWJsZXMgaW50byBhIHNpbmdsZSBkYXRhIGZyYW1lIGJhc2VkIG9uIGEgY29tbW9uIGNvbHVtbiwgZm9yIGV4YW1wbGUsIGFuIGF0dHJpYnV0ZSB0aGF0IHVuaXF1ZWx5IGlkZW50aWZpZXMgdGhlIHN1YmplY3QuCgpgZHBseXJgIHByb3ZpZGVzIGEgc2V0IG9mIGpvaW4gZnVuY3Rpb25zIGZvciBjb21iaW5pbmcgdHdvIGRhdGEgZnJhbWVzIGJhc2VkIG9uIG1hdGNoZXMgd2l0aGluIHNwZWNpZmllZCBjb2x1bW5zLiBGb3IgdGhvc2UgZmFtaWxpYXIgd2l0aCBzdWNoIFNRTCwgdGhlc2Ugb3BlcmF0aW9ucyBhcmUgdmVyeSBzaW1pbGFyIHRvIGNhcnJ5aW5nIG91dCBqb2luIG9wZXJhdGlvbnMgYmV0d2VlbiB0YWJsZXMgaW4gYSByZWxhdGlvbmFsIGRhdGFiYXNlLgoKQXMgYSB0b3kgZXhhbXBsZSwgbGV0cyBjb25zaWRlciB0d28gZGF0YSBmcmFtZXMgdGhhdCBjb250YWluIHRoZSBuYW1lcyBvZiB2YXJpb3VzIGJhbmRzLCBhbmQgdGhlIGluc3RydW1lbnRzIHRoYXQgdGhleSBwbGF5Oi0KYGBge3J9CmJhbmRfaW5zdHJ1bWVudHMKYmFuZF9tZW1iZXJzCmBgYAoKVGhlcmUgYXJlIHZhcmlvdXMgd2F5cyBpbiB3aGljaCB3ZSBjYW4gam9pbiB0aGVzZSB0d28gdGFibGVzIHRvZ2V0aGVyLiBXZSB3aWxsIGp1c3QgY29uc2lkZXIgdGhlIGNhc2Ugb2YgYSAibGVmdCBqb2luIi4KCiFbXShpbWFnZXMvbGVmdC1qb2luLmdpZikKCipBbmltYXRlZCBnaWYgYnkgR2FycmljayBBZGVuLUJ1aWUqCgpgbGVmdF9qb2luYCByZXR1cm5zIGFsbCByb3dzIGZyb20gdGhlIGZpcnN0IGRhdGEgZnJhbWUgcmVnYXJkbGVzcyBvZiB3aGV0aGVyIHRoZXJlIGlzIGEgbWF0Y2ggaW4gdGhlIHNlY29uZCBkYXRhIGZyYW1lLiBSb3dzIHdpdGggbm8gbWF0Y2ggYXJlIGluY2x1ZGVkIGluIHRoZSByZXN1bHRpbmcgZGF0YSBmcmFtZSBidXQgaGF2ZSBOQSB2YWx1ZXMgaW4gdGhlIGFkZGl0aW9uYWwgY29sdW1ucyBjb21pbmcgZnJvbSB0aGUgc2Vjb25kIGRhdGEgZnJhbWUuCgpBbmltYXRpb25zIHRvIGlsbHVzdHJhdGUgb3RoZXIgdHlwZXMgb2Ygam9pbiBhcmUgYXZhaWxhYmxlIGF0IFtodHRwczovL2dpdGh1Yi5jb20vZ2FkZW5idWllL3RpZHktYW5pbWF0ZWQtdmVyYnNdKGh0dHBzOi8vZ2l0aHViLmNvbS9nYWRlbmJ1aWUvdGlkeS1hbmltYXRlZC12ZXJicykKCmBgYHtyfQpsZWZ0X2pvaW4oYmFuZF9tZW1iZXJzLCBiYW5kX2luc3RydW1lbnRzKQpgYGAKCmByaWdodF9qb2luYCBpcyBzaW1pbGFyIGJ1dCByZXR1cm5zIGFsbCByb3dzIGZyb20gdGhlIHNlY29uZCBkYXRhIGZyYW1lIHRoYXQgaGF2ZSBhIG1hdGNoIHdpdGggcm93cyBpbiB0aGUgZmlyc3QgZGF0YSBmcmFtZSBiYXNlZCBvbiB0aGUgc3BlY2lmaWVkIGNvbHVtbi4KCmBgYHtyfQpyaWdodF9qb2luKGJhbmRfbWVtYmVycywgYmFuZF9pbnN0cnVtZW50cykKYGBgCgpgaW5uZXJfam9pbmAgb25seSByZXR1cm5zIHRob3NlIHJvd3Mgd2hlcmUgbWF0Y2hlcyBjb3VsZCBiZSBtYWRlCgpgYGB7cn0KaW5uZXJfam9pbihiYW5kX21lbWJlcnMsIGJhbmRfaW5zdHJ1bWVudHMpCmBgYAoKCioqKioqKgoqKioqKioKKioqKioqCgoKCiMjIyBFeGVyY2lzZSAob3Blbi1lbmRlZCkKCi0gVGhlIGZpbGUgYG1lZGFsX3RhYmxlLmNzdmAgaW4gdGhlIGByYXdfZGF0YS9gIHByb2plY3Qgc3ViLWRpcmVjdG9yeSBjb250YWlucyBkYXRhIGFib3V0IGhvdyBtYW55IG1lZGFscyBob3cgYmVlbiB3b24gYnkgdmFyaW91cyBjb3VudHJpZXMgYXQgdGhlIEJlaWppbmcgc3VtbWVyIG9seW1waWNzIG9mIDIwMDguCi0gUmVhZCB0aGlzIGNzdiBmaWxlIGludG8gUiBhbmQgam9pbiB3aXRoIHRoZSBgZ2FwbWluZGVyYCBkYXRhIGZyb20gMjAwNwotIFdoYXQgaW50ZXJlc3Rpbmcgc3VtbWFyaWVzIC8gcGxvdHMgY2FuIHlvdSBtYWtlIGZyb20gdGhlIGRhdGE/IEZvciBleGFtcGxlLi4uCiAgKyB3aGF0IGNvdW50cmllcyBoYXZlIHRoZSBncmVhdGVzdCBwcm9wb3J0aW9uIG9mIGdvbGQgbWVkYWxzIChpZ25vcmUgY291bnRyaWVzIHdpdGggdG9vIGZldyBtZWRhbHMpCiAgKyBjYWxjdWxhdGUgdGhlIG51bWJlciBvZiBtZWRhbHMgd29uIHBlciBtaWxsaW9uIHBlb3BsZSBhbmQgcmUtYXJyYW5nZSBieSB0aGlzIG5ldyBtZWFzdXJlLiBXaGF0IGNvdW50cmllcyBwZXJmb3JtIGJlc3Q/CiAgKyBob3cgc2ltaWxhciBpcyB0aGUgZGlzdHJpYnV0aW9uIG9mIHRvdGFsIG1lZGFscyBiZXR3ZWVuIGNvbnRpbmVudHM/CiAgKyBkbyBjb3VudHJpZXMgd2l0aCBhIGxhcmdlciBwb3B1bGF0aW9uIHRlbmQgdG8gd2luIG1vcmUgbWVkYWxzPwogICsgZG8gY291bnRyaWVzIHdpdGggbGFyZ2VyIEdEUCB0ZW5kIHRvIHdpbiBtb3JlIG1lZGFscz8KICArIGFyZSB0aGVzZSB0cmVuZHMgY29uc2lzdGVudCBhbW9uZyBkaWZmZXJlbnQgY29udGluZW50cz8KICAKKioqKioqCioqKioqKgoqKioqKioKCiMgVXNpbmcgUiB0byBhbmFseXNlIFJOQWkgZGF0YQoKWW91IHNob3VsZCBoYXZlIGFjY2VzcyB0byBhbiBleGFtcGxlIFJOQWkgc2NyZWVuIGRhdGFzZXQ7IGBybmFpLmNzdmAuIEluIFJTdHVkaW8sIGNyZWF0ZSBhIG5ldyBwcm9qZWN0IGluIHRoZSBkaXJlY3RvcnkgdGhhdCB0aGlzIGBjc3ZgIGlzIHNhdmVkLiAKCllvdSBjYW4gY3JlYXRlIGEgbmV3IG1hcmtkb3duIGZpbGUgdG8gZG9jdW1lbnQgdGhlIGFuYWx5c2lzIHRocm91Z2ggdGhlIG1lbnVzLgoKPGRpdiBjbGFzcz0iYWxlcnQgYWxlcnQtaW5mbyI+CgojIyMjICoqRmlsZSAtPiBOZXcgRmlsZSAtPiBSIG1hcmtkb3duID4gKiogCgo8L2Rpdj4KClRoZSBmaXJzdCBjb2RlIGNodW5rIHNob3VsZCBtYWtlIHN1cmUgd2UgaGF2ZSB0aGUgYHRpZHl2ZXJzZWAgcGFja2FnZSBsb2FkZWQuCgpgYGB7ciBtZXNzYWdlPUZBTFNFfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKYGBgCgoKYGBge3IgbWVzc2FnZT1GQUxTRSxlY2hvPUZBTFNFfQpybmFpIDwtIHJlYWRfY3N2KCJybmFpLmNzdiIpCnJuYWkKYGBgCgpBcyBhIHF1YWxpdHkgY29udHJvbCBtZWFzdXJlLCB3ZSBjYW4gbG9vayBhdCB0aGUgc2lnbmFsIGZyb20gZWFjaCBwbGF0ZSBhbmQgbWFrZSBzdXJlIHRoZXJlIGFyZSBubyBzeXN0ZW1hdGljIGRpZmZlcmVuY2VzLiBUaGlzIGNvdWxkIGJlIHZpc3VhbGlzZWQgYXMgYSBib3hwbG90IHdpdGggYGdlb21fYm94cGxvdGAgYXMgd2UgaGF2ZSBzZWVuIGJlZm9yZS4KCmBgYHtyfQpnZ3Bsb3Qocm5haSwgYWVzKHg9YXMuZmFjdG9yKFBsYXRlKSwgeT1gRkwvUkxfcmVwMWApKSArIGdlb21fYm94cGxvdCgpCmBgYAoKV2UgYWxzbyBoYXZlIHNvbWUgcG9zaXRpb25hbCBpbmZvcm1hdGlvbiBhdCBvdXIgZGlzcG9zYWwgdGhhdCBjb3VsZCBiZSBvZiBpbnRlcmVzdC4gVGhpcyBpcyBlbmNvZGVkIGluIHRoZSB3ZWxsIG5hbWUsIGJ1dCB3ZSBjYW4gc3BsaXQgdGhlIHdlbGwgbmFtZSBpbnRvIGEgcm93IGFuZCBjb2x1bW4gSUQgdXNpbmcgYSBgbXV0YXRlYCBjb21tYW5kIHdpdGggYHN1YnN0cmAgdG8gZXh0cmFjdCB0aGUgbGV0dGVyIGFuZCBudW1iZXIgb2YgdGhlIHdlbGwuCgpgYGB7cn0Kcm5haSA8LSBtdXRhdGUocm5haSxSb3cgPSBzdWJzdHIoV2VsbCwxLDEpLCBDb2w9c3Vic3RyKFdlbGwsMiwzKSkKcm5haQpgYGAKClRoaXMgY2FuIGJlIHZpc3VhbGlzZWQgdXNpbmcgYSB0eXBlIG9mIGBnZW9tYCB0aGF0IHdlIGhhdmVuJ3Qgc2VlbiBiZWZvcmUgY2FsbGVkIGBnZW9tX3RpbGVgIHRoYXQgcHJvZHVjZXMgYSAiaGVhdG1hcCIgc3R5bGUgb2YgcGxvdC4KCmBgYHtyIGZpZy53aWR0aD0xMn0KZ2dwbG90KHJuYWksIGFlcyh4PVJvdyx5PUNvbCxmaWxsPWBGTC9STF9yZXAxYCkpICsgZ2VvbV90aWxlKCkgKyBmYWNldF93cmFwKH5QbGF0ZSkKYGBgCgoqKioqKioKKioqKioqCioqKioqKgoKIyMjIEV4ZXJjaXNlCgotIFRyeSBhbmQgcmUtcHJvZHVjZSB0aGUgYGdlb21fYm94cGxvdGAgYW5kIGBnZW9tX3RpbGVgIGFib3ZlIHdpdGggdGhlIGBGTC9STF9BdmAgY29sdW1uIGluc3RlYWQgb2YgYEZML1JMX1JlcDFgCiAgICArIFIgd2lsbCBzdHJ1Z2dsZSB0byBwcm9kdWNlIHRoZSBwbG90LiBMb29rIGF0IHRoZSBjb250ZW50cyBvZiB0aGlzIGNvbHVtbiBhbmQgdGhpbmsgYWJvdXQgd2h5IHRoaXMgbWlnaHQgaGFwcGVuPwotIEFwcGx5IGFwcHJvcHJpYXRlIGZpbHRlcmluZyB0byB0aGUgZGF0YSBpbiBvcmRlciB0byBwcm9kdWNlIGEgYm94cGxvdCBvZiB0aGUgYEZML1JMX0F2YCB2YWx1ZXMgZm9yIGVhY2ggUGxhdGUKICAgICsgSElOVDogdGhlIFIgY29kZSBgIWlzLm5hYCBjYW4gYmUgdXNlZCB0byBjaGVjayBpZGVudGlmeSB3aGF0IHJvd3MgZG8gbm90IGNvbnRhaW4gYW4gYE5BYCB2YWx1ZSBpbiBhIHBhcnRpY3VsYXIgY29sdW1uCiAgICAKKioqKioqCioqKioqKgoqKioqKioKCkJlZm9yZSBwcm9jZWVkaW5nIHRvIGZ1cnRoZXIgYW5hbHlzaXMgd2UgY2FuIGNsZWFuIHRoZSBkYXRhc2V0IHNvIHRoYXQgd2Ugb25seSBoYXZlIHRoZSBnZW5lIGlkZW50aWZpZXJzIGFuZCBhdmVyYWdlIGBGTGAgYW5kIGBSTGAgdmFsdWVzIGFzIGNvbHVtbnMuCgpgYGB7cn0Kcm5haV9jbGVhbmVkIDwtIGZpbHRlcihybmFpLCAhaXMubmEoQktOKSkgJT4lIAogIG11dGF0ZShGTD1hcy5udW1lcmljKGBGTF9BdmApLCBSTD1hcy5udW1lcmljKGBSTF9BdmApLCBSYXRpbyA9IGFzLm51bWVyaWMoYEZML1JMX0F2YCkpICU+JSAKICBzZWxlY3QoR2VuZVN5bWJvbCwgR2VuZUlELEZMLFJMLFJhdGlvKQpybmFpX2NsZWFuZWQKYGBgCgpUaGUgR2VuZSBpZGVudGlmaWVycyBhcmUgbm90IHBhcnRpY3VsYXJseSB1c2VmdWwgYnkgdGhlbXNlbHZlcyBidXQgd2UgY2FuIHVzZSByZXNvdXJjZXMgc3VjaCBhcyBbQmlvbWFydF0oaHR0cHM6Ly93d3cuZW5zZW1ibC5vcmcvYmlvbWFydCkgdG8gbWFwIHRoZXNlIHRvIG1vcmUgcmVjb2duaXNhYmxlIG5hbWVzLCBhbmQgYWxzbyBvYnRhaW4gaG9tb2xvZ291cyBodW1hbiBnZW5lcy4gVGhlIGBiaW9tYVJ0YCBwYWNrYWdlIGlzIGFibGUgdG8gcGVyZm9ybSB0aGUgbWFwcGluZywgYW5kIHdpdGhvdXQgZ29pbmcgaW50byB0b28gbXVjaCBkZXRhaWwgaGVyZSBpcyB0aGUgY29kZSB0byBkbyBzby4KCmBgYHtyIG1lc3NhZ2U9RkFMU0V9CmxpYnJhcnkoYmlvbWFSdCkKZ2VuZXNfb2ZfaW50ZXJlc3QgPC1ybmFpX2NsZWFuZWQkR2VuZUlECmVuc2VtYmwgPC0gdXNlRW5zZW1ibChiaW9tYXJ0ID0gImVuc2VtYmwiLAogICAgICAgICAgICAgICAgICAgICAgZGF0YXNldCA9ICJkbWVsYW5vZ2FzdGVyX2dlbmVfZW5zZW1ibCIpCgphbm5vIDwtIGdldEJNKGMoImVuc2VtYmxfZ2VuZV9pZCIsImV4dGVybmFsX2dlbmVfbmFtZSIsICJoc2FwaWVuc19ob21vbG9nX2Vuc2VtYmxfZ2VuZSIsImhzYXBpZW5zX2hvbW9sb2dfYXNzb2NpYXRlZF9nZW5lX25hbWUiLCJoc2FwaWVuc19ob21vbG9nX29ydGhvbG9neV9jb25maWRlbmNlIiksICJleHRlcm5hbF9nZW5lX25hbWUiLCBnZW5lc19vZl9pbnRlcmVzdCwgZW5zZW1ibCkgJT4lIAogIHJlbmFtZShHZW5lSUQgPSBleHRlcm5hbF9nZW5lX25hbWUpCmBgYAoKVGhpcyBjYW4gbm93IGJlIGpvaW5lZCB0byB0aGUgY2xlYW5lZCBkYXRhc2V0IHVzaW5nIHRoZSBgbGVmdF9qb2luYCBjb21tYW5kIGZyb20gYmVmb3JlLgoKYGBge3J9CnJuYWlfY2xlYW5lZCA8LSBsZWZ0X2pvaW4ocm5haV9jbGVhbmVkLCBhbm5vKQpybmFpX2NsZWFuZWQKYGBgCgpBIGJhc2ljIHNjYXR0ZXIgcGxvdCBvZiBgRkxgIGFuZCBgUkxgIHJldmVhbHMgc29tZSBleHRyZW1lIHZhbHVlcyBpbiB0aGUgZGF0YXNldC4KCmBgYHtyfQpnZ3Bsb3Qocm5haV9jbGVhbmVkLCBhZXMoeCA9IEZMLCB5ID0gUkwsY29sPVJhdGlvKSkgKyBnZW9tX3BvaW50KCkKCmBgYAoKKioqKioqCioqKioqKgoqKioqKioKCiMjIyBFeGVyY2lzZQoKLSBJZGVudGlmeSB0aGUgbmFtZXMgb2YgdGhlIGdlbmVzIHRoYXQgaGF2ZSBleHRyZW1lIGBSTGAgb3IgYEZMYCB2YWx1ZXMKLSBQcm9kdWNlIGEgdmVyc2lvbiBvZiB0aGUgcGxvdCB0aGF0IGhhcyB0aGUgbmFtZSBvZiB0aGVzZSBnZW5lcyBsYWJlbGxlZAogICAgKyBjaGVjayB0aGUgaWRlbnRpdHkgb2YgdGhlc2UgZ2VuZXMuIERvIHRoZXkgaGF2ZSBhbnkgYmlvbG9naWNhbCBzaWduaWZpY2FuY2U/CgpgYGB7ciBlY2hvPUZBTFNFfQpnZ3Bsb3Qocm5haV9jbGVhbmVkLCBhZXMoeCA9IEZMLCB5ID0gUkwsY29sPVJhdGlvKSkgKyBnZW9tX3BvaW50KCkgKyBnZW9tX3RleHQoZGF0YT0gZmlsdGVyKHJuYWlfY2xlYW5lZCwgRkw+OCB8IFJMID4gMTAwKSxhZXMobGFiZWw9R2VuZUlEKSxjb2w9ImJsYWNrIikKYGBgCgotIERlY2lkZSBvbiBzdWl0YWJsZSBjdXQtb2ZmcyBvbiB0aGUgcmF0aW8gb2YgYFJMYCB0byBgRkxgIHRvIGlkZW50aWZ5IGdlbmVzIG9mIGludGVyZXN0Ci0gV3JpdGUgdGhlIHJvd3MgY29ycmVzcG9uZGluZyB0byB0aGVzZSBnZW5lcyB0byBhIGNzdiBmaWxlIGZvciBmdXJ0aGVyIHByb2Nlc3NpbmcKCgoKCg==