doing something for every element of an object

doing something until the processed data runs out

doing something for every file in a folder

doing something that can fail, until it succeeds

iterating a calculation until it converges

**Quebec Centre for Biodiversity Science**

R Workshop Series

R Workshop Series

**Workshop 5: Programming in R**

**Website**

:

http://qcbs.ca/wiki/r_workshop5

:

http://qcbs.ca/wiki/r_workshop5

Reuse and share your code

Achieve greater consistency

Avoid copy/paste errors

Avoid reinventing the wheel

Redo your analysis quickly and easily

Use R to do repetitive tasks for you

Understand what R is doing to your data

Do analyses that nobody has prepackaged

It’s fun! (no, really!)

Why program in R?

Install R (or use it on the computers here):

http://cran.r-project.org/

Install an R environment, such as R Studio:

http://rstudio.org/

Download the slides and the .R script:

http://bit.ly/yRjShO

Twiddle your thumbs impatiently

Pre-workshop:

Iteration

if

statement

Beware of R’s expression parsing!

Another example of a for loop.

Outline

Control Flow

Writing functions in R

Speeding up your code

Useful R packages for biologists

**1. Control Flow**

if(condition) {

expression

}

Use curly brackets { } so that R knows to expect more input.

Try:

if (2+2)==4 print("Arithmetic works.")

else print("Houston, we have a problem.")

This doesn't work because R evaluates the first line and doesn't know that you are going to use an

else

statement.

Instead use:

if (2+2)==4 {

print("Arithmetic works.")

} else {print("Houston, we have a problem."}

When using brackets, R waits to evaluate the command until the brackets have been closed.

Syntax

The example above would cause R to evaluate the expression 5 times. In the first iteration, R would replace every instance of

i

with 1. In the second iteration

i

would be replaced with 2, and so on.

for (m in 4:10) {

print(m*2)

}

Try

**Writing Functions**

- Perform a task repeatedly, but configurably

- Make your code more readable

- Make your code easier to modify and maintain

- Share code between different analyses

- Share code with other people

- Modify R’s built-in functionality

Why write functions?

What is a function?

Syntax

function_name <- function(argument1, argument2, ...) {

...

expression

# What we want the function to do

...

return(value) # Optional.

}

Arguments

function_name <- function(

argument1, argument2, ...

) {

expression

# What we want the function to do

return(value) # Optional.

}

The entry values of the function, the information required for the function to work.

They are variables available only in the function.

A function can have between 0 and an infinity of arguments

Arguments

print_number <- function(number) {

print(number)

}

> print_number(2)

> print_number(231)

Arguments

operations <- function(number1, number2, number3) {

result <- (number1 + number2) * number3

print(result)

}

> operations(1, 2, 3)

> operations(17, 23, 2)

With more than one argument :

Example

Challenge 5

> Scruffy <- "dog"

> Paws <- "cat"

> print_animal(Scruffy)

Using what you learned previously on flow control, create a function

print_animal

that takes an animal as argument and gives the following results :

The expression part can contain virtually anything : statements, loops, conditional statements, even other functions.

print_animal(Paws)

> [1] "woof"

> [1] "meow"

Challenge 5

Solution

print_animal <- function(animal) {

if (animal == "dog") {

print("woof")

} else if (animal == "cat") {

print("meow")

}

}

Default values

operations <- function(number1, number2,

number3=3

) {

result <- (number1 + number2) * number3

print(result)

}

> operations(1, 2, 3) # becomes equivalent to

> operations(1, 2)

> operations(1, 2, 2) # number3 can still be changed

To avoid writing all arguments all the time when calling the function and still be flexible

The ... argument

plot.CO2 <- function(CO2, ...) {

# We use ... to pass on arguments to plot()

plot(x=CO2$conc, y=CO2$uptake, type="n", ...)

for (i in 1:length(CO2[,1])){

if (CO2$Type[i] == "Quebec") {

# same for points

points(CO2$conc[i], CO2$uptake[i], col="red", type="p", ...)

} else if (CO2$Type[i] == "Mississippi") {

# same for points()

points(CO2$conc[i], CO2$uptake[i], col="blue", type="p", ...)

}

}

}

> plot.CO2(CO2, cex.lab=1.4, xlab="CO2 concentration", ylab="CO2 uptake")

> plot.CO2(CO2, cex.lab=1.4, xlab="CO2 concentration", ylab="CO2 uptake", pch=20)

- To pass on arguments to another function used inside your function

The ... argument

sum2 <- function(...){

args <- list(...)

result <- 0

for (i in args) {

result <- result + i

}

return (result)

}

> sum2(2, 3)

> sum2(2, 4, 5, 7688, 1)

- To allow the user to input an indefinite number of arguments

Return value

returntest <- function(a, b) {

return (a) # The function exits here

a <- a + b # Not interpreted

return (a) # Not interpreted

}

> returntest(2, 3) # Prints the return value of your function

> c <- returntest(2, 3) # assign it to another variable to save it

> c

Allows you to save the result of our function and be able to use it later.

Only one return value can be provided by a function.

The function will exit once it hits the

return()

keyword

Challenge 6

Using what you learned so far on functions and flow control, create a function

bigsum

that takes two arguments

a

and

b

and :

returns 0 if the sum of a and b is strictly less than 50

returns the sum of a and b otherwise

Challenge 6

Solution

bigsum <- function(a, b) {

result <- a + b

if (result < 50) {

return(0)

} else {

return (result)

}

}

Accessibility of variables

Always keep in mind where your variables are and if they are accessible.

Variables defined inside a function are not accessible outside

Variables defined outside a function are accessible inside. But it is NEVER a good idea!

rm(list=ls()) # remove everything to avoid any confusion

var1 <- 3 # var1 is defined outside our function

vartest <- function() {

a <- 4 # a is defined inside

print(a) # print a

print(var1) # print var1

}

a # print a. Error, a can be seen only inside the function

vartest() # calling vartest() will print a and var1

rm(var1) # remove var1

vartest() # calling the function again doesn't work anymore

Accessibility of variables

Instead, use arguments. Inside a function, arguments names will take over other variable names.

var1 <- 3 # var1 is defined outside our function

vartest <- function(var1) {

print(var1) # print var1

}

vartest(8) # Inside our function var1 is now our argument

var1 # var1 still has the same value

Be careful when creating variable inside a conditional statement.

a <- 3

if (a > 5) {

b <- 2

}

a + b # Error, b is not created because a < 5

Accessibility of variables

It is usually a good practice to define variables outside the conditions and then modify their value to avoid any problem

a <- 3

b <- 0

if (a > 5) {

b <- 2

}

a + b

**Good programming practices**

To make your life easier!!!

It helps achieve greater readability and makes sharing and reusing your code a lot less painful.

Having an easy to read code will reduce the time you'll spend to understand it so it's never time lost

Why?

Proper indentation and spacing is the first step to get an easy to read code. Here are some suggestions

Use spaces between and after your operators

Use consistentely the same assignation operator. `←` is often preferred, `=` is ok but don't switch all the time between the two

Use brackets when using flow control statements

Inside brackets, indent by at least two spaces.

Put closing brackets on a separate line, except when preceding an else statement.

Define each variable on its own line

Keep a clean code

Keep a clean code

a<-4;b=3

if(a<b){

if(a==0)print("a zero") } else {

if(b==0){print("b zero")} else print(b)}

What not to do

Keep a clean code

a <- 4

b <- 3

if(a < b){

if(a == 0) {

print("a zero")

}

} else {

if(b == 0){

print("b zero")

} else {

print(b)

}

}

Takes more space but easier to read and understand

Use functions

for (i in 1:length(CO2[,1])) {

if(CO2$Type[i] == "Mississippi") {

CO2$conc[i] <- CO2$conc[i] - 20

}

}

for (i in 1:length(CO2[,1])) {

if(CO2$Type[i] == "Quebec") {

CO2$conc[i] <- CO2$conc[i] + 50

}

}

Helps reducing the number of errors done by copying/pasting similar chunks of code and reduces the time needed if we want to change them.

Let's modify the example from exercise 3 and suppose that all CO2 uptake from Mississipi were overestimated by 20 and Quebec underestimated by 50.

We could write this :

Use functions

recalibrate <- function(CO2, type, bias) {

for (i in 1:nrow(CO2)) {

if(CO2$Type[i] == type) {

CO2$conc[i] <- CO2$conc[i] + bias

}

}

# return the new dataset

return (CO2)

}

# don't forget to save the results!

newCO2 <- recalibrate(CO2, "Mississipi", -20)

newCO2 <- recalibrate(newCO2, "Quebec", +50)

Or this :

Use functions

recalibrate <- function(CO2, type, bias) {

for (i in 1:nrow(CO2)) {

if(CO2$Type[i] == type) {

CO2$

uptake

[i] <- CO2$

uptake

[i] + bias

}

}

# return the new dataset

return (CO2)

}

Woops made a mistake :

Yay, less changes to make. And it looks waaay cooler!!

Use meaningful names

rc <- function(c, t, b) {

for (i in 1:nrow(c)) {

if(c$Type[i] == t) {

c$uptake[i] <- c$uptake[i] + b

}

}

return (c)

}

Same function, stupid names, way harder to understand at first sight :

Comments

## recalibrates the CO2 dataset by modifying the CO2 uptake concentration

## by a fixed amount depending on the region of sampling

# Arguments

# CO2: the CO2 dataset

# type: the type that need to be recalibrated. Values: "Mississippi" or "Quebec"

# bias: the amount to add to the concentration uptake. Use negative values for overestimations

recalibrate <- function(CO2, type, bias) {

for (i in 1:nrow(CO2)) {

if(CO2$Type[i] == type) {

CO2$uptake[i] <- CO2$uptake[i] + bias

}

}

# we have to return our new dataset because the original is not modified

return (CO2)

}

To help the others and yourself!!

**Speeding up your code**

Because if we want to optimize, we will need to know how much time it takes!

Profiling

system.time({

a <- 0

for (i in 1:1000) {

a <- a + i

}

})

system.time(replicate(1000, {

a <- 0

for (i in 1:1000) {

a <- a + i

}

}))

Repeating our code might be necessary for time to be measurable

To have a more detailed output of the time spent in each function, you can use the function Rprof()

Profiling

Rprof("profile.txt") # Saves results in file profile.txt

a <- 0

for (i in 1:1000000) {

a <- a + i

}

Rprof(NULL) # Ends the profiling

summaryRprof("profile.txt") # Display the result of profiling

To compare the efficiency of several functions with accurate precision, you can use the package

microbenchmark

Profiling

install.packages("microbenchmark")

library(microbenchmark)

f1 <- function() {

a <- 0

for (i in 1:1000) {

a <- a + i

}

}

# The argument times sets the number of iterations

microbenchmark(f1(), times=1000)

When we want to speed up our code, the first thing we should do is look at it and ask ourselves the following questions:

Is my code ok?

Is everything useful?

Do I repeat some tasks needlessly?

Are there other ways to do that?

To program efficiently, we have to think efficiently first and remove everything that can be removed.

This might also usually provide a simpler code to read.

First step : thinking

Let's create a function that:

Takes a number

a

Adds

a

to every number from 1 to 100

If

a

is less than 5, then we will add

2*a

instead

Sums of all the elements of the modified sequence.

First step : thinking

Here's a way to do it:

f2 <- function(a) {

# initialize our result

result <- 0

# iterate on the sequence from 1 to 100

for (i in 1:100) {

if (a < 5) {

# a is < 5, we add 2 * a to the sequence element and to a. We save it in result

result <- result + i + (2 * a)

} else {

# a is >= 5, we do not add 1

result <- result + i + a

}

}

return(result)

}

f2(4)

Our previous example works well. However, a is constant so it's useless to check if it is less than 5 in each iteration.

Here's another more efficient way:

First step : thinking

f3 <- function(a) {

# initialize our result

result <- 0

# Check if a < 5 and add 1 if true

if (a < 5) {

a <- 2 * a

}

# We don't even need an else here since a remains the same otherwise

# iterate on the sequence from 1 to n

for (i in 1:100) {

result <- result + i + a

}

return(result)

}

f3(4)

microbenchmark(f2(4),

f3(4), times=1000)

Just by thinking a little bit, our code became faster and easier to understand.

Now we can do even better with the power of R

First step : thinking

f4 <- function(a) {

result <- 0

if (a < 5) {

a <- a * 2

}

result <- sum(1:100 + a)

return(result)

}

f4(4)

microbenchmark(f3(4), f4(4), times=10000)

Wow our code just got ten times faster and also smaller...

What just happened?

R is an interpreted language actually written in C.

R code is slower since it has to be decoded into C functions.

Some R functions are direct links to C functions and are therefore way faster and optimized

R is usually optimized for vectorization, i.e. operations on vectors.

So it is usually way faster to perform operation directly on vectors instead of looping over them

Vectorization

Vectorization

v1 <- 1:5

v2 <- 2:6

v3 <- 1:3

v1 + 2 # Addition on a vector : adds 2 to all elements

v1 + v2 # Adds each element of v2 to v

v1 + v3 # v3 is recycled since it is shorter than v1

sum(v1) # Adds all elements of v1 together

sum(v1, v2) # Sums all elements of v1 and v2

mean(v1) # Average of elements in v1

mean(c(v1, v2)) # Average of elements of v1 and v2.

Here are some examples of operations on vectors

Subsetting / logical indexing

Extracts data way faster than loops.

Is done with the

[ ]

operator by providing a set of indexes or conditions returning a set of indexes.

v1 <- 1:10

v1[7] # Extracts the 7th value

v1[v1 > 5] # Extracts values > 5 only

v1[which(v1 > 5)] # same as before

In data frames, the

$

operator allows to access columns directly.

Remember that columns of a data frame are always vectors

data(CO2)

CO2$Type # Prints columns Type

CO2[, "Type"] # Same as above

CO2[CO2$Type == "Quebec", ] # Extracts all rows of the CO2 dataset where the Type is "Quebec"

Challenge 7

Create a new function

recalibrate2()

rewrites the function

recalibrate()

seen earlier using subsetting and vectorization techniques.

The new function should not be longer than 3 lines.

Reminder:

Challenge 7

Solution

recalibrate2 <- function(CO2, type, bias) {

# First get the indexes of the data with the good type

# Thinking tip : since we use the indexes twice below, instead of using which()

# twice, let's do it only once and save the result!

idx <- CO2$Type == type

# Modify only the data concerned using indexes.

CO2$uptake[idx] <- CO2$uptake[idx] + bias

return (CO2)

}

# Check the results are the same

all.equal(recalibrate(CO2, "Quebec", 20), recalibrate2(CO2, "Quebec", 20))

# Check that this is indeed way faster

microbenchmark(recalibrate(CO2, "Quebec", 20),

recalibrate2(CO2, "Quebec", 20))

recalibrate <- function(CO2, type, bias) {

for (i in 1:nrow(CO2)) {

if(CO2$Type[i] == type) {

CO2$uptake[i] <- CO2$uptake[i] + bias

}

}

return (CO2)

}

Growing objects

Sometimes loops can't be avoided. In these cases, pay extra attention to objects that grow with each iteration.

Take these two functions

growing <- function(n) {

# declare our result

result <- NULL

for (i in 1:n) {

# create our result by growing our object

result <- c(result, i)

}

return(result)

}

growing2 <- function(n) {

# declare our result : here we create a vector of length n with 0 in it

result <- numeric(n)

for (i in 1:n) {

# now we just modify our value instead of recreating the vector

result[i] <- i

}

return(result)

}

Growing objects

system.time({

growing(10000)

})

system.time({

growing2(10000)

})

system.time({

growing(50000)

})

system.time({

growing2(50000)

})

Let's compare their speeds

Performance dropped so much because when calling a function, arguments are first copied in memory.

So as your object grows, the time needed to copy it when calling c() increases.

This problem is resolved by preallocating your result object and filling it

Growing objects

The same problems appears with data frames and functions like cbind() or rbind().

However it is a bit more complex.

growingdf <- function(n, row) {

# preallocate our dataframe

df <- data.frame(numeric(n), character(n), stringsAsFactors=FALSE)

for (i in 1:n) {

# replace the ith row with row

df[i,] <- row

}

return(df)

}

growingdf2 <- function(n, row) {

# this is the way to allocate a list with n elements

df <- vector("list", n)

for (i in 1:n) {

# put row in the ith element

df[[i]] <- row

}

return(do.call(rbind, df))

}

# store our row in a list since we have different types

row <- list(1, "Hello World")

microbenchmark(growingdf(5000, row),

growingdf2(5000, row),

times=10)

The apply family

To prevent the growing object problem.

Not always the best solution performance-wise (they sometimes hide a for loop)

Allow to apply easily a function on rows or columns of a data frame

df <- data.frame(1:100, 101:200)

# Sum on rows

apply(df, 1, sum)

# Mean on columns

apply(df, 2, mean)

# we can also supply additional arguments to the function

apply(df, 2, mean, na.rm=TRUE)

# we can also define a function directly. The first argument is always what

# we iterate on. Here each row is treated as a vector of numbers as we can

# see with the str() function

apply(df, 1, function(x){str(x)})

# We can also add other arguments

apply(df, 1, function(x, y){x[2] - x[1] + y}, y=5)

The apply family

When looking for speed, The most interesting apply functions are probably lapply(), sapply and vapply() since they are primitives written in C. But they are more complex to use.

a <- list(1:100, 101:200)

# apply mean to each element of the list

lapply(a, mean) # we get a list as a result

unlist(lapply(a, mean)) # use unlist to get a vector instead

sapply(a,mean) #Same result

vapply(a, mean, 0) # the result of mean is a single number, we tell vapply our result will be a number

But remember...

Before spending time speeding up your code, first ask yourselves :

Is it really worth it??

Because, sometimes, spending 1 hour optimizing your code to effectively save 15 seconds on computing time is just not really that good a deal...

if and if/else test a single condition

Use "ifelse" function to:

test a vector of conditions

apply a function only under certain conditions

What if you want to test more than one thing?

a <- 1:10

ifelse(a > 5, "yes", "no")

a <- (-4):5

sqrt(ifelse(a >= 0, a, NA))

== equal to

!= not equal to

!x not x

< less than

<= less than or equal to

> greater than

>= greater than or equal to

x&y x AND y

x|y x OR y

isTRUE(x) test if X is true

Remember the logical operators

Exercise 1

Paws <- "cat"

Scruffy <- "dog"

Sassy <- "cat"

animals <- c(Paws, Scruffy, Sassy)

1. Use an if statement to print “meow” if Paws is a “cat”.

2. Use an if/else statement to print “woof” if you supply an object that is a “dog” and “meow” if it is not. Try it out with Paws and Scruffy.

3. Use the ifelse function to display “woof” for animals that are dogs and “meow” for animals that are cats.

The letter 'i' can be replaced with any variable name and the sequence can be almost anything, even a list of vectors.

for (a in c("Hello", "R", "Programmers")) {

print(a)

}

for (z in 1:30) {

a <- rnorm(n = 1, mean = 5, sd = 2)

print(a)

}

elements <- list(1:3, 4:10)

for (element in elements) {

print(element)

}

Loops are often used to loop over a dataset. We will use loops to perform functions on the CO2 dataset which is built in to R.

data(CO2) # This loads the built in dataset

for (i in 1:length(CO2[,1])) { # for each row in the CO2 dataset

print(CO2$conc[i]) #print the CO2 concentration

}

for (i in 1:length(CO2[,1])) { # for each row in the CO2 dataset

if(CO2$Type[i] == "Quebec") { # if the type is "Quebec"

print(CO2$conc[i]) #print the CO2 concentration }

}

}

# Tip 1 : to get the number of rows of a data frame, we can also use the function nrow

for (i in 1:nrow(CO2)) { # for each row in the CO2 dataset

print(CO2$conc[i]) #print the CO2 concentration

}

# Tip 2 : If we want to perform operations on only the elements of one column, we can directly

# iterate over it.

for (i in CO2$conc) { # for every element of the concentration column of the CO2 dataset

print(i) # print the ith element

}

The expression part of the loop can be almost anything and is usually a compound statement containing many commands.

for (i in 4:5) { # for i in 4 to 5

print(colnames(CO2)[i])

print(mean(CO2[,i]))

}

Note that this could be done more quickly using apply(), but that wouldn't teach you about loops. We will talk about it later.

Exercise 2

You have realized that your tool for measuring uptake was not calibrated properly at Quebec sites and all measurements are 2 units higher than they should be. Use a loop to correct these measurements for all Quebec sites.

Make sure you reload the data so that we are working with the raw data for the rest of the exercise:

data(CO2)

Modifying iterations

Normally, loops iterate over and over until they finish.

To change this behavior, you can use:

break

breaks out of the loops execution entirely

next

stops executing the current iteration and jumps to the next iteration.

count <- 0

for (i in 1:length(CO2[,1])) {

if (CO2$Treatment[i] == "nonchilled") next

#Skip to next iteration if treatment is nonchilled

count <- count + 1

print(CO2$conc[i])

}

print(count)

# The count and print command were performed 42 times.

count <- 0

i <- 0

repeat {

i <- i + 1

if (CO2$Treatment[i] == "nonchilled") next

# next tells R to skip this loop

count <- count + 1

print(CO2$conc[i])

if (i == length(CO2[,1])) break # stop looping

}

print(count)

Example

Print the CO2 concentrations for "chilled" treatments and keep count of how many replications there were.

This could be equivalently written using a repeat loop:

Example (Continued)

Example (Continued)

This could be equivalently written using a while loop:

i <- 0

count <- 0

while (i < length(CO2[,1]))

{

i <- i + 1

if (CO2$Treatment[i] == "nonchilled") next # skip this loop

count <- count + 1

print(CO2$conc[i])

}

print(count)

Exercise 3

Make sure you reload the data so that we are working with the raw data for the rest of the exercise:

data(CO2)

You have realized that your tool for measuring concentration didn't work properly. At Mississippi sites, concentrations less than 300 were measured correctly but concentrations >= 300 were overestimated by 20 units. Use a loop to correct these measurements for all Mississippi sites.

Using flow control to make a complex plot

Dataset

concentration

uptake

type (Quebec or Mississippi)

treatment (chilled or nonchilled)

How do we plot the points differently to show types and treatments?

plot(x=CO2$conc, y=CO2$uptake, type="n", cex.lab=1.4, xlab="CO2 concentration", ylab="CO2 uptake") # Type "n" tells R to not actually plot the points.

for (i in 1:length(CO2[,1])) {

if (CO2$Type[i] == "Quebec" & CO2$Treatment[i] == "nonchilled") {

points(CO2$conc[i], CO2$uptake[i], col="red",type="p")

}

if (CO2$Type[i] == "Quebec" & CO2$Treatment[i] == "chilled") {

points(CO2$conc[i], CO2$uptake[i], col="blue")

}

if (CO2$Type[i] == "Mississippi" & CO2$Treatment[i] == "nonchilled") {

points(CO2$conc[i], CO2$uptake[i], col="orange")

}

if (CO2$Type[i] == "Mississippi" & CO2$Treatment[i] == "chilled") {

points(CO2$conc[i], CO2$uptake[i], col="green")

}

}

Using control flow to make a complex plot

head(CO2) # Look at the dataset

unique(CO2$Type)

unique(CO2$Treatment)

Generate a plot of showing concentration versus uptake where each plant is shown using a different colour point.

Bonus points for doing it with nested loops!

Exercise 4

while

loops and

repeat

loops operate similarly to for loops

Once you understand how for loops work, you should be able to use any type of loop.

You will see some examples of while loops and repeat loops in the next section.

while loops and repeat loops

**Other packages of interest**

knitr

Write R code in Markdown or in Latex

Compile or 'knit' code to html, PDF or Word.

Shiny: similar concept, but for interactive web documents

Note that there are other tools to create a complex plot (such as ggplot which was covered in workshop 3)

nested loops

In some cases, you may want to use nested loops to accomplish a task. When using nested loops, it is important to use different variables as counters for each of your loops (here we used i and n).

for (i in 1:5) {

for (n in 1:5) {

print (i*n)

}

}

Program flow control can be simply defined as the order in which a program is executed

Control Flow

Flow charts can be used to plan programs, and represent structure.

Coded Solutions

It decreases the complexity and time of the task at hand.

This logical structure also means that the code has increased clarity.

It also means that many programmers can work on one program. This means increased productivity.

Why is it advantageous to have structured programs?

Start of a process and has only one output.

This is the operation carried out.

These blocks must have an input and output.

Boolean choice: it has one input and two outputs.

Opposite of the ‘Start’ symbol.

An example of a real life awkward situation

Representing structure

The two basic building blocks of codes are the following:

Selection

Iteration

Program’s execution determined by statements

Repetition, where the statement will

loop

until a criteria is met

if

if else

for

while

repeat

if(condition) {

expression 1

} else {

expression 2

}

if ... else

statement

Decision making

Decision making is an important part of programming

if (test_expression1) {

statement1

} else if (test_expression2) {

statement2

} else if (test_expression3) {

statement3

} else

statement4

nested

if ... else

statement

Not convinced that your life is a program?

Let us take a look at our graduate lives!

Every time some operation(s) has to be repeated, a

loop

may come in handy.

for (i in 1:5) {

expression

}

for

statement

for (val in sequence) {

statement

}

x <- c(2,5,3,9,6)

count <- 0

for (val in x) {

if(val %% 2 == 0)

count = count+1 }

print(count)

[1] 2

for

statement

while

statement

while (test_expression) {

statement

}

i <- 1

while (i < 6) {

print(i)

i = i+1

}

while

statement

[1] 1

[2] 2

[3] 3

[4] 4

[5] 5

break

statement

for (val in x) {

if (condition){

break

}

statement

}

for (val in x) {

if (condition){

next

}

statement

}

next

statement

repeat

statement

repeat {

statement

}

for (i in 1:length(CO2[,1])) {

if(CO2$Type[i] == "Mississippi") {

if(CO2$conc[i] < 300) next

CO2$conc[i] <- CO2$conc[i] - 20

}

}

# Note : We could also have written it that way, which is more concise and clear

for (i in 1:nrow(CO2)) {

if(CO2$Type[i] == "Mississippi" && CO2$conc[i] >= 300) {

CO2$conc[i] <- CO2$conc[i] - 20

}

}

for (i in 1:length(CO2[,1])) {

if(CO2$Type[i] == "Quebec") {

CO2$uptake[i] <- CO2$uptake[i] - 2

}

}

tapply(CO2$uptake,CO2$Type,mean)

plot(x=CO2$conc, y=CO2$uptake, type="n", cex.lab=1.4,xlab="CO2 concentration", ylab="CO2 uptake")

# Type "n" tells R to not actually plot the points.

plants <- unique(CO2$Plant)

for (i in 1:length(CO2[,1])){

for (p in 1:length(plants)) {

if (CO2$Plant[i] == plants[p]) {

points(CO2$conc[i], CO2$uptake[i], col=p, type="p")

}

}

}

library(RgoogleMaps)

myhome=getGeoCode('Stewart Biology Building, Montreal');

mymap<-GetMap(center=myhome, zoom=14)

PlotOnStaticMap(mymap,lat=myhome['lat'],lon=myhome['lon'],

cex=5,pch=10,lwd=3,col=c('red'));

RgoogleMaps

Thank you for coming!