R Programming/Advanced programming

Conditional execution
> ?Control
 * Help for programming :

if accepts a unidimensional condition. if (condition){ statement } else { alternative }

The ifelse command takes as first argument the condition, as second argument the treatment if the condition is true and as third argument the treatment if the condition is false. In that case, the condition can be a vector. For instance we generate a sequence from 1 to 10 and we want to display values which are lower than 5 and greater than 8. > x <- 1:10 > ifelse(x<5 | x>8, x, 0) [1] 1  2  3  4  0  0  0  0  9 10

Loops
R provides three ways to write loops: for, repeat and while. The for statement is excessively simple. You simply have to define index (here k) and a vector (in the example below the vector is 1:5) and you specify the action you want between braces. > for (k in 1:5){ + print(k) + } [1] 1 [1] 2 [1] 3 [1] 4 [1] 5 When it is not possible to use the for statement, you can also use break or while by specifying a breaking rules. One should be careful with this kind of loops since if the breaking rules is misspecified the loop will never end. In the two examples below the standard normal distribution is drawn in as long as the value is lower than 1. The cat function is used to display the present value on screen. > repeat { + 	g <- rnorm(1) + 	if (g > 1.0) break + 	cat(g,"\n") + 	} -1.214395 0.6393124 0.05505484 -1.217408 > g <- 0 > while (g < 1){ + 	g <- rnorm(1) + 	cat(g,"\n") + 	} -0.08111594 0.1732847 -0.2428368 0.3359238 -0.2080000 0.05458533 0.2627001 1.009195

Implicit loops
Loops are generally slow and it is better to avoid them when it is possible.
 * apply can apply a function to elements of a matrix or an array. This may be the rows of a matrix (1) or the columns (2).
 * lapply applies a function to each column of a dataframe and returns a list.
 * sapply is similar but the output is simplified. It may be a vector or a matrix depending on the function.
 * tapply applies the function for each level of a factor.

> N <- 10 > x1 <- rnorm(N) > x2 <- rnorm(N) + x1 + 1 > male <- rbinom(N,1,.48) > y <- 1 + x1 + x2 + male + rnorm(N) > mydat <- data.frame(y,x1,x2,male) > lapply(mydat,mean) # returns a list $y [1] 3.247

$x1 [1] 0.1415

$x2 [1] 1.29

$male [1] 0.5

> sapply(mydat,mean) # returns a vector y    x1     x2   male 3.2468 0.1415 1.2900 0.5000 > apply(mydat,1,mean) # applies the function to each row [1] 1.1654  2.8347 -0.9728  0.6512 -0.0696  3.9206 -0.2492  3.1060  2.0478  0.5116 > apply(mydat,2,mean) # applies the function to each column y    x1     x2   male 3.2468 0.1415 1.2900 0.5000 > tapply(mydat$y,mydat$male,mean) # applies the function to each level of the factor 0    1 1.040 5.454


 * See also aggregate which is similar to <tt>tapply</tt> but is applied to a dataframe instead of a vector.

Iterators

 * Loops in R are generally slow. <tt>iterators</tt> may be more efficient than loops. See this entry in the Revolution Computing Blogs