Everything in R is an object, there are five basic or “atomic” classes of objects. The most basic object is a vector, it can contain objects of the same class only, the one exception is a list, which is represented as a vector but can contain different classes and indeed that’s usually how lists are used.

Objects

  • character
  • numeric (real numbers)
  • integer
  • complex (3 - 4i)
  • logical (True/False)

Numbers are double precision real numbers. If needed to specify an integer we should use the L suffix. There are a couple of special numbers, Inf and NaN which stand for Infinite and Not A Number.

Objects in R can have attributes

  • name, dimnames
  • dimensions (matrices, arrays)
  • class

Vectors

The c() function is used to create vectors.

x <- c(1, 3, 4, 7)
x <- vector("numeric", lenght=2)

When different objects are mixed in a vector, coercion occurs so that every element in the vector is of the same class. Objects can be explicitly coerced using the as.* function. If non-sense coercion is tried the result is NA.

x <- 0:6
class(x)
as.numeric(x)
as.logical(x)
as.character(x)

Lists

Lists are very important in R, the element in a list are enclosed in double square brackets and can be accesed by their index in the list.

x <- list(1, "a", TRUE)
[[2]]
[2] "a"

Matrices

Special vetor in R with a dimension attribute which is itself an integer vector of lenght 2 (nrow, ncol). Matrices are constructed column-wise.

m <- matrix(nrow = 2, ncol = 3)
dim(m)
attributes(m)

Matrices can also be created directly from vectors by adding a dimension attribute.

m <- 1:10
dim(m) <- c(2,5)

Binding can be used to create matrices also by using the functions cbind() and rbind()

x <- 1:3
y <- 10:12
cbind(x,y)
rbind(x,y)

Factors

Are used to represent categorical data, can be ordered or unordered. One can think a factor as an integer vector where each integer has a label. Using factors with labels is better than using integers because factors al self-described.

The order of the labels can be set using the levels argument, this can be important for linear modeling.

x <- factor(c("yes", "yes", "no", "yes", "no"), levels = c("yes", "no"))
table(x)
unclass(x)
attr(,"levels")

Missing values

Missing values are denoted by NA or NaN fr undefined mathematical operations. NA values can have a class also, so there are integer NA, character NA, etc. A NaN value is also an NA value, but not the opposite.

x <- c(1, 2, NA, 10, 3)
is.na(x)
is.nan(x)

Data frames

Are used to store tabular data, they are represented as a special list has to have the same length. Each element can be thought of as a column and the length of each element of the list is the number of rows. Unlike matrices, data frames can store different classes of objects in each column. Data frames have a special attribute called row.names

x <- data.frame(foo = 1:4, bar = c(T, T, F, F))
x
nrow(x)
ncol(x)

Names

R objects can have names, which is useful for writing readable code.

x <- 1:3
names(x)
x
names(x) <- c("foo", "bar", "norf")
x
x <- list(a = 1, b = 2, c = 3)
x
m <- matrix(1:4, nrow = 2, ncol = 2)
dimnames(m) <- list(c("a", "b"), c("c", "d"))
m

EOF