<- c(5, 2, 8) x
Multivariable Mathematics for Data Science
1 Preface
This book will introduce students to multivariable Calculus and linear algebra methods and techniques to be successful in data science, statistics, computer science, and other data-driven, computational disciplines.
The motiviation for this text is to provide both a theoretical understanding of important multivariable methods used in data science as well as giving a hands-on experience using software. Throughout this text, we assume the reader has a solid foundation in univariate calculus (typically two semesters) as well as familiarity with a scripting language (e.g., R or python).
1.1 Getting started in R
TBD
1.2 Some videos that explain useful concepts of linear algebra
1.3 Notation
For notation, we let lowercase Roman letters represent scalar numbers (e.g., n = 5, d = 7), lowercase bold letters represent vectors
\[ \begin{aligned} \textbf{x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}, \end{aligned} \]
where the elements \(x_1, \ldots, x_n\) are scalars written in lowercase Roman. Note that vectors are assumed to follow a vertical notation where the elements of the vector (the \(x_i\)s are stacked on top of one another) and the order matters. For example, the vector
\[ \begin{aligned} \mathbf{x} & = \begin{pmatrix} 5 \\ 2 \\ 8 \end{pmatrix} \end{aligned} \]
has the first element \(x_1 = 5\), second element \(x_2 = 2\) and third element \(x_3 = 8\). Note that the vector \(\begin{pmatrix} 5 \\ 2 \\ 8 \end{pmatrix}\) is not the same as the vector \(\begin{pmatrix} 8 \\ 2 \\ 5 \end{pmatrix}\) because the order of the elements matters.
We can also write the vector as
\[ \begin{aligned} \textbf{x} = \left( x_1, x_2, \ldots, x_n \right)', \end{aligned} \]
where the \('\) symbol represents the transpose function. For our example matrix, we have \(\begin{pmatrix} 5 \\ 2 \\ 8 \end{pmatrix}' = \begin{pmatrix} 5 & 2 & 8 \end{pmatrix}\) which is the original vector but arranged in a row rather than a column. Likewise, the transpose of a row vector \(\begin{pmatrix} 5 & 2 & 8 \end{pmatrix}' = \begin{pmatrix} 5 \\ 2 \\ 8 \end{pmatrix}\) is a column vector. If \(\mathbf{x}\) is a column vector, we say that \(\mathbf{x}'\) is a row vector and if \(\mathbf{x}\) is a row vector, the \(\mathbf{x}'\) is a column vector.
To create a vector we can use the concatenate function c()
. For example, the vector \(\mathbf{x} = \begin{pmatrix} 5 \\ 2 \\ 8 \end{pmatrix}\) can be created as the R
object using
where the <-
assigns the values in the vector c(5, 2, 8)
to the object named x
. To print the values of x
, we can use
x
[1] 5 2 8
which prints the elements of x. Notice that R
prints the elements of \(\mathbf{x}\) in a row; however, \(\mathbf{x}\) is a column vector. This inconsistency is present to allow the output to be printed in a manner easier to read (more numbers fit on a row). If we put the column vector into a data.frame
, then the vector will be presented as a column vector
data.frame(x)
x
1 5
2 2
3 8
One can use the index operator \([\ ]\) to select specific elements of the vector \(\mathbf{x}\). For example, the first element of \(\mathbf{x}\), \(x_1\), is
1] x[
[1] 5
and the third element of \(\mathbf{x}\), \(x_3\), is
3] x[
[1] 8
The transpose function t()
turns a column vector into a row vector (or a row vector into a column vector). For example the transpose \(\mathbf{x}'\) of \(\mathbf{x}\) is
<- t(x)
tx tx
[,1] [,2] [,3]
[1,] 5 2 8
where tx
is R
object storing the transpose of \(\mathbf{x}\) and is a row vector. The transpose of tx
. Notice the indices on the output of the row vector tx
. The index operator [1, ]
selects the first row to tx
and the index operator [, 1]
gives the first column tx
. Taking the transpose again gives us back the original column vector
t(tx)
[,1]
[1,] 5
[2,] 2
[3,] 8
1.3.1 Matrices
We let uppercase bold letters \(\mathbf{A}\), \(\mathbf{B}\), etc., represent matrices. We define the matrix \(\mathbf{A}\) with \(m\) rows and \(n\) columns as
\[ \begin{aligned} \mathbf{A} & = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix}, \end{aligned} \]
with \(a_{ij}\) being the value of the matrix \(\mathbf{A}\) in the \(i\)th row and the \(j\)th column.
If the matrix
\[ \begin{aligned} \mathbf{A} & = \begin{pmatrix} 5 & 7 & 1 \\ 5 & -22 & 2 \\ -14 & 5 & 99 \\ 42 & -3 & 0\end{pmatrix}, \end{aligned} \]
the elements \(a_{11}\) = 5, \(a_{12}\) = 7, \(a_{21}\) = 5, and \(a_{33}\) = 99, etc.
In R
, we can define the matrix A using the matrix()
function
<- matrix(
A data = c(5, 5, -14, 42, 7, -22, 5, -3, 1, 2, 99, 0),
nrow = 4,
ncol = 3
)
A
[,1] [,2] [,3]
[1,] 5 7 1
[2,] 5 -22 2
[3,] -14 5 99
[4,] 42 -3 0
Notice in the above creation of \(\mathbf{A}\), we wrote defined the elements of the \(\mathbf{A}\) using the columns stacked on top of one another. If we want to fill in the elements of \(\mathbf{A}\) using the rows, we can add the option byrow = TRUE
to the matrix()
function
<- matrix(
A data = c(5, 7, 1, 5, -22, 2, -14, 5, 99, 42, -3, 0),
nrow = 4,
ncol = 3,
byrow = TRUE
) A
[,1] [,2] [,3]
[1,] 5 7 1
[2,] 5 -22 2
[3,] -14 5 99
[4,] 42 -3 0
To select the \(ij\)th elements of \(\mathbf{A}\), we use the subset operator [
to select the element. For example, to get the element \(a_{11} = 5\) in the first row and first column of \(\mathbf{A}\), we use
1, 1] A[
[1] 5
The element \(a_{3, 3} = 99\) in the third row and third column can be selected using
3, 3] A[
[1] 99
The matrix \(\mathbf{A}\) can also be represented as a set of either column vectors \(\{\mathbf{c}_j \}_{j=1}^n\) or row vectors \(\{\mathbf{r}_i \}_{i=1}^m\). For example, the column vector representation is
\[ \begin{aligned} \mathbf{A} & = \left( \mathbf{c}_{1} \middle| \mathbf{c}_{2} \middle| \cdots \middle| \mathbf{c}_{n} \right), \end{aligned} \]
where the notation \(|\) is used to separate the vectors
\[ \begin{aligned} \mathbf{c}_1 & = \begin{pmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{pmatrix}, & \mathbf{c}_2 & = \begin{pmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{pmatrix}, & \cdots, & & \mathbf{c}_n & = \begin{pmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{pmatrix} \end{aligned} \]
In R
you can extract the columns using the [
selection operator
<- A[, 1] # first column
c1 <- A[, 2] # second column
c2 <- A[, 3] # third column c3
and you can give the column representation of the matrix A
with with column bind function cbind()
cbind(c1, c2, c3)
c1 c2 c3
[1,] 5 7 1
[2,] 5 -22 2
[3,] -14 5 99
[4,] 42 -3 0
The row vector representation of \(\mathbf{A}\) is
\[ \begin{aligned} \mathbf{A} & = \begin{pmatrix} \mathbf{r}_{1} \\ \mathbf{r}_{2} \\ \vdots \\ \mathbf{r}_{m} \end{pmatrix}, \end{aligned} \]
where the row vectors \(\mathbf{r}_i\) are
\[ \begin{aligned} \mathbf{r}_1 & = \left( a_{11}, a_{12}, \ldots, a_{1n} \right) \\ \mathbf{r}_2 & = \left( a_{21}, a_{22}, \ldots, a_{2n} \right) \\ & \vdots \\ \mathbf{r}_m & = \left( a_{m1}, a_{m2}, \ldots, a_{mn} \right) \end{aligned} \]
In R
you can extract the rows using the [
selection operator
<- A[1, ] # first row
r1 <- A[2, ] # second row
r2 <- A[3, ] # third row
r3 <- A[4, ] # fourth row r4
and you can give the row representation of the matrix A
with with row bind function rbind()
rbind(r1, r2, r3, r4)
[,1] [,2] [,3]
r1 5 7 1
r2 5 -22 2
r3 -14 5 99
r4 42 -3 0