Linear Algebra in Data Science


Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data. Data science practitioners apply machine learning algorithms to numbers, text, images, video, audio, and more to produce artificial intelligence systems to perform tasks that ordinarily require human intelligence.

Linear algebra is the branch of mathematics concerning linear equations, linear functions and their representations through matrices and vector spaces. It helps us to understand geometric terms in higher dimensions, and perform mathematical operations on them.

Linear Algebra is the heart to almost all areas of mathematics like geometry and functional analysis. Its concepts are a crucial prerequisite for understanding the theory behind Data Science. The data scientist doesn’t need to understand Linear Algebra before getting started in Data Science, but at some point, it is necessary to understand how the different algorithms really work. Linear algebra in data science is used as follows :

Scalars, Vectors, Matrices and Tensors

• A scalar is a single number

• A vector is an array of numbers.

• A matrix is a 2-D array

  • A tensor is a n-dimensional array with n>2

Here above is the Representation of data in data science using linear algebra


Basic Concepts

Linear algebra provides a way of compactly representing and operating on sets of linear equations. For example, consider the following system of equations:

4x1 − 5x2 = −13

−2x1 + 3x2 = 9

This is two equations and two variables, so as you know from high school algebra, you can find a unique solution for x1 and x2 (unless the equations are somehow degenerate, for example if the second equation is simply a multiple of the first, but in the case above there is in fact a unique solution). In matrix notation, we can write the system more compactly as

Ax = b


Basic Notations:

By x ∈Rn, we denote a vector with n entries

By A ∈Rm×n we denote a matrix with m rows and n columns, where the entries of A are real numbers.

The Identity Matrix

The identity matrix, denoted I ∈Rn×n, is a square matrix with ones on the diagonal and zeros everywhere else. That is :

It has the property that for all A ∈Rm×n

Vector-Vector Product

inner product or dot product :

outer product :

Matrix-Vector Product

If we write A by rows, then we can express Ax as :

If we write A by columns, then we have:

Matrix-Matrix Multiplication

  1. As a set of vector-vector products

2. As a sum of outer products

3. As a set of matrix-vector products

4. As a set of vector-matrix products



The transpose of a matrix results from “flipping” the rows and columns. Given a matrix A ∈Rm×n, its transpose, written AT ∈Rn×m, is the n×m matrix whose entries are given by

The following properties of transposes are easily verified:

The Inverse Of Square Matrix

The inverse of a square matrix A ∈Rn×n is denoted A−1, and is the unique matrix such that :

We say that A is invertible or non-singular if A−1 exists and non-invertible or singular otherwise.

In order for a square matrix A to have an inverse A−1, then A must be full rank. — Properties (Assuming A,B ∈Rn×n are non-singular):

The Determinant

Algebraically, the determinant satisfies the following three properties: 1. The determinant of the identity is 1, |I| = 1. (Geometrically, the volume of a unit hypercube is 1).

2. Given a matrix A ∈Rn×n, if we multiply a single row in A by a scalar t ∈R, then the determinant of the new matrix is t|A|, (Geometrically, multiplying one of the sides of the set S by a factor t causes the volume to increase by a factor t.)

3. If we exchange any two rows aT i and aT j of A, then the determinant of the new matrix is −|A|.

The general (recursive) formula for the determinant is :

with the initial case that |A| = a11 for A ∈R1×1. If we were to expand this formula completely for A ∈Rn×n, there would be a total of n! (n factorial) different terms. For this reason, we hardly ever explicitly write the complete equation of the determinant for matrices bigger than 3×3.

However, the equations for determinants of matrices up to size 3×3 are fairly common, and it is good to know them:

Matrix Inverse

Matrix algebra provides tools for manipulating matrices and creating various useful formulas in ways similar to doing ordinary algebra with real numbers. For example, the (multiplicative) inverse of a real number, say 3, is 3^-1, or 1/3. This inverse satisfies the following equations:

This concept can be generalized for square matrices. An n x n matrix A is said to be invertible if there is an n x n matrix C such that:

Where I is the n x n identity matrix. An identity matrix is a square matrix with 1’s on the diagonal and 0’s elsewhere. Below, the 5 x 5 identity matrix is shown:

Going back to the invertibility principle above, we call the matrix C an inverse of A. In fact, C is uniquely determined by A, because if B were another inverse of A, then B = BI = B(AC) = (BA)C = IC = C. This unique inverse is denoted by A^-1, so that:




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store