Data Science Prerequisites: linear algebra (3/5)

If Data Science was Sherlock Holmes, Linear Algebra would be Watson. This faithful sidekick is often overlooked by professionals and will form the main topic of this post. 

Linear Algebra is hands-down the most important discipline to know before doing any machine learning work. Prior to feeding it to an algorithm, data usually needs to be represented in a tabular format with numeric values, where rows represent instances of an entity (e.g. customer) and columns depict specific features or attributes of the entity in question (e.g. age or salary for the customer entity). In this structured shape, rows can be viewed as vectors in a vector space, or data points in a geometric space where each column or attribute represents a dimension of said space. The collection of rows (i.e. the tabular dataset) forms a matrix. And it so happens that Linear Algebra is the branch of Mathematics that studies vectors, matrices, their characteristics and operations to manipulate them. 

Aside from the theoretical reasons, knowledge of linear algebra also enables us to code our algorithms from scratch and tweak their inner workings when necessary. As an added benefit, it also allows us to efficiently vectorize relevant parts of our code (perform matrix multiplications as opposed to for loops iterations) to make it run much faster given that low-level libraries are well optimized for matrix operations. 

How to define Linear Algebra?

Intermediate to Advanced knowledge in Linear Algebra is desired for data science work.

Linear Algebra Concepts

  1. Introduction to Vectors: Vectors, Linear Combinations, Independence, Dot Product, Norms, Matrices, Inverse, Transpose, Determinant, Rank …
  2. Vector Spaces and Subspaces: Vector Spaces, Basis, Dimension, Null Space of a Matrix …
  3. Orthogonality: Dot Product, Orthogonality, Projections, Orthonormal Bases and the Gram-Schmidt process.
  4. Eigen Decomposition: Eigen Values, Eigen Vectors, Symmetric Matrices, Positive Definite Matrices, Diagonalizing a Matrix, Spectral Theorem.
  5. Other Decompositions: Singular Value Decomposition (SVD), LU Decomposition and Solving Linear Equations, Cholesky Decomposition, QR Decomposition …

For further more

In order to improve your skills, or simply to start learning linear algebra, we propose you some resources, classified by type of source.

YouTube Playlists

Channel: 3Blue1Brown

Playlist: Essence of Linear Algebra 

Description: This channel is about mathematics explained in an easy and digestible way using intuitive visualizations. It is one of the best YouTube channels out there to introduce and explain difficult math concepts to initiated and non-initiated audiences alike. The Playlist contains 15 videos of ~12 mins each and can provide you with an intuitive feel of linear algebra beyond the complex mathematical formulations. These videos can be useful even to math students who have previously studied Linear Algebra to an advanced level. It is mainly focused on providing a conceptual understanding about mathematical constructs such as Matrices, Linear Maps, Eigenvalues, Eigenvectors, Vector Spaces, Determinant, Dot Product, Cross Product etc. However, this playlist alone is not sufficient as you also need exposure to some formulas and equations because that’s how machine learning theory is written and communicated. Therefore, I suggest pairing this playlist with a book or an online course about Linear Algebra, and to consolidate this knowledge through practical exercises. 

Books

Author: Marc Peter

Title: Mathematics for Machine Learning

Description: This is an amazing book that covers most of the mathematical concepts required for machine learning. It was specifically written for that purpose. In particular, it has detailed chapters about concepts related to linear algebra (Chapter 2 to Chapter 5). One of the things I like most about this book is how rich in examples it is, as well as the illustrative figures it contains. Also, the authors have a very clear and concise writing style which makes reading the book manageable even for non mathematicians. Finally, it has an exercises section at the end of each chapter to test your understanding. This is a good resource for condensed knowledge and can be found online in PDF format. 

Author: Gilbert Strang

Title: Introduction to Linear Algebra

Description: This is easily one of the best introductory books to linear algebra out there and is used as a textbook in many university graduate courses. Gilbert Strang is an MIT professor who teaches the Linear Algebra class. The mathematics in the book is not very rigorous making it approachable for audiences with different levels of math. This book covers more than you will ever need to know about Linear algebra to understand the mathematics behind machine learning (be it from books or research papers). Prof. Gilbert is also known for his teaching style: he explains concepts in a digestible way, focusing first on the intuition before providing the mathematical formulation and proof when applicable. This is a comprehensive book of about 550 pages, and will require time to cover in full. A PDF version of the book can be found online. 

Author: Sheldon Axler

Title: Linear Algebra Done Right

Description: Yet another fantastic book that receives a lot of attention. It’s only 250 pages, but covers most of the concepts you’ll need. This is also the only introductory book I’ve come across that starts with Vector Spaces, which I personally think is the way to teach Linear Algebra. As mentioned earlier, given a structured dataset, every record (or row) can be regarded as a vector in a M-dimensional vector space where M is the number of columns in the dataset. This way of picturing data helps understand many ML algorithms such as PCA, through the concept of intrinsic dimensionality of data, or equivalently, the rank of the data matrix. 

MOOC

Institution: Massachusetts Institute of Technology (MIT)

Name: Linear Algebra

Description: As you may have guessed, this is the actual Linear Algebra course that Prof. Gilbert teaches at MIT. There are 34 Lectures of ~40 mins each, which cover the same topics that are mentioned in the book. This is a long course and will require about 25 hours just to finish the lectures alone. If you like Prof. Gilbert’s teaching style, I recommend going through the book, and referring back to the corresponding explanation in his MIT lectures when you find some concept difficult to understand. 

In a future post, we will propose you to address the fourth pre-requisite: calculus.


Written by Samy Tafasca
Curious, eager to learn and a good communicator, Samy is a PhD student in Deep Learning within Octopeek’s Innovation division. Before starting his career in research, he obtained an engineering degree from a French engineering school and a master’s degree in data science from a London University. His various experiences allowed him to develop a solid technical profile with a strong international exposure. He often uses his knowledge to write blog posts, or initiate knowledge sharing efforts with the community. In addition, Samy is also passionate about cooking, travel, technology and vector artwork!