Singular Value Decomposition (SVD) is a method of decomposing a matrix into several other matrices where we can derive useful features. It is an important element in achieving Principal component analysis. SVD is widely used in dimensionality reduction of the data before any analysis or modeling is done. SVD has many use cases in statistics, signal processing and natural language processing among others. In this post we are going to focus on the Singular Value Decomposition, its uses, strength and weaknesses.

Singular Value Decomposition

Singular value decomposition is a powerful vocabulary in data science and machine learning dictionary. The discovery of the singular value decomposition can be dated back in 1837 by Eugenio Beltrami and Camille Jordan when trying to find out whether a real bilinear form could be made equal to another by independent orthogonal transformations of the two spaces it acts on. Since then many other mathematicians have independently worked on the proving the relevance of singular value decomposition. SVD is a wide area of study in mathematics and cannot be covered in a single post. In this post we are going to see a brief overview of how it works and apply it in machine learning using the scipy and scikit-learn library.

How Singular Value Decomposition Works

Below is a formal definition of a singular value decomposition;

Given an mxn matrix (M) there exists a singular value decomposition of M, in the form shown below.

M=U∑VT

where;

  • M is an mxn matrix.
  • U is an mxn unitary matrix
  • is an mxn diagonal matrix
  • VT is a conjugate transpose of nxn unitary matrix

U and V have orthogonal columns hence

U*U=1
and
VTV=1

Singular Value Decomposition Example

Singular Value Decomposition is a simple but yet very powerful method that is used in decomposing matrix. We are going to start with a simple example on how we can perform the SVD on a given matrix using the singular value decomposition class that comes with the scipy package. Let’s look at the examples below.

Singular Value Decomposition Of Matrix (M) In M=U∑VT

Reconstructing Original Matrix From SVD

Sklearn’s SVD class

Scikit-learn comes with an sklearn.decomposition.TruncatedSVD(n_components=2, algorithm=’randomized’, n_iter=5, random_state=None, tol=0.0) class that is primarly used in dimensionality reduction. TruncatedSVD method can be used in TF-IDF matrices for feature extraction in a technique known as latent semantic analysis. When using TruncatedSVD we need to declare the number of components which must be less than the number of features. Let’s look at the following examples.

Dimensionality Reduction With SVD In Sklearn

Dimensionality Reduction

Explained Variance And Variance Ratio

SVD Explained Variance - Singular Value Decomposition

TruncatedSVD Plot

Output

TruncatedSVD - Singular Value Decomposition

SVD Plots

Output

SVD Plots - Singular Value Decomposition

SVD With Iris Data Set

Output

SVD On Iris Data Set - Singular Value Decomposition

Pros

  • It works well even with big data sets.
  • Works efficiently with different data.

Cons

  • The data needs to be linear.
  • Interpretation of the results can be difficult.
  • Poor visualization unlike other methods such as t-SNE.

Applications Of Singular Value Decomposition

There are so many use cases of the singular value decomposition. Below is a limited list of many applications.

  • Dimensionality reduction.
  • Recommendation systems.
  • Data compression e.g image compression.
  • latent semantic indexing in natural language processing.
  • Pseudoinverse of a matrix.

Conclusion

Singular Value Decomposition (SVD) is similar to the Principal Component Analysis except that it is more efficient. SVD is a method used in decomposing a matrix into other matrices. SVD is used in many tasks which includes dimensionality reduction , image processing and solving linear systems problems. SVD works well with different data sets, however, it does not produce intuitive visualizations as compared to other methods such as t-SNE.

What’s Next

In this post we have looked the SVD in a nutshell. In the next post we will look at the apriori algorithm.

Singular Value Decomposition

Post navigation