We’ve collated a collection of cheat sheets for you to get to grips with the main libraries used in data science. They are grouped into the fields for which each library is designed: Basics, Databases, Data Manipulation, Data Visualization, Analysis, Machine Learning, Deep Learning and Natural Language Processing (NLP).
This cheat sheet is a quick reference for NumPy / SciPy beginners and gives an overview about the most important commands and functions of NumPy and SciPy that you might need on solving the exercise sheets about Linear Algebra in Information Retrieval. It doesn't claim to be complete and will be extended continuously. If you think that some important thing is missing or if you find any errors, please let us know.
Scipy Cheat Sheet
Contents
- In fact, we must understand linear algebra to go there. SciPy is linear algebra library in Python. If you want to learn deep learning for example (i.e., image classification), you will deal with large matrix from your image and you need to do many operation on your matrix. That’s why we need SciPy. Here the cheat sheet of SciPy library in Python.
- Algorithm cheat-sheet svc Ensemble Classifiers Naive Bayes NOT kernel approximation KNeighbors Classifier START regression NOT WORKING OOK samples sa mples.
- NumPy/SciPy Cheat Sheet
- General
- Install
- Matrix construction
- Matrix operations
- Useful methods
- Matrix decomposition
General
What is NumPy?
A library that allows to work with arrays and matrices in Python.
What is SciPy?
Another library built upon NumPy that provides advanced Linear Algebra stuff.
Install
The routine to install NumPy and SciPy depends on your operating system. Williams textbook of endocrinology 14th edition pdf.
Linux (Ubuntu, Debian)
Other systems (Windows, Mac, etc.)
For all other systems (Windows, Mac, etc.) see the instructions given on the offical SciPy website.
Matrix construction
We distinguish between dense matrices and sparse matrices (Note: The color code will be used consistently throughout this cheat sheet).
Dense matrices store every entry in the matrix, while sparse matrices only store the non-zero entries (together with their row and column index). Dense matrices are more feature-rich, but may consume more memory space than sparse matrices (in particular if most of the entries in a matrix are zero).
Dense matrices
In NumPy, there are two concepts of dense matrices: matrices and arrays. Matrices are strictly 2-dimensional, while arrays are n-dimensional (the term array is a bit misleading here).
Construct a matrix:
Construct an array:
Sparse matrices
There are two principle concepts of sparse matrices:
Compressed Sparse Row matrix (CSR matrix): entries are stored row by row (sorted by row index first)
Compressed Sparse Column matrix (CSC matrix): entries are stored column by column (sorted by column index first)
Construct a CSR/CSC matrix:
Special matrices
There are some utility functions to create special matrices/arrays:
(1) Construct an empty array, without initializing the entries (an array with random entries):
(2) Construct an array filled with zeros:
(3) Construct an array filled with ones:
(4) Construct a diagonal array, a (usually square) array in which all entries are 0, except on the main diagonal:
(5) Construct an identity array, a square array in which all entries on the main diagonal are 1 and all other entries are 0:
(6) Construct an triangular array, a square array in which all entries below (upper triangle) or above (lower triangle) the main diagonal are zero:
Accessing elements
TODO: crazy element access magic, single elements, entire rows, sub-matrices
Matrix operations
Adding a constant
The addition of a constant adds the constant to every element of a matrix (only available for dense matrices).
Multiplying by a constant
Multiplying by a constant multiplies every element of a matrix by that constant (both for sparse and dense matrices).
Multiplying two matrices
There are two options on multiplying two matrices: the * operator and the dot() function. The behavior and result of both options differ depending on the type of the used matrices (resp. arrays):
(1) The * operator computes
the normal matrix multiplication when sparse and/or dense matrices are used.
the element-wise matrix multiplication when dense arrays are used.
(2) The dot() function computes
the normal matrix multiplication when a dense matrix is multiplied with a dense matrix or a sparse matrix is multiplied with a sparse matrix or a dense matrix;
bullshit when a dense matrix is multiplied with a sparse matrix.
The result of a matrix multiplication between
a sparse matrix and a sparse matrix is a sparse matrix.
a sparse matrix and a dense matrix is a dense matrix.
a dense matrix and a dense matrix is a dense matrix.
Useful methods
numpy.round()
Takes an array or matrix and rounds its values to the given number of decimals. Note that for values exactly halfway between rounded decimal values, numpy rounds to the nearest even value.
numpy.min()
Takes an array and returns its minimum value. If an axis is specified, returns the minima along the axis.
numpy.argmin()
Takes an array and returns the index of the minimum value of the flattened array. If an axis is specified, returns the indices of the minimum values along the axis.
numpy.argsort()
Takes an array A and returns an array of indices that sort A. Optionally, you can specify the axis along which a will be sorted.
numpy.where()
Takes a condition and optionally two array-like objects A and B. If A and B are specified, returns an array that contains elements from A where condition is true and elements from B elsewhere.
Matrix decomposition
Singular Value Decompostion (SVD)
Numpy Cheat Sheet
Factorize a matrix A (m*n) into three matrices U (m * r), S (r * r) and V (r * n) such that A = U * S * V. Here r is the rank of A.