LLE comes in different flavours
I haven't worked in the manifold module since last time, yet thanks to Jake VanderPlas there are some cool features I can talk about. First of, the ARPACK backend is finally working and gives factor...
View ArticleRidge regression path
Ridge coefficients for multiple values of the regularization parameter can be elegantly computed by updating the thin SVD decomposition of the design matrix: import numpy as np from scipy import linalg...
View Articlescikit-learn EuroScipy 2011 coding sprint -- day one
As a warm-up for the upcoming EuroScipy-conference, some of the scikit-learn developers decided to gather and work together for a couple of days. Today was the first day and there was only a handfull...
View Articlescikit-learn’s EuroScipy 2011 coding sprint -- day two
Today's coding sprint was a bit more crowded, with some notable scipy hackers such as Ralph Gommers, Stefan van der Walt, David Cournapeau or Fernando Perez from Ipython joining in. On what got done:...
View ArticleReworked example gallery for scikit-learn
I've been working lately in improving the scikit-learn example gallery to show also a small thumbnail of the plotted result. Here is what the gallery looks like now: And the real thing should be...
View Articlescikit-learn 0.9
Last week we released a new version of scikit-learn. The Changelog is particularly impressive, yet personally this release is important for other reasons. This will probably be my last release as a...
View Articleqr_multiply function in scipy.linalg
In scipy's development version there's a new function closely related to the QR-decomposition of a matrix and to the least-squares solution of a linear system. What this function does is to compute the...
View ArticleLow rank approximation
A little experiment to see what low rank approximation looks like. These are the best rank-k approximations (in the Frobenius norm) to the a natural image for increasing values of k and an original...
View Articleline-by-line memory usage of a Python program
My newest project is a Python library for monitoring memory consumption of arbitrary process, and one of its most useful features is the line-by-line analysis of memory usage for Python code. I wrote a...
View ArticleLearning to rank with scikit-learn: the pairwise transform
This tutorial introduces the concept of pairwise preference used in most ranking problems. I'll use scikit-learn and for learning and matplotlib for visualization. In the ranking setting, training...
View ArticleSingular Value Decomposition in SciPy
SciPy contains two methods to compute the singular value decomposition (SVD) of a matrix: scipy.linalg.svd and scipy.sparse.linalg.svds. In this post I'll compare both methods for the task of computing...
View ArticleMemory plots with memory_profiler
Besides performing a line-by-line analysis of memory consumption, memory_profiler exposes some functions that allow to retrieve the memory consumption of a function in real-time, allowing e.g. to...
View ArticleLoss Functions for Ordinal regression
Note: this post contains a fair amount of LaTeX, if you don't visualize the math correctly come to its original location In machine learning it is common to formulate the classification task as a...
View ArticleHouseholder matrices
Householder matrices are square matrices of the form $$ P = I - \beta v v^T$$ where $\beta$ is a scalar and $v$ is a vector. It has the useful property that for suitable chosen $v$ and $\beta$ it...
View ArticleIsotonic Regression
My latest contribution for scikit-learn is an implementation of the isotonic regression model that I coded with Nelle Varoquaux and Alexandre Gramfort. This model finds the best least squares fit to a...
View ArticleLogistic Ordinal Regression
TL;DR: I've implemented a logistic ordinal regression or proportional odds model. Here is the Python code The logistic ordinal regression model, also known as the proportional odds was introduced in...
View ArticleNumerical optimizers for Logistic Regression
In this post I compar several implementations of Logistic Regression. The task was to implement a Logistic Regression model using standard optimization tools from scipy.optimize and compare them...
View ArticleDifferent ways to get memory consumption or lessons learned from...
As part of the development of memory_profiler I've tried several ways to get memory usage of a program from within Python. In this post I'll describe the different alternatives I've tested. The psutil...
View ArticleSurrogate Loss Functions in Machine Learning
TL; DR These are some notes on calibration of surrogate loss functions in the context of machine learning. But mostly it is an excuse to post some images I made. In the binary-class classification...
View ArticlePlot memory usage as a function of time
One of the lesser known features of the memory_profiler package is its ability to plot memory consumption as a function of time. This was implemented by my friend Philippe Gervais, previously a...
View Article