Handwritten digits and Locally Linear Embedding
I decided to test my new Locally Linear Embedding (LLE) implementation against a real dataset. At first I didn't think this would turn out very well, since LLE seems to be somewhat fragile, yielding...
View ArticleManifold learning in scikit-learn
The manifold module in scikit-learn is slowly progressing: the locally linear embedding implementation was finally merged along with some documentation. At about the same time but in a different...
View ArticleLLE comes in different flavours
I haven't worked in the manifold module since last time, yet thanks to Jake VanderPlas there are some cool features I can talk about. First of, the ARPACK backend is finally working and gives factor...
View ArticleRidge regression path
Ridge coefficients for multiple values of the regularization parameter can be elegantly computed by updating the thin SVD decomposition of the design matrix: import numpy as np from scipy import linalg...
View Articlescikit-learn EuroScipy 2011 coding sprint -- day one
As a warm-up for the upcoming EuroScipy-conference, some of the scikit-learn developers decided to gather and work together for a couple of days. Today was the first day and there was only a handfull...
View Articlescikit-learn’s EuroScipy 2011 coding sprint -- day two
Today's coding sprint was a bit more crowded, with some notable scipy hackers such as Ralph Gommers, Stefan van der Walt, David Cournapeau or Fernando Perez from Ipython joining in. On what got done:...
View ArticleReworked example gallery for scikit-learn
I've been working lately in improving the scikit-learn example gallery to show also a small thumbnail of the plotted result. Here is what the gallery looks like now: And the real thing should be...
View Articlescikit-learn 0.9
Last week we released a new version of scikit-learn. The Changelog is particularly impressive, yet personally this release is important for other reasons. This will probably be my last release as a...
View Articleqr_multiply function in scipy.linalg
In scipy's development version there's a new function closely related to the QR-decomposition of a matrix and to the least-squares solution of a linear system. What this function does is to compute the...
View ArticleLow rank approximation
A little experiment to see what low rank approximation looks like. These are the best rank-k approximations (in the Frobenius norm) to the a natural image for increasing values of k and an original...
View Articleline-by-line memory usage of a Python program
My newest project is a Python library for monitoring memory consumption of arbitrary process, and one of its most useful features is the line-by-line analysis of memory usage for Python code. I wrote a...
View ArticleLearning to rank with scikit-learn: the pairwise transform
This tutorial introduces the concept of pairwise preference used in most ranking problems. I'll use scikit-learn and for learning and matplotlib for visualization. In the ranking setting, training...
View ArticleSingular Value Decomposition in SciPy
SciPy contains two methods to compute the singular value decomposition (SVD) of a matrix: scipy.linalg.svd and scipy.sparse.linalg.svds. In this post I'll compare both methods for the task of computing...
View ArticleMemory plots with memory_profiler
Besides performing a line-by-line analysis of memory consumption, memory_profiler exposes some functions that allow to retrieve the memory consumption of a function in real-time, allowing e.g. to...
View ArticleLoss Functions for Ordinal regression
Note: this post contains a fair amount of LaTeX, if you don't visualize the math correctly come to its original location In machine learning it is common to formulate the classification task as a...
View ArticleHouseholder matrices
Householder matrices are square matrices of the form $$ P = I - \beta v v^T$$ where $\beta$ is a scalar and $v$ is a vector. It has the useful property that for suitable chosen $v$ and $\beta$ it...
View ArticleIsotonic Regression
My latest contribution for scikit-learn is an implementation of the isotonic regression model that I coded with Nelle Varoquaux and Alexandre Gramfort. This model finds the best least squares fit to a...
View ArticleLogistic Ordinal Regression
TL;DR: I've implemented a logistic ordinal regression or proportional odds model. Here is the Python code The logistic ordinal regression model, also known as the proportional odds was introduced in...
View ArticleNumerical optimizers for Logistic Regression
In this post I compar several implementations of Logistic Regression. The task was to implement a Logistic Regression model using standard optimization tools from scipy.optimize and compare them...
View ArticleDifferent ways to get memory consumption or lessons learned from...
As part of the development of memory_profiler I've tried several ways to get memory usage of a program from within Python. In this post I'll describe the different alternatives I've tested. The psutil...
View ArticleSurrogate Loss Functions in Machine Learning
TL; DR These are some notes on calibration of surrogate loss functions in the context of machine learning. But mostly it is an excuse to post some images I made. In the binary-class classification...
View ArticlePlot memory usage as a function of time
One of the lesser known features of the memory_profiler package is its ability to plot memory consumption as a function of time. This was implemented by my friend Philippe Gervais, previously a...
View ArticleData-driven hemodynamic response function estimation
My latest research paper[1] deals with the estimation of the hemodynamic response function (HRF) from fMRI data. This is an important topic since the knowledge of a hemodynamic response function is...
View ArticlePyData Paris - April 2015
Last Friday was PyData Paris, in words of the organizers, ''a gathering of users and developers of data analysis tools in Python''. The organizers did a great job in putting together and the event...
View ArticleIPython/Jupyter notebook gallery
Due to lack of time and interest, I'm no longer maintaining this project. Feel free to grab the sources from https://github.com/fabianp/nbgallery and fork the project. TL;DR I created a gallery for...
View ArticleHoldout cross-validation generator
Cross-validation iterators in scikit-learn are simply generator objects, that is, Python objects that implement the __iter__ method and that for each call to this method return (or more precisely,...
View ArticleOn the consistency of ordinal regression methods
My latests work (with Francis Bach and Alexandre Gramfort) is on the consistency of ordinal regression methods. It has the wildly imaginative title of "On the Consistency of Ordinal Regression...
View ArticleSAGA algorithm in the lightning library
Recently I've implemented, together with Arnaud Rachez, the SAGA[1] algorithm in the lightning machine learning library (which by the way, has been recently moved to the new scikit-learn-contrib...
View Articlescikit-learn-contrib, an umbrella for scikit-learn related projects.
Together with other scikit-learn developers we've created an umbrella organization for scikit-learn-related projects named scikit-learn-contrib. The idea is for this organization to host projects that...
View ArticleLightning v0.1
Announce: first public release of lightning!, a library for large-scale linear classification, regression and ranking in Python. The library was started a couple of years ago by Mathieu Blondel who...
View ArticleHyperparameter optimization with approximate gradient
TL;DR: I describe a method for hyperparameter optimization by gradient descent. Most machine learning models rely on at least one hyperparameter to control for model complexity. For example, logistic...
View ArticleA fully asynchronous variant of the SAGA algorithm
My friend Rémi Leblond has recently uploaded to ArXiv our preprint on an asynchronous version of the SAGA optimization algorithm. The main contribution is to develop a parallel (fully asynchronous, no...
View ArticleOptimization inequalities cheatsheet
Most proofs in optimization consist in using inequalities for a particular function class in some creative way. This is a cheatsheet with inequalities that I use most often. It considers class of...
View ArticleNotes on the Frank-Wolfe Algorithm, Part I
This blog post is the first in a series discussing different theoretical and practical aspects of the Frank-Wolfe algorithm. hljs.initHighlightingOnLoad(); $$ \def\xx{\boldsymbol x}...
View ArticleThree Operator Splitting
$$ \def\aa{\boldsymbol a} \def\bb{\boldsymbol b} \def\cc{\boldsymbol c} \def\xx{\boldsymbol x} \def\zz{\boldsymbol z} \def\uu{\boldsymbol u} \def\vv{\boldsymbol v} \def\yy{\boldsymbol y}...
View ArticleNotes on the Frank-Wolfe Algorithm, Part II: A Primal-dual Analysis
This blog post extends the convergence theory from the first part of my notes on the Frank-Wolfe (FW) algorithm with convergence guarantees on the primal-dual gap which generalize and strengthen the...
View ArticleHow to Evaluate the Logistic Loss and not NaN trying
A naive implementation of the logistic regression loss can results in numerical indeterminacy even for moderate values. This post takes a closer look into the source of these instabilities and...
View ArticleOn the Link Between Polynomials and Optimization
There's a fascinating link between minimization of quadratic functions and polynomials. A link that goes deep and allows to phrase optimization problems in the language of polynomials and vice versa....
View ArticleOn the Link Between Optimization and Polynomials, Part 2
An analysis of momentum can be tightened using a combination Chebyshev polynomials of the first and second kind. Through this connection we'll derive one of the most iconic methods in optimization:...
View ArticleOn the Link Between Optimization and Polynomials, Part 3
I've seen things you people wouldn't believe. Valleys sculpted by trigonometric functions. Rates on fire off the shoulder of divergence. Beams glitter in the dark near the Polyak gate. All those...
View ArticleOn the Link Between Optimization and Polynomials, Part 4
While the most common accelerated methods like Polyak and Nesterov incorporate a momentum term, a little known fact is that simple gradient descent –no momentum– can achieve the same rate through only...
View ArticleOptimization Nuggets: Exponential Convergence of SGD
This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic...
View ArticleOptimization Nuggets: Implicit Bias of Gradient-based Methods
When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post...
View ArticleOn the Link Between Optimization and Polynomials, Part 5
Six: All of this has happened before. Baltar: But the question remains, does all of this have to happen again?Six: This time I bet no.Baltar: You know, I've never known you to play the optimist. Why...
View ArticleNotes on the Frank-Wolfe Algorithm, Part III: backtracking line-search
Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the...
View Article--- Article Not Found! ---
*** *** *** RSSing Note: Article is missing! We don't know where we put it!!. *** ***
View ArticleOn the Convergence of the Unadjusted Langevin Algorithm
The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially...
View ArticleOptimization Nuggets: Stochastic Polyak Step-size
The stochastic Polyak step-size (SPS) is a practical variant of the Polyak step-size for stochastic optimization. In this blog post, we'll discuss the algorithm and provide a simple analysis for...
View ArticleOptimization Nuggets: Stochastic Polyak Step-size, Part 2
This blog post discusses the convergence rate of the Stochastic Gradient Descent with Stochastic Polyak Step-size (SGD-SPS) algorithm for minimizing a finite sum objective. Building upon the proof of...
View ArticleOn the Link Between Optimization and Polynomials, Part 6.
Differentiating through optimization is a fundamental problem in hyperparameter optimization, dataset distillation, meta-learning and optimization as a layer, to name a few. In this blog post we'll...
View Article