Quantcast
Channel: Keep the gradient flowing
Browsing all 195 articles
Browse latest View live

Cumpleaños de un líder

Aunque nos vamos a ver a la hora de comer, quisiera felicitar desde aquí a nuestro gran co-líder, el cual es omnipresente, omnipotente y omnívoro. Y esto me recuerda a que hace un mes fue el cumpleaños...

View Article


Image may be NSFW.
Clik here to view.

On the Link Between Polynomials and Optimization

There's a fascinating link between minimization of quadratic functions and polynomials. A link that goes deep and allows to phrase optimization problems in the language of polynomials and vice versa....

View Article


Image may be NSFW.
Clik here to view.

On the Link Between Optimization and Polynomials, Part 2

An analysis of momentum can be tightened using a combination Chebyshev polynomials of the first and second kind. Through this connection we'll derive one of the most iconic methods in optimization:...

View Article

Image may be NSFW.
Clik here to view.

On the Link Between Optimization and Polynomials, Part 3

I've seen things you people wouldn't believe. Valleys sculpted by trigonometric functions. Rates on fire off the shoulder of divergence. Beams glitter in the dark near the Polyak gate. All those...

View Article

Image may be NSFW.
Clik here to view.

On the Link Between Optimization and Polynomials, Part 4

While the most common accelerated methods like Polyak and Nesterov incorporate a momentum term, a little known fact is that simple gradient descent –no momentum– can achieve the same rate through only...

View Article


Optimization Nuggets: Exponential Convergence of SGD

This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic...

View Article

Optimization Nuggets: Implicit Bias of Gradient-based Methods

When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post...

View Article

Image may be NSFW.
Clik here to view.

On the Link Between Optimization and Polynomials, Part 5

Six: All of this has happened before. Baltar: But the question remains, does all of this have to happen again?Six: This time I bet no.Baltar: You know, I've never known you to play the optimist. Why...

View Article


Image may be NSFW.
Clik here to view.

Notes on the Frank-Wolfe Algorithm, Part III: backtracking line-search

Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the...

View Article


--- Article Not Found! ---

*** *** *** RSSing Note: Article is missing! We don't know where we put it!!. *** ***

View Article

--- Article Not Found! ---

*** *** *** RSSing Note: Article is missing! We don't know where we put it!!. *** ***

View Article

Image may be NSFW.
Clik here to view.

On the Convergence of the Unadjusted Langevin Algorithm

The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially...

View Article

Image may be NSFW.
Clik here to view.

Optimization Nuggets: Stochastic Polyak Step-size

The stochastic Polyak step-size (SPS) is a practical variant of the Polyak step-size for stochastic optimization. In this blog post, we'll discuss the algorithm and provide a simple analysis for...

View Article


Optimization Nuggets: Stochastic Polyak Step-size, Part 2

This blog post discusses the convergence rate of the Stochastic Gradient Descent with Stochastic Polyak Step-size (SGD-SPS) algorithm for minimizing a finite sum objective. Building upon the proof of...

View Article

Image may be NSFW.
Clik here to view.

On the Link Between Optimization and Polynomials, Part 6.

Differentiating through optimization is a fundamental problem in hyperparameter optimization, dataset distillation, meta-learning and optimization as a layer, to name a few. In this blog post we'll...

View Article

Browsing all 195 articles
Browse latest View live