Gustavo Lacerda

+1 9I7 655 87O7
CV publications blog twitter links

research scrapyard — hammers seeking nails


STORM imaging with ISTA report slides

mixture of Kalman Filters
Demo: Mixture of Gaussians EM
Demo: Mixture of Kalman Filters EM


regularization / filtering / clustering with pairwise fusion penalties, report slides
thoroughly scooped by T. Hocking, A. Joulin, F. Bach and J.-P. Vert. Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties, ICML 2011. [pdf]

I am a PhD student in Statistics at Columbia University, working with John Paisley.

Areas of interest: machine learning, graphical models, signal processing, information theory, compressed sensing, convex optimization; dynamical systems, functional data analysis / spatial statistics / shape analysis

Models of interest: Independent Component/Subspace Analysis, Mixture of Kalman Filters; network models; topic models; differential equation models; Non-Parametric Bayes; structured data.

Data of interest: spike trains, microscope images, networks, potentially anything

Neat ideas: geometry of exponential families, n-ary relational learning, automatic optimization, manifold learning, information bottleneck method, low-rank matrix completion, unfolding flower models, probabilistic programming languages

"Follow Occam street, but do not stop!" -here

academic history

Columbia University 2010-, PhD in Statistics

University of British Columbia 2008-2010, MSc in Computer Science

Carnegie Mellon University 2006-2008, programmer for HCII, researcher at Machine Learning Department

Universiteit van Amsterdam 2003-2005, MSc in Logic at ILLC

Bucknell University 1997-2001, B.S. in Mathematics and Computer Science

why so many places, so many degrees?

conferences and summer schools

IPAMGSS 2007       ICML/UAI 2008       SFI Summer School 2009
NIPS 2008, 2009       CogSci 2008, 2009.

papers all publications Google Scholar

Identification of gene modules using a generative model for relational data (PDF) - UBC Master's thesis (2010), supervised by Jennifer Bryan.

Discovering Cyclic Causal Models by ICA (UAI2008) (paper, video lecture with slides) extends LiNGAM to discover cyclic models; The non-Gaussian model leads to a finer level of identifiability than what can be achieved in the Gaussian case (e.g. by Richardson's CCD), and allows us to relax the faithfulness assumption.

(draft) Upper-Bounding Proof Length with the Busy Beaver (2008) (PDF) - This note presents a Chaitin-esque result. I derive an (uncomputable) upper bound on the length of the shortest proof of any given statement, as a function of the length of the statement; and briefly discuss implications. Mathematically trivial, but original (to the best of my knowledge). Could possibly be useful if we ever have good estimates of BB for n large enough to encode an interesting question.

see all papers


- Introduction to Kolmogorov Complexity (with Liliana Salvador) (slides), 45 minutes.

- Introduction to Machine Learning and Bayesian inference (slides), 45 minutes.

video demos

slice sampling

general-purpose code

R: R-helpers
Julia: B-Splines

getting help

Q&A sites for Math: mathoverflow, math.stackexchange
Q&A sites for Machine Learning / Stats: CrossValidated, MetaOptimize
For R, visit the #R channel on FreeNode (IRC). Emacs users can use IRC by doing "M-x erc".

If your problem is computationally intensive, consider learning distributed programming (GPU or cluster).

work tools

Ergonomics: standing desk, high chair, white boards

Programming languages: Julia, R, Matlab

Programming tools: emacs, automatic memoization

Writing: knitr, pandoc, LyX, ShareLatex

Online notebook: MediaWiki

Data collection: BeautifulSoup

Reference management: Zotero

misc tools

Beeminder, Boomerang, Dropbox, Google Calendar (social features, add events with one click), TotalFinder, SSHFS

language tools

"a unique translation tool combining an editorial dictionary and a search engine"

Google Translate tooltip

learn a new language / help translate documents; very clever crowdsourcing.

some things I like

argument mapping, bikes, bluegrass, contact improvisation, DreamWidth, functional programming, GiveWell, infoviz, musical instruments, open data, Quantified Self

food for thought

"You and Your Research", by Richard Hamming

"Why People Are Irrational about Politics", by Michael Huemer

"Why I defend scoundrels", by Yvain

Paul Graham: "How to do Philosophy", "Why nerds are unpopular"

LessWrong: Applause Lights
"Illusion of Transparency: Why No One Understands You"

Ribbonfarm: "A Big Little Idea Called Legibility"

Ben Goldacre: "The Information Architecture of Medicine is Broken"


Andrew Gelman - Statistical Modeling, Causal Inference, and Social Science

Cosma Shalizi - Three-Toed Sloth

Cathy O' Neil - mathbabe

Peter Gray - Freedom to Learn

neat toys

Algodoo: 2D physics engine

BeepBox: chiptune editor

This website is permanently under construction. You may notice that behind this frontpage is a MediaWiki site. Someday I'd like to have indexing. For now, keyword searches will have to do. RIP Xanadu