research scrapyard — hammers seeking nails
2012STORM imaging with ISTA report slides
mixture of Kalman Filters
Demo: Mixture of Gaussians EM
Demo: Mixture of Kalman Filters EM
2011regularization / filtering / clustering with pairwise fusion penalties, report slides
thoroughly scooped by T. Hocking, A. Joulin, F. Bach and J.-P. Vert. Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties, ICML 2011. [pdf]
I am a PhD student in Statistics at Columbia University, working with John Paisley.
Areas of interest: machine learning, graphical models, signal processing, information theory, compressed sensing, convex optimization; dynamical systems, functional data analysis / spatial statistics / shape analysis
Models of interest: Independent Component/Subspace Analysis, Mixture of Kalman Filters; network models; topic models; differential equation models; Non-Parametric Bayes; structured data.
Data of interest: spike trains, microscope images, networks, potentially anything
Neat ideas: geometry of exponential families, n-ary relational learning, automatic optimization, manifold learning, information bottleneck method, low-rank matrix completion, unfolding flower models, probabilistic programming languages
academic historyColumbia University 2010-, PhD in Statistics
University of British Columbia 2008-2010, MSc in Computer Science
Carnegie Mellon University 2006-2008, programmer for HCII, researcher at Machine Learning Department
Universiteit van Amsterdam 2003-2005, MSc in Logic at ILLC
Bucknell University 1997-2001, B.S. in Mathematics and Computer Science
why so many places, so many degrees?
conferences and summer schoolsIPAMGSS 2007 ICML/UAI 2008 SFI Summer School 2009
NIPS 2008, 2009 CogSci 2008, 2009.
papers all publications Google ScholarIdentification of gene modules using a generative model for relational data (PDF, slides) - UBC Master's thesis (2010), supervised by Jennifer Bryan.
Discovering Cyclic Causal Models by ICA (UAI2008) (paper, video lecture with slides) extends LiNGAM to discover cyclic models; The non-Gaussian model leads to a finer level of identifiability than what can be achieved in the Gaussian case (e.g. by Richardson's CCD), and allows us to relax the faithfulness assumption. We prove theorems about identifiability, specifically about when a unique model can be identified.
(draft) Upper-Bounding Proof Length with the Busy Beaver (2008) (PDF) - This note presents a Chaitin-esque result. I derive an (uncomputable) upper bound on the length of the shortest proof of any given statement, as a function of the length of the statement; and briefly discuss implications. Mathematically trivial, but original (to the best of my knowledge). Could possibly be useful if we ever have good estimates of BB for n large enough to encode an interesting question.
see all papers
tutorials- Introduction to Kolmogorov Complexity (with Liliana Salvador) (slides), 45 minutes.
- Introduction to Machine Learning and Bayesian inference (slides), 45 minutes.
video demosslice sampling
general-purpose codeR: R-helpers
getting helpQ&A sites for Math: mathoverflow, math.stackexchange
Q&A sites for Machine Learning / Stats: CrossValidated, MetaOptimize
For R, visit the #R channel on FreeNode (IRC). Emacs users can use IRC by doing "M-x erc".
If your problem is computationally intensive, consider learning distributed programming (GPU or cluster).
work toolsErgonomics: standing desk, high chair, white boards
Programming languages: Julia, R, Matlab
Programming tools: emacs, automatic memoization
Writing: knitr, pandoc, LyX, ShareLatex
Online notebook: MediaWiki
Data collection: BeautifulSoup
Reference management: Zotero
misc toolsBeeminder, Boomerang, Dropbox, Google Calendar (social features, add events with one click), TotalFinder, SSHFS
"a unique translation tool combining an editorial dictionary and a search engine"
Google Translate tooltip
learn a new language / help translate documents; very clever crowdsourcing.
some things I likeargument mapping, bikes, bluegrass, contact improvisation, DreamWidth, functional programming, GiveWell, infoviz, musical instruments, open data, Quantified Self
food for thought"You and Your Research", by Richard Hamming
"Why People Are Irrational about Politics", by Michael Huemer
"Why I defend scoundrels", by Yvain
Paul Graham: "How to do Philosophy", "Why nerds are unpopular"
LessWrong: Applause Lights
"Illusion of Transparency: Why No One Understands You"
Ribbonfarm: "A Big Little Idea Called Legibility"
Ben Goldacre: "The Information Architecture of Medicine is Broken"
blogsAndrew Gelman - Statistical Modeling, Causal Inference, and Social Science
Cosma Shalizi - Three-Toed Sloth
Cathy O' Neil - mathbabe
Peter Gray - Freedom to Learn
neat toysAlgodoo: 2D physics engine
BeepBox: chiptune editor