Publications
From GusWiki
See also: Tutorials, Class projects
Note: I take full responsibility for the content on this page and any other pages that don't have an "edit" link, as they are only editable by me. This notice also appears on my homepage. -- Gustavo Lacerda
Contents |
Statistics on Structured Data
Essentially a write-up of the first two months of my MSc research (minus the "learning R" part).
Using simulations we find that, in a maximum likelihood setting, the true block structure is recovered most often when the clustering strength parameter is underestimated. Perhaps not too surprising, considering the size of the data is fixed.
Causality
- Gustavo Lacerda, Peter Spirtes, Joseph Ramsey, Patrik O. Hoyer - Discovering Cyclic Causal Models by Independent Components Analysis Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI-2008) (plenary talk [video])
Generalizes the LiNGAM method to deal with cycles, and proposes stability as a partial solution to the underdetermination. Cyclic SEMs correspond to linear dynamical systems. The non-Gaussian model leads to a finer level of identifiability than what can be achieved in the Gaussian case (e.g. by Richardson's CCD), and allows us to relax the faithfulness assumption. We prove theorems about identifiability, specifically about when a unique model can be identified. Besides the new results, this paper also contains a novel presentation of the LiNGAM method.
- P. O. Hoyer, A. Hyvärinen, R. Scheines, P. Spirtes, J. Ramsey, G. Lacerda, and S. Shimizu - “Causal discovery of linear acyclic models with arbitrary distributions” Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI-2008)
How to intelligently combine LiNGAM with methods based on conditional independence tests (this is useful when it may be the case that more than 1, but not all error terms are Gaussian). (Future work: to make this smoother, use a Bayesian search that considers many equivalence classes)
NLP / Information Retrieval
- S. Fissaha Adafre, W.R. van Hage, J. Kamps, G. Lacerda de Melo, and M. de Rijke - The University of Amsterdam at CLEF 2004, In: C. Peters and F. Borri, editors, Working Notes for the CLEF 2004 Workshop, pages 91-98, 2004.
I think this paper was a blend of many people's independent projects. My part was building a bilingual Portuguese-English dictionary from a parallel corpus. This involved doing statistical word-alignment before I knew anything about machine learning. Since we had a very large corpus, it worked out ok. I created a score that used proximity in location, a cognate heuristic, word-length correlations and an assumption that synonyms do not appear in the same sentence; finally I bootstrapped from a hand-made dictionary of 100 word-pairs.
Student modeling
- Noboru Matsuda, William W. Cohen, Jonathan Sewall, Gustavo Lacerda, and Kenneth R. Koedinger (2008) - Why tutored problem solving may be better than example study: Theoretical implications from a simulated-student study. In Proceedings of the International Conference on Intelligent Tutoring Systems.
- Noboru Matsuda, William W. Cohen, Jonathan Sewall, Gustavo Lacerda, and Kenneth R. Koedinger (2007) - Predicting students performance with SimStudent that learns cognitive skills from observation. In R. Luckin, K. R. Koedinger & J. Greer (Eds.), Proceedings of the international conference on Artificial Intelligence in Education (pp. 467-476). Amsterdam, Netherlands: IOS Press.
- Noboru Matsuda, William W. Cohen, Jonathan Sewall, Gustavo Lacerda, and Kenneth R. Koedinger - Evaluating a Simulated Student using Real Students Data for Training and Testing, In C. Conati, K. McCoy & G. Paliouras (Eds.), Proceedings of the international conference on User Modeling (LNAI 4511) (pp. 107-116). Berlin, Heidelberg: Springer.
Logic
- (draft) Gustavo Lacerda - An Information-Theoretic Upper Bound on the Length of the Shortest Proof (draft) (2008) - [Will put it on the arxiv when I have some time to clean it up]
subtitle: "Short theorems can't have arbitrarily long proofs as their shortest proof" (a.k.a. "Upper-Bounding Proof Length with the Busy Beaver")
This note presents a Chaitin-esque result. I derive an (uncomputable) upper bound on the length of the shortest proof of any given statement, as a function of the length of the statement; and briefly discuss implications. Mathematically trivial, but original (to the best of my knowledge). Could possibly be useful if we ever have good estimates of BB for n large enough to encode an interesting question (Disclaimer: this seems VERY unlikely)
Almost certainly my last excursion in mathematical logic.
