Workshop on Prior Knowledge for Text and Language
GOAL. The aim of the workshop is to present and discuss recent advances in machine learning approaches to text and natural language processing that capitalize on rich prior knowledge models in these domains.
TOPICS. The workshop aims at presenting a diversity of viewpoints on prior knowledge for language and text processing:
- Prior knowledge for language modeling and parsing,
- Topic modeling for document analysis and retrieval,
- Parametric and non-parametric Bayesian models in NLP,
- Graphical models embodying structural knowledge of texts,
- Complex features/kernels that incorporate linguistic knowledge; kernels built from generative models,
- Limitations of purely data-driven learning techniques for text and language applications; performance gains due to incorporation of prior knowledge,
- Typology of different forms of prior knowledge for NLP (knowledge embodied in generative Bayesian models, in MDL models, in ILP/logical models, in linguistic features, in representational frameworks, in grammatical rules…),
- Formal principles for combining rule-based and data-based approaches to NLP.
Instructions for presenters
Important information: the workshop will be recorded by a Pascal2 video team. In case you do not want the video of your presentation to be made public, please tell the organizers at the beginning of the meeting or send mail
- Invited talks: 40 mns + 5 mns for questions
- Contributed talks: 20 mns + 5 mns for questions
- Poster previews: poster presenters should send one or two slides in pdf format (2mns presentation) to the organizers (Marc) before July 1st.
- Posters: see http://icml2008.cs.helsinki.fi/author_information.shtml for formatting instructions.
Programme (NEW: slides !)
|9:00||Introduction (Organizers) dymetman_intro.pdf
Learning Rules: From PCFGs to Adaptor Grammars johnson_slides.pdf
|10:30||Ming-Wei Chang, Lev Ratinov and Dan Roth.
Constraints as Prior Knowledge chang.pdf chang_slides.ppt
Some thoughts on prior knowledge, deep architectures and NLP weston_slides.pdf
- Songfang Huang and Steve Renals.
Using Participant Role in Multiparty Meetings as Prior Knowledge for Nonparametric Topic Modeling huang.pdf huang_preview.pdf
- Honglak Lee, Rajat Raina, Alex Teichman and Andrew Y. Ng.
Exponential family sparse coding with application to self-taught learning with text documents lee.pdf lee_preview.pdf
- Kinfe Tadesse Mengistu, Mirko Hanneman, Tobias Baum and Andreas Wendemuth.
Using Prior Domain Knowledge to Build HMM-Based Semantic Tagger Trained on Completely Unannotated Data mengistu.pdf mengistu_preview.pdf
- Thomas J. Murray, Panayiotis G. Georgiou and Shrikanth S. Narayanan.
Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging murray.pdf murray_preview.pdf
- Andreas Vlachos, Zoubin Ghahramani and Anna Korhonen.
Dirichlet Process Mixture Models for Verb Clustering vlachos.pdf vlachos_preview.pdf
Incorporating Prior Knowledge into NLP with Markov Logic domingos_slides.ppt
|15:15||Mikaela Keller, John S. Brownstein and Clark C. Freifeld.
Expanding a Gazetteer-Based Approach for Geo-Parsing Disease Alerts keller.pdf keller_slides.pdf
|15:40||Hanna M. Wallach, Charles Sutton and Andrew McCallum
Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors wallach.pdf wallach_slides.pdf
David Blei blei_panel.pdf
Fabrizio Costa costa_panel.pdf
Pedro Domingos domingos_panel.ppt
Peter Grünwald grunwald_panel.pdf
Mark Johnson johnson_panel.pdf
Jason Weston weston_panel.pdf
|17:30||Partha Pratim Talukdar, Ted Sandler, Mark Dredze, Koby Crammer, John Blitzer and Fernando Pereira.
DRASO: Declaratively Regularized Alternating Structural Optimization talukdar.pdf talukdar_slides.pdf
- Guillaume Bouchard: guillaume (dot) bouchard (at) xrce (dot) xerox (dot) com
- Hal Daumé III: hal (at) cs (dot) utah (dot) edu
- Marc Dymetman (main contact): marc (dot) dymetman (at) xrce (dot) xerox (dot) com
- Yee Whye Teh: yeewhye (at) gmail (dot) com
- Guillaume Bouchard, Xerox Research Center Europe
- Nicola Cancedda, Xerox Research Center Europe
- Hal Daumé III, University of Utah
- Marc Dymetman, Xerox Research Center Europe
- Tom Griffiths, Stanford University
- Peter Grünwald, Centrum voor Wiskunde en Informatica
- Kevin Knight, University of Southern California
- Mark Johnson, Brown University
- Yee Whye Teh, University College London
- new: The Generative-Discriminative Learning Interface (NIPS Workshop 2009)