Workshop on Prior Knowledge for Text and Language
9 July 2008, Helsinki, in conjunction with the ICML/UAI/COLT conferences (see also Workshops).
The workshop is sponsored by the PASCAL-2 European Network of Excellence, and is part of PASCAL-2's Thematic Programme "Leveraging Complex Prior Knowledge for Learning".
GOAL. The aim of the workshop is to present and discuss recent advances in machine learning approaches to text and natural language processing that capitalize on rich prior knowledge models in these domains.
TOPICS. The workshop aims at presenting a diversity of viewpoints on prior knowledge for language and text processing:
- Prior knowledge for language modeling and parsing,
- Topic modeling for document analysis and retrieval,
- Parametric and non-parametric Bayesian models in NLP,
- Graphical models embodying structural knowledge of texts,
- Complex features/kernels that incorporate linguistic knowledge; kernels built from generative models,
- Limitations of purely data-driven learning techniques for text and language applications; performance gains due to incorporation of prior knowledge,
- Typology of different forms of prior knowledge for NLP (knowledge embodied in generative Bayesian models, in MDL models, in ILP/logical models, in linguistic features, in representational frameworks, in grammatical rules…),
- Formal principles for combining rule-based and data-based approaches to NLP.
Proceedings.pdf
Instructions for presenters
Important information: the workshop will be recorded by a Pascal2 video team. In case you do not want the video of your presentation to be made public, please tell the organizers at the beginning of the meeting or send mail
- Invited talks: 40 mns + 5 mns for questions
- Contributed talks: 20 mns + 5 mns for questions
- Poster previews: poster presenters should send one or two slides in pdf format (2mns presentation) to the organizers (Marc) before July 1st.
- Posters: see http://icml2008.cs.helsinki.fi/author_information.shtml for formatting instructions.
Programme (NEW: slides !)
9:00 | Introduction (Organizers) dymetman_intro.pdf |
9:05 | Invited talk Mark Johnson Learning Rules: From PCFGs to Adaptor Grammars johnson_slides.pdf |
9:50 | Poster preview |
10:00 | Break |
10:30 | Ming-Wei Chang, Lev Ratinov and Dan Roth. Constraints as Prior Knowledge chang.pdf chang_slides.ppt |
10:55 | Invited talk Jason Weston Some thoughts on prior knowledge, deep architectures and NLP weston_slides.pdf |
11:40 | Poster session - Songfang Huang and Steve Renals. Using Participant Role in Multiparty Meetings as Prior Knowledge for Nonparametric Topic Modeling huang.pdf huang_preview.pdf - Honglak Lee, Rajat Raina, Alex Teichman and Andrew Y. Ng. Exponential family sparse coding with application to self-taught learning with text documents lee.pdf lee_preview.pdf - Kinfe Tadesse Mengistu, Mirko Hanneman, Tobias Baum and Andreas Wendemuth. Using Prior Domain Knowledge to Build HMM-Based Semantic Tagger Trained on Completely Unannotated Data mengistu.pdf mengistu_preview.pdf - Thomas J. Murray, Panayiotis G. Georgiou and Shrikanth S. Narayanan. Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging murray.pdf murray_preview.pdf - Andreas Vlachos, Zoubin Ghahramani and Anna Korhonen. Dirichlet Process Mixture Models for Verb Clustering vlachos.pdf vlachos_preview.pdf |
12:30 | Lunch Break |
14:30 | Invited talk Pedro Domingos Incorporating Prior Knowledge into NLP with Markov Logic domingos_slides.ppt |
15:15 | Mikaela Keller, John S. Brownstein and Clark C. Freifeld. Expanding a Gazetteer-Based Approach for Geo-Parsing Disease Alerts keller.pdf keller_slides.pdf |
15:40 | Hanna M. Wallach, Charles Sutton and Andrew McCallum Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors wallach.pdf wallach_slides.pdf |
16:05 | Break |
16:30 | Panel David Blei blei_panel.pdf Fabrizio Costa costa_panel.pdf Pedro Domingos domingos_panel.ppt Peter Grünwald grunwald_panel.pdf Mark Johnson johnson_panel.pdf Jason Weston weston_panel.pdf |
17:30 | Partha Pratim Talukdar, Ted Sandler, Mark Dredze, Koby Crammer, John Blitzer and Fernando Pereira. DRASO: Declaratively Regularized Alternating Structural Optimization talukdar.pdf talukdar_slides.pdf |
17:55 | Wrap up |
18:00 | End |
Organizers
- Guillaume Bouchard: guillaume (dot) bouchard (at) xrce (dot) xerox (dot) com
- Hal Daumé III: hal (at) cs (dot) utah (dot) edu
- Marc Dymetman (main contact): marc (dot) dymetman (at) xrce (dot) xerox (dot) com
- Yee Whye Teh: yeewhye (at) gmail (dot) com
Program committee
- Guillaume Bouchard, Xerox Research Center Europe
- Nicola Cancedda, Xerox Research Center Europe
- Hal Daumé III, University of Utah
- Marc Dymetman, Xerox Research Center Europe
- Tom Griffiths, Stanford University
- Peter Grünwald, Centrum voor Wiskunde en Informatica
- Kevin Knight, University of Southern California
- Mark Johnson, Brown University
- Yee Whye Teh, University College London
Links
- new: The Generative-Discriminative Learning Interface (NIPS Workshop 2009)