Workshop on Prior Knowledge for Text and Language

9 July 2008, Helsinki, in conjunction with the ICML/UAI/COLT conferences (see also Workshops).


The workshop is sponsored by the PASCAL-2 European Network of Excellence, and is part of PASCAL-2's Thematic Programme "Leveraging Complex Prior Knowledge for Learning".

GOAL. The aim of the workshop is to present and discuss recent advances in machine learning approaches to text and natural language processing that capitalize on rich prior knowledge models in these domains.

TOPICS. The workshop aims at presenting a diversity of viewpoints on prior knowledge for language and text processing:

  • Prior knowledge for language modeling and parsing,
  • Topic modeling for document analysis and retrieval,
  • Parametric and non-parametric Bayesian models in NLP,
  • Graphical models embodying structural knowledge of texts,
  • Complex features/kernels that incorporate linguistic knowledge; kernels built from generative models,
  • Limitations of purely data-driven learning techniques for text and language applications; performance gains due to incorporation of prior knowledge,
  • Typology of different forms of prior knowledge for NLP (knowledge embodied in generative Bayesian models, in MDL models, in ILP/logical models, in linguistic features, in representational frameworks, in grammatical rules…),
  • Formal principles for combining rule-based and data-based approaches to NLP.


Instructions for presenters

Important information: the workshop will be recorded by a Pascal2 video team. In case you do not want the video of your presentation to be made public, please tell the organizers at the beginning of the meeting or send mail

  • Invited talks: 40 mns + 5 mns for questions
  • Contributed talks: 20 mns + 5 mns for questions
  • Poster previews: poster presenters should send one or two slides in pdf format (2mns presentation) to the organizers (Marc) before July 1st.
  • Posters: see for formatting instructions.

Programme (NEW: slides !)

9:00 Introduction (Organizers) dymetman_intro.pdf
9:05 Invited talk
Mark Johnson
Learning Rules: From PCFGs to Adaptor Grammars johnson_slides.pdf
9:50 Poster preview
10:00 Break
10:30 Ming-Wei Chang, Lev Ratinov and Dan Roth.
Constraints as Prior Knowledge chang.pdf chang_slides.ppt
10:55 Invited talk
Jason Weston
Some thoughts on prior knowledge, deep architectures and NLP weston_slides.pdf
11:40 Poster session
- Songfang Huang and Steve Renals.
Using Participant Role in Multiparty Meetings as Prior Knowledge for Nonparametric Topic Modeling huang.pdf huang_preview.pdf
- Honglak Lee, Rajat Raina, Alex Teichman and Andrew Y. Ng.
Exponential family sparse coding with application to self-taught learning with text documents lee.pdf lee_preview.pdf
- Kinfe Tadesse Mengistu, Mirko Hanneman, Tobias Baum and Andreas Wendemuth.
Using Prior Domain Knowledge to Build HMM-Based Semantic Tagger Trained on Completely Unannotated Data mengistu.pdf mengistu_preview.pdf
- Thomas J. Murray, Panayiotis G. Georgiou and Shrikanth S. Narayanan.
Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging murray.pdf murray_preview.pdf
- Andreas Vlachos, Zoubin Ghahramani and Anna Korhonen.
Dirichlet Process Mixture Models for Verb Clustering vlachos.pdf vlachos_preview.pdf
12:30 Lunch Break
14:30 Invited talk
Pedro Domingos
Incorporating Prior Knowledge into NLP with Markov Logic domingos_slides.ppt
15:15 Mikaela Keller, John S. Brownstein and Clark C. Freifeld.
Expanding a Gazetteer-Based Approach for Geo-Parsing Disease Alerts keller.pdf keller_slides.pdf
15:40 Hanna M. Wallach, Charles Sutton and Andrew McCallum
Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors wallach.pdf wallach_slides.pdf
16:05 Break
16:30 Panel
David Blei blei_panel.pdf
Fabrizio Costa costa_panel.pdf
Pedro Domingos domingos_panel.ppt
Peter Grünwald grunwald_panel.pdf
Mark Johnson johnson_panel.pdf
Jason Weston weston_panel.pdf
17:30 Partha Pratim Talukdar, Ted Sandler, Mark Dredze, Koby Crammer, John Blitzer and Fernando Pereira.
DRASO: Declaratively Regularized Alternating Structural Optimization talukdar.pdf talukdar_slides.pdf
17:55 Wrap up
18:00 End


Program committee

  • Guillaume Bouchard, Xerox Research Center Europe
  • Nicola Cancedda, Xerox Research Center Europe
  • Hal Daumé III, University of Utah
  • Marc Dymetman, Xerox Research Center Europe
  • Tom Griffiths, Stanford University
  • Peter Grünwald, Centrum voor Wiskunde en Informatica
  • Kevin Knight, University of Southern California
  • Mark Johnson, Brown University
  • Yee Whye Teh, University College London


back to top

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License