Constraint-based Sentence Compression: An Integer Programming Approach Mirella Lapata (joint work with James. Clarke.

In this talk we introduce the sentence compression task, which can be viewed as producing a summary of a single sentence. An ideal compression algorithm should produce a shorter version of an original sentence that retains the most important information while remaining grammatical. The task has an immediate impact on several applications ranging from document summarisation to audio scanning devices for the blind and caption generation.

In this talk we introduce the sentence compression task, which can be Previous approaches have primarily relied on parallel corpora to determine what is important in a sentence. These include data intensive methods inspired from machine translation using the noisy-channel model and from parsing by treating compression as a series of tree rewriting operations. Our work views sentence compression as an optimisation problem. We develop an integer programming formulation and infer globally optimal compressions in the face of linguistically motivated constraints. We show that such a formulation allows for relatively simple and knowledge-lean compression models that do not require parallel corpora or large-scale. The proposed approach yields results comparable and in some cases superior to state-of-the-art.