Technical Reports
Media Publications
        
 
Personal Name Resolution in Email: A Heuristic Approach

Tamer Elsayed, Galileo Namata, Lise Getoor and Douglas W. Oard


ABSTRACT

Much of the work to date on searching email has focused on personal information management. Archival access poses new challenges, including automatic association of references to unfamiliar individuals using whatever information is available about those people. This paper describes a computational approach to that task motivated by intuitions about the ways people might explore an email collection to find that information. The proposed approach makes use of context in a flexible and adaptive manner. Two techniques for context expansion are: a mixture model that combines evidence from each context to rank candidates, and cutoff model that ranks candidates based on the closest context in which any suitable evidence was found. Both models rely on mentions that could be resolved to a common identity as evidence of the resolution. Results on three relatively small collections indicate that the accuracy of our approach performs favorable compared to the best known technique and results on the full CMU Enron collection indicate that the approach presented in this paper scales well to larger email collections.

Reference: Technical Report: LAMP-TR-150, University of Maryland, College Park, March 2008. (BibTex)

Manuscript: (PDF)




home | language group | media group | sponsors & partners | publications | seminars | contact us | staff only
© Copyright 2001, Language and Media Processing Laboratory, University of Maryland, All rights reserved.