In this paper, the authors present their new algorithm for document summarization. Their prototype uses eye-tracking to determine which words in a document are focused on the longest by a reader. The algorithm then predicts how long users would focus on other words based on semantic similarity. Basically, their algorithm predicts what sentences the user is most likely to focus on, and then ranks them in decreasing order so as to present the "most important" information for that specific user first. What sets this apart from other summarization techniques is the fact that the documents are broken down into components (word by word) that are then combined to rank sentences and paragraphs.
Eye tracking sample. This person looks at weird things.
While testing their new algorithm, the authors continually beat MS Word AutoSummarize and MEAD (an open source summarization package) in both Precision and Recall. In the future, they hope to improve their algorithm so that documents a user has never read can be successfully summarized (currently, it only works if the user is going back over something previously read).
Discussion:
I didn't know that automatic summary technology even existed. It seems to me that the authors are moving in the right direction by having summarization be user-oriented instead of generic. Everyone reviews differently and focuses on different things; a software that can learn your style would be very valuable. Their future work hinted at creating a summarization tool for things that the user hasn't even read before. I could use that on my next blog!