Wednesday, March 17, 2010

Video Object Annotation, Navigation, and Composition

*NOTE: I have not been keeping track of who I leave comments on, but I have comments out there. I'm writing the blogs, reading others, and attending lectures where I can ask questions. Comments seem a little redundant!

Authors:
Dan B Goldman and David Salesin (Adobe Systems, Inc.). Chris Gonterman, Brian Curless, Steven M Seitz (University of Washington)

Summary:
Objects in a video are... objects in a video. Characters and props can be objects, cars and animals, etc. Most video editing software is concerned with timelines and frames, even though objects are what people are more concerned with. Being able to tag an object and have it tracked across frames would greatly speed up the video editing process (no splicing together stills to get your point across), and that's just what the authors of this paper are working on. They focus on the annotation, navigation, and composition of videos in an object-focused way. To accomplish these tasks, videos are preprocessed and low-level motion tracking is employed to determine what objects are in the video.

The colored dots are computed to track probable objects

Annotation deals with adding graphics (such as text bubbles, highlights, outlines, etc.) to moving objects. Uses include sports broadcasting, surveillance videos, and post-production notation for professionals. The five annotations that they implemented were graffiti, scribbles, speech balloons, path arrows, and hyperlinks.

For navigation, the new system allows a user to select an object and drag the mouse to a new location on the screen. Once released, the video will move to a time when that object is close to that release point, thus computing video cuts for the user. The system visualizes ranges of motion for an object by placing a "starburst widget" on it which uses vectors to indicate the length and direction of motion that the object undergoes forward and backward in time.

Video-to-still composition is all about splicing together images from the video to create a single composition. The authors use a drag-and-drop system to move selected objects forward or backward through frames until the object is where it is wanted. All other objects in the frame remain frozen in place until they are directly selected and subsequently manipulated. In this way, a composite image can be created that has each object exactly where the user wants it to be.

The black car was pulled forward in time to appear to be right in front of the white one (affirmative action!)

Discussion:
Awesome stuff... except it takes 5 minutes PER FRAME to preprocess the video! That's an epic amount of time (and that's for 720 x 480 resolution). If they can speed that up, then they are golden. You should check out the paper for yourself!

3 comments:

  1. That sounds really cool although I would be a little worried about the system calculating new paths for an object that I moved in a video. I understand that editing frame by frame would take too long like what you were saying. Is there a way that I could define an explicit path for an object?

    ReplyDelete
  2. This is really cool. Is there a video?

    ReplyDelete
  3. Another annotation project, I guess it's a theme for this set of papers. Did they give any kind of practical purpose for this besides just video editing? They did a pretty good job picking out the car and being able to move it. Though you do still see a little weirdness on the right side from where the angle shifted, but not bad!

    ReplyDelete