Video Understanding FAQ
Video understanding FAQ

What is shot break detection?

Roughly speaking, a shot break is the change from one camera shot to another. In the research literature, the definitions change depending upon the paper. Many researchers use the movie maker terminology in which a scene is composed of one or more shots.

Why is this useful?

Shot break detection is useful for a wide variety of real world applications. Video editors and producers usually work in terms of shots and scenes, not individual frames. This work is useful not only in finding shots and deleting shots within a current movie, but also in locating shots and scenes from previous movies. It also gives more search options to the home user. Suppose you have recorded six hours of video from cable. This technology gives you the ability to skip over shots with a button clock.

Why is this important?

Since the information in video databases can be measured in thousands of gigabytes of uncompressed data, video analysis methods which ignore the redundancy in video sequences can not be considered practical methods. Thus, we integrate the video analysis methods with video compression in order to exploit the redundancy in video sequences.

What problems are addressed?

Two problems will be addressed: (1) Video analysis, which includes segmentation, classification, and recognition, and (2) Computational efficiency through integration with video compression methods. Regarding the video analysis, a method based on 2D pixel motion flow fields and the Karhunen-Loeve transform is proposed to determine the location of scene cuts and the type of camera movement within the scene. The 2D pixel motion can be found from methods which include optical flow and correlation. The scene cuts can be detected by analyzing the displaced frame difference or the smoothness of the pixel motion field. The Karhunen-Loeve transform was chosen because it extracts optimal linear features which describe the pixel motion fields. These optimal linear features are denoted as eigenflow images since they describe the set of pixel motion flow images which correspond to a particular camera movement. Regarding face and object classification, the Karhunen-Loeve transform has been used successfully for recognition of near frontal views of human faces

Media Lab Overview
LIACS Homepage
MM Conf
ACM Multimedia
Science Direct
IEEE Library
LIACS Publications
ACM Digital Library