This webpage contains additional material (videos) for the article:

G.D. Evangelidis and C. Bauckhage, "Efficient Subframe Video Alignment using Short Descriptors", IEEE Transaction on Pattern Analysis and Machine Intelligence, vol.35, no. 10, Oct. 2013, pages 2371-2386 (pdf)

Video results for:
Moving Cameras
Backroad Sequence (moving cameras)
Campus Sequence (moving cameras)
Highway Sequence (moving cameras)
Suburb Sequence (moving cameras, appearance variation)
Pedzone2 Sequence (moving cameras, appearance variation)
Static cameras (non-rigid scene motion)
Wind Sequence (static cameras, non-rigid scene motion)
Water Sequence (static cameras, non-rigid scene motion)
Inria Sequence (static cameras, non-rigid scene motion, appearance variation)


Moving Cameras

Backroad Sequence

This text will be replaced
This text will be replaced
Reference Sequence Input sequence

Synchronization:

In these representations, the top frame is the input(query) frame while the bottom frame is the synchronized reference image obtained by each method.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-tree
Proposed Quad-VD Proposed Quad-VD-mDP SIFT-BoW [1] MAP-Inference[2]

Spatio-temporal alignment:

Fusion videos have been created by combining the R and B channels from the input sequence and G component from the reference sequence, but warped in space and time w.r.t the outcome of each algorithm. This way changes are identified with lawn-green and hot-pink colors.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-Tree SIFT-BoW + SIFT-flow [1],[3] Map-Inference + Lucas-Kanade[2] Caspi-Irani[4]

Campus Sequence

This text will be replaced
This text will be replaced
Reference Sequence Input sequence

Synchronization:

In these representations, the top frame is the input(query) frame while the bottom frame is the synchronized reference image obtained by each method.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-tree
Proposed Quad-VD Proposed Quad-VD-mDP SIFT-BoW [1] MAP-Inference[2]

Spatio-temporal alignment:

Fusion videos have been created by combining the R and B channels from the input sequence and G component from the reference sequence, but warped in space and time w.r.t the outcome of each algorithm. This way changes are identified with lawn-green and hot-pink colors.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-Tree SIFT-BoW + SIFT-flow [1],[3] Map-Inference + Lucas-Kanade[2] Caspi-Irani[4]

Highway Sequence

This text will be replaced
This text will be replaced
Reference Sequence Input sequence

Synchronization:

In these representations, the top frame is the input(query) frame while the bottom frame is the synchronized reference image obtained by each method.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-tree
Proposed Quad-VD Proposed Quad-VD-mDP SIFT-BoW [1] MAP-Inference[2]

Spatio-temporal alignment:

Fusion videos have been created by combining the R and B channels from the input sequence and G component from the reference sequence, but warped in space and time w.r.t the outcome of each algorithm. This way changes are identified with lawn-green and hot-pink colors.

This text will be replaced
This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-Tree SIFT-BoW + SIFT-flow [1],[3] Map-Inference + Lucas-Kanade[2] Caspi-Irani[4]

Pedzone-2 Sequence (with appearance variation)

This text will be replaced
This text will be replaced
Reference Sequence Input sequence

Spatio-temporal alignment:

Fusion videos have been created by combining the R and B channels from the input sequence and G component from the reference sequence, but warped in space and time w.r.t the outcome of each algorithm. This way changes are identified with lawn-green and hot-pink colors.

This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-Tree SIFT-BoW + SIFT-flow [1],[3] Caspi-Irani[4]

Suburb Sequence (different weather conditions)

This text will be replaced
This text will be replaced
Reference Sequence Input sequence

Spatio-temporal alignment:

Fusion videos have been created by combining the R and B channels from the input sequence and G component from the reference sequence, but warped in space and time w.r.t the outcome of each algorithm. This way changes are identified with lawn-green and hot-pink colors.

This text will be replaced
This text will be replaced
This text will be replaced
Proposed Quad-Tree SIFT-BoW + SIFT-flow [1],[3] Caspi-Irani[4]

Non-rigid scene motion - Static cameras

In this comparison, sequences have been synchronized by the proposed Quad-Tree and a RANSAC-based line fitting, hence they are synchronized up to frame accuracy. Then, starting from the same temporal intialization, we test the proposed spatio-temporal ECC scheme and the algorithm proposed by Caspi-Irani. Note that when the algorithms use 1 frame, it means that a frame-to-subframe alignment scheme is established (it is spatio-temporal as well). We provide the uncompressed videos for the cases that the difference in performance may not be obvious.

Wind sequence

Video 1 Video2
video_wind_1
video_wind_2

Spatio-temporal alignment using subsequences

Proposed 1 frame Proposed 3 frames Proposed 9 frames
video_wind_proposed_1
video_wind_proposed_3
video_wind_proposed_9
Caspi-Irani 1 frame Caspi-Irani 3 frames Caspi-Irani 9 frames
video_wind_caspi_1
video_wind_caspi_3
video_wind_caspi_9
Uncompressed video files:
Proposed (9 frames)
Caspi-Irani (9 frames)

Water sequence

Video 1 Video2
video_water_1
video_water_2

Spatio-temporal alignment using subsequences

Proposed 1 frame Proposed 3 frames Proposed 9 frames
video_water_proposed_1
video_water_proposed_3
video_water_proposed_9
Caspi-Irani 1 frame Caspi-Irani 3 frames Caspi-Irani 9 frames
video_water_caspi_1
video_water_caspi_3
video_water_caspi_9
Uncompressed video files:
Proposed (9 frames) (70 MB)
Caspi-Irani (9 frames) (70 MB)

Uncompressed video files:
Proposed (3 frames) (70 MB)
Caspi-Irani (3 frames) (70 MB)

Inria sequence (with appearance variation)

Video 1 Video2
video_inria_1
video_inria_2

Spatio-temporal alignment using subsequences

Proposed 1 frame Proposed 3 frames Proposed 9 frames
video_inria_proposed_1
video_inria_proposed_3
video_inria_proposed_9
Caspi-Irani 1 frame Caspi-Irani 3 frames Caspi-Irani 9 frames
video_inria_caspi_1
video_inria_caspi_3
video_inria_caspi_9

Uncompressed video (avi) files:
Proposed (9 frames) (60 MB)
Caspi-Irani (9 frames) (60 MB)


[1] J. Sivic and A. Zissermann, Efficient Visual Search of videos cast as text retrieval", IEEE Trans. on Pattern Analysis & Machine Itelligence, vol.31, no. 4, pp. 591-606, 2009
[2] F. Diego, D. Ponsa, J. Serrat and A. Lo'pez, Video Alignment for change detecion, IEEE Trans. on Image Processing, vol. Preprint, Issue 99, 2010
[3] C. Liu, J. Yuen, A. Torralba, J. Sivic, W. T. Freeman, SIFT flow: dense correspondence across difference scenes, in ECCV 2008
[4] Y. Caspi, M. Irani, Spatio-temporal alignment of sequences, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1409-1424, 2002