Video Signal Processing Research Papers

Enhancing the Error Detection Capabilities of DCT Based Codecs using Compressed Domain Dissimilarity Metrics

2025, EUROCON 2007 - The International Conference on "Computer as a Tool"

Video compression standards are implemented in wireless data transmission technologies to provide multimedia services efficiently. These compression standards generally utilize the Discrete Cosine Transform (DCT) in conjunction with... more

descriptionView Paper arrow_downwardDownload

Enhancing the Error Detection Capabilities of the Standard Video Decoder using Pixel Domain Dissimilarity Metrics

by Carl Debono

2025, EUROCON 2007 - The International Conference on "Computer as a Tool"

The video compression standards commonly adopted in wireless multimedia services utilize variable length codes (VLC) in order to attain high compression ratios. While providing the high data rates required, this technique makes the system... more

descriptionView Paper arrow_downwardDownload

Real-Time Whiteboard Capture and Processing Using a Video Camera for Remote Collaboration

by ZHENGYOU ZHANG

2025, IEEE Transactions on Multimedia

This paper describes our recently developed system which captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, our system does not instrument the pens or the whiteboard. It... more

descriptionView Paper arrow_downwardDownload

Real-Time Whiteboard Capture and Processing Using a Video Camera for Remote Collaboration

by ZHENGYOU ZHANG

2025, IEEE Transactions on Multimedia

This paper describes our recently developed system which captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, our system does not instrument the pens or the whiteboard. It... more

descriptionView Paper arrow_downwardDownload

Intramodal and intermodal fusion for audio-visual biometric authentication

by Sun-yuan Kung

2025, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004.

This paper proposes a multiple-source multiple-sample fusion approach to identity verification. Fusion is performed at two levels: intramodal and intermodal. In intramodal fusion, the scores of multiple samples (e.g. utterances or video... more

descriptionView Paper arrow_downwardDownload

Improvement of Exemplar Based Inpainting by Enhancement of Patch Prior

by Koustav Dutta and

2025, 2023 IEEE 3rd International Conference on Applied Electromagnetics, Signal Processing, & Communication (AESPC)

Image Inpainting is the technique of filling out a photo with missing details. The purpose of inpainting is to visualize the realistic reconstruction of lost areas in a way that looks to the human eye natural.We present a novel algorithm... more

descriptionView Paper arrow_downwardDownload

Automated fish cage net inspection using image processing techniques

by Michalis Zervakis

2025, IET Image Processing

Fish-cage dysfunction in aquaculture installations can trigger significant negative consequences affecting the operational costs. Low oxygen levels, due to excessive fooling's, leads to decrease growth performance, and feed efficiency.... more

descriptionView Paper arrow_downwardDownload

Low-Complexity Rate Control for Efficient H.263 to H.264/AVC Video Transcoding

by Oscar Au

2025

Rate control is a complicated problem in the H.264/AVC coding standard, extra computation is usually needed for the existing rate control schemes to estimate the complexity of frames or macroblocks (MBs). However, during transcoding,... more

descriptionView Paper arrow_downwardDownload

Low-Complexity Rate Control for Efficient H.263 to H.264/AVC Video Transcoding

by Oscar Au

2025, 2006 International Conference on Image Processing

Rate control is a complicated problem in the H.264/AVC coding standard, extra computation is usually needed for the existing rate control schemes to estimate the complexity of frames or macroblocks (MBs). However, during transcoding,... more

descriptionView Paper arrow_downwardDownload

Seventh-order elliptic video filter with 0.1dB pass band ripple employing CMOS CDTAs

by Hakan Kuntman

2025, Aeu-international Journal of Electronics and Communications

In this paper, a CMOS realization of the current differencing transconductance amplifier (CDTA) is given, which is a newly reported active building block for current-mode signal processing. Current differencing stage of the CDTA element... more

descriptionView Paper arrow_downwardDownload

Content-based decomposition of gesture videos

by Dimitrios Kosmopoulos

2025, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.

In this paper we present a novel method for gesture video decomposition based on the depicted content. From the initial content the key-frames are extracted and the neighboring frames are assigned to key-frames of similar content. The... more

descriptionView Paper arrow_downwardDownload

What contextual and demographic factors predict drivers’ decision to engage in secondary tasks?

by Judith Charlton

2025, IET Intelligent Transport Systems

What contextual and demographic factors predict drivers' decision to engage in secondary tasks? IET Intelligent Transport Systems, 13(8), pp. 1218-1223.

descriptionView Paper arrow_downwardDownload

Implementation of deep learning models in FPGA development board for recognition accuracy enhancement

by beei iaes

2024, Bulletin of Electrical Engineering and Informatics

Deep learning (DL) model performance is intricately tied to the quality of training, influenced by several parameters. Of these, the computing unit employed significantly impacts training efficiency. Traditional setups use central... more

Table 5. State of the art model based on FPGA environments (model 2) Table 6. Epoch wise results (mse and time computation) for the second (state of the art) proposed model

4.3. Model 2 With the aim of reducing prediction errors and minimizing processing time, Model 2 has been constructed with three convolutional layers. While the structure of this model bears similarity to that proposed in the CPU (CNN) model, detailed configurations for Model 2 are provided in Table 5. Results

Figure 1. PYNQ-Z2 FPGA development board structure FPGAs through the PYNQ interface, unlocking the potential for high-performance inference and acceleration [24].

Figure 3. Putty software configuration page

Table 3. The first proposed model of FPGA based modulation classification detection Table 4. Epoch wise results (mse and time computation) for the first proposed model After uploading database file into notebook, here all data rows need to read and stored on separated array; the same procedure of the simulation stage is used over here unless the rows dimensions with is (50x50) points are resampled to match the DL model. A new dimension similar to (1x50x50) is made. Table 3 is demonstrating the model structure. This model is trained using ADAM algorithm with 20 epochs and batch size of 20 samples as illustrated. Thereafter, model is being trained for error minimization at the detection results, however, the results are given in the Table 4.

Table 7. Configuration of second DL model [29]

Table 1. Comparison of different FPGA models using different networks

descriptionView Paper arrow_downwardDownload

An audio-visual saliency model for movie summarization

by Petros Maragos

2024, 2007 IEEE 9Th International Workshop on Multimedia Signal Processing, MMSP 2007 - Proceedings

A saliency-based method for generating video summaries is presented, which exploits coupled audiovisual information from both media streams. Efficient and advanced speech and image processing algorithms to detect key frames that are... more

descriptionView Paper arrow_downwardDownload

Comparative study of background subtraction algorithms

by Yannick Benezeth

2024, Journal of Electronic Imaging

In this paper, we present a comparative study of several state of the art background subtraction methods. Approaches ranging from simple background subtraction with global thresholding to more sophisticated statistical methods have been... more

descriptionView Paper arrow_downwardDownload

Intramodal and intermodal fusion for audio-visual biometric authentication

by Man-wai Mak

2024

This paper proposes a multiple-source multiple-sample fusion approach to identity verification. Fusion is performed at two levels: intramodal and intermodal. In intramodal fusion, the scores of multiple samples (e.g. utterances or video... more

descriptionView Paper arrow_downwardDownload

Intramodal and intermodal fusion for audio-visual biometric authentication

by Man-wai Mak

2024, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004.

This paper proposes a multiple-source multiple-sample fusion approach to identity verification. Fusion is performed at two levels: intramodal and intermodal. In intramodal fusion, the scores of multiple samples (e.g. utterances or video... more

descriptionView Paper arrow_downwardDownload

Estimation of subjective quality for mixed-resolution stereoscopic video

by Moncef Gabbouj

2024

In mixed-resolution (MR) stereoscopic video, one view is presented with a lower resolution compared with the other one; therefore, a lower bitrate, a reduced computational complexity, and a decrease in memory access bandwidth can be... more

descriptionView Paper arrow_downwardDownload

A scalable and efficient convolutional neural network accelerator using HLS for a System on Chip design

by Kim Bjerge

2024, arXiv (Cornell University)

This paper presents a configurable Convolutional Neural Network Accelerator (CNNA) for a System on Chip design (SoC). The goal was to accelerate inference of different deep learning networks on an embedded SoC platform. The presented CNNA... more

descriptionView Paper arrow_downwardDownload

Video-based estimation of building occupancy during emergency egress

by Andrzej Banaszuk

2024

Providing real-time estimates of building occupancy to first responders during emergency events can help in search and rescue, and egress management. This paper addresses the estimation of occupancy in each zone of a building, where the... more

descriptionView Paper arrow_downwardDownload

Video-based estimation of building occupancy during emergency egress

by Andrzej Banaszuk

2024, 2008 American Control Conference

Providing real-time estimates of building occupancy to first responders during emergency events can help in search and rescue, and egress management. This paper addresses the estimation of occupancy in each zone of a building, where the... more

descriptionView Paper arrow_downwardDownload

Fast eigenspace decomposition of correlated images

by Anthony Maciejewski

2024, IEEE transactions on image processing

We present a computationally efficient algorithm for the eigenspace decomposition of correlated images. Our approach is motivated by the fact that for a planar rotation of a twodimensional image, analytical expressions can be given for... more

description