Table of Contents: MultiMedia Modeling

MultiMedia Modeling [E-Book] : 28th International Conference, MMM 2022, Phu Quoc, Vietnam, June 6-10, 2022, Proceedings, Part I / edited by Björn Þór Jónsson, Cathal Gurrin, Minh-Triet Tran, Duc-Tien Dang-Nguyen, Anita Min-Chun Hu, Binh Huynh Thi Thanh, Benoit Huet.

The two-volume set LNCS 13141 and LNCS 13142 constitutes the proceedings of the 28th International Conference on MultiMedia Modeling, MMM 2022, which took place in Phu Quoc, Vietnam, during June 6-10, 2022. The 107 papers presented in these proceedings were carefully reviewed and selected from a tot...

Saved in:

	Full text
Personal Name(s):	Dang-Nguyen, Duc-Tien, editor
	Gurrin, Cathal, editor / Hu, Anita Min-Chun, editor / Huet, Benoit, editor / Huynh Thi Thanh, Binh, editor / Tran, Minh-Triet, editor / Þór Jónsson, Björn, editor
Edition:	1st edition 2022.
Imprint:	Cham : Springer, 2022
Physical Description:	XXVI, 641 pages 188 illustrations, 179 illustrations in color (online resource)
Note:	englisch
ISBN:	9783030983581
DOI:	10.1007/978-3-030-98358-1
Series Title:	Lecture Notes in Computer Science ; 13141
Subject (LOC):	Computer engineering. Computer networks . Computer vision. Education-Data processing. Multimedia systems. Pattern recognition systems.

BEST PAPER SESSION
Real-time detection of tiny objects based on a weighted bi-directional FPN
Multi-Modal Fusion Network for Rumor Detection with Texts and Images
PF-VTON: Toward High-Quality Parser-Free Virtual Try-On Network
MF-GAN: Multi-conditional fusion Generative Adversarial Network for Text-to-Image Synthesis
APPLICATIONS 1
Learning to classify weather conditions from single images without labels
Learning Image Representation via Attribute-aware Attention Networks for Fashion Classification
Toward Detail-Oriented Image-Based Virtual Try-On with Arbitrary Poses
Parallel DBSCAN-Martingale estimation of the number of concepts for automatic satellite image clustering
MULTIMEDIA APPLICATIONS - PERSPECTIVES, TOOLS & APPLICATIONS (Special Session) & BRAVE NEW IDEAS
AI for the Media Industry: Application Potential and Automation Level
Color the Word: Leveraging Web Images for Machine Translation of Untranslatable Words
ACTIVITIES & EVENTS
MGMP: Multimodal Graph Message Propagation Network for Event Detection
Pose-Enhanced Relation Feature for Action Recognition in Still Images.-Prostate Segmentation of Ultrasound Images based on Interpretable-guided Mathematical Model
Spatiotemporal Perturbation Based Dynamic Consistency for Semi-Supervised Temporal Action Detection
MULTIMEDIA DATASETS FOR REPEATABLE EXPERIMENTATION (Special Session)
A Task Category Space for User-Centric Comparative Multimedia Search Evaluations
GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval
LLQA - Lifelog Question Answering Dataset
LEARNING
Category-sensitive Incremental Learning For Image-based 3D Shape Reconstruction
AdaConfigure: Reinforcement Learning-based Adaptive Configuration for Video Analytics Services
Mining Minority-class Examples With Uncertainty Estimates
Conditional Context-aware Feature Alignment for Domain Adaptive Detection Transformer
MULTIMEDIA for MEDICAL APPLICATIONS (Special Session)
Human activity recognition with IMU and vital signs feature fusion
On Identifying Pareidolia Phenomenon by Emulating Patient Behavior
Using Explainable AI to Identify Differences between Clinical and Experimental Pain Detection Models Based on Facial Expressions
APPLICATIONS 2
Double Granularity Relation Network with Self-Criticism for Occluded Person Re-Identification
A Complementary Fusion Strategy for RGB-D Face Recognition
Multi-scale Cross-modal Transformer Network for RGB-D Object Detection
Joint Re-Detection and Re-Identification for Multi-Object Tracking
MULTIMEDIA ANALYTICS for CONTEXTUAL HUMAN UNDERSTANDING (Special Session)
An Investigation into Keystroke Dynamics and Heart Rate Variability as Indicators of Stress
Fall detection using multimodal data
Prediction of Blood Glucose using Contextual LifeLog Data
Multimodal Embedding for Lifelog Retrieval
APPLICATIONS 3
A Multiple Positives Enhanced NCE Loss for Image-Text Retrieval
SAM: Self Attention Mechanism for Scene Text Recognition based on Swin Transformer
JVCSR: Video Compressive Sensing Reconstruction with Joint In-loop Reference Enhancement and Out-loop Super-resolution
Point Cloud Upsampling via a Coarse-to-fine Network
IMAGE ANALYTICS
Arbitrary Style Transfer With Adaptive Channel Network
Fast Single Image Dehazing Using Morphological Reconstruction and Saturation Compensation
One-Stage Image Inpainting with Hybrid Attention
Real-time FPGA Design for OMP Targeting 8K Image Reconstruction
SPEECH & MUSIC
Time-Frequency Attention For Speech Emotion Recognition With Squeeze-and-Excitation Blocks
SPEECH INTELLIGIBILITY ENHANCEMENT BY NON-PARALLEL SPEECH STYLE CONVERSION USING CWT AND iMetricGAN BASED CycleGAN
A-Muze-Net: Music Generation by Composing the Harmony based on the Generated Melody
MULTIMODAL ANALYTICS
Bi-attention modal separation network for multimodal video fusion
Combining Knowledge and Multi-modal Fusion for Meme Classification
Non-Uniform Attention Network for Multi-modal Sentiment Analysis
Multimodal Unsupervised Image-to-Image Translation Without Independent Style Encoder.