Table of Contents: Speech and Computer

Speech and Computer [E-Book] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna.

The two-volume proceedings set LNAI 14338 and 14339 constitutes the refereed proceedings of the 25th International Conference on Speech and Computer, SPECOM 2023, held in Dharwad, India, during November 29-December 2, 2023. The 94 papers included in these proceedings were carefully reviewed and sele...

Saved in:

	Full text
Personal Name(s):	Agrawal, Shyam S., editor
	Deepak, K. T., editor / Hegde, Rajesh M., editor / Karpov, Alexey, editor / Prasanna, S. R. Mahadeva, editor / Samudravijaya, K., editor
Edition:	1st edition 2023.
Imprint:	Cham : Springer, 2023
Physical Description:	XXV, 642 pages 226 illustrations, 158 illustrations in color (online resource)
Note:	englisch
ISBN:	9783031483097
DOI:	10.1007/978-3-031-48309-7
Series Title:	Lecture Notes in Artificial Intelligence ; 14338 Lecture Notes in Computer Science
Subject (LOC):	Application software. Artificial intelligence. Computer engineering. Computer networks . Computer vision. Image processing -- Digital techniques.

Automatic Speech Recognition
Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks
EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition
Significance of Audio Quality in Speech-to-Text Translation Systems
Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level
Improving Automatic Speech Recognition with Dialect-Specific Language Models
Emotional speech recognition of Holocaust survivors with deep neural network models for Russian language
Computational Paralinguistics
Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks
Rhythm Formant Analysis for Automatic Depression Classification
Determining Alcohol Intoxication Based on Speech and Neural Networks
Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
Enhancing Stutter Detection in Speech using Zero Time Windowing Cepstral Coefficients and Phase Information
Source and System-based Modulation Approach for Fake Speech Detection
Digital Signal Processing
Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems
Learning to Predict Speech Intelligibility from Speech Distortions
Sparse Representation Frameworks for Acoustic Scene Classification
Driver Speech Detection in Real Driving Scenario
Regularization based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction
Candidate Speech Extraction from Multi-Speaker Single-Channel Audio Interviews
Post-Processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality
Region Normalized Capsule Network based Generative Adversarial Network for Non-Parallel Voice Conversion
Speech Enhancement using LinkNet Architecture
ATT:Adversarial Trained Transformer for Speech Enhancement
Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks
Speech Prosody
Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker
Gestures vs. Prosodic Structure in Laboratory Ironic Speech
Sounds of < sil > ence: Acoustics of Inhalation in Read Speech
Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language
Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation
Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment
Association of Time Domain Features with Oral Cavity Configuration during Vowel Production and its Application in Vowel Recognition
Prosodic Interaction Models in a Conversation
Natural Language Processing
Development and Research of Dialogue Agents with Long-Term Memory and Web Search
Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization
Boosting Rule-based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali
Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations
Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors based on Parts-of-Speech
On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification
Child Speech Processing
Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts
Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition
Gammatone-Filterbank based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR
System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach
Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children's KWS System
Development of Children's KWS System Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities
Linear Frequency Residual Features for Infant Cry Classification
Speech Processing for Medicine
Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms
Transfer Learning using Whisper for Dysarthric Automatic Speech Recognition
Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury
Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury
Respiratory Sickness Detection from Audio Recordings using CLIP Models
Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders through Spoken Dialogues.