Speech and Computer [E-Book] : 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I / edited by Alexey Karpov, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna.
The two-volume proceedings set LNAI 14338 and 14339 constitutes the refereed proceedings of the 25th International Conference on Speech and Computer, SPECOM 2023, held in Dharwad, India, during November 29-December 2, 2023. The 94 papers included in these proceedings were carefully reviewed and sele...
Saved in:
Full text |
|
Personal Name(s): | Agrawal, Shyam S., editor |
Deepak, K. T., editor / Hegde, Rajesh M., editor / Karpov, Alexey, editor / Prasanna, S. R. Mahadeva, editor / Samudravijaya, K., editor | |
Edition: |
1st edition 2023. |
Imprint: |
Cham :
Springer,
2023
|
Physical Description: |
XXV, 642 pages 226 illustrations, 158 illustrations in color (online resource) |
Note: |
englisch |
ISBN: |
9783031483097 |
DOI: |
10.1007/978-3-031-48309-7 |
Series Title: |
/* Depending on the record driver, $field may either be an array with
"name" and "number" keys or a flat string containing only the series
name. We should account for both cases to maximize compatibility. */?>
Lecture Notes in Artificial Intelligence ;
14338 /* Depending on the record driver, $field may either be an array with "name" and "number" keys or a flat string containing only the series name. We should account for both cases to maximize compatibility. */?> Lecture Notes in Computer Science |
Subject (LOC): |
- Automatic Speech Recognition
- Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks
- EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition
- Significance of Audio Quality in Speech-to-Text Translation Systems
- Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level
- Improving Automatic Speech Recognition with Dialect-Specific Language Models
- Emotional speech recognition of Holocaust survivors with deep neural network models for Russian language
- Computational Paralinguistics
- Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks
- Rhythm Formant Analysis for Automatic Depression Classification
- Determining Alcohol Intoxication Based on Speech and Neural Networks
- Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
- Enhancing Stutter Detection in Speech using Zero Time Windowing Cepstral Coefficients and Phase Information
- Source and System-based Modulation Approach for Fake Speech Detection
- Digital Signal Processing
- Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems
- Learning to Predict Speech Intelligibility from Speech Distortions
- Sparse Representation Frameworks for Acoustic Scene Classification
- Driver Speech Detection in Real Driving Scenario
- Regularization based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine Interaction
- Candidate Speech Extraction from Multi-Speaker Single-Channel Audio Interviews
- Post-Processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual Quality
- Region Normalized Capsule Network based Generative Adversarial Network for Non-Parallel Voice Conversion
- Speech Enhancement using LinkNet Architecture
- ATT:Adversarial Trained Transformer for Speech Enhancement
- Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural Networks
- Speech Prosody
- Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign Speaker
- Gestures vs. Prosodic Structure in Laboratory Ironic Speech
- Sounds of < sil > ence: Acoustics of Inhalation in Read Speech
- Prolongations as Hesitation Phenomena in Spoken Speech in First and Second Language
- Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation
- Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment
- Association of Time Domain Features with Oral Cavity Configuration during Vowel Production and its Application in Vowel Recognition
- Prosodic Interaction Models in a Conversation
- Natural Language Processing
- Development and Research of Dialogue Agents with Long-Term Memory and Web Search
- Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression Verbalization
- Boosting Rule-based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in Bengali
- Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations
- Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors based on Parts-of-Speech
- On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of Classification
- Child Speech Processing
- Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian Experts
- Effect of Linear Prediction Order to Modify Formant Locations for Children Speech Recognition
- Gammatone-Filterbank based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR
- System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based Approach
- Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children's KWS System
- Development of Children's KWS System Perceptual Experiment and Automatic Recognition by Video, Audio and Text Modalities
- Linear Frequency Residual Features for Infant Cry Classification
- Speech Processing for Medicine
- Identification of Voice Disorders: A Comparative Study of Machine Learning Algorithms
- Transfer Learning using Whisper for Dysarthric Automatic Speech Recognition
- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury
- Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain Injury
- Respiratory Sickness Detection from Audio Recordings using CLIP Models
- Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders through Spoken Dialogues.