Speech and Computer [E-Book] : 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27-30, 2021, Proceedings / edited by Alexey Karpov, Rodmonga Potapova.
This book constitutes the proceedings of the 23rd International Conference on Speech and Computer, SPECOM 2021, held in St. Petersburg, Russia, in September 2021.* The 74 papers presented were carefully reviewed and selected from 163 submissions. The papers present current research in the area of co...
Saved in:
Full text |
|
Personal Name(s): | Karpov, Alexey, editor |
Potapova, Rodmonga, editor | |
Edition: |
1st edition 2021. |
Imprint: |
Cham :
Springer,
2021
|
Physical Description: |
XVII, 839 pages 204 illustrations, 129 illustrations in color (online resource) |
Note: |
englisch |
ISBN: |
9783030878023 |
DOI: |
10.1007/978-3-030-87802-3 |
Series Title: |
/* Depending on the record driver, $field may either be an array with
"name" and "number" keys or a flat string containing only the series
name. We should account for both cases to maximize compatibility. */?>
Lecture Notes in Artificial Intelligence ;
12997 /* Depending on the record driver, $field may either be an array with "name" and "number" keys or a flat string containing only the series name. We should account for both cases to maximize compatibility. */?> Lecture Notes in Computer Science |
Subject (LOC): |
- Text-Independent Speaker Verification Employing CNN-LSTM-TDNN Hybrid Networks
- End-to-End Voice Spoofing Detection Employing Time Delay Neural Networks and Higher Order Statistics
- Assessing Velar Gestures Timing in European Portuguese Nasal Vowels with RT-MRI Data
- Designing and Deploying an Interaction Modality for Articulatory-Based Audiovisual Speech Synthesis
- Kurdish Spoken Dialect Recognition Using X-vector Speaker Embedding
- An ASR-based Tutor for Learning to Read: How to Optimize Feedback to First Graders
- Velocity Differences Between Velum Raising and Lowering Movements
- Pragmatic Markers of Russian Everyday Speech: Invariants in Dialogue and Monologue
- Language Adaptation for Speaker Recognition Systems using Contrastive Learning
- Evaluating X-vector-based Speaker Anonymization Under White-box Assessment
- Improved Prosodic Clustering for Multispeaker and Speaker-Independent Phoneme-Level Prosody Control
- Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives
- Imagined, Intended, and Spoken Speech Envelope Synthesis from Neuromagnetic Signals
- What Causes Phonetic Reduction in Russian Speech: New Evidence from Machine Learning Algorithms
- Toxic Comment Classification Service in Social Network
- Deep Learning based Engagement Recognition in Highly Imbalanced Data
- Intraspeaker Variability of a Professional Lecturer: Ageing, Genre, Pragmatics vs. Voice Acting (Case Study)
- An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds
- Where are We in Semantic Concept Extraction for Spoken Language Understanding?
- Learning Mizo Tones from F0 Contours using 1D-CNN
- OCR Improvements for Images of Multi-Page Historical Documents
- X-Bridge: Image-to-Image Translation with Reconstruction Capabilities
- Who is Selling to Whom - Feature Evaluation for Multi-block Classification in Invoice Information Extraction
- Multimodal Corpus Analysis of Autoblog 2020: Lecture Videos in Machine Learning
- Text and Synthetic Data for Domain Adaptation in End-to-End Speech Recognition
- Speaker-invariant Speech-To-Intent Classification for Low-Resource Languages
- Speaker-Dependent Visual Command Recognition in Vehicle Cabin: Methodology and Evaluation
- Optimised Code-Switched Language Model Data Augmentation in Four Under-Resourced South African Languages
- Synthesis Speech based Data Augmentation for Low Resource Children ASR
- End-to-End Russian Speech Recognition Models with Multi-Head Attention
- Word-level Style Control for Expressive, Non-attentive Speech Synthesis
- Perceiving Speech Aggression with and without Textual Context on Twitter Social Network Site
- Assessing Speaker Interpolation in Neural Text-to-Speech
- A Mobile Application for Detection of Amyotrophic Lateral Sclerosis via Voice Analysis
- Child's Emotional Speech Classification by Human across Two Languages: Russian & Tamil
- Analysis of Dialogues of Typically Developing Children, Children with Down Syndrome and ASD using Machine Learning Methods
- Speaker Adaptation with Continuous Vocoder-based DNN-TTS
- Automatic Recognition of the Psychoneurological State of Children: Autism Spectrum Disorders, Down Syndrome, Typical Development
- Study on Acoustic Model Personalization in a Context of Collaborative Learning Constrained by Privacy Preservation
- USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
- A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English
- Dialog Speech Sentiment Classification for Imbalanced Datasets
- Explicit Control of the Level of Expressiveness in DNN-based Speech Synthesis by Embedding Interpolation
- Experimental Analysis of Expert and Quantitative Estimates of Syllable Recordings in the Process of Speech Rehabilitation
- Methods for Using Class Based N-gram Language Models in the Kaldi Toolkit
- Spectral Root Features for Replay Spoof Detection in Voice Assistants
- Influence of the Aggressive Internet Environment on Cognitive Personality Disorders (in Relation to the Russian Young Generation of Users)
- Media Content vs Nature Stimuli Influence on Human Brain Activity
- Can Your Eyes Tell Us Why You Hesitate? Comparing Reading Aloud in Russian as L1 and Japanese as L2
- Recognition of Heavily Accented and Emotional Speech of English and Czech Holocaust Survivors using Various DNN Architectures
- Assessing Speaker-Independent Character Information for Acted Voices
- Influence of Speaker Pre-training on Character Voice Representation
- Opinion Classification via Word and Emoji Embedding Models with LSTM
- An Equal Data Setting for Attention-based Encoder-Decoder and HMM/DNN Models: a Case Study in Finnish ASR
- Speaker-aware Training of Speech Emotion Classifier with Speaker Recognition
- Neural Network Recognition of Russian Noun and Adjective Cases in the Google Books Ngram Corpus
- Is it a Filler or a Pause? A Quantitative Analysis of Filled Pauses in Hebrew
- Modified Group Delay Function using Different Spectral Smoothing Techniques for Voice Liveness Detection
- Complex Rhythm Adjustments in Multilingual Code-Switching across Mandarin, English and Russian
- Increasing the Precision of Dysarthric Speech Intelligibility and Severity Level Estimate
- Articulation During Voice Disguise: a Pilot Study
- Improvement of Speaker Number Estimation by Applying an Overlapped Speech Detector
- Mind Your Tweet: Abusive Tweet Detection
- Speaker Authorization for Air Traffic Control Security
- Prosodic Changes with Age: a Longitudinal Study on a Famous European Portuguese Native Speaker
- Automatic Selection of the Most Characterizing Features for Detecting COPD in Speech
- Multilingual Training Set Selection for ASR in Under-Resourced Malian Languages
- Human and Transformer-Based Prosodic Phrasing in Two Speech Genres
- Learning Efficient Representations for Keyword Spotting with Triplet Loss
- Regularized Forward-Backward Decoder for Attention Models
- Induced Local Attention for Transformer Models in Speech Recognition
- Applying EEND Diarization to Telephone Recordings from a Call Center
- Acoustic Characteristics of Speech Entrainment in Dialogues in Similar Phonetic Sequences
- Predicting Biometric Error Behaviour from Speaker Embeddings and a Fast Score Normalization Scheme.