Multimodal Speech Separation with Feedback Architecture
This thesis tackles the high computational cost, and output inconsistency of processing long audio sequences in multimodal speech separation. We introduce the Self-Feedback RE-Sepformer (SFRS), an architecture integrating an RNN-inspired incremental inference mechanism with RE-Sepformer backbone. SF...
| Main Author: | |
|---|---|
| Other Authors: | , , , |
| Format: | Master's thesis |
| Language: | eng |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://jyx.jyu.fi/handle/123456789/102963 |