Multimodal Speech Separation with Feedback Architecture
This thesis tackles the high computational cost, and output inconsistency of processing long audio sequences in multimodal speech separation. We introduce the Self-Feedback RE-Sepformer (SFRS), an architecture integrating an RNN-inspired incremental inference mechanism with RE-Sepformer backbone. SF...
Main Author: | |
---|---|
Other Authors: | , , , |
Format: | Master's thesis |
Language: | eng |
Published: |
2025
|
Subjects: | |
Online Access: | https://jyx.jyu.fi/handle/123456789/102963 |