Multimodal Speech Separation with Feedback Architecture
This thesis tackles the high computational cost, and output inconsistency of processing long audio sequences in multimodal speech separation. We introduce the Self-Feedback RE-Sepformer (SFRS), an architecture integrating an RNN-inspired incremental inference mechanism with RE-Sepformer backbone. SF...
Päätekijä: | |
---|---|
Muut tekijät: | , , , |
Aineistotyyppi: | Pro gradu |
Kieli: | eng |
Julkaistu: |
2025
|
Aiheet: | |
Linkit: | https://jyx.jyu.fi/handle/123456789/102963 |