NLP-KG
Semantic Search

Publication:

CopyNE: Better Contextual ASR by Copying Named Entities

Shilin ZhouZhenghua LiYu HongM. ZhangZhefeng WangBaoxing Huai • @arXiv • 22 May 2023

TLDR: A novel approach called CopyNE is introduced, which uses a span-level copying mechanism to improve ASR in transcribing entities, which can copy all tokens of an entity at once, effectively avoiding errors caused by homophonic or near-homophonic tokens.

Citations: 3
Abstract: Recent years have seen remarkable progress in automatic speech recognition (ASR). However, traditional token-level ASR models have struggled with accurately transcribing entities due to the problem of homophonic and near-homophonic tokens. This paper introduces a novel approach called CopyNE, which uses a span-level copying mechanism to improve ASR in transcribing entities. CopyNE can copy all tokens of an entity at once, effectively avoiding errors caused by homophonic or near-homophonic tokens that occur when predicting multiple tokens separately. Experiments on Aishell and ST-cmds datasets demonstrate that CopyNE achieves significant reductions in character error rate (CER) and named entity CER (NE-CER), especially in entity-rich scenarios. Furthermore, even when compared to the strong Whisper baseline, CopyNE still achieves notable reductions in CER and NE-CER. Qualitative comparisons with previous approaches demonstrate that CopyNE can better handle entities, effectively improving the accuracy of ASR.

Related Fields of Study

loading

Citations

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next

References

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next