How do I get started with VIT?

VIT wake word and Voice Command Engine can be accessed through online tools and our MCUXpresso SDK. For VIT Speech to Intent, please contact us at voice@nxp.com with your specific requests.

Does NXP have voice software application examples?

Yes, visit our application software pack page or our Application Code Hub. You can also view demo videos showcasing our voice software.

What is the difference between voice UI and voice communications?

Voice UI refers to “voice-first” devices that use voice as a user interface. NXP's Voice UI software technologies are VIT, VoiceSpot and VoiceSeeker. Voice communications refer to two-way person-to-person communication using voice; i.e., telephony. NXP's Voice communications software technology is Conversa.

What is the difference between VoiceSpot and VIT? When should you use one versus the other?

VoiceSpot is a very accurate, highly optimized wake word and acoustic event detection engine. It is based on deep learning neural network techniques and requires large datasets for training. VoiceSpot is appropriate for customers who need the highest response rates with the fewest false alarms and is also appropriate for customers who need to run in ultralow power states while waiting for the voice / acoustic trigger. VIT software suite is built on phoneme-based automatic speech recognition technology. This technology maps spoken phonemes (the basic building blocks of speech) into words, which can then be recognized as wake words and commands and transformed into intents and actions. Because VIT is based on phonemes, it is possible to create wake words and command models quickly with a keyboard and NXP's online model creation tools. VIT wake word and Voice Command Engines are appropriate for customers who want to build custom wake words and voice commands independently or those who want to quickly experiment with voice as a user interface. VIT Speech to Intent is for customers who want to create a natural language understanding like experience on edge processors without the use of cloud connectivity and cloud ASR transcription services.

What is VoiceSeeker and how do you use it?

VoiceSeeker is a multi-microphone beamforming audio front end signal processing solution for voice user interfaces. VoiceSeeker discriminates between signal and noise and is especially effective in far-field, reverberant conditions. VoiceSeeker is offered in a standard free-to-use option and a premium option. VoiceSeeker without AEC is freely available via NXP's MCUXpresso SDK and integrates easily with VoiceSpot or VIT. The premium VoiceSeeker option includes an acoustic echo canceler (AEC) and is available via controlled distribution from NXP. VoiceSeeker is frequently used in far-field voice control applications like smart speakers and home controllers but can also be used in the mid- and near-field where interfering noise needs to be cancelled.

语音处理

支持边缘语音处理的综合软件

应用
产品
设计资源
文档
常见问题解答

恩智浦嵌入式语音通信套件

恩智浦提供一系列语音控制、音频和通信软件与解决方案，为人对人和人对机器的语音应用提供高质量、可靠的嵌入式语音处理。恩智浦语音通信软件专为基于我们各类MCU、MPU和DSP的小尺寸、低功耗应用而设计。

语音处理应用

工业控制

消费电子

设计资源

开发板与设计

EdgeReady语音解决方案

完整的量产级软硬件平台，通过了恩智浦认证，可实现快速开发和提供全包式解决方案。

文档

边缘语音处理软件

恩智浦为人与机器语音处理提供可靠的语音、音频及通信解决方案。

简介

2023年9月19日

第1版

常见问题解答

如何开始使用VIT？

VIT唤醒词和语音命令引擎可通过在线工具及我们的MCUXpresso SDK获取。如需了解VIT Speech to Intent的更多详情，请将具体需求发送至邮箱voice@nxp.com与我们联系。

恩智浦是否提供语音软件应用示例？

提供，请访问我们的应用软件包页面或应用代码中心。您也可以观看展示我们语音软件的演示视频。

语音UI和语音通信有何区别？

语音UI指以语音为主要交互方式的“语音优先”设备。恩智浦的语音UI软件技术包括VIT、VoiceSpot和VoiceSeeker。

语音通信指人与人之间使用语音进行的双向通信，如电话。恩智浦使用的语音通信软件技术是Conversa。

VoiceSpot和VIT有何不同？在什么情况下应该使用哪一款产品呢？

VoiceSpot是一款高精度、高度优化的唤醒词及声学事件检测引擎。它基于深度学习神经网络技术，需要大量数据集进行训练。VoiceSpot适合对响应率要求高、需最大限度减少误报的客户，也适合需要在超低功耗状态下等待语音/声学触发的应用场景。

VIT软件套件基于音素自动语音识别技术构建。该技术将口语音素(语音的构建模块)映射为单词，进而识别为唤醒词和命令，并转换为意图与动作。由于VIT基于音素，客户可通过键盘和恩智浦在线模型创建工具快速创建唤醒词和命令模型。VIT唤醒词和语音命令引擎适合希望独立构建定制唤醒词和语音命令，或希望快速尝试将语音作为用户界面的客户。VIT Speech to Intent则面向希望在边缘处理器上实现类自然语言理解体验、且无需依赖云连接与云ASR转录服务的客户。

什么是VoiceSeeker？如何使用？

VoiceSeeker是一款多麦克风波束赋形音频前端信号处理解决方案，适用于语音用户界面。它能有效区分信号与噪声，在远场混响环境中表现尤为出色。VoiceSeeker提供标准免费版与高级版两种选项。不含AEC功能的VoiceSeeker标准版可通过恩智浦MCUXpresso SDK免费获取，并能轻松与VoiceSpot或VIT集成。包含声学回声消除器(AEC)的高级版则需通过恩智浦受控分发渠道提供。VoiceSeeker常用于智能音箱、家庭控制器等远场语音控制应用，也可用于需要消除干扰噪声的中场及近场场景。

恩智浦智慧生活博文

语音处理

恩智浦嵌入式语音通信套件

语音处理应用

工业控制

消费电子

语音处理产品

语音处理软件产品组合

音频处理

音频前端

对话式AI

语音通话

语音用户交互

语音增强