XAF Record Example
Table of Content
Overview
The dsp_xaf_record application demonstrates audio processing using the DSP core, the Xtensa Audio Framework (XAF) middleware library, with a focus on audio recording, processing and voice recognition (VIT - Voice Intelligent Technology, Voice seeker).
As shown in the table below, the application is supported on several development boards and each development board may have certain limitations, some development boards may also require hardware modifications or allow to use of an audio expansion board. Therefore, please check the supported features and Hardware modifications or Example configuration sections before running the demo.
Feature | VIT | Voice seeker |
---|---|---|
EVK-MIMXRT595 | OK | 2 dmic |
EVK-MIMXRT685 | OK | 2 dmic |
MIMXRT685-AUD-EVK | OK | 1 dmic/3 dmic with DMIC board |
MIMXRT700-EVK | OK | X |
Functionality
The application includes the following main components:
ARM Core (CM33) - Handles user interface and communicates with the DSP core
DSP Core - Processes audio data using the Xtensa Audio Framework (XAF)
The typical audio processing pipeline includes:
Audio source component - DMIC audio
VIT and Voice seeker component (perform voice recognition)
Renderer component (playback on codec)
The application demonstrates recording from digital microphones (DMIC), processing the audio with voice enhancement algorithms, performing voice recognition, and prints back in console detected WakeWord and list of commands.
Hardware Requirements
Development board (one of the following):
EVK-MIMXRT595 board
EVK-MIMXRT685 board
MIMXRT685-AUD-EVK board (optionally with 8CH-DMIC expansion board - rev B required)
MIMXRT700-EVK board
Micro USB cable
JTAG/SWD debugger
Headphones with 3.5 mm stereo jack
Personal Computer
Hardware Modifications
Some development boards need some hardware modifications to run the application.
EVK-MIMXRT595:
To enable the example audio using WM8904 codec, connect pins as follows:
JP7-1 <–> JP8-2
Note: The I3C Pin configuration in pin_mux.c is verified for default 1.8V, for 3.3V, need to manually configure slew rate to slow mode for I3C-SCL/SDA.
EVK-MIMXRT685:
To enable the example audio using WM8904 codec, connect pins as follows:
JP7-1 <–> JP8-2
MIMXRT685-AUD-EVK
Set the hardware jumpers (Tower system/base module) to default settings.
Set hardware jumpers JP2 2<–>3, JP44 1<–>2 and JP45 1<–>2.
For 8CH-DMIC expansion board (optional):
Connect the 8CH-DMIC expansion board to the MIMXRT685-AUD-EVK board to the DMIC connector (J31). For safety reasons, the expansion board must be connected when the power supply is disconnected.
Set the hardware jumpers on the 8-DMIC expansion board to 2MIC, 3MICA, 3MICC config (Short: J6, J9, J10).
Set the hardware jumpers JP44 2<–>3 and JP45 2<–>3 on the MIMXRT685-AUD-EVK board for on-board DMIC bypass.
MIMXRT700-EVK:
Set the hardware jumpers to default settings.
Preparation
Connect headphones to Audio HP / Line-Out connector.
EVK-MIMXRT595 - J4
EVK-MIMXRT685 - J4
MIMXRT685-AUD-EVK - J4, J50 for third channel when using 3 microphones
MIMXRT700-EVK - J29
Connect a micro USB cable between the PC host and the debug USB port on the development board.
EVK-MIMXRT595 - J40
EVK-MIMXRT685 - J5
MIMXRT685-AUD-EVK - J5
MIMXRT700-EVK - J54
Open a serial terminal with the following settings:
115200 baud rate
8 data bits
No parity
One stop bit
No flow control
Download the program for CM33 core to the target board.
Launch the debugger in your IDE to begin running the demo.
If building release configuration, start the xt-ocd daemon and download the program for DSP core to the target board. If building debug configuration, launch the Xtensa IDE or xt-gdb debugger to begin running the demo.
Notes:
DSP image can only be debugged using J-Link debugger. See the document ‘Getting Started with Xplorer’ for your particular board for more information.
Example Configuration
The example can be configured by user. Before configuration, please check the table to see if the feature is supported on the development board.
MIMXRT685-AUD-EVK 8CH-DMIC expansion board settings:
Select how many microphones should be used
Set the BOARD_DMIC_NUM preprocessor macro to 1,2, 3 (default) or 4 in the project for the CM33 core.
When the 8CH-DMIC expansion board is used, the DMIC_BOARD_CONNECTED macro must be set to 1 (default) in the project for the DSP core.
Important: When you set the value to 2, 3 or 4 you have to connect the 8CH-DMIC expansion board and set the DMIC_BOARD_CONNECTED macro to 1. Don’t forget set the hardware jumpers JP44 2-3 and JP45 2-3.
Running the Demo
The ARM application will power and clock the DSP, so it must be loaded prior to loading the DSP application. The DSP application can be built by the following tools: Xtensa Xplorer or Xtensa C Compiler. Application for Cortex-M33 can be built by the other toolchains listed in MCUXpresso SDK Release Notes.
The release configurations of the demo will combine both applications into one ARM image. With this, the ARM core will load and start the DSP application on startup. Pre-compiled DSP binary images are provided under dsp/binary/ directory. If you make changes to the DSP application in release configuration, rebuild ARM application after building the DSP application. If you plan to use MCUXpresso IDE for cm33 you will have to make sure that the preprocessor symbol DSP_IMAGE_COPY_TO_RAM, found in IDE project settings, is defined to the value 1 when building release configuration.
The debug configurations will build two separate applications that need to be loaded independently. DSP application can be built by the following tools: Xtensa Xplorer or Xtensa C Compiler. Required tool versions can be found in MCUXpresso SDK Release Notes for the board. Application for cm33 can be built by the other toolchains listed there. If you plan to use MCUXpresso IDE for cm33 you will have to make sure that the preprocessor symbol DSP_IMAGE_COPY_TO_RAM, found in IDE project settings, is defined to the value 0 when building debug configuration. The ARM application will power and clock the DSP, so it must be loaded prior to loading the DSP application.
In order to debug both the Cortex-M33 and DSP side of the application, please follow the instructions:
It is necessary to run the Cortex-M33 side first and stop the application before the DSP_Start function
Run the xt-ocd daemon with proper settings
Download and debug the DSP application
In order to get TRACE debug output from the XAF it is necessary to define XF_TRACE 1 in the project settings. It is possible to save the TRACE output into RAM using DUMP_TRACE_TO_BUF 1 define on project level. Please see the initialization of the TRACE function in the xaf_main_dsp.c file. For more details see XAF documentation.
Running on CM33
When the demo runs successfully, the CM33 terminal will display the following output (example from MIMXRT700-EVK):
******************************
DSP audio framework demo start
******************************
[CM33 Main] Configure codec
[DSP_Main] Cadence Xtensa Audio Framework
[DSP_Main] Library Name : Audio Framework (Hostless)
[DSP_Main] Library Version : 3.5
[DSP_Main] API Version : 3.2
[DSP_Main] start
[DSP_Main] established RPMsg link
[CM33 Main] DSP image copied to DSP TCM
[CM33 Main][APP_DSP_IPC_Task] start
[CM33 Main][APP_Shell_Task] start
Copyright 2024 NXP
>>
Type help
to see the command list. Similar description will be displayed on serial console (example from MIMXRT700-EVK):
"help": List all the registered commands
"exit": Exit program
"version": Query DSP for component versions
"record_dmic": Record DMIC audio , perform voice recognition (VIT) and playback on codec
USAGE: record_dmic [language]
For voice recognition say supported WakeWord and in 3s frame supported command.
If selected model contains strings, then WakeWord and list of commands will be printed in console.
NOTE: this command does not return to the shell
After running the “record_dmic en” command, similar output will be printed
[CM33 CMD] Setting VIT language to en
[DSP_Main] Number of channels 1, sampling rate 16000, PCM width 32
[CM33 CMD] [APP_DSP_IPC_Task] response from DSP, cmd: 13, error: 0
[DSP Record] Audio Device Ready
[CM33 CMD] DSP DMIC Recording started
[CM33 CMD] To see VIT functionality say wakeword and command
[DSP VIT] VIT Model info
[DSP VIT] VIT Model Release = 0x40a00
[DSP VIT] Language supported : English
[DSP VIT] Number of WakeWords supported : 2
[DSP VIT] Number of Commands supported : 12
[DSP VIT] VIT_Model integrating WakeWord and Voice Commands strings : YES
[DSP VIT] WakeWords supported :
[DSP VIT] 'HEY NXP'
[DSP VIT] 'HEY TV'
[DSP VIT] Voice commands supported :
[DSP VIT] 'MUTE'
[DSP VIT] 'NEXT'
[DSP VIT] 'SKIP'
[DSP VIT] 'PAIR DEVICE'
[DSP VIT] 'PAUSE'
[DSP VIT] 'STOP'
[DSP VIT] 'POWER OFF'
[DSP VIT] 'POWER ON'
[DSP VIT] 'PLAY MUSIC'
[DSP VIT] 'PLAY GAME'
[DSP VIT] 'WATCH CARTOON'
[DSP VIT] 'WATCH MOVIE'
[DSP Record] connected CAPTURER -> GAIN_0
[DSP Record] connected XA_GAIN_0 -> XA_VIT_PRE_PROC_0
[DSP Record] connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0
[DSP VIT] - WakeWord detected 1 HEY NXP
[DSP VIT] - Voice Command detected 6 STOP
Xtensa IDE log of successful start of command:
Number of channels 2, sampling rate 16000, PCM width 16
Audio Device Ready
connected CAPTURER -> GAIN_0
connected CAPTURER -> XA_VIT_PRE_PROC_0
connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0
Running on DSP
Debug configuration: When the demo runs successfully, the terminal will display the following:
Cadence Xtensa Audio Framework
Library Name : Audio Framework (Hostless)
Library Version : 3.2
API Version : 3.0
[DSP_Main] start
[DSP_Main] established RPMsg link
Number of channels 2, sampling rate 16000, PCM width 16
Audio Device Ready
VoiceSeekerLight lib initialized!
============= VoiceSeekerLight Configuration =============
version = 0.6.0
num mics = 2
max num mics = 4
mic0 = (35, 0, 0)
mic1 = (-35, 0, 0)
mic2 = (0, -35, 0)
num_spks = 0
max num spks = 2
samplerate = 16000
framesize_in = 32
framesize_out = 480
create_aec = 0
create_doa = 0
buffer_length_sec = 1.5
aec_filter_length_ms = 0
============= VoiceSeekerLight Memory Allocation =============
VoiceSeekerLib allocated 80592 persistent bytes
VoiceSeekerLib allocated 3840 scratch bytes
==================================== VoiceSeekerLight Memory Usage =========================
=========
Total = 72400 bytes
connected CAPTURER -> GAIN_0
connected XA_GAIN_0 -> XA_VOICE_SEEKER_0
connected XA_VOICE_SEEKER_0 -> XA_VIT_PRE_PROC_0
connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0
Known Issues
There are limited features in release SRAM target because of memory limitations. To enable/disable components, set appropriate preprocessor define in project settings to 0/1 (e.g. XA_VIT_PRE_PROC etc.). Debug and flash targets have full functionality enabled.