XAF Record Example

Table of Content

Overview

The dsp_xaf_record application demonstrates audio processing using the DSP core, the Xtensa Audio Framework (XAF) middleware library, with a focus on audio recording, processing and voice recognition (VIT - Voice Intelligent Technology, Voice seeker).

As shown in the table below, the application is supported on several development boards and each development board may have certain limitations, some development boards may also require hardware modifications or allow to use of an audio expansion board. Therefore, please check the supported features and Hardware modifications or Example configuration sections before running the demo.

Feature VIT Voice seeker
EVK-MIMXRT595 OK 2 dmic
EVK-MIMXRT685 OK 2 dmic
MIMXRT685-AUD-EVK OK 1 dmic/3 dmic with DMIC board
MIMXRT700-EVK OK X
- Dark green - Fully supported and enabled by default.
- Light green - Supported, but only available after some SW or HW modification. More information about modification can be found in the [Example configuration](#example-configuration) section.
- X - Not supported.

Functionality

The application includes the following main components:

  1. ARM Core (CM33) - Handles user interface and communicates with the DSP core

  2. DSP Core - Processes audio data using the Xtensa Audio Framework (XAF)

The typical audio processing pipeline includes:

  • Audio source component - DMIC audio

  • VIT and Voice seeker component (perform voice recognition)

  • Renderer component (playback on codec)

The application demonstrates recording from digital microphones (DMIC), processing the audio with voice enhancement algorithms, performing voice recognition, and prints back in console detected WakeWord and list of commands.

Hardware Requirements

  • Development board (one of the following):

    • EVK-MIMXRT595 board

    • EVK-MIMXRT685 board

    • MIMXRT685-AUD-EVK board (optionally with 8CH-DMIC expansion board - rev B required)

    • MIMXRT700-EVK board

  • Micro USB cable

  • JTAG/SWD debugger

  • Headphones with 3.5 mm stereo jack

  • Personal Computer

Hardware Modifications

Some development boards need some hardware modifications to run the application.

  • EVK-MIMXRT595:

    To enable the example audio using WM8904 codec, connect pins as follows:

    • JP7-1 <–> JP8-2

    Note: The I3C Pin configuration in pin_mux.c is verified for default 1.8V, for 3.3V, need to manually configure slew rate to slow mode for I3C-SCL/SDA.

  • EVK-MIMXRT685:

    To enable the example audio using WM8904 codec, connect pins as follows:

    • JP7-1 <–> JP8-2

  • MIMXRT685-AUD-EVK

    1. Set the hardware jumpers (Tower system/base module) to default settings.

    2. Set hardware jumpers JP2 2<–>3, JP44 1<–>2 and JP45 1<–>2.

    For 8CH-DMIC expansion board (optional):

    1. Connect the 8CH-DMIC expansion board to the MIMXRT685-AUD-EVK board to the DMIC connector (J31). For safety reasons, the expansion board must be connected when the power supply is disconnected.

    2. Set the hardware jumpers on the 8-DMIC expansion board to 2MIC, 3MICA, 3MICC config (Short: J6, J9, J10).

    3. Set the hardware jumpers JP44 2<–>3 and JP45 2<–>3 on the MIMXRT685-AUD-EVK board for on-board DMIC bypass.

  • MIMXRT700-EVK:

    Set the hardware jumpers to default settings.

Preparation

  1. Connect headphones to Audio HP / Line-Out connector.

    • EVK-MIMXRT595 - J4

    • EVK-MIMXRT685 - J4

    • MIMXRT685-AUD-EVK - J4, J50 for third channel when using 3 microphones

    • MIMXRT700-EVK - J29

  2. Connect a micro USB cable between the PC host and the debug USB port on the development board.

    • EVK-MIMXRT595 - J40

    • EVK-MIMXRT685 - J5

    • MIMXRT685-AUD-EVK - J5

    • MIMXRT700-EVK - J54

  3. Open a serial terminal with the following settings:

    • 115200 baud rate

    • 8 data bits

    • No parity

    • One stop bit

    • No flow control

  4. Download the program for CM33 core to the target board.

  5. Launch the debugger in your IDE to begin running the demo.

  6. If building release configuration, start the xt-ocd daemon and download the program for DSP core to the target board. If building debug configuration, launch the Xtensa IDE or xt-gdb debugger to begin running the demo.

Notes:

  • DSP image can only be debugged using J-Link debugger. See the document ‘Getting Started with Xplorer’ for your particular board for more information.

Example Configuration

The example can be configured by user. Before configuration, please check the table to see if the feature is supported on the development board.

  • MIMXRT685-AUD-EVK 8CH-DMIC expansion board settings:

    Select how many microphones should be used

    • Set the BOARD_DMIC_NUM preprocessor macro to 1,2, 3 (default) or 4 in the project for the CM33 core.

    • When the 8CH-DMIC expansion board is used, the DMIC_BOARD_CONNECTED macro must be set to 1 (default) in the project for the DSP core.

    • Important: When you set the value to 2, 3 or 4 you have to connect the 8CH-DMIC expansion board and set the DMIC_BOARD_CONNECTED macro to 1. Don’t forget set the hardware jumpers JP44 2-3 and JP45 2-3.

Running the Demo

The ARM application will power and clock the DSP, so it must be loaded prior to loading the DSP application. The DSP application can be built by the following tools: Xtensa Xplorer or Xtensa C Compiler. Application for Cortex-M33 can be built by the other toolchains listed in MCUXpresso SDK Release Notes.

The release configurations of the demo will combine both applications into one ARM image. With this, the ARM core will load and start the DSP application on startup. Pre-compiled DSP binary images are provided under dsp/binary/ directory. If you make changes to the DSP application in release configuration, rebuild ARM application after building the DSP application. If you plan to use MCUXpresso IDE for cm33 you will have to make sure that the preprocessor symbol DSP_IMAGE_COPY_TO_RAM, found in IDE project settings, is defined to the value 1 when building release configuration.

The debug configurations will build two separate applications that need to be loaded independently. DSP application can be built by the following tools: Xtensa Xplorer or Xtensa C Compiler. Required tool versions can be found in MCUXpresso SDK Release Notes for the board. Application for cm33 can be built by the other toolchains listed there. If you plan to use MCUXpresso IDE for cm33 you will have to make sure that the preprocessor symbol DSP_IMAGE_COPY_TO_RAM, found in IDE project settings, is defined to the value 0 when building debug configuration. The ARM application will power and clock the DSP, so it must be loaded prior to loading the DSP application.

In order to debug both the Cortex-M33 and DSP side of the application, please follow the instructions:

  1. It is necessary to run the Cortex-M33 side first and stop the application before the DSP_Start function

  2. Run the xt-ocd daemon with proper settings

  3. Download and debug the DSP application

In order to get TRACE debug output from the XAF it is necessary to define XF_TRACE 1 in the project settings. It is possible to save the TRACE output into RAM using DUMP_TRACE_TO_BUF 1 define on project level. Please see the initialization of the TRACE function in the xaf_main_dsp.c file. For more details see XAF documentation.

Running on CM33

When the demo runs successfully, the CM33 terminal will display the following output (example from MIMXRT700-EVK):

    ******************************
    DSP audio framework demo start
    ******************************

    [CM33 Main] Configure codec

    [DSP_Main] Cadence Xtensa Audio Framework
    [DSP_Main] Library Name    : Audio Framework (Hostless)
    [DSP_Main] Library Version : 3.5
    [DSP_Main] API Version     : 3.2

    [DSP_Main] start
    [DSP_Main] established RPMsg link
    [CM33 Main] DSP image copied to DSP TCM
    [CM33 Main][APP_DSP_IPC_Task] start
    [CM33 Main][APP_Shell_Task] start

    Copyright  2024  NXP

    >>

Type help to see the command list. Similar description will be displayed on serial console (example from MIMXRT700-EVK):

    "help": List all the registered commands

    "exit": Exit program

    "version": Query DSP for component versions

    "record_dmic": Record DMIC audio , perform voice recognition (VIT) and playback on codec
    USAGE: record_dmic [language]
    For voice recognition say supported WakeWord and in 3s frame supported command.
    If selected model contains strings, then WakeWord and list of commands will be printed in console.
    NOTE: this command does not return to the shell

After running the “record_dmic en” command, similar output will be printed

    [CM33 CMD] Setting VIT language to en
    [DSP_Main] Number of channels 1, sampling rate 16000, PCM width 32
    [CM33 CMD] [APP_DSP_IPC_Task] response from DSP, cmd: 13, error: 0
    [DSP Record] Audio Device Ready
    [CM33 CMD] DSP DMIC Recording started
    [CM33 CMD] To see VIT functionality say wakeword and command
    [DSP VIT] VIT Model info
    [DSP VIT]   VIT Model Release = 0x40a00
    [DSP VIT]   Language supported : English
    [DSP VIT]   Number of WakeWords supported : 2
    [DSP VIT]   Number of Commands supported : 12
    [DSP VIT]   VIT_Model integrating WakeWord and Voice Commands strings : YES
    [DSP VIT]   WakeWords supported :
    [DSP VIT]    'HEY NXP'
    [DSP VIT]    'HEY TV'
    [DSP VIT]   Voice commands supported :
    [DSP VIT]    'MUTE'
    [DSP VIT]    'NEXT'
    [DSP VIT]    'SKIP'
    [DSP VIT]    'PAIR DEVICE'
    [DSP VIT]    'PAUSE'
    [DSP VIT]    'STOP'
    [DSP VIT]    'POWER OFF'
    [DSP VIT]    'POWER ON'
    [DSP VIT]    'PLAY MUSIC'
    [DSP VIT]    'PLAY GAME'
    [DSP VIT]    'WATCH CARTOON'
    [DSP VIT]    'WATCH MOVIE'
    [DSP Record] connected CAPTURER -> GAIN_0
    [DSP Record] connected XA_GAIN_0 -> XA_VIT_PRE_PROC_0
    [DSP Record] connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0
    [DSP VIT]  - WakeWord detected 1 HEY NXP
    [DSP VIT]  - Voice Command detected 6 STOP

Xtensa IDE log of successful start of command:

    Number of channels 2, sampling rate 16000, PCM width 16
    Audio Device Ready
    connected CAPTURER -> GAIN_0
    connected CAPTURER -> XA_VIT_PRE_PROC_0
    connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0

Running on DSP

Debug configuration: When the demo runs successfully, the terminal will display the following:

    Cadence Xtensa Audio Framework
      Library Name    : Audio Framework (Hostless)
      Library Version : 3.2
      API Version     : 3.0

    [DSP_Main] start
    [DSP_Main] established RPMsg link
    Number of channels 2, sampling rate 16000, PCM width 16

    Audio Device Ready
    VoiceSeekerLight lib initialized!
    ============= VoiceSeekerLight Configuration =============
      version = 0.6.0
      num mics = 2
      max num mics = 4
      mic0 = (35, 0, 0)
      mic1 = (-35, 0, 0)
      mic2 = (0, -35, 0)
      num_spks = 0
      max num spks = 2
      samplerate = 16000
      framesize_in = 32
      framesize_out = 480
      create_aec = 0
      create_doa = 0
      buffer_length_sec = 1.5
      aec_filter_length_ms = 0
    ============= VoiceSeekerLight Memory Allocation =============
      VoiceSeekerLib allocated 80592 persistent bytes
      VoiceSeekerLib allocated 3840 scratch bytes
    ==================================== VoiceSeekerLight Memory Usage =========================
    =========
      Total                 = 72400 bytes

    connected CAPTURER -> GAIN_0
    connected XA_GAIN_0 -> XA_VOICE_SEEKER_0
    connected XA_VOICE_SEEKER_0 -> XA_VIT_PRE_PROC_0
    connected XA_VIT_PRE_PROC_0 -> XA_RENDERER_0

Known Issues

There are limited features in release SRAM target because of memory limitations. To enable/disable components, set appropriate preprocessor define in project settings to 0/1 (e.g. XA_VIT_PRE_PROC etc.). Debug and flash targets have full functionality enabled.