Voice Support#
The Robotics Edge platform supports voice processing capabilities to enable speech-based interaction within robotic applications. This functionality leverages imx-voice-plugins running with GStreamer for audio processing and ROS for communicating with other components.
Overview#
Voice support on this platform includes:
Speech-to-Text transcription using imxasr from imx-voice-plugins (released separately).
Integration with ROS through ros-gst-bridge open-source component, which is pre-installed on the platform.
This setup allows decoded text from voice input to be shared with other ROS nodes for further processing or decision-making.
Key Components#
imx-voice-plugins: GStreamer NXP proprietary plugins for audio and voice processing, including speech transcription.
ros-gst-bridge: Open-source component, pre-installed on Robotics Edge Platform. Facilitates communication between GStreamer pipelines and ROS nodes.
Access to Voice Plugins#
The imx-voice-plugins are released separately from the Robotics Edge Platform. To obtain access to the evaluation package, please contact:voice@nxp.com
Installation guide and examples are provided within the evaluation package.
First simple example#
Requires imx 95 EVK or imx 8M Plus EVK and a microphone connected to the jack audio port.
Download and install imx-voice-plugins (more detailed information is available in the README file of the package):
unzip imx-voice-plugins.zip cp gst-plugin/libgstimx* /usr/lib/gstreamer-1.0/ cp models/moonshine/moonshine-base*.onnx /root/
Initialize ROS environment:
. /opt/ros/jazzy/setup.sh`
Start Gstreamer speech to text basic pipeline:
for i.MX 95 EVK:gst-launch-1.0 -q --no-position alsasrc device=hw:wm8962audio,0 ! audioconvert ! queue ! imxasr silent=true onnx-nb-threads=5 ! rostextsink &
for i.MX 8MP EVK:
gst-launch-1.0 -q --no-position alsasrc device=hw:wm8960audio,0 ! audioconvert ! queue ! imxasr silent=true onnx-nb-threads=3 ! rostextsink &
This command can be easily adapted to use any audio input source. For example, use
alsasrc device=hw:micfilaudio,0to use the on-board digital microphones on i.MX 95 EVK.Use ros2 utility to monitor “gst_text_pub” topic:
ros2 topic echo "gst_text_pub"
Instead of using the generic
ros2utility, you can subscribe this topic from your application to get notified of the decoded speech.Wait for a few seconds, so that the speech to text models are loaded. Then, when you speak in the microphone, the decoded text is displayed by the
ros2utility.
Now the transcript of the speech captured by the microphone is available to ROS nodes as the below figure shows:
