Skip to main content

Audio

1. Introduction

The built-in audio IP module of the RV1106 SoC allows external analog microphones to be connected to it. Through analog-to-digital conversion, external signals can be recorded and captured and then transmitted to the CPU for processing. The CPU can also output PCM (Pulse-Code Modulation) digital audio signals locally, which can be converted to analog signals for playback. For more detailed usage, please refer to the audio section of the relevant data manual. Since the Luckfox-Pico-Pro/Max does not expose relevant audio pins, this tutorial is only applicable to Luckfox-Pico-Ultra-W.

ModelSupported System
Luckfox Pico Ultra/Ultra WBuildroot
  • Note: If you are flashing the Ubuntu22.04 system, you can download the ffmpeg tool for recording and playing music.

2. Sound Configuration

  1. View the sound card devices

    # arecord -l
    **** List of CAPTURE Hardware Devices ****
    card 0: rvacodec [rv-acodec], device 0: ffae0000.i2s-rv1106-hifi ff480000.acodec-0 [ffae0000.i2s-rv1106-hifi ff480000.acodec-0]
    Subdevices: 0/1
    Subdevice #0: subdevice #0

    # aplay -l
    **** List of PLAYBACK Hardware Devices ****
    card 0: rvacodec [rv-acodec], device 0: ffae0000.i2s-rv1106-hifi ff480000.acodec-0 [ffae0000.i2s-rv1106-hifi ff480000.acodec-0]
    Subdevices: 1/1
    Subdevice #0: subdevice #0
    • card0: The audio processing IP module built into the SOC.
  2. Sound card driver directory

    # ls /dev/snd/
    by-path controlC0 pcmC0D0c pcmC0D0p
    • controlC0: Used for sound card control, where "C0" represents sound card 0, corresponding to the built-in sound output of the SOC.
    • pcmC0D0c: Used for recording PCM devices, where the "c" stands for capture.
    • pcmC0D0p: Used for playback PCM devices, where the "p" stands for playback.
    • by-path: Saves the device mapping.
    # ls -l /dev/snd/by-path/
    total 0
    lrwxrwxrwx 1 root root 12 Jan 1 2021 platform-acodec-sound -> ../controlC0

3. Sound Card Configuration

You can use amixer or rk_mpi_amix_test --list_contents to get all available audio controls on the sound card.The source code for mpi_amix_test is located in the SDK directory //media/rockit/rockit/mpi/example/mod/.

  1. View all states of built-in codec gain

    # rk_mpi_amix_test --list_contents
    cmd parse result:
    sound control id : 0
    control name : (null)
    control value : (null)
    list controls : 0
    list contents : 1
    Number of controls: 25
    ctl type num name value
    0 ENUM 1 I2STDM Digital Loopback Mode , DisabledMode1Mode2Mode2 Swap
    1 INT 1 ADC MIC Left Gain 2 (range 0->3)
    2 INT 1 ADC MIC Right Gain 2 (range 0->3)
    3 INT 1 ADC ALC Left Volume 6 (range 0->31)
    4 INT 1 ADC ALC Right Volume 6 (range 0->31)
    5 INT 1 ADC Digital Left Volume 195 (range 0->255)
    6 INT 1 ADC Digital Right Volume 195 (range 0->255)
    7 ENUM 1 ADC HPF Cut-off , OffOn
    8 INT 1 ALC AGC Left Volume 12 (range 0->31)
    9 INT 1 ALC AGC Right Volume 12 (range 0->31)
    10 INT 1 ALC AGC Left Max Volume 7 (range 0->7)
    11 INT 1 ALC AGC Right Max Volume 7 (range 0->7)
    12 INT 1 ALC AGC Left Min Volume 0 (range 0->7)
    13 INT 1 ALC AGC Right Min Volume 0 (range 0->7)
    14 ENUM 1 ALC AGC Left Switch , OffOn
    15 ENUM 1 ALC AGC Right Switch , OffOn
    16 ENUM 1 AGC Left Approximate Sample Rate , 96KHz48KHz44.1KHz32KHz24KHz16KHz12KHz8KHz
    17 ENUM 1 AGC Right Approximate Sample Rate , 96KHz48KHz44.1KHz32KHz24KHz16KHz12KHz8KHz
    18 ENUM 1 ADC Mode , DiffadcLSingadcLDiffadcRSingadcRSingadcLRDiffadcLR
    19 ENUM 1 ADC MICBIAS Voltage VREFx0_8VREFx0_825VREFx0_85VREFx0_875, VREFx0_9VREFx0_925VREFx0_95VREFx0_975
    20 ENUM 1 ADC Main MICBIAS Off, On
    21 ENUM 1 ADC MIC Left Switch , WorkMute
    22 ENUM 1 ADC MIC Right Switch , WorkMute
    23 INT 1 DAC LINEOUT Volume 20 (range 0->30)
    24 INT 1 DAC HPMIX Volume 2 (range 0->2)
    • I2STDM Digital Loopback Mode: Indicates whether the I2STDM controller operates in loopback mode.
      • Disabled: Default state, loopback mode is not enabled.
      • Mode1: Suitable for 4-channel use cases. Channels 1-2 are for microphone audio input, and channels 3-4 are for loopback data.
      • Mode2: Suitable for 2-channel use cases. The left channel is for microphone audio input, and the right channel is for loopback data from the right channel of playback.
      • Mode2 Swap: Suitable for 2-channel use cases. The left channel is for loopback data from the left channel of playback, and the right channel is for microphone audio input. The channel order is opposite to Mode2.
    • ACodec ADC Boost Gain: Simulated volume for codec, valid value range is 1 to 3.
      • vol 0: Disabled and not recommended
      • vol 1: 0dB
      • vol 2: 20dB
      • vol 3: 12dB
    • ACodec ADC ALC PGA Gain: Simulated volume for codec, value range is 0 to 255.
      • min: -9dB (vol: 0)
      • max: +37.5dB (vol: 255)
      • step: +1.5dB
      • location: 0dB (vol: 6)
    • ACodec ADC Digital Gain: Digital volume for codec, value range is 0 to 255.
      • min: -97.5dB (vol: 0)
      • max: +30dB (vol: 255)
      • step: +0.5dB
      • location: 0dB (vol: 195)
    • ADC Mode: Indicates whether the ACodec ADC operates in differential or single-ended mode. Default is "Diff" for differential mode. For maximum power saving, only the left channel of the ADC is enabled. Therefore, "DiffadcL" is the default preference.
    • ACodec DAC Lineout Gain: Volume value range is 0 to 30.
      • min: -39dB (vol: 0)
      • max: +6dB (vol: 30)
      • step: +1.5dB
      • location: 0dB (vol: 26)
    • ACodec DAC HPMIX Gain: It is the pre-stage gain of Lineout and usually not set. The volume value range is actually 1 and 2.
      • vol 0: Disabled and not recommended
      • vol 1: 0dB
      • vol 2: 6dB
    • AGC Left Approximate Sample Rate: Common sample rates for PCM (Pulse Code Modulation) audio include but are not limited to the following:
      • 8 kHz: Used for telephone systems and voice communication, generally used for narrowband voice communication with poor sound quality.
      • 16 kHz: Common in applications such as voice recognition and voicemail, with better sound quality.
      • 32 kHz: Used in early audio storage and playback devices, also common in some low-quality audio recording devices.
      • 44.1 kHz: CD-quality standard sampling rate, also the standard sampling rate for many digital audio files, suitable for music storage and playback.
      • 48 kHz: Widely used in the digital audio field, including music recording, movie production, etc., also the standard sampling rate for many professional and consumer digital audio devices.
      • 96 kHz: High-fidelity audio recording and playback, used in professional music production and audiophile-grade audio equipment.

4. Recording

  1. Record in PCM format with a sampling rate of 16kHz/2ch/16bit, where rk_mpi_ai_test command defaults to handling 16-bit depth.

    rk_mpi_ai_test --sound_card_name=hw:0,0 --device_rate=16000 --device_ch=2 --out_rate=16000 --out_ch=2 --output=/tmp
    • --sound_card_name=hw:0,0: Specifies the name of the audio input device as "hw:0,0", which is usually the device name in ALSA (Advanced Linux Sound Architecture), representing the first device of the first sound card.
    • --device_rate=16000: Specifies the sampling rate of the audio input device as 16000 Hz.
    • --device_ch=2: Specifies the number of channels of the audio input device as 2 (stereo).
    • --out_rate=16000: Specifies the output audio sampling rate as 16000 Hz.
    • --out_ch=2: Specifies the number of output audio channels as 2 (stereo).
    • --output=/tmp: Specifies the file path for output audio data as /tmp.
  2. Record in WAV format with a sampling rate of 16kHz/2ch/16bit.

    arecord -f S16_LE -c 2 -r 16000 -D hw:0 -d 30 test.wav
    • -f S16_LE: Specifies the PCM format as 16-bit Little Endian.
    • -c 2: Specifies the number of channels as 2 (stereo).
    • -r 16000: Specifies the sampling rate as 16000 Hz.
    • -D hw:0: Specifies the audio device as hw:0.
    • -d 30: Specifies the recording time as 30 seconds.
    • test.wav: Specifies the output file name as test.wav.
  3. If the sampling rate parameter is not specified, arecord will use the default sampling rate. Typically, the default sampling rate for arecord is 44100 Hz (44.1 kHz), which is the sampling rate standard for CD audio.

    arecord -f cd -Dhw:0 -d 30 test.wav

5. Playback

  1. Play audio in PCM format.

    rk_mpi_ao_test -i /root/2.pcm   --sound_card_name=hw:0,0 --device_ch=2 --device_rate=16000 --input_rate=16000 --input_ch=2
  2. Play audio in WAV format.

    aplay -Dhw:0 test.wav
  3. Play audio in MP3 format.

    madplay anheqiao.mp3 
  4. With a variety of audio formats available, we can use the ffmpeg tool to convert various audio formats into WAV format.

    ffmpeg -i anheqiao.mp3 -f wav -acodec pcm_s16le -ar 44100 -ac 2 output.wav
    • -f wav: Specifies the output file format as WAV.
  • -acodec pcm_s16le: Specifies the audio codec as pcm_s16le, indicating the use of 16-bit signed integer format for encoding.
    • -ar 44100: Specifies the sampling rate of the output audio as 44100 Hz.
  • -ac 2: Specifies the number of output audio channels as 2, representing stereo audio.

6. Volume Adjustment

  1. Adjust speaker volume (recommended).

    # amixer cset name='DAC LINEOUT Volume' 15
    numid=24,iface=MIXER,name='DAC LINEOUT Volume'
    ; type=INTEGER,access=rw---R--,values=1,min=0,max=30,step=0
    : values=15
    | dBscale-min=-39.00dB,step=1.50dB,mute=1
    • Adjust according to your own needs, the range is 0-30.
  2. Adjust according to your own needs, the range is 0-30.

    # Check the maximum volume adjustment factor
    # sox anheqiao.wav -n stat
    Samples read: 22058820
    Length (seconds): 250.100000
    Scaled by: 2147483647.0
    Maximum amplitude: 0.891235
    Minimum amplitude: -0.891266
    Midline amplitude: -0.000015
    Mean norm: 0.153542
    Mean amplitude: -0.000068
    RMS amplitude: 0.206771
    Maximum delta: 1.232666
    Minimum delta: 0.000000
    Mean delta: 0.060496
    RMS delta: 0.092891
    Rough frequency: 3153
    Volume adjustment: 1.122

    # Set the volume factor and generate another file
    sox -v 0.2 anheqiao.wav output.wav
    • SoX is a powerful audio processing tool, but some of its functions, such as MP3 encoding, may require specific compilation options and dependent libraries to support.