7.6 Microphone Management Section
7.6 Microphone Management Section
Section titled “7.6 Microphone Management Section”7.6.1 Overview
Section titled “7.6.1 Overview”The robot can use two different microphones as interaction input sources by default: external microphones (depusheng handheld microphone and flasher lapel microphone) and internal microphones (built-in microphones of the robot). Additionally, a silent mode can be set to disable interaction. The microphone switching and silent mode settings can be done through the AimMaster software. We also provide RPC interfaces for switching the microphone source and setting the silent mode.
Furthermore, we provide raw audio output from the microphone (with noise reduction, echo cancellation, and VAD), which can be used to obtain the robot’s microphone audio and integrate it into other interaction systems after disabling Agibot’s own interaction chain.
Internal microphone interaction logic (sound, face, lip shape, distance):
- The internal fan noise of the robot is relatively loud, so it is recommended that the customer’s wake-up and conversation sounds be as loud as possible.
- The main focus is on the three largest faces in the center camera. The face will only switch when there is a significant change in face size.
- During conversations, lip movement is used to determine if the conversation is ongoing, which helps in anti-interference and separating different speakers.
- The recommended distance is 0.5m to 2m in front of the robot. Users who are very tall (over 180 cm) and standing too close may have their faces out of the camera range. The most important thing is to keep the face within the camera range.
External microphones are directional and can be used by speaking directly into the microphone. There is no face recognition logic involved.
Additionally, the interaction supports secondary development. The agent can be set to different modes, allowing users to exit Agibot’s cloud audio chain and output only the raw audio and face data for custom interaction agent development.
Silent mode is a state under the normal mode and can be flexibly switched without restarting the agent.
7.6.2 RPC Interface for Switching Internal and External Microphones
Section titled “7.6.2 RPC Interface for Switching Internal and External Microphones”| Interface Name | pb:/aimdk.protocol.AgentControlService/SetMicSourceRequest |
|---|---|
| Function Summary | Switch between internal and external microphone sources |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/SetMicSourceRequest |
| Input Parameters |
|
| Output Parameters |
|
| Example Script | examples/agent/SetMicSource.sh |
| Notes |
7.6.3 RPC Interface for Getting the Current Microphone Source
Section titled “7.6.3 RPC Interface for Getting the Current Microphone Source”| Interface Name | pb:/aimdk.protocol.AgentControlService/GetMicSourceRequest |
|---|---|
| Function Summary | Get the current microphone source |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/GetMicSourceRequest |
| Input Parameters | |
| Output Parameters |
|
| Example Script | examples/agent/GetMicSource.sh |
| Notes |
7.6.4 RPC Interface for Setting Silent Mode
Section titled “7.6.4 RPC Interface for Setting Silent Mode”| Interface Name | pb:/aimdk.protocol.AgentControlService/SetVoiceEnable |
|---|---|
| Function Summary | Set silent mode |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/SetVoiceEnable |
| Input Parameters |
|
| Output Parameters |
|
| Example Script | examples/agent/SetVoiceEnable.sh |
| Notes |
7.6.5 Query Silent Mode RPC Interface
Section titled “7.6.5 Query Silent Mode RPC Interface”| Interface Name | pb:/aimdk.protocol.AgentControlService/GetVoiceEnable |
|---|---|
| Function Overview | Query the silent mode status |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/GetVoiceEnable |
| Input Parameters | |
| Output Parameters |
|
| Example Script | examples/agent/GetVoiceEnable.sh |
| Notes |
7.6.6 Set Interaction Mode RPC Interface
Section titled “7.6.6 Set Interaction Mode RPC Interface”| Interface Name | pb:/aimdk.protocol.AgentControlService/SetAgentPropertiesRequest |
|---|---|
| Function Overview | Set interaction mode |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/SetAgentPropertiesRequest |
| Input Parameters | Modes:
|
| Output Parameters | |
| Example Script | examples/agent/SetAgentPropertiesRequest.sh |
| Notes |
|
7.6.7 Get Interaction Mode RPC Interface
Section titled “7.6.7 Get Interaction Mode RPC Interface”| Interface Name | pb:/aimdk.protocol.AgentControlService/GetAgentPropertiesRequest |
|---|---|
| Function Overview | Query the interaction mode |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/GetAgentPropertiesRequest |
| Input Parameters | |
| Output Parameters | Modes:
|
| Example Script | examples/agent/GetAgentPropertiesRequest.sh |
| Notes |
7.6.8 Noise-Reduced Microphone Audio Topic Interface
Section titled “7.6.8 Noise-Reduced Microphone Audio Topic Interface”| Interface Name | /agent/process_audio_output |
|---|---|
| Function Overview | Noise-reduced microphone audio interface |
| Interface Type | ROS2 Topic |
| Output Parameters |
|
| Example Script | examples/agent/get_voice.py |
| Notes |
|
7.6.9 Local Face Registration Interface
Section titled “7.6.9 Local Face Registration Interface”This interface is not a conventional HTTP JSON RPC or ROS2 Topic, but rather provides a separate script examples/agent/run_face_id_register.sh for calling. The content of the script is as follows:
#!/bin/bash
# 1. The 'images' directory to register (at the same level as the shell script)RUN_SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"IMAGES_DIR="${RUN_SCRIPT_DIR}/images"
# 2. Faceid base directoryFACEID_SCRIPT_DIR="/agibot/software/v0/scripts/agent/face_id/"FACEID_LIB_DIR="/agibot/software/v0/bin"FACEID_OFFLINE_FEAT="/agibot/data/param/interaction/face_id/offline_face_features"
# 3. Relative paths to the executable and configuration filesEXEC="${FACEID_SCRIPT_DIR}/face_id_register"CONF="${FACEID_SCRIPT_DIR}/face_id_config.json"
chmod +x "$EXEC"export LD_LIBRARY_PATH="${FACEID_LIB_DIR}":$LD_LIBRARY_PATH
# 4. Invocationrm -rf "$FACEID_OFFLINE_FEAT"/*"$EXEC" "$CONF" "$IMAGES_DIR"Place the face data to be registered in the images directory in the same directory as the script. Execute the script on ORIN to complete the registration. After registration, the ID and image correspondence, as well as the registration result, are stored in the Result.txt file in the same directory. An example is shown below (where blurry and small faces are also successfully registered, but it is still recommended to use clear frontal face images as shown in satisfy.png to avoid adverse effects on recognition rates):
GID17648293009168001 满足.png OK 注册成功GID17648293018063607 侧脸.png FAIL 人脸质量不满足要求GID17648293020281011 过暗.png FAIL 人脸质量不满足要求GID17648293021878934 模糊.png OK 注册成功GID17648293024764703 过曝.png FAIL 人脸质量不满足要求GID17648293026684491 无人脸.png FAIL 未检测到人脸GID17648293028768487 非人脸.png FAIL 未检测到人脸GID17648293030305970 人脸过小.png OK 注册成功Explanation of the face registration and recognition logic rules:
- Place JPG, PNG, and JPEG type face images in the
imagesdirectory, with each image containing only one clear frontal face. Running the script will register the faces. After registration, you need to restart the agent. On ORIN, runaima em stop-app agent && aima em start-app agent, or you can directly restart the robot. - The locally registered face features will be stored in the
/agibot/data/param/interaction/face_id/offline_face_featuresdirectory on ORIN. - The construction rule for the registered user ID is: “GID” + timestamp + random 4-digit number. At the time of release, the current machine’s SN (
/agibot/data/info/sn) will replace “GID” as the new UID. - The Lingxin platform can also upload faces, which we call the cloud-based face database. The cloud-based face database can be configured with greeting information, etc. After the relevant content is distributed, it will be stored in the
/agibot/data/param/interaction/face_id/user_info.jsonfile. - Each time the script registers, it will clear the existing local database. Please always re-register all face data completely, i.e., maintain an
imagesfolder containing all the faces that need to be recognized. Any additions, deletions, or modifications require re-running the registration script. - The matching rule always prioritizes the cloud database before the local database. Once the first successful match is found, no further matching will be performed.
7.6.10 Face Recognition Result Topic Interface
Section titled “7.6.10 Face Recognition Result Topic Interface”| Interface Name | /agent/vision/face_id |
|---|---|
| Function Overview | Face recognition results |
| Interface Type | ROS2 Topic |
| Output Parameters |
|
| Example Script | examples/agent/get_face_id.py |
| Notes |
|
7.6.11 Wake-Up Result Reporting
Section titled “7.6.11 Wake-Up Result Reporting”| Interface Name | /agent/wakeup/pb_3Aaimdk_2Eprotocol_2EWakeUpResult |
|---|---|
| Function Summary | Wake-up result reporting |
| Interface Type | ROS2 Topic |
| Output Parameters |
|
| Example Script | examples/agent/get_wakeup_result.py |
| Notes |
|
7.6.12 Built-In Microphone Wake Word Configuration
Section titled “7.6.12 Built-In Microphone Wake Word Configuration”| Interface Name | pb:/aimdk.protocol.AgentControlService/SetCustomWakeUpWord |
|---|---|
| Function Summary | Set wake-up word for the built-in microphone. |
| Interface Type | HTTP JSON RPC |
| URL | http://192.168.100.110:59301/rpc/aimdk.protocol.AgentControlService/SetCustomWakeUpWord |
| Input Parameters |
|
| Output Parameters |
|
| Example Script | examples/agent/SetCustomWakeUpWord.sh |
| Notes |
|
7.6.13 External Microphone Wake Word Configuration
Section titled “7.6.13 External Microphone Wake Word Configuration”For the external microphone, modify /agibot/data/var/agent/omnis_sdk/sherpa-onnx-kws/keywords.txt on ORIN. Add or remove phonetic entries as needed.
Wake-word format:
声母1(空格)韵母1(带声调)(空格)声母2(空格)韵母2(带声调) ......- Wake words containing non-Chinese characters are not supported. English wake words must be converted into Chinese transliterations. Examples:
* 中文唤醒词1:x iǎo zh ì x iǎo zh ì @小智小智_zh_1
* 中文唤醒词2(带ü):x iǎo l ǚ x iǎo l ǚ @小吕小吕_zh_1
* 中文唤醒词3(多音字):x iǎo x ī x iǎo x ī @小茜小茜_zh_1
* 英文唤醒词:h ā l óu t āng mǔ @哈喽汤姆_en_1-
Wake-word recommendations and constraints:
Length: 3-6 Chinese characters are recommended.
Repetition pattern: ABAB-style repetition is recommended to improve wake-up success rate.
Pronunciation: Prefer open vowels such as a, o, and e.
Avoid common words: Avoid common words or command words (for example, “goodbye”, “good morning”, “watch TV”) to reduce false wake-ups.
Avoid repeated/similar sounds: Avoid duplicated characters and consecutive similar pronunciations, such as “珍珍” or “花华”.
Avoid modal particles: Avoid light-tone particles such as ‘吧’, ‘呢’, ‘啊’, ‘的’, ‘了’, ‘吗’.
Avoid zero-initial syllables: Avoid characters such as ‘昂’, ‘恩’, ‘安’.
Tone diversity: Avoid using the same tone for all characters, such as ‘喀咪喀咪’.
Limit closed vowels: Reduce use of i, u, and ü. -
Valid syllable list (using invalid syllables may crash the process):
| a | án | áo | é | en | er | ì | iàn | iǎo |
| á | àn | ào | è | én | ér | ǐ | iǎn | iāo |
| à | ǎn | ǎo | ě | èn | èr | ī | iān | ié |
| ǎ | ān | āo | ē | ěn | ěr | ia | iáng | iè |
| ā | ang | b | ei | ēn | f | iá | iàng | iě |
| ái | áng | c | éi | éng | g | ià | iǎng | iē |
| ài | àng | ch | èi | èng | h | iǎ | iāng | ín |
| ǎi | ǎng | d | ěi | ěng | i | iā | iáo | ìn |
| āi | āng | e | ēi | ēng | í | ián | iào | ǐn |
| īn | iù | o | ou | sh | ū | uán | uè | ún |
| íng | iǔ | ó | óu | t | uá | uàn | uě | ùn |
| ìng | iū | ò | òu | u | uà | uǎn | uē | ǔn |
| ǐng | j | ǒ | ǒu | ú | uǎ | uān | üè | ūn |
| īng | k | ō | ōu | ù | uā | uáng | üě | uo |
| ióng | l | óng | p | ǔ | uái | uàng | uí | uó |
| iǒng | m | òng | q | ǘ | uài | uǎng | uì | uò |
| iōng | n | ǒng | r | ǜ | uǎi | uāng | uǐ | uǒ |
| iú | ń | ōng | s | ǚ | uāi | ué | uī | uō |
| w | x | y | z | zh |