Analyze ARM speech recognition system

With the extensive use of high-tech skills in the military field, weapons and equipment are gradually moving toward high, fine and sharp. Traditional military exercises, because of their long practice time, high cost of practice, and narrow practice space, often fail to reach the expected practice, and are no longer satisfied with the needs of modern military practice. In order to solve the above problems, imitation exercises came into being.

To further enhance the practice, this article uses the intelligent voice interaction chip to plan the teaching and playback system of an imitation exerciser. The teaching system vividly demonstrates the standard operation process and the corresponding operation appearance for the operator, which greatly shortens the practice time for the operator and improves the exercise effect. The playback system records the password, sound intensity, motion, time, operation appearance, etc. of each operator in the practice process. After the practice is completed, the replay process is performed, so that the operator can correct the problem in time. The teaching system can also be understood as a playback of the normal operation process. The system does not require the support of virtual reality skills and can be completed on a small embedded system.

1 System principle

The simulation exerciser consists of a measurement and control computer and multiple slave devices. As shown in Figure 1. Here, only one slave device is introduced. The hardware system is mainly composed of a measurement and control computer, an Arduino mega2560 controller, a voice recognition unit, a sound intensity detection unit, a voice component unit, a panel manipulation unit, and an instrument panel. The panel control unit is more complex and contains a variety of control circuits. In the simulation exercise, the slave device completes the whole exercise process under the control of the Arduino mega2560 controller, and completes the replay of the practice operation appearance in the teaching and playback system. The detailed circuit planning is not described here.

The voice recognition unit acts as the operation password for identifying the operator; the sound intensity detection unit acts as the basis for determining the strength of the slave operator; the Arduino mega2560 controller is used to monitor the status of each component of the instrument panel. The action of the operator, and then the completion of the record of the operation process. The operation appearance of each instrument is not required to be described in advance according to the operation and operation. During the operation playback process, the measurement and control computer reproduces the recorded operation process by controlling the corresponding Arduino mega2560 controller from the device according to the recorded data.

2 unit system planning

2.1 Speech recognition unit planning

At that time, the speech recognition skills were developed very quickly, and can be classified into specific people and non-specific voice recognition according to the type of recognition target. A specific person refers to a person who recognizes a target as a special person. A non-specific person refers to a target that is targeted at most users. Usually, it is necessary to collect more and more voices for recording and practice, and after learning, and then reach a higher recognition rate.

The LD3320 speech recognition chip selected in this article is a chip based on the Skill Independent AutomaTIc Speech Recognition TI (SI ASR). The chip integrates high-precision A/D and D/A interfaces, eliminating the need for externally supported FLASH and RAM, which can complete voice recognition, voice control, and human-machine dialogue functions, providing a true single-chip voice recognition solution. . Moreover, the list of identified key words can be dynamically edited.

The voice recognition unit selects ATmega168 as the MCU, and controls the LD3320 to complete all the operations related to voice recognition, and uploads the recognition result to the Arduino mega2560 controller through the serial port. The various operations of the LD3320 chip must be completed by register operations. There are two methods for register read and write operations (standard parallel method and serial SPI method). The parallel method is used here to connect the data port of the LD3320 to the I/O port of the MCU.

The voice recognition process uses the suspension method to operate, and its operation flow is divided into initialization, writing key words, beginning recognition and echo suspension. The MCU program is written by ARDUINO IDE. After debugging, it is burned through the serial port, and the LD3320 is controlled to complete the voice recognition, and the recognition result is uploaded to the Arduino mega2560 controller.

Analyze the steps to design an ARM speech recognition system

2.2 Sound intensity detection unit planning

In the speech recognition, the demand discriminating is the password of a slave operator. For this reason, the sound intensity detecting unit circuit is planned, and the circuit only needs to be able to discriminate the relative sound intensity, without detecting the sound level, and the detection precision is required. low.

The capacitive MIC acoustic sensor converts the external acoustic signal into an electrical signal, which is expanded by the NE5532 expansion circuit, and converts the input weak audio signal into a voltage signal having a certain amplitude, and the voltage signal is loaded by the AC/DC RMS conversion circuit. After the change, it is expanded again and finally sampled by the A/D of the Arduino mega2560 controller. In the meantime, D1 terminates the A/D of the Arduinomega2560 controller, and the INT1 terminates the external suspension of the Arduino mega2560 controller. 1. When the external sound signal is greater than the preset threshold, the transistor turns on the INT1 terminal from the high level to the low level. The external abort, the manipulator responds to abort and performs A/D sampling. The sampled data is retained by the mean filtering, and the sound intensity data is uploaded when the control computer queries.

2.3 Voice component planning

TTS (Text To Speech) text-to-speech skills are the trend of human-computer intelligent dialogue. The voice system based on TTS skills can detect and compose voices according to the query conditions without any prior recording, and then greatly reduce the amount of system maintenance. With this skill, voice chip pronunciation can be manipulated via an MCU or a PC.

This article uses SYN6658 Chinese speech to form a chip for speech composition. SYN6658 accepts the text data to be composed through the UART interface or SPI interface communication method, and completes the conversion of text to speech (or TTS voice). The controller and the SYN6658 voice chip are connected through the UART interface. The controller sends control commands and texts to the SYN6658 voice chip through serial communication. The SYN6658 voice component chip composes the received text into a voice signal output, and the output signal is output via LM386. The power amplifier is expanded and connected to the speaker for broadcast.

The SYN6658 voice component circuit is planned by using the typical circuit used in the chip hardware data manual. It will not be introduced here. The power expansion circuit is expanded by the LM386 audio amplifier LM386 produced by National Semiconductor.

Initialization is performed when the speech composition is performed, including speaker selection, digital processing strategy, speech rate conditioning, tone conditioning, volume conditioning, and the like.

Because the system is to imitate the pronunciation of many people, so different from the device set the speaker and tone and speed to distinguish. After initialization, wait for the voice component of the measurement and control computer to form a command. After receiving the command, the chip will send a 1-byte status back to the host computer. The host computer can determine the current working status of the chip based on this backhaul.

3 system software planning

The software planning of the teaching and playback system includes the software planning of the measurement and control computer and the software planning of the Arduino mega260 controller for each slave device.

The measurement and control computer is the control center of the whole system. The software is written by C#. In the teaching and playback system, the operation data is mainly recorded so that the operation process can be accurately played back according to the recorded data. The data recorded in the requirements include: From the device operator's operation password, operation action, password and action time, the operation appearance corresponding to each operation. In order to simplify the description of the data, the code of each thing is prepared in advance, and the process of recording only records the code, which greatly improves the efficiency of the program.

During the operation process, the measurement and control computer controls and polls the lower computer every 50 ms, and records the reaction data. The data is recorded in units of 50 ms. Use a timer to control the moment. In the playback process, the time is compared with the recorded time, and when the recorded time coincides with the current time, the measurement and control computer controls the lower computer to perform the event, and the event is played back.

The Arduino mega2560 controller acts as a control command for the measurement and control computer and fulfills the instructions, reads the speech recognition results, collects and processes the sound intensity data, and controls the speech component to perform voice composition. The Arduinomega2560 controller uses the serial port abort method to accept commands.

As long as the instructions are correctly accepted and the results are returned, if the measurement and control computer does not receive the returned results within the time limit, the attack error is indicated, and the measurement and control computer needs to be sent from the beginning.

This article uses a smart voice chip to plan the teaching and playback system of an imitation exerciser. The system does not require the support of today's prevailing virtual reality skills, and can only operate under the control of the MCU. The system can also be completed on small portable devices with an outstanding use vision.

Cell Phone Stand

Shenzhen ChengRong Technology Co.,Ltd. , https://www.laptopstandsuppliers.com

This entry was posted in on