Factors affecting the video and audio effects of video conferencing systems

In the application of video conferencing systems, the factors affecting the visual and audio effects are mainly concentrated in three aspects:

1) the quality of service of the network;

This article refers to the address: http://

2) performance of the MCU and the terminal;

3) Design of the conference room.

First, the network's quality of service (QoS)

At present, the networks commonly used in video conferencing systems mainly include E1 private lines and IP. Based on circuit switching and time division multiplexing, the E1 line provides end-to-end exclusive bandwidth, so the network itself has a complete transmission quality guarantee mechanism. In most cases, the main factor affecting the transmission efficiency of the E1 line is the quality of the transmission equipment and transmission lines. For such factors, we can often improve by replacing the transmission equipment and reducing the line error rate.

The IP network is based on statistical multiplexing and packet switching technology. When it needs to transmit multiple services such as voice, data and video at the same time, its traditional “best effort” mechanism exposes many problems. The most important point is that it cannot be used for each. kind of business to provide end to end bandwidth guarantee, will lead to larger transmission delay and jitter. To this end, we must optimize the IP network through technical means to reduce the impact of the network itself on the video conferencing system. These technical tools have now evolved into an important branch of the IP system, namely Quality of Service (QoS).

The so-called QoS refers to the ability of a network to provide better services for a specific network traffic through multiple technologies. Its main purpose is to achieve priority control, including bandwidth, delay, jitter, and packet loss. Almost all networks can take advantage of QoS to get the best efficiency.

There are three types of QoS technologies, including best-effort services, integrated services, and differentiated services, among which differential services are the most widely used. In differential services, the network classifies, queues, and manages packets based on the QoS markings of each packet. These tags can be IP addresses, TCP port numbers, or specific fields in an IP packet.

In actual network planning, network devices (such as routers) are required to provide QoS guarantee mechanisms through various technologies by means of complex traffic management systems, and different priority levels are classified according to service types, such as voice optimal, video second, At the end of the data, network resources are then allocated based on these priority levels.

For video conferencing, in order to guarantee the bandwidth of the video service, the router must be able to identify and classify the video service data packets in the passed IP data stream, and then provide bandwidth guarantee and priority delivery services through the congestion management mechanism. In this way, when the network is congested, the transmission effect of voice and video services can be guaranteed. Currently mainstream router vendors can provide QoS support based on classification, marking and congestion management.

Second, the performance of MCU and terminal

In addition to the network should provide a good QoS guarantee mechanism, the video conferencing system equipment itself should also have good performance to truly guarantee the effect of the conference. These performance factors include the video and audio codec technology used by the system, the design structure of the device, the adaptability of the device itself to the harsh network environment, and other aspects.

1, video and audio codec technology

Video and audio coding technology is a key technical indicator of video conferencing systems and an important factor affecting the effectiveness of the conference. At present, the video coding technologies used in video conferencing systems mainly include H.261, H.263, H.264, MPEG-2, MPEG-4, etc. The audio coding technologies mainly include G.711, G.722, and G.728. , G.729, MP3, etc.

Among them, H.264 and MPEG-4 two video encoding technologies can achieve high-definition dynamic image effects at low bandwidth, and the encoding delay is small. As a new generation video encoding and decoding standard , the advantages are very obvious.

In terms of audio coding, MP3 is an efficient sound compression algorithm with a frequency response range of 20Hz to 20KHz, a sampling frequency of 44.1KHz, and support for two-channel encoding, so it is gaining more and more applications.

2, the design structure of the equipment

In the early days, many MCUs and terminals in the video conferencing system used PC as the hardware structure, and the operating system was based on Windows. Such devices have great limitations in terms of codec performance, packet forwarding efficiency, stability, and security , resulting in low video and audio quality and high latency.

As a professional conference room application, most video conferencing systems now choose MCUs and terminal devices based on embedded design structures. This is mainly because the embedded system instructions are streamlined and highly real-time. Combined with a dedicated codec DSP, high-quality, low-latency video and audio signal processing can be realized, and stability and security are also high.

3. Equipment adaptability to harsh network environment

The QoS of the network can guarantee the transmission effect of the video conference to a certain extent, but its role is very limited, especially in some relatively harsh network environments. The adaptability of the video conferencing system equipment to the harsh network environment will also have a greater impact on the conference effect. These adaptive capabilities include IP priority settings, IP packet sequencing, IP packet repetition control , IP packet jitter control, packet loss retransmission, and rate auto-tuning.

1) IP Priority (IP Precedence)

When the network plans the QoS technology of the differentiated service mode, the service packets entering the data network can be classified by various matching means, including IP address and IP priority (IP Precedence).

Among them, audio, video and RTCP (Multicast) data streams can be prioritized by using the IP priority part of the IP packet. When the network uses IP Precedence for traffic matching, the video and audio packets sent by the video device with the modified IP Precedence field information can be processed into the queue to ensure the priority transmission of the video conference code stream.

2) IP packet sorting

In general, the best-effort delivery mechanism of the network cannot guarantee the correct order of the packets it forwards. For H.323 video conferencing systems, if the video device receives IP packets in order, it will cause a problem of out-of-sequence, and loss or delay of the data packet will result in freezing of the video image or interruption or jitter of the sound.

This problem can be solved by the video device supporting the IP packet sorting function. When the IP packet arrives, the video device will verify its order, and the out-of-order packet is returned to maintain the continuity of the audio and video streams sent to the end user.

3) IP packet repetitive control

When an IP packet passes through the bearer network, multiple duplicate copies may be generated, or in order to adapt to a harsh network environment, a retransmission mechanism may be used to generate multiple duplicate copies, which may cause freezing or interruption of the video image. A video device that supports IP packet repetitive control can correct this error by this function to maintain the continuity of the audio and video streams sent to the end user.

4) Jitter control

When the audio and video IP packets leave the sender, they are evenly arranged at regular intervals. After passing through the network, this uniform interval is corrupted by different delay sizes, resulting in jitter. Jitter can cause inconsistencies in the audio and video streams on the target terminal. Video devices that support jitter control can achieve jitter cancellation through jitter caching to maintain the consistency of audio and video streams received by end users.

5) Lost packet retransmission

When the network is severely congested, network devices (such as routers) will drop some video packets according to the cache size and related processing mechanisms. The video packets in the video conferencing system are transmitted using the UDP protocol, and UDP itself has no retransmission mechanism. This causes image dropping or mosaic on the receiving end. A video device that supports packet loss retransmission can ensure the consistency of the conference image by adding a mechanism for packet loss detection and retransmission.

6) Automatic rate adjustment technology

In some bad network environments, reducing the conference code rate will help improve the consistency and actual effect of video and audio. If the video device supports the dynamic rate adjustment technology, the terminal and the MCU can automatically adapt to the capacity and performance of the network by detecting favorable and unfavorable factors on the network, and dynamically improve the code rate of the video conference to provide the end user with the best possible. Video quality.

The adaptive bandwidth adjustment function of the video device is mainly realized by detecting the packet loss rate. If the terminal detects that the packet loss rate exceeds the specified threshold, it will automatically reduce the video conference rate and notify other participants to do the same action, thus providing a conference rate with optimal video and audio effects.

7) Lip synchronization technology

In the video conferencing system, the video signal and the audio signal are separately encoded and transmitted separately. Due to factors such as the IP priority and the size of the video and audio package, the synchronization packets of the video and audio arrive in different order, causing the lip to be out of sync.

There are two main factors that affect lip-slip synchronization: network transmission delay and video and audio processing delay are different.

When the audio and video packets leave the sender, the audio packets are synchronized with the corresponding video packets. However, when passing through the bearer network, various queue algorithms perform different processing on the audio data package and the video data package. This will disrupt the synchronization of the audio package with the corresponding video package. The end result is that the sound is out of sync with the mouth. Video devices that support lip sync can correct this problem by using the RTP timestamp information in the IP packet. Using the RTP timestamp, the device can determine which audio packet corresponds to which video packet. Further re-adjust the corresponding video and audio package to ensure the synchronization of the sound and the mouth.

At the transmitting end, the time it takes to process the audio is different from the time it takes to process the video. Factors affecting this problem include the difference in speed of sound and speed of light, the size and shape of the room, and the complexity of the algorithms for audio and video coding. In order to avoid time difference, devices that support lip sync can increase the delay between the sound and the mouth by adding a certain delay at the starting point of the audio stream. It can also increase or decrease the audio delay at the receiving end to correct the inappropriate delay setting at the sender. . This ensures that the remote site synchronizes lip sound when receiving video conference sounds and images.

4, audio processing technology

1) Automatic echo suppression

When a multi-point video conference is held, the audio encoder of each site transmits the audio package to the MCU, and the MCU broadcasts the audio package of the conference site to all other conference sites. When the video conference terminal receives the audio package, the decoded audio is decoded. The stream is level-compared with the locally input audio stream, and the same portion is removed, so that the local sound will not be transmitted from the speaker of the venue, causing the audio to oscillate, thereby avoiding echo.

2) Automatic gain control

Since the priority video conference places the omnidirectional microphone in the center of the venue, each speaker receives a different level of microphone due to the location of the microphone.

In order to ensure the smoothness of the audio level transmitted to the remote , the gain processing of the audio is performed during the encoding to ensure that the speakers in a certain range speak in the same tone, so that the sound of the remote venue will not be high or low.

3) Background noise cancellation

There will inevitably be some environmental noise during the meeting, such as the continuous noise emitted by electrical equipment such as air conditioners, fans, and AC power. These sounds seriously affect the audio quality of the conference.

The automatic noise suppression system judges whether it is ambient noise according to the level and duration of the audio, and processes it to achieve a good sound conference effect .

Third, the design of the conference room

The design of the conference room is also one of the important factors affecting the effectiveness of the video conference, including the venue equipment, the venue layout, and the venue environment. The design of the conference room involves a wide range of content, limited by space, we only list some factors and suggestions below.

The venue equipment includes specific video and audio signal input and output devices such as cameras, televisions, microphones and audio systems. Combined with different venue layout and decoration conditions, these devices should be different in configuration to truly guarantee the conference effect. For example, the sound reinforcement system of the venue must be well coordinated with the layout of the venue to truly ensure its effectiveness. The professional sound reinforcement system design relies on complex sound field testing and repeated debugging.

The layout of the venue includes the overall design, venue area, venue decoration, etc.:

1) The overall design of the venue should be able to realistically reflect the scenes and scenes, so that the participants have a sense of presence to achieve good results in visual and linguistic information exchange. The images transmitted in the conference room include characters, scenes, charts, texts, etc. It should be clearly identifiable;

2) The area of ​​the venue is recommended to be calculated on an average of 2.2 square meters per person;

3) In order to prevent the "light-receiving" and "reflective" effects of color on the person's camera, the background wall should have a uniform light color, usually in beige or gray, so that the camera lens aperture is set properly, and the other three sides of the room Walls, floors, ceilings, etc. are not saturated with black or bright colors, usually light blue, light gray, etc., each wall is not suitable for complex patterns or hanging complex frames, so as to avoid camera movement or zooming The image produces blurring and increases coding overhead;

4) The conference table is arranged in a row. In order to reduce facial shadows, it is required to use a light-colored tabletop or tablecloth. It is best to add a layer of soft material between the microphone and the desktop to avoid causing too much noise when hitting the tabletop;

5) Try to use a comfortable chair, while the chair should not be equipped with small casters, restrict movement to prevent leaving the lens;

6) In order to ensure sound insulation, carpets should be laid on the ground, silencers should be installed on the ceiling, soundproof blankets should be installed on the surrounding walls, double-glazed windows should be installed on the windows, and tablecloths should be placed on the tables;

7) Lamp illumination is a basic requirement for video conference rooms. Because of the randomness of video conferences, artificial cold light sources are used indoors to avoid natural light. The windows and doors of the conference room should be covered with dark curtains. The light source has no adverse effects on human vision. It is more suitable to select three primary color lights (color temperature 3500K).

The venue environment includes the indoor environment and the surrounding environment:

1) Air conditioning should be installed in the conference room to create a stable temperature and humidity environment. The noise of the air conditioner should be relatively low. If the noise of the indoor air conditioner is too large, the audio effect of the venue will be greatly affected. The indoor air in the meeting should be circulated;

2) The conference room should be placed away from the noisy and hustle and bustle of the outside world. Meeting rooms should be set up to prevent leaks, ease of use and minimize the need for external noise interference.


- 6 Way Power Strip Extension with surge protector will protect your electrical devices from voltage spikes e.g. electrical surge or lightning. USB Ports Electric Socket 6 Outlet Extension provide you multi capability for various electronics power needs at the same time.

- Qualified Power Strip 6-Outlet Power Cord allowing you extend the outlet 6 feet (2 meters) away from the wall or even further.

- 6 Gang Power Strip is capable of overload protection, optional USB Charging Ports and surge protection, flexible design.


6 Gang Power Strip

6 Gang Power Strip, Smart Power Bar, 6 Way Power Strip Extension, Power Strip 6-Outlet, USB Ports Electric Socket 6 Outlet Extension

ZhongShan JITONGLONG Plastic Hardware Co. Ltd. , http://www.toukoo-electronics.com

This entry was posted in on