[go: up one dir, main page]

WO2017181508A1 - 多媒体会议控制方法及服务器 - Google Patents

多媒体会议控制方法及服务器 Download PDF

Info

Publication number
WO2017181508A1
WO2017181508A1 PCT/CN2016/085049 CN2016085049W WO2017181508A1 WO 2017181508 A1 WO2017181508 A1 WO 2017181508A1 CN 2016085049 W CN2016085049 W CN 2016085049W WO 2017181508 A1 WO2017181508 A1 WO 2017181508A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
control terminal
conference control
speaking
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/085049
Other languages
English (en)
French (fr)
Inventor
唐春华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bangyan Technology Co Ltd
Original Assignee
Bangyan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bangyan Technology Co Ltd filed Critical Bangyan Technology Co Ltd
Publication of WO2017181508A1 publication Critical patent/WO2017181508A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources

Definitions

  • the present invention relates to the field of multimedia conferences, and in particular, to a multimedia conference control method and a server.
  • Multimedia conference rooms are rapidly adopted for their functional diversity (such as on-site conferences, academic reports, training and teaching). popular.
  • the multimedia conference system refers to the integration of sound, light, electrical equipment and software that are interrelated with the conference.
  • the multimedia conference room whether it is for reporting, summarizing, reporting, introducing products, etc., the use of computer interactive operation of pictures, texts, sounds, shadows, paintings, fully mobilized the participants' sensory perception, greatly improving the effectiveness of the meeting.
  • Multimedia is increasingly showing its advantages in the office field.
  • the cameras of the venue are mostly fixed, and it is impossible to track the video of the speaker, which greatly reduces the user experience.
  • the camera cannot track the problem of shooting the speaker video, and the problem in this aspect needs to be solved by the inventor.
  • the main object of the present invention is to solve the problem that the camera cannot track the video of the speaker in the multimedia conference system.
  • the present invention provides a multimedia conference control method, where the multimedia conference control method includes the following steps:
  • the server determines, according to the speaking instruction, the orientation information corresponding to the corresponding speaking seat and the speaking seat;
  • the server adjusts a camera shooting speaker video according to the determined orientation information
  • the server sends the speaker video to a display screen for display.
  • the server before the step of determining, by the server according to the speaking instruction, the step of determining the orientation information corresponding to the speaker and the speaker according to the speaking instruction, the server further includes:
  • the server displays a preset agent list by using the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction;
  • the server receives a speaking instruction sent by the conference control terminal.
  • the method before the step of the server displaying the preset agent list by the conference control terminal for the user to determine the speaker based on the agent list and triggering the corresponding speaking instruction, the method further includes:
  • the server saves the received agent list and the orientation information corresponding to each agent.
  • the method further includes:
  • the server receives video data of each of the sub-sites through a network connection
  • the server performs jigsaw processing on the video data of each of the sub-sites to obtain a puzzle video
  • the server sends the puzzle video to a display for display.
  • the server before the step of determining, by the server according to the speaking instruction, the step of determining the orientation information corresponding to the speaker and the speaker according to the speaking instruction, the server further includes:
  • the server displays a preset agent list by using the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction;
  • the server receives a speaking instruction sent by the conference control terminal.
  • the method before the step of the server displaying the preset agent list by the conference control terminal for the user to determine the speaker based on the agent list and triggering the corresponding speaking instruction, the method further includes:
  • the server saves the received agent list and the orientation information corresponding to each agent.
  • the step of the server receiving the video data of each of the sub-sites through the network connection includes:
  • the server detects the network bandwidth of the network connection in real time when receiving the video data of the conference site through the network connection;
  • the server determines a video bit rate and a video resolution corresponding to the changed network bandwidth when detecting that the network bandwidth changes;
  • the server switches to the determined video bit rate and video resolution to continue receiving video data.
  • the server before the step of determining, by the server according to the speaking instruction, the step of determining the orientation information corresponding to the speaker and the speaker according to the speaking instruction, the server further includes:
  • the server displays a preset agent list by using the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction;
  • the server receives a speaking instruction sent by the conference control terminal.
  • the method before the step of the server displaying the preset agent list by the conference control terminal for the user to determine the speaker based on the agent list and triggering the corresponding speaking instruction, the method further includes:
  • the server saves the received agent list and the orientation information corresponding to each agent.
  • the present invention further provides a multimedia conference server, where the multimedia conference server includes:
  • the receiving module is configured to: when receiving the speaking instruction sent by the conference control terminal, determine, according to the speaking instruction, the orientation information corresponding to the corresponding speaking seat and the speaking seat;
  • control module configured to adjust a camera shooting speaker video according to the determined orientation information
  • a sending module configured to send the speaker video to a display screen for display.
  • the multimedia conference server further includes a display module
  • the display module is configured to display a preset agent list by using the conference control terminal, so that the user determines a speaker based on the agent list and triggers a corresponding speaking instruction;
  • the receiving module is further configured to receive a speaking instruction sent by the conference control terminal.
  • the multimedia conference server further includes a storage module
  • the receiving module is further configured to: when receiving the setting instruction sent by the conference control terminal, receive the agent list input by the user based on the conference control terminal and the orientation information corresponding to each agent;
  • the storage module is configured to save the received agent list and the orientation information corresponding to each agent.
  • the multimedia conference server further includes a multimedia module
  • the receiving module is further configured to receive video data of each of the sub-sites through a network connection
  • the multimedia module is configured to perform jigsaw processing on video data of each of the sub-sites to obtain a puzzle video
  • the sending module is further configured to send the puzzle video to a display screen for display.
  • the multimedia conference server further includes a display module
  • the display module is configured to display a preset agent list by using the conference control terminal, so that the user determines a speaker based on the agent list and triggers a corresponding speaking instruction;
  • the receiving module is further configured to receive a speaking instruction sent by the conference control terminal.
  • the multimedia conference server further includes a storage module
  • the receiving module is further configured to: when receiving the setting instruction sent by the conference control terminal, receive the agent list input by the user based on the conference control terminal and the orientation information corresponding to each agent;
  • the storage module is configured to save the received agent list and the orientation information corresponding to each agent.
  • the receiving module includes a detecting unit, a determining unit, and a switching unit;
  • the detecting unit is configured to detect a network bandwidth of the network connection in real time when receiving video data of a sub-site through a network connection;
  • the determining unit is configured to determine a video bit rate and a video resolution corresponding to the changed network bandwidth when detecting that the network bandwidth changes;
  • the switching unit is configured to switch to the determined video bit rate and video resolution to continue receiving video data.
  • the multimedia conference server further includes a display module
  • the display module is configured to display a preset agent list by using the conference control terminal, so that the user determines a speaker based on the agent list and triggers a corresponding speaking instruction;
  • the receiving module is further configured to receive a speaking instruction sent by the conference control terminal.
  • the multimedia conference server further includes a storage module
  • the receiving module is further configured to: when receiving the setting instruction sent by the conference control terminal, receive the agent list input by the user based on the conference control terminal and the orientation information corresponding to each agent;
  • the storage module is configured to save the received agent list and the orientation information corresponding to each agent.
  • the invention receives the speaking instruction sent by the user based on the conference control terminal by the server, and controls the camera to aim at the corresponding orientation according to the speaking instruction, so as to realize the automatic positioning of the camera in the multimedia conference system, the speaker video It is automatically displayed on the display screen, so that the chairman station can trigger the speaking command to indicate who is speaking through the conference terminal.
  • the corresponding speaker video is displayed on the display screen of the conference site, which greatly improves the conference effect and improves the user experience.
  • FIG. 1 is a hardware architecture diagram of a multimedia conference system implementing various embodiments of the present invention
  • FIG. 2 is a schematic flowchart of a first embodiment of a multimedia conference control method according to the present invention
  • FIG. 3 is a schematic flowchart of a second embodiment of a multimedia conference control method according to the present invention.
  • FIG. 4 is a schematic flowchart diagram of a third embodiment of a multimedia conference control method according to the present invention.
  • FIG. 5 is a schematic flowchart diagram of a fourth embodiment of a multimedia conference control method according to the present invention.
  • FIG. 6 is a schematic diagram of an effect of an embodiment of an agent list displayed by a conference control terminal according to the present invention.
  • FIG. 7 is a schematic diagram of functional modules of a first embodiment of a multimedia conference server according to the present invention.
  • FIG. 8 is a schematic diagram of functional modules of a second embodiment of a multimedia conference server according to the present invention.
  • FIG. 9 is a schematic diagram of functional modules of a third embodiment of a multimedia conference server according to the present invention.
  • FIG. 10 is a schematic diagram of functional modules of a fourth embodiment of a multimedia conference server according to the present invention.
  • FIG. 1 is a hardware architecture diagram of a multimedia conference system implementing various embodiments of the present invention.
  • the multimedia conference system may include a server 100, a conference control terminal 200, and external devices such as a camera 301, a microphone 302, a display screen 303, an audio 304, and the like.
  • the conference control terminal 200 is configured to generate a corresponding instruction according to a command input by the host user and send it to the server 100 to control various operations of the conference service.
  • the conference control terminal 200 can be a terminal of a mobile phone, a smart phone, a notebook computer, a PAD (tablet computer), a desktop computer, or the like.
  • the camera 301 and the microphone 302 are used to collect audio and video data.
  • the display screen 303 and the audio 304 device are configured to output audio and video processed by the multimedia device 102.
  • the server 100 may include a multimedia device 102, a softswitch device 103, a resource access device 104, a controller 101, and the like.
  • FIG. 1 illustrates a server 100 having various devices, but it should be understood that implementation is not required. All the devices shown. More or fewer devices can be implemented instead.
  • the control signaling between the devices in the server 100 can be implemented through the SIP protocol, and the multimedia data passes the RTP protocol (Real-time). Transport Protocol, real-time transport protocol) bearer transmission.
  • the softswitch device 103 is configured to control the registration of the various resources (such as camera resources, display resources, microphone resources, etc.) of the terminal 200 and the conference room, call routing, and the like.
  • the controller 101 is used for control and management of conference services.
  • the multimedia device 102 is used for processing audio and video, such as audio mixing, video puzzles, and the like.
  • the resource access device 104 is configured to access a display 303, a camera 301, a microphone 302, an audio 304, and the like in the conference room.
  • the present invention provides a multimedia conference control method.
  • FIG. 2 is a schematic flowchart diagram of a first embodiment of a multimedia conference control method according to the present invention.
  • the multimedia conference control method includes:
  • Step S10 When receiving the speaking instruction sent by the conference control terminal, the server determines, according to the speaking instruction, the orientation information corresponding to the corresponding speaking seat and the speaking seat;
  • the host user may trigger a speaking instruction for instructing the corresponding speaker to speak by the conference control terminal, and the conference control terminal sends the speaking instruction to the server, and when the server receives the speaking instruction, according to the The speaking instruction determines the orientation information corresponding to the corresponding speaker and the speaker to control the camera to align the corresponding orientation to perform the shooting of the speaker video.
  • the conference control terminal may add the agent information corresponding to the speaker as the speaker information to the speaking instruction, and when receiving the speaking instruction, the server determines the corresponding speaker information according to the speaking instruction. And querying the orientation information corresponding to the speaker locally saved by the server according to the speaker information, so as to adjust the camera to shoot the video according to the orientation information.
  • the server can communicate with the conference control terminal through a SIP protocol.
  • the speaking instruction may be transmitted between the server and the conference control terminal in a format of an INFO message.
  • Step S20 the server adjusts a camera shooting speaker video according to the determined orientation information
  • the server adjusts, according to the determined orientation information, that the corresponding camera is aligned with the speaker to perform the shooting of the video of the speaker.
  • the orientation information may include a preset shooting angle for the server to adjust a corresponding camera angle according to the shooting angle to align the speaker.
  • the camera may be a single one or a plurality of cameras. When there are a plurality of cameras for capturing the video of the speaker seat, the orientation information of each camera is respectively set corresponding to the same agent, and the server is configured according to each camera. The corresponding orientation information controls the angle adjustment of each camera.
  • the server may further determine the corresponding speaker information when receiving the speaking instruction of the conference control terminal, and control to open the microphone device corresponding to the speaker to collect the speaker audio data, and collect the speaker's data. After the audio data is transmitted, it is mixed by the media server in the server and sent to the audio device for output.
  • Step S30 the server sends the video of the speaker to a display screen for display.
  • the server can send the video of the speaker shot by the camera to the display screen for display by using the RTP protocol.
  • the conference control terminal may further add, to the speaking instruction, a control command for displaying a video of the speaker selected by the host user, and the server determines, according to the speaking instruction, whether to send the corresponding video of the speaker. Displaying to the display screen, if yes, the server sends the video of the speaker to the display screen for display; if not, deleting the video of the speaker.
  • the server can be connected to the display through a VGA/HDMI/DVI/SDI interface.
  • the server receives the speaking instruction sent by the user based on the conference control terminal, and controls the camera to align the corresponding orientation according to the speaking instruction to perform the shooting of the speaker video, thereby realizing the automatic alignment of the camera in the multimedia conference system.
  • the video is automatically displayed on the display screen, so that the chairman station can trigger the speaking command to indicate who is speaking through the conference terminal.
  • the corresponding speaker video is displayed on the display screen of the conference site, which greatly improves the conference effect and improves the user experience.
  • FIG. 3 is a schematic flowchart diagram of a second embodiment of a multimedia conference control method according to the present invention. Based on the first embodiment of the foregoing multimedia conference control method, after the step S30, the method further includes:
  • Step S40 the server receives video data of each of the sub-sites through a network connection
  • the server can receive video data of each sub-site through the RTP protocol.
  • the server may connect to a remote conference site server or a SIP conference terminal of the conference site through the network to receive video data of each of the conference sites.
  • Step S50 the server performs the puzzle processing on the video data of each of the sub-sites to obtain a puzzle video
  • the server can implement the puzzle processing of the video data of each of the sub-sites through the multimedia device in the server to obtain a puzzle video containing the video of each of the sub-sites.
  • the server can perform puzzle processing in various ways, for example, 1+1 (1 main conference video + 1 sub-site video), 4 sub-screen, 6-screen, 1+4 (1 main conference video + 4) Video of the sub-site), 1+5 (1 main venue video + 5 sub-site videos), 9-screen and so on.
  • Step S60 the server sends the puzzle video to a display screen for display.
  • the server sends the puzzle video to a display for display.
  • the display screen accessed by the resource access device of the server may be a single display screen or multiple display screens. For example, when multiple screens are accessed, the One display is used to display the puzzle video of all the venues, the second display is used to display the speaker video, and the third display is used to display documents such as PPT.
  • the video data of each sub-site is received by the server, and the jigsaw video is displayed according to the video data, and the video of each sub-site is displayed, which improves the conference effect and improves the user experience.
  • FIG. 4 is a schematic flowchart diagram of a third embodiment of a multimedia conference control method according to the present invention. Based on the second embodiment of the foregoing multimedia conference control method, the step S40 includes:
  • Step S41 the server detects the network bandwidth of the network connection in real time when receiving the video data of the conference site through the network connection;
  • Step S42 the server determines a video bit rate and a video resolution corresponding to the changed network bandwidth when detecting that the network bandwidth changes.
  • step S43 the server switches to the determined video bit rate and video resolution to continue receiving video data.
  • the server detects the network bandwidth of the network connection in real time during the process of receiving the video data of the sub-site through the network connection; when detecting the change of the network bandwidth, the server determines the video code corresponding to the changed network bandwidth. Rate and video resolution; the server switches to the determined video bit rate and video resolution to continue receiving video data. For example, the server receives the video data of the conference site according to the code rate of 2000 kbps, detects that the network bandwidth changes, and the changed network bandwidth conforms to the 800 kbps code rate, and the server switches to the 800 kbps code rate to continue receiving the current location. Video data.
  • the resolution and the code rate of the video are adjusted according to the network bandwidth, which avoids problems such as video jamming and flower screen caused by network deterioration during the conference, and can automatically adjust the video resolution and code rate to adapt to the network bandwidth when the network is deteriorated. , to achieve the best video effects under the current network bandwidth conditions, improve the user experience.
  • FIG. 5 is a schematic flowchart diagram of a fourth embodiment of a multimedia conference control method according to the present invention. Based on the first embodiment of the foregoing multimedia conference control method, before the step S10, the method further includes:
  • step S11 the server displays a preset agent list through the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction;
  • Step S12 The server receives the speaking instruction sent by the conference control terminal.
  • the server displays a preset agent list through the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction, and the server receives the statement sent by the conference control terminal to The speaking instruction performs a corresponding operation.
  • the agent list may be stored in the server, and when the server receives a display instruction triggered by the conference terminal, the server sends the agent list to the conference terminal for display.
  • the conference control terminal may trigger a corresponding speaking instruction to indicate that the participant in the agent speaks when detecting the click operation of the host user based on the agent list.
  • FIG. 6 is a schematic diagram of an effect of an embodiment of a seat list displayed by a conference control terminal according to the present invention.
  • the server may further receive, according to the setting instruction sent by the conference control terminal, the agent list input by the user based on the conference terminal and the orientation information corresponding to each agent; the server The received seat list and the orientation information corresponding to each seat are saved.
  • the conference control terminal triggers the corresponding speaking instruction, and receives the speaking instruction sent by the conference control terminal through the server, and controls the camera to align the corresponding orientation according to the speaking instruction to perform the shooting of the speaker video, thereby realizing the automatic camera pair in the multimedia conference system.
  • the prospective spokesperson and the spokesperson video are automatically displayed on the display screen, so that the podium user can trigger the speaking command to indicate who is speaking through the conference control terminal, and the corresponding speaker video is displayed on the display screen of the conference site, which greatly improves the conference effect. , improved user experience.
  • the execution bodies of the multimedia conference control methods of the foregoing first to fourth embodiments may each be a multimedia conference system or a server disposed in the multimedia conference system. Further, the multimedia conference control method may be implemented by a client control program installed in the multimedia conference system or the multimedia conference server.
  • the invention further provides a multimedia conference server.
  • FIG. 7 is a schematic diagram of functional modules of a first embodiment of a multimedia conference server according to the present invention.
  • the multimedia conference server includes: a receiving module 10, a control module 20, and a sending module 30.
  • the receiving module 10 is configured to, according to the speaking instruction, determine the orientation information corresponding to the speaking seat and the speaking seat according to the speaking instruction when receiving the speaking instruction sent by the conference controlling terminal;
  • the host user may trigger a speaking instruction for instructing the corresponding speaker to speak by the conference control terminal, and the conference control terminal sends the speaking instruction to the server, and when the server receives the speaking instruction, according to the The speaking instruction determines the orientation information corresponding to the corresponding speaker and the speaker to control the camera to align the corresponding orientation to perform the shooting of the speaker video.
  • the conference control terminal may add the agent information corresponding to the speaker as the speaker information to the speaking instruction, and when receiving the speaking instruction, the server determines the corresponding speaker information according to the speaking instruction. And querying the orientation information corresponding to the speaker locally saved by the server according to the speaker information, so as to adjust the camera to shoot the video according to the orientation information.
  • the server can communicate with the conference control terminal through a SIP protocol.
  • the speaking instruction may be transmitted between the server and the conference control terminal in a format of an INFO message.
  • the control module 20 is configured to adjust a camera shooting speaker video according to the determined orientation information
  • the server adjusts, according to the determined orientation information, that the corresponding camera is aligned with the speaker to perform the shooting of the video of the speaker.
  • the orientation information may include a preset shooting angle for the server to adjust a corresponding camera angle according to the shooting angle to align the speaker.
  • the camera may be a single one or a plurality of cameras. When there are a plurality of cameras for capturing the video of the speaker seat, the orientation information of each camera is respectively set corresponding to the same agent, and the server is configured according to each camera. The corresponding orientation information controls the angle adjustment of each camera.
  • the server may further determine the corresponding speaker information when receiving the speaking instruction of the conference control terminal, and control to open the microphone device corresponding to the speaker to collect the speaker audio data, and collect the speaker's data. After the audio data is transmitted, it is mixed by the media server in the server and sent to the audio device for output.
  • the sending module 30 is configured to send the video of the speaker to a display screen for display.
  • the server can send the video of the speaker shot by the camera to the display screen for display by using the RTP protocol.
  • the conference control terminal may further add, to the speaking instruction, a control command for displaying a video of the speaker selected by the host user, and the server determines, according to the speaking instruction, whether to send the corresponding video of the speaker. Displaying to the display screen, if yes, the server sends the video of the speaker to the display screen for display; if not, deleting the video of the speaker.
  • the server can be connected to the display through a VGA/HDMI/DVI/SDI interface.
  • the server receives the speaking instruction sent by the user based on the conference control terminal, and controls the camera to align the corresponding orientation according to the speaking instruction to perform the shooting of the speaker video, thereby realizing the automatic alignment of the camera in the multimedia conference system.
  • the video is automatically displayed on the display screen, so that the chairman station can trigger the speaking command to indicate who is speaking through the conference terminal.
  • the corresponding speaker video is displayed on the display screen of the conference site, which greatly improves the conference effect and improves the user experience.
  • FIG 8 is a schematic diagram of functional modules of a second embodiment of the apparatus of the present invention.
  • the multimedia conference server further includes a multimedia module 40.
  • the receiving module 10 is further configured to receive video data of each of the sub-sites through a network connection;
  • the server can receive video data of each sub-site through the RTP protocol.
  • the server may connect to a remote conference site server or a SIP conference terminal of the conference site through the network to receive video data of each of the conference sites.
  • the multimedia module 40 is configured to perform puzzle processing on video data of each of the sub-sites to obtain a puzzle video
  • the server can implement the puzzle processing of the video data of each of the sub-sites through the multimedia device in the server to obtain a puzzle video containing the video of each of the sub-sites.
  • the server can perform puzzle processing in various ways, for example, 1+1 (1 main conference video + 1 sub-site video), 4 sub-screen, 6-screen, 1+4 (1 main conference video + 4) Video of the sub-site), 1+5 (1 main venue video + 5 sub-site videos), 9-screen and so on.
  • the sending module 30 is further configured to send the puzzle video to a display screen for display.
  • the server sends the puzzle video to a display for display.
  • the display screen accessed by the resource access device of the server may be a single display screen or multiple display screens. For example, when multiple screens are accessed, the One display is used to display the puzzle video of all the venues, the second display is used to display the speaker video, and the third display is used to display documents such as PPT.
  • the video data of each sub-site is received by the server, and the jigsaw video is displayed according to the video data, and the video of each sub-site is displayed, which improves the conference effect and improves the user experience.
  • FIG. 9 is a schematic diagram of functional modules of a third embodiment of the apparatus of the present invention.
  • the receiving module 10 includes a detecting unit 11, a determining unit 12, and a switching unit 13 based on the second embodiment of the multimedia conference server.
  • the detecting unit 11 is configured to detect a network bandwidth of the network connection in real time when receiving video data of a sub-site through a network connection;
  • the determining unit 12 is configured to determine a video bit rate and a video resolution corresponding to the changed network bandwidth when detecting that the network bandwidth changes;
  • the switching unit 13 is configured to switch to the determined video bit rate and video resolution to continue receiving video data.
  • the server detects the network bandwidth of the network connection in real time during the process of receiving the video data of the sub-site through the network connection; when detecting the change of the network bandwidth, the server determines the video code corresponding to the changed network bandwidth. Rate and video resolution; the server switches to the determined video bit rate and video resolution to continue receiving video data. For example, the server receives the video data of the conference site according to the code rate of 2000 kbps, detects that the network bandwidth changes, and the changed network bandwidth conforms to the 800 kbps code rate, and the server switches to the 800 kbps code rate to continue receiving the current location. Video data.
  • the resolution and the code rate of the video are adjusted according to the network bandwidth, which avoids problems such as video jamming and flower screen caused by network deterioration during the conference, and can automatically adjust the video resolution and code rate to adapt to the network bandwidth when the network is deteriorated. , to achieve the best video effects under the current network bandwidth conditions, improve the user experience.
  • FIG. 10 is a schematic diagram of functional modules of a fourth embodiment of the apparatus of the present invention.
  • the multimedia conference server further includes a display module 50, based on the first embodiment of the multimedia conference server;
  • the display module 50 is configured to display a preset agent list by using the conference control terminal, so that the user determines a speaker based on the agent list and triggers a corresponding speaking instruction;
  • the receiving module 10 is further configured to receive a speaking instruction sent by the conference control terminal.
  • the server displays a preset agent list through the conference control terminal, so that the user determines the speaker based on the agent list and triggers a corresponding speaking instruction, and the server receives the statement sent by the conference control terminal to The speaking instruction performs a corresponding operation.
  • the agent list may be stored in the server, and when the server receives a display instruction triggered by the conference terminal, the server sends the agent list to the conference terminal for display.
  • the conference control terminal may trigger a corresponding speaking instruction to indicate that the participant in the agent speaks when detecting the click operation of the host user based on the agent list.
  • FIG. 6 is a schematic diagram of an effect of an embodiment of a seat list displayed by a conference control terminal according to the present invention.
  • the multimedia conference server further includes a storage module, and the receiving module is further configured to: when receiving the setting instruction sent by the conference control terminal, receive a seat list and each agent input by the user based on the conference control terminal Corresponding orientation information; the storage module is configured to save the received agent list and the orientation information corresponding to each agent.
  • the conference control terminal triggers the corresponding speaking instruction, and receives the speaking instruction sent by the conference control terminal through the server, and controls the camera to align the corresponding orientation according to the speaking instruction to perform the shooting of the speaker video, thereby realizing the automatic camera pair in the multimedia conference system.
  • the prospective spokesperson and the spokesperson video are automatically displayed on the display screen, so that the podium user can trigger the speaking command to indicate who is speaking through the conference control terminal, and the corresponding speaker video is displayed on the display screen of the conference site, which greatly improves the conference effect. , improved user experience.
  • the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better.
  • Implementation Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

公开了一种多媒体会议控制方法,所述多媒体会议控制方法包括以下步骤:服务器在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;所述服务器根据所确定的方位信息调整摄像头拍摄发言席视频;所述服务器将所述发言席视频发送至显示屏进行显示。还公开了一种多媒体会议服务器。其实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。

Description

多媒体会议控制方法及服务器
技术领域
本发明涉及多媒体会议领域,尤其涉及一种多媒体会议控制方法及服务器。
背景技术
随着多媒体技术的普及和发展,使得视频会议、远程教学等可视化信息技术在会议室领域得到广泛应用,多媒体会议室以其功能的多样性(如现场会议、学术报告、培训教学等)得到迅速普及。多媒体会议系统是泛指与会议相互关联的声、光、电设备及软件的集成。在多媒体会议室里不管是作报告、总结、汇报、介绍产品等等,用电脑互动操作的图、文、声、影、画展示,充分调动了与会者的感官知觉,大大提高了会议效果。多媒体在办公领域中,也越来越体现出它的优势。但是,在现有的多媒体会议系统中,会场的摄像头多是固定的,无法跟踪拍摄发言人视频,极大的降低了用户体验,
因此,在多媒体会议系统中摄像头无法跟踪拍摄发言人视频的问题,此方面的问题亟待发明人解决。
上述内容仅用于辅助理解本发明的技术方案,并不代表承认上述内容是现有技术。
发明内容
本发明的主要目的在于解决在多媒体会议系统中,摄像头无法跟踪拍摄发言人视频的问题。
为实现上述目的,本发明提供一种多媒体会议控制方法,所述多媒体会议控制方法包括以下步骤:
服务器在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
所述服务器根据所确定的方位信息调整摄像头拍摄发言席视频;
所述服务器将所述发言席视频发送至显示屏进行显示。
可选的,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述服务器接收所述会控终端发送的发言指令。
可选的,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
可选的,所述服务器将所述发言席视频发送至显示屏进行显示的步骤之后,还包括:
所述服务器通过网络连接接收各个分会场的视频数据;
所述服务器将各个分会场的视频数据进行拼图处理,得到拼图视频;
所述服务器将所述拼图视频发送至显示屏进行显示。
可选的,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述服务器接收所述会控终端发送的发言指令。
可选的,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
可选的,所述服务器通过网络连接接收各个分会场的视频数据的步骤包括:
所述服务器在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
所述服务器在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
所述服务器切换至所确定的视频码率及视频分辨率继续接收视频数据。
可选的,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述服务器接收所述会控终端发送的发言指令。
可选的,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
此外,为实现上述目的,本发明还提供一种多媒体会议服务器,所述多媒体会议服务器包括:
接收模块,用于在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
控制模块,用于根据所确定的方位信息调整摄像头拍摄发言席视频;
发送模块,用于将所述发言席视频发送至显示屏进行显示。
可选的,所述多媒体会议服务器还包括显示模块;
所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述接收模块,还用于接收所述会控终端发送的发言指令。
可选的,所述多媒体会议服务器还包括存储模块;
所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
可选的,所述多媒体会议服务器还包括多媒体模块;
所述接收模块,还用于通过网络连接接收各个分会场的视频数据;
所述多媒体模块,用于将各个分会场的视频数据进行拼图处理,得到拼图视频;
所述发送模块,还用于将所述拼图视频发送至显示屏进行显示。
可选的,所述多媒体会议服务器还包括显示模块;
所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述接收模块,还用于接收所述会控终端发送的发言指令。
可选的,所述多媒体会议服务器还包括存储模块;
所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
可选的,所述接收模块包括检测单元、确定单元和切换单元;
所述检测单元,用于在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
所述确定单元,用于在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
所述切换单元,用于切换至所确定的视频码率及视频分辨率继续接收视频数据。
可选的,所述多媒体会议服务器还包括显示模块;
所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述接收模块,还用于接收所述会控终端发送的发言指令。
可选的,所述多媒体会议服务器还包括存储模块;
所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
本发明通过服务器接收用户基于会控终端发送的发言指令,并根据该发言指令控制摄像头对准对应的方位进行发言人视频的拍摄,实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。
附图说明
图1为实现本发明各个实施例的多媒体会议系统的硬件架构图;
图2为本发明多媒体会议控制方法的第一实施例的流程示意图;
图3为本发明多媒体会议控制方法的第二实施例的流程示意图;
图4为本发明多媒体会议控制方法的第三实施例的流程示意图;
图5为本发明多媒体会议控制方法的第四实施例的流程示意图;
图6为本发明中通过会控终端显示的坐席列表的一实施例的效果示意图;
图7为本发明多媒体会议服务器的第一实施例的功能模块示意图;
图8为本发明多媒体会议服务器的第二实施例的功能模块示意图;
图9为本发明多媒体会议服务器的第三实施例的功能模块示意图;
图10为本发明多媒体会议服务器的第四实施例的功能模块示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
现在将参考附图描述实现本发明各个实施例的多媒体会议系统。图1为实现本发明各个实施例的多媒体会议系统的硬件架构图。多媒体会议系统可以包括服务器100、会控终端200以及诸如摄像头301、麦克风302、显示屏303、音响304等等的外部设备。
所述会控终端200用于根据主持人用户所输入的命令生成对应的指令并发送至服务器100,以控制会议业务的各种操作。所述会控终端200可以为移动电话、智能电话、笔记本电脑、PAD(平板电脑)、台式计算机等等的终端。
所述摄像头301、麦克风302用于采集音视频数据。所述显示屏303及所述音响304设备用于输出多媒体设备102处理后的音视频。
所述服务器100可以包括多媒体设备102、软交换设备103、资源接入设备104和控制器101等等,图1示出了具有各种设备的服务器100,但是应理解的是,并不要求实施所有示出的设备。可以替代地实施更多或更少的设备。所述服务器100内部的各个设备之间的控制信令可以通过SIP协议实现,多媒体数据通过RTP协议(Real-time Transport Protocol,实时传输协议)承载传输。所述软交换设备103用于会控终端200及会议室各种资源(如摄像头资源、显示屏资源、麦克风资源等)的注册、呼叫路由等。所述控制器101用于会议业务的控制与管理。所述多媒体设备102用于音视频的处理,例如:音频的混音、视频的拼图等。所述资源接入设备104用于接入会议室内的显示屏303、摄像头301、麦克风302、音响304等设备。
基于上述多媒体会议系统的硬件架构,本发明提供一种多媒体会议控制方法。
参照图2,图2为本发明多媒体会议控制方法的第一实施例的流程示意图。
在本实施例中,所述多媒体会议控制方法包括:
步骤S10,服务器在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
可以由主持人用户通过会控终端触发用于指示对应的发言人进行发言的发言指令,所述会控终端将所述发言指令发送至服务器,所述服务器在接收到所述发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息,以控制摄像头对准对应方位进行发言人视频的拍摄。
所述会控终端可以将发言人所对应的坐席信息作为发言席信息添加至所述发言指令中,所述服务器在接收到所述发言指令时,根据所述发言指令确定对应的发言席信息,并根据所述发言席信息查询所述服务器本地保存的所述发言席对应的方位信息,以根据所述方位信息调整摄像头拍摄发言席视频。
所述服务器可以通过SIP协议与所述会控终端之间进行通信。所述发言指令可以以INFO消息的格式在所述服务器与所述会控终端之间进行传输。
步骤S20,所述服务器根据所确定的方位信息调整摄像头拍摄发言席视频;
所述服务器根据所确定的方位信息调整对应的摄像头对准所述发言席进行发言席视频的拍摄。所述方位信息可以包括预设的拍摄角度,以供服务器根据所述拍摄角度调整对应的摄像头角度以对准所述发言席。进一步的,所述摄像头可以为单独的一个或者也可以是多个,当用于拍摄发言席视频的摄像头为多个时,对应于同一坐席分别设置各个摄像头的方位信息,所述服务器根据各个摄像头对应的方位信息控制各个摄像头的角度调整。
进一步的,所述服务器还可以在接收到会控终端的发言指令时,确定对应的发言席信息,并控制打开所述发言席对应的麦克风设备以采集发言人音频数据,在采集到发言人的音频数据后,通过服务器内的媒体服务器进行混音处理后发送至音响设备输出。
步骤S30,所述服务器将所述发言席视频发送至显示屏进行显示。
所述服务器可以通过RTP协议将摄像头拍摄的发言席视频发送至显示屏进行显示。进一步的,所述会控终端还可以将主持人用户所选择的是否显示发言席视频的控制命令添加至所述发言指令中,所述服务器根据所述发言指令判断是否将对应的发言席视频发送至显示屏进行显示,若是,则所述服务器将所述发言席视频发送至显示屏进行显示;若否,则删除所述发言席视频。
所述服务器可以通过VGA/HDMI/DVI/SDI接口与所述显示屏进行连接。
本实施例通过服务器接收用户基于会控终端发送的发言指令,并根据该发言指令控制摄像头对准对应的方位进行发言人视频的拍摄,实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。
参照图3,图3为本发明多媒体会议控制方法的第二实施例的流程示意图。基于上述多媒体会议控制方法的第一实施例,所述步骤S30之后,还包括:
步骤S40,所述服务器通过网络连接接收各个分会场的视频数据;
所述服务器可以通过RTP协议接收各个分会场的视频数据。所述服务器可以通过网络连接远程的分会场服务器或者分会场的SIP会议终端,以接收各个分会场的视频数据。
步骤S50,所述服务器将各个分会场的视频数据进行拼图处理,得到拼图视频;
所述服务器可以通过所述服务器内的多媒体设备实现对各个分会场的视频数据的拼图处理,以得到含有各个分会场视频的拼图视频。所述服务器可以按照各种方式进行拼图处理,例如:1+1(1个主会场视频+1个分会场视频),4分屏,6分屏,1+4(1个主会场视频+4个分会场视频),1+5(1个主会场视频+5个分会场视频),9分屏等等。
步骤S60,所述服务器将所述拼图视频发送至显示屏进行显示。
所述服务器将所述拼图视频发送至显示屏进行显示。进一步的,通过所述服务器的资源接入设备所接入的显示屏可以是一个单独的显示屏或者也可以是多个显示屏,例如:当接入的显示屏为多个时,可以将第一显示屏用于显示所有会场的拼图视频,将第二显示屏用于显示发言人视频,将第三显示屏用于显示PPT等文档。
本实施例通过服务器接收各个分会场的视频数据,并根据所述视频数据进行拼图处理得到拼图视频进行显示,实现了各个分会场视频的显示,提高了会议效果,提升了用户体验。
参照图4,图4为本发明多媒体会议控制方法的第三实施例的流程示意图。基于上述多媒体会议控制方法的第二实施例,所述步骤S40包括:
步骤S41,所述服务器在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
步骤S42,所述服务器在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
步骤S43,所述服务器切换至所确定的视频码率及视频分辨率继续接收视频数据。
所述服务器在通过网络连接接收分会场的视频数据过程中,实时检测所述网络连接的网络带宽;所述服务器在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;所述服务器切换至所确定的视频码率及视频分辨率继续接收视频数据。例如:所述服务器按照2000kbps码率接收分会场的视频数据,检测测到网络带宽发生变化,变化后的网络带宽符合800kbps码率,则所述服务器切换至800kbps码率从当前位置继续接收所述视频数据。
本实施例根据网络带宽调整视频的分辨率及码率,避免了开会过程中由于网络恶化造成视频卡顿、花屏等问题,在网络恶化时,能够自动调整视频分辨率及码率以适应网络带宽,实现了在当前网络带宽条件下达到最好的视频效果,提高了用户体验。
参照图5,图5为本发明多媒体会议控制方法的第四实施例的流程示意图。基于上述多媒体会议控制方法的第一实施例,所述步骤S10之前,还包括:
步骤S11,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
步骤S12,所述服务器接收所述会控终端发送的发言指令。
所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令,所述服务器接收所述会控终端发送的发言至,以根据所述发言指令进行对应的操作。
所述坐席列表可以保存在所述服务器内,在所述服务器接收到用户基于会控终端触发的显示指令时,将所述坐席列表发送至会控终端进行显示。所述会控终端可以在侦测到主持人用户基于所述坐席列表的点击操作时,触发对应的发言指令以指示处于该坐席的与会人员进行发言。具体的,参照图6,图6为本发明中通过会控终端所显示的坐席列表的一实施例的效果示意图。
进一步的,在步骤S11之前,所述服务器还可以在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
本实施例会控终端触发对应的发言指令,并通过服务器接收会控终端发送的发言指令,根据该发言指令控制摄像头对准对应的方位进行发言人视频的拍摄,实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。
上述第一至第四实施例的多媒体会议控制方法的执行主体均可以为多媒体会议系统或设置在所述多媒体会议系统内的服务器。更进一步地,该多媒体会议控制方法可以由安装在所述多媒体会议系统或者所述多媒体会议服务器内的客户端控制程序实现。
本发明进一步提供一种多媒体会议服务器。
参照图7,图7为本发明多媒体会议服务器的第一实施例的功能模块示意图。
在本实施例中,所述多媒体会议服务器包括:接收模块10、控制模块20及发送模块30。
所述接收模块10,用于在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
可以由主持人用户通过会控终端触发用于指示对应的发言人进行发言的发言指令,所述会控终端将所述发言指令发送至服务器,所述服务器在接收到所述发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息,以控制摄像头对准对应方位进行发言人视频的拍摄。
所述会控终端可以将发言人所对应的坐席信息作为发言席信息添加至所述发言指令中,所述服务器在接收到所述发言指令时,根据所述发言指令确定对应的发言席信息,并根据所述发言席信息查询所述服务器本地保存的所述发言席对应的方位信息,以根据所述方位信息调整摄像头拍摄发言席视频。
所述服务器可以通过SIP协议与所述会控终端之间进行通信。所述发言指令可以以INFO消息的格式在所述服务器与所述会控终端之间进行传输。
所述控制模块20,用于根据所确定的方位信息调整摄像头拍摄发言席视频;
所述服务器根据所确定的方位信息调整对应的摄像头对准所述发言席进行发言席视频的拍摄。所述方位信息可以包括预设的拍摄角度,以供服务器根据所述拍摄角度调整对应的摄像头角度以对准所述发言席。进一步的,所述摄像头可以为单独的一个或者也可以是多个,当用于拍摄发言席视频的摄像头为多个时,对应于同一坐席分别设置各个摄像头的方位信息,所述服务器根据各个摄像头对应的方位信息控制各个摄像头的角度调整。
进一步的,所述服务器还可以在接收到会控终端的发言指令时,确定对应的发言席信息,并控制打开所述发言席对应的麦克风设备以采集发言人音频数据,在采集到发言人的音频数据后,通过服务器内的媒体服务器进行混音处理后发送至音响设备输出。
所述发送模块30,用于将所述发言席视频发送至显示屏进行显示。
所述服务器可以通过RTP协议将摄像头拍摄的发言席视频发送至显示屏进行显示。进一步的,所述会控终端还可以将主持人用户所选择的是否显示发言席视频的控制命令添加至所述发言指令中,所述服务器根据所述发言指令判断是否将对应的发言席视频发送至显示屏进行显示,若是,则所述服务器将所述发言席视频发送至显示屏进行显示;若否,则删除所述发言席视频。
所述服务器可以通过VGA/HDMI/DVI/SDI接口与所述显示屏进行连接。
本实施例通过服务器接收用户基于会控终端发送的发言指令,并根据该发言指令控制摄像头对准对应的方位进行发言人视频的拍摄,实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。
参照图8,图8为本发明装置的第二实施例的功能模块示意图。基于上述多媒体会议服务器的第一实施例,所述多媒体会议服务器还包括多媒体模块40。
所述接收模块10,还用于通过网络连接接收各个分会场的视频数据;
所述服务器可以通过RTP协议接收各个分会场的视频数据。所述服务器可以通过网络连接远程的分会场服务器或者分会场的SIP会议终端,以接收各个分会场的视频数据。
所述多媒体模块40,用于将各个分会场的视频数据进行拼图处理,得到拼图视频;
所述服务器可以通过所述服务器内的多媒体设备实现对各个分会场的视频数据的拼图处理,以得到含有各个分会场视频的拼图视频。所述服务器可以按照各种方式进行拼图处理,例如:1+1(1个主会场视频+1个分会场视频),4分屏,6分屏,1+4(1个主会场视频+4个分会场视频),1+5(1个主会场视频+5个分会场视频),9分屏等等。
所述发送模块30,还用于将所述拼图视频发送至显示屏进行显示。
所述服务器将所述拼图视频发送至显示屏进行显示。进一步的,通过所述服务器的资源接入设备所接入的显示屏可以是一个单独的显示屏或者也可以是多个显示屏,例如:当接入的显示屏为多个时,可以将第一显示屏用于显示所有会场的拼图视频,将第二显示屏用于显示发言人视频,将第三显示屏用于显示PPT等文档。
本实施例通过服务器接收各个分会场的视频数据,并根据所述视频数据进行拼图处理得到拼图视频进行显示,实现了各个分会场视频的显示,提高了会议效果,提升了用户体验。
参照图9,图9为本发明装置的第三实施例的功能模块示意图。基于上述多媒体会议服务器的第二实施例,所述接收模块10包括检测单元11、确定单元12和切换单元13;
所述检测单元11,用于在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
所述确定单元12,用于在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
所述切换单元13,用于切换至所确定的视频码率及视频分辨率继续接收视频数据。
所述服务器在通过网络连接接收分会场的视频数据过程中,实时检测所述网络连接的网络带宽;所述服务器在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;所述服务器切换至所确定的视频码率及视频分辨率继续接收视频数据。例如:所述服务器按照2000kbps码率接收分会场的视频数据,检测测到网络带宽发生变化,变化后的网络带宽符合800kbps码率,则所述服务器切换至800kbps码率从当前位置继续接收所述视频数据。
本实施例根据网络带宽调整视频的分辨率及码率,避免了开会过程中由于网络恶化造成视频卡顿、花屏等问题,在网络恶化时,能够自动调整视频分辨率及码率以适应网络带宽,实现了在当前网络带宽条件下达到最好的视频效果,提高了用户体验。
参照图10,图10为本发明装置的第四实施例的功能模块示意图。基于上述多媒体会议服务器的第一实施例,所述多媒体会议服务器还包括显示模块50;
所述显示模块50,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
所述接收模块10,还用于接收所述会控终端发送的发言指令。
所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令,所述服务器接收所述会控终端发送的发言至,以根据所述发言指令进行对应的操作。
所述坐席列表可以保存在所述服务器内,在所述服务器接收到用户基于会控终端触发的显示指令时,将所述坐席列表发送至会控终端进行显示。所述会控终端可以在侦测到主持人用户基于所述坐席列表的点击操作时,触发对应的发言指令以指示处于该坐席的与会人员进行发言。具体的,参照图6,图6为本发明中通过会控终端所显示的坐席列表的一实施例的效果示意图。
进一步的,所述多媒体会议服务器还包括存储模块;所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
本实施例会控终端触发对应的发言指令,并通过服务器接收会控终端发送的发言指令,根据该发言指令控制摄像头对准对应的方位进行发言人视频的拍摄,实现了多媒体会议系统中摄像头自动对准发言人,发言人视频自动显示到显示屏,使得主席台用户能够通过会控终端触发发言指令指示谁发言,对应的发言人视频就显示在会场的显示屏上,极大的提高了会议效果,提升了用户体验。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本发明各个实施例所述的方法。
以上仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (18)

  1. 一种多媒体会议控制方法,其特征在于,所述多媒体会议控制方法包括以下步骤:
    服务器在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
    所述服务器根据所确定的方位信息调整摄像头拍摄发言席视频;
    所述服务器将所述发言席视频发送至显示屏进行显示。
  2. 如权利要求1所述的多媒体会议控制方法,其特征在于,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
    所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述服务器接收所述会控终端发送的发言指令。
  3. 如权利要求2所述的多媒体会议控制方法,其特征在于,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
    所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
  4. 如权利要求1所述的多媒体会议控制方法,其特征在于,所述服务器将所述发言席视频发送至显示屏进行显示的步骤之后,还包括:
    所述服务器通过网络连接接收各个分会场的视频数据;
    所述服务器将各个分会场的视频数据进行拼图处理,得到拼图视频;
    所述服务器将所述拼图视频发送至显示屏进行显示。
  5. 如权利要求4所述的多媒体会议控制方法,其特征在于,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
    所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述服务器接收所述会控终端发送的发言指令。
  6. 如权利要求5所述的多媒体会议控制方法,其特征在于,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
    所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
  7. 如权利要求4所述的多媒体会议控制方法,其特征在于,所述服务器通过网络连接接收各个分会场的视频数据的步骤包括:
    所述服务器在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
    所述服务器在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
    所述服务器切换至所确定的视频码率及视频分辨率继续接收视频数据。
  8. 如权利要求7所述的多媒体会议控制方法,其特征在于,所述服务器在接收到会控终端的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息的步骤之前,还包括:
    所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述服务器接收所述会控终端发送的发言指令。
  9. 如权利要8所述的多媒体会议控制方法,其特征在于,所述服务器通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令的步骤之前,还包括:
    所述服务器在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述服务器保存所接收到的坐席列表及各个坐席对应的方位信息。
  10. 一种多媒体会议服务器,其特征在于,所述多媒体会议服务器包括:
    接收模块,用于在接收到会控终端发送的发言指令时,根据所述发言指令确定对应的发言席及所述发言席对应的方位信息;
    控制模块,用于根据所确定的方位信息调整摄像头拍摄发言席视频;
    发送模块,用于将所述发言席视频发送至显示屏进行显示。
  11. 如权利要求10所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括显示模块;
    所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述接收模块,还用于接收所述会控终端发送的发言指令。
  12. 如权利要求11所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括存储模块;
    所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
  13. 如权利要求10所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括多媒体模块;
    所述接收模块,还用于通过网络连接接收各个分会场的视频数据;
    所述多媒体模块,用于将各个分会场的视频数据进行拼图处理,得到拼图视频;
    所述发送模块,还用于将所述拼图视频发送至显示屏进行显示。
  14. 如权利要求13所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括显示模块;
    所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述接收模块,还用于接收所述会控终端发送的发言指令。
  15. 如权利要求14所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括存储模块;
    所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
  16. 如权利要求13所述的多媒体会议服务器,其特征在于,所述接收模块包括检测单元、确定单元和切换单元;
    所述检测单元,用于在通过网络连接接收分会场的视频数据时,实时检测所述网络连接的网络带宽;
    所述确定单元,用于在检测到所述网络带宽发生变化时,确定变化后的网络带宽对应的视频码率及视频分辨率;
    所述切换单元,用于切换至所确定的视频码率及视频分辨率继续接收视频数据。
  17. 如权利要求16所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括显示模块;
    所述显示模块,用于通过所述会控终端显示预设的坐席列表,以供用户基于所述坐席列表确定发言席并触发对应的发言指令;
    所述接收模块,还用于接收所述会控终端发送的发言指令。
  18. 如权利要求17所述的多媒体会议服务器,其特征在于,所述多媒体会议服务器还包括存储模块;
    所述接收模块,还用于在接收到所述会控终端发送的设置指令时,接收用户基于所述会控终端输入的坐席列表及各个坐席对应的方位信息;
    所述存储模块,用于保存所接收到的坐席列表及各个坐席对应的方位信息。
PCT/CN2016/085049 2016-04-21 2016-06-07 多媒体会议控制方法及服务器 Ceased WO2017181508A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610255434.5 2016-04-21
CN201610255434.5A CN105812717A (zh) 2016-04-21 2016-04-21 多媒体会议控制方法及服务器

Publications (1)

Publication Number Publication Date
WO2017181508A1 true WO2017181508A1 (zh) 2017-10-26

Family

ID=56458395

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/085049 Ceased WO2017181508A1 (zh) 2016-04-21 2016-06-07 多媒体会议控制方法及服务器

Country Status (2)

Country Link
CN (1) CN105812717A (zh)
WO (1) WO2017181508A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112616035A (zh) * 2020-11-23 2021-04-06 深圳市捷视飞通科技股份有限公司 多画面拼接方法、装置、计算机设备和存储介质

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106789914B (zh) * 2016-11-24 2020-04-14 邦彦技术股份有限公司 一种多媒体会议控制方法和系统
WO2018098780A1 (zh) * 2016-12-01 2018-06-07 深圳前海达闼云端智能科技有限公司 一种交互式广告展示方法、终端及智慧城市交互系统
CN109246383B (zh) * 2017-07-11 2022-03-29 中兴通讯股份有限公司 一种多媒体会议终端的控制方法及多媒体会议服务器
US10356362B1 (en) * 2018-01-16 2019-07-16 Google Llc Controlling focus of audio signals on speaker during videoconference
US10523864B2 (en) * 2018-04-10 2019-12-31 Facebook, Inc. Automated cinematic decisions based on descriptive models
CN109698928B (zh) * 2018-11-15 2021-04-13 贵阳朗玛信息技术股份有限公司 一种调节视频会议系统中视频流的方法及装置
CN111212218A (zh) * 2018-11-22 2020-05-29 阿里巴巴集团控股有限公司 拍摄控制方法、设备及拍摄系统
CN109547735B (zh) * 2019-01-18 2024-04-16 海南科先电子科技有限公司 一种会议集成系统
CN111245823A (zh) * 2020-01-09 2020-06-05 福建星网智慧科技股份有限公司 一种基于lte协议可移动的无线专网音视频通信系统
CN114067668B (zh) * 2020-08-04 2024-12-20 广州艾美网络科技有限公司 可调多媒体系统及其控制方法
CN116366961A (zh) * 2021-12-24 2023-06-30 广西三诺数字科技有限公司 视频会议方法、装置及计算机设备
CN114449205B (zh) * 2022-04-08 2022-07-29 浙江华创视讯科技有限公司 数据处理方法、终端设备、电子设备及存储介质
CN116312579B (zh) * 2022-09-07 2025-12-12 阿里巴巴(中国)有限公司 音频数据处理方法、存储介质和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030013017A (ko) * 2001-08-06 2003-02-14 주식회사 호스트이엔아이 프리젠테이션 시스템에서의 화자 인식 방법
CN102469295A (zh) * 2010-10-29 2012-05-23 华为终端有限公司 会议控制方法及相关设备和系统
CN102625077A (zh) * 2011-01-27 2012-08-01 深圳市合智创盈电子有限公司 一种会议记录方法、会议摄像装置、客户机及系统
CN103327250A (zh) * 2013-06-24 2013-09-25 深圳锐取信息技术股份有限公司 基于模式识别镜头控制方法
CN103986914A (zh) * 2014-05-27 2014-08-13 东南大学 无线视频监控系统中基于客户端数量的码率自适应方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NO333026B1 (no) * 2008-09-17 2013-02-18 Cisco Systems Int Sarl Styringssystem for et lokalt telepresencevideokonferansesystem og fremgangsmate for a etablere en videokonferansesamtale.
CN101742222A (zh) * 2009-12-30 2010-06-16 华为终端有限公司 摄像头位置的操作方法及视频会议终端
CN101877706B (zh) * 2010-06-24 2013-04-17 北京邮电大学 多终端的多媒体会议控制系统及实现方法
CN104144315B (zh) * 2013-05-06 2017-12-29 华为技术有限公司 一种多点视频会议的显示方法及多点视频会议系统
US20150146078A1 (en) * 2013-11-27 2015-05-28 Cisco Technology, Inc. Shift camera focus based on speaker position
CN204119373U (zh) * 2014-04-02 2015-01-21 中国舰船研究设计中心 一种数字会议人脸跟踪系统
CN105163134B (zh) * 2015-08-03 2018-09-07 腾讯科技(深圳)有限公司 直播视频的视频编码参数设置方法、装置及视频编码设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030013017A (ko) * 2001-08-06 2003-02-14 주식회사 호스트이엔아이 프리젠테이션 시스템에서의 화자 인식 방법
CN102469295A (zh) * 2010-10-29 2012-05-23 华为终端有限公司 会议控制方法及相关设备和系统
CN102625077A (zh) * 2011-01-27 2012-08-01 深圳市合智创盈电子有限公司 一种会议记录方法、会议摄像装置、客户机及系统
CN103327250A (zh) * 2013-06-24 2013-09-25 深圳锐取信息技术股份有限公司 基于模式识别镜头控制方法
CN103986914A (zh) * 2014-05-27 2014-08-13 东南大学 无线视频监控系统中基于客户端数量的码率自适应方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112616035A (zh) * 2020-11-23 2021-04-06 深圳市捷视飞通科技股份有限公司 多画面拼接方法、装置、计算机设备和存储介质
CN112616035B (zh) * 2020-11-23 2023-09-19 深圳市捷视飞通科技股份有限公司 多画面拼接方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN105812717A (zh) 2016-07-27

Similar Documents

Publication Publication Date Title
WO2017181508A1 (zh) 多媒体会议控制方法及服务器
WO2018094791A1 (zh) 一种多媒体会议控制方法和系统
WO2019019374A1 (zh) 智能语音设备控制家电的方法、装置及系统
WO2017107388A1 (zh) Hdmi版本切换方法及显示设备
WO2017135585A2 (en) Main speaker, sub speaker and system including the same
WO2018120457A1 (zh) 数据处理方法、装置、设备及计算机可读存储介质
WO2017201899A1 (zh) 连接蓝牙设备的方法及装置
WO2019114269A1 (zh) 一种节目续播方法、电视设备及计算机可读存储介质
WO2020010671A1 (zh) 显示方法、装置以及电视机、存储介质
WO2018000856A1 (zh) 一种实现SDN Overlay网络报文转发的方法、终端、设备及计算机可读存储介质
WO2019024336A1 (zh) 数据查询方法、装置及计算机可读存储介质
WO2017096671A1 (zh) 网络会议方法及装置
WO2017113614A1 (zh) 视频播放过程中插播广告的方法及装置
WO2018233221A1 (zh) 多窗口声音输出方法、电视机以及计算机可读存储介质
WO2019031735A1 (en) IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND IMAGE DISPLAY SYSTEM
WO2017045441A1 (zh) 基于智能电视的音频播放方法及装置
WO2017063369A1 (zh) 无线直连连接方法及装置
WO2019071762A1 (zh) 楼层位置定位方法、系统、服务器和计算机可读存储介质
WO2017181504A1 (zh) 智能调节字幕大小的方法及电视机
WO2017185480A1 (zh) 多屏互动连接方法、装置及系统
WO2017113596A1 (zh) 单独听控制方法及系统、移动终端及智能电视
WO2018205514A1 (zh) 机顶盒无线兼容性自动化测试方法、系统及可读存储介质
WO2017152527A1 (zh) 智能电视从设备应用的控制方法及智能电视
WO2017148028A1 (zh) 基于智能电视的远端网络连接方法和系统
WO2017084298A1 (zh) 电视机报警方法和系统

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16899095

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16899095

Country of ref document: EP

Kind code of ref document: A1