[go: up one dir, main page]

WO2024174732A1 - Display device and speech control method - Google Patents

Display device and speech control method Download PDF

Info

Publication number
WO2024174732A1
WO2024174732A1 PCT/CN2023/143115 CN2023143115W WO2024174732A1 WO 2024174732 A1 WO2024174732 A1 WO 2024174732A1 CN 2023143115 W CN2023143115 W CN 2023143115W WO 2024174732 A1 WO2024174732 A1 WO 2024174732A1
Authority
WO
WIPO (PCT)
Prior art keywords
control
scroll
voice
scrolling
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/143115
Other languages
French (fr)
Chinese (zh)
Inventor
付友苹
付延松
卢可敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Visual Technology Co Ltd
Original Assignee
Hisense Visual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Visual Technology Co Ltd filed Critical Hisense Visual Technology Co Ltd
Publication of WO2024174732A1 publication Critical patent/WO2024174732A1/en
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Definitions

  • the present application relates to the field of voice control technology, and in particular to a display device and a voice control method.
  • voice functions are also integrated into various display devices (such as smart TVs), and users can jump to/open specified applications through voice commands.
  • the display interface of a display device maintains multiple controls that can respond to voice control commands, and the controls can display corresponding interface description words to the outside.
  • the display device matches the voice control text corresponding to the voice control command with the interface description words of each control in the current display interface one by one. If the corresponding interface description word is matched, the control corresponding to the interface description word is controlled to perform related operations. For example, if a resource library control is displayed in the current display interface, then in response to the voice control command of "open resource library", the display device can automatically select the resource library control, and then open the resource library to display a variety of multimedia resources for users to choose from.
  • some display interfaces may contain some controls without interface description words, and therefore, it is not possible to perform voice control on them by using the above interface description word matching method.
  • the present application provides a display device and a voice control method, which can effectively control controls without interface description words in a display interface.
  • the present application provides a display device, including a display, a sound collector, and a controller connected to the display and the sound collector respectively, wherein:
  • a display configured to display an image screen and a user interface
  • a sound collector is configured to collect a user's voice control command
  • the controller is configured as:
  • the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;
  • the target scroll control In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text in the scroll controls contained in the current display interface.
  • the present application provides a voice control method, comprising:
  • the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;
  • the target scroll control In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text in the scroll controls contained in the current display interface.
  • the present application also provides a computer-readable storage medium, which stores a computer program.
  • the computer program When the computer program is executed by a controller in a display device, it can implement some or all of the steps of the voice control method provided in the present application.
  • the present application also provides a computer program product, which includes a computer program, and when the computer program is executed by a controller in a display device, it can implement some or all of the steps of the voice control method provided in the present application.
  • FIG1 is an operation scenario between a display device and a control device shown in some embodiments of the present application.
  • FIG2 is a hardware configuration block diagram of a display device shown in some embodiments of the present application.
  • FIG3 is a software configuration block diagram of a display device shown in some embodiments of the present application.
  • FIG4 is a schematic diagram of control distribution of a display interface shown in some embodiments of the present application.
  • FIG5 is a flow chart of a method for executing voice control on a display device according to some embodiments of the present application.
  • FIG6 is a schematic diagram of horizontal scrolling of a scroll control shown in some embodiments of the present application.
  • FIG. 7 is a schematic diagram of vertical scrolling of a scroll control shown in some embodiments of the present application.
  • FIG8 is a schematic diagram of waterfall scrolling of a scroll control shown in some embodiments of the present application.
  • FIG9 is a schematic diagram of a flow chart of a voice-controlled scrolling control performing a scrolling operation according to some embodiments of the present application.
  • FIG10 is a schematic diagram of a distribution of multiple scrolling controls on a display interface shown in some embodiments of the present application.
  • FIG. 11 is a schematic diagram of the voice control logic of a scrolling control in a display interface shown in some embodiments of the present application.
  • the display device can have various implementation forms, for example, it can be a television, a smart TV, a laser projection device, a monitor, an electronic whiteboard (electronic bulletin board), an electronic desktop (electronic table), it can also be a personal computer, a laptop computer, a smart phone, a tablet computer, a portable wearable device, etc.
  • the above-mentioned display device can be controlled by a control device, an intelligent control device, voice, action, gesture, trigger action, etc.
  • a user may operate a display device 200 through a smart device 300 or a control apparatus 100 .
  • control device 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or Bluetooth protocol communication, and other short-range communication methods, and the display device 200 is controlled wirelessly or wired.
  • the user can input user commands through buttons on the remote controller, voice input, control panel input, etc. to control the display device 200.
  • a smart device 300 (such as a mobile terminal, a tablet computer, a computer, a laptop computer, etc.) may also be used to control the display device 200.
  • the display device 200 is controlled using an application running on the smart device.
  • the display device 200 may not use the smart device 300 or the control apparatus 100 to receive instructions, but may receive user control through touch or gestures.
  • the display device 200 can also be controlled in a manner other than the control apparatus 100 and the smart device 300.
  • the user's voice command control can be directly received through a module for acquiring voice commands configured inside the display device 200; or the user's voice command control can be received through a voice control device provided outside the display device 200.
  • the display device 200 also communicates data with the server 400, allowing the display device 200 to communicate through a local area network (LAN), a wireless local area network (WLAN), or other networks.
  • LAN local area network
  • WLAN wireless local area network
  • the server 400 can provide various contents and interactions to the display device 200.
  • server 400 can be a stand-alone server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN), as well as big data and artificial intelligence platforms.
  • cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN), as well as big data and artificial intelligence platforms.
  • the display device 200 may include at least one of a tuner 210 , a communicator 220 , a detector 230 , an external device interface 240 , a controller 250 , a display 260 , an audio output interface 270 , a memory 280 , a power supply, and a user interface.
  • the tuner demodulator 210 receives broadcast television signals via wired or wireless reception, and demodulates audio and video signals from multiple wireless or wired broadcast television signals, such as Electronic Program Guide (EPG) data signals.
  • EPG Electronic Program Guide
  • the tuner-demodulator 210 and the controller 250 may be located in different separate devices, that is, the tuner-demodulator 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.
  • the communicator 220 is a component for communicating with an external device or server 400 according to various communication protocol types.
  • the communicator may include at least one of a Wifi module, a Bluetooth module, a wired Ethernet module, other network communication protocol chips or a near field communication protocol chip, and an infrared receiver.
  • the display device 200 can communicate with the external control device 100 or the server 400 through the communicator 220. Establish the sending and receiving of control signals and data signals.
  • the display device can communicate with the cloud server through the communicator, send data information that needs to be processed to the cloud server, and obtain the data information processed by the cloud server.
  • the detector 230 is used to collect signals of the external environment or the interaction between the display device 200 and the outside.
  • the detector 230 includes an image collector, such as a camera, which can be used to collect external environment scenes, user attributes or user interaction gestures; or the detector 230 includes a sound collector, such as a microphone, etc., which is used to collect external sounds.
  • the sound collector can be used to collect the user's voice control instructions.
  • the external device interface 240 may include, but is not limited to, any one or more of the following interfaces: a high definition multimedia interface (HDMI), an analog or digital high definition component input interface (component), a composite video input interface (CVBS), a universal serial bus (USB) input interface, an RGB port, etc. It may also be a composite input/output interface formed by the above multiple interfaces.
  • HDMI high definition multimedia interface
  • CVBS composite video input interface
  • USB universal serial bus
  • the controller 250 includes: a central processing unit (CPU), a video processor, an audio processor, a graphics processing unit (GPU), RAM (Random Access Memory, RAM), ROM (Read-Only Memory, ROM), a first interface to an nth interface for input/output, a communication bus (Bus), etc.
  • CPU central processing unit
  • video processor video processor
  • audio processor audio processor
  • graphics processing unit GPU
  • RAM Random Access Memory
  • ROM Read-Only Memory
  • a first interface to an nth interface for input/output a communication bus (Bus), etc.
  • the controller 250 can control the operation of the display device 200 and respond to the user's operation through various software control programs stored in the memory 280.
  • the controller 250 controls the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 can perform an operation related to the object selected by the user command.
  • the controller is configured to determine at least one scroll control contained in the current display interface and construct a voice scroll control word list for the scroll control in the current display interface; then, after the sound collector receives the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, in the scroll controls contained in the current display interface, the target scroll control corresponding to the voice control text is controlled to perform a scroll operation in the scroll direction indicated by the voice control text.
  • the display 260 includes a display screen component for presenting images and a driving component for driving image display.
  • the display 260 is used to receive the image signal output by the controller 250, and display the video content, image content, menu control interface components and user control UI interface.
  • the display 260 may be a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
  • the user may input a user command in a graphical user interface (GUI) displayed on the display 260, and the user interface receives the user input command through the GUI.
  • GUI graphical user interface
  • the user may input a user command by inputting a specific sound or gesture, and the user interface receives the user input command by recognizing the sound or gesture through a detector.
  • the display 260 is configured to display image screens and a user interface.
  • "user interface” is the medium interface for interaction and information exchange between application programs or operating systems and users. It realizes the conversion between the internal form of information and the form acceptable to users.
  • the user interface may be an interface element such as an icon, a window, a control, etc. displayed on the display of the display device 200 .
  • controls may include icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, widgets and other visual interface elements.
  • the display device 200 divides the Android operating system into four layers, namely, from top to bottom, applications (Applications, referred to as “application layer”), application framework layer (Application Framework) layer (referred to as “framework layer”), Android runtime (Android runtime) and system library layer (referred to as “system runtime library layer”), and kernel layer.
  • applications Applications
  • Application Framework Application Framework
  • Android runtime Android runtime
  • system library layer system runtime library layer
  • At least one application is running in the application layer.
  • These applications may be window programs, system settings programs, clock programs, etc. that come with the Android operating system; or they may be applications developed by third-party developers.
  • application programs included in the application layer are not limited to the above examples, but may also include other application programs.
  • the framework layer provides application programming interface (API) and programming framework for applications.
  • API application programming interface
  • the application framework layer includes some predefined functions.
  • the framework layer is equivalent to a processing center, which determines the actions of applications in the application layer.
  • Applications can access resources in the Android system and obtain system services during execution through API interfaces.
  • the framework layer includes managers, content providers, view component systems, etc.
  • the manager includes at least one of the following modules: an activity manager (Activity Manager) is used to interact with all activities running in the system; a location manager (Location Manager) is used to provide system services or applications with access to system location services; a package manager (Package Manager) is used to retrieve various information related to the application package currently installed on the device; a notification manager (Notification Manager) is used to control the display and clearing of notification messages; a window manager (Window Manager) is used to manage icons, windows, toolbars, wallpapers and desktop widgets on the user interface.
  • an activity manager (Activity Manager) is used to interact with all activities running in the system
  • a location manager Location Manager) is used to provide system services or applications with access to system location services
  • a package manager (Package Manager) is used to retrieve various information related to the application package currently installed on the device
  • a notification manager Notification Manager
  • a window manager is used to manage icons, windows, toolbars, wallpapers and desktop widgets on the user interface.
  • the activity manager is used to manage the life cycle of each application and the usual navigation back function, such as controlling the exit, opening, and back of the application;
  • the window manager is used to manage all window programs, such as obtaining the size of the display screen, determining whether there is a status bar, locking the screen, capturing the screen, and controlling display window changes (for example, reducing the display window, shaking the display, distorting the display, etc.).
  • the system runtime layer provides support for the upper layer (ie, the framework layer).
  • the Android operating system will run the C/C++ library contained in the system runtime layer to implement the functions to be implemented by the framework layer.
  • the kernel layer is a layer between hardware and software.
  • the kernel layer includes at least one of the following drivers: audio driver, display driver, Bluetooth driver, camera driver, WIFI driver, USB driver, HDMI driver, sensor driver (such as fingerprint sensor, temperature sensor, pressure sensor, etc.), and power driver, etc.
  • Voice functions are also integrated into various display devices. Users can view various multimedia resources and jump to/open specified applications through voice control commands.
  • voice control can be achieved through direct words to Direct jump/switching between applications. That is, when the voice control text corresponding to the voice control instruction input by the user is a direct word of a certain application, the display device responds to the voice control instruction and jumps from the current application to the target application indicated by the voice control text corresponding to the voice control instruction.
  • the voice control command "return to the home page” to instruct the display device to switch from the currently running video application to the home page, making it easier for the user to select other applications/multimedia resources of interest on the home page.
  • the homepage is a user interface for displaying and directly accessing various application interfaces. Users can browse recommended multimedia resources and applications on the homepage to select multimedia resources of interest to play, or open applications of interest to play/browse related content in the application.
  • voice control cannot be implemented between the scenarios through the above direct words, and control can only be implemented through interface description words in different display interfaces. That is, when the voice control text corresponding to the voice control instruction input by the user is the interface description word of a control in the current display interface, the display device responds to the voice control instruction and opens/selects the target control indicated by the voice control text corresponding to the voice control instruction in the current display interface.
  • a voice interface control word list corresponding to each display interface in the application can be constructed during the first loading process after each application update. After receiving the user's voice control command, the target control that requires the user's request to operate can be quickly determined in the current display interface based on the voice interface control word list.
  • the display device After the display device receives the user's voice control command, it matches the voice control text corresponding to the voice control command with the interface description words of each control in the voice interface control word list one by one. If the corresponding interface description word is matched, the target control corresponding to the interface description word is opened/selected in the current display interface.
  • the display interface in addition to the above-mentioned controls carrying interface description words, the display interface also includes some controls without interface description words, such as scroll controls.
  • the display interface containing scroll controls in the display device may include at least one of a home page, a web page, and a user interface within an application, and some embodiments of the present application do not impose any restrictions on this.
  • the scroll control is used to implement sliding and page turning operations. Since the scroll control has no interface description words and the scrolling direction is not unique, it is difficult to control it by voice by building a voice interface control word list and matching interface description words.
  • the display interface includes multiple controls such as search, time, VIP, footprints, news, history, selections, multimedia resources (referred to as media resources), etc.
  • the user can select a target control from the controls displayed in the display interface through the above-mentioned control device 100, the above-mentioned smart device 300 or voice control instructions and other control methods to open the content display interface of the target control.
  • the user inputs the voice control command of "activate VIP" to the display device.
  • the display device After the display device receives the voice control command, it traverses the interface description words of each control in the voice interface control word list of the display interface according to the voice control text corresponding to the voice control command.
  • the control corresponding to the interface description word of "activate VIP" as control A it operates control A to open the content display interface for activating VIP, so that the user can further perform related operations for activating VIP rights in the content display interface.
  • the B control can slide up or down, and the C control can slide left or right, and the B control and the C control are only displayed as a sliding bar in the display interface, and there are no corresponding interface description words. Therefore, it is difficult to determine whether the target scrolling control to be operated is the B control or the C control in the display interface based on the user's voice control instructions.
  • the present application provides a voice control method to identify a scroll control in the current display interface and construct a voice scroll control word list of the scroll control in the current display interface, so that after receiving the user's voice control instruction, the display device 200 can determine the target scroll control to be operated in the current display interface according to the voice scroll control word list and the voice control text corresponding to the voice control instruction, and control the target scroll control to perform a scroll operation in the scroll direction indicated by the voice control text. In this way, voice control of the scroll control in the display device is realized.
  • the controller 250 in the display device 200 is configured to perform the following steps:
  • Step 510 Determine at least one scroll control included in the current display interface.
  • the current display interface may not have a scroll control, or may contain one scroll control, or may contain multiple scroll controls. If the current display interface does not have a scroll control, the user can directly follow the interface description words of each control in the current display interface to directly voice control the corresponding control to perform operations such as opening/selecting.
  • the technical solution provided in the present application is executed to determine at least one scroll control contained in the current display interface, and construct a list of voice scroll control words for the scroll control in the current display interface to implement voice control of the scroll control in the current display interface to perform scrolling operations.
  • the current display interface in some embodiments of the present application includes at least one of a home page, a web page, and a user interface within an application.
  • the controller when executing the above step 510, is further configured to: after the display device is powered on, obtain the scrolling control contained in the home page, and implement voice control of the scrolling control in the home page to perform a scrolling operation through the following steps 520-530.
  • the controller when executing step 510 above, is further configured to: when a jump operation based on a web page link is detected, use web crawler technology to obtain web page related information of the jump web page, and obtain the web page content of the web page by parsing the web page related information, thereby determining whether there is a scroll control in the web page based on the web page content.
  • the total height of all content items in the webpage is determined based on the webpage content. If the total height of the content items is greater than the window height value of the display interface, there is a scroll control in the webpage, and the scrolling direction of the scroll control is vertical scrolling; if the total height of the content items is not greater than the window height value of the display interface, there is no scroll control in the webpage.
  • the scroll control in the webpage can be voice-controlled to perform a scrolling operation through the following steps 520 to 530.
  • the controller when executing the above step 510, is further configured to: after the target application is started, monitor changes in the display interface of the target application; if the display interface changes, obtain control information of multiple controls contained in the current display interface after the change; and determine at least one scrolling control contained in the current display interface based on the control information.
  • the target application can be a system application that comes with the display device 200, an application installed by the user in the display device 200, or a third-party application launched through a network link. Some embodiments of the present application do not limit the source and type of the target application.
  • the target application in the present application is any application triggered and started by the user in the display device 200 .
  • the target application in the display device after the target application in the display device is started, it will register the Accessibility Service in the framework layer of the Android system.
  • the Accessibility Service can receive some events triggered by the Android system, such as notification status, view-related events, fingerprints, button clicks (touch), etc.
  • the accessibility service will monitor the changes in the display interface of the target application in real time.
  • the callback interface of the accessibility service will receive the interface change notification.
  • the root view node (Node Root) control of the current display interface after the change, as well as the sub-view node controls corresponding to each root view node control can be obtained, and finally all the controls contained in the current display interface can be obtained.
  • Each control has control information such as type and function description information stored in the display device.
  • the control information of each control is stored in the form of a structure tree. Therefore, based on the above characteristics, the control information of all controls in the current display interface can be obtained through the structure tree of the current display interface.
  • a control node list (Node List) corresponding to the current display interface can be generated based on the control information of all controls contained in the current display interface.
  • control node list includes controls with interface description words in the current display interface and scroll controls without interface description words.
  • control node list can be traversed to determine the type of each control in the control node list to determine the scroll control node list (Scroll Node List) corresponding to the current display interface.
  • the corresponding interface (such as the isScroll interface) is called through the accessibility service to determine whether the control corresponding to each node in the structure tree of the current display interface supports scrolling. If scrolling is supported, the control is determined to be a scroll control and is added to the scroll control node list.
  • the corresponding relationship between the display interface and the scroll control node list can be stored after the scroll control corresponding to the display interface is determined for the first time.
  • the scroll control included in the display interface can be quickly determined according to the scroll control node list corresponding to the display interface.
  • Step 520 Construct a voice scrolling control word list for the scrolling control in the current display interface.
  • the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control words.
  • the controller when executing the above step 520, is further configured to: obtain the display positions of multiple sub-controls controlled by the scrolling control in the current display interface; determine the scrolling direction of the scrolling control according to the display position of each sub-control; and construct a voice scrolling control word list for the scrolling control in the current display interface according to the scrolling direction.
  • the display position of the subcontrol may be the display area coordinates of the subcontrol in the current display interface.
  • a two-dimensional rectangular coordinate system is established with the upper left corner/center point of the current display interface as the coordinate center. Based on the two-dimensional rectangular coordinate system, the display position of each control (scroll control, sub-control of the scroll control, and other controls carrying interface description words) in the current display interface is determined.
  • the corresponding interface (e.g., getBounds interface) is called through the auxiliary function service to obtain the display position of each control in the current display interface.
  • the display position may include the boundary coordinates of the control, such as the upper boundary coordinates, the lower boundary coordinates, the left boundary coordinates, and the right boundary coordinates.
  • the display height of each sub-control is determined according to the display area coordinates of each sub-control in the current display interface. Then, according to the display height and display width, the scroll direction of the scroll control to which each sub-control belongs is determined.
  • the controller when determining the scrolling direction of the scrolling control, is further configured to:
  • the scrolling direction of the scrolling control is determined to be horizontal scrolling
  • the display height is the difference between the upper and lower boundary coordinates of the control.
  • each sub-control when the display heights of each sub-control are the same, it means that multiple sub-controls are horizontally distributed in the current display interface.
  • the scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be horizontal scrolling, so that the corresponding sub-controls can be browsed/selected/opened by sliding left and right/turning pages.
  • the fitness recommendation item in the display interface is a scroll control
  • the scroll control can control the sliding of sub-controls corresponding to multiple fitness items.
  • the fitness recommendation items display icons of sub-controls such as finding courses, finding plans, fat loss, body shaping training, yoga, AI games, and screen casting fitness, as well as icons of other sub-controls provided by the fitness application but not displayed in the current display interface, such as icons corresponding to sub-controls such as dance and aerobics.
  • FIG6 only uses a triangular icon to represent the scroll control corresponding to the fitness recommendation item; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.
  • the scrolling direction of the scrolling control corresponding to the fitness recommendation item is horizontal scrolling.
  • the display width is the difference between the left and right border coordinates of the control.
  • the scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be vertical scrolling, so that the corresponding sub-controls can be browsed/selected/opened by sliding up and down/turning pages.
  • the playlist in the display interface is a scroll control
  • the scroll control can control the sliding of sub-controls corresponding to multiple songs.
  • the playlist displays tabs for songs 01-09, as well as tabs for other sub-controls included in the playlist but not displayed in the current display interface, such as tabs corresponding to songs 10-121.
  • FIG7 only uses the arrow shown on the right to represent the scroll control corresponding to the playlist; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.
  • the scrolling direction of the scroll control corresponding to the playlist is vertical scrolling.
  • the scrolling direction is waterfall scrolling.
  • the multiple sub-controls are distributed in a waterfall flow manner in the current display interface, and the scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be waterfall flow scrolling, so that multiple sub-controls can be loaded by sliding up and down/turning pages, so that the user can browse/select/open the sub-controls of interest.
  • a user interface of a picture viewing application is taken as an example, in which a scroll control is included, and the scroll control can control pictures corresponding to multiple sub-controls, and these sub-controls display corresponding pictures in a waterfall flow manner in the display interface.
  • the display interface includes pictures 1 to 8 , pictures corresponding to video 9 , and pictures corresponding to other sub-controls that are not displayed in the display interface.
  • FIG8 only uses the scroll bar on the right to represent the scroll control in the display page; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.
  • the controller when constructing a voice scrolling control word list for a scrolling control in a current display interface, is further configured to: set semantic control words corresponding to the scrolling control according to the scrolling direction of the scrolling control; and then generate a voice scrolling control word list for the scrolling control in the current display interface according to the scrolling direction and semantic control words of the scrolling control in the current display interface.
  • semantic control words include sliding semantic words and page turning semantic words.
  • the set sliding semantic words include “slide left” and “slide right”; for a scroll control with vertical scrolling direction, the set sliding semantic words include “slide up” and “slide down”; for a scroll control with waterfall scrolling direction, the set sliding semantic words include “slide up”, “slide down”, “slide left” and “slide right”.
  • page turning semantic words include “previous page” and “next page”.
  • the scroll controls in the current display interface include M controls, N controls, and P controls
  • the scroll direction of the M control is horizontal scrolling
  • the scroll direction of the N control is vertical scrolling
  • the scroll direction of the P control is waterfall scrolling.
  • Table 1 provides an exemplary voice scroll control word list.
  • Step 530 In response to the user's voice control instruction, based on the voice scrolling control word list and the voice control text corresponding to the voice control instruction, the target scrolling control is controlled to perform a scrolling operation in the scrolling direction indicated by the voice control text.
  • the target scroll control is a scroll control in the scroll controls included in the current display interface that corresponds to the voice-controlled text.
  • the controller when implementing the above step 530, is further configured to perform the following sub-steps:
  • Step 531 Obtain the voice control text corresponding to the voice control instruction.
  • the display device can establish a communication connection between the controller and the cloud server through the communicator, so as to parse the voice control instructions through the cloud server, thereby reducing the data processing volume and algorithm storage resource consumption of the display device.
  • the controller when implementing the above step 531, is further configured to: send the voice control instruction to the cloud server to request the cloud server to parse and process the voice control instruction; and receive the voice control text corresponding to the voice control instruction sent by the cloud server.
  • some embodiments of the present application store the voice analysis algorithm in a cloud server. After the cloud server receives the voice control instruction sent by the display device, it analyzes the voice control instruction and obtains the voice control text corresponding to the voice control instruction.
  • the cloud server parses the voice control instruction and determines that the voice control instruction is an invalid instruction (for example, the relevant control word character information is not parsed), it sends abnormal result feedback information to the display device to instruct the display device not to perform control operations on any controls for the time being and continue to detect the voice control instructions input by the user.
  • Step 532 Match the voice control text with the semantic control words in the voice scroll control word list to determine the target scroll control that the voice control text requests to control in the current display interface, and the target scroll direction of the target scroll control.
  • the scrolling control is directly determined as the target scrolling control, and the target hybrid control is controlled to perform a scrolling operation in the target scrolling direction according to the voice control text, that is, step 533.
  • the controller when executing the above step 532, is further configured to: obtain the control priority of each candidate scrolling control; and determine the target scrolling control from the multiple candidate scrolling controls according to the control priority.
  • control priority of each candidate scrolling control is determined according to the management and control relationship between each candidate scrolling control and the display position of each candidate scrolling control in the current display interface.
  • the priority of the scroll control in the parent node is higher than the priority of the scroll control in the child node.
  • the priority of the scroll control whose display position is on the upper left side of the current display interface is higher than the priority of the scroll control whose display position is on the lower right side of the current display interface.
  • the display interface includes multiple scroll controls: W control, X control, Y control and Z control.
  • the W control can control the X control, the Y control and the Z control, and the priority of the W control is higher than that of the X control, the Y control and the Z control.
  • the control priorities of the X control, the Y control, and the Z control can be further determined according to their display positions in the display interface.
  • the X control is at the top of the display interface and has the highest priority; the Z control is at the bottom of the display interface and has the lowest priority. Therefore, the control priorities of the X control, the Y control, and the Z control are: control X>control Y>control Z.
  • control priority of the four scroll controls in FIG10 is: control W> control X> control Y> control Z.
  • a target scroll control can be determined to perform a scroll operation according to the voice control instruction and the voice scroll control word list.
  • the controller can determine the control that the user requests to control in the current display interface in the order of "voice interface control word list first, then voice scrolling control word list”; the controller can also determine the control that the user requests to control in the current display interface in the order of "voice scrolling control word list first, then voice interface control word list", and some embodiments of the present application do not limit this.
  • the display device can first determine, based on the voice scroll control word list, whether the voice control text corresponding to the voice control instruction has a target scroll control requested to be controlled in the current display interface.
  • the voice control text is continued to be used to determine whether there is a target control for which control is requested in the voice interface control word list of the current display interface, wherein the interface description word of the target control in the voice interface control word list is the same as the voice control text.
  • Step 533 According to the voice control text, control the target scroll control to perform a scroll operation in the target scroll direction.
  • the controller when implementing the above step 533, is further configured to: obtain a target scrolling distance of the target scrolling control according to the voice control text; and control the target scrolling control to perform a scrolling operation in the target scrolling direction according to the target scrolling distance.
  • the voice control text can include sliding voice text and page turning voice text.
  • the scrolling distance may be a preset distance value.
  • the distance value may be set according to the display height, display width, display position interval, etc. of the multiple sub-controls controlled by the scroll control, and some embodiments of the present application do not limit this.
  • the scrolling distance of the scroll control can be the display position interval between the "Find Course” icon and the "Find Plan” icon; see Figure 7, the scrolling distance of the scroll control can be the display width of the tab of Song 01, that is, d; see Figure 8, the scrolling distance of the scroll control can be the display width or display height of any picture in the display interface.
  • the controller when calculating the scrolling distance, is further configured as follows: if the scrolling direction is horizontal scrolling, the scrolling distance is calculated based on the display position of the rightmost sub-control and the display position of the leftmost sub-control controlled by the target scrolling control; if the scrolling direction is vertical scrolling or waterfall scrolling, the scrolling distance is calculated based on the display position of the topmost sub-control and the display position of the bottommost sub-control controlled by the target scrolling control.
  • the scrolling distance of the scroll control can be the total display width value between the "Find Course” icon and the "Cast Screen Training”icon; see Figure 7, the scrolling distance of the scroll control can be the total display width value between the "Find Course” icon and the "Cast Screen Training” icon.
  • the distance can be the total display height value between the tabs of songs 01 to 09, that is, 9*d; referring to FIG8 , the scrolling distance of the scroll control can be the total display height value or the total display width value of multiple pictures in the current display interface.
  • the scrolling distance can be the total display height value of Picture 2 and Picture 6 in the display interface.
  • the controller in the display device provided by the present application is connected to the display and the sound collector respectively.
  • the controller determines at least one scroll control included in the current display interface and constructs a voice scroll control word list of the scroll control in the current display interface.
  • the voice scroll control word list is used to represent the corresponding relationship between the scroll direction of the scroll control and the semantic control word.
  • the voice control instruction is sent to the controller, so that the controller responds to the voice control instruction, based on the pre-constructed voice scroll control word list and the voice control text corresponding to the voice control instruction, and controls the target scroll control corresponding to the voice control instruction in the current display interface to perform a scroll operation.
  • the target scroll control in the current display interface can be controlled to perform a scroll operation according to the voice control instruction, which overcomes the defect that the scroll control cannot be directly controlled by voice, realizes the full voice control of each control in the display device, improves the flexibility and convenience of the display device control method, and improves the user experience.
  • steps in the flowcharts involved in the above-mentioned embodiments may include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily to be carried out in sequence, but can be executed in turn or alternately with other steps or at least a portion of the steps or stages in other steps.
  • the present application also provides a voice control method, which can be applied to the above-mentioned display device 200 or other electronic devices, and the method includes:
  • the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;
  • the target scrolling control In response to the user's voice control instruction, based on the voice scrolling control word list and the voice control text corresponding to the voice control instruction, the target scrolling control is controlled to perform a scrolling operation in the scrolling direction indicated by the voice control text; the target scrolling control is a scrolling control corresponding to the voice control instruction among the scrolling controls contained in the current display interface.
  • the voice control logic of the scrolling control is as follows: after the target application is started, an accessibility service that can monitor page changes is registered, and changes in the display page are monitored through the accessibility service.
  • the controls in the display page change that is, when the display page changes
  • the root view node control in the current display interface and the sub-view node control controlled by the root view node control are obtained according to the structure tree of the controls in the current display interface, and a control node list corresponding to the current display interface is generated according to the root view node control and the sub-view node control.
  • control node list of the current display interface is traversed to determine at least one scroll control included in the current display interface, and a scroll control node list corresponding to the current display interface is generated.
  • the display positions of multiple sub-controls controlled by the scroll control are obtained, such as the upper boundary coordinates, lower boundary coordinates, left boundary coordinates and right boundary coordinates of the control; according to the display positions of each sub-control controlled by the scroll control, the scroll direction of the scroll control is determined, and the semantic control word corresponding to the scroll control is set; according to The scrolling direction and semantic control words of each scrolling control are used to generate a voice scrolling control word list of the scrolling control in the current display interface.
  • a voice control command from the user when the display device implements voice control, it traverses the voice scroll control word list according to the voice control text corresponding to the voice control command, determines the target scroll control corresponding to the voice control text, and then controls the target scroll control to perform a scroll operation.
  • the target scroll control in the current display interface is controlled to perform a scrolling operation according to the control priority.
  • the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium may store a computer program, which, when called and executed by a controller in a display device or other electronic device, implements some or all steps of the voice control method provided in the present application.
  • the computer-readable storage medium may be a magnetic disk, an optical disk, a read-only storage memory, a random access memory, or the like.
  • the present application further provides a computer program product, wherein the computer program product includes a computer program, which, when called and executed by a controller in a display device or other electronic device, can implement some or all steps of the voice control method provided in the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A display device and a speech control method. The display device comprises a monitor (260), a sound collector and a controller (250), which is respectively connected to the monitor (260) and the sound collector, wherein the sound collector is configured to collect a speech control instruction of a user; and the controller (250) is configured to: determine at least one scrolling control which is comprised in the current display interface; construct a speech scrolling control word list of the scrolling control in the current display interface, wherein the speech scrolling control word list is used for representing the correlation between a scrolling direction of the scrolling control and a semantic control word; and in response to the speech control instruction of the user, on the basis of the speech scrolling control word list and speech control text corresponding to the speech control instruction, control a target scrolling control in the current display interface to execute a scrolling operation.

Description

显示设备和语音控制方法Display device and voice control method

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求在2023年02月22日递交、申请号为202310155184.8的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese patent application No. 202310155184.8, filed on February 22, 2023, the entire contents of which are incorporated by reference into this application.

技术领域Technical Field

本申请涉及语音控制技术领域,特别是涉及一种显示设备和语音控制方法。The present application relates to the field of voice control technology, and in particular to a display device and a voice control method.

背景技术Background Art

随着语音控制技术的发展,各显示设备(例如智能电视)中也集成了语音功能,用户可以通过语音指令跳转至/打开指定应用。With the development of voice control technology, voice functions are also integrated into various display devices (such as smart TVs), and users can jump to/open specified applications through voice commands.

通常,显示设备的显示界面维护有能响应语音控制指令的多个控件,控件对外可以展示相应的界面描述词。在接收到用户语音控制指令后,显示设备将该语音控制指令对应的语音控制文本,与当前显示界面中各控件的界面描述词逐一进行匹配,如果匹配到对应的界面描述词,则控制该界面描述词对应的控件执行相关操作。比如,若当前显示界面中显示有资源库控件,则响应于“打开资源库”的语音控制指令,显示设备可以自动执行选中资源库控件,进而打开资源库,以展示多种多媒体资源供用户选择。Typically, the display interface of a display device maintains multiple controls that can respond to voice control commands, and the controls can display corresponding interface description words to the outside. After receiving the user's voice control command, the display device matches the voice control text corresponding to the voice control command with the interface description words of each control in the current display interface one by one. If the corresponding interface description word is matched, the control corresponding to the interface description word is controlled to perform related operations. For example, if a resource library control is displayed in the current display interface, then in response to the voice control command of "open resource library", the display device can automatically select the resource library control, and then open the resource library to display a variety of multimedia resources for users to choose from.

然而,有些显示界面中会存在一些无界面描述词的控件,因此,无法采用上述界面描述词匹配的方式来对其进行语音控制。However, some display interfaces may contain some controls without interface description words, and therefore, it is not possible to perform voice control on them by using the above interface description word matching method.

发明内容Summary of the invention

本申请提供了一种显示设备和语音控制方法,能够对显示界面中没有界面描述词的控件进行有效控制。The present application provides a display device and a voice control method, which can effectively control controls without interface description words in a display interface.

第一方面,本申请提供一种显示设备,包括显示器、声音采集器,以及与显示器、声音采集器分别连接的控制器,其中:In a first aspect, the present application provides a display device, including a display, a sound collector, and a controller connected to the display and the sound collector respectively, wherein:

显示器,被配置为显示图像画面和用户界面;A display configured to display an image screen and a user interface;

声音采集器,被配置为采集用户的语音控制指令;A sound collector is configured to collect a user's voice control command;

控制器被配置为:The controller is configured as:

确定当前显示界面中包含的至少一个滚动控件;Determine at least one scroll control included in the current display interface;

构建当前显示界面中滚动控件的语音滚动控制词列表;语音滚动控制词列表用于表示滚动控件的滚动方向和语义控制词之间的对应关系;Constructing a voice scroll control word list of a scroll control in the current display interface; the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;

响应于用户的语音控制指令,基于语音滚动控制词列表和语音控制指令对应的语音控制文本,控制目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作;目标滚动控件为当前显示界面包含的滚动控件中与语音控制文本相对应的一个滚动控件。In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text in the scroll controls contained in the current display interface.

第二方面,本申请提供一种语音控制方法,包括: In a second aspect, the present application provides a voice control method, comprising:

确定当前显示界面中包含的至少一个滚动控件;Determine at least one scroll control included in the current display interface;

构建当前显示界面中滚动控件的语音滚动控制词列表;语音滚动控制词列表用于表示滚动控件的滚动方向和语义控制词之间的对应关系;Constructing a voice scroll control word list of a scroll control in the current display interface; the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;

响应于用户的语音控制指令,基于语音滚动控制词列表和语音控制指令对应的语音控制文本,控制目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作;目标滚动控件为当前显示界面包含的滚动控件中与语音控制文本相对应的一个滚动控件。In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text in the scroll controls contained in the current display interface.

第三方面,本申请还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序被显示设备中控制器执行时可实现本申请提供的语音控制方法的部分或全部步骤。In a third aspect, the present application also provides a computer-readable storage medium, which stores a computer program. When the computer program is executed by a controller in a display device, it can implement some or all of the steps of the voice control method provided in the present application.

第四方面,本申请还提供了一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序被显示设备中控制器执行时可实现本申请提供的语音控制方法的部分或全部步骤。In a fourth aspect, the present application also provides a computer program product, which includes a computer program, and when the computer program is executed by a controller in a display device, it can implement some or all of the steps of the voice control method provided in the present application.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请一些实施例或相关技术中的实施方式,下面将对实施例或相关技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the implementation methods of some embodiments of the present application or related technologies, the following is a brief introduction to the drawings required for use in the description of the embodiments or related technologies. Obviously, the drawings described below are some embodiments of the present application, and a person skilled in the art can also obtain other drawings based on these drawings.

图1是本申请一些实施例示出的一种显示设备与控制装置之间操作场景;FIG1 is an operation scenario between a display device and a control device shown in some embodiments of the present application;

图2是本申请一些实施例示出的一种显示设备的硬件配置框图;FIG2 is a hardware configuration block diagram of a display device shown in some embodiments of the present application;

图3是本申请一些实施例示出的一种显示设备的软件配置框图;FIG3 is a software configuration block diagram of a display device shown in some embodiments of the present application;

图4是本申请一些实施例示出的一种显示界面的控件分布示意图;FIG4 is a schematic diagram of control distribution of a display interface shown in some embodiments of the present application;

图5是本申请一些实施例示出的一种显示设备执行语音控制方法的流程示意图;FIG5 is a flow chart of a method for executing voice control on a display device according to some embodiments of the present application;

图6是本申请一些实施例示出的一种滚动控件的水平滚动示意图;FIG6 is a schematic diagram of horizontal scrolling of a scroll control shown in some embodiments of the present application;

图7是本申请一些实施例示出的一种滚动控件的垂直滚动示意图;FIG. 7 is a schematic diagram of vertical scrolling of a scroll control shown in some embodiments of the present application;

图8是本申请一些实施例示出的一种滚动控件的瀑布流滚动示意图;FIG8 is a schematic diagram of waterfall scrolling of a scroll control shown in some embodiments of the present application;

图9是本申请一些实施例示出的一种语音控制滚动控件执行滚动操作的流程示意图;FIG9 is a schematic diagram of a flow chart of a voice-controlled scrolling control performing a scrolling operation according to some embodiments of the present application;

图10是本申请一些实施例示出的一种显示界面的多滚动控件分布示意图;FIG10 is a schematic diagram of a distribution of multiple scrolling controls on a display interface shown in some embodiments of the present application;

图11是本申请一些实施例示出的一种显示界面中滚动控件的语音控制逻辑示意图。FIG. 11 is a schematic diagram of the voice control logic of a scrolling control in a display interface shown in some embodiments of the present application.

具体实施方式DETAILED DESCRIPTION

为使本申请的目的和实施方式更加清楚,下面将结合本申请示例性实施例中的附图,对本申请示例性实施方式进行清楚、完整地描述,显然,所描述的示例性实施例仅是本申请技术方案的一部分实施例,而不是全部的实施例。In order to make the purpose and implementation method of the present application clearer, the exemplary implementation method of the present application will be clearly and completely described below in conjunction with the drawings in the exemplary embodiments of the present application. Obviously, the described exemplary embodiments are only partial embodiments of the technical solution of the present application, rather than all embodiments.

需要说明的是,本申请中对于术语的简要说明,仅是为了方便理解接下来所描述的实施方式,而不是意图限定本申请的实施方式。除非另有说明,这些术语应当按照其普通和通常的含义理解。It should be noted that the brief description of terms in this application is only for the convenience of understanding the embodiments described below, and is not intended to limit the embodiments of this application. Unless otherwise specified, these terms should be understood according to their ordinary and common meanings.

本申请中的说明书和上述权利要求书中提及的“包括”和“具有”等术语以及相关变形,意图在于覆盖但不排他的包含,例如,包含了一系列组件的产品/设备不必限于清楚地列出的所有组件,但是可以包括没有清楚地列出的,或者对于这些产品/设备固有的其它组件。 The terms "including" and "having" and related variations mentioned in the specification of this application and the above claims are intended to cover but not exclude inclusion. For example, a product/device containing a series of components is not necessarily limited to all the components clearly listed, but may include other components not clearly listed or inherent to these products/devices.

在本申请实施方式,显示设备可以具有多种实施形式,例如,可以是电视、智能电视、激光投影设备、显示器(monitor)、电子白板(electronic bulletin board)、电子桌面(electronic table),也可以是个人计算机、笔记本电脑、智能手机、平板电脑、便携式可穿戴设备等。In the implementation mode of the present application, the display device can have various implementation forms, for example, it can be a television, a smart TV, a laser projection device, a monitor, an electronic whiteboard (electronic bulletin board), an electronic desktop (electronic table), it can also be a personal computer, a laptop computer, a smart phone, a tablet computer, a portable wearable device, etc.

其中,上述显示设备可以通过控制装置、智能控制设备、语音、动作、手势、触发动作等进行控制。Among them, the above-mentioned display device can be controlled by a control device, an intelligent control device, voice, action, gesture, trigger action, etc.

参见图1,在显示设备与控制装置之间的一个示例性控制操作场景中,用户可通过智能设备300或控制装置100操作显示设备200。1 , in an exemplary control operation scenario between a display device and a control apparatus, a user may operate a display device 200 through a smart device 300 or a control apparatus 100 .

在一些实施例中,控制装置100可以是遥控器,遥控器和显示设备的通信包括红外协议通信或蓝牙协议通信,及其他短距离通信方式,通过无线或有线方式来控制显示设备200。用户可以通过遥控器上按键、语音输入、控制面板输入等输入用户指令,来控制显示设备200。In some embodiments, the control device 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or Bluetooth protocol communication, and other short-range communication methods, and the display device 200 is controlled wirelessly or wired. The user can input user commands through buttons on the remote controller, voice input, control panel input, etc. to control the display device 200.

在一些实施例中,也可以使用智能设备300(比如移动终端、平板电脑、计算机、笔记本电脑等)以控制显示设备200。例如,使用在智能设备上运行的应用程序控制显示设备200。In some embodiments, a smart device 300 (such as a mobile terminal, a tablet computer, a computer, a laptop computer, etc.) may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on the smart device.

在一些实施例中,上述显示设备200可以不使用智能设备300或控制装置100接收指令,而是通过触摸或者手势等接收用户的控制。In some embodiments, the display device 200 may not use the smart device 300 or the control apparatus 100 to receive instructions, but may receive user control through touch or gestures.

在一些实施例中,显示设备200还可以采用除了控制装置100和智能设备300之外的方式进行控制。例如,可以通过显示设备200设备内部配置的获取语音指令的模块直接接收用户的语音指令控制;也可以通过显示设备200设备外部设置的语音控制设备来接收用户的语音指令控制。In some embodiments, the display device 200 can also be controlled in a manner other than the control apparatus 100 and the smart device 300. For example, the user's voice command control can be directly received through a module for acquiring voice commands configured inside the display device 200; or the user's voice command control can be received through a voice control device provided outside the display device 200.

在一些实施例中,显示设备200还与服务器400进行数据通信,可允许显示设备200通过局域网(Local Area Network,LAN)、无线局域网(Wireless Local Area Network,WLAN)或其他网络等进行通信连接。In some embodiments, the display device 200 also communicates data with the server 400, allowing the display device 200 to communicate through a local area network (LAN), a wireless local area network (WLAN), or other networks.

其中,服务器400可以向显示设备200提供各种内容和互动。Among them, the server 400 can provide various contents and interactions to the display device 200.

作为一个示例,服务器400可以是独立服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN),以及大数据和人工智能平台等基础云计算服务的云服务器。As an example, server 400 can be a stand-alone server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN), as well as big data and artificial intelligence platforms.

参见图2,显示设备200可以包括:调谐解调器210、通信器220、检测器230、外部装置接口240、控制器250、显示器260、音频输出接口270、存储器280、供电电源、用户接口中的至少一种。2 , the display device 200 may include at least one of a tuner 210 , a communicator 220 , a detector 230 , an external device interface 240 , a controller 250 , a display 260 , an audio output interface 270 , a memory 280 , a power supply, and a user interface.

在一些实施例中,调谐解调器210通过有线或无线接收方式接收广播电视信号,以及从多个无线或有线广播电视信号中解调出音视频信号。比如,电子节目指南(Electronic Program Guide,EPG)数据信号。In some embodiments, the tuner demodulator 210 receives broadcast television signals via wired or wireless reception, and demodulates audio and video signals from multiple wireless or wired broadcast television signals, such as Electronic Program Guide (EPG) data signals.

在一些实施例中,调谐解调器210和控制器250可以位于不同的分体设备中,即调谐解调器210也可在控制器250所在的主体设备的外置设备中,如外置机顶盒等。In some embodiments, the tuner-demodulator 210 and the controller 250 may be located in different separate devices, that is, the tuner-demodulator 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.

在一些实施例中,通信器220是用于根据各种通信协议类型与外部设备或服务器400进行通信的组件。例如:通信器可以包括Wifi模块、蓝牙模块、有线以太网模块等其他网络通信协议芯片或近场通信协议芯片,以及红外接收器中的至少一种。In some embodiments, the communicator 220 is a component for communicating with an external device or server 400 according to various communication protocol types. For example, the communicator may include at least one of a Wifi module, a Bluetooth module, a wired Ethernet module, other network communication protocol chips or a near field communication protocol chip, and an infrared receiver.

在一些实施例中,显示设备200可以通过通信器220与外部控制设备100或服务器400 建立控制信号和数据信号的发送和接收。In some embodiments, the display device 200 can communicate with the external control device 100 or the server 400 through the communicator 220. Establish the sending and receiving of control signals and data signals.

比如,显示设备可以通过通信器与云端服务器进行通信,向云端服务器发送需要处理的数据信息,并获取云端服务器处理后的数据信息。For example, the display device can communicate with the cloud server through the communicator, send data information that needs to be processed to the cloud server, and obtain the data information processed by the cloud server.

在一些实施例中,检测器230用于采集外部环境,或者显示设备200与外部交互的信号。例如,检测器230包括图像采集器,如摄像头,可以用于采集外部环境场景、用户的属性或用户交互手势;再或者,检测器230包括声音采集器,如麦克风等,用于采集外部声音。In some embodiments, the detector 230 is used to collect signals of the external environment or the interaction between the display device 200 and the outside. For example, the detector 230 includes an image collector, such as a camera, which can be used to collect external environment scenes, user attributes or user interaction gestures; or the detector 230 includes a sound collector, such as a microphone, etc., which is used to collect external sounds.

在本申请所示的语音控制显示设备的交互场景中,声音采集器可以用于采集用户的语音控制指令。In the interactive scenario of the voice-controlled display device shown in the present application, the sound collector can be used to collect the user's voice control instructions.

在一些实施例中,外部装置接口240可以包括但不限于如下:高清多媒体接口(High Definition Multimedia Interface,HDMI)、模拟或数据高清分量输入接口(分量)、复合视频输入接口(Composite Video Broadcast Signal,CVBS)、串行通信总线(Universal Serial Bus,USB)输入接口、RGB端口等任一个或多个接口。也可以是上述多个接口形成的复合型的输入/输出接口。In some embodiments, the external device interface 240 may include, but is not limited to, any one or more of the following interfaces: a high definition multimedia interface (HDMI), an analog or digital high definition component input interface (component), a composite video input interface (CVBS), a universal serial bus (USB) input interface, an RGB port, etc. It may also be a composite input/output interface formed by the above multiple interfaces.

在一些实施例中,控制器250包括:中央处理器(Central Processing Unit,CPU)、视频处理器、音频处理器、图形处理器(Graphics Processing Unit,GPU)、RAM(Random Access Memory,RAM)、ROM(Read-Only Memory,ROM)、用于输入/输出的第一接口至第n接口,通信总线(Bus)等中的至少一种。In some embodiments, the controller 250 includes: a central processing unit (CPU), a video processor, an audio processor, a graphics processing unit (GPU), RAM (Random Access Memory, RAM), ROM (Read-Only Memory, ROM), a first interface to an nth interface for input/output, a communication bus (Bus), etc.

其中,控制器250可以通过存储在存储器280上中各种软件控制程序,来控制显示设备200的工作和响应用户的操作。控制器250控制显示设备200的整体操作。例如:响应于接收到用于选择在显示器260上显示UI对象的用户命令,控制器250便可以执行与由用户命令选择的对象有关的操作。The controller 250 can control the operation of the display device 200 and respond to the user's operation through various software control programs stored in the memory 280. The controller 250 controls the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 can perform an operation related to the object selected by the user command.

在本申请所示的语音控制显示设备的交互场景中,控制器被配置为确定当前显示界面中包含的至少一个滚动控件,并构建当前显示界面中滚动控件的语音滚动控制词列表;进而在声音采集器接收到用户的语音控制指令后,基于语音滚动控制词列表和语音控制指令对应的语音控制文本,在当前显示界面包含的滚动控件中,控制与语音控制文本相对应的目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作。In the interactive scenario of the voice-controlled display device shown in the present application, the controller is configured to determine at least one scroll control contained in the current display interface and construct a voice scroll control word list for the scroll control in the current display interface; then, after the sound collector receives the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, in the scroll controls contained in the current display interface, the target scroll control corresponding to the voice control text is controlled to perform a scroll operation in the scroll direction indicated by the voice control text.

在一些实施例中,显示器260包括用于呈现画面的显示屏组件,以及驱动图像显示的驱动组件。显示器260用于接收控制器250输出的图像信号,进行显示视频内容、图像内容以及菜单操控界面的组件以及用户操控UI界面。In some embodiments, the display 260 includes a display screen component for presenting images and a driving component for driving image display. The display 260 is used to receive the image signal output by the controller 250, and display the video content, image content, menu control interface components and user control UI interface.

作为一个示例,显示器260可为液晶显示器、OLED显示器,以及投影显示器,还可以为投影装置和投影屏幕。As an example, the display 260 may be a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.

其中,用户可以在显示器260上显示的图形用户界面(Graphic User Interface,GUI)输入用户命令,则用户接口通过GUI接收用户输入命令。或者,用户可通过输入特定的声音或手势进行输入用户命令,则用户接口通过检测器识别出声音或手势,来接收用户输入命令。The user may input a user command in a graphical user interface (GUI) displayed on the display 260, and the user interface receives the user input command through the GUI. Alternatively, the user may input a user command by inputting a specific sound or gesture, and the user interface receives the user input command by recognizing the sound or gesture through a detector.

在本申请所示的语音控制显示设备的交互场景中,显示器260配置为显示图像画面和用户界面。In the interactive scenario of the voice-controlled display device shown in the present application, the display 260 is configured to display image screens and a user interface.

其中,“用户界面”是应用程序或操作系统与用户之间进行交互和信息交换的介质接口,它实现信息的内部形式与用户可以接受形式之间的转换。 Among them, "user interface" is the medium interface for interaction and information exchange between application programs or operating systems and users. It realizes the conversion between the internal form of information and the form acceptable to users.

作为一个示例,用户界面可以是在显示设备200的显示器中显示的一个图标、窗口、控件等界面元素。As an example, the user interface may be an interface element such as an icon, a window, a control, etc. displayed on the display of the display device 200 .

其中,控件可以包括图标、按钮、菜单、选项卡、文本框、对话框、状态栏、导航栏、Widget等可视的界面元素。Among them, controls may include icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, widgets and other visual interface elements.

参见图3,以显示设备200中部署安卓(Android)软件平台为例,显示设备200将安卓操作系统分为四层,从上至下分别为应用程序(Applications,简称“应用层”)、应用程序框架(Application Framework)层(简称“框架层”)、安卓运行时(Android runtime)和系统库层(简称“系统运行库层”),以及内核层。Referring to FIG. 3 , taking the deployment of the Android software platform in the display device 200 as an example, the display device 200 divides the Android operating system into four layers, namely, from top to bottom, applications (Applications, referred to as "application layer"), application framework layer (Application Framework) layer (referred to as "framework layer"), Android runtime (Android runtime) and system library layer (referred to as "system runtime library layer"), and kernel layer.

在一些实施例中,应用层中运行有至少一个应用程序,这些应用程序可以是安卓操作系统自带的窗口(Window)程序、系统设置程序或时钟程序等;也可以是第三方开发者所开发的应用程序。In some embodiments, at least one application is running in the application layer. These applications may be window programs, system settings programs, clock programs, etc. that come with the Android operating system; or they may be applications developed by third-party developers.

应该理解的是,该应用层中所包括的应用程序不限于以上举例,还可以包括其他应用程序。It should be understood that the application programs included in the application layer are not limited to the above examples, but may also include other application programs.

框架层为应用程序提供应用编程接口(application programming interface,API)和编程框架,应用程序框架层包括一些预先定义的函数。The framework layer provides application programming interface (API) and programming framework for applications. The application framework layer includes some predefined functions.

换言之,框架层相当于一个处理中心,这个中心决定让应用层中的应用程序做出动作。应用程序通过API接口,可在执行中访问安卓系统中的资源和取得系统的服务。In other words, the framework layer is equivalent to a processing center, which determines the actions of applications in the application layer. Applications can access resources in the Android system and obtain system services during execution through API interfaces.

在一些实施例中,框架层包括管理器(Managers)、内容提供者(Content Provider)、视图组件系统(View system)等。In some embodiments, the framework layer includes managers, content providers, view component systems, etc.

其中,管理器包括以下模块中的至少一个:活动管理器(Activity Manager)用于和系统中正在运行的所有活动进行交互;位置管理器(Location Manager)用于给系统服务或应用提供了系统位置服务的访问;文件包管理器(Package Manager)用于检索当前安装在设备上的应用程序包相关的各种信息;通知管理器(Notification Manager)用于控制通知消息的显示和清除;窗口管理器(Window Manager)用于管理用户界面上的括图标、窗口、工具栏、壁纸和桌面部件。Among them, the manager includes at least one of the following modules: an activity manager (Activity Manager) is used to interact with all activities running in the system; a location manager (Location Manager) is used to provide system services or applications with access to system location services; a package manager (Package Manager) is used to retrieve various information related to the application package currently installed on the device; a notification manager (Notification Manager) is used to control the display and clearing of notification messages; a window manager (Window Manager) is used to manage icons, windows, toolbars, wallpapers and desktop widgets on the user interface.

在一些实施例中,活动管理器用于管理各个应用程序的生命周期以及通常的导航回退功能,比如控制应用程序的退出、打开、后退等;窗口管理器用于管理所有的窗口程序,比如获取显示屏大小、判断是否有状态栏、锁定屏幕、截取屏幕、控制显示窗口变化(例如将显示窗口缩小显示、抖动显示、扭曲变形显示等)等。In some embodiments, the activity manager is used to manage the life cycle of each application and the usual navigation back function, such as controlling the exit, opening, and back of the application; the window manager is used to manage all window programs, such as obtaining the size of the display screen, determining whether there is a status bar, locking the screen, capturing the screen, and controlling display window changes (for example, reducing the display window, shaking the display, distorting the display, etc.).

在一些实施例中,系统运行库层为上层(即框架层)提供支撑,当框架层被使用时,安卓操作系统会运行系统运行库层中包含的C/C++库以实现框架层要实现的功能。In some embodiments, the system runtime layer provides support for the upper layer (ie, the framework layer). When the framework layer is used, the Android operating system will run the C/C++ library contained in the system runtime layer to implement the functions to be implemented by the framework layer.

在一些实施例中,内核层是硬件和软件之间的层。内核层至少包含以下驱动中的至少一种:音频驱动、显示驱动、蓝牙驱动、摄像头驱动、WIFI驱动、USB驱动、HDMI驱动、传感器驱动(如指纹传感器,温度传感器,压力传感器等),以及电源驱动等。In some embodiments, the kernel layer is a layer between hardware and software. The kernel layer includes at least one of the following drivers: audio driver, display driver, Bluetooth driver, camera driver, WIFI driver, USB driver, HDMI driver, sensor driver (such as fingerprint sensor, temperature sensor, pressure sensor, etc.), and power driver, etc.

介绍完显示设备的控制交互场景、硬件结构和部署的软件平台后,接下来,基于上述显示设备200,对显示设备执行语音控制方法的具体实施过程进行详细说明。After introducing the control interaction scenario, hardware structure and deployed software platform of the display device, next, based on the above-mentioned display device 200, the specific implementation process of the voice control method of the display device is described in detail.

随着智能化信息化发展,语音控制场景以及覆盖范围越广,占有的市场份额也就越大。各显示设备中也集成了语音功能,用户可以通过语音控制指令查看各种多媒体资源,以及跳转至/打开指定应用。With the development of intelligent information technology, the wider the voice control scenarios and coverage, the greater the market share. Voice functions are also integrated into various display devices. Users can view various multimedia resources and jump to/open specified applications through voice control commands.

在显示设备中的各应用之间,可以通过直达词实现语音控制,以在显示设备所安装的 应用之间实现直接跳转/切换。也即是,当用户输入的语音控制指令对应的语音控制文本为某个应用的直达词时,显示设备响应于该语音控制指令,从当前应用跳转到语音控制指令对应的语音控制文本所指示的目标应用。Between the applications in the display device, voice control can be achieved through direct words to Direct jump/switching between applications. That is, when the voice control text corresponding to the voice control instruction input by the user is a direct word of a certain application, the display device responds to the voice control instruction and jumps from the current application to the target application indicated by the voice control text corresponding to the voice control instruction.

作为一个示例,当用户需要从当前正在浏览的视频应用退出至主页时,可以通过输入“回到主页”的语音控制指令,指示显示设备从当前运行的视频应用,切换到主页中,便于用户在主页中选择感兴趣的其他应用/多媒体资源。As an example, when a user needs to exit the video application currently being browsed to the home page, he or she can input the voice control command "return to the home page" to instruct the display device to switch from the currently running video application to the home page, making it easier for the user to select other applications/multimedia resources of interest on the home page.

应该理解的是,主页即为展示和直达各应用界面的用户界面。用户可以在主页中浏览推荐的多媒体资源和应用,以从中选择感兴趣的多媒体资源进行播放,或者,打开感兴趣的应用,以在应用内播放/浏览相关内容。It should be understood that the homepage is a user interface for displaying and directly accessing various application interfaces. Users can browse recommended multimedia resources and applications on the homepage to select multimedia resources of interest to play, or open applications of interest to play/browse related content in the application.

一些实施例中,对于单个应用,由于应用内存在各种场景,无法通过上述直达词在各场景之间实现语音控制,只能通过不同显示界面中的界面描述词实现控制。也即是,当用户输入的语音控制指令对应的语音控制文本为当前显示界面中某个控件的界面描述词时,显示设备响应于该语音控制指令,在当前显示界面中打开/选中语音控制指令对应的语音控制文本所指示的目标控件。In some embodiments, for a single application, since there are various scenarios within the application, voice control cannot be implemented between the scenarios through the above direct words, and control can only be implemented through interface description words in different display interfaces. That is, when the voice control text corresponding to the voice control instruction input by the user is the interface description word of a control in the current display interface, the display device responds to the voice control instruction and opens/selects the target control indicated by the voice control text corresponding to the voice control instruction in the current display interface.

其中,由于应用内场景较为固定,因此,可以在每次应用更新后,首次加载过程中,构建该应用内各显示界面对应的语音界面控制词列表,从而在接收到用户的语音控制指令后,基于语音界面控制词列表,在当前显示界面中快速确定需要用户请求操作的目标控件。Among them, since the scenes within the application are relatively fixed, a voice interface control word list corresponding to each display interface in the application can be constructed during the first loading process after each application update. After receiving the user's voice control command, the target control that requires the user's request to operate can be quickly determined in the current display interface based on the voice interface control word list.

作为一个示例,显示设备接收到用户的语音控制指令后,将该语音控制指令对应的语音控制文本与语音界面控制词列表中各控件的界面描述词进行逐一匹配,若匹配到对应的界面描述词,则在当前显示界面中打开/选中该界面描述词对应的目标控件。As an example, after the display device receives the user's voice control command, it matches the voice control text corresponding to the voice control command with the interface description words of each control in the voice interface control word list one by one. If the corresponding interface description word is matched, the target control corresponding to the interface description word is opened/selected in the current display interface.

此外,在一些显示界面中,除了上述携带界面描述词的控件以外,显示界面中还包括一些不存在界面描述词的控件,比如,滚动控件。In addition, in some display interfaces, in addition to the above-mentioned controls carrying interface description words, the display interface also includes some controls without interface description words, such as scroll controls.

需要说明的是,显示设备中含有滚动控件的显示界面可以包括主页、网页、应用内的用户界面中的至少一种,本申请一些实施例对此不做限制。It should be noted that the display interface containing scroll controls in the display device may include at least one of a home page, a web page, and a user interface within an application, and some embodiments of the present application do not impose any restrictions on this.

其中,滚动控件用于实现滑动和翻页操作。由于滚动控件没有界面描述词,且滚动方向不唯一,因此,难以通过构建语音界面控制词列表,以及界面描述词匹配的方式,对其进行语音控制。Among them, the scroll control is used to implement sliding and page turning operations. Since the scroll control has no interface description words and the scrolling direction is not unique, it is difficult to control it by voice by building a voice interface control word list and matching interface description words.

为便于理解,参见图4,以显示界面为设备主页为例,该显示界面中包括搜索、时间、VIP、足迹、新闻、历史记录、精选、多媒体资源(简称媒资)等多个控件,用户可以通过上述控制装置100、上述智能设备300或语音控制指令等控制方式,在显示界面中所显示的控件选中一个目标控件,以打开目标控件的内容显示界面。For ease of understanding, refer to Figure 4, taking the display interface of the device homepage as an example, the display interface includes multiple controls such as search, time, VIP, footprints, news, history, selections, multimedia resources (referred to as media resources), etc. The user can select a target control from the controls displayed in the display interface through the above-mentioned control device 100, the above-mentioned smart device 300 or voice control instructions and other control methods to open the content display interface of the target control.

作为一个示例,若用户需要操作图4所示显示界面中的A控件,则用户向显示设备输入“开通VIP”的语音控制指令,显示设备接收到该语音控制指令后,即根据语音控制指令对应的语音控制文本,在该显示界面的语音界面控制词列表中遍历各控件的界面描述词,在匹配到“开通VIP”的界面描述词所对应的控件为A控件后,操作A控件,以打开开通VIP的内容显示界面,使得用户可以在该内容显示界面中进一步执行开通VIP权利的相关操作。As an example, if the user needs to operate control A in the display interface shown in Figure 4, the user inputs the voice control command of "activate VIP" to the display device. After the display device receives the voice control command, it traverses the interface description words of each control in the voice interface control word list of the display interface according to the voice control text corresponding to the voice control command. After matching the control corresponding to the interface description word of "activate VIP" as control A, it operates control A to open the content display interface for activating VIP, so that the user can further perform related operations for activating VIP rights in the content display interface.

然而,滚动控件没有对应的界面描述词,一个显示界面中所包含的滚动控件的数目可能是不同的,各滚动控件的滚动方向也不唯一,因此,无法根据用户的语音控制指令,通过上述界面描述词匹配的方式,从显示界面中确定用户需要操作的目标滚动控件。 However, there is no corresponding interface description word for the scroll control, the number of scroll controls included in a display interface may be different, and the scrolling direction of each scroll control is not unique. Therefore, it is impossible to determine the target scroll control that the user needs to operate from the display interface based on the user's voice control instructions through the above-mentioned interface description word matching method.

继续参见图4,对于该显示界面中未携带界面描述词的滚动控件:B控件和C控件,B控件可以实现向上滑动或向下滑动,C控件可以实现向左滑动或向右滑动,且B控件和C控件在该显示界面中仅以一个滑动条进行显示,并没有对应的界面描述词,因此,难以根据用户的语音控制指令,在该显示界面中确定需要操作的目标滚动控件是B控件或C控件。Continuing to refer to Figure 4, for the scrolling controls in the display interface that do not carry interface description words: B control and C control, the B control can slide up or down, and the C control can slide left or right, and the B control and the C control are only displayed as a sliding bar in the display interface, and there are no corresponding interface description words. Therefore, it is difficult to determine whether the target scrolling control to be operated is the B control or the C control in the display interface based on the user's voice control instructions.

基于此,本申请提供了一种语音控制方法,以在当前显示界面中识别滚动控件,并构建当前显示界面中滚动控件的语音滚动控制词列表,从而使得显示设备200在接收到用户的语音控制指令后,可以根据语音滚动控制词列表和语音控制指令对应的语音控制文本,在当前显示界面中确定需要操作的目标滚动控件,并控制该目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作。如此,实现了显示设备中滚动控件的语音控制。Based on this, the present application provides a voice control method to identify a scroll control in the current display interface and construct a voice scroll control word list of the scroll control in the current display interface, so that after receiving the user's voice control instruction, the display device 200 can determine the target scroll control to be operated in the current display interface according to the voice scroll control word list and the voice control text corresponding to the voice control instruction, and control the target scroll control to perform a scroll operation in the scroll direction indicated by the voice control text. In this way, voice control of the scroll control in the display device is realized.

参见图5,上述显示设备200在执行本申请提供的语音控制方法时,显示设备200中的控制器250被配置为执行以下步骤:5 , when the display device 200 executes the voice control method provided in the present application, the controller 250 in the display device 200 is configured to perform the following steps:

步骤510:确定当前显示界面中包含的至少一个滚动控件。Step 510: Determine at least one scroll control included in the current display interface.

应该理解的是,当前显示界面中可能不存在滚动控件,或者包含一个滚动控件,亦或者包含多个滚动控件。若当前显示界面中不存在滚动控件,则用户可以直接按照当前显示界面中各控件的界面描述词,直接语音控制对应的控件执行打开/选中等操作。It should be understood that the current display interface may not have a scroll control, or may contain one scroll control, or may contain multiple scroll controls. If the current display interface does not have a scroll control, the user can directly follow the interface description words of each control in the current display interface to directly voice control the corresponding control to perform operations such as opening/selecting.

若当前显示界面中存在滚动控件,则执行本申请提供的技术方案,以确定当前显示界面中包含的至少一个滚动控件,并构建当前显示界面中滚动控件的语音滚动控制词列表,以实现语音控制当前显示界面中的滚动控件执行滚动操作。If there is a scroll control in the current display interface, the technical solution provided in the present application is executed to determine at least one scroll control contained in the current display interface, and construct a list of voice scroll control words for the scroll control in the current display interface to implement voice control of the scroll control in the current display interface to perform scrolling operations.

如前文说明,本申请一些实施例中的当前显示界面包括主页、网页和应用内的用户界面中的至少一种。As described above, the current display interface in some embodiments of the present application includes at least one of a home page, a web page, and a user interface within an application.

在一些实施例中,若当前显示界面为主页,则执行上述步骤510时,控制器被进一步配置为:在显示设备上电开机后,获取主页中包含的滚动控件,并通过下述步骤520-步骤530实现语音控制主页中的滚动控件执行滚动操作。In some embodiments, if the current display interface is a home page, when executing the above step 510, the controller is further configured to: after the display device is powered on, obtain the scrolling control contained in the home page, and implement voice control of the scrolling control in the home page to perform a scrolling operation through the following steps 520-530.

在一些实施例中,若当前显示界面为网页,则执行上述步骤510时,控制器被进一步配置为:在检测到基于网页链接的跳转操作时,采用网络爬虫技术获取跳转网页的网页相关信息,并通过解析该网页相关信息,获取该网页的网页内容,从而根据网页内容,判断该网页中是否存在滚动控件。In some embodiments, if the current display interface is a web page, then when executing step 510 above, the controller is further configured to: when a jump operation based on a web page link is detected, use web crawler technology to obtain web page related information of the jump web page, and obtain the web page content of the web page by parsing the web page related information, thereby determining whether there is a scroll control in the web page based on the web page content.

作为一个示例,根据网页内容,确定网页中所有内容项的总高度。若内容项的总高度大于显示界面的窗口高度值,则网页中存在滚动控件,且该滚动控件的滚动方向为垂直滚动;若内容项的总高度不大于显示界面的窗口高度值,则网页中不存在滚动控件。As an example, the total height of all content items in the webpage is determined based on the webpage content. If the total height of the content items is greater than the window height value of the display interface, there is a scroll control in the webpage, and the scrolling direction of the scroll control is vertical scrolling; if the total height of the content items is not greater than the window height value of the display interface, there is no scroll control in the webpage.

若确定网页中存在滚动控件,即可通过下述步骤520-步骤530实现语音控制网页中的滚动控件执行滚动操作。If it is determined that there is a scroll control in the webpage, the scroll control in the webpage can be voice-controlled to perform a scrolling operation through the following steps 520 to 530.

在一些实施例中,若当前显示界面为应用内的用户界面,则执行上述步骤510时,控制器被进一步配置为:在目标应用启动后,监听目标应用的显示界面变化情况;若显示界面发生变化,则获取变化后的当前显示界面中所包含的多个控件的控件信息;根据控件信息,确定当前显示界面中包含的至少一个滚动控件。In some embodiments, if the current display interface is a user interface within an application, when executing the above step 510, the controller is further configured to: after the target application is started, monitor changes in the display interface of the target application; if the display interface changes, obtain control information of multiple controls contained in the current display interface after the change; and determine at least one scrolling control contained in the current display interface based on the control information.

需要说明的是,目标应用可以为显示设备200自带的系统应用,也可以为用户在显示设备200中安装的应用,还可以为通过网络链接启动的第三方应用,本申请一些实施例对目标应用的来源和类型不做限制。 It should be noted that the target application can be a system application that comes with the display device 200, an application installed by the user in the display device 200, or a third-party application launched through a network link. Some embodiments of the present application do not limit the source and type of the target application.

也即是,本申请中的目标应用为用户在显示设备200中触发启动的任一应用。That is, the target application in the present application is any application triggered and started by the user in the display device 200 .

在一种可能的实现方式中,显示设备中的目标应用在启动之后,会在安卓系统的框架层中注册辅助功能服务(Accessibility Service),辅助功能服务能够接收到由安卓系统触发的一些事件,比如,通知状态、视图相关事件、指纹、按键点击(touch)等。In one possible implementation, after the target application in the display device is started, it will register the Accessibility Service in the framework layer of the Android system. The Accessibility Service can receive some events triggered by the Android system, such as notification status, view-related events, fingerprints, button clicks (touch), etc.

因此,当目标应用的应用程序开始运行,启动目标应用后,辅助功能服务会实时监听目标应用的显示界面变化情况。当显示界面发生变化时,辅助功能服务的回调接口会收到界面变化通知。如此,通过辅助功能服务提供的接口,可以获取到变化后的当前显示界面的根视图节点(Node Root)控件,以及各根视图节点控件对应的子视图节点控件,最终得到当前显示界面中包含的所有控件。Therefore, when the target application starts running and the target application is started, the accessibility service will monitor the changes in the display interface of the target application in real time. When the display interface changes, the callback interface of the accessibility service will receive the interface change notification. In this way, through the interface provided by the accessibility service, the root view node (Node Root) control of the current display interface after the change, as well as the sub-view node controls corresponding to each root view node control, can be obtained, and finally all the controls contained in the current display interface can be obtained.

其中,每个控件在显示设备中均对应存储有诸如类型、功能描述信息等控件信息。在显示设备中,各控件的控件信息是以结构树的形式进行存储。因此,基于上述特性,可以通过当前显示界面的结构树,获取当前显示界面的中所有控件的控件信息。Each control has control information such as type and function description information stored in the display device. In the display device, the control information of each control is stored in the form of a structure tree. Therefore, based on the above characteristics, the control information of all controls in the current display interface can be obtained through the structure tree of the current display interface.

进一步地,可以根据当前显示界面中包含的所有控件的控件信息,生成当前显示界面对应的控件节点列表(Node List)。Furthermore, a control node list (Node List) corresponding to the current display interface can be generated based on the control information of all controls contained in the current display interface.

需要说明的是,控件节点列表中包括当前显示界面中存在界面描述词的控件和不存在界面描述词的滚动控件。It should be noted that the control node list includes controls with interface description words in the current display interface and scroll controls without interface description words.

因此,可以遍历控件节点列表,对控件节点列表中的每一个控件进行类型判断,以确定当前显示界面对应的滚动控件节点列表(Scroll Node List)。Therefore, the control node list can be traversed to determine the type of each control in the control node list to determine the scroll control node list (Scroll Node List) corresponding to the current display interface.

在一种可能的实现方式中,通过辅助功能服务调用相应的接口(比如isScroll接口),在当前显示界面的结构树中判断各节点对应的控件是否支持滚动,如果支持滚动,则确定该控件为滚动控件,将其添加至滚动控件节点列表中。In one possible implementation, the corresponding interface (such as the isScroll interface) is called through the accessibility service to determine whether the control corresponding to each node in the structure tree of the current display interface supports scrolling. If scrolling is supported, the control is determined to be a scroll control and is added to the scroll control node list.

在一些实施例中,若显示界面中包含的滚动控件较为固定(比如,在预设的时间段内,滚动控件的显示位置、类型、功能信息等均未发生变化),则可以在首次确定该显示界面对应的滚动控件后,存储显示界面和滚动控件节点列表的对应关系。后续再打开该显示界面时,可以根据该显示界面对应的滚动控件节点列表,快速确定该显示界面中所包含的滚动控件。In some embodiments, if the scroll control included in the display interface is relatively fixed (for example, the display position, type, function information, etc. of the scroll control have not changed within a preset time period), the corresponding relationship between the display interface and the scroll control node list can be stored after the scroll control corresponding to the display interface is determined for the first time. When the display interface is subsequently opened, the scroll control included in the display interface can be quickly determined according to the scroll control node list corresponding to the display interface.

步骤520:构建当前显示界面中滚动控件的语音滚动控制词列表。Step 520: Construct a voice scrolling control word list for the scrolling control in the current display interface.

其中,语音滚动控制词列表用于表示滚动控件的滚动方向和语义控制词之间的对应关系。The voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control words.

在一些实施例中,执行上述步骤520时,控制器被进一步配置为:获取滚动控件控制的多个子控件在当前显示界面中的显示位置;根据各子控件的显示位置,确定滚动控件的滚动方向;根据滚动方向,构建当前显示界面中滚动控件的语音滚动控制词列表。In some embodiments, when executing the above step 520, the controller is further configured to: obtain the display positions of multiple sub-controls controlled by the scrolling control in the current display interface; determine the scrolling direction of the scrolling control according to the display position of each sub-control; and construct a voice scrolling control word list for the scrolling control in the current display interface according to the scrolling direction.

其中,子控件的显示位置可以为子控件在当前显示界面中的显示区域坐标。The display position of the subcontrol may be the display area coordinates of the subcontrol in the current display interface.

在一种可能的实现方式中,以当前显示界面的左上角/中心点作为坐标中心,建立二维直角坐标系。基于该二维直角坐标系,确定当前显示界面中各控件(滚动控件、滚动控件的子控件、其他携带界面描述词的控件)的显示位置。In a possible implementation, a two-dimensional rectangular coordinate system is established with the upper left corner/center point of the current display interface as the coordinate center. Based on the two-dimensional rectangular coordinate system, the display position of each control (scroll control, sub-control of the scroll control, and other controls carrying interface description words) in the current display interface is determined.

作为一个示例,通过辅助功能服务调用相应的接口(比如,getBounds接口),以获取当前显示界面中各控件的显示位置。其中,显示位置可以包括控件的边界坐标,比如,上边界坐标、下边界坐标、左边界坐标、右边界坐标。As an example, the corresponding interface (e.g., getBounds interface) is called through the auxiliary function service to obtain the display position of each control in the current display interface. The display position may include the boundary coordinates of the control, such as the upper boundary coordinates, the lower boundary coordinates, the left boundary coordinates, and the right boundary coordinates.

进一步地,根据各子控件在当前显示界面中的显示区域坐标,确定各子控件的显示高 度和显示宽度;进而根据显示高度和显示宽度,判断控制各子控件所属的滚动控件的滚动方向。Further, the display height of each sub-control is determined according to the display area coordinates of each sub-control in the current display interface. Then, according to the display height and display width, the scroll direction of the scroll control to which each sub-control belongs is determined.

在一些实施例中,确定滚动控件的滚动方向时,控制器被进一步配置为:In some embodiments, when determining the scrolling direction of the scrolling control, the controller is further configured to:

(1)若各子控件在当前显示界面中的显示高度均相同,则确定滚动控件的滚动方向为水平滚动;(1) If the display heights of the sub-controls in the current display interface are the same, the scrolling direction of the scrolling control is determined to be horizontal scrolling;

其中,显示高度为控件的上边界坐标和下边界坐标之间的差值。The display height is the difference between the upper and lower boundary coordinates of the control.

也即是,当各子控件的显示高度均相同时,表示多个子控件在当前显示界面中是水平分布的,则控制多个子控件所属的滚动控件的滚动方向为水平滚动,以通过左右滑动/翻页,来浏览/选中/打开对应的子控件。That is, when the display heights of each sub-control are the same, it means that multiple sub-controls are horizontally distributed in the current display interface. The scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be horizontal scrolling, so that the corresponding sub-controls can be browsed/selected/opened by sliding left and right/turning pages.

在一个示例中,参见图6,以显示界面为一种健身应用内的用户界面为例,该显示界面中的健身推荐项为滚动控件,该滚动控件可以控制滑动多个健身项目分别对应的子控件。In one example, referring to FIG. 6 , taking the display interface as a user interface in a fitness application as an example, the fitness recommendation item in the display interface is a scroll control, and the scroll control can control the sliding of sub-controls corresponding to multiple fitness items.

参见图6,健身推荐项中显示有找课程、找计划、减脂、塑形训练、瑜伽、AI游戏和投屏健身等子控件的图标,以及该健身应用提供的,但在当前显示界面中未显示的其他子控件的图标,比如,舞蹈、健身操等子控件所对应的图标。Referring to Figure 6, the fitness recommendation items display icons of sub-controls such as finding courses, finding plans, fat loss, body shaping training, yoga, AI games, and screen casting fitness, as well as icons of other sub-controls provided by the fitness application but not displayed in the current display interface, such as icons corresponding to sub-controls such as dance and aerobics.

为便于说明,图6中仅以三角形图标来表示健身推荐项对应的滚动控件;当然,该滚动控件也可以通过其他形态进行显示,本申请一些实施例对此不做限制。For ease of explanation, FIG6 only uses a triangular icon to represent the scroll control corresponding to the fitness recommendation item; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.

在该健身应用的显示界面中,由于健身推荐项所包含的多个健身项目的图标的显示高度均为h,即健身推荐项对应的滚动控件中所包含的多个子控件的显示高度均相同,则该健身推荐项对应的滚动控件的滚动方向为水平滚动。In the display interface of the fitness application, since the display heights of the icons of the multiple fitness items included in the fitness recommendation item are all h, that is, the display heights of the multiple sub-controls included in the scrolling control corresponding to the fitness recommendation item are all the same, the scrolling direction of the scrolling control corresponding to the fitness recommendation item is horizontal scrolling.

如此,通过控制该健身推荐项对应的滚动控件沿水平方向执行滚动操作,可以依次浏览该健身应用中推荐多个健身项目,从而选择感兴趣的健身项目进行练习。In this way, by controlling the scroll control corresponding to the fitness recommendation item to perform a scroll operation in the horizontal direction, multiple fitness items recommended in the fitness application can be browsed in sequence, and then a fitness item of interest can be selected for practice.

(2)若各子控件在当前显示界面中的显示宽度均相同,则确定滚动控件的滚动方向为垂直滚动;(2) If the display widths of the sub-controls in the current display interface are the same, then the scrolling direction of the scrolling control is determined to be vertical scrolling;

其中,显示宽度为控件的左边界坐标和右边界坐标之间的差值。The display width is the difference between the left and right border coordinates of the control.

也即是,当各子控件的显示宽度均相同时,表示多个子控件在当前显示界面中是垂直分布的,则控制多个子控件所属的滚动控件的滚动方向为垂直滚动,以通过上下滑动/翻页,来浏览/选中/打开对应的子控件。That is, when the display widths of all sub-controls are the same, it means that multiple sub-controls are distributed vertically in the current display interface. The scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be vertical scrolling, so that the corresponding sub-controls can be browsed/selected/opened by sliding up and down/turning pages.

在一个示例中,参见图7,以显示界面为一种音乐应用内的用户界面为例,该显示界面中的歌单(即图7中的XXX榜单)为滚动控件,该滚动控件可以控制滑动多个歌曲分别对应的子控件。In one example, referring to FIG. 7 , taking the display interface as a user interface in a music application as an example, the playlist in the display interface (i.e., the XXX list in FIG. 7 ) is a scroll control, and the scroll control can control the sliding of sub-controls corresponding to multiple songs.

参见图7,歌单中展示有歌曲01-09的选项卡,以及该歌单中所包含的,但在当前显示界面中未显示的其他子控件的选项卡,比如,歌曲10-121所对应的选项卡。Referring to FIG. 7 , the playlist displays tabs for songs 01-09, as well as tabs for other sub-controls included in the playlist but not displayed in the current display interface, such as tabs corresponding to songs 10-121.

为便于说明,图7中仅以右侧所示的箭头来表示歌单对应的滚动控件;当然,该滚动控件也可以通过其他形态进行显示,本申请一些实施例对此不做限制。For ease of explanation, FIG7 only uses the arrow shown on the right to represent the scroll control corresponding to the playlist; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.

在该音乐应用的显示界面中,由于歌单所包含的多个歌曲的选项卡的显示宽度均为d,即歌单对应的滚动控件中所包含的多个子控件的显示宽度均相同,则该歌单对应的滚动控件的滚动方向为垂直滚动。In the display interface of the music application, since the display width of the tabs of the multiple songs included in the playlist is d, that is, the display width of the multiple sub-controls contained in the scroll control corresponding to the playlist is the same, the scrolling direction of the scroll control corresponding to the playlist is vertical scrolling.

如此,通过控制该歌单对应的滚动控件沿垂直方向执行滚动操作,可以依次浏览该歌单中包括的所有歌曲,从而选择感兴趣的歌曲进行播放。In this way, by controlling the scroll control corresponding to the playlist to perform a scroll operation in the vertical direction, all songs included in the playlist can be browsed in sequence, thereby selecting songs of interest to play.

(3)若各子控件在当前显示界面中的显示高度和/或显示宽度不同,则确定滚动控件 的滚动方向为瀑布流滚动。(3) If the display height and/or display width of each sub-control in the current display interface are different, determine the scroll control The scrolling direction is waterfall scrolling.

也即是,若各子控件的显示高度均不相同,或者各子控件的显示宽度均不相同,亦或者,各子控件的显示高度和显示宽度均不相同,则表示多个子控件在当前显示界面中是采用瀑布流方式分布的,则控制该多个子控件所属的滚动控件的滚动方向为瀑布流滚动,以通过上下滑动/翻页,来加载多个子控件,以供用户浏览/选中/打开感兴趣的子控件。That is, if the display heights of the sub-controls are all different, or the display widths of the sub-controls are all different, or the display heights and display widths of the sub-controls are all different, it means that the multiple sub-controls are distributed in a waterfall flow manner in the current display interface, and the scrolling direction of the scroll control to which the multiple sub-controls belong is controlled to be waterfall flow scrolling, so that multiple sub-controls can be loaded by sliding up and down/turning pages, so that the user can browse/select/open the sub-controls of interest.

在一个示例中,参见图8,以显示界面为一种图片查看应用的用户界面为例,该显示界面中包括一个滚动控件,该滚动控件可以控制多个子控件对应的图片,这些子控件在显示界面中以瀑布流的方式展示对应的图片。In one example, referring to FIG8 , a user interface of a picture viewing application is taken as an example, in which a scroll control is included, and the scroll control can control pictures corresponding to multiple sub-controls, and these sub-controls display corresponding pictures in a waterfall flow manner in the display interface.

参见图8,该显示界面中包括图片1-图片8、视频9对应的图片,以及该显示界面中未展示出的其他子控件对应的图片。Referring to FIG. 8 , the display interface includes pictures 1 to 8 , pictures corresponding to video 9 , and pictures corresponding to other sub-controls that are not displayed in the display interface.

为便于说明,图8中仅以右侧的滚动条来表示该显示页面中的滚动控件;当然,滚动控件也可以通过其他形态进行显示,本申请一些实施例对此不做限制。For ease of explanation, FIG8 only uses the scroll bar on the right to represent the scroll control in the display page; of course, the scroll control can also be displayed in other forms, and some embodiments of the present application do not limit this.

在该图片查看应用的显示界面中,由于图片1-8、视频9在当前显示界面中处于的不同显示区域,则图片1-8、视频9对应的子控件的边界坐标不同,因此,图片1-8、视频9对应的子控件的显示高度和/或显示宽度不同,则该滚动控件的滚动方向为瀑布流滚动。In the display interface of the picture viewing application, since pictures 1-8 and video 9 are in different display areas in the current display interface, the boundary coordinates of the sub-controls corresponding to pictures 1-8 and video 9 are different. Therefore, the display heights and/or display widths of the sub-controls corresponding to pictures 1-8 and video 9 are different, and the scrolling direction of the scroll control is waterfall scrolling.

如此,通过控制该显示页面中的滚动控件沿垂直方向执行滚动操作,每次滚动操作后都可以一次加载多张图片,从而选择感兴趣的图片执行预览/下载/处理等操作。In this way, by controlling the scroll control in the display page to perform a scroll operation in the vertical direction, multiple pictures can be loaded at one time after each scroll operation, so that pictures of interest can be selected to perform operations such as previewing/downloading/processing.

在一些实施例中,构建当前显示界面中滚动控件的语音滚动控制词列表时,控制器被进一步配置为:根据滚动控件的滚动方向,设置滚动控件对应的语义控制词;进而根据当前显示界面中滚动控件的滚动方向和语义控制词,生成当前显示界面中滚动控件的语音滚动控制词列表。In some embodiments, when constructing a voice scrolling control word list for a scrolling control in a current display interface, the controller is further configured to: set semantic control words corresponding to the scrolling control according to the scrolling direction of the scrolling control; and then generate a voice scrolling control word list for the scrolling control in the current display interface according to the scrolling direction and semantic control words of the scrolling control in the current display interface.

其中,语义控制词包括滑动语义词和翻页语义词。Among them, semantic control words include sliding semantic words and page turning semantic words.

作为一个示例,对于滚动方向为水平滚动的滚动控件,设置的滑动语义词包括“向左滑动”和“向右滑动”;对于滚动方向为垂直滚动的滚动控件,设置的滑动语义词包括“向上滑动”和“向下滑动”,对于滚动方向为瀑布流滚动的滚动控件,设置的滑动语义词包括“向上滑动”、“向下滑动”、“向左滑动”和“向右滑动”。As an example, for a scroll control with horizontal scrolling direction, the set sliding semantic words include "slide left" and "slide right"; for a scroll control with vertical scrolling direction, the set sliding semantic words include "slide up" and "slide down"; for a scroll control with waterfall scrolling direction, the set sliding semantic words include "slide up", "slide down", "slide left" and "slide right".

作为一个示例,翻页语义词包括“上一页”和“下一页”。As an example, page turning semantic words include "previous page" and "next page".

在一些实施例中,假设当前显示界面中的滚动控件包括M控件、N控件和P控件,且M控件的滚动方向为水平滚动,N控件的滚动方向为垂直滚动,P控件的滚动方向为瀑布流滚动。则针对该当前显示界面,下表1给出了一种示例性的语音滚动控制词列表。In some embodiments, assuming that the scroll controls in the current display interface include M controls, N controls, and P controls, and the scroll direction of the M control is horizontal scrolling, the scroll direction of the N control is vertical scrolling, and the scroll direction of the P control is waterfall scrolling. For the current display interface, the following Table 1 provides an exemplary voice scroll control word list.

表1

Table 1

步骤530:响应于用户的语音控制指令,基于语音滚动控制词列表和语音控制指令对应的语音控制文本,控制目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作。Step 530: In response to the user's voice control instruction, based on the voice scrolling control word list and the voice control text corresponding to the voice control instruction, the target scrolling control is controlled to perform a scrolling operation in the scrolling direction indicated by the voice control text.

其中,目标滚动控件为当前显示界面包含的滚动控件中与语音控制文本相对应的一个滚动控件。The target scroll control is a scroll control in the scroll controls included in the current display interface that corresponds to the voice-controlled text.

在一些实施例中,参见图9,实现上述步骤530时,控制器被进一步配置为执行以下子步骤:In some embodiments, referring to FIG. 9 , when implementing the above step 530, the controller is further configured to perform the following sub-steps:

步骤531:获取语音控制指令对应的语音控制文本。Step 531: Obtain the voice control text corresponding to the voice control instruction.

其中,显示设备可以通过通信器在控制器与云端服务器之间建立通信连接,以通过云端服务器对语音控制指令进行解析,从而减少显示设备的数据处理量和算法存储资源消耗量。Among them, the display device can establish a communication connection between the controller and the cloud server through the communicator, so as to parse the voice control instructions through the cloud server, thereby reducing the data processing volume and algorithm storage resource consumption of the display device.

在一些实施例中,实现上述步骤531时,控制器被进一步配置为:将语音控制指令发送给云端服务器,以请求云端服务器对语音控制指令进行解析处理;接收云端服务器发送的语音控制指令对应的语音控制文本。In some embodiments, when implementing the above step 531, the controller is further configured to: send the voice control instruction to the cloud server to request the cloud server to parse and process the voice control instruction; and receive the voice control text corresponding to the voice control instruction sent by the cloud server.

也即是,本申请一些实施例将语音解析算法存储在云端服务器中,云端服务器接收到显示设备发送的语音控制指令后,对语音控制指令进行解析,获取该语音控制指令对应的语音控制文本。That is, some embodiments of the present application store the voice analysis algorithm in a cloud server. After the cloud server receives the voice control instruction sent by the display device, it analyzes the voice control instruction and obtains the voice control text corresponding to the voice control instruction.

在一些实施例中,若云端服务器解析语音控制指令时,确定语音控制指令为无效指令(比如,未解析到相关的控制词字符信息),则向显示设备发送异常结果反馈信息,以指示该显示设备暂不对任何控件执行控制操作,继续检测用户输入的语音控制指令。In some embodiments, if the cloud server parses the voice control instruction and determines that the voice control instruction is an invalid instruction (for example, the relevant control word character information is not parsed), it sends abnormal result feedback information to the display device to instruct the display device not to perform control operations on any controls for the time being and continue to detect the voice control instructions input by the user.

步骤532:将语音控制文本与语音滚动控制词列表中的语义控制词进行匹配,确定语音控制文本在当前显示界面中请求控制的目标滚动控件,以及目标滚动控件的目标滚动方向。Step 532: Match the voice control text with the semantic control words in the voice scroll control word list to determine the target scroll control that the voice control text requests to control in the current display interface, and the target scroll direction of the target scroll control.

其中,若语音控制文本在当前显示界面中仅存在一个匹配的滚动控件,则直接将该滚动控件确定为目标滚动控件,执行根据语音控制文本,控制目标混动控件在目标滚动方向上执行滚动操作,也即步骤533。Among them, if there is only one matching scrolling control for the voice control text in the current display interface, the scrolling control is directly determined as the target scrolling control, and the target hybrid control is controlled to perform a scrolling operation in the target scrolling direction according to the voice control text, that is, step 533.

在一些实施例中,若当前显示界面中包含与语音控制文本相对应的多个候选滚动控件,则执行上述步骤532时,控制器被进一步配置为:获取各候选滚动控件的控制优先级;按照控制优先级,从多个候选滚动控件中确定目标滚动控件。In some embodiments, if the current display interface contains multiple candidate scrolling controls corresponding to the voice-controlled text, when executing the above step 532, the controller is further configured to: obtain the control priority of each candidate scrolling control; and determine the target scrolling control from the multiple candidate scrolling controls according to the control priority.

在一种可能的实现方式中,根据各候选滚动控件之间的管控关系,以及各候选滚动控件在当前显示界面中的显示位置,确定各候选滚动控件的控制优先级。In a possible implementation, the control priority of each candidate scrolling control is determined according to the management and control relationship between each candidate scrolling control and the display position of each candidate scrolling control in the current display interface.

其中,在当前显示界面对应的结构树中,处于父节点的滚动控件的优先级,高于处于子节点的滚动控件的优先级。Among them, in the structure tree corresponding to the current display interface, the priority of the scroll control in the parent node is higher than the priority of the scroll control in the child node.

其中,显示位置处于当前显示界面左上侧的滚动控件的优先级,高于显示位置处于当前显示界面右下侧的滚动控件的优先级。 Among them, the priority of the scroll control whose display position is on the upper left side of the current display interface is higher than the priority of the scroll control whose display position is on the lower right side of the current display interface.

作为一个示例,参见图10所示的显示界面,该显示界面中包括多个滚动控件:W控件、X控件、Y控件和Z控件。其中,W控件可以控制X控件、Y控件和Z控件,则W控件的优先级高于X控件、Y控件和Z控件。As an example, referring to the display interface shown in FIG10 , the display interface includes multiple scroll controls: W control, X control, Y control and Z control. Among them, the W control can control the X control, the Y control and the Z control, and the priority of the W control is higher than that of the X control, the Y control and the Z control.

关于X控件、Y控件和Z控件的控件优先级,可以进一步根据其在该显示界面中的显示位置确定,X控件处于该显示界面的最上侧,其优先级最高;Z控件处于该显示界面的最下侧,其优先级最低,因此,X控件、Y控件和Z控件的控件优先级为:控件X>控件Y>控件Z。The control priorities of the X control, the Y control, and the Z control can be further determined according to their display positions in the display interface. The X control is at the top of the display interface and has the highest priority; the Z control is at the bottom of the display interface and has the lowest priority. Therefore, the control priorities of the X control, the Y control, and the Z control are: control X>control Y>control Z.

因此,图10中四个滚动控件的控制优先级为:控件W>控件X>控件Y>控件Z。Therefore, the control priority of the four scroll controls in FIG10 is: control W> control X> control Y> control Z.

如此,确定了当前界面中滚动控件的控制优先级后,即可根据语音控制指令和语音滚动控制词列表,从中确定一个目标滚动控件执行滚动操作。In this way, after the control priority of the scroll control in the current interface is determined, a target scroll control can be determined to perform a scroll operation according to the voice control instruction and the voice scroll control word list.

应该理解的是,在接收到用户的语音控制指令后,控制器可以按照“先语音界面控制词列表,后语音滚动控制词列表”的顺序,在当前显示界面中确定用户请求控制的控件;控制器也可以按照“先语音滚动控制词列表,后语音界面控制词列表”的顺序,在当前显示界面中确定用户请求控制的控件,本申请一些实施例对此不做限制。It should be understood that after receiving the user's voice control command, the controller can determine the control that the user requests to control in the current display interface in the order of "voice interface control word list first, then voice scrolling control word list"; the controller can also determine the control that the user requests to control in the current display interface in the order of "voice scrolling control word list first, then voice interface control word list", and some embodiments of the present application do not limit this.

在一些实施例中,若确定当前显示界面中包含滚动控件,则显示设备接收到用户的语音控制指令后,可以先基于语音滚动控制词列表,确定语音控制指令对应的语音控制文本在当前显示界面中是否存在请求控制的目标滚动控件。In some embodiments, if it is determined that the current display interface includes a scroll control, after the display device receives the user's voice control instruction, it can first determine, based on the voice scroll control word list, whether the voice control text corresponding to the voice control instruction has a target scroll control requested to be controlled in the current display interface.

进一步地,若基于语音滚动控制词列表,无法确定与该语音控制文本匹配的目标滚动控件,则继续基于该语音控制文本,在当前显示界面的语音界面控制词列表中确定是否存在请求控制的目标控件。其中,该目标控件的在语音界面控制词列表中的界面描述词与该语音控制文本相同。Furthermore, if the target scrolling control that matches the voice control text cannot be determined based on the voice scrolling control word list, then the voice control text is continued to be used to determine whether there is a target control for which control is requested in the voice interface control word list of the current display interface, wherein the interface description word of the target control in the voice interface control word list is the same as the voice control text.

步骤533:根据语音控制文本,控制目标滚动控件在目标滚动方向上执行滚动操作。Step 533: According to the voice control text, control the target scroll control to perform a scroll operation in the target scroll direction.

在一些实施例中,实现上述步骤533时,控制器被进一步配置为:根据语音控制文本,获取目标滚动控件的目标滚动距离;控制目标滚动控件在目标滚动方向上,按照目标滚动距离执行滚动操作。In some embodiments, when implementing the above step 533, the controller is further configured to: obtain a target scrolling distance of the target scrolling control according to the voice control text; and control the target scrolling control to perform a scrolling operation in the target scrolling direction according to the target scrolling distance.

其中,语音控制文本可以包括滑动语音文本和翻页语音文本。Among them, the voice control text can include sliding voice text and page turning voice text.

在一种可能的实现方式中,若语音控制文本为滑动语音文本,则滚动距离可以为预设的距离值。比如,可以根据滚动控件所控制的多个子控件的显示高度、显示宽度、显示位置间隔等设置上述距离值,本申请一些实施例对此不做限制。In a possible implementation, if the voice control text is a sliding voice text, the scrolling distance may be a preset distance value. For example, the distance value may be set according to the display height, display width, display position interval, etc. of the multiple sub-controls controlled by the scroll control, and some embodiments of the present application do not limit this.

作为一个示例,若语音控制文本为滑动语音文本,参见图6,滚动控件的滚动距离可以为“找课程”图标和“找计划”图标之间的显示位置间隔;参见图7,滚动控件的滚动距离可以为歌曲01的选项卡的显示宽度,即d;参见图8,滚动控件的滚动距离可以任一图片在该显示界面中的显示宽度或显示高度。As an example, if the voice control text is a sliding voice text, see Figure 6, the scrolling distance of the scroll control can be the display position interval between the "Find Course" icon and the "Find Plan" icon; see Figure 7, the scrolling distance of the scroll control can be the display width of the tab of Song 01, that is, d; see Figure 8, the scrolling distance of the scroll control can be the display width or display height of any picture in the display interface.

在一些实施例中,若语音控制文本为翻页语音文本,则计算滚动距离时,控制器被进一步配置为:若滚动方向为水平滚动,则根据目标滚动控件控制的最右侧子控件的显示位置和最左侧子控件的显示位置,计算滚动距离;若滚动方向为垂直滚动或瀑布流滚动,则根据目标滚动控件控制的最上侧子控件的显示位置和最下侧子控件的显示位置,计算滚动距离。In some embodiments, if the voice-controlled text is a page-turning voice text, when calculating the scrolling distance, the controller is further configured as follows: if the scrolling direction is horizontal scrolling, the scrolling distance is calculated based on the display position of the rightmost sub-control and the display position of the leftmost sub-control controlled by the target scrolling control; if the scrolling direction is vertical scrolling or waterfall scrolling, the scrolling distance is calculated based on the display position of the topmost sub-control and the display position of the bottommost sub-control controlled by the target scrolling control.

作为一个示例,若语音控制文本为翻页语音文本,参见图6,滚动控件的滚动距离可以为“找课程”图标到“投屏训练”的图标之间的总显示宽度值;参见图7,滚动控件的滚动 距离可以为歌曲01到09的选项卡之间的总显示高度值,即9*d;参见图8,滚动控件的滚动距离可以多张图片在当前显示界面中的总显示高度值或总显示宽度值,比如,滚动距离可以为图片2和图片6在该显示界面中的总显示高度值。As an example, if the voice control text is a page-turning voice text, see Figure 6, the scrolling distance of the scroll control can be the total display width value between the "Find Course" icon and the "Cast Screen Training"icon; see Figure 7, the scrolling distance of the scroll control can be the total display width value between the "Find Course" icon and the "Cast Screen Training" icon. The distance can be the total display height value between the tabs of songs 01 to 09, that is, 9*d; referring to FIG8 , the scrolling distance of the scroll control can be the total display height value or the total display width value of multiple pictures in the current display interface. For example, the scrolling distance can be the total display height value of Picture 2 and Picture 6 in the display interface.

基于上述内容,本申请所提供的显示设备中的控制器分别与显示器和声音采集器连接,在显示器显示图像画面和用户界面的过程中,控制器确定当前显示界面中包括的至少一个滚动控件,并构建当前显示界面中滚动控件的语音滚动控制词列表。其中,语音滚动控制词列表用于表示滚动控件的滚动方向和语义控制词之间的对应关系。进一步地,声音采集设备在检测并采集到用户的语音控制指令后,将该语音控制指令发送给控制器,以使控制器响应于该语音控制指令,基于预先构建的语音滚动控制词列表和语音控制指令对应的语音控制文本,在当前显示界面中控制与语音控制指令相对应的目标滚动控件执行滚动操作。如此,通过识别滚动控件,并为滚动控件构建语音滚动控制词列表,即可在显示设备接收到用户的语音控制指令时,控制当前显示界面中的目标滚动控件按照该语音控制指令执行滚动操作,克服了滚动控件无法通过语音直接控制的缺陷,实现了显示设备中各控件的全语音控制,提高了显示设备控制方式的灵活性和便捷性,提升了用户体验感。Based on the above content, the controller in the display device provided by the present application is connected to the display and the sound collector respectively. In the process of the display displaying the image screen and the user interface, the controller determines at least one scroll control included in the current display interface and constructs a voice scroll control word list of the scroll control in the current display interface. Wherein, the voice scroll control word list is used to represent the corresponding relationship between the scroll direction of the scroll control and the semantic control word. Further, after the sound collection device detects and collects the user's voice control instruction, the voice control instruction is sent to the controller, so that the controller responds to the voice control instruction, based on the pre-constructed voice scroll control word list and the voice control text corresponding to the voice control instruction, and controls the target scroll control corresponding to the voice control instruction in the current display interface to perform a scroll operation. In this way, by identifying the scroll control and constructing a voice scroll control word list for the scroll control, when the display device receives the user's voice control instruction, the target scroll control in the current display interface can be controlled to perform a scroll operation according to the voice control instruction, which overcomes the defect that the scroll control cannot be directly controlled by voice, realizes the full voice control of each control in the display device, improves the flexibility and convenience of the display device control method, and improves the user experience.

应该理解的是,虽然上述实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowcharts involved in the above-mentioned embodiments are displayed in sequence according to the indications of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear explanation in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least a portion of the steps in the flowcharts involved in the above-mentioned embodiments may include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily to be carried out in sequence, but can be executed in turn or alternately with other steps or at least a portion of the steps or stages in other steps.

另外,本申请还提供了一种语音控制方法,该语音控制方法可以应用于上述显示设备200或其他电子设备中,该方法包括:In addition, the present application also provides a voice control method, which can be applied to the above-mentioned display device 200 or other electronic devices, and the method includes:

确定当前显示界面中包含的至少一个滚动控件;Determine at least one scroll control included in the current display interface;

构建当前显示界面中滚动控件的语音滚动控制词列表;语音滚动控制词列表用于表示滚动控件的滚动方向和语义控制词之间的对应关系;Constructing a voice scroll control word list of a scroll control in the current display interface; the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word;

响应于用户的语音控制指令,基于语音滚动控制词列表和语音控制指令对应的语音控制文本,控制目标滚动控件按照语音控制文本指示的滚动方向执行滚动操作;目标滚动控件为当前显示界面包含的滚动控件中与语音控制指令相对应的一个滚动控件。In response to the user's voice control instruction, based on the voice scrolling control word list and the voice control text corresponding to the voice control instruction, the target scrolling control is controlled to perform a scrolling operation in the scrolling direction indicated by the voice control text; the target scrolling control is a scrolling control corresponding to the voice control instruction among the scrolling controls contained in the current display interface.

在一些实施例中,参见图11,滚动控件的语音控制逻辑为:在目标应用启动后注册可以监听页面变化情况的辅助功能服务(Accessibility Service),通过辅助功能服务监听显示页面变化情况。在显示页面中的控件发生变化,即显示页面发生变化时,根据当前显示界面中控件的结构树,获取当前显示界面中的根视图节点控件,以及根视图节点控件所控制的子视图节点控件,并根据根视图节点控件和子视图节点控件,生成当前显示界面对应的控件节点列表。In some embodiments, referring to FIG. 11 , the voice control logic of the scrolling control is as follows: after the target application is started, an accessibility service that can monitor page changes is registered, and changes in the display page are monitored through the accessibility service. When the controls in the display page change, that is, when the display page changes, the root view node control in the current display interface and the sub-view node control controlled by the root view node control are obtained according to the structure tree of the controls in the current display interface, and a control node list corresponding to the current display interface is generated according to the root view node control and the sub-view node control.

进一步地,遍历当前显示界面的控件节点列表,确定当前显示界面中包含的至少一个滚动控件,生成当前显示界面对应的滚动控件节点列表。Furthermore, the control node list of the current display interface is traversed to determine at least one scroll control included in the current display interface, and a scroll control node list corresponding to the current display interface is generated.

进一步地,针对任一个滚动控件,获取该滚动控件控制的多个子控件的显示位置,比如,控件的上边界坐标、下边界坐标、左边界坐标和右边界坐标;根据滚动控件控制的各子控件的显示位置,确定滚动控件的滚动方向,并设置滚动控件对应的语义控制词;根据 各滚动控件的滚动方向和语义控制词,生成当前显示界面中滚动控件的语音滚动控制词列表。Further, for any scroll control, the display positions of multiple sub-controls controlled by the scroll control are obtained, such as the upper boundary coordinates, lower boundary coordinates, left boundary coordinates and right boundary coordinates of the control; according to the display positions of each sub-control controlled by the scroll control, the scroll direction of the scroll control is determined, and the semantic control word corresponding to the scroll control is set; according to The scrolling direction and semantic control words of each scrolling control are used to generate a voice scrolling control word list of the scrolling control in the current display interface.

若接收到用户的语音控制指令,则显示设备实施语音控制时,根据该语音控制指令对应的语音控制文本,遍历语音滚动控制词列表,确定该语音控制文本对应的目标滚动控件,进而控制目标滚动控件执行滚动操作。If a voice control command from the user is received, when the display device implements voice control, it traverses the voice scroll control word list according to the voice control text corresponding to the voice control command, determines the target scroll control corresponding to the voice control text, and then controls the target scroll control to perform a scroll operation.

需要说明的是,在当前显示界面中包括多个滚动控件时,则根据控制优先级,控制当前显示界面中的目标滚动控件执行滚动操作。It should be noted that, when the current display interface includes multiple scroll controls, the target scroll control in the current display interface is controlled to perform a scrolling operation according to the control priority.

关于该语音控制方法的实现原理和有益效果,可以参见上文中关于显示设备中控制器配置内容的各实施例的具体限定,在此不再赘述。Regarding the implementation principle and beneficial effects of the voice control method, please refer to the specific limitations of the various embodiments of the controller configuration content in the display device mentioned above, which will not be repeated here.

在一些实施例中,本申请还提供了一种计算机可读存储介质。其中,该计算机可读存储介质可存储有计算机程序,该计算机程序被显示设备或其他电子设备中的控制器调用并运行时,以实现本申请提供的语音控制方法的部分或全部步骤。In some embodiments, the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium may store a computer program, which, when called and executed by a controller in a display device or other electronic device, implements some or all steps of the voice control method provided in the present application.

作为一个示例,该计算机可读存储介质可以为磁碟、光盘、只读存储记忆体或随机存储记忆体等。As an example, the computer-readable storage medium may be a magnetic disk, an optical disk, a read-only storage memory, a random access memory, or the like.

应该理解的是,本申请一些实施例中的技术方案可借助软件加必需的通用硬件平台的方式来实现。因此,本申请一些实施例中的技术方案本质上或者对现有技术做出贡献的部分可以以软件产品的形式体现出来,该软件产品可以存储在计算机可读存储介质中。It should be understood that the technical solutions in some embodiments of the present application can be implemented by means of software plus a necessary general hardware platform. Therefore, the technical solutions in some embodiments of the present application can essentially or contribute to the prior art in the form of software products, which can be stored in a computer-readable storage medium.

则在一些实施例中,本申请还提供了一种计算机程序产品。其中,该计算机程序产品包括计算机程序,该计算机程序被显示设备或其他电子设备中的控制器调用并运行时,可实现本申请提供的语音控制方法的部分或全部步骤。In some embodiments, the present application further provides a computer program product, wherein the computer program product includes a computer program, which, when called and executed by a controller in a display device or other electronic device, can implement some or all steps of the voice control method provided in the present application.

为了方便解释,已经结合具体的实施方式进行了上述说明。但是,上述在一些实施例中讨论不是意图穷尽或者将实施方式限定到上述公开的具体形式。根据上述的教导,可以得到多种修改和变形。上述实施方式的选择和描述是为了更好地解释本申请的内容,从而使得本领域技术人员更好地使用实施方式。For the convenience of explanation, the above description has been made in conjunction with specific embodiments. However, the above discussion in some embodiments is not intended to be exhaustive or limit the embodiments to the specific forms disclosed above. According to the above teachings, various modifications and variations can be obtained. The selection and description of the above embodiments are intended to better explain the content of this application, so that those skilled in the art can better use the embodiments.

也即是,对于本领域的普通技术人员来说,在不脱离本申请的技术构思的前提下,所做的任何修改、等同替换、改进等,均应包括在本申请的保护范围之内。 That is, for ordinary technicians in this field, any modifications, equivalent substitutions, improvements, etc. made without departing from the technical concept of this application should be included in the protection scope of this application.

Claims (10)

一种显示设备,包括:A display device, comprising: 显示器,被配置为显示图像画面和用户界面;A display configured to display an image screen and a user interface; 声音采集器,被配置为采集用户的语音控制指令;A sound collector is configured to collect a user's voice control command; 与所述显示器、所述声音采集器分别连接的控制器,其中,所述控制器被配置为:A controller connected to the display and the sound collector respectively, wherein the controller is configured as follows: 确定当前显示界面中包含的至少一个滚动控件;Determine at least one scroll control included in the current display interface; 构建所述当前显示界面中滚动控件的语音滚动控制词列表;所述语音滚动控制词列表用于表示所述滚动控件的滚动方向和语义控制词之间的对应关系;Constructing a voice scroll control word list of the scroll control in the current display interface; the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word; 响应于用户的语音控制指令,基于所述语音滚动控制词列表和所述语音控制指令对应的语音控制文本,控制目标滚动控件按照所述语音控制文本指示的滚动方向执行滚动操作;所述目标滚动控件为所述当前显示界面包含的滚动控件中与所述语音控制文本相对应的一个滚动控件。In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text among the scroll controls contained in the current display interface. 根据权利要求1所述的显示设备,所述控制器被进一步配置为:The display device according to claim 1, wherein the controller is further configured to: 在目标应用启动后,监听所述目标应用的显示界面变化情况;After the target application is started, monitoring the display interface changes of the target application; 若显示界面发生变化,则获取变化后的当前显示界面中所包含的多个控件的控件信息;If the display interface changes, then obtain the control information of multiple controls included in the current display interface after the change; 根据所述控件信息,确定所述当前显示界面中包含的至少一个滚动控件。At least one scroll control included in the current display interface is determined according to the control information. 根据权利要求1所述的显示设备,所述控制器被进一步配置为:The display device according to claim 1, wherein the controller is further configured to: 获取所述滚动控件控制的多个子控件在所述当前显示界面中的显示位置;Obtaining display positions of multiple sub-controls controlled by the scroll control in the current display interface; 根据各所述子控件的显示位置,确定所述滚动控件的滚动方向;Determining the scrolling direction of the scrolling control according to the display position of each of the sub-controls; 根据所述滚动方向,构建所述当前显示界面中滚动控件的语音滚动控制词列表。According to the scrolling direction, a voice scrolling control word list of the scrolling control in the current display interface is constructed. 根据权利要求3所述的显示设备,所述控制器被进一步配置为:According to the display device of claim 3, the controller is further configured to: 若各所述子控件在所述当前显示界面中的显示高度均相同,则确定所述滚动控件的滚动方向为水平滚动;If the display heights of the sub-controls in the current display interface are the same, determining that the scrolling direction of the scrolling control is horizontal scrolling; 若各所述子控件在所述当前显示界面中的显示宽度均相同,则确定所述滚动控件的滚动方向为垂直滚动;If the display widths of the sub-controls in the current display interface are the same, determining that the scrolling direction of the scrolling control is vertical scrolling; 若各所述子控件在所述当前显示界面中的显示高度和/或显示宽度不同,则确定所述滚动控件的滚动方向为瀑布流滚动。If the display heights and/or display widths of the sub-controls in the current display interface are different, the scrolling direction of the scrolling control is determined to be waterfall scrolling. 根据权利要求1至4中任一项所述的显示设备,所述控制器被进一步配置为:According to the display device according to any one of claims 1 to 4, the controller is further configured to: 获取所述语音控制指令对应的语音控制文本;Obtaining a voice control text corresponding to the voice control instruction; 将所述语音控制文本与所述语音滚动控制词列表中的语义控制词进行匹配,确定所述语音控制文本在所述当前显示界面中请求控制的目标滚动控件,以及所述目标滚动控件的目标滚动方向;Matching the voice control text with the semantic control words in the voice scroll control word list, determining the target scroll control that the voice control text requests to control in the current display interface, and the target scroll direction of the target scroll control; 根据所述语音控制文本,控制所述目标滚动控件在所述目标滚动方向上执行滚动操作。According to the voice control text, the target scroll control is controlled to perform a scroll operation in the target scroll direction. 根据权利要求5所述的显示设备,所述控制器被进一步配置为:According to the display device of claim 5, the controller is further configured to: 根据所述语音控制文本,获取所述目标滚动控件的目标滚动距离; Acquiring a target scroll distance of the target scroll control according to the voice control text; 控制所述目标滚动控件在所述目标滚动方向上按照所述目标滚动距离执行滚动操作。The target scroll control is controlled to perform a scroll operation in the target scroll direction according to the target scroll distance. 根据权利要求6所述的显示设备,所述语音控制文本包括翻页语音文本,所述控制器被进一步配置为:According to the display device of claim 6, the voice control text includes page turning voice text, and the controller is further configured to: 若所述目标滚动方向为水平滚动,则根据所述目标滚动控件控制的最右侧子控件的显示位置和最左侧子控件的显示位置,计算所述目标滚动距离;If the target scrolling direction is horizontal scrolling, the target scrolling distance is calculated according to the display position of the rightmost subcontrol and the display position of the leftmost subcontrol controlled by the target scrolling control; 若所述目标滚动方向为垂直滚动或瀑布流滚动,则根据所述目标滚动控件控制的最上侧子控件的显示位置和最下侧子控件的显示位置,计算所述目标滚动距离。If the target scrolling direction is vertical scrolling or waterfall scrolling, the target scrolling distance is calculated according to the display position of the uppermost subcontrol and the display position of the lowermost subcontrol controlled by the target scrolling control. 根据权利要求5所述的显示设备,若所述当前显示界面中包含与所述语音控制文本相对应的多个候选滚动控件,则所述控制器被进一步配置为:According to the display device of claim 5, if the current display interface includes a plurality of candidate scroll controls corresponding to the voice-controlled text, the controller is further configured to: 获取各所述候选滚动控件的控制优先级;Obtaining the control priority of each of the candidate scroll controls; 按照所述控制优先级,从所述多个候选滚动控件中确定所述目标滚动控件。According to the control priority, the target scroll control is determined from the multiple candidate scroll controls. 根据权利要求5所述的显示设备,所述控制器被进一步配置为:According to the display device of claim 5, the controller is further configured to: 将所述语音控制指令发送给云端服务器,以请求所述云端服务器对所述语音控制指令进行解析处理;Sending the voice control instruction to a cloud server to request the cloud server to parse and process the voice control instruction; 接收所述云端服务器发送的所述语音控制指令对应的语音控制文本。Receive the voice control text corresponding to the voice control instruction sent by the cloud server. 一种语音控制方法,包括:A voice control method, comprising: 确定当前显示界面中包含的至少一个滚动控件;Determine at least one scroll control included in the current display interface; 构建所述当前显示界面中滚动控件的语音滚动控制词列表;所述语音滚动控制词列表用于表示所述滚动控件的滚动方向和语义控制词之间的对应关系;Constructing a voice scroll control word list of the scroll control in the current display interface; the voice scroll control word list is used to indicate the correspondence between the scroll direction of the scroll control and the semantic control word; 响应于用户的语音控制指令,基于所述语音滚动控制词列表和所述语音控制指令对应的语音控制文本,控制目标滚动控件按照所述语音控制文本指示的滚动方向执行滚动操作;所述目标滚动控件为所述当前显示界面包含的滚动控件中与所述语音控制文本相对应的一个滚动控件。 In response to the user's voice control instruction, based on the voice scroll control word list and the voice control text corresponding to the voice control instruction, the target scroll control is controlled to perform a scroll operation in the scroll direction indicated by the voice control text; the target scroll control is a scroll control corresponding to the voice control text among the scroll controls contained in the current display interface.
PCT/CN2023/143115 2023-02-22 2023-12-29 Display device and speech control method Pending WO2024174732A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202310155184.8 2023-02-22
CN202310155184.8A CN116312514B (en) 2023-02-22 2023-02-22 Display device and voice control method

Publications (1)

Publication Number Publication Date
WO2024174732A1 true WO2024174732A1 (en) 2024-08-29

Family

ID=86837100

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/143115 Pending WO2024174732A1 (en) 2023-02-22 2023-12-29 Display device and speech control method

Country Status (2)

Country Link
CN (1) CN116312514B (en)
WO (1) WO2024174732A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116312514B (en) * 2023-02-22 2025-10-28 海信视像科技股份有限公司 Display device and voice control method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100058241A1 (en) * 2008-08-28 2010-03-04 Kabushiki Kaisha Toshiba Display Processing Apparatus, Display Processing Method, and Computer Program Product
CN112885354A (en) * 2021-01-25 2021-06-01 海信视像科技股份有限公司 Display device, server and display control method based on voice
CN113035194A (en) * 2021-03-02 2021-06-25 海信视像科技股份有限公司 Voice control method, display device and server
CN113658598A (en) * 2021-08-12 2021-11-16 海信电子科技(深圳)有限公司 Voice interaction method of display equipment and display equipment
CN116312514A (en) * 2023-02-22 2023-06-23 海信视像科技股份有限公司 Display device and voice control method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2629291A1 (en) * 2012-02-15 2013-08-21 Research In Motion Limited Method for quick scroll search using speech recognition
KR102009423B1 (en) * 2012-10-08 2019-08-09 삼성전자주식회사 Method and apparatus for action of preset performance mode using voice recognition
CN103970257A (en) * 2013-01-28 2014-08-06 联想(北京)有限公司 Information processing method and electronic equipment
CN105869643A (en) * 2016-06-06 2016-08-17 青岛海信移动通信技术股份有限公司 Terminal control method based on voice and voice control device
CN107358953A (en) * 2017-06-30 2017-11-17 努比亚技术有限公司 Sound control method, mobile terminal and storage medium
CN111968637B (en) * 2020-08-11 2024-06-14 北京小米移动软件有限公司 Terminal equipment operation mode control method and device, terminal equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100058241A1 (en) * 2008-08-28 2010-03-04 Kabushiki Kaisha Toshiba Display Processing Apparatus, Display Processing Method, and Computer Program Product
CN112885354A (en) * 2021-01-25 2021-06-01 海信视像科技股份有限公司 Display device, server and display control method based on voice
CN113035194A (en) * 2021-03-02 2021-06-25 海信视像科技股份有限公司 Voice control method, display device and server
CN113658598A (en) * 2021-08-12 2021-11-16 海信电子科技(深圳)有限公司 Voice interaction method of display equipment and display equipment
CN116312514A (en) * 2023-02-22 2023-06-23 海信视像科技股份有限公司 Display device and voice control method

Also Published As

Publication number Publication date
CN116312514B (en) 2025-10-28
CN116312514A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
US20130326583A1 (en) Mobile computing device
CN111669621A (en) A kind of media asset data distribution method, server and display device
CN118890504A (en) Display device and method for displaying details page
CN113766164B (en) Display device and signal source interface display method
WO2022237603A1 (en) Control processing method and display device
CN112632160A (en) Intelligent device and intelligent device login method
CN114339346B (en) Display device and image recognition result display method
CN117406886A (en) Display device and floating window display method
WO2024174732A1 (en) Display device and speech control method
CN115086771B (en) Video recommendation media asset display method, display equipment and server
WO2022083554A1 (en) User interface layout and interaction method, and three-dimensional display device
US12382123B2 (en) Display apparatus and display method
CN114630171A (en) Display device and configuration switching method
CN112073787A (en) Display device and home page display method
WO2024139950A1 (en) Display device and processing method for display device
CN114357279B (en) Display equipment and voice search method based on web site internal page
CN113391746B (en) Display equipment and multi-window focus control method
CN112199560B (en) Search method of setting items and display equipment
CN115185414A (en) Display device and cursor control method
CN114938467A (en) Display apparatus and display apparatus control method
CN112367550A (en) Method for realizing multi-title dynamic display of media asset list and display equipment
CN113378096B (en) Display equipment and browser residual frame clearing method
CN118820609A (en) Item recommendation method, device and storage medium based on user behavior
CN120469616A (en) Display device and content display method
CN117608426A (en) Display equipment and multi-application same-screen display method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23923895

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE