[go: up one dir, main page]

CN109274999A - A kind of video playing control method, device, equipment and medium - Google Patents

A kind of video playing control method, device, equipment and medium Download PDF

Info

Publication number
CN109274999A
CN109274999A CN201811169061.5A CN201811169061A CN109274999A CN 109274999 A CN109274999 A CN 109274999A CN 201811169061 A CN201811169061 A CN 201811169061A CN 109274999 A CN109274999 A CN 109274999A
Authority
CN
China
Prior art keywords
video
specified
information
video data
indication information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811169061.5A
Other languages
Chinese (zh)
Inventor
陈姿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201811169061.5A priority Critical patent/CN109274999A/en
Publication of CN109274999A publication Critical patent/CN109274999A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • H04N21/4126The peripheral being portable, e.g. PDAs or mobile phones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

This application discloses a kind of video playing control methods, including send acquisition request to server, obtain the designated that the server is provided according to the acquisition request video data and its corresponding instruction information;Obtain the markup information of specified object;Then it plays the video data of designated and shows the markup information of the specified object according to the instruction information.In this way, user can be by specifying the markup information of object to identify object in piece when watching video, user, which does not need to exit again to play, goes for corresponding objects by watching film profile.This method provides more convenient information acquiring pattern for user, and user is facilitated more intuitively to obtain information, to reduce the frequent interaction of user and network, improves user's viewing experience, and reduce the occupancy and waste of video platform resource.Disclosed herein as well is a kind of video playing control device, equipment and storage mediums.

Description

Video playing control method, device, equipment and medium
Technical Field
The present application relates to the field of video technologies, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for controlling video playing.
Background
At present, when a user watches videos, the situation that objects in a film are difficult to distinguish is met. For example, when a user watches foreign language videos, foreigners are similar in growth and difficult to distinguish, so that the user often cannot remember the character roles played by actors in a drama; for another example, when a user watches an animation, the cartoon images are very similar and difficult to distinguish, so that the user often confuses different animal characters.
Generally, when a user watches a video, if the user encounters a condition that an object is unclear, the user often quits the video playing and then checks a movie brief introduction of the video, and further, specific objects are distinguished based on the movie brief introduction.
Therefore, how to reduce frequent interaction between a user and a network and reduce occupation and waste of resources of the video platform through technical means is a big problem to be solved urgently by the video platform at present.
Disclosure of Invention
The embodiment of the application provides a video playing control method, which can reduce frequent interaction between a user and a network and reduce occupation and waste of video platform resources. Based on the above, the embodiment of the application also provides a corresponding device, equipment and computer storage medium.
In view of the above, an aspect of the present application provides a video playback control method, including:
sending an acquisition request to a server, wherein the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
acquiring the video data of the specified video and corresponding indication information thereof, which are provided by the server according to the acquisition request, wherein the indication information is used for indicating a client to display the marking information of the specified object at a specified position corresponding to a specified time when the video data of the specified video is played;
acquiring the labeling information of the specified object;
and playing the video data of the specified video and displaying the labeling information of the specified object according to the indication information.
One aspect of the present application provides a video playing control method, where the method includes:
receiving an acquisition request sent by a client, wherein the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
and providing the video data of the specified video and corresponding indication information thereof to the client according to the acquisition request, wherein the indication information is used for indicating the client to display the marking information of the specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video.
One aspect of the present application provides a video playback control apparatus, the apparatus including:
the system comprises a sending module, a receiving module and a sending module, wherein the sending module is used for sending an acquisition request to a server, and the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
a first obtaining module, configured to obtain video data of the specified video and indication information corresponding to the video data, which are provided by the server according to the obtaining request, where the indication information is used to indicate a client to display annotation information of a specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video;
the second acquisition module is used for acquiring the labeling information of the specified object;
and the control module is used for playing the video data of the specified video and displaying the labeling information of the specified object according to the indication information.
One aspect of the present application provides a video playback control apparatus, the apparatus including:
the system comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving an acquisition request sent by a client, and the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
and the providing module is used for providing the video data of the specified video and the corresponding indication information thereof to the client according to the acquisition request, wherein the indication information is used for indicating the client to display the annotation information of the specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video.
One aspect of the present application provides a video playback control apparatus, including a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the steps of the video playback control method according to instructions in the program code.
An aspect of the present application provides a computer-readable storage medium for storing program codes for executing the above-mentioned video playback control method.
According to the technical scheme, the embodiment of the application has the following advantages:
in the method, a client acquires video data from a server and indication information corresponding to the video data, wherein the indication information is used for indicating the client to display annotation information of a specified object at a specified position corresponding to a specified opportunity, and when the client plays the video data, the client displays the annotation information of the specified object according to the indication information. Therefore, when watching the video, the user can identify the object in the film through the marking information of the specified object, and the user does not need to quit playing and look for the corresponding object through watching the movie brief introduction, so that frequent interaction between the user and the network can be reduced, and occupation and waste of video platform resources are reduced.
Drawings
Fig. 1 is a scene architecture diagram of a video playing control method in an embodiment of the present application;
fig. 2 is a flowchart of a video playing control method in an embodiment of the present application;
fig. 3 is a flowchart of a video playing control method in an embodiment of the present application;
fig. 4 is a flowchart of a video playing control method in an embodiment of the present application;
fig. 5 is a flowchart of a video playing control method in an embodiment of the present application;
fig. 6 is an interaction flowchart of a video playing control method in an actual application scenario in the embodiment of the present application;
FIG. 7 is a schematic interface diagram illustrating an operation of clicking a label component to trigger object labeling in the embodiment of the present application;
FIG. 8 is a schematic interface diagram illustrating a highlighted face image in the form of a bounding box in an embodiment of the present application;
FIG. 9 is a schematic diagram of an interface displaying a callout information input control in an embodiment of the present application;
fig. 10 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 14 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 15 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of a video playback control apparatus in an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The method comprises the steps that a client side obtains video data and indication information corresponding to the video data from a server, the indication information is used for indicating the client side to display annotation information of a specified object at a specified position corresponding to a specified time, and when the client side plays the video data, the annotation information of the specified object is displayed according to the indication information.
Therefore, when watching the video, the user can identify the object in the film through the marking information of the specified object, and the user does not need to quit playing to find the corresponding object by watching the movie introduction. The method provides a more convenient information acquisition mode for the user, and facilitates the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network, improving the film watching experience of the user, and reducing the occupation and waste of video platform resources.
It can be understood that the video playing control method provided by the application can be applied to terminal equipment, the terminal equipment is provided with a client, and the terminal equipment interacts with a server by operating the client, so that a video playing control service is provided for a user, the user can identify an object in a film through the mark information of a specified object when watching a video, and the user does not need to quit playing and watching a movie brief introduction to search for a corresponding object. The terminal device may be a computing device with data processing capability, including a desktop computer, a notebook computer, or a smart phone. The client installed on the terminal device may be a stand-alone application program, or a functional module, a plug-in or a component integrated on other application programs, for example, the client may be a plug-in integrated on a video platform for providing a video playing control service.
In order to facilitate understanding of the technical solution of the present application, a video playing control method in the embodiment of the present application will be described below with reference to specific scenes.
Fig. 1 is a view of a scene architecture of a video playing control method in an embodiment of the present application, and referring to fig. 1, the scene includes a terminal device 10 and a server 20, the terminal device 10 is installed with a client, and when the client runs, the client interacts with the server 20 to execute the video playing control method in the embodiment of the present application, so as to provide a video playing control service for a user. The following describes a specific implementation of the method.
The client sends an acquisition request to the server 20 through the terminal device 10, and receives the video data of the specified video and the corresponding indication information returned by the server 20, where the indication information is used to indicate that the client displays the annotation information of the specified object at the specified position corresponding to the specified timing when playing the video data of the specified video. Then, the client acquires the annotation information of the specified object, plays the video data of the specified video and displays the annotation information of the specified object according to the indication information.
Therefore, when watching the video, the user can identify the object in the film through the marking information of the specified object, and does not need to quit playing to find the corresponding object by watching the movie brief introduction. Therefore, the method provides a more convenient information acquisition mode for the user, and the user can conveniently and intuitively acquire the information, so that frequent interaction between the user and the network is reduced, and occupation and waste of video platform resources are reduced.
Next, a video playback control method in the embodiment of the present application will be described in detail from the perspective of the client and the server, respectively.
First, it is described from a client perspective in conjunction with fig. 2.
Fig. 2 is a flowchart of a video playing control method in an embodiment of the present application, applied to a client, and referring to fig. 2, the method includes:
s201, sending an acquisition request to a server.
The acquisition request is used for requesting to acquire the video data of the specified video and the corresponding indication information. Specifically, when the user triggers a playing operation for the specified video, the client sends an acquisition request to the server in response to the user operation to acquire the video data of the specified video and the indication information corresponding to the video data of the specified video.
In a specific implementation, the acquisition request carries a video identifier of a specified video, and the video identifier can uniquely identify the specified video. As a specific example of the present application, all videos in a video platform have video numbers, and the number of each video is different, based on which the video number can be used as a video identification of the video. Of course, in practical applications, the video identifier may be characterized in other ways, for example, by using the title of the video as the video identifier.
S202: and acquiring the video data of the specified video provided by the server according to the acquisition request and the corresponding indication information thereof.
The indication information is used for indicating a client to display the marking information of the specified object at the specified position corresponding to the specified time when the video data of the specified video is played. In this embodiment, the server stores video data of a video and indication information corresponding to the video data, so that the client can obtain the video data of a specified video provided by the server according to the obtaining request and the indication information corresponding to the video data.
There are many ways for the client to obtain the video data of the designated video provided by the server and the indication information corresponding to the video data. One implementation manner is that the client receives the video data of the specified video and the indication information corresponding to the video data, which are returned by the server in response to the acquisition request, so that the client can obtain the video data of the specified video and the indication information corresponding to the video data. The other implementation manner is that the client actively acquires the video data of the specified video and the corresponding indication information thereof from the server, and specifically, the client receives the video data of the specified video and the storage address of the indication information thereof returned by the server in response to the acquisition request, and then accesses the server according to the storage address, so as to acquire the video data of the specified video and the corresponding indication information thereof.
In this embodiment, the indication information may be represented in the form of specifying a specific display position at which the object corresponds to a specific time point. For example, the server identifies a face in the video through a face identification technology, determines the time when the specified object appears in the video, and further determines the display time point of the annotation information of the specified object, where the time point may be accurate to seconds, and the server may generate the indication information according to the display time point, for example, according to the display time point corresponding to the annotation information of the specified object and the display position corresponding to the display time point. Based on this, the indication information includes a display time point corresponding to the label information of the designated object and a display position corresponding to the display time point.
Since each second of data in the video data includes a plurality of frames of image data, the indication information can also be expressed in the form of specifying a specific display position of the object corresponding to a specific video frame number. In some possible implementation manners of the embodiment of the present application, the indication information may also include a video frame number corresponding to the annotation information of the specified object and a display position corresponding to the video frame number.
It should be noted that video data is generally large, and in order to improve video playing response efficiency, the server may perform fragmentation processing on the video data by using a fragmentation technology, so as to obtain multiple video fragments for one video. Based on this, the client can obtain the video fragments corresponding to the video from the server when playing the video, and then the video data returned by the server can be the video fragments. For the convenience of understanding, for a 90-minute movie, the video can be divided into a plurality of segments, so that the loading time before the video is played can be reduced, the broadband resource can be saved, and the server stress can be relieved. Of course, in some possible implementations, the video data may be complete video data. For example, for a video with a duration of less than 5 minutes, the server may return the entire video data without dividing into multiple video slices.
When the video data is a video slice, the indication information may include: a display time point corresponding to the marking information of the specified object in the video fragment and a display position corresponding to the display time point; of course, the indication information may also include: and video frame numbers corresponding to the marking information of the specified object in the video fragment and display positions corresponding to the video frame numbers.
S203, acquiring the labeling information of the specified object.
The annotation information is information for identifying a designated object and is used for helping a user to distinguish the objects. For example, the annotation information can be identity information of the specified object, including name, category, and the like. For example, if the specified object is a person in the video, the annotation information may be a name of the person in the video, that is, a role name, and of course, the person information may also be a nickname, a job, or the like of the person; if the designated object is an animal in the video, the designated object may be an animal category, for example, a category name "otter" may be used as the annotation information for an animal with a low recognition degree such as an otter.
The client acquires the annotation information of the specified object, so that the client can display the annotation information when playing the video data, and a user is helped to distinguish the objects in the video. In some cases, the client locally stores annotation information for an object in the video, and the specified object is an object annotated in the video, and the client can directly obtain the annotation information of the specified object from the local. In other cases, the server stores the annotation information for the object in the video in advance, and the specified object is the object annotated in the video, and the client can obtain the annotation information of the specified object from the server.
It should be noted that S202 and S203 may be executed simultaneously or in a preset order, and the execution order of S202 and S203 does not affect the specific implementation of the present application.
And S204, playing the video data of the specified video and displaying the annotation information of the specified object according to the indication information.
The client side can play the video data after acquiring the video data of the designated video, the indication information corresponding to the video data and the annotation information, and the indication information carries the time and the position for displaying the annotation information of the designated object, so that the client side displays the annotation information of the designated object at the designated position corresponding to the designated time according to the indication information and the annotation information.
In consideration of the fact that in practical application, if a user can already recognize a specific object according to identification information, the user does not need to prompt the identification information any more and needs to see a pure video picture, and based on this, in some possible implementation manners, the client can also hide the annotation information to avoid the annotation information from influencing the viewing experience of the user. Specifically, the client hides the labeling information of the specified object in response to the object labeling cancellation operation. During specific implementation, a user can selectively display or hide the annotation information of the designated object according to the requirement of the user, when the user is difficult to distinguish the object in the video, the annotation information of the designated object is displayed to help the user to distinguish, and when the user can distinguish the object in the video, the annotation information is hidden, so that the watching experience of the user is prevented from being influenced by the annotation information.
As can be seen from the above, in the method, the client acquires the video data from the server and the indication information corresponding to the video data, where the indication information is used to indicate the client to display the annotation information of the specified object at the specified position corresponding to the specified time, and when the client plays the video data, the client displays the annotation information of the specified object according to the indication information. Therefore, the method provides a more convenient information acquisition mode for the user, and is convenient for the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
In the embodiment shown in fig. 2, a key step of implementing the video playing control method is to acquire the annotation information of the designated object. And the annotation information of the object can be generated by the video platform operator by pre-marking in the server. However, in order to improve the viewing experience of the user in viewing the video and meet the personalized requirements of the user on object labeling, the application also provides a corresponding solution, specifically, a client provides a service of a personalized labeling object for the user, and it can be understood that the user can firstly perform personalized labeling on a certain or some objects capable of labeling according to the actual requirements, and then the client can cache the labeling information in the local, so that the subsequent client can obtain the labeling information of the specified object from the local and display the labeling information when playing the video. Of course, the client may also send the annotation information for the annotated object to the server, and the annotation information is stored by the server for subsequent needs. To facilitate an understanding of a specific implementation of this solution, it is explained below in connection with fig. 3.
Fig. 3 is a flowchart of a video playing control method in an embodiment of the present application, which is different from the embodiment shown in fig. 2 in that a processing procedure for labeling an object that can be labeled is further added, and only differences from the embodiment shown in fig. 2 are described in detail in this embodiment, and other steps can be referred to the embodiment shown in fig. 2, referring to fig. 3, where the method includes:
s301: and in response to the object marking operation, pausing the playing of the video data and highlighting the markable object in the video frame when the video is paused according to the region position of the markable object acquired from the server.
The client side responds to the object marking operation and pauses the playing of the video data, and the client side displays the markable object in a highlighted mode in the video frame when the video is paused according to the region position of the markable object acquired from the server, so that a user can select one or some markable objects to mark information.
The annotatable object refers to an object in the video that can be annotated or an object in the video that is allowed to be annotated, for example, the annotatable object can be a person or an animal in the video. The callable objects may include all the human objects or animal objects in the video, and may also include some human objects or animal objects in the video, such as a lead actor object. The client can provide a labeling control for a user on a playing page, when the user triggers an object labeling operation, the client pauses the playing of the video data, and according to the area position of the markable object acquired from the server, the markable object is highlighted in the video frame when the video pauses according to the area position, so that the user can select the object to be labeled. In some possible implementations, the client may highlight the callable objects in the form of a bounding box. Specifically, the client highlights all the markable objects in the video frame during the video pause in the form of a bounding box.
S302: and responding to the region selection operation of the markable object, and displaying a marking information input control aiming at the markable object.
The client-side displays the markable object in the video frame when pausing, the user can select the markable object needing to be marked by mouse clicking, touch control or voice control, and the client-side responds to the region selection operation aiming at the markable object and displays the marking information input control aiming at the markable object.
It should be noted that, when a plurality of markable objects are displayed in the video frame when the client is paused, the user may select only one markable object at a time, and the client displays the marking information input control for the markable object in response to the user selecting an operation for the region of the markable object. Of course, in other possible implementation manners of the present application, a user may also select a plurality of markable objects at a time, and the client displays the annotation information input control of the plurality of markable objects in response to the operation of selecting the region for the markable object.
S303: and receiving marking information made for the markable object in response to a marking information input operation.
According to the marking information input control aiming at the markable object displayed by the client, the user can input the marking information aiming at the markable object, and based on the marking information input control, the client can respond to the marking information input operation and receive the marking information aiming at the markable object.
The client can store the labeling information of the markable object to the local, so that the client can obtain the labeling information of the designated object from the local when obtaining the labeling information of the designated object, and the labeling information of the designated object is displayed at a designated position corresponding to designated time when playing the video.
Of course, the client may also send the received annotation information for the object that can be annotated to the server, and store the annotation information in the server for the client to play the subsequent video.
As can be seen from the above, in the video playing control method provided in the embodiments of the present application, the client suspends playing of video data in response to an object tagging operation, and highlights a markable object in a video frame when the video is suspended according to an area position of the markable object acquired from the server, then displays a tagging information input control of the markable object in response to a selection operation for the markable object, and then receives tagging information made for the markable object in response to the tagging information input operation. The client can send an acquisition request to the server so as to acquire the video data and the indication information corresponding to the video data, acquire the annotation information of the specified object from the annotation information of the object which can be annotated, play the video data, and display the annotation information of the specified object according to the indication information.
Therefore, when watching the video, the user can identify the object in the film through the marking information of the specified object, and does not need to quit playing to find the corresponding object by watching the movie brief introduction. The method provides a more convenient information acquisition mode for the user, and facilitates the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
Fig. 2 and fig. 3 illustrate a video playing control method in the embodiment of the present application in detail from the perspective of a client, and then, a video playing control method in the embodiment of the present application will be described in detail from the perspective of a server with reference to specific embodiments.
Fig. 4 is a flowchart of a video playing control method in an embodiment of the present application, which is applied to a server, and the method includes:
s401: and receiving an acquisition request sent by a client.
The acquisition request is used for requesting to acquire the video data of the specified video and the corresponding indication information. Specific implementations of which can be seen from the above description of related matter.
S402: and providing the video data of the specified video and the corresponding indication information thereof to the client according to the acquisition request.
The indication information is used for indicating the client to display the marking information of the specified object at the specified position corresponding to the specified time when the video data is played. In specific implementation, the server can provide the video data of the designated video and the corresponding indication information thereof to the client according to the acquisition request, so that the client can display the annotation information of the designated object at the designated position corresponding to the designated time when playing the video data according to the video data, the corresponding indication information and the annotation information.
There are many implementations in which the server provides the client with video data of a given video and its corresponding indication information. One implementation mode is that the server responds to the acquisition request and directly returns video data of the specified video and corresponding indication information to the client; another implementation manner is that the server responds to the acquisition request, and returns the video data of the specified video and the storage address of the indication information corresponding to the video data to the client, so that the client can acquire the video data of the specified video and the indication information corresponding to the video data according to the storage address.
Therefore, in the method, the server responds to the acquisition request sent by the client, provides the video data of the designated video and the indication information corresponding to the video data to the client, so that the client can display the marking information of the designated object at the designated position corresponding to the designated time according to the indication information when playing the video data, and accordingly, a user can identify the object in the film through the marking information of the designated object when watching the video and does not need to quit playing and search the corresponding object through watching the movie profile.
In the embodiment shown in fig. 4, when the server returns the indication information to the client, the server needs to determine the designated time and the designated position related to the designated object first, so as to determine the indication information. Based on this, the embodiment of the present application further provides a specific implementation manner of the video playing control method. The following describes an implementation of the video playback control method in detail.
Fig. 5 is a flowchart of a video playing control method in an embodiment of the present application, and a difference between the embodiment and the embodiment shown in fig. 4 is that a step of preprocessing a video is further added, and for a video in a video platform, a server may identify an object that can be annotated in the video in advance, and determine indication information corresponding to the video, so as to provide data support for a client, so that when the client plays a specified video, the client displays annotation information of the specified object according to the indication information of the specified video. It should be noted that, the server may perform the above-mentioned preprocessing process on each video, or may perform the above-mentioned preprocessing process on only a part of the videos according to the service requirement, and the process of preprocessing each video is the same, so for convenience of understanding, the present application only introduces the above-mentioned preprocessing process on one video. Regarding the step of determining the indication information, this embodiment only describes in detail the difference from the embodiment shown in fig. 4, and other steps may refer to the embodiment shown in fig. 4, and refer to fig. 5, where the method includes:
s501: and acquiring material data corresponding to the video.
The material data includes video data of a video and an image of an annotatable object.
Wherein, the image of the object can be extracted from the drama or the video. Taking the annotatable object as a person in the video as an example, the image of the annotatable object may be a face image of the person. In some possible implementation manners, an operator of the video platform may log in the material data uploading platform to upload the video and the face images of the main characters in the video to the server, so that the server may obtain material data corresponding to the video, where the material data includes the video data and the face images of the main characters in the video.
Note that the markable object is not limited to a main character, and may be a subsidiary character in a video or a non-human character such as an animal in a video, which is not limited in the embodiment of the present application.
S502: and identifying the video data of the video according to the image of the object which can be labeled to obtain the area position of the object which can be labeled in the video frame.
The server compares the image of the markable object with the multi-frame image data contained in the video data by adopting an image identification technology to obtain a video frame containing the markable object, and further obtains the area position of the markable object in the video frame. Taking the markable object as a person as an example, the corresponding region position may be the position of the head region of the person. Taking the markable object as an animal as an example, the corresponding region position may be a position of the whole body of the animal.
In some cases, the occurrence time of the markable object is short, and displaying the marking information at the time does not help the user to distinguish the object, so the server can only obtain the region position of the markable object in the video frame, where the markable object continuously appears for a certain time. Specifically, for a video fragment corresponding to the video data of the video, the server identifies the video fragment according to the image of the markable object to obtain the area position of the video frame of the markable object at the display time point. Wherein the display time point is the earliest point of occurrence of the markable object in the video slice that first continues to occur up to a time threshold.
For ease of understanding, the description is made in connection with specific examples. In this example, video data is sliced at 10 seconds to obtain video slices corresponding to the video data, a time threshold is set to 2 seconds according to an empirical value, based on which the server identifies the video slices from the face image of the person a, when the face image of the person a first continuously appears for 2 seconds in the video slices, the earliest appearing time point at which the face image of the person a first continuously appears for 2 seconds is taken as a display time point, and then the server acquires the area position of the video frame of the person a at the display time point.
S503: and determining the display position of the labeling information of the markable object in the video frame according to the area position of the markable object in the video frame.
The annotation information is used to help the user identify the object in the video, so when displaying the annotation information of the annotatable object, the display position of the annotation information should be associated with the region position of the annotatable object in the video frame, so that the user can distinguish the annotatable object according to the annotation information. The display position of the annotation information of the annotatable object in the video frame can be the periphery of the area position of the annotatable object in the video frame. As a possible implementation manner, the display position of the labeling information of the markable object in the video frame is at the upper left corner of the area position of the markable object in the video frame. In other possible implementation manners of the embodiment of the present application, the server may use a top center position of the area position of the markable object in the video frame as a display position of the marking information of the markable object in the video frame.
S504: and determining indication information corresponding to the video data of the video according to the display position of the labeling information of the markable object in the video frame.
The indication information corresponding to the video data of the video is used for indicating a client to display the marking information of the specified object at the specified position corresponding to the specified time when the video data of the video is played. And the appointed time represents the display time of the labeling information of the markable object, and the appointed position represents the display position of the labeling information of the markable object corresponding to the appointed time. Because the display position corresponds to a specific video frame, the server can determine the display time of the annotation information of the markable object in the video according to the video frame, and further determine the indication information corresponding to the video data according to the display time and the display position.
In some possible implementations, the server may determine the display time point according to the video frame, and thus, the server may determine the indication information corresponding to the video data according to the display position of the annotation information of the annotatable object in the video frame and the display time point corresponding to the video frame. In this implementation manner, the indication information includes a display time point corresponding to the labeling information of the markable object and a display position corresponding to the display time point.
In other possible implementations, the server may determine the indication information corresponding to the video data according to a display position of annotation information of the annotatable object in the video frame and a video frame number corresponding to the video frame. In this implementation, the indication information includes a video frame number corresponding to the annotation information of the annotatable object and a display position corresponding to the video frame number.
In this embodiment, the indication information is determined according to the display time point corresponding to the video frame, so that the data amount can be reduced and the calculation pressure of the server can be reduced compared with the case where the indication information is determined according to the frame number corresponding to the video frame.
In view of the above, an embodiment of the present application provides a video playing control method, in which a server acquires material data corresponding to a video, and then identifies the video data in the material data according to an image of an object that can be annotated in the material data to obtain a region position of the object that can be annotated in a video frame, and a display position of the object that can be annotated can be determined according to the region position, and a server determines indication information corresponding to the video data according to a display position of annotation information of the object that can be annotated in the video frame. When the server returns the indication information to the client, the client can display the marking information of the specified object at the specified position corresponding to the specified time according to the indication information when playing the video data. Therefore, when watching the video, the user can identify the object in the film through the marking information of the specified object, and does not need to quit playing to find the corresponding object by watching the movie brief introduction. The method provides a more convenient information acquisition mode for the user, and facilitates the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
Fig. 2 to fig. 5 respectively describe a video playing control method in the embodiment of the present application from the perspective of a client or a server, and a description will be given to a video playing control method in the embodiment of the present application from the perspective of interaction with a specific application scenario.
Fig. 6 is an interaction flowchart of a video playing control method in an actual application scenario in the embodiment of the present application, and referring to fig. 6, the application scenario includes a first client 100, a second client 200, and a server 300. The first client 100 is a client installed on a terminal device of an operator of a video platform and is configured to manage video data in the server 300, for example, the video data may be uploaded to the server 300, and the second client 200 is a client installed on a terminal device of a user of the video platform and is configured to obtain video data of a specified video and indication information corresponding to the video data from the server 300 and display annotation information of a specified object according to the indication information when the video data is played. As shown in fig. 6, the method specifically includes the following steps:
s601: the first client 100 uploads the movie video data and the face image of the main character of the movie to the server 300.
The video data of the film and the face image of the main character of the film form material data corresponding to the film.
S602: the server 300 identifies the film video data according to the face image, obtains the area position of the video frame of each face image at the display time point, and determines the display position of the annotation information in the video frame based on the area position.
The display time point is the earliest time point when the face image continuously appears in the video fragment corresponding to the film for the first time and reaches the time threshold. If the video segment duration is 10 seconds and the time threshold is 2 seconds, the display time point is the earliest time point when the face image continuously appears for the first time in the video segment corresponding to each film for 2 seconds. The server 300 identifies the face image in the video frame at the display time point, and the area position of the face image, and then uses the periphery of the area position, such as the upper left corner, as the display position of the annotation information in the video frame.
The display time point will be described below with reference to a specific example. For example, in a video clip, when a face image continues to appear in 2-5 seconds and 6-8 seconds, the earliest appearance time point 2 seconds, in which the face image first continues to appear for 2 seconds, is taken as the display time point. For another example, in another video segment, the face image appears only in the 3 rd second, and in the video segment, the continuous appearance time of the image does not reach the time threshold, so that the video segment has no display time point, that is, the annotation information is not displayed in the video segment.
The server 300 may sequentially identify each face image to obtain a display time point corresponding to each face image, obtain an area position of the video frame of each face image at the display time point based on the display time point of each face image, and obtain a display position of each face image in the video frame corresponding to each display time point based on the area position.
S603: the second client 200 sends a first acquisition request to the server 300 in response to a video play operation triggered by a user.
S604: the server 300 returns the video data of the specified video and the display time point corresponding to each face image in the video data and the area position in the video frame at the display time point to the second client 200.
As some specific examples of the application, the designated video may be a video selected by the user in a movie list of the second client 200, or a video selected by the user in a search box of the second client 200 and selected from search results.
S605: the second client 200 pauses the play of the video data and highlights the face image in the form of a bounding box in the video frame at the time of the video pause according to the region position of the face image acquired from the server 300 in response to the object annotation operation.
As shown in fig. 7, the client of the video platform has a annotation component in the playing interface, and when the user clicks the annotation component by a mouse to trigger an object annotation operation, the second client 200 may pause the playing of the video data in response to the operation. The second client 200 highlights the face image in the form of a bounding box in the video frame when the video is paused according to the region position of the face image acquired from the server, as shown in fig. 8, the paused video frame includes five face images, and for each face image, the second client 200 highlights the face image in the form of a bounding box.
S606: the second client 200 displays an annotation information input control for the face image in response to the region selection operation for the face image.
S607: the second client 200 receives the annotation information made for the face image in response to the annotation information input operation.
As shown in fig. 9, the user may select a bounding box to implement a region selection operation for the face image, and when a region of a certain face image is selected, the second client 200 displays a labeling information input control around the region of the face image, for example, in the upper left corner. The user can input the annotation information, such as "dewy", according to the annotation information input control, and the second client 200 receives the annotation information made for the face image in response to the annotation information input operation.
S608: the second client 200 sends a second acquisition request to the server 300.
S609: the server 300 returns the video data of the specified video and the corresponding indication information to the second client 200.
The indication information is used for indicating the client to display the annotation information of the specified face image at the specified position corresponding to the specified time. The appointed time comprises an appointed display time point of the face image, and the appointed position comprises a display position of the annotation information of the appointed face image in a video frame of the appointed display time point.
S610: the second client 200 acquires annotation information of the specified face image.
S611: the second client 200 plays the video data and displays the annotation information of the specified face image according to the indication information.
The specific playing effect can be seen in fig. 9, and when video data is played, the second client displays the annotation information of the designated face image according to the indication information.
It should be noted that, S601 and S602 are preprocessing processes performed on video data, and for any video, once the video is uploaded to the server and the server identifies a main person therein, the area position of the video frame at the display time point of the face image of each person and the display position of the annotation information determined based on the area position in the video frame are obtained, and after the user requests the video data, the user can directly obtain corresponding data from the server without performing S601 and S602 again.
In the application scene, the server preprocesses the video data uploaded by the operator, identifies the video data according to the face images of main characters of the film, obtains the display time point of each face image and the area position of the video frame of each face image at the display time point, determines the display position of the labeling information based on the display time point, when the user triggers the object marking operation, the second client corresponding to the user responds to the operation, pauses the video playing data, provides an information marking path, so that the user can label the face image, and thus, the second client can specify the display time point of the face image according to the acquired video data, specify the display position of the label information of the face image, and specify the label information, and when the video data is played, displaying the annotation information of the specified face image at the specified position corresponding to the specified time.
Therefore, the method provides a more convenient information acquisition mode for the user, and is convenient for the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
Based on the foregoing specific implementation manners of the video playing control method provided in the embodiments of the present application, the embodiments of the present application further provide a video playing control device. Next, a video playback control apparatus according to an embodiment of the present invention will be described in terms of functional modularization with reference to the accompanying drawings.
Fig. 10 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application, and referring to fig. 10, the apparatus 1000 includes:
a sending module 1010, configured to send an acquisition request to a server, where the acquisition request is used to request to acquire video data of a specified video and indication information corresponding to the video data;
a first obtaining module 1020, configured to obtain video data of the specified video and indication information corresponding to the video data, which are provided by the server according to the obtaining request, where the indication information is used to indicate a client to display annotation information of a specified object at a specified position corresponding to a specified time when playing the video data of the specified video;
a second obtaining module 1030, configured to obtain label information of the specified object;
the control module 1040 is configured to play the video data of the specified video and display the annotation information of the specified object according to the indication information.
Optionally, referring to fig. 11, fig. 11 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application, and based on the structure shown in fig. 10, the apparatus further includes a first display module 1050, a second display module 1060, and a receiving module 1070, where:
the first display module 1050 is configured to, in response to an object annotation operation, pause playing of video data and display an annotated object in a video frame during video pause in a highlighted manner according to an area position of the annotated object acquired from the server;
the second display module 1060 is configured to, in response to a region selection operation for the callable object, display a callable information input control for the callable object;
the receiving module 1070 is configured to receive annotation information made for the annotatable object in response to an annotation information input operation.
Optionally, the first display module 1050 is specifically configured to:
highlighting the markable object in a bounding box form.
Optionally, the indication information includes a display time point corresponding to the label information of the specified object and a display position corresponding to the display time point;
the control module 1040 is specifically configured to:
and displaying the label information of the specified object at the display time point corresponding to the label information of the specified object according to the display position corresponding to the display time point.
Optionally, the indication information includes a video frame number corresponding to the annotation information of the specified object and a display position corresponding to the video frame number;
the control module 1040 is specifically configured to:
and displaying the marking information of the specified object on the video frame corresponding to the video frame number according to the display position corresponding to the video frame number.
Optionally, referring to fig. 12, fig. 12 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application, and based on the structure shown in fig. 10, the apparatus further includes:
a hiding module 1080, configured to hide the annotation information of the specified object in response to an object annotation canceling operation.
As can be seen from the above, the present invention provides a video playback control apparatus that obtains video data and indication information corresponding to the video data from a server, where the indication information indicates that annotation information of a specified object is displayed at a specified position corresponding to a specified timing, and displays the annotation information of the specified object according to the indication information when playing back the video data. Therefore, the device provides a more convenient information acquisition mode for the user, and is convenient for the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
Fig. 13 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application, and referring to fig. 10, the apparatus 1300 includes:
a receiving module 1310, configured to receive an acquisition request sent by a client, where the acquisition request is used to request to acquire video data of a specified video and indication information corresponding to the video data;
a providing module 1320, configured to provide, according to the obtaining request, the video data of the specified video and indication information corresponding to the video data of the specified video to the client, where the indication information is used to indicate that the client displays annotation information of a specified object at a specified position corresponding to a specified time when playing the video data of the specified video.
Optionally, referring to fig. 14, fig. 14 is a schematic structural diagram of a video playback control apparatus according to an embodiment of the present application, and based on the structure shown in fig. 13, the apparatus further includes:
an obtaining module 1330, configured to obtain material data corresponding to a video, where the material data includes video data of the video and an image of an object that can be labeled;
the identifying module 1340 is configured to identify video data of the video according to the image of the markable object, so as to obtain an area position of the markable object in a video frame;
a first determining module 1350, configured to determine, according to a location of an area of the markable object in the video frame, a display location of the marking information of the markable object in the video frame, where the display location is located around the location of the area;
the second determining module 1360 is configured to determine, according to a display position of the annotation information of the markable object in a video frame, indication information corresponding to the video data of the video, where the indication information corresponding to the video data of the video is used to indicate a client to display the annotation information of the designated object at a designated position corresponding to a designated time when playing the video data of the video.
Optionally, the identifying module 1340 is specifically configured to:
and aiming at the video fragment corresponding to the video data of the video, identifying the video fragment according to the image of the markable object to obtain the area position of the video frame of the markable object at the display time point, wherein the display time point is the earliest time point of the first continuous occurrence of the markable object in the video fragment reaching the time threshold.
Optionally, the second determining module 1360 is specifically configured to:
determining indicating information corresponding to video data of the video according to a display position of the labeling information of the markable object in a video frame and a video frame number corresponding to the video frame, wherein the indicating information comprises the video frame number corresponding to the labeling information of the markable object and a display position corresponding to the video frame number; or,
and determining indication information corresponding to the video data of the video according to the display position of the labeling information of the markable object in the video frame and the display time point corresponding to the video frame, wherein the indication information comprises the display time point corresponding to the labeling information of the markable object and the display position corresponding to the display time point.
Optionally, the display position of the annotation information of the markable object in the video frame is at the upper left corner of the display position of the markable object in the video frame, which is at the area position of the markable object in the video frame.
As can be seen from the above, the embodiment of the present application provides a video playing control apparatus, which receives an acquisition request sent by a client, and then provides video data of a specified video and indication information corresponding to the video data to the client according to the acquisition request, so that the client can display annotation information of a specified object at a specified position corresponding to a specified time according to the indication information. The user can identify the object in the film through the marking information of the designated object when watching the video without quitting playing and finding the corresponding object through watching the movie brief introduction, so that the device provides a more convenient information acquisition mode for the user, and is convenient for the user to acquire information more intuitively, thereby reducing frequent interaction between the user and the network and reducing occupation and waste of video platform resources.
Fig. 10 to 14 illustrate an embodiment of a video playback control apparatus in the present application from the perspective of function modularization, and based on this, the present application also provides a video playback control apparatus, and a video playback control apparatus in the present application is described in the following from the perspective of hardware implementation.
The embodiment of the present application provides a video playing control device, which may be a terminal device, as shown in fig. 15, for convenience of description, only a part related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to the method part of the embodiment of the present application. The terminal device may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (hereinafter, referred to as "Personal Digital Assistant" for short in english), a vehicle-mounted computer, etc., taking the terminal device as the mobile phone as an example:
fig. 15 is a block diagram illustrating a partial structure of a mobile phone related to a terminal device provided in an embodiment of the present application. Referring to fig. 15, the cellular phone includes: radio Frequency (RF) circuit 1510, memory 1520, input unit 1530, display unit 1540, sensor 1550, audio circuit 1560, wireless fidelity (WiFi) module 1570, processor 1580, and power 1590. Those skilled in the art will appreciate that the handset configuration shown in fig. 15 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The memory 1520 may be used to store software programs and modules, and the processor 1580 performs various functional applications and data processing of the cellular phone by operating the software programs and modules stored in the memory 1520. The memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1520 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 1580 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1520 and calling data stored in the memory 1520, thereby integrally monitoring the mobile phone. Optionally, the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, and the like, and a modem processor, which mainly handles wireless communications. It is to be appreciated that the modem processor may not be integrated into the processor 1580.
Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.
In this embodiment, the processor 1580 included in the terminal device further has the following functions:
sending an acquisition request to a server, receiving video data of a specified video and indication information corresponding to the video data returned by the server, wherein the indication information is used for indicating a client to display annotation information of a specified object at a specified position corresponding to a specified opportunity;
acquiring the labeling information of the specified object;
and playing the video data and displaying the labeling information of the specified object according to the indication information.
Optionally, the processor 1580 included in the terminal device may be further configured to execute any implementation manner of a video playing control method in this embodiment of the present application.
Fig. 16 is a schematic structural diagram of a video playback control apparatus provided in this embodiment, where the apparatus may be a server, and the server 1600 may generate a relatively large difference due to a difference in configuration or performance, and may include one or more Central Processing Units (CPUs) 1622 (e.g., one or more processors) and a memory 1632, and one or more storage media 1630 (e.g., one or more mass storage devices) storing an application program 1642 or data 1644. Memory 1632 and storage media 1630 may be transient or persistent storage, among others. The program stored on the storage medium 1630 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Further, central processing unit 1622 may be configured to communicate with storage medium 1630 to execute a series of instruction operations on storage medium 1630 at server 1600.
The server 1600 may also include one or more power supplies 1626, one or more wired or wireless network interfaces 1650, one or more input-output interfaces 1658, and/or one or more operating systems 1641, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
The steps performed by the server in the above embodiment may be based on the server structure shown in fig. 16.
The CPU 1622 is configured to execute the following steps:
receiving an acquisition request sent by a client;
and returning the video data of the specified video and the indication information corresponding to the video data to the client according to the acquisition request, wherein the indication information is used for indicating the client to display the marking information of the specified object at the specified position corresponding to the specified time when the client plays the video data.
The embodiment of the present application further provides a computer-readable storage medium, configured to store a program code, where the program code is configured to execute any one implementation manner of a video playing control method described in the foregoing embodiments.
The present application further provides a computer program product including instructions, which when run on a computer, causes the computer to execute any one of the implementation manners of the video playing control method described in the foregoing embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (15)

1. A video playback control method, comprising:
sending an acquisition request to a server, wherein the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
acquiring the video data of the specified video and corresponding indication information thereof, which are provided by the server according to the acquisition request, wherein the indication information is used for indicating a client to display the marking information of the specified object at a specified position corresponding to a specified time when the video data of the specified video is played;
acquiring the labeling information of the specified object;
and playing the video data of the specified video and displaying the labeling information of the specified object according to the indication information.
2. The method of claim 1, further comprising:
in response to the object marking operation, pausing the playing of the video data and displaying the markable object in a video frame during the video pausing in a highlighted manner according to the region position of the markable object acquired from the server;
responding to the region selection operation of the markable object, and displaying a marking information input control aiming at the markable object;
and receiving marking information made for the markable object in response to a marking information input operation.
3. The method of claim 2, wherein said highlighting the markable object comprises:
highlighting the markable object in a bounding box form.
4. The method according to claim 1, wherein the indication information includes a display time point corresponding to the label information of the designated object and a display position corresponding to the display time point;
then, the displaying the labeling information of the designated object according to the indication information includes:
and displaying the label information of the specified object at the display time point corresponding to the label information of the specified object according to the display position corresponding to the display time point.
5. The method according to claim 1, wherein the indication information comprises a video frame number corresponding to the annotation information of the specified object and a display position corresponding to the video frame number;
then, the displaying the labeling information of the designated object according to the indication information includes:
and displaying the marking information of the specified object on the video frame corresponding to the video frame number according to the display position corresponding to the video frame number.
6. The method of claim 1, further comprising:
and hiding the labeling information of the specified object in response to the object labeling cancellation operation.
7. A video playback control method, comprising:
receiving an acquisition request sent by a client, wherein the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
and providing the video data of the specified video and corresponding indication information thereof to the client according to the acquisition request, wherein the indication information is used for indicating the client to display the marking information of the specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video.
8. The method of claim 7, further comprising:
acquiring material data corresponding to a video, wherein the material data comprises video data of the video and an image of an object which can be marked;
identifying video data of the video according to the image of the object which can be labeled to obtain the area position of the object which can be labeled in the video frame;
determining the display position of the labeling information of the markable object in the video frame according to the area position of the markable object in the video frame, wherein the display position is positioned at the periphery of the area position;
and according to the display position of the marking information of the markable object in the video frame, determining the indication information corresponding to the video data of the video, wherein the indication information corresponding to the video data of the video is used for indicating a client to display the marking information of the designated object at the designated position corresponding to the designated time when the video data of the video is played.
9. The method of claim 8, wherein said identifying the video data according to the image of the annotatable object to obtain the location of the annotatable object in the region of the video frame comprises:
and aiming at the video fragment corresponding to the video data of the video, identifying the video fragment according to the image of the markable object to obtain the area position of the video frame of the markable object at the display time point, wherein the display time point is the earliest time point of the first continuous occurrence of the markable object in the video fragment reaching the time threshold.
10. The method according to claim 8 or 9, wherein determining the indication information corresponding to the video data of the video according to the display position of the annotation information of the annotatable object in the video frame comprises:
determining indicating information corresponding to video data of the video according to a display position of the labeling information of the markable object in a video frame and a video frame number corresponding to the video frame, wherein the indicating information comprises the video frame number corresponding to the labeling information of the markable object and a display position corresponding to the video frame number; or,
and determining indication information corresponding to the video data of the video according to the display position of the labeling information of the markable object in the video frame and the display time point corresponding to the video frame, wherein the indication information comprises the display time point corresponding to the labeling information of the markable object and the display position corresponding to the display time point.
11. The method according to claim 8 or 9, wherein the display position of the annotation information of the annotatable object in the video frame is at the upper left corner of the position of the region of the annotatable object in the video frame.
12. A video playback control apparatus, comprising:
the system comprises a sending module, a receiving module and a sending module, wherein the sending module is used for sending an acquisition request to a server, and the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
a first obtaining module, configured to obtain video data of the specified video and indication information corresponding to the video data, which are provided by the server according to the obtaining request, where the indication information is used to indicate a client to display annotation information of a specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video;
the second acquisition module is used for acquiring the labeling information of the specified object;
and the control module is used for playing the video data of the specified video and displaying the labeling information of the specified object according to the indication information.
13. A video playback control apparatus, comprising:
the system comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving an acquisition request sent by a client, and the acquisition request is used for requesting to acquire video data of a specified video and corresponding indication information;
and the providing module is used for providing the video data of the specified video and the corresponding indication information thereof to the client according to the acquisition request, wherein the indication information is used for indicating the client to display the annotation information of the specified object at a specified position corresponding to a specified time when the client plays the video data of the specified video.
14. A video playback control apparatus, characterized in that the apparatus comprises a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to execute the video playback control method according to any one of claims 1 to 6, or the video playback control method according to any one of claims 7 to 11, according to instructions in the program code.
15. A computer-readable storage medium for storing a program code for executing the video playback control method according to any one of claims 1 to 6, or the video playback control method according to any one of claims 7 to 11.
CN201811169061.5A 2018-10-08 2018-10-08 A kind of video playing control method, device, equipment and medium Pending CN109274999A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811169061.5A CN109274999A (en) 2018-10-08 2018-10-08 A kind of video playing control method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811169061.5A CN109274999A (en) 2018-10-08 2018-10-08 A kind of video playing control method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN109274999A true CN109274999A (en) 2019-01-25

Family

ID=65195994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811169061.5A Pending CN109274999A (en) 2018-10-08 2018-10-08 A kind of video playing control method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN109274999A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110099303A (en) * 2019-06-05 2019-08-06 四川长虹电器股份有限公司 A kind of media play system based on artificial intelligence
CN110225367A (en) * 2019-06-27 2019-09-10 北京奇艺世纪科技有限公司 It has been shown that, recognition methods and the device of object information in a kind of video
CN110505498A (en) * 2019-09-03 2019-11-26 腾讯科技(深圳)有限公司 Processing, playback method, device and the computer-readable medium of video
CN111652678A (en) * 2020-05-27 2020-09-11 腾讯科技(深圳)有限公司 Article information display method, device, terminal, server and readable storage medium
CN111800651A (en) * 2020-06-29 2020-10-20 联想(北京)有限公司 Information processing method and information processing device
CN112601129A (en) * 2020-12-09 2021-04-02 深圳市房多多网络科技有限公司 Video interaction system, method and receiving end
CN113794907A (en) * 2021-09-16 2021-12-14 广州虎牙科技有限公司 Video processing method, video processing device and electronic equipment
CN115119004A (en) * 2019-05-13 2022-09-27 阿里巴巴集团控股有限公司 Data processing method, information display method, device, server and terminal equipment
CN115379284A (en) * 2022-07-15 2022-11-22 广州力天文化创意产业集团有限公司 Method and device for playing video

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140019862A1 (en) * 2008-06-03 2014-01-16 Google Inc. Web-Based System for Collaborative Generation of Interactive Videos
US20140023341A1 (en) * 2012-07-18 2014-01-23 Hulu, LLC Annotating General Objects in Video
CN103970906A (en) * 2014-05-27 2014-08-06 百度在线网络技术(北京)有限公司 Method and device for establishing video tags and method and device for displaying video contents
CN104105010A (en) * 2013-04-01 2014-10-15 云联(北京)信息技术有限公司 Video playing method and device
CN106358092A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Information processing method and device
CN108401176A (en) * 2018-02-06 2018-08-14 北京奇虎科技有限公司 A kind of method and apparatus for realizing video personage mark

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140019862A1 (en) * 2008-06-03 2014-01-16 Google Inc. Web-Based System for Collaborative Generation of Interactive Videos
US20140023341A1 (en) * 2012-07-18 2014-01-23 Hulu, LLC Annotating General Objects in Video
CN104105010A (en) * 2013-04-01 2014-10-15 云联(北京)信息技术有限公司 Video playing method and device
CN103970906A (en) * 2014-05-27 2014-08-06 百度在线网络技术(北京)有限公司 Method and device for establishing video tags and method and device for displaying video contents
CN106358092A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Information processing method and device
CN108401176A (en) * 2018-02-06 2018-08-14 北京奇虎科技有限公司 A kind of method and apparatus for realizing video personage mark

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115119004A (en) * 2019-05-13 2022-09-27 阿里巴巴集团控股有限公司 Data processing method, information display method, device, server and terminal equipment
CN115119004B (en) * 2019-05-13 2024-03-29 阿里巴巴集团控股有限公司 Data processing method, information display device, server and terminal equipment
CN110099303A (en) * 2019-06-05 2019-08-06 四川长虹电器股份有限公司 A kind of media play system based on artificial intelligence
CN110225367A (en) * 2019-06-27 2019-09-10 北京奇艺世纪科技有限公司 It has been shown that, recognition methods and the device of object information in a kind of video
CN110505498A (en) * 2019-09-03 2019-11-26 腾讯科技(深圳)有限公司 Processing, playback method, device and the computer-readable medium of video
CN111652678B (en) * 2020-05-27 2023-11-14 腾讯科技(深圳)有限公司 Method, device, terminal, server and readable storage medium for displaying article information
CN111652678A (en) * 2020-05-27 2020-09-11 腾讯科技(深圳)有限公司 Article information display method, device, terminal, server and readable storage medium
US12088887B2 (en) 2020-05-27 2024-09-10 Tencent Technology (Shenzhen) Company Limited Display method and apparatus for item information, device, and computer-readable storage medium
CN111800651A (en) * 2020-06-29 2020-10-20 联想(北京)有限公司 Information processing method and information processing device
CN112601129A (en) * 2020-12-09 2021-04-02 深圳市房多多网络科技有限公司 Video interaction system, method and receiving end
CN113794907A (en) * 2021-09-16 2021-12-14 广州虎牙科技有限公司 Video processing method, video processing device and electronic equipment
CN115379284A (en) * 2022-07-15 2022-11-22 广州力天文化创意产业集团有限公司 Method and device for playing video
CN115379284B (en) * 2022-07-15 2024-11-19 广州力天文化创意产业集团有限公司 Film playing method and device

Similar Documents

Publication Publication Date Title
CN109274999A (en) A kind of video playing control method, device, equipment and medium
US10645332B2 (en) Subtitle displaying method and apparatus
CN104811814B (en) Information processing method and system, client and server based on video playing
CN108932253B (en) Multimedia search result display method and device
CN110121093A (en) The searching method and device of target object in video
US12301983B2 (en) Content operation method and device, terminal, and storage medium
JP2025521253A (en) Related information display method, device, electronic device, storage medium and computer program
US10127246B2 (en) Automatic grouping based handling of similar photos
US10674183B2 (en) System and method for perspective switching during video access
CN112169319B (en) Application program starting method, device, equipment and storage medium
US20240231886A1 (en) Method for processing task, electronic device and storage medium
US11600300B2 (en) Method and device for generating dynamic image
CN112752127B (en) Method and device for positioning video playing position, storage medium and electronic device
CN112818147B (en) Image processing method, device, equipment and storage medium
CN106878773B (en) Electronic device, video processing method and apparatus, and storage medium
CN113568551A (en) Image storage method and device
CN105204718B (en) Information processing method and electronic equipment
US20170139933A1 (en) Electronic Device, And Computer-Readable Storage Medium For Quickly Searching Video Segments
CN115484471B (en) Method and device for recommending anchor
CN115474098B (en) Resource processing method and device, electronic equipment and storage medium
CN116781940A (en) Interface interaction methods, devices, equipment and storage media
CN115346558B (en) Voice response time recognition method and device
CN117676172A (en) Page display method and system for entering live broadcasting room
CN111859095B (en) Picture identification method and device
CN113727194A (en) Big data processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190125

RJ01 Rejection of invention patent application after publication