[go: up one dir, main page]

WO2015039575A1 - Procédé et système d'identification d'image - Google Patents

Procédé et système d'identification d'image Download PDF

Info

Publication number
WO2015039575A1
WO2015039575A1 PCT/CN2014/086171 CN2014086171W WO2015039575A1 WO 2015039575 A1 WO2015039575 A1 WO 2015039575A1 CN 2014086171 W CN2014086171 W CN 2014086171W WO 2015039575 A1 WO2015039575 A1 WO 2015039575A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion
video frame
state
camera
accordance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2014/086171
Other languages
English (en)
Inventor
Xiao Liu
Jian Ding
Hailong Liu
Bo Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to JP2015563118A priority Critical patent/JP6026680B1/ja
Publication of WO2015039575A1 publication Critical patent/WO2015039575A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/527Global motion vector estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection

Definitions

  • the present application relates to image processing and identification technologies, and in particular, to methods and systems for performing image identification on an end device.
  • a solution for performing real-time image identification on an end device includes: obtaining a video frame of a target by using a camera of the mobile terminal, and sending the video frame to a cloud server; and performing, by the cloud server, identification on the received video frame, determining corresponding description information, and feeding back the description information to the mobile terminal for display.
  • data collection from the obtained video frame may be performed on various matters, such as a book cover, a CD cover, a film poster, a bar code, a two dimension code, a goods logo.
  • the cloud server feeds back the description information, where the description information includes a purchase situation, comment information, and the like of related goods.
  • the related information of the images is obtained instantly after the images are shot, and this provides convenience to the client.
  • a target is photographed by aiming at the target using a camera of a mobile terminal, and an obtained video frame is sent to a cloud server from the mobile terminal.
  • this manner has the following defects: the operation needs to be manually performed after the target is aimed at, which is inconvenient.
  • the cloud server cannot perform image identification, and therefore the mobile terminal cannot obtain description information about the target successfully.
  • the method In the second manner, data collection is performed in real time on a picture captured by the camera, and all of the collected image data is then sent to the cloud server.
  • the method also has the following defects: because each collected video frame is sent to the cloud server in real time, a large amount of traffic is used and the bandwidth is occupied; in addition, because some collected data frames are not clear, the cloud server cannot perform identification, and cannot effectively feed back the identification result.
  • the embodiments of the present disclosure provide methods and systems for providing recommendations during a chat session.
  • a method for image identification is performed at an electronic device having one or more processors, a memory, and a camera.
  • the method includes obtaining a sequence of video frames including at least a first video frame and a second video frame captured by the camera, the first video frame being captured prior to the second video frame; determining a respective motion state of the camera associated with each video frame of the sequence of video frames, including determining a first motion state of the camera associated with the second video frame by performing a motion estimation of the first video frame and the second video frame; determining whether the camera has undergone a transition of motion states from a respective moving state to a respective stationary state between capturing two consecutive video frames of the sequence of video frames; and in accordance with a determination that the camera has undergone the transition of motion states from the respective moving state to the respective stationary state between capturing the two consecutive video frames of the sequence of video frames, determining whether a latter video frame of the two consecutive video frames is valid for uploading in accordance with predetermined uploading criteria.
  • a device comprises one or more processors, memory, camera, and one or more program modules stored in the memory and configured for execution by the one or more processors.
  • the one or more program modules include instructions for performing the method described above.
  • a non-transitory computer readable storage medium having stored thereon instructions, which, when executed by a device, cause the device to perform the method described herein.
  • Figure 1 is a schematic flowchart of a method for performing image identification on a mobile terminal in accordance with some embodiments of the present application.
  • Figure 2 is a schematic flowchart of a method for performing image identification on a mobile terminal in accordance with some embodiments of the present application.
  • Figure 3 is a schematic flowchart of a method for performing motion estimation in accordance with some embodiments of the present application.
  • Figure 4 illustrates schematic exemplary diagrams for determining the match block in accordance with some embodiments of the present application.
  • FIG 5 is a schematic structural diagram of a mobile terminal which performs the method for image identification as discussed in Figures 1-3 in accordance with some embodiments of the present application.
  • FIG. 6 is a block diagram of a server-client environment in accordance with some embodiments.
  • Figure 7 is a block diagram of a client device in accordance with some embodiments.
  • FIG. 8 is a block diagram of a server system in accordance with some embodiments.
  • Figures 9A-9E are a flowchart diagram of a method for performing image identification on an end device in accordance with some embodiments of the present application.
  • a user first opens a camera, moves the camera to aim at a target, and then performs data collection by using the camera, which is a process from moving to stationary state. Based on this, in the present application, a motion state of the collected video frame is determined. When it is known that the motion state of the video frame is from moving state to stationary state, it is determined to be a clear frame image, and then the clear frame image is uploaded to a cloud server. In this way, only the clear frame image is sent to the cloud server, which saves traffic bandwidth. In addition, because the cloud server feeds back the identification result based on the clear frame image, the identification result is more effective.
  • Figure 1 is a schematic flowchart of a method 100 for performing image identification on an end device, e. g. , a mobile terminal, in accordance with some embodiments of the present application.
  • the method 100 is performed to the real-time images captured by the mobile terminal.
  • data collection is performed (101) in real time by using a camera of a mobile terminal, and a video frame is obtained.
  • motion estimation is performed (102) on the video frame, and a motion state of the video frame is determined.
  • frame-by-frame collection is performed on a picture by moving the camera.
  • Motion estimation is performed on a video frame which is obtained in real time, to determine the motion state of the video frame.
  • motion estimation is used in video encoding technologies.
  • the motion estimation is used to process a video frame that is collected using the camera of the mobile terminal, to determine the motion state of the video frame.
  • a motion vector may be used to determine the motion state of the video frame, which includes: calculating a motion vector between a current video frame and a previous video frame, where the motion vector includes a motion amplitude and a motion direction; and determining the motion state of the video frame using the motion vector.
  • motion estimation is used to calculate the motion vector between the current video frame and the previous video frame
  • the steps used may include obtaining a center area pixel of the previous video frame; using a center area of the current video frame as a start point, and searching for an area, surrounding the start point, having a pixel similar to the center area pixel of the previous video frame, to be determined as a match block; and using a position vector between the center area of the current video frame and the match block as the motion vector.
  • the motion state includes moving state, stationary state, from moving state to stationary state, and from stationary state to moving state.
  • the moving state of the video frame is determined by using the motion vector in many manners, which may be set according to actual needs. For example, determining the motion state of the video frame using the motion vector includes reading a stored background motion state.
  • each of the motion amplitudes of N consecutive frames from a current frame is greater than a first motion threshold, where N is a natural number, and the current frame is the first frame
  • motion states of the first frame to the (N+1) th frame are determined to be stationary state
  • the background motion state still is stationary state
  • a motion state of the (N+1) th frame is determined to be from stationary state to moving state
  • the background motion state is changed as moving.
  • the background motion state is moving state
  • motion amplitudes of N consecutive frames from the current frame are less than a second motion threshold, where N is a natural number
  • the current frame is the first frame
  • motion states of the first frame to the (N+1) th frame are moving state
  • the background motion state still is moving
  • a motion state of the (N+1) th frame is determined to be from moving state to stationary state
  • the background motion state is changed to stationary state.
  • the method further includes: determining whether the motion amplitude is greater than a third motion threshold.
  • a motion of the current frame is micro motion
  • the background motion state still is stationary state.
  • the background motion state is stationary state
  • the motion amplitudes of two consecutive frames after the previous video frame are greater than S1
  • the motion directions of the two consecutive frames are in opposite directions, it is determined to be a shaking situation, and the motion states of the two consecutive frames are still determined to be stationary.
  • a motion state of the latest frame of the two consecutive frames is determined to be from stationary to moving.
  • the video frame is determined to be a clear frame image, and the clear frame image is uploaded to a cloud server. If it is determined that the motion state of the video frame is not from moving state to stationary state, a data frame is not uploaded to the cloud server.
  • a corner detection in order to improve accuracy of determining a clear frame, after it is determined that the motion state of the video frame is from moving state to stationary state, a corner detection may also be performed.
  • the corner detection includes: calculating the number of features, such as corner characters, of the video frame; and determining whether the number of corner characters is greater than the threshold number of corners. When the number of corner characters is greater than the threshold number of corners, the clear frame is determined to be a clear frame image. When the number of corner characters is no greater than the threshold number of corners, the frame is determined to be a fuzzy frame image.
  • the clear frame image is uploaded to a cloud server.
  • whether uploading the clear frame image may be determined based on whether the motion states of multiple consecutive video frames are stationary. For example, assuming that the current frame is the first frame, if it is determined that the first frame to the (N+1) th frame are in the stationary state, the (N+1) th frame is determined to be the clear frame, then the clear frame image is uploaded to the cloud server, where N is a natural number.
  • an identification result fed back from the cloud server is received (104) at the end device, and the identification result is displayed (104) .
  • the cloud server feeds back related description information, which may include a purchasing situation, remark information, and the like of related goods.
  • motion estimation is performed on a collected video frame to determine a motion state of the video frame.
  • the motion state of the video frame is determined to be from moving state to stationary state, and the video frame is determined to be a clear frame image
  • the clear frame image is uploaded to a cloud server.
  • the present application uses a manner in which data is actively collected by using a camera, and a user does not need to take photos manually, which is convenient to the operation.
  • only the clear frame image is uploaded to the cloud server, instead of uploading all collected video frame to the cloud server in real time, therefore traffic and bandwidth is saved. Because the cloud server feeds back the identification result based on the clear frame image, the identification result is more effective.
  • FIG. 2 is a schematic flowchart of a method 200 for performing image identification on a mobile terminal in accordance with some embodiments of the present application.
  • data collection is performed (201) in real time using a camera of a mobile terminal, and a video frame is obtained.
  • Perform motion estimation is performed (202) on the video frame, and a motion state of the video frame is determined (202) .
  • the video frame on which motion estimation is performed is called a to-be-processed video frame in the following descriptions.
  • a motion estimation idea used for video encoding is used for processing images captured by a camera of a mobile terminal.
  • the video and the image sequence of the camera of the mobile terminal have the same consecutive image correlation; therefore, the motion estimation algorithm is universal.
  • a difference also exists between the two scenarios.
  • the image obtained by the camera of the mobile terminal generally has lower resolution, and during practical use of a user, the mobile terminal may not move at a great amplitude.
  • a motion estimation algorithm for the overall situation is used in video encoding, this calculation manner is very slow, and generally, a real-time effect cannot be achieved even in a PC. Therefore, in consideration of the difference, the present application makes improvement on the motion estimation algorithm applied in video encoding, so that the algorithm can achieve effective performance in various mobile terminals while consuming less CPU resources, and even the consumed CPU resources basically can be ignored.
  • Figure 3 is a schematic flowchart of a method 300 for performing motion estimation in accordance with some embodiments of the present application.
  • a center area pixel of a to-be-processed video frame is obtained and stored (301) .
  • a center area pixel of a previous video frame of the to-be-processed video frame is also obtained (302) .
  • each time the mobile terminal collects a video frame the mobile terminal stores a center area pixel of the video frame. For example, a pixel gray value of the center area is stored. In this step, a stored center area pixel gray value of a previous video frame adjacent to the to-be-processed video frame is extracted.
  • a center area of the to-be-processed video frame is used (303) as a start point, and search for an area, surrounding the start point, having a pixel similar to the center area pixel of the previous video frame, to be determined (303) as a match block.
  • Figure 4 illustrates schematic exemplary diagrams for determining the match block in step 303 of method 300 in accordance with some embodiments of the present application.
  • a first video frame 400 i. e. , the previous video frame
  • the second video frame 450 i. e. , the to-be-processed video frame
  • a neighboring region surrounding the dashed block 460 is searched from the center to peripheral area for an area 470 having a pixel gray value similar to the pixel gray value of the center area 410 in the previous video frame, and the area 470 is called a match block 470.
  • a square area marked with grid in the to-be-processed video frame 450 is the match block 470 that is obtained through searching.
  • the pixel gray of the center area (x, y) 410 of the previous video frame 400 is indicated as I (x, y) .
  • a search block e. g. , area 470
  • I' pixel gray indicated as I' (x, y) .
  • a quadratic sum of the difference between I (x, y) and I' (x, y) is used as an index for evaluating the block similarity. Assuming that the block size includes pixels of N*N, a square sum of error S, is:
  • a block with the minimum S calculated according to the above formula (1) is used as the match block 470.
  • a motion vector (e. g. , vector 480) between the match block 470 and the center block 460 of the to-be-processed video frame 450 is determined according to a position from the match block 470 to the center block 460.
  • the vector 480 in Figure 4 includes a motion direction and a motion magnitude (i. e. , the length of the vector 480) .
  • an approximation algorithm is used in the foregoing searching process. For example, a large step length is first used for the search process, and an area with a relatively great similarity is identified. Then the step length is reduced in the identified area, and the similarity is evaluated. This step-by-step approximation is performed to obtain the final search result to identify the match block.
  • down-sampling processing may be performed first, for example, a frame size of 2000*2000 is changed to a frame size of 400*400 through down-sampling.
  • a rectangular area e. g.
  • the first video frame 400 and the second video frame 450 is used, and a square area, e. g. , the match block 470, is used to indicate the match block.
  • a square area e. g. , the match block 470
  • any other suitable shape matching such as diamond matching and round matching, may also be used to perform the matching process.
  • any other similarity determining method such as mean square error, sum of absolute error, and sum of mean error, may also be used.
  • another searching algorithm such as a three-step search, and a diamond search, may also be used.
  • a position vector between the match block 470 and the center block 460 of the to-be-processed video frame 450 is calculated (304) , and the position vector as the motion vector is used (304) .
  • the calculated motion vector includes a motion direction and a motion magnitude.
  • the motion state of the video frame is determined (305) using the motion vector.
  • the motion state of video frame mainly has the following four states: moving state, stationary state, from moving state to stationary state, and from stationary state to moving state.
  • moving state stationary state
  • stationary state from moving state to stationary state
  • stationary state from stationary state to moving state.
  • the image is ready for uploading.
  • the state from moving state to stationary state and the state from stationary state to moving state may use different magnitude thresholds.
  • a magnitude threshold of the state from moving state to stationary state is relatively high, and the magnitude threshold is indicated as a second motion threshold.
  • An magnitude threshold of the state from stationary state to moving state is relatively low, and the magnitude threshold is indicated as a first motion threshold. In some embodiments, the first motion threshold is less than the second motion threshold.
  • the mobile terminal stores a background motion state, and the background motion state may be extracted from a stored state. Then the motion state of the to-be-processed video frame can be determined by combining the background motion state, the first motion threshold, and the second motion threshold.
  • the stored background motion state is detected, where if the background motion state is stationary state, and motion magnitudes of N consecutive frames from the current frame are greater than the first motion threshold, where N is a natural number, and the current frame is the first frame, motion states of the first frame to the (N+1) th frame are stationary, the background motion state still is stationary, and a motion state of the (N+1) th frame is determined to be from stationary to moving, and the background motion state is changed as motion.
  • the background motion state is stationary, and a motion magnitude of the current frame is less than the first motion threshold, the motion state of the current frame still is stationary, and the background motion state still is stationary.
  • the background motion state is moving, and motion magnitudes of N consecutive frames from the current frame are less than the second motion threshold, where N is a natural number, and the current frame is the first frame, motion states of the first frame to the (N+1) th frame are moving, the background motion state still is moving, and a motion state of the (N+1) th frame is determined to be from moving to stationary, and the background motion state is changed as stationary.
  • the motion state of the current frame still is moving, and the background motion state still is moving.
  • the method further includes determining whether the motion magnitude is greater than a third motion threshold.
  • a motion of the current frame is associated with micro motion, and the background motion state still is stationary. If motions of M consecutive frames from the current frame are micro motions in the same direction, and the current frame is the first frame, a motion state of the Mth frame is determined to be from stationary to moving, and the background motion state is changed as moving, where M is a natural number.
  • a policy of "remaining the state” is used. For an occasional single stationary or moving state, state switching is not performed. When more than two state changes are accumulated, state switching is performed. By using this policy, state stability is achieved. S1 is used to indicate the first motion threshold, S2 is used to indicate the second motion threshold, and S3 is used to indicate the third motion threshold. S is used to indicate the motion magnitude of the to-be-processed video frame. In some embodiments, that the state switching is performed generally when two state changes are accumulated, and for micro motion, state switching is performed when five state changes are accumulated. In some embodiments, the policy of "remaining the state”includes the following various situations.
  • the background motion state when the background motion state is stationary: (1) when S>S1, the to-be-processed video frame (indicated by using the Y th frame) is determined to be in a stationary state, and the background motion state still is stationary. Then whether a motion magnitude of the (Y+1) th frame is still greater than S1 is determined. When the motion magnitude of the (Y+1) th frame is still greater than S1, the (Y+1) th frame is determined to be in a state from stationary to moving, and the background motion state is changed to be moving. (2) When S ⁇ S1, the to-be-processed video frame is determined to be in a stationary state, and the background motion state still is stationary.
  • a motion of the to-be-processed video frame (indicated by using the Z th frame) is determined to be a micro motion, and motions of the Z th frame to (Z+3) th frame are determined to be micro motions in the same direction, and the Z th frame to (Z+3) th frame are determined to be in the stationary state. If a motion of the (Z+4) th frame is also a micro motion in the same direction, the (Z+4) th frame is determined to be in the state from stationary to moving, and the background motion is changed to moving state .
  • the number of accumulated times may be set to be any suitable number.
  • the background motion state when the background motion state is moving: (1) when S ⁇ S2, the to-be-processed video frame (indicated by using the Y th frame) is determined to be in a moving state, the background motion state still is moving. Then it is determined whether a motion magnitude of the (Y+1) th frame is less than S2. When the motion magnitude of the (Y+1) th frame is less than S2, the (Y+1) th frame is determined to be in a state from moving to stationary, and the background motion state is changed to stationary. (2) When S>S2, the to-be-processed video frame is determined to be in a moving state, and the background motion state still is moving.
  • a hand shaking situation may also be determined. For example, if"sudden left motion and/or sudden right motion"occur, that is, motion vectors occur in opposite directions, it is determined to be the "hand shaking”situation. In this case, if the background is in the stationary state, the motion state is not changed until consecutive motions in the same direction are generated.
  • step 201 it is determined (306) whether to continue to perform motion estimation; if yes, method 300 returns to step 310; otherwise, the procedure is ended. In some embodiments, if the video frame is obtained continuously in step 201, in this step 201, motion estimation is performed on the obtained video frame.
  • the state When the camera is just turned on, the state may be set to stationary. Then, the user moves the camera to aim at the target, and this process experiences a state from stationary to moving, a state of moving, and a state from moving to stationary.
  • the motion state of the video frame is determined to be from motion to stillness, the corresponding video frame is used as the to-be-detected video frame.
  • the number of feature, e. g. , corner characters, of the to-be-detected video frame is calculated (204) .
  • various types of corner detection algorithms such as the features From Accelerated Segment Test (FAST) corner detection algorithm, the Harris corner detection algorithm, the Compressed Histogram of Gradients (CHOG) corner detection algorithm, and the Fast Retina Keypoint (FREAK) corner detection algorithm, oneof which can be selected randomly. These algorithms have good corner detection capabilities. According to the definition of an effective picture, the first requirement is clearness, and the second requirement is rich vein. Based on the two requirements, the FAST corner detection may be used.
  • FAST corner detection may be used.
  • whether the number of corner characters is greater than the threshold number of corners is determined (205) ; if yes, the to-be-detected video frame is determined to be the clear frame image, and the clear frame image is uploaded to the cloud server. Otherwise, the to-be-detected video frame is determined to be a fuzzy frame image that is not qualified for uploading to the cloud server.
  • an identification result fed back by the cloud server is received (206) , and the identification result may be displayed (206) .
  • Figure 5 is a schematic structural diagram 500 of a mobile terminal which performs the method 100, 200, and/or 300 for image identification as discussed in Figures 1-3 in accordance with some embodiments of the present application.
  • the mobile terminal includes a data collection unit, a motion estimation unit, a clear frame determining unit, and an identification result display unit.
  • the data collection unit is configured to perform data collection in real time by using a camera of the mobile terminal, to obtain a video frame, and to send the video frame to the motion estimation unit.
  • the motion estimation unit is configured to perform motion estimation on the video frame, to determine a motion state of the video frame, and to send the motion state to the clear frame determining unit.
  • the clear frame determining unit is configured to determine whether a motion state of the video frame is from moving to stationary: if yes, the video frame is determined to be a clear frame image, and the clear frame image is uploaded to a cloud server.
  • the identification result display unit is configured to receive the identification result fed back by the cloud server, and to display the identification result.
  • the motion estimation unit includes a motion vector calculation sub-unit and a state determining sub-unit.
  • the motion vector calculation sub-unit is configured to calculate a motion vector between the video frame and a previous video frame, and to send the motion vector to the state determining sub-unit.
  • the motion vector includes a motion magnitude and a motion direction.
  • the state determining sub-unit is configured to determine the motion state of the video frame according to the motion vector.
  • the state determining sub-unit includes a state determining module, configured to read a stored background motion state, where if the background motion state is stationary, and motion magnitudes of N consecutive frames from a current frame are greater than a first motion threshold, where N is a natural number, and the current frame is the first frame, motion states of the first frame to the (N+1) th frame are stationary, the background motion state still is stationary, and a motion state of the (N+1) th frame is determined to be from stationary to moving, and the background motion state is changed as moving.
  • a motion magnitude of the current frame is less than the first motion threshold, the motion state of the current frame is stationary, and the background motion state still is stationary.
  • the background motion state is moving, and motion magnitudes of N consecutive frames from the current frame are less than a second motion threshold, where N is a natural number, and the current frame is the first frame, motion states of the first frame to the (N+1) th frame are moving, the background motion state still is moving, and a motion state of the (N+1) th frame is determined to be from moving to stationary, and the background motion state is changed as stationary.
  • the background motion state is moving, and a motion magnitude of the current frame is greater than the second motion threshold, the motion state of the current frame still is moving, and the background motion state still is moving.
  • the state determining module is configured to determine the background motion state to be stationary, and when the motion magnitude of the current frame is less than the first motion threshold, further whether the motion amplitude is greater than a third motion threshold is further determined.
  • the motion of the current frame is associated with micro motion, and the background motion state still is stationary. If motions of the M consecutive frames from the current frame are micro motions in the same direction, and the current frame is the first frame, the motion state of the M th frame is determined to be from stationary to moving, and the background motion state is changed to moving, where M is a natural number.
  • the motion vector calculation unit includes a motion vector determining module, configured to obtain a center area pixel of the previous video frame; and to use the center area of the video frame as the start point, search for an area, surrounding the start point, having a pixel similar to the center area pixel of the previous video frame which is determined to be a match block; and to use a position vector between the center area of the video frame and the match block as the motion vector.
  • a motion vector determining module configured to obtain a center area pixel of the previous video frame; and to use the center area of the video frame as the start point, search for an area, surrounding the start point, having a pixel similar to the center area pixel of the previous video frame which is determined to be a match block; and to use a position vector between the center area of the video frame and the match block as the motion vector.
  • the clear frame determining unit includes a motion-to-stillness-state determining module and a corner detection module.
  • the moving-to-stationary-state determining module is configured to determine whether the motion state of the video frame is from moving to stationary; if yes, a start instruction is sent to the corner detection module.
  • the corner detection module is configured to receive the start instruction from the moving-to-stationary-state determining module, and to calculate the number of corner characters of the video frame. In some embodiments, whether the number of corner characters is greater than the threshold number of corners is determined. When the number of corner characters is greater than the threshold number of corners, the video frame is determined to be a clear frame image, and the clear frame image is uploaded to the cloud server. Otherwise, the video frame is determined to be a fuzzy frame image.
  • FIG. 6 is a block diagram of a server-client environment in accordance with some embodiments. As shown in Figure 6, image identification is performed in a server-client environment 600 in accordance with some embodiments.
  • server-client environment 600 includes client-side processing 602 (hereinafter “client-side module 602” ) and input device (s) 714 (e. g. , a camera) executed on a client device 604, and server-side processing 606 (hereinafter “server-side module 606” ) executed on a server system 608.
  • client-side module 602 communicates with server-side module 106 through one or more networks 610.
  • Client-side module 602 provides client-side functionalities for the social networking platform (e.
  • Server-side module 606 provides server-side functionalities, e. g. , image/video processing, and image/video information identification, for any number of client modules 602 each residing on a respective client device 604.
  • server-side module 606 includes one or more processors 612, one or more databases 614, an I/O interface to one or more clients 618, and an I/O interface to one or more external services 620.
  • I/O interface to one or more clients 618 facilitates the client-facing input and output processing for server-side module 606.
  • One or more processors 612 receive images sent from the client device 604, process the images, and provide requested image/video related information to client-side modules 602.
  • the database 614 stores various information, including but not limited to, book information, CD information, film information, and product and barcode information.
  • the cloud server feeds back the description information, where the description information includes a purchase situation, comment information, and the like of related goods.
  • I/O interface to one or more external services 620 facilitates communications with one or more external services 622 (e. g. , image/video processing services, publishers, and/or other related services) .
  • client device 604 examples include, but are not limited to, a handheld computer, a wearable computing device, a personal digital assistant (PDA) , a tablet computer, a laptop computer, a desktop computer, a cellular telephone, a smart phone, an enhanced general packet radio service (EGPRS) mobile phone, a media player, a navigation device, a game console, a television, a remote control, or a combination of any two or more of these data processing devices or other data processing devices.
  • PDA personal digital assistant
  • EGPS enhanced general packet radio service
  • Examples of one or more networks 610 include local area networks (LAN) and wide area networks (WAN) such as the Internet.
  • One or more networks 610 are, optionally, implemented using any known network protocol, including various wired or wireless protocols, such as Ethernet, Universal Serial Bus (USB) , FIREWIRE, Global System for Mobile Communications (GSM) , Enhanced Data GSM Environment (EDGE) , code division multiple access (CDMA) , time division multiple access (TDMA) , Bluetooth, Wi-Fi, voice over Internet Protocol (VoIP) , Wi-MAX, or any other suitable communication protocol.
  • USB Universal Serial Bus
  • FIREWIRE Global System for Mobile Communications
  • GSM Global System for Mobile Communications
  • EDGE Enhanced Data GSM Environment
  • CDMA code division multiple access
  • TDMA time division multiple access
  • Bluetooth Wi-Fi
  • Wi-Fi voice over Internet Protocol
  • Wi-MAX Wi-MAX
  • Server system 608 is implemented on one or more standalone data processing apparatuses or a distributed network of computers.
  • server system 608 also employs various virtual devices and/or services of third party service providers (e. g. , third-party cloud service providers) to provide the underlying computing resources and/or infrastructure resources of server system 608.
  • third party service providers e. g. , third-party cloud service providers
  • Server-client environment 600 shown in Figure 6 includes both a client-side portion (e. g. , client-side module 602) and a server-side portion (e. g. , server-side module 606) .
  • data processing is implemented as a standalone application installed on client device 604.
  • client-side module 602 is a thin-client that provides only user-facing input and output processing functions, and delegates all other data processing functionalities to a backend server (e. g., server system 608) .
  • FIG. 7 is a block diagram of a client device in accordance with some embodiments.
  • Client device 604 typically, includes one or more processing units (CPUs) 702, one or more network interfaces 704, memory 706, and one or more communication buses 708 for interconnecting these components (sometimes called a chipset) .
  • Client device 604 also includes a user interface 710.
  • User interface 710 includes one or more output devices 712 that enable presentation of media content, including one or more speakers and/or one or more visual displays.
  • User interface 710 also includes one or more input devices 714, including user interface components that facilitate user input such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a camera, a gesture capturing camera, or other input buttons or controls. Furthermore, some client devices 604 use a microphone and voice recognition or a camera and gesture recognition to supplement or replace the keyboard.
  • Memory 706 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices.
  • Memory 706, optionally, includes one or more storage devices remotely located from one or more processing units 702.
  • Memory 706, or alternatively the non-volatile memory within memory 706, includes a non-transitory computer readable storage medium.
  • memory 706, or the non-transitory computer readable storage medium of memory 706, stores the following programs, modules, and data structures, or a subset or superset thereof:
  • ⁇ operating system 716 including procedures for handling various basic system services and for performing hardware dependent tasks
  • ⁇ network communication module 718 for connecting client device 604 to other computing devices (e. g. , server system 608 and external service (s) 622) connected to one or more networks 610 via one or more network interfaces 704 (wired or wireless) ;
  • ⁇ input processing module 722 for detecting one or more user inputs or interactions from one of the one or more input devices 714 and interpreting the detected input or interaction;
  • client device 604 e. g., games, application marketplaces, payment platforms, social network platforms, and/or other applications
  • client-side module/device module 602 which provides client-side data processing and functionalities, including but not limited to video/image processing module 751 for processing the images and/or video frames captured by the camera by any of the method 100, 200, and 300, and/or 900, the video/image processing module 751 may include any one or more modules and units discussed in Figure 5; and
  • ⁇ database 760 storing various data (e. g. , one or more motion magnitude thresholds) associated with the video/image processing as discussed in the present application.
  • Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i. e. , sets of instructions
  • memory 706, optionally, stores a subset of the modules and data structures identified above.
  • memory 706, optionally, stores additional modules and data structures not described above.
  • FIG. 8 is a block diagram of a server system 608 in accordance with some embodiments.
  • Server system 608 typically, includes one or more processing units (CPUs) 812, one or more network interfaces 804 (e. g. , including I/O interface to one or more clients 618 and I/O interface to one or more external services 620) , memory 806, and one or more communication buses 808 for interconnecting these components (sometimes called a chipset) .
  • Memory 806 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices.
  • Memory 806, optionally, includes one or more storage devices remotely located from one or more processing units 812.
  • Memory 806, or alternatively the non-volatile memory within memory 806, includes a non-transitory computer readable storage medium.
  • memory 806, or the non-transitory computer readable storage medium of memory 806, stores the following programs, modules, and data structures, or a subset or superset thereof:
  • ⁇ operating system 810 including procedures for handling various basic system services and for performing hardware dependent tasks
  • ⁇ network communication module 812 for connecting server system 608 to other computing devices (e. g. , client devices 604 and external service (s) 622) connected to one or more networks 610 via one or more network interfaces 804 (wired or wireless) ;
  • ⁇ server-side module 606 which provides server-side data processing for the social networking platform (e. g. , instant messaging, and social networking services) , includes, but is not limited to video/image processing module 838 for processing the images and/or video frames uploaded by the client device 604;
  • server database 814 storing various data associated with the video/image processing as discussed in the present application.
  • Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i. e. , sets of instructions
  • memory 806, optionally, stores a subset of the modules and data structures identified above.
  • memory 806, optionally, stores additional modules and data structures not described above.
  • At least some of the functions of server system 608 are performed by client device 604, and the corresponding sub-modules of these functions may be located within client device 604 rather than server system 608. In some embodiments, at least some of the functions of client device 604 are performed by server system 608, and the corresponding sub-modules of these functions may be located within server system 608 rather than client device 604.
  • Client device 604 and server system 608 shown in Figures 7-8, respectively, are merely illustrative, and different configurations of the modules for implementing the functions described herein are possible in various embodiments.
  • Figures 9A-9E are a flowchart diagram of a method 900 for performing image identification on an end device 604, e. g. , a mobile terminal, in accordance with some embodiments of the present application.
  • method 900 is performed by an end device with one or more processors, memory, and camera.
  • method 900 is performed by end device 604 ( Figures 6-7) or a component thereof (e. g. , one or more device modules 602, Figures 6-7) .
  • method 900 is governed by instructions that are stored in a non-transitory computer readable storage medium and the instructions are executed by one or more processors of the end device.
  • Optional operations are indicated by dashed lines (e. g. , boxes with dashed-line borders) .
  • one or more steps of method 900 are substantially similar to one or more steps of method 100, 200, and/or 300 as discussed with regard to Figures 1-4.
  • the client device 604 obtains (902) a sequence of video frames including at least a first video frame and a second video frame captured by the camera of the client device.
  • the first video frame is captured prior to the second video frame.
  • the first and second video frames are captured in real time.
  • the client device 604 determines (904) a respective motion state of the camera associated with each video frame of the sequence of video frames. In some embodiments, the client device determines a first motion state of the camera associated with the second video frame by performing a motion estimation of the first video frame and the second video frame. In some embodiments, the client device detects (906) a center block (e. g. , center block 410 of Figure 4) of the first video frame (e. g. , the previous video frame 400 of Figure 4) . The client device then identifies (906) a match block (e. g. , match block 470 of Figure 4) of the second video frame (e. g. , the to-be-processed video frame 450 of Figure 4) which matches with the center block of the first video frame in accordance with predetermined matching criteria.
  • a center block e. g. , center block 410 of Figure 4
  • the client device identifies (906) a match block (e. g. , match block
  • the client device 604 in order to identify the match block of the second video frame, the client device 604 first selects (908) a center block (e. g. , center block 460 of Figure 4) of the second video frame, and the center block of the second video frame has identical area as the center block of the first video frame.
  • the client device 604 calculates (908) a difference value between the center block of the second video frame and the center block of the first video frame.
  • the difference value between the center block of the second video frame and the center block of the first video frame is calculated using a sum of squared errors (S) as shown in the following formula (1) :
  • the difference value may be determined using any other suitable method (s) .
  • the client device 604 selects (908) one or more blocks in the second video frame, each having identical area as the center block of the second video frame.
  • the client device then calculates (908) a difference value between each of the selected one or more blocks of the second video frame and the center block of the first video frame.
  • the client device identifies (908) one of the center block and the one or more blocks of the second video frame having a smallest difference value from the center block of the first video frame as the match block of the second video frame.
  • the one or more blocks are selected from the second video frame using an approximation algorithm.
  • the approximation algorithm includes searching from the center block of the second video frame towards a peripheral area of the second video frame.
  • a large step length is initially used every time the block is moved, until an area with a relatively greater similarity (i. e. , a smaller S) is identified. Then within the identified area, a reduced step length is used to search for an area with the greatest similarity (i. e., the smallest S) .
  • a step-by-step approximation is performed as discussed herein to identify the match block with the greatest similarity and thus the smallest S.
  • the selected block and/or video frame may have any other appropriate shapes, such as square, rectangular, circle, rhombus, diamond.
  • any other suitable formula, other than formula (1) may be used other than the sum of squared errors S as discussed in the present application.
  • any other suitable searching algorithm may be used other than the approximation algorithm.
  • the client device identifies (910) a first motion vector (e. g. , vector 480 of Figure 4) starting from the center block of the second video frame and ending at the match block of the second video frame.
  • the client device determines (912) whether a magnitude of the first motion vector (e. g. , vector 480 of Figure 4) is greater than a predetermined threshold value. In accordance with a determination that the magnitude is greater than the predetermined threshold value, the client device determines (912) that the first motion state of the camera is a respective moving state. In accordance with a determination that the magnitude is equal to or smaller than the predetermined threshold value, the client device determines (912) that the first motion state of the camera is a respective stationary state.
  • the client device determines (914) a second motion state of the camera immediately succeeding the first motion state of the camera.
  • the client device further selects (914) a first predetermined motion magnitude threshold value for determining the second motion state.
  • the client device further selects (914) a second predetermined motion magnitude threshold value for determining the second motion state.
  • the first predetermined motion magnitude threshold value is greater than the second predetermined motion magnitude threshold value. Therefore, when the camera is in moving state, a larger motion magnitude is selected for determining the state as a stationary state. When the camera is in stationary state, a smaller motion magnitude is selected for determining the state as a moving state.
  • the client device determines (916) whether the camera has undergone a transition of motion states from a respective moving state to a respective stationary state between capturing two consecutive video frames of the sequence of video frames.
  • the client device obtains (918) a predetermined number of video frames succeeding the second video frame.
  • the predetermined number can be any suitable natural number that is selected by the user or predefined by the camera settings.
  • the client device determines (918) respective succeeding motion states of the camera for capturing each of the predetermined number of video frames by performing the motion estimation for each pair of consecutive video frames of the predetermined number of video frames.
  • the client device determines (918) that the camera has undergone the transition of state from the respective moving state to the respective stationary state.
  • the client device in accordance with a determination that the first motion state of the camera is a respective stationary state, the client device further determines (920) whether the camera has undergone the transition of motion states from the respective stationary state to the respective moving state in this following manner: The client device obtains (920) a predetermined number of video frames succeeding the second video frame.
  • the predetermined number can be any suitable natural number that is selected by the user or predefined by the camera settings.
  • the client device determines (920) respective succeeding motion states of the camera for capturing each of the predetermined number of video frames by performing the motion estimation for each pair of consecutive video frames of the predetermined number of video frames.
  • the client device determines (920) that the camera has undergone the transition of state from the respective stationary state to the respective moving state.
  • the client device determines (922) whether a latter video frame of the two consecutive video frames is valid for uploading in accordance with predetermined uploading criteria. In some embodiments, in accordance with a determination that the camera has undergone the transition of motion states from the respective moving state to the respective stationary state, the client device counts (924) a number of feature points in the latter video frame. In some examples, the feature points include corner characters.
  • the number of feature points is counted using any suitable algorithms, such as the FAST corner detection algorithm, the Harris corner detection algorithm, the CHOG corner detection algorithm, and the FREAK corner detection algorithm.
  • the quality of the video frame to be uploaded is checked at this step to ensure enough clarity and sufficient detail of the frame can be detected at the server system.
  • the client device determines (924) whether the number of the feature points in the latter video frame is greater than a predetermined threshold feature count. In accordance with a determination that the number of feature point in the latter video frame is greater than the predetermined feature count, the client device uploads (924) the latter video frame to a system server (e. g. , server system 608, Figures 6 and 8).
  • a system server e. g. , server system 608, Figures 6 and 8.
  • the client device in accordance with a determination that the first motion state of the camera is a respective stationary state, obtains (926) a predetermined number of video frames succeeding the second video frame. The client device determines (926) respective succeeding motion vectors and respective succeeding motion states of the camera for the predetermined number of video frames. In some embodiments in accordance with a determination that the respective succeeding motion states of the camera are respective stationary states, the client device determines (926) whether the respective succeeding motion vectors share a common direction. In some embodiments in accordance with a determination that the respective succeeding motion vectors share the common direction, the client device determines (926) whether respective magnitudes of the respective succeeding motion vectors are all greater than a third magnitude threshold value.
  • the client device changes (926) a latest motion state of the camera to a respective moving state associated with micro motion. In some embodiments, it is sufficient for the client device to determine when a major component of the directions of different motion vectors align, the respective succeeding motion vectors share a common direction.
  • the third magnitude threshold value may be a relatively small value (e. g. , smaller than the first and second magnitude threshold values) , and a higher number of video frames are predetermined to obtain when a smaller third magnitude threshold is selected.
  • the client device in accordance with a determination that the first motion state of the camera is a respective stationary state, the client device obtains (928) a predetermined number of video frames succeeding the second video frame. For example, the client device obtains five consecutive video frames succeeding the second video frame. The client device then determines (928) respective succeeding motion vectors and respective succeeding motion states of the camera for the predetermined number of video frames. In accordance with a determination that the respective succeeding motion states of the camera are respective stationary states, the client device (928) determines whether the respective succeeding motion vectors share a common direction.
  • the client device determines (928) whether a sum of respective magnitudes of the respective succeeding motion vectors are is greater than a fourth magnitude threshold value.
  • the fourth magnitude threshold value is greater than the third magnitude threshold value as discussed at step 926.
  • the fourth magnitude threshold is greater than or comparable to the first or the second magnitude threshold value as discussed earlier in the present application.
  • the client device changes (928) a latest motion state of the camera to a respective moving state associated with micro motion.
  • the client device in accordance with a determination that the first motion state of the camera is a respective stationary state, obtains (930) a plurality of consecutive video frames after the second video frame. The client device then determines (930) whether respective motion vectors of each pair of consecutive video frames of the plurality of the video frames have opposite directions. In accordance with a determination that the respective motion vectors of each pair of consecutive video frames of the plurality of video frames have opposite directions, the client device suppresses (930) changing the motion state of the camera from the respective stationary state to a respective moving state based on motion magnitude. In some embodiments, this motion state is regarded as respective stationary state associated with hand shaking.
  • the state of the camera is considered to be stationary when there is hand shaking, until when the consecutive motion vectors are detected to have the same directions, the client device then determines the state of the camera using the magnitude of the vectors as discussed earlier in various embodiments of the present application.
  • the present disclosure further provides a machine readable storage medium, which stores an instruction enabling a machine to execute the method described in the specification.
  • a system or an apparatus equipped with the storage medium may be provided, software program code for implementing a function of any embodiment in the foregoing embodiments is stored in the storage medium, and a computer (or a CPU or an MPU) of the system or the apparatus is enabled to read and execute the program code stored in the storage medium.
  • an operating system operated in a computer may further be enabled, according to the instructions based on the program code, to perform a part of or all of actual operations.
  • the program code read from the storage medium may further be written in a memory disposed in an expansion board inserted in the computer or may be written in a memory disposed in an expansion unit connected to the computer, and then the CPU disposed on the expansion board or the expansion unit is enabled, based on the instruction of the program code, to perform a part of or all of the actual operations, so as to implement the functions of any embodiment in the foregoing embodiments.
  • An embodiment of the storage medium used for providing the program code includes a floppy disk, a hard disk, a magneto-optical disk, an optical disc (such as a CD-ROM, a CD-R, a CD-RW, a DVD-ROM, a DVD-RAM, a DVD-RW, and a DVD+RW) , a magnetic tape, a nonvolatile memory card and a ROM.
  • a communications network may be used for downloading the program code from a server computer.
  • stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art and so do not present an exhaustive list of alternatives. Moreover, it should be recognized that the stages could be implemented in hardware, firmware, software or any combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

L'invention concerne un procédé et un système d'identification d'image. Le procédé consiste à : obtenir une séquence de trames vidéo comprenant au moins une première trame vidéo capturée avant une seconde trame vidéo par la caméra ; déterminer un état de mouvement relatif de la caméra associé à chaque trame vidéo de la séquence de trames vidéo, et déterminer un premier état de mouvement de la caméra associé à la seconde trame vidéo via l'exécution d'une estimation de mouvement ; déterminer si la caméra a subi une transition d'états de mouvement, d'un état mobile respectif à un état stationnaire respectif, entre deux trames vidéo consécutives ; et, quand il est déterminé que la caméra a subi la transition d'états de mouvement, de l'état mobile respectif à l'état stationnaire respectif, déterminer si une dernière trame vidéo des deux trames vidéo consécutives peut être téléchargée, d'après des critères de téléchargement prédéterminés.
PCT/CN2014/086171 2013-09-18 2014-09-10 Procédé et système d'identification d'image Ceased WO2015039575A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2015563118A JP6026680B1 (ja) 2013-09-18 2014-09-10 画像識別を行うための方法およびシステム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310428930.2A CN104144345B (zh) 2013-09-18 2013-09-18 在移动终端进行实时图像识别的方法及该移动终端
CN201310428930.2 2013-09-18

Publications (1)

Publication Number Publication Date
WO2015039575A1 true WO2015039575A1 (fr) 2015-03-26

Family

ID=51853403

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/086171 Ceased WO2015039575A1 (fr) 2013-09-18 2014-09-10 Procédé et système d'identification d'image

Country Status (5)

Country Link
JP (1) JP6026680B1 (fr)
CN (1) CN104144345B (fr)
SA (1) SA114350742B1 (fr)
TW (1) TWI522930B (fr)
WO (1) WO2015039575A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516018A (zh) * 2021-04-22 2021-10-19 深圳市睿联技术股份有限公司 目标检测方法、安防设备及可读存储介质
US20210341725A1 (en) * 2019-05-29 2021-11-04 Tencent Technology (Shenzhen) Company Limited Image status determining method an apparatus, device, system, and computer storage medium
CN114972809A (zh) * 2021-02-19 2022-08-30 株式会社理光 用于视频处理的方法、设备及计算机可读存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460556A (zh) * 2017-09-06 2019-03-12 北京搜狗科技发展有限公司 一种翻译方法和装置
CN108229391B (zh) 2018-01-02 2021-12-24 京东方科技集团股份有限公司 手势识别装置及其服务器、手势识别系统、手势识别方法
EP3823267B1 (fr) * 2018-03-11 2023-05-10 Google LLC Reconnaissance vidéo statique
CN110782647A (zh) * 2019-11-06 2020-02-11 重庆神缘智能科技有限公司 基于图像识别的智能抄表系统
CN110929093B (zh) * 2019-11-20 2023-08-11 百度在线网络技术(北京)有限公司 用于搜索控制的方法、装置、设备和介质
CN113869258B (zh) * 2021-10-08 2025-08-26 重庆紫光华山智安科技有限公司 交通事件检测方法、装置、电子设备及可读存储介质
CN115375974A (zh) * 2022-09-03 2022-11-22 深圳纬格科技有限公司 高稳定性的图像抓取识别方法、系统及台灯

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101019418A (zh) * 2004-07-21 2007-08-15 卓然公司 补偿获取图像帧之间无意的照相机运动的视频数据的处理
CN101755448A (zh) * 2007-07-20 2010-06-23 伊斯曼柯达公司 校正曝光期间的成像设备运动
CN102656876A (zh) * 2009-10-14 2012-09-05 Csr技术公司 用于图像稳定的方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4636786B2 (ja) * 2003-08-28 2011-02-23 カシオ計算機株式会社 撮影画像投影装置、撮影画像投影装置の画像処理方法及びプログラム
JP2007096532A (ja) * 2005-09-27 2007-04-12 Canon Inc 画像蓄積装置及び画像蓄積システム
JP5221550B2 (ja) * 2007-09-14 2013-06-26 シャープ株式会社 画像表示装置および画像表示方法
JP4492724B2 (ja) * 2008-03-25 2010-06-30 ソニー株式会社 画像処理装置、画像処理方法、プログラム
JP5583992B2 (ja) * 2010-03-09 2014-09-03 パナソニック株式会社 信号処理装置
CN102447870A (zh) * 2010-09-30 2012-05-09 宝利微电子系统控股公司 静止物体检测方法和运动补偿装置
CN102521979B (zh) * 2011-12-06 2013-10-23 北京万集科技股份有限公司 基于高清摄像机进行路面事件检测的方法及系统
CN102609957A (zh) * 2012-01-16 2012-07-25 上海智觉光电科技有限公司 一种摄像装置画面偏移检测方法及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101019418A (zh) * 2004-07-21 2007-08-15 卓然公司 补偿获取图像帧之间无意的照相机运动的视频数据的处理
CN101755448A (zh) * 2007-07-20 2010-06-23 伊斯曼柯达公司 校正曝光期间的成像设备运动
CN102656876A (zh) * 2009-10-14 2012-09-05 Csr技术公司 用于图像稳定的方法和装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210341725A1 (en) * 2019-05-29 2021-11-04 Tencent Technology (Shenzhen) Company Limited Image status determining method an apparatus, device, system, and computer storage medium
EP3979194A4 (fr) * 2019-05-29 2022-08-17 Tencent Technology (Shenzhen) Company Limited Procédé et dispositif de détermination d'état d'image, appareil, système et support d'enregistrement informatique
US11921278B2 (en) 2019-05-29 2024-03-05 Tencent Technology (Shenzhen) Company Limited Image status determining method an apparatus, device, system, and computer storage medium
CN114972809A (zh) * 2021-02-19 2022-08-30 株式会社理光 用于视频处理的方法、设备及计算机可读存储介质
CN113516018A (zh) * 2021-04-22 2021-10-19 深圳市睿联技术股份有限公司 目标检测方法、安防设备及可读存储介质

Also Published As

Publication number Publication date
TWI522930B (zh) 2016-02-21
CN104144345A (zh) 2014-11-12
SA114350742B1 (ar) 2015-08-30
JP6026680B1 (ja) 2016-11-16
HK1200623A1 (en) 2015-08-07
TW201512996A (zh) 2015-04-01
JP2016537692A (ja) 2016-12-01
CN104144345B (zh) 2016-08-17

Similar Documents

Publication Publication Date Title
WO2015039575A1 (fr) Procédé et système d'identification d'image
US20240371189A1 (en) Efficient Image Analysis
US10452953B2 (en) Image processing device, image processing method, program, and information recording medium
US9179071B2 (en) Electronic device and image selection method thereof
KR102087882B1 (ko) 시각적 이미지 매칭을 기반으로 한 미디어 스트림 식별 장치 및 방법
CN111464716B (zh) 一种证件扫描方法、装置、设备及存储介质
CN106560840B (zh) 一种图像信息识别处理方法及装置
WO2021004186A1 (fr) Procédé, appareil, système, dispositif et support de recueil de visage
US9275275B2 (en) Object tracking in a video stream
US10122912B2 (en) Device and method for detecting regions in an image
CN111259907A (zh) 内容识别方法、装置以及电子设备
CN105354296B (zh) 一种终端定位方法和用户终端
US11087121B2 (en) High accuracy and volume facial recognition on mobile platforms
US20190327475A1 (en) Object segmentation in a sequence of color image frames based on adaptive foreground mask upsampling
CN110097061B (zh) 一种图像显示方法及装置
WO2015074405A1 (fr) Procédés et dispositifs pour obtenir des informations de carte
CN111210506B (zh) 一种三维还原方法、系统、终端设备和存储介质
US11113998B2 (en) Generating three-dimensional user experience based on two-dimensional media content
US20170091760A1 (en) Device and method for currency conversion
CN109783680B (zh) 图像推送方法、图像获取方法、装置及图像处理系统
WO2024255425A1 (fr) Acquisition d'image
US10650242B2 (en) Information processing apparatus, method, and storage medium storing a program that obtain a feature amount from a frame in accordance with a specified priority order
CN104933688B (zh) 一种数据处理方法及电子设备
CN115004245A (zh) 目标检测方法、装置、电子设备和计算机存储介质
CN111241990A (zh) 图像处理方法及装置、计算机设备、计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14845268

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2015563118

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM F1205A DATED 07.07.2016)

122 Ep: pct application non-entry in european phase

Ref document number: 14845268

Country of ref document: EP

Kind code of ref document: A1