WO2021081562A3

WO2021081562A3 - Multi-head text recognition model for multi-lingual optical character recognition

Info

Publication number: WO2021081562A3
Application number: PCT/US2021/014171
Authority: WO
Inventors: Kaiyu ZHANG; Yuan Lin
Original assignee: Innopeak Technology Inc
Current assignee: Innopeak Technology Inc
Priority date: 2021-01-20
Filing date: 2021-01-20
Publication date: 2021-12-09
Anticipated expiration: 2023-07-20
Also published as: WO2021081562A2

Abstract

This application is directed to performing optical character recognition (OCR) using deep learning techniques. An electronic device receives an image and a language indicator that indicates that the textual content in the image corresponds to a first language. The electronic device processes the image using a multilingual text recognition model applicable to a plurality of languages. The electronic device generates a feature sequence including a plurality of probability values corresponding to the textual content of the image. The feature sequence includes a plurality of feature subsets that correspond to the plurality of languages. For each feature subset, each probability value indicates a probability that a respective textual content corresponds to a respective character in a dictionary of the corresponding language. The electronic device constructs a sparse mask based on the first language and combines the feature sequence and the sparse mask to determine the textual content.