[go: up one dir, main page]

WO2021081562A3 - Multi-head text recognition model for multi-lingual optical character recognition - Google Patents

Multi-head text recognition model for multi-lingual optical character recognition Download PDF

Info

Publication number
WO2021081562A3
WO2021081562A3 PCT/US2021/014171 US2021014171W WO2021081562A3 WO 2021081562 A3 WO2021081562 A3 WO 2021081562A3 US 2021014171 W US2021014171 W US 2021014171W WO 2021081562 A3 WO2021081562 A3 WO 2021081562A3
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
image
language
textual content
optical character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2021/014171
Other languages
French (fr)
Other versions
WO2021081562A2 (en
Inventor
Kaiyu ZHANG
Yuan Lin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Innopeak Technology Inc
Original Assignee
Innopeak Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innopeak Technology Inc filed Critical Innopeak Technology Inc
Priority to PCT/US2021/014171 priority Critical patent/WO2021081562A2/en
Publication of WO2021081562A2 publication Critical patent/WO2021081562A2/en
Publication of WO2021081562A3 publication Critical patent/WO2021081562A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

This application is directed to performing optical character recognition (OCR) using deep learning techniques. An electronic device receives an image and a language indicator that indicates that the textual content in the image corresponds to a first language. The electronic device processes the image using a multilingual text recognition model applicable to a plurality of languages. The electronic device generates a feature sequence including a plurality of probability values corresponding to the textual content of the image. The feature sequence includes a plurality of feature subsets that correspond to the plurality of languages. For each feature subset, each probability value indicates a probability that a respective textual content corresponds to a respective character in a dictionary of the corresponding language. The electronic device constructs a sparse mask based on the first language and combines the feature sequence and the sparse mask to determine the textual content.
PCT/US2021/014171 2021-01-20 2021-01-20 Multi-head text recognition model for multi-lingual optical character recognition Ceased WO2021081562A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2021/014171 WO2021081562A2 (en) 2021-01-20 2021-01-20 Multi-head text recognition model for multi-lingual optical character recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2021/014171 WO2021081562A2 (en) 2021-01-20 2021-01-20 Multi-head text recognition model for multi-lingual optical character recognition

Publications (2)

Publication Number Publication Date
WO2021081562A2 WO2021081562A2 (en) 2021-04-29
WO2021081562A3 true WO2021081562A3 (en) 2021-12-09

Family

ID=75620895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/014171 Ceased WO2021081562A2 (en) 2021-01-20 2021-01-20 Multi-head text recognition model for multi-lingual optical character recognition

Country Status (1)

Country Link
WO (1) WO2021081562A2 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021237227A1 (en) * 2021-07-01 2021-11-25 Innopeak Technology, Inc. Method and system for multi-language text recognition model with autonomous language classification
CN113744281A (en) * 2021-07-20 2021-12-03 北京旷视科技有限公司 Instance segmentation network training and instance segmentation method, device and electronic device
CN113657391A (en) * 2021-08-13 2021-11-16 北京百度网讯科技有限公司 Training method of character recognition model, and method and device for recognizing characters
CN113762269B (en) * 2021-09-08 2024-03-22 深圳市网联安瑞网络科技有限公司 Chinese character OCR recognition method, system and medium based on neural network
CN113744157B (en) * 2021-09-09 2025-06-03 深圳市医诺智能科技发展有限公司 A method and terminal for medical image denoising and enhancement
CN114022876B (en) * 2021-11-15 2025-03-07 中再云图技术有限公司 An electronic scale image text recognition method based on artificial intelligence OCR
CN114120321B (en) * 2021-12-01 2024-07-23 北京比特易湃信息技术有限公司 Text recognition method based on multi-dictionary sample weighting
CN114495111B (en) * 2022-01-20 2024-07-26 北京字节跳动网络技术有限公司 Text recognition method and device, readable medium and electronic equipment
CN114445812B (en) * 2022-01-30 2025-12-09 北京有竹居网络技术有限公司 Character recognition method, device, equipment and medium
CN114818738B (en) * 2022-03-01 2024-08-02 达观数据有限公司 A method and system for identifying customer service hotline user intention trajectory
CN114973224A (en) * 2022-04-12 2022-08-30 北京百度网讯科技有限公司 Character recognition method and device, electronic equipment and storage medium
CN114596566B (en) * 2022-04-18 2022-08-02 腾讯科技(深圳)有限公司 Text recognition method and related device
CN114495114B (en) * 2022-04-18 2022-08-05 华南理工大学 Text sequence recognition model calibration method based on CTC decoder
CN114821566B (en) * 2022-05-13 2024-06-14 北京百度网讯科技有限公司 Text recognition method, device, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6047251A (en) * 1997-09-15 2000-04-04 Caere Corporation Automatic language identification system for multilingual optical character recognition
KR20060046128A (en) * 2004-05-20 2006-05-17 마이크로소프트 코포레이션 Low resolution OCR for camera input documents
US20110123115A1 (en) * 2009-11-25 2011-05-26 Google Inc. On-Screen Guideline-Based Selective Text Recognition
US20150370785A1 (en) * 2014-06-24 2015-12-24 Google Inc. Techniques for machine language translation of text from an image based on non-textual context information from the image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6047251A (en) * 1997-09-15 2000-04-04 Caere Corporation Automatic language identification system for multilingual optical character recognition
KR20060046128A (en) * 2004-05-20 2006-05-17 마이크로소프트 코포레이션 Low resolution OCR for camera input documents
US20110123115A1 (en) * 2009-11-25 2011-05-26 Google Inc. On-Screen Guideline-Based Selective Text Recognition
US20150370785A1 (en) * 2014-06-24 2015-12-24 Google Inc. Techniques for machine language translation of text from an image based on non-textual context information from the image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHAE HO-YEOL: "Scene Text Recognition Performance Improvement through an Add-on of an OCR based Classifier", JOURNAL OF IKEEE, vol. 24, no. 4, 1 January 2020 (2020-01-01), pages 1086 - 1092, XP055877271, ISSN: 2288-243X, DOI: 10.7471/ikeee.2020.24.4.1086 *

Also Published As

Publication number Publication date
WO2021081562A2 (en) 2021-04-29

Similar Documents

Publication Publication Date Title
WO2021081562A3 (en) Multi-head text recognition model for multi-lingual optical character recognition
CN101002198B (en) Systems and methods for spell correction of non-roman characters and words
Vinnarasu et al. Speech to text conversion and summarization for effective understanding and documentation
US11416709B2 (en) Method, apparatus, device and computer readable medium for generating VQA training data
CN102982021A (en) Method for disambiguating multiple readings in language conversion
US11907665B2 (en) Method and system for processing user inputs using natural language processing
US11741317B2 (en) Method and system for processing multilingual user inputs using single natural language processing model
CN108536654A (en) Identify textual presentation method and device
CN112231480A (en) A bert-based phonetic hybrid error correction model
CN113268576B (en) Deep learning-based department semantic information extraction method and device
CN105760359A (en) Question processing system and method thereof
Tursun et al. Noisy Uyghur text normalization
Khosrobeigi et al. A rule-based post-processing approach to improve Persian OCR performance
US20210064820A1 (en) Machine learning lexical discovery
CN111241845B (en) Automatic financial subject identification method and device based on semantic matching method
Al-Sanabani et al. Improved an algorithm for Arabic name matching
CN112148862A (en) Question intention identification method and device, storage medium and electronic equipment
Nguyen et al. OCR error correction for Vietnamese handwritten text using neural machine translation
US20210073466A1 (en) Semantic vector rule discovery
Chiu et al. Chinese spell checking based on noisy channel model
Lu et al. An automatic spelling correction method for classical mongolian
CN111738023A (en) Automatic image-text audio translation method and system
CN112668312A (en) Wrongly written character correction method and device, electronic equipment and storage medium
Liu et al. Chinese Spelling Correction: A Comprehensive Survey of Progress, Challenges, and Opportunities
Ratnam et al. Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21720671

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21720671

Country of ref document: EP

Kind code of ref document: A2