JP2018151925A

JP2018151925A - Terminal, character recognition system, control method of terminal and program

Info

Publication number: JP2018151925A
Application number: JP2017048476A
Authority: JP
Inventors: 雅人左貝; Masato Sakai
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-03-14
Filing date: 2017-03-14
Publication date: 2018-09-27
Anticipated expiration: 2037-03-14
Also published as: JP7091606B2

Abstract

【課題】クラウドシステムにおける、高精度且つ高速なＯＣＲ機能を実現する端末を提供する。【解決手段】端末は、被写体を撮像し画像を取得する、撮像部と、取得された画像の領域のなかから文字認識装置に文字認識を行わせる文字認識範囲を決定する、認識範囲決定部と、決定された文字認識範囲のデータを文字認識装置に出力する、出力部と、を備える。認識範囲決定部は、取得された画像を表示すると共に、ユーザが前記表示された画像上で所定の範囲を入力するための画面を表示し、入力された範囲を前記文字認識範囲に決定してもよい。【選択図】図１PROBLEM TO BE SOLVED: To provide a terminal which realizes a high-precision and high-speed OCR function in a cloud system. SOLUTION: A terminal has an image pickup unit that captures an image of a subject and acquires an image, and a recognition range determination unit that determines a character recognition range for causing a character recognition device to perform character recognition from the area of the acquired image. , An output unit that outputs the data of the determined character recognition range to the character recognition device. The recognition range determination unit displays the acquired image, displays a screen for the user to input a predetermined range on the displayed image, and determines the input range as the character recognition range. May be good. [Selection diagram] Fig. 1

Description

本発明は、端末、文字認識システム、端末の制御方法及びプログラムに関する。 The present invention relates to a terminal, a character recognition system, a terminal control method, and a program.

ＯＣＲ（Optical Character Recognition；光学的文字認識）と称される技術がある。ＯＣＲは、通常、専用装置にＯＣＲアプリケーションソフトが実装され、当該アプリケーションソフトにより画像を取得する際の撮像条件を制御しながら高い認識精度や高速なレスポンスを実現している。 There is a technique called OCR (Optical Character Recognition). In the OCR, OCR application software is usually mounted on a dedicated device, and high recognition accuracy and high-speed response are realized while controlling imaging conditions when an image is acquired by the application software.

また、スマートフォン等の端末にはカメラが内蔵されており、当該カメラを用いたＯＣＲ機能を実現する端末が存在する（特許文献１参照）。さらに、ＯＣＲ機能はスマートフォン等だけでなく、種々の装置にて利用される。例えば、特許文献２には、ＯＣＲ機能を利用したナンバープレート読取装置が開示されている。また、特許文献３には、クラウド（クラウドサーバ）にＯＣＲ機能を実装し、当該クラウドサーバ上にてＯＣＲを実行する技術が開示されている。 A terminal such as a smartphone has a built-in camera, and there is a terminal that realizes an OCR function using the camera (see Patent Document 1). Furthermore, the OCR function is used not only in smartphones but also in various devices. For example, Patent Document 2 discloses a license plate reader using an OCR function. Patent Document 3 discloses a technique for implementing an OCR on a cloud server by implementing an OCR function in the cloud (cloud server).

特開２００５−０９４７８２号公報JP 2005-094782 A 特開２００９−０１５４７８号公報JP 2009-015478 A 特開２０１５−２０４０１５号公報JP-A-2015-204015

なお、上記先行技術文献の各開示を、本書に引用をもって繰り込むものとする。以下の分析は、本発明者らによってなされたものである。 Each disclosure of the above prior art document is incorporated herein by reference. The following analysis was made by the present inventors.

上述のように、クラウドサーバにてＯＣＲを実行することがある。しかし、実際にクラウドサーバにてＯＣＲ機能を実現することに関しては問題が多い。具体的には、ユーザから提供される画像の領域のうち、全ての領域を文字認識の対象とするのか、一部の領域を文字認識の対象とするのかクラウドサーバでは判断できない。従って、クラウドサーバでは、画像の全領域を文字認識の対象とすることになるが、そのような対応ではクラウドサーバによる高速なレスポンスは期待できない。また、所定のスピード（レスポンス）を確保するために、文字認識に係るアルゴリズム等を簡略化することも考えられるが、そのような対応は文字認識精度の悪化を招く。 As described above, OCR may be executed on the cloud server. However, there are many problems with actually realizing the OCR function in the cloud server. Specifically, it is impossible for the cloud server to determine whether all areas of the image area provided by the user are targeted for character recognition or whether some areas are targeted for character recognition. Therefore, in the cloud server, the entire area of the image is targeted for character recognition. However, with such a correspondence, a high-speed response by the cloud server cannot be expected. Further, in order to ensure a predetermined speed (response), it is conceivable to simplify an algorithm and the like related to character recognition. However, such correspondence causes deterioration in character recognition accuracy.

特許文献３に開示されたシステムでは、文字領域を複数の部分領域に分割した上で、各部分領域にて文字認識を行っている。しかし、このような対応でも、ユーザが必要としない文字（文字領域）も認識することに変わりなく、高速なレスポンスは期待できない。 In the system disclosed in Patent Document 3, a character area is divided into a plurality of partial areas, and character recognition is performed in each partial area. However, even with such a correspondence, a character (character area) that the user does not need is recognized, and a high-speed response cannot be expected.

本発明は、クラウドシステムにおける、高精度且つ高速なＯＣＲ機能を実現する、端末、文字認識システム、端末の制御方法及びプログラムを提供することを目的とする。 It is an object of the present invention to provide a terminal, a character recognition system, a terminal control method, and a program that realize a high-precision and high-speed OCR function in a cloud system.

本発明の第１の視点によれば、被写体を撮像し画像を取得する、撮像部と、前記取得された画像の領域のなかから文字認識装置に文字認識を行わせる文字認識範囲を決定する、認識範囲決定部と、前記決定された文字認識範囲のデータを前記文字認識装置に出力する、出力部と、を備える、端末が提供される。 According to a first aspect of the present invention, an image capturing unit that captures an image of a subject and acquires an image, and a character recognition range that causes a character recognition device to perform character recognition are determined from the acquired image region. There is provided a terminal comprising: a recognition range determination unit; and an output unit that outputs data of the determined character recognition range to the character recognition device.

本発明の第２の視点によれば、文字認識装置と、前記文字認識装置に文字認識を依頼する端末と、を含み、前記端末は、被写体を撮像し画像を取得する、撮像部と、前記取得された画像の領域のなかから前記文字認識装置に文字認識を行わせる文字認識範囲を決定する、認識範囲決定部と、前記決定された文字認識範囲のデータを前記文字認識装置に出力する、出力部と、を備える、文字認識システムが提供される。 According to a second aspect of the present invention, there is provided a character recognition device and a terminal that requests the character recognition device to perform character recognition, the terminal capturing an image of a subject and acquiring an image, Determining a character recognition range that causes the character recognition device to perform character recognition from the acquired image area; and outputting data of the determined character recognition range to the character recognition device. And a character recognition system comprising an output unit.

本発明の第３の視点によれば、被写体を撮像し画像を取得するステップと、前記取得された画像の領域のなかから文字認識装置に文字認識を行わせる文字認識範囲を決定するステップと、前記決定された文字認識範囲のデータを前記文字認識装置に出力するステップと、含む、端末の制御方法が提供される。 According to a third aspect of the present invention, a step of capturing an image of a subject and acquiring an image, a step of determining a character recognition range for causing a character recognition device to perform character recognition from the region of the acquired image, A method for controlling a terminal is provided, including the step of outputting data of the determined character recognition range to the character recognition device.

本発明の第４の視点によれば、被写体を撮像し画像を取得する処理と、前記取得された画像の領域のなかから文字認識装置に文字認識を行わせる文字認識範囲を決定する処理と、前記決定された文字認識範囲のデータを前記文字認識装置に出力する処理と、をコンピュータに実行させるプログラムが提供される。
なお、このプログラムは、コンピュータが読み取り可能な記憶媒体に記録することができる。記憶媒体は、半導体メモリ、ハードディスク、磁気記録媒体、光記録媒体等の非トランジェント（non-transient）なものとすることができる。本発明は、コンピュータプログラム製品として具現することも可能である。 According to a fourth aspect of the present invention, a process of capturing an image of a subject and acquiring an image, a process of determining a character recognition range for causing a character recognition device to perform character recognition from the area of the acquired image, There is provided a program for causing a computer to execute a process of outputting data of the determined character recognition range to the character recognition device.
This program can be recorded on a computer-readable storage medium. The storage medium may be non-transient such as a semiconductor memory, a hard disk, a magnetic recording medium, an optical recording medium, or the like. The present invention can also be embodied as a computer program product.

本発明の各視点によれば、クラウドシステムにおける、高精度且つ高速なＯＣＲ機能を実現する、端末、文字認識システム、端末の制御方法及びプログラムが、提供される。 According to each aspect of the present invention, a terminal, a character recognition system, a terminal control method, and a program that realize a high-precision and high-speed OCR function in a cloud system are provided.

一実施形態の概要を説明するための図である。It is a figure for demonstrating the outline | summary of one Embodiment. 第１の実施形態に係る文字認識システムの構成の一例を示す図である。It is a figure which shows an example of a structure of the character recognition system which concerns on 1st Embodiment. 第１の実施形態に係る端末のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the terminal which concerns on 1st Embodiment. 第１の実施形態に係る文字認識サーバのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the character recognition server which concerns on 1st Embodiment. 第１の実施形態に係る端末の処理構成の一例を示す図である。It is a figure which shows an example of the process structure of the terminal which concerns on 1st Embodiment. カメラモジュールにより取得される基礎画像の一例を示す図である。It is a figure which shows an example of the basic image acquired by a camera module. 画像合成部により生成される候補画像の一例を示す図である。It is a figure which shows an example of the candidate image produced | generated by the image synthetic | combination part. 画像検証部により生成されるユーザインターフェイスの一例を示す図である。It is a figure which shows an example of the user interface produced | generated by the image verification part. 認識範囲決定部により提供されるユーザインターフェイスの一例を示す図である。It is a figure which shows an example of the user interface provided by the recognition range determination part. 認識範囲決定部により提供されるユーザインターフェイスの一例を示す図である。It is a figure which shows an example of the user interface provided by the recognition range determination part. 第１の実施形態に係る文字認識サーバの処理構成の一例示す図である。It is a figure which shows an example of the process structure of the character recognition server which concerns on 1st Embodiment. 第１の実施形態に係る文字認識システムの動作の一例を示すシーケンス図である。It is a sequence diagram which shows an example of operation | movement of the character recognition system which concerns on 1st Embodiment. 第２の実施形態に係る認識範囲決定部の動作を説明するための図である。It is a figure for demonstrating operation | movement of the recognition range determination part which concerns on 2nd Embodiment. 一実施形態に係る端末の処理構成の一例を示す図である。It is a figure which shows an example of the process structure of the terminal which concerns on one Embodiment.

初めに、一実施形態の概要について説明する。なお、この概要に付記した図面参照符号は、理解を助けるための一例として各要素に便宜上付記したものであり、この概要の記載はなんらの限定を意図するものではない。また、各図におけるブロック間の接続線は、双方向及び単方向の双方を含む。一方向矢印については、主たる信号（データ）の流れを模式的に示すものであり、双方向性を排除するものではない。 First, an outline of one embodiment will be described. Note that the reference numerals of the drawings attached to the outline are attached to the respective elements for convenience as an example for facilitating understanding, and the description of the outline is not intended to be any limitation. In addition, the connection lines between the blocks in each drawing include both bidirectional and unidirectional directions. The unidirectional arrow schematically shows the main signal (data) flow and does not exclude bidirectionality.

一実施形態に係る端末１００は、被写体を撮像し画像を取得する、撮像部１０１と、取得された画像の領域のなかから文字認識装置に文字認識を行わせる文字認識範囲を決定する、認識範囲決定部１０２と、決定された文字認識範囲のデータを文字認識装置に出力する、出力部１０３と、を備える。 A terminal 100 according to an embodiment determines a character recognition range that causes a character recognition device to perform character recognition from an imaging unit 101 that captures an image of a subject and acquires an image, and a region of the acquired image. A determination unit 102; and an output unit 103 that outputs data of the determined character recognition range to the character recognition device.

端末１００は、例えば、取得された画像を画面に表示し、当該画像の領域のなかからユーザが真に文字認識を行いたい範囲を決定するためのインターフェイスを提供する。その後、端末１００は、ユーザにより入力指示された所定範囲を外部の文字認識装置に送信する。文字認識装置では、文字認識の対象が制限されるため、文字認識のための処理を簡略化する等の対策をしなくとも高速に文字認識結果を出力することができる。 For example, the terminal 100 displays an acquired image on a screen and provides an interface for determining a range in which the user really wants to perform character recognition from the area of the image. Thereafter, the terminal 100 transmits a predetermined range instructed to be input by the user to an external character recognition device. In the character recognition device, the character recognition target is limited, and therefore the character recognition result can be output at high speed without taking measures such as simplifying the process for character recognition.

以下に具体的な実施の形態について、図面を参照してさらに詳しく説明する。なお、各実施形態において同一構成要素には同一の符号を付し、その説明を省略する。 Hereinafter, specific embodiments will be described in more detail with reference to the drawings. In addition, in each embodiment, the same code | symbol is attached | subjected to the same component and the description is abbreviate | omitted.

［第１の実施形態］
第１の実施形態について、図面を用いてより詳細に説明する。 [First Embodiment]
The first embodiment will be described in more detail with reference to the drawings.

図２は、第１の実施形態に係る文字認識システムの構成の一例を示す図である。図２を参照すると、文字認識システムは、端末１０と、文字認識サーバ２０と、を含んで構成される。 FIG. 2 is a diagram illustrating an example of the configuration of the character recognition system according to the first embodiment. Referring to FIG. 2, the character recognition system includes a terminal 10 and a character recognition server 20.

端末１０は、スマートフォンや携帯電話等の端末であり、カメラを内蔵する。 The terminal 10 is a terminal such as a smartphone or a mobile phone, and incorporates a camera.

文字認識サーバ２０は、端末１０から提供される画像（カメラにより撮影される画像）に対して文字認識を実行し、その結果を端末１０に応答する文字認識装置である。 The character recognition server 20 is a character recognition device that performs character recognition on an image provided by the terminal 10 (an image taken by a camera) and responds to the result of the result.

文字認識サーバ２０は、クラウドシステムにより提供されるサーバであり、端末１０と文字認識サーバ２０はネットワークを介して接続されている。なお、図２には、１台の端末１０を図示しているが、実際には多数の端末１０が文字認識サーバ２０を利用する。 The character recognition server 20 is a server provided by a cloud system, and the terminal 10 and the character recognition server 20 are connected via a network. In FIG. 2, one terminal 10 is illustrated, but in reality, many terminals 10 use the character recognition server 20.

［ハードウェア構成］
初めに、第１の実施形態に係る文字認識システムを構成する各種装置のハードウェア構成を説明する。 [Hardware configuration]
First, the hardware configuration of various devices constituting the character recognition system according to the first embodiment will be described.

図３は、端末１０のハードウェア構成の一例を示す図である。端末１０は、例えば、内部バスにより相互に接続される、ＣＰＵ（Central Processing Unit）１１、メモリ１２、カメラモジュール１３、液晶パネル及びタッチパネル１４、無線信号送受信回路１５等を備える。 FIG. 3 is a diagram illustrating an example of a hardware configuration of the terminal 10. The terminal 10 includes, for example, a CPU (Central Processing Unit) 11, a memory 12, a camera module 13, a liquid crystal panel and touch panel 14, a wireless signal transmission / reception circuit 15, and the like that are connected to each other via an internal bus.

但し、図３に示す構成は、端末１０のハードウェア構成を限定する趣旨ではない。端末１０は、図示しないハードウェアを含んでもよい。また、端末１０に含まれるＣＰＵ等の数も図３の例示に限定する趣旨ではなく、例えば、複数のＣＰＵが端末１０に含まれていてもよい。 However, the configuration illustrated in FIG. 3 is not intended to limit the hardware configuration of the terminal 10. The terminal 10 may include hardware not shown. Further, the number of CPUs and the like included in the terminal 10 is not limited to the example illustrated in FIG. 3. For example, a plurality of CPUs may be included in the terminal 10.

メモリ１２は、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、補助記憶装置（ハードディスク等）等の１以上を含む。 The memory 12 includes one or more of a random access memory (RAM), a read only memory (ROM), an auxiliary storage device (such as a hard disk), and the like.

カメラモジュール１３は、レンズやＣＣＤ（Charge Coupled Device）等の撮像センサを備えるモジュールである。 The camera module 13 is a module including an imaging sensor such as a lens or a CCD (Charge Coupled Device).

液晶パネル及びタッチパネル１４は、ユーザにＧＵＩ（Graphical User Interface）を提供するための入出力デバイスである。ユーザは、液晶パネルに表示される画面及びメッセージを確認し、タッチパネルを操作して端末１０に情報を入力する。 The liquid crystal panel and the touch panel 14 are input / output devices for providing a user with a GUI (Graphical User Interface). The user checks the screen and message displayed on the liquid crystal panel, and operates the touch panel to input information to the terminal 10.

無線信号送受信回路１５は、アンテナ１６に接続され、無線信号を送受信するための回路である。 The wireless signal transmission / reception circuit 15 is connected to the antenna 16 and is a circuit for transmitting / receiving a wireless signal.

端末１０の機能は、後述する処理モジュールにより実現される。当該処理モジュールは、例えば、メモリ１２に格納されたプログラムをＣＰＵ１１が実行することで実現される。また、そのプログラムは、ネットワークを介してダウンロードするか、あるいは、プログラムを記憶した記憶媒体を用いて、更新することができる。さらに、上記処理モジュールは、半導体チップにより実現されてもよい。即ち、上記処理モジュールが行う機能は、何らかのハードウェア及び／又はソフトウェアにより実現できればよい。 The function of the terminal 10 is realized by a processing module described later. The processing module is realized, for example, by the CPU 11 executing a program stored in the memory 12. The program can be downloaded through a network or updated using a storage medium storing the program. Furthermore, the processing module may be realized by a semiconductor chip. In other words, the function performed by the processing module may be realized by some hardware and / or software.

図４は、文字認識サーバ２０のハードウェア構成の一例を示す図である。文字認識サーバ２０は、情報処理装置（所謂、コンピュータ）により実現可能であり、上述したＣＰＵ、メモリ等に加え、入出力インターフェイス１７及びＮＩＣ（Network Interface Card）１８を備える。 FIG. 4 is a diagram illustrating an example of a hardware configuration of the character recognition server 20. The character recognition server 20 can be realized by an information processing apparatus (so-called computer), and includes an input / output interface 17 and a NIC (Network Interface Card) 18 in addition to the above-described CPU, memory, and the like.

入出力インターフェイス１７は、表示装置や入力装置といったデバイスのインターフェイスである。表示装置は、例えば、液晶ディスプレイ等である。入力装置は、例えば、キーボードやマウス等のユーザ操作を受け付ける装置や、ＵＳＢ（Universal Serial Bus）メモリ等の外部記憶装置から情報を入力する装置である。ユーザ（例えば、クラウドシステムの管理者）は、キーボードやマウス等を用いて、必要な情報を文字認識サーバ２０に入力する。 The input / output interface 17 is an interface of a device such as a display device or an input device. The display device is, for example, a liquid crystal display. The input device is, for example, a device that accepts user operations such as a keyboard and a mouse, and a device that inputs information from an external storage device such as a USB (Universal Serial Bus) memory. A user (for example, a cloud system administrator) inputs necessary information to the character recognition server 20 using a keyboard, a mouse, or the like.

ＮＩＣ１８は、ルータ等の通信装置に接続される通信インターフェイスである。 The NIC 18 is a communication interface connected to a communication device such as a router.

［処理モジュール］
続いて、第１の実施形態に係る文字認識システムを構成する各種装置の処理モジュールについて説明する。 [Process module]
Subsequently, processing modules of various devices constituting the character recognition system according to the first embodiment will be described.

［端末］
図５は、端末１０の処理構成の一例を示す図である。図５を参照すると、端末１０は、無線通信制御部２０１と、撮像部２０２と、画像合成部２０３と、画像検証部２０４と、認識範囲決定部２０５と、を含んで構成される。 [Terminal]
FIG. 5 is a diagram illustrating an example of a processing configuration of the terminal 10. Referring to FIG. 5, the terminal 10 includes a wireless communication control unit 201, an imaging unit 202, an image composition unit 203, an image verification unit 204, and a recognition range determination unit 205.

無線通信制御部２０１は、文字認識サーバ２０との間の通信を実現するための手段である。無線通信制御部２０１は、例えば、ＬＴＥ（Long Term Evolution）等のモバイル通信や無線ＬＡＮ（Local Area Network）等の通信方式によりネットワークにアクセスし、文字認識サーバ２０と通信する。 The wireless communication control unit 201 is a means for realizing communication with the character recognition server 20. The wireless communication control unit 201 accesses the network by a communication method such as mobile communication such as LTE (Long Term Evolution) or wireless LAN (Local Area Network), and communicates with the character recognition server 20.

撮像部２０２は、カメラモジュール１３を制御することで、被写体を撮像し画像（画像データ）を取得する手段である。撮像部２０２は、文字認識サーバ２０に文字認識を依頼する画像（以下、依頼画像と表記する）の基礎（ソース）となる画像を取得する。 The imaging unit 202 is means for capturing a subject and acquiring an image (image data) by controlling the camera module 13. The imaging unit 202 acquires an image serving as a basis (source) of an image for requesting character recognition to the character recognition server 20 (hereinafter referred to as a request image).

撮像部２０２は、同一の被写体から複数の基礎画像を取得する。より具体的には、撮像部２０２は、露出条件を変更しつつ、同一の被写体から複数の基礎画像を取得する。つまり、撮像部２０２は、露出条件を変更しながら対象物を連写し、複数の基礎画像を取得する。その際、撮像部２０２は、露出時間やＩＳＯ（International Organization for Standardization）感度等の露出条件を変更しながら同じ対象物を連写する。 The imaging unit 202 acquires a plurality of basic images from the same subject. More specifically, the imaging unit 202 acquires a plurality of basic images from the same subject while changing the exposure condition. That is, the imaging unit 202 continuously captures an object while changing the exposure condition, and acquires a plurality of basic images. At that time, the imaging unit 202 continuously captures the same object while changing exposure conditions such as exposure time and ISO (International Organization for Standardization) sensitivity.

例えば、撮像部２０２は、図６に示すような複数の基礎画像を取得する。なお、撮像部２０２は、複数枚の基礎画像を取得するので、ユーザがシャッターボタンを一度押せば、必要な枚数の基礎画像を取得するように動作する。 For example, the imaging unit 202 acquires a plurality of basic images as illustrated in FIG. Note that since the imaging unit 202 acquires a plurality of basic images, the user operates to acquire a necessary number of basic images once the user presses the shutter button.

画像合成部２０３は、複数の基礎画像を合成することで、１枚の画像を生成する手段である。より具体的には、画像合成部２０３は、撮像部２０２により取得された複数の基礎画像を合成し、依頼画像の候補となる画像（以下、候補画像と表記する）を生成する。例えば、画像合成部２０３は、ＨＤＲ（High Dynamic Range）合成を実行し、複数枚の基礎画像に係るデータから１枚の画像（候補画像；依頼画像の候補）を生成する。例えば、図６に示す複数の基礎画像を合成すると図７に示すような候補画像が得られる。 The image synthesizing unit 203 is a unit that generates a single image by synthesizing a plurality of basic images. More specifically, the image synthesis unit 203 synthesizes a plurality of basic images acquired by the imaging unit 202, and generates an image that is a candidate for the requested image (hereinafter referred to as a candidate image). For example, the image composition unit 203 performs HDR (High Dynamic Range) composition, and generates one image (candidate image; candidate request image) from data related to a plurality of basic images. For example, when a plurality of basic images shown in FIG. 6 are combined, a candidate image as shown in FIG. 7 is obtained.

画像検証部２０４は、合成された画像（候補画像）の品質を検証する手段である。具体的には、画像検証部２０４は、候補画像に「手ぶれ」や「ピント外れ」が生じているか否かを検証する。なお、「手ぶれ」や「ピント外れ」の検出には種々の技術を用いることができる。例えば、画像検証部２０４は、所謂、画像復元式と称される方法を用いて、候補画像に「手ぶれ」が生じているか検証できる。また、画像検証部２０４は、特許文献１に開示されるような合焦状態判定方法を用いて候補画像にピント外れが生じているか否かを検証できる。なお、ピント外れの検出方法に関しては、参考文献１（J.L. Pech-Paceco & G. Cristobal Imaging & Vision Dept. "Diatom autofocusing in brightfield microscopy; a comparative study"）の３．３節に記載された技術を用いることもできる。 The image verification unit 204 is means for verifying the quality of the synthesized image (candidate image). Specifically, the image verification unit 204 verifies whether “camera shake” and “out of focus” have occurred in the candidate image. Various techniques can be used to detect “camera shake” and “out of focus”. For example, the image verification unit 204 can verify whether or not “camera shake” has occurred in a candidate image using a so-called image restoration method. In addition, the image verification unit 204 can verify whether or not the candidate image is out of focus using the in-focus state determination method disclosed in Patent Document 1. Regarding the method of detecting out-of-focus, the technique described in Section 3.3 of Reference 1 (JL Pech-Paceco & G. Cristobal Imaging & Vision Dept. “Diatom autofocusing in brightfield microscopy; a comparative study”) is used. It can also be used.

画像検証部２０４は、例えば、候補画像に「手ぶれ」も「ピント外れ」も生じていない場合に、当該候補画像の品質は高いと判定する。換言するならば、画像検証部２０４は、候補画像に「手ぶれ」及び「ピント外れ」の少なくともいずれかが生じている場合には、当該候補画像の品質は低いと判定する。 For example, the image verification unit 204 determines that the quality of the candidate image is high when neither “shake” nor “out of focus” occurs in the candidate image. In other words, the image verification unit 204 determines that the quality of the candidate image is low when at least one of “shake” and “out of focus” occurs in the candidate image.

画像検証部２０４は、候補画像の品質に関する検証をユーザに依頼してもよい。例えば、画像検証部２０４は、候補画像と共にその品質確認を要求するメッセージを液晶パネル等に表示し、ユーザから当該候補画像を依頼画像に設定するか否かに関する指示を入力する。具体的には、画像検証部２０４は、液晶パネル等に図８に示すような表示を行い、ユーザからの指示を入力する。 The image verification unit 204 may request the user to verify the quality of the candidate image. For example, the image verification unit 204 displays a message requesting quality confirmation together with the candidate image on a liquid crystal panel or the like, and inputs an instruction regarding whether or not to set the candidate image as a requested image from the user. Specifically, the image verification unit 204 performs display as shown in FIG. 8 on a liquid crystal panel or the like, and inputs an instruction from the user.

画像検証部２０４は、候補画像の品質に問題があれば（品質が低ければ）、画像を再撮影する旨をユーザに通知し、撮像部２０２に対して対象物の再撮影を指示する。つまり、撮像部２０２は、複数の基礎画像を合成することで生成された候補画像の品質が予め定めた基準（手ぶれ又はピント外れがあり）よりも低い場合には、被写体からの画像を再取得する。候補画像の品質に問題がなければ、画像検証部２０４は、候補画像を認識範囲決定部２０５に引き渡す。 If there is a problem with the quality of the candidate image (if the quality is low), the image verification unit 204 notifies the user that the image is to be recaptured, and instructs the image capture unit 202 to recapture the object. That is, the imaging unit 202 re-acquires an image from a subject when the quality of a candidate image generated by combining a plurality of basic images is lower than a predetermined standard (there is camera shake or out of focus). To do. If there is no problem with the quality of the candidate image, the image verification unit 204 delivers the candidate image to the recognition range determination unit 205.

認識範囲決定部２０５は、候補画像の領域のなかから文字認識サーバ２０に文字認識を行わせる文字認識範囲を決定する手段である。具体的には、認識範囲決定部２０５は、候補画像を液晶パネル等に表示すると共に、ユーザが表示された画像上で所定の範囲を入力するための画面を表示し、入力指示された所定範囲を文字認識範囲として決定する。即ち、認識範囲決定部２０５は、候補画像から文字認識を行う範囲を抽出して、文字認識範囲を決定する手段である。 The recognition range determination unit 205 is a means for determining a character recognition range that causes the character recognition server 20 to perform character recognition from the candidate image regions. Specifically, the recognition range determination unit 205 displays a candidate image on a liquid crystal panel or the like, displays a screen for inputting a predetermined range on the displayed image by the user, and inputs the predetermined range instructed to be input. Is determined as the character recognition range. That is, the recognition range determination unit 205 is a unit that extracts a character recognition range from the candidate image and determines the character recognition range.

例えば、認識範囲決定部２０５は、候補画像を液晶パネル等に表示しつつ、文字認識範囲を入力するような操作を受け付けるユーザインターフェイスを提供する。換言するならば、認識範囲決定部２０５は、候補画像を液晶パネル等に表示し、ユーザによる画像のトリミングを実行するユーザインターフェイスを提供する。 For example, the recognition range determination unit 205 provides a user interface that receives an operation for inputting a character recognition range while displaying a candidate image on a liquid crystal panel or the like. In other words, the recognition range determination unit 205 provides a user interface that displays candidate images on a liquid crystal panel or the like and performs image trimming by the user.

認識範囲決定部２０５により提供されるユーザインターフェイスには種々の形態が考えられる。 Various forms of the user interface provided by the recognition range determination unit 205 are conceivable.

例えば、図９（ａ）に示すように、認識範囲決定部２０５は、候補画像の全体と文字認識範囲入力に係るメッセージを表示する。図９（ａ）の表示に接したユーザは、ＯＣＲにて文字認識を行わせたい領域の左上に触れ、その後、右下に触れる。例えば、ユーザは、図９（ｂ）に示すような押下点２１及び押下点２２に触れたものとする。ユーザが２点に触れると、認識範囲決定部２０５は、ユーザから入力された２点を頂点とする矩形形状に囲まれる領域を文字認識範囲とする。図９（ｂ）の例では、文字「ＡＢＣ」を含む点線で囲まれた範囲が文字認識範囲に設定される。 For example, as shown in FIG. 9A, the recognition range determination unit 205 displays a message relating to the entire candidate image and character recognition range input. The user who is in contact with the display of FIG. 9A touches the upper left of the area to be recognized by OCR, and then touches the lower right. For example, it is assumed that the user touches the pressing point 21 and the pressing point 22 as shown in FIG. When the user touches two points, the recognition range determination unit 205 sets a region surrounded by a rectangular shape with the two points input from the user as vertices as a character recognition range. In the example of FIG. 9B, a range surrounded by a dotted line including the character “ABC” is set as the character recognition range.

上記インターフェイスの他にも、ユーザによる一筆書きにより囲まれる領域を文字認識範囲とすることもできる。例えば、図１０（ａ）に示すように、認識範囲決定部２０５は、候補画像の全体と文字認識範囲入力に係るメッセージを表示する。図１０（ａ）の表示に接したユーザは、文字認識させたい範囲を指で囲うようにタッチパネルを操作する。例えば、図１０（ｂ）に示すように、文字「ＡＢＣ」を含む領域の左上から右上、右下、左下を経由して左上にユーザの指による軌跡が描かれる場合には、点線２３で囲まれた範囲が文字認識範囲に設定される。 In addition to the above interface, an area surrounded by a single stroke written by the user can be set as a character recognition range. For example, as shown in FIG. 10A, the recognition range determination unit 205 displays a message relating to the entire candidate image and the character recognition range input. A user who is in contact with the display of FIG. 10A operates the touch panel so as to enclose a range for character recognition with a finger. For example, as shown in FIG. 10B, when a trajectory by the user's finger is drawn on the upper left via the upper left, lower right, and lower left of the area including the character “ABC”, the area is surrounded by a dotted line 23 The selected range is set as the character recognition range.

認識範囲決定部２０５は、ユーザにより指定された範囲を文字認識範囲と定め、当該範囲を候補画像から切り出す。認識範囲決定部２０５は、切り出した文字認識範囲に係る画像を、無線通信制御部２０１（出力部）を介して文字認識サーバ２０に送信する。なお、候補画像から切り出した文字認識範囲に係る画像が、上記依頼画像となる。認識範囲決定部２０５は、自装置（端末１０）の識別子（例えば、ＭＡＣ（Media Access Control）アドレス）を付して依頼画像に係るデータを文字認識サーバ２０に送信する。 The recognition range determination unit 205 determines a range designated by the user as a character recognition range, and cuts out the range from the candidate image. The recognition range determination unit 205 transmits an image related to the extracted character recognition range to the character recognition server 20 via the wireless communication control unit 201 (output unit). An image related to the character recognition range cut out from the candidate image is the request image. The recognition range determination unit 205 transmits the data related to the requested image to the character recognition server 20 with the identifier (for example, MAC (Media Access Control) address) of the own device (terminal 10).

［文字認識サーバ］
図１１は、文字認識サーバ２０の処理構成の一例を示す図である。図１１を参照すると、文字認識サーバ２０は、通信制御部３０１と、画像管理部３０２と、文字認識制御部３０３と、文字認識部３０４と、を備える。 [Character recognition server]
FIG. 11 is a diagram illustrating an example of a processing configuration of the character recognition server 20. Referring to FIG. 11, the character recognition server 20 includes a communication control unit 301, an image management unit 302, a character recognition control unit 303, and a character recognition unit 304.

通信制御部３０１は、端末１０との間の通信を制御する手段である。通信制御部３０１は、端末１０から依頼画像に係るデータを取得すると、当該画像データを画像管理部３０２に引き渡す。 The communication control unit 301 is means for controlling communication with the terminal 10. When the communication control unit 301 acquires data related to the requested image from the terminal 10, the communication control unit 301 delivers the image data to the image management unit 302.

画像管理部３０２は、端末１０から受信する依頼画像を管理する手段である。具体的には、画像管理部３０２は、端末１０から画像データを受信すると、当該受信した画像データを受信端末ごとに区分して記憶媒体に格納する。 The image management unit 302 is a means for managing request images received from the terminal 10. Specifically, when receiving image data from the terminal 10, the image management unit 302 classifies the received image data for each receiving terminal and stores it in a storage medium.

文字認識制御部３０３は、上記記憶媒体に格納された画像データによる文字認識を文字認識部３０４に行わせる手段である。具体的には、文字認識制御部３０３は、上記記憶媒体に格納された画像データを格納された順に読み出し、読み出したデータを文字認識部３０４に提供する。また、文字認識制御部３０３は、文字認識部３０４から出力される結果（認識された文字列）を、文字認識した依頼画像の送信元である端末１０に送信する。 The character recognition control unit 303 is a unit that causes the character recognition unit 304 to perform character recognition based on image data stored in the storage medium. Specifically, the character recognition control unit 303 reads the image data stored in the storage medium in the order in which it is stored, and provides the read data to the character recognition unit 304. Further, the character recognition control unit 303 transmits the result (recognized character string) output from the character recognition unit 304 to the terminal 10 that is the transmission source of the request image that has been character-recognized.

文字認識部３０４は、ＯＣＲ機能の実行エンジンであり、文字認識に必要な画像変換やパターンマッチング等に係る処理を実行する。文字認識部３０４は、文字認識の結果を文字認識制御部３０３に出力する。 The character recognition unit 304 is an execution engine for the OCR function, and executes processing related to image conversion, pattern matching and the like necessary for character recognition. The character recognition unit 304 outputs the result of character recognition to the character recognition control unit 303.

［システムの動作］
次に、図１２を参照しつつ、第１の実施形態に係る文字認識システムの動作を説明する。図１２は、第１の実施形態に係る文字認識システムの動作の一例を示すシーケンス図である。 [System Operation]
Next, the operation of the character recognition system according to the first embodiment will be described with reference to FIG. FIG. 12 is a sequence diagram illustrating an example of the operation of the character recognition system according to the first embodiment.

ステップＳ０１において、端末１０は、ユーザからの操作により被写体を撮影する。その際、端末１０は、露光条件を変更しながらの連写により複数の基礎画像を取得する。 In step S01, the terminal 10 captures a subject by an operation from the user. At that time, the terminal 10 acquires a plurality of basic images by continuous shooting while changing the exposure conditions.

ステップＳ０２において、端末１０は、複数の基礎画像を合成し、１枚の候補画像を生成する。 In step S02, the terminal 10 combines a plurality of basic images and generates one candidate image.

ステップＳ０３において、端末１０は、候補画像の品質を検証する。具体的には、端末１０は、候補画像の品質を検証し、候補画像に「手ぶれ」や「ピント外れ」等が発生しているか否かを判定することで、候補画像の品質を検証する。 In step S03, the terminal 10 verifies the quality of the candidate image. Specifically, the terminal 10 verifies the quality of the candidate image, and verifies the quality of the candidate image by determining whether “camera shake”, “out of focus”, or the like has occurred in the candidate image.

品質の低い候補画像（ステップＳ０４、Ｎ分岐）であれば、端末１０は、対象物を再撮影することをユーザに通知（ステップＳ０５）し、ステップＳ０１以降の処理を繰り返す。品質の高い候補画像（ステップＳ０４、Ｙ分岐）であれば、端末１０は、文字認証範囲に係る決定を行う（ステップＳ０６）。具体的には、端末１０は、図９や図１０に示すインターフェイス画面を表示し、ユーザからの操作により文字認識範囲を決定する。 If it is a low-quality candidate image (step S04, N branch), the terminal 10 notifies the user that the object is to be re-photographed (step S05), and repeats the processing after step S01. If it is a high-quality candidate image (step S04, Y branch), the terminal 10 makes a determination related to the character authentication range (step S06). Specifically, the terminal 10 displays the interface screen shown in FIGS. 9 and 10 and determines the character recognition range by an operation from the user.

端末１０は、候補画像からユーザにより指定された文字認識範囲を切り出し、依頼画像に係るデータを作成する。端末１０は、依頼画像に係るデータを文字認識サーバ２０に送信する（ステップＳ０７）。つまり、端末１０は、クラウドシステムに対し、送信した画像の文字認識を依頼する。 The terminal 10 cuts out the character recognition range designated by the user from the candidate image, and creates data related to the requested image. The terminal 10 transmits data relating to the requested image to the character recognition server 20 (step S07). That is, the terminal 10 requests the cloud system for character recognition of the transmitted image.

文字認識サーバ２０は、受信した画像に対して文字認識を実行する（ステップＳ０８）。 The character recognition server 20 performs character recognition on the received image (step S08).

文字認識サーバ２０は、認識結果（認識された文字）を端末１０に送信する（ステップＳ０９）。 The character recognition server 20 transmits the recognition result (recognized character) to the terminal 10 (step S09).

以上のように、第１の実施形態に係る端末１０では、複数の基礎画像を取得し、当該複数の画像を合成することで、品質の高い候補画像を生成している。その上で、端末１０は、当該候補画像の品質に問題がないか検証し、問題が無い候補画像をユーザに提供（表示）している。さらに、ユーザは、候補画像の領域のうち、真に文字認識を行わせたい範囲を決定する。その結果、ユーザにとって無駄な領域の文字認証が文字認識サーバ２０にて実行されることがなくなる。そのため、文字認識サーバ２０の高速なレスポンスと高い認識精度を両立することができる。 As described above, the terminal 10 according to the first embodiment acquires a plurality of basic images and combines the plurality of images to generate a high-quality candidate image. Then, the terminal 10 verifies whether there is a problem in the quality of the candidate image, and provides (displays) the candidate image having no problem to the user. Furthermore, the user determines a range in which the character recognition is to be truly performed in the candidate image area. As a result, character authentication in a useless area for the user is not executed by the character recognition server 20. Therefore, it is possible to achieve both a high-speed response of the character recognition server 20 and high recognition accuracy.

［第２の実施形態］
続いて、第２の実施形態について図面を参照して詳細に説明する。 [Second Embodiment]
Next, a second embodiment will be described in detail with reference to the drawings.

第２の実施形態では、端末１０が候補画像の中から文字認識範囲を自動的に決定する場合について説明する。 In the second embodiment, a case will be described in which the terminal 10 automatically determines a character recognition range from candidate images.

第２の実施形態では、定型的な書類等に追加された文字列を含む範囲を端末１０が自動的に検出し、当該検出した範囲を依頼画像とする場合について説明する。なお、第２の実施形態において、システム構成や端末１０等のハードウェア構成、処理構成は、第１の実施形態にて説明した構成と同一とすることができるので、図２等に相当する説明は省略する。 In the second embodiment, a case will be described in which the terminal 10 automatically detects a range including a character string added to a standard document or the like, and uses the detected range as a request image. In the second embodiment, the system configuration, the hardware configuration of the terminal 10, etc., and the processing configuration can be the same as the configuration described in the first embodiment, so that the description corresponding to FIG. Is omitted.

第２の実施形態に係る認識範囲決定部２０５は、上記定型的な書類のテンプレート画像と、ユーザにより撮影された画像（候補画像）と、を比較し、２つの画像にて相違する領域を文字認識範囲に設定する。例えば、図１３（ａ）に示す画像がテンプレート画像であり、図１３（ｂ）に示す画像が候補画像（品質に問題がない画像）である。 The recognition range determination unit 205 according to the second embodiment compares the template image of the standard document with an image (candidate image) photographed by the user, and characterizes different areas between the two images. Set to recognition range. For example, the image shown in FIG. 13A is a template image, and the image shown in FIG. 13B is a candidate image (an image with no problem in quality).

認識範囲決定部２０５は、２つの画像の対応する位置（座標）における画素値の差分を算出する。その結果、候補画像の各点においてテンプレート画像から変化のない点の差分値は小さい値となり、変化のある点は差分値が大きくなる。認識範囲決定部２０５は、差分値が所定の閾値よりも大きな点を数多く含む領域を文字認識範囲に設定する。例えば、図１３の例では、１２桁の数字が書き込まれた領域３１が文字認識範囲に設定される。 The recognition range determination unit 205 calculates a difference between pixel values at corresponding positions (coordinates) of two images. As a result, at each point of the candidate image, the difference value at a point that does not change from the template image becomes a small value, and at a point where there is a change, the difference value increases. The recognition range determination unit 205 sets an area including many points having a difference value larger than a predetermined threshold as the character recognition range. For example, in the example of FIG. 13, an area 31 where a 12-digit number is written is set as the character recognition range.

なお、実際には、テンプレート画像と候補画像ではそのサイズ（ドット数）が一致するとは限らない。そこで、認識範囲決定部２０５は、精度良く文字認識範囲を算出するため、テンプレート画像のサイズに候補画像のサイズを変換する幾何変換等を実施した後、文字認識範囲の抽出に係る処理を実行するのが望ましい。 Actually, the size (number of dots) does not always match between the template image and the candidate image. Accordingly, the recognition range determination unit 205 executes a process related to extraction of the character recognition range after performing geometric conversion or the like for converting the size of the candidate image into the size of the template image in order to calculate the character recognition range with high accuracy. Is desirable.

また、２つの画像の微妙な相違を許容するため、複数の画素を１つのグループにまとめ、グループごとの画素値を計算し、２枚の画像間で比較しても良い。例えば、４つの画素を１つのグループとし、４つの画素値の平均値をグループの代表値に設定する（グループの画素値に設定する）。認識範囲決定部２０５は、２つの画像から同様に算出されたグループの画素値を、文字認識範囲の決定（抽出）に用いることで、両画像の微妙な相違を吸収できる。即ち、認識範囲決定部２０５は、比較対象となる２枚の画像における解像度を低くし、大まかな範囲の比較とすることで、画像間の微妙な相違を吸収する。 Further, in order to allow a subtle difference between two images, a plurality of pixels may be grouped into one group, a pixel value for each group may be calculated, and the two images may be compared. For example, four pixels are set as one group, and an average value of the four pixel values is set as a representative value of the group (set to a group pixel value). The recognition range determination unit 205 can absorb subtle differences between the two images by using the pixel values of the group calculated in the same manner from the two images for determining (extracting) the character recognition range. That is, the recognition range determination unit 205 absorbs subtle differences between images by reducing the resolution of the two images to be compared and making a rough comparison of the ranges.

以上のように、第２の実施形態では、端末１０が文字認識範囲を自動的に決定する。その結果、ユーザによる文字認識範囲の決定は不要となり利便性が向上する。 As described above, in the second embodiment, the terminal 10 automatically determines the character recognition range. As a result, it is not necessary for the user to determine the character recognition range, and convenience is improved.

上記実施形態にて説明した文字認識システムの構成等は例示であって、システムの構成を限定する趣旨ではない。例えば、スマートフォン等の端末１０だけでなく、据え置き型のコンピュータがスキャナから取得した画像を対象としてもよい。但し、この場合、スキャナから取得した候補画像には手ぶれ等の問題は生じないと考えられるため、必要に応じて「画像合成部」や「画像検証部」に係る処理を省略してもよい。つまり、上記実施形態にて説明した、画像合成処理や画像検証処理は省略されてもよい。 The configuration or the like of the character recognition system described in the above embodiment is an example, and is not intended to limit the configuration of the system. For example, not only the terminal 10 such as a smartphone but also an image acquired from a scanner by a stationary computer may be targeted. However, in this case, since it is considered that there is no problem such as camera shake in the candidate image acquired from the scanner, the processing related to the “image composition unit” or “image verification unit” may be omitted as necessary. That is, the image composition process and the image verification process described in the above embodiment may be omitted.

あるいは、端末１０にて実行される処理のうち一部の処理は外部のサーバ等で実行されてもよい。例えば、複数の画像を合成する処理は、文字認識サーバ２０等の外部サーバで実行されてもよい。 Alternatively, some of the processes executed by the terminal 10 may be executed by an external server or the like. For example, the process of combining a plurality of images may be executed by an external server such as the character recognition server 20.

上記実施形態では、画像検証部２０４が候補画像の品質を判定し、品質の低い候補画像が得られた場合には撮像部２０２により新たな複数枚の基礎画像が取得される。しかし、画像検証部２０４が撮像部２０２に画像の再取得を依頼する前に、候補画像の補正を試みて十分高品質な候補画像が得られる場合には、撮像部２０２に画像の再取得を依頼しなくともよい。この場合、端末１０は、画像補正部２０６を備えることになる（図１４参照）。 In the above embodiment, the image verification unit 204 determines the quality of a candidate image, and when a low-quality candidate image is obtained, a plurality of new basic images are acquired by the imaging unit 202. However, before the image verification unit 204 requests the image capturing unit 202 to reacquire the image, if correction of the candidate image is attempted and a sufficiently high quality candidate image is obtained, the image capturing unit 202 must reacquire the image. You don't have to ask. In this case, the terminal 10 includes an image correction unit 206 (see FIG. 14).

上記実施形態では、端末１０が候補画像の品質を確認しているが、図８に示すようにユーザに候補画像の品質確認を依頼してもよい。また、その場合には、文字認識範囲決定のユーザインターフェイス画面（図９や図１０）にて候補画像の品質確認が行われてもよい。つまり、図９等の画面に「再取得」のボタンを設け、当該ボタンが押下された場合に、被写体の画像が再取得されてもよい。 In the above embodiment, the terminal 10 checks the quality of the candidate image. However, as shown in FIG. 8, the user 10 may be asked to check the quality of the candidate image. In that case, the quality of the candidate image may be checked on the user interface screen for determining the character recognition range (FIGS. 9 and 10). That is, a “re-acquisition” button may be provided on the screen of FIG. 9 or the like, and the subject image may be re-acquired when the button is pressed.

上記実施形態では、撮像部２０２は、当初から撮像条件を変更しつつ、複数枚の基礎画像を取得しているが、最初は１枚の基礎画像を取得してもよい。あるいは、撮像部２０２は、被写体の画像を再取得する際には、先の撮像条件とは異なる条件を設定し、被写体から複数の基礎画像を取得してもよい。 In the above embodiment, the imaging unit 202 acquires a plurality of basic images while changing the imaging conditions from the beginning, but may initially acquire a single basic image. Alternatively, the imaging unit 202 may acquire a plurality of basic images from the subject by setting conditions different from the previous imaging conditions when re-acquiring the subject image.

上述の説明で用いた複数のフローチャートでは、複数の工程（処理）が順番に記載されているが、各実施形態で実行される工程の実行順序は、その記載の順番に制限されない。各実施形態では、例えば各処理を並行して実行する等、図示される工程の順番を内容的に支障のない範囲で変更することができる。また、上記実施形態で説明した事項は、相反しない範囲で組み合わせることができる。 In the plurality of flowcharts used in the above description, a plurality of steps (processes) are described in order, but the execution order of the steps executed in each embodiment is not limited to the description order. In each embodiment, the order of the illustrated steps can be changed within a range that does not hinder the contents, for example, the processes are executed in parallel. Moreover, the matter demonstrated by the said embodiment can be combined in the range which does not conflict.

上記の実施形態の一部又は全部は、以下の付記のようにも記載され得るが、以下には限られない。
［付記１］
上述の第１の視点に係る端末のとおりである。
［付記２］
前記認識範囲決定部は、
前記取得された画像を表示すると共に、ユーザが前記表示された画像上で所定の範囲を入力するための画面を表示し、入力指示された所定範囲を前記文字認識範囲として決定する、付記１の端末。
［付記３］
前記認識範囲決定部は、
前記取得された画像と予め定めたテンプレート画像を比較し、前記取得された画像と前記テンプレート画像が相違する領域を前記文字認識範囲に決定する、付記１の端末。
［付記４］
前記撮像部は、同一の被写体から複数の画像を取得し、
前記複数の画像を合成することで、１枚の画像を生成する画像生成部をさらに備え、
前記認識範囲決定部は、前記合成された画像から前記文字認識範囲を決定する、付記１乃至３のいずれか一に記載の端末。
［付記５］
前記撮像部は、露出条件を変更しつつ、前記同一の被写体から複数の画像を取得する、付記４の端末。
［付記６］
前記合成された画像の品質を検証する、画像検証部をさらに備え、
前記撮像部は、前記合成された画像の品質が予め定めた基準よりも低い場合には、前記被写体からの画像を再取得する、付記４又は５の端末。
［付記７］
上述の第２の視点に係る文字認識システムのとおりである。
［付記８］
前記認識範囲決定部は、
前記取得された画像を表示すると共に、ユーザが前記表示された画像上で所定の範囲を入力するための画面を表示し、入力指示された所定範囲を前記文字認識範囲として決定する、付記７の文字認識システム。
［付記９］
前記認識範囲決定部は、
前記取得された画像と予め定めたテンプレート画像を比較し、前記取得された画像と前記テンプレート画像が相違する領域を前記文字認識範囲に決定する、付記７の文字認識システム。
［付記１０］
前記撮像部は、同一の被写体から複数の画像を取得し、
前記端末は、前記複数の画像を合成することで、１枚の画像を生成する画像生成部をさらに備え、
前記認識範囲決定部は、前記合成された画像から前記文字認識範囲を決定する、付記７乃至９のいずれか一に記載の文字認識システム。
［付記１１］
前記撮像部は、露出条件を変更しつつ、前記同一の被写体から複数の画像を取得する、付記１０の文字認識システム。
［付記１２］
前記端末は、前記合成された画像の品質を検証する、画像検証部をさらに備え、
前記撮像部は、前記合成された画像の品質が予め定めた基準よりも低い場合には、前記被写体からの画像を再取得する、付記１０又は１１の文字認識システム。
［付記１３］
上述の第３の視点に係る端末の制御方法のとおりである。
［付記１４］
上述の第４の視点に係るプログラムのとおりである。
なお、付記１３の形態及び付記１４の形態は、付記１の形態と同様に、付記２の形態〜付記６の形態に展開することが可能である。 A part or all of the above embodiments can be described as in the following supplementary notes, but is not limited thereto.
[Appendix 1]
It is as the terminal which concerns on the above-mentioned 1st viewpoint.
[Appendix 2]
The recognition range determination unit
The acquired image is displayed, and a screen for the user to input a predetermined range on the displayed image is displayed, and the predetermined range instructed to input is determined as the character recognition range. Terminal.
[Appendix 3]
The recognition range determination unit
The terminal according to appendix 1, wherein the acquired image is compared with a predetermined template image, and an area where the acquired image and the template image are different is determined as the character recognition range.
[Appendix 4]
The imaging unit acquires a plurality of images from the same subject,
An image generation unit that generates one image by combining the plurality of images;
The terminal according to any one of appendices 1 to 3, wherein the recognition range determination unit determines the character recognition range from the combined image.
[Appendix 5]
The terminal according to appendix 4, wherein the imaging unit acquires a plurality of images from the same subject while changing an exposure condition.
[Appendix 6]
An image verification unit that verifies the quality of the synthesized image;
The terminal according to appendix 4 or 5, wherein the imaging unit reacquires an image from the subject when the quality of the synthesized image is lower than a predetermined reference.
[Appendix 7]
It is as the character recognition system concerning the above-mentioned 2nd viewpoint.
[Appendix 8]
The recognition range determination unit
The acquired image is displayed, a screen for a user to input a predetermined range on the displayed image is displayed, and the predetermined range instructed to input is determined as the character recognition range. Character recognition system.
[Appendix 9]
The recognition range determination unit
The character recognition system according to appendix 7, wherein the acquired image is compared with a predetermined template image, and an area where the acquired image and the template image are different is determined as the character recognition range.
[Appendix 10]
The imaging unit acquires a plurality of images from the same subject,
The terminal further includes an image generation unit that generates one image by combining the plurality of images,
The character recognition system according to any one of appendices 7 to 9, wherein the recognition range determination unit determines the character recognition range from the synthesized image.
[Appendix 11]
The character recognition system according to appendix 10, wherein the imaging unit acquires a plurality of images from the same subject while changing an exposure condition.
[Appendix 12]
The terminal further includes an image verification unit that verifies the quality of the synthesized image,
The character recognition system according to appendix 10 or 11, wherein the imaging unit re-acquires an image from the subject when the quality of the synthesized image is lower than a predetermined reference.
[Appendix 13]
This is the same as the terminal control method according to the third aspect described above.
[Appendix 14]
It is as the program which concerns on the above-mentioned 4th viewpoint.
Note that the form of Supplementary Note 13 and the form of Supplementary Note 14 can be expanded to the form of Supplementary Note 2 to the form of Supplementary Note 6, similarly to the form of Supplementary Note 1.

なお、引用した上記の特許文献等の各開示は、本書に引用をもって繰り込むものとする。本発明の全開示（請求の範囲を含む）の枠内において、さらにその基本的技術思想に基づいて、実施形態ないし実施例の変更・調整が可能である。また、本発明の全開示の枠内において種々の開示要素（各請求項の各要素、各実施形態ないし実施例の各要素、各図面の各要素等を含む）の多様な組み合わせ、ないし、選択が可能である。すなわち、本発明は、請求の範囲を含む全開示、技術的思想にしたがって当業者であればなし得るであろう各種変形、修正を含むことは勿論である。特に、本書に記載した数値範囲については、当該範囲内に含まれる任意の数値ないし小範囲が、別段の記載のない場合でも具体的に記載されているものと解釈されるべきである。 Each disclosure of the cited patent documents and the like cited above is incorporated herein by reference. Within the scope of the entire disclosure (including claims) of the present invention, the embodiments and examples can be changed and adjusted based on the basic technical concept. In addition, various combinations or selections of various disclosed elements (including each element in each claim, each element in each embodiment or example, each element in each drawing, etc.) within the scope of the entire disclosure of the present invention. Is possible. That is, the present invention of course includes various variations and modifications that could be made by those skilled in the art according to the entire disclosure including the claims and the technical idea. In particular, with respect to the numerical ranges described in this document, any numerical value or small range included in the range should be construed as being specifically described even if there is no specific description.

１０、１００端末
１１ＣＰＵ
１２メモリ
１３カメラモジュール
１４液晶パネル及びタッチパネル
１５無線信号送受信回路
１６アンテナ
１７入出力インターフェイス
１８ＮＩＣ
２０文字認識サーバ
２１、２２押下点
２３点線
３１領域
１０１、２０２撮像部
１０２、２０５認識範囲決定部
１０３出力部
２０１無線通信制御部
２０３画像合成部
２０４画像検証部
２０６画像補正部
３０１通信制御部
３０２画像管理部
３０３文字認識制御部
３０４文字認識部 10, 100 Terminal 11 CPU
12 memory 13 camera module 14 liquid crystal panel and touch panel 15 wireless signal transmission / reception circuit 16 antenna 17 input / output interface 18 NIC
20 Character recognition server 21, 22 Press point 23 Dotted line 31 Area 101, 202 Imaging unit 102, 205 Recognition range determination unit 103 Output unit 201 Wireless communication control unit 203 Image composition unit 204 Image verification unit 206 Image correction unit 301 Communication control unit 302 Image management unit 303 Character recognition control unit 304 Character recognition unit

Claims

An imaging unit for capturing an image of a subject and acquiring an image;
A recognition range determination unit that determines a character recognition range that causes a character recognition device to perform character recognition from the acquired image region;
An output unit for outputting data of the determined character recognition range to the character recognition device;
Comprising a terminal.

The recognition range determination unit
2. The acquired image is displayed, and a screen for a user to input a predetermined range on the displayed image is displayed, and the predetermined range instructed to input is determined as the character recognition range. Terminal.

The recognition range determination unit
The terminal according to claim 1, wherein the acquired image is compared with a predetermined template image, and an area where the acquired image and the template image are different is determined as the character recognition range.

The imaging unit acquires a plurality of images from the same subject,
An image generation unit that generates one image by combining the plurality of images;
The terminal according to any one of claims 1 to 3, wherein the recognition range determination unit determines the character recognition range from the synthesized image.

The terminal according to claim 4, wherein the imaging unit acquires a plurality of images from the same subject while changing an exposure condition.

An image verification unit that verifies the quality of the synthesized image;
The terminal according to claim 4 or 5, wherein the imaging unit re-acquires an image from the subject when the quality of the synthesized image is lower than a predetermined reference.

A character recognition device;
A terminal requesting character recognition to the character recognition device;
Including
The terminal
An imaging unit for capturing an image of a subject and acquiring an image;
A recognition range determination unit that determines a character recognition range that causes the character recognition device to perform character recognition from the acquired image region;
An output unit for outputting data of the determined character recognition range to the character recognition device;
A character recognition system.

The recognition range determination unit
8. The acquired image is displayed and a screen for a user to input a predetermined range on the displayed image is displayed, and the predetermined range instructed to input is determined as the character recognition range. Character recognition system.

The recognition range determination unit
The character recognition system according to claim 7, wherein the acquired image is compared with a predetermined template image, and an area where the acquired image and the template image are different is determined as the character recognition range.

The imaging unit acquires a plurality of images from the same subject,
The terminal further includes an image generation unit that generates one image by combining the plurality of images,
The character recognition system according to any one of claims 7 to 9, wherein the recognition range determination unit determines the character recognition range from the synthesized image.

The character recognition system according to claim 10, wherein the imaging unit acquires a plurality of images from the same subject while changing an exposure condition.

The terminal further includes an image verification unit that verifies the quality of the synthesized image,
The character recognition system according to claim 10 or 11, wherein the imaging unit re-acquires an image from the subject when the quality of the synthesized image is lower than a predetermined reference.

Capturing a subject and capturing an image;
Determining a character recognition range for causing the character recognition device to perform character recognition from the acquired image area; and
Outputting data of the determined character recognition range to the character recognition device;
Including a terminal control method.

Processing to capture an image of a subject and acquire an image;
A process for determining a character recognition range for causing a character recognition device to perform character recognition from the acquired image area;
A process of outputting data of the determined character recognition range to the character recognition device;
A program that causes a computer to execute.