WO2019026117A1

WO2019026117A1 - Security system

Info

Publication number: WO2019026117A1
Application number: PCT/JP2017/027660
Authority: WO
Inventors: 菊池正和
Original assignee: Secual; Secual Inc
Current assignee: Secual; Secual Inc
Priority date: 2017-07-31
Filing date: 2017-07-31
Publication date: 2019-02-07
Anticipated expiration: 2020-01-31

Abstract

セキュリティシステム（１０）のカメラ（１２）は、該カメラ（１２）の周囲を撮像する撮像部（１２ａ）と、該撮像部（１２ａ）が撮像した画像に基づいて人物を検知する人物検知部（１２ｃ）と、該人物検知部（１２ｃ）が検知した人物の画像から該人物の特徴部分を検出する特徴部分検出部（１２ｄ）と、該特徴部分検出部（１２ｄ）が検出した特徴部分の画像に基づいて人物を特定する人物特定部（１２ｅ）とを有する。The camera (12) of the security system (10) comprises an imaging unit (12a) for imaging the surroundings of the camera (12), and a person detection unit (for detecting a person based on an image imaged by the imaging unit (12a) 12c), a characteristic portion detection unit (12d) for detecting a characteristic portion of the person from the image of the person detected by the person detection unit (12c), and an image of the characteristic portion detected by the feature portion detection unit (12d) And a person identification unit (12e) for identifying a person based on the

Description

Security system

　本発明は、人物を撮像するカメラを少なくとも有するセキュリティシステムに関する。 The present invention relates to a security system having at least a camera for imaging a person.

　例えば、建物等に設置されたカメラが撮像した画像を、管理者のサーバに送信するセキュリティシステムが、特開２００９－９８８１４号公報、特開２０１３－３０１１６号公報、及び、特開２０１５－２５０６号公報に開示されている。 For example, a security system that transmits an image captured by a camera installed in a building or the like to a server of an administrator is disclosed in JP-A-2009-98814, JP-A-2013-30116, and JP-A-2015-2506. It is disclosed in the official gazette.

　また、近年のセキュリティシステムでは、カメラが撮像した画像を、通信回線を介して、スマートフォン等の携帯機器に配信し、さらには、カメラが撮像した画像の輝度変化に基づいて、該画像に写り込んだ人物を検知し、この検知結果をトリガとして、撮像した画像の録画を開始することも行われている。 Also, in recent security systems, images captured by a camera are distributed to portable devices such as smartphones via a communication line, and further reflected in the images based on changes in luminance of the images captured by the camera. It is also performed to detect a person and to start recording of a captured image triggered by the detection result.

　しかしながら、従来のセキュリティシステムでは、例えば、住宅等の建物への訪問者の来訪をカメラで撮像する場合、撮像した画像に写り込んだ人間以外の動体の動きや、照明装置の照明等の明るさの変化を、全て、人間の動きと捉えてしまう場合がある。これにより、画像に写り込んだ動体等が訪問者であるか否か、又は、訪問者がどのような人物であるのかを適切に判定することができない。この結果、住宅の住人が不在であるときに、訪問者に合わせた適切なサービス（例えば、訪問者との対話、訪問者の画像の配信）を提供することができない。 However, in the conventional security system, for example, when an image of a visitor visiting a building such as a house is imaged by a camera, the brightness of the movement of a non-human moving body reflected in the imaged image or the brightness of the illumination device In some cases, all changes can be regarded as human movements. Thus, it can not be appropriately determined whether the moving object or the like reflected in the image is a visitor or what kind of person the visitor is. As a result, when the resident of the house is absent, it is not possible to provide an appropriate service (for example, interaction with the visitor, delivery of the image of the visitor) tailored to the visitor.

　本発明は、このような課題を考慮してなされたものであり、カメラで撮像された人物に合わせた適切なサービスを提供することが可能となるセキュリティシステムを提供することを目的とする。 The present invention has been made in consideration of such problems, and an object of the present invention is to provide a security system capable of providing an appropriate service tailored to a person captured by a camera.

　そして、上記の目的を達成するため、本発明において、前記カメラは、該カメラの周囲を撮像する撮像部と、該撮像部が撮像した画像に基づいて前記人物を検知する人物検知部と、該人物検知部が検知した前記人物の画像から該人物の特徴部分を検出する特徴部分検出部と、該特徴部分検出部が検出した前記特徴部分の画像に基づいて前記人物を特定する人物特定部とを有する。 Further, in order to achieve the above object, in the present invention, the camera includes an imaging unit for imaging the periphery of the camera, a person detection unit for detecting the person based on an image imaged by the imaging unit, A feature portion detection unit for detecting a feature portion of the person from the image of the person detected by the person detection unit; a person identification unit for identifying the person based on the image of the feature portion detected by the feature portion detection unit; Have.

　このように、前記カメラに人工知能を搭載し、前記人物検知部、前記特徴部分検出部及び前記人物特定部として機能させることにより、訪問者等の特定の人物に合わせたサービスの提供を、ローカル領域としての前記カメラ側で行うことが可能となる。これにより、例えば、前記カメラを設置した住宅の住人が不在であっても、該カメラが前記訪問者に対して適切な対応を取ることができる。この結果、前記セキュリティシステムの利便性を向上させることができる。 As described above, by mounting an artificial intelligence on the camera and causing the person detection unit, the feature portion detection unit, and the person identification unit to function, provision of services tailored to a specific person such as a visitor can be performed locally. It is possible to do on the camera side as a region. Thereby, for example, even when the resident of the house where the camera is installed is absent, the camera can take an appropriate response to the visitor. As a result, the convenience of the security system can be improved.

　この場合、前記人物検知部は、前記カメラが撮像した画像について、該画像に写り込んでいる動体を検出するか、前記画像のノイズの変化を検出するか、又は、前記画像の明るさの変化を検出することにより、前記人物を検知してもよい。 In this case, the person detection unit detects a moving object reflected in the image, or detects a change in noise of the image, or a change in brightness of the image in the image captured by the camera. The person may be detected by detecting

　これにより、前記撮像部が撮像した画像のみを用いて前記人物を容易に検知することができる。この結果、前記人物検知部での検知結果をトリガとして、前記特徴部分検出部及び前記人物特定部で各処理を実行することが可能となる。 Thus, the person can be easily detected using only the image captured by the imaging unit. As a result, with the detection result of the person detection unit as a trigger, the characteristic part detection unit and the person identification unit can execute each process.

　また、前記カメラは、該カメラの周囲の音を検出する第１マイクロホンをさらに有してもよい。この場合、前記人物検知部は、前記撮像部が撮像した前記画像と、前記第１マイクロホンが検出した前記音とに基づいて、前記人物を検知すればよい。 In addition, the camera may further include a first microphone that detects sound around the camera. In this case, the person detection unit may detect the person based on the image captured by the imaging unit and the sound detected by the first microphone.

　このように、前記撮像部が撮像した画像と、前記第１マイクロホンが検出した音とを併用することにより、前記人物を精度よく検知することが可能となる。 As described above, by using the image captured by the imaging unit and the sound detected by the first microphone in combination, it is possible to detect the person with high accuracy.

　さらに、前記カメラが連続的に又は所定時間間隔で該カメラの周囲を撮像する場合、前記特徴部分検出部は、前記人物検知部が前記人物を検知する毎に、前記人物の画像から前記特徴部分を検出することにより、該特徴部分を追跡すればよい。 Furthermore, when the camera images the periphery of the camera continuously or at predetermined time intervals, the characteristic part detection unit detects the characteristic part from the image of the person each time the person detection unit detects the person The feature portion may be tracked by detecting the

　これにより、前記人物の行動を容易に把握することができると共に、該人物に対して適切な対応を取ることが可能となる。 This makes it possible to easily grasp the behavior of the person and to take an appropriate response to the person.

　また、前記カメラは、所定の人物に関する人物情報が登録されている人物情報登録部をさらに有してもよい。この場合、前記人物特定部は、前記特徴部分検出部が検出した前記特徴部分の画像から該特徴部分の特徴量を抽出する特徴量抽出部と、前記特徴量抽出部が抽出した人物の特徴部分の特徴量と前記人物情報の示す人物の特徴部分の特徴量との類似度を判定する類似度判定処理部と、前記特徴量抽出部が抽出した人物の特徴部分の特徴量と前記人物情報とを比較することにより前記人物検知部が検知した人物の年齢及び／又は性別を推定する年齢・性別推定処理部とから構成される。 Further, the camera may further include a person information registration unit in which person information on a predetermined person is registered. In this case, the person specifying unit extracts a feature amount of the feature portion from the image of the feature portion detected by the feature portion detecting unit, and a feature portion of the person extracted by the feature amount extracting unit A similarity degree determination processing unit that determines the similarity between the feature amount of the person and the feature amount of the feature portion of the person indicated by the person information; the feature amount of the feature portion of the person extracted by the feature amount extraction unit; And an age / sex estimation processing unit that estimates the age and / or sex of the person detected by the person detection unit.

　これにより、前記人物検知部が検知した人物がどのような人物であるのかを容易に且つ効率よく特定することが可能となる。 This makes it possible to easily and efficiently specify what kind of person the person detected by the person detection unit is.

　この場合、前記特徴部分は、少なくとも、前記人物検知部が検知した人物の顔又は容姿であればよい。これにより、前記カメラが住宅等の建物に設置されている場合に、該建物への訪問者や、前記建物に出入りする住人の動き等を容易に検知することが可能となる。なお、前記容姿には、例えば、前記訪問者の前側又は側方の姿や、前記建物から出ていく住人の後ろ姿が含まれる。 In this case, the characteristic portion may be at least a face or appearance of a person detected by the person detection unit. As a result, when the camera is installed in a building such as a house, it is possible to easily detect a visitor to the building or a movement of a resident entering or leaving the building. In addition, the appearance includes, for example, the appearance of the front side or the side of the visitor, and the back appearance of a resident coming out of the building.

　また、前記特徴部分には、前記人物の服装、該服装の柄、色若しくは模様、又は、前記服装に付されたマークが含まれてもよい。これにより、前記カメラが住宅等の建物に設置されている場合に、該建物への訪問者や前記建物に出入りする人物がどのような人物であるのかを容易に特定することができる。 In addition, the feature portion may include clothes of the person, patterns, colors or patterns of the clothes, or marks attached to the clothes. Thereby, when the camera is installed in a building such as a house, it is possible to easily identify what kind of person is a visitor to the building or a person who goes in and out of the building.

　そして、前記カメラは、前記人物特定部が前記人物を特定した際に、その特定結果に応じた前記人物への対応を判断する第１応答判断部をさらに有してもよい。 The camera may further include a first response determination unit that determines a response to the person according to the identification result when the person identification unit identifies the person.

　これにより、前記人物の人物像に応じた適切な対応を取ることが可能となる。 Thereby, it is possible to take an appropriate response according to the person image of the person.

　具体的に、前記第１応答判断部の判断結果に応じて、前記セキュリティシステムでは、前記人物に対して、下記のような第１～第３の対応が実行される。 Specifically, in accordance with the determination result of the first response determination unit, the security system executes the following first to third measures for the person.

　第１の対応は、前記人物と前記カメラとの間で対話を実行する対応であって、前記カメラに搭載された人工知能機能により、該カメラの内部処理のみで前記人物に対応する場合である。 The first action is to execute a dialogue between the person and the camera, and the artificial intelligence function mounted on the camera corresponds to the person only by the internal processing of the camera. .

　すなわち、前記カメラは、複数のシナリオが登録されたシナリオ登録部と、前記第１応答判断部が前記特定結果に基づいて前記人物との対話の実行を決定した場合に、前記シナリオ登録部から前記対話に応じたシナリオを選定する第１対話エンジン部と、前記第１対話エンジン部が選定した前記シナリオを音声として外部に出力する第１スピーカと、前記人物の声を検出する第２マイクロホンとをさらに有する。 That is, when the camera determines that the scenario registration unit in which a plurality of scenarios are registered and the first response determination unit execute the dialogue with the person based on the specification result, the camera registration unit A first dialog engine unit that selects a scenario according to a dialog; a first speaker that outputs the scenario selected by the first dialog engine unit as a voice to the outside; and a second microphone that detects the voice of the person Furthermore, it has.

　これにより、前記カメラは、前記シナリオに従って前記人物との間で対話を容易に行うことが可能となる。また、人工知能の一機能である前記第１対話エンジン部を有しているので、前記対話の実行によって、画像及び音声のデータ量が増大しても、該第１対話エンジン部で容易に且つ速やかに処理することが可能となる。さらに、前記第１対話エンジン部は、前記人物との対話を通じて、対話学習を行うので、対話機能を含めた前記人工知能の諸機能を向上させることが可能となる。 This allows the camera to easily interact with the person according to the scenario. In addition, since the first dialogue engine unit, which is a function of artificial intelligence, is provided, even if the amount of image and voice data increases due to the execution of the dialogue, the first dialogue engine unit can be easily and It becomes possible to process promptly. Furthermore, since the first dialogue engine unit performs dialogue learning through dialogue with the person, it is possible to improve various functions of the artificial intelligence including the dialogue function.

　この場合、前記カメラは、該カメラの設置場所の情報、該設置場所周辺の情報、及び／又は、所定の人物に関する情報が関連情報として登録されている関連情報登録部をさらに有してもよい。そして、前記第１対話エンジン部は、前記第１応答判断部が前記特定結果に基づいて前記人物特定部が特定した人物との対話の実行を決定した場合に、前記関連情報登録部から前記関連情報を抽出し、抽出した前記関連情報及び前記対話に応じたシナリオを前記シナリオ登録部から選定すればよい。 In this case, the camera may further include a related information registration unit in which information on the installation place of the camera, information on the periphery of the installation place, and / or information on a predetermined person is registered as related information. . Then, when the first response judging unit determines execution of a dialog with the person specified by the person specifying unit based on the specifying result, the first dialogue engine unit determines the relation from the related information registration unit Information may be extracted, and a scenario corresponding to the extracted related information and the dialogue may be selected from the scenario registration unit.

　これにより、前記関連情報も参照して、より適切なシナリオを選定し、前記人物との間の対話の精度を向上させることができる。また、住宅等の建物に前記カメラが設置されている場合、前記建物への訪問者が予め分かっているときには、該訪問者の関連情報を前記関連情報登録部に登録しておけば、前記訪問者が前記建物に訪問した際、該訪問者に対して適切な対話を実行することができる。 Thereby, referring to the related information, it is possible to select a more appropriate scenario and improve the accuracy of the interaction with the person. In addition, when the camera is installed in a building such as a house, if a visitor to the building is known in advance, the related information of the visitor may be registered in the related information registration unit. When a person visits the building, an appropriate dialogue can be performed for the visitor.

　この場合、前記カメラは、外部から前記関連情報を収集し、収集した前記関連情報を前記関連情報登録部に登録する関連情報収集部をさらに有してもよい。これにより、最新の関連情報が前記関連情報登録部に登録されるので、前記人物との対話の精度をさらに向上させることが可能となる。 In this case, the camera may further include a related information collection unit that collects the related information from the outside and registers the collected related information in the related information registration unit. Thus, since the latest related information is registered in the related information registration unit, it is possible to further improve the accuracy of the dialog with the person.

　また、前記第１対話エンジン部は、前記第１応答判断部が前記特定結果に基づいて前記人物特定部が特定した人物との対話の実行を決定した場合に、該人物を前記撮像部が撮像した時点において前記関連情報収集部が収集した前記関連情報を前記関連情報登録部から抽出し、抽出した前記関連情報及び前記対話に応じたシナリオを前記シナリオ登録部から選定すればよい。これにより、前記人物との間の対話の精度が一層向上する。 In the first dialogue engine unit, the imaging unit captures an image of the person when the first response determination unit determines execution of a dialog with the identified person based on the identification result. The relevant information collected by the relevant information collection unit may be extracted from the relevant information registration unit at the time point of occurrence, and the scenario corresponding to the extracted relevant information and the dialogue may be selected from the scenario registration unit. This further improves the accuracy of the dialogue with the person.

　さらに、前記第１対話エンジン部は、前記人物特定部が特定した人物に応じた関連情報が、前記関連情報登録部に登録されていない場合、前記関連情報登録部に登録されている他の関連情報を抽出し、抽出した前記他の関連情報及び前記対話に応じたシナリオを前記シナリオ登録部から選定するか、並びに／又は、外部に報知すればよい。これにより、住宅等の建物に前記カメラが設置されている場合、前記建物への訪問者と、前記関連情報登録部に登録されている関連情報の示す訪問者とが一致していないときでも、他の関連情報を用いて、該訪問者との対話を実行することが可能となる。また、前記建物を留守にしている住人に訪問者の来訪等を報知することが可能となる。 Furthermore, when the related information according to the person specified by the person specifying unit is not registered in the related information registration unit, the first dialogue engine unit is another related information registered in the related information registration unit. Information may be extracted, and the extracted other related information and a scenario corresponding to the interaction may be selected from the scenario registration unit and / or notified to the outside. Thereby, when the camera is installed in a building such as a house, even when the visitor to the building and the visitor indicated by the related information registered in the related information registration unit do not match, Other relevant information can be used to perform the interaction with the visitor. In addition, it is possible to notify a resident who is away from the building by reporting the visit of the visitor.

　また、第２の対応は、前記カメラから外部の携帯機器の所持者（例えば、前記カメラが設置された住宅の住人）に、前記人物の存在（例えば、訪問者の来訪）を通知する対応である。 The second response is to notify the holder of an external portable device (for example, a resident of a house where the camera is installed) from the camera by notifying the presence of the person (for example, a visitor's visit). is there.

　すなわち、前記セキュリティシステムは、前記カメラに通信回線を介して接続可能な携帯機器をさらに有する。この場合、前記カメラは、前記通信回線を介して、少なくとも前記携帯機器との間で通信が可能な第１通信部をさらに有する。そして、前記第１応答判断部が前記特定結果に基づいて前記人物の存在を前記携帯機器に通知することを決定した場合に、前記第１通信部は、前記通信回線を介して前記携帯機器に、前記人物の存在を通知する。 That is, the security system further includes a portable device connectable to the camera via a communication line. In this case, the camera further includes a first communication unit capable of communicating with at least the portable device via the communication line. Then, when the first response determination unit determines to notify the portable device of the presence of the person based on the identification result, the first communication unit transmits the notification to the portable device via the communication line. , Notify the existence of the person.

　これにより、前記カメラの設置場所を不在にしている前記携帯機器の所持者は、該設置場所にいなくても、前記通知をリアルタイムで受けることができる。これにより、前記セキュリティシステムは、前記所持者に対して適切なサービスを提供することが可能となる。 As a result, the holder of the portable device in the absence of the installation location of the camera can receive the notification in real time even if not at the installation location. Thereby, the security system can provide appropriate services to the holder.

　また、第２の対応では、前記携帯機器（の前記所持者）が前記通知を受け取った後、前記人物との間で対話を行うことが可能である。 Also, in the second response, after (the holder of) the portable device receives the notification, it is possible to interact with the person.

　具体的に、前記携帯機器は、前記通信回線を介して、少なくとも前記カメラとの間で通信が可能な第２通信部と、表示部と、操作部と、第２スピーカと、第３マイクロホンとを有する。 Specifically, the portable device can communicate with at least the camera via the communication line, a second communication unit, a display unit, an operation unit, a second speaker, and a third microphone. Have.

　ここで、前記第２通信部が前記通知を受信した場合、前記表示部が前記通知を表示するか、及び／又は、前記第２スピーカが前記通知を音として出力する。これに対して、前記携帯機器の所持者が前記操作部を操作し、前記人物との対話の開始を指示した場合、前記第２通信部は、前記通信回線を介して、前記第１通信部に前記対話の開始を指示する指示信号を送信する。 Here, when the second communication unit receives the notification, the display unit displays the notification and / or the second speaker outputs the notification as a sound. On the other hand, when the holder of the portable device operates the operation unit and instructs start of a dialogue with the person, the second communication unit transmits the first communication unit via the communication line. Sends an instruction signal instructing the start of the dialogue.

　この場合、前記第１通信部は、前記指示信号を受信した際に、前記撮像部が撮像した前記人物の画像と、前記第２マイクロホンが検出した前記人物の音声データとを、前記通信回線を介して前記第２通信部に送信する。 In this case, when the first communication unit receives the instruction signal, an image of the person captured by the imaging unit and voice data of the person detected by the second microphone are transmitted through the communication line. It transmits to said 2nd communication part via.

　これにより、前記表示部は、前記第２通信部が受信した前記人物の画像を表示し、前記第２スピーカは、前記第２通信部が受信した前記人物の音声データを音として出力し、且つ、前記第２通信部は、前記第３マイクロホンが検出した前記所持者の音声データを、前記通信回線を介して前記第１通信部に送信する。この結果、前記第１スピーカは、前記第１通信部が受信した前記所持者の音声データを音として出力する。 Thereby, the display unit displays the image of the person received by the second communication unit, and the second speaker outputs the voice data of the person received by the second communication unit as sound, and The second communication unit transmits voice data of the holder detected by the third microphone to the first communication unit via the communication line. As a result, the first speaker outputs the sound data of the holder received by the first communication unit as a sound.

　このようにして、前記カメラ側の前記人物と、前記携帯機器の前記所持者との間で対話が開始されるので、該所持者は、前記カメラの設置場所に居なくても、リアルタイムで適切なサービス（対話サービス）を受けることができる。これにより、前記セキュリティシステムの利便性をさらに向上させることができる。この場合でも、前記第１対話エンジン部は、前記人物との対話を通じて、対話学習を行い、対話機能を含めた前記人工知能の諸機能を向上させることができる。 In this way, since the dialogue is started between the person on the camera side and the holder of the portable device, the holder is suitable in real time even if he is not at the installation location of the camera. Service (interactive service). Thereby, the convenience of the security system can be further improved. Even in this case, the first dialogue engine unit can perform dialogue learning through the dialogue with the person and improve various functions of the artificial intelligence including the dialogue function.

　そして、第１の対応又は第２の対応において、前記人物との対話が終了した場合、前記カメラは、下記の処理を行う。 Then, in the first response or the second response, when the dialog with the person ends, the camera performs the following processing.

　すなわち、前記カメラは、前記人物との対話の終了後、少なくとも、前記対話中に前記撮像部が撮像した前記人物の画像と、前記第２マイクロホンが検出した前記人物の音声データとが記録されるデータ蓄積部をさらに有する。この場合、前記第１対話エンジン部は、前記対話の終了後、前記人物との対話の内容に基づいて新たなシナリオを作成し、作成した前記新たなシナリオを前記シナリオ登録部に登録する。 That is, after the end of the dialog with the person, the camera records at least the image of the person captured by the imaging unit during the dialog and the voice data of the person detected by the second microphone. It further has a data storage unit. In this case, after the end of the dialogue, the first dialogue engine unit creates a new scenario based on the contents of the dialogue with the person, and registers the created new scenario in the scenario registration unit.

　これにより、前記人物との間でどのような対話が実行されたのかを後日確認することができ、前記セキュリティシステムの利便性を一層向上させることができる。また、前記カメラに人工知能が搭載されているため、前記データ蓄積部に蓄積されるデータ量が飛躍的に増大しても、前記第１対話エンジン部は、適切に且つ速やかにデータを処理し、前記新たなシナリオを作成して前記シナリオ登録部に登録することができる。しかも、前記第１対話エンジン部は、前記人物との対話を通じた対話学習によって前記新たなシナリオを作成するので、前記人物との対話を実行する毎に、最適なシナリオを選択しやすくなる。すなわち、前記人物との対話を行う毎に前記新たなシナリオが作成され、前記シナリオ登録部に登録されるので、前記カメラに搭載された人工知能機能の精度が向上し、最適なシナリオを選択して対話を実行することができる。 As a result, it is possible to confirm at a later date what kind of dialogue has been performed with the person, and the convenience of the security system can be further improved. Further, since the artificial intelligence is mounted on the camera, the first dialogue engine unit processes the data appropriately and promptly even if the amount of data stored in the data storage unit increases dramatically. The new scenario can be created and registered in the scenario registration unit. Moreover, since the first dialogue engine unit creates the new scenario by dialogue learning through dialogue with the person, it is easy to select an optimal scenario each time the dialogue with the person is executed. That is, since the new scenario is created and registered in the scenario registration unit each time the user interacts with the person, the accuracy of the artificial intelligence function mounted on the camera is improved, and the optimal scenario is selected. Dialogue can be performed.

　また、第３の対応は、前記人物の特定を詳しく行う必要がある場合、例えば、前記カメラ側の人工知能の機能では、前記人物を特定することが難しい場合に、外部のサーバに対して、前記人物の詳しい特定を問い合わせる（要請する）対応である。 In addition, in the third correspondence, when it is necessary to specify the person in detail, for example, when it is difficult to specify the person with the artificial intelligence function on the camera side, the external server may It is a response to inquire (request) the detailed identification of the person.

　具体的に、前記セキュリティシステムは、前記カメラに通信回線を介して接続可能なサーバと、前記サーバに接続され、前記カメラの設置場所の情報、及び、該設置場所周辺の情報を含む周辺の周辺情報が登録された周辺情報データベースと、前記サーバに接続され、所定の人物の個人情報が登録された個人情報データベースとをさらに有する。 Specifically, the security system is connected to a server connectable to the camera via a communication line, and connected to the server, and includes information on an installation location of the camera and peripheral surroundings including information on the periphery of the installation location. It further has a peripheral information database in which information is registered, and a personal information database connected to the server and in which personal information of a predetermined person is registered.

　この場合、前記第１応答判断部は、前記特定結果に基づき前記人物検知部が検知した人物の詳しい特定を前記サーバに問い合わせることを決定したときに、前記通信回線を介して、前記人物の詳しい特定を前記サーバに問い合わせる。 In this case, when the first response determination unit determines to inquire the server of the detailed identification of the person detected by the person detection unit on the basis of the identification result, the first response determination unit determines the detail of the person via the communication line. Query the server for identification.

　一方、前記サーバは、前記第１応答判断部からの問い合わせ内容を受信する第３通信部と、該第３通信部が受信した前記問い合わせ内容を判断する問い合わせ内容判断部と、該問い合わせ内容判断部の判断結果に基づき、前記第３通信部を介して、前記周辺情報データベースから前記周辺情報を取得すると共に、前記個人情報データベースから前記問い合わせ内容の示す人物に応じた前記個人情報を取得し、取得した前記周辺情報及び前記個人情報と、前記問い合わせ内容とを突き合わせることにより、該人物を詳しく特定するデータ突き合わせ部とを有する。 On the other hand, the server includes: a third communication unit that receives the content of the inquiry from the first response determination unit; an inquiry content determination unit that determines the content of the inquiry received by the third communication unit; The peripheral information is acquired from the peripheral information database via the third communication unit based on the determination result of the above, and the personal information corresponding to the person indicated by the content of the inquiry is acquired from the personal information database and acquired And a data matching unit for specifying the person in detail by matching the peripheral information and the personal information with the content of the inquiry.

　このように、前記サーバ側には、前記周辺情報データベース及び前記個人情報データベースが設けられており、前記人物を特定するために必要なデータ量は、前記カメラよりも格段に多い。従って、前記周辺情報及び前記個人情報と前記問い合わせ内容とを突き合わせることで、前記サーバにおいて、前記人物の人物像等を詳しく特定することが可能となる。また、前記サーバに人工知能を搭載し、前記問い合わせ内容判断部及び前記データ突き合わせ部として機能させることにより、前記人物に合わせたサービスの提供を、ローカル領域としての前記サーバ側で行うことが可能となる。この場合でも、前記カメラを設置した住宅の住人が不在であっても、該サーバが前記人物に対して適切な対応を取ることができる。この結果、前記セキュリティシステムの利便性を向上させることができる。さらに、データの突き合わせを行うことにより、前記人物と前記周辺情報及び前記個人情報等のコンテンツとを容易に紐付けることが可能となる。 Thus, the peripheral information database and the personal information database are provided on the server side, and the amount of data required to identify the person is much larger than that of the camera. Therefore, by matching the peripheral information and the personal information with the contents of the inquiry, the server can specify the person's image etc. in detail. Further, by installing artificial intelligence in the server and making it function as the inquiry content judgment unit and the data matching unit, it is possible for the server as a local area to provide services tailored to the person. Become. Even in this case, even if there is no resident of the house where the camera is installed, the server can take an appropriate response to the person. As a result, the convenience of the security system can be improved. Furthermore, by matching the data, it is possible to easily associate the person with the content such as the surrounding information and the personal information.

　また、前記セキュリティシステムは、前記サーバに接続され、複数のシナリオが登録されたシナリオデータベースをさらに有してもよい。この場合、前記サーバは、前記データ突き合わせ部が前記問い合わせ内容の示す人物を詳しく特定した際に、その特定結果に応じた該人物への対応を判断する第２応答判断部と、該第２応答判断部が前記特定結果に基づいて前記人物との対話の実行を決定したときに、前記第３通信部を介して、前記シナリオデータベースから前記対話に応じたシナリオを選定する第２対話エンジン部とをさらに有する。そして、前記カメラは、前記第２対話エンジン部が選定した前記シナリオを音声として外部に出力する第２スピーカと、前記人物の声を検出する第３マイクロホンとをさらに有する。 In addition, the security system may further include a scenario database connected to the server and registered with a plurality of scenarios. In this case, when the data matching unit specifies a person indicated by the contents of the inquiry in detail, the server determines a response to the person according to the specification result, and the second response determining unit; and the second response A second dialogue engine unit that selects a scenario corresponding to the dialogue from the scenario database via the third communication unit when the judgment unit decides to execute dialogue with the person based on the identification result; In addition, The camera further includes a second speaker for outputting the scenario selected by the second dialogue engine unit as a voice to the outside, and a third microphone for detecting the voice of the person.

　これにより、前記カメラは、前記サーバ側で選定された前記シナリオに従って前記人物との間で対話を容易に且つ精度よく行うことが可能となる。また、前記サーバが人工知能の一機能である前記第２対話エンジン部を有しているので、前記対話の実行によって、画像及び音声のデータ量が増大しても、該第２対話エンジン部で容易に且つ速やかに処理することが可能となる。さらに、前記第２対話エンジン部は、前記人物との対話を通じて、対話学習を行い、対話機能を含めた前記人工知能の諸機能を向上させることができる。 Thus, the camera can easily and accurately interact with the person according to the scenario selected on the server side. In addition, since the server has the second dialogue engine unit which is one function of artificial intelligence, even if the amount of image and voice data increases due to the execution of the dialogue, the second dialogue engine unit It becomes possible to process easily and promptly. Furthermore, the second dialogue engine unit can perform dialogue learning through dialogue with the person to improve various functions of the artificial intelligence including the dialogue function.

　また、前記セキュリティシステムは、前記サーバに接続され、前記人物との対話の終了後、少なくとも、前記対話中に前記撮像部が撮像した前記人物の画像と、前記第３マイクロホンが検出した前記人物の音声データとが対話履歴として記録される対話履歴データベースをさらに有する。この場合、前記第２対話エンジン部は、前記対話の終了後、前記対話履歴データベースに記録された前記対話履歴を用いて新たなシナリオを作成し、作成した前記新たなシナリオを前記シナリオデータベースに登録する。 Further, the security system is connected to the server, and after the end of the dialog with the person, at least the image of the person captured by the imaging unit during the dialog and the person detected by the third microphone It further has a dialogue history database in which speech data is recorded as dialogue history. In this case, after the end of the dialog, the second dialog engine unit creates a new scenario using the dialog history recorded in the dialog history database, and registers the created new scenario in the scenario database. Do.

　これにより、前記人物との間でどのような対話が実行されたのかを後日確認することができ、前記セキュリティシステムの利便性を一層向上させることができる。また、前記サーバに人工知能が搭載されているため、前記対話履歴データベースに蓄積されるデータ量が飛躍的に増大しても、前記第２対話エンジン部は、適切に且つ速やかにデータを処理し、前記新たなシナリオを作成することができる。しかも、前記対話履歴データベースに蓄積される前記対話履歴を用いて前記新たなシナリオが作成されるので、前記第２対話エンジン部は、前記人物との対話の際、最適で且つ最新のシナリオを容易に選定することが可能となる。すなわち、前記第２対話エンジン部は、前記人物との対話を通じた対話学習によって前記新たなシナリオを作成するので、前記人物との対話を実行する毎に、最適なシナリオを選択しやすくなる。すなわち、第３の対応でも、前記人物との対話を行う毎に前記新たなシナリオが作成され、前記シナリオデータベースに登録されるので、前記サーバに搭載された人工知能機能の精度が向上し、最適なシナリオを選択して対話を実行することができる。 As a result, it is possible to confirm at a later date what kind of dialogue has been performed with the person, and the convenience of the security system can be further improved. Further, since the artificial intelligence is mounted on the server, the second dialogue engine unit processes the data appropriately and promptly even if the amount of data accumulated in the dialogue history database increases dramatically. , Can create the new scenario. Moreover, since the new scenario is created using the dialogue history accumulated in the dialogue history database, the second dialogue engine unit facilitates an optimal and latest scenario when interacting with the person. It is possible to select That is, since the second dialogue engine unit creates the new scenario by dialogue learning through dialogue with the person, it is easy to select an optimal scenario each time the dialogue with the person is executed. That is, even in the third response, the new scenario is created each time the user interacts with the person, and is registered in the scenario database, so that the accuracy of the artificial intelligence function installed in the server is improved, which is optimal. Dialog can be selected by selecting

本実施形態に係るセキュリティシステムのブロック図である。It is a block diagram of a security system concerning this embodiment. 図１のカメラの詳細なブロック図である。Figure 2 is a detailed block diagram of the camera of Figure 1; 図１の携帯機器及びサーバの詳細なブロック図である。Figure 2 is a detailed block diagram of the mobile device and server of Figure 1; 図１のセキュリティシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the security system of FIG. 図１のセキュリティシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the security system of FIG. 図１のセキュリティシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the security system of FIG. 図１のセキュリティシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the security system of FIG.

　本発明について、好適な実施形態を掲げ、添付の図面を参照しながら、以下詳細に説明する。 The present invention will be described in detail below with reference to preferred embodiments and with reference to the attached drawings.

［１．本実施形態の構成］
　本実施形態に係るセキュリティシステム１０の構成について、図１～図３を参照しながら説明する。 [1. Configuration of this embodiment]
The configuration of the security system 10 according to the present embodiment will be described with reference to FIGS. 1 to 3.

　セキュリティシステム１０は、住宅を含む建物、土地、事業所等の監視対象に来訪する訪問者（人物）、該監視対象に出入りする人物（例えば、住宅に出入りする住人）に対する監視や、監視対象の警備等のサービスに適用される。 The security system 10 monitors a visitor (person) who visits a monitoring target such as a building including a house, a land, an office, etc., a person who enters and leaves the monitoring target (for example, a resident who enters and leaves the housing) It applies to services such as security.

　セキュリティシステム１０は、例えば、監視対象の入口に設置され、訪問者や監視対象に出入りする人物を含む設置場所の周辺を撮像するカメラ１２と、カメラ１２と通信回線１４を介して接続されるサーバ１６と、監視対象の所有者等（例えば、住宅の住人）が所持し、カメラ１２及びサーバ１６と通信回線１４を介して接続されるスマートフォン等の携帯機器１８とを有する。 The security system 10 is installed, for example, at an entrance of a monitoring target, and a camera 12 for capturing an image of the periphery of an installation location including a visitor and a person who enters or leaves the monitoring target 16 and a portable device 18 such as a smartphone which is owned by the owner of the monitoring target (for example, a resident of a house) and connected to the camera 12 and the server 16 via the communication line 14.

　ここでは、一例として、監視対象が住宅であり、住宅の入口に外部から訪問者（人物）が来訪する場合、又は、携帯機器１８の所有者以外の住人（人物）が住宅を出入りする場合について説明する。従って、カメラ１２の撮像対象となる人物は、監視対象に来訪する外部の人間に限定されるものではなく、携帯機器１８を所持していない監視対象の関係者（例えば、携帯機器１８を所持する住人の家族）も含む概念である。さらに、図１では、通信回線１４を介して、カメラ１２、サーバ１６及び携帯機器１８との間で無線通信による信号の送受信が可能な状態において、住宅の入口に対する監視を行う場合について説明する。 Here, as an example, when the monitoring target is a house and a visitor (person) visits the entrance of the house from the outside or a resident (person) other than the owner of the portable device 18 enters and leaves the house explain. Therefore, the person to be imaged by the camera 12 is not limited to an external person who visits the monitoring object, and the person concerned with the monitoring object who does not possess the portable device 18 (for example, possesses the portable device 18 It is a concept that also includes the family of residents. Furthermore, in FIG. 1, the case where the entrance to a house is monitored in a state in which signals can be transmitted and received by wireless communication with the camera 12, the server 16, and the portable device 18 via the communication line 14 will be described.

　サーバ１６は、監視又は警備等のサービス提供者のクラウドサーバであり、該サーバ１６が含まれるクラウド２０内には、カメラ１２の設置場所、及び、該設置場所周辺の周辺情報が蓄積される周辺情報データベース２２（以下、周辺情報ＤＢ２２ともいう。）と、訪問者を含む複数の人物の個人情報が蓄積される個人情報データベース２４（以下、個人情報ＤＢ２４ともいう。）と、訪問者とカメラ１２との対話を実行する際に用いられるシナリオが蓄積されたシナリオデータベース２６（以下、シナリオＤＢ２６ともいう。）と、訪問者とカメラ１２との対話履歴が記録される対話履歴データベースとしてのヒストリーデータベース２８（以下、ヒストリーＤＢ２８ともいう。）とが配置されている。 The server 16 is a cloud server of a service provider such as monitoring or security, and in the cloud 20 in which the server 16 is included, the installation location of the camera 12 and the periphery where information around the installation location is stored. A personal information database 24 (hereinafter also referred to as a personal information DB 24) in which personal information of a plurality of persons including a visitor is stored, an information database 22 (hereinafter also referred to as a surrounding information DB 22), a visitor and a camera 12 Scenario database 26 (hereinafter, also referred to as scenario DB 26) in which the scenario used in executing the dialogue with the user is stored, and the history database 28 as a dialogue history database in which the dialogue history between the visitor and the camera 12 is recorded. (Hereinafter, also referred to as history DB 28) are arranged.

　周辺情報は、カメラ１２の設置場所、及び、該設置場所周辺の気象情報、交通情報、防犯情報等であり、これらの情報を取得した時刻と紐付けられて周辺情報ＤＢ２２に蓄積される。個人情報は、人物の写真、名前、住所、勤務先、勤務先住所等のように、訪問者や住宅の住人等の人物を特定する情報をいう。また、個人情報には、後述する所定の人物に関する情報である人物情報を含ませてもよい。なお、サーバ１６や各ＤＢ２２～２８は、クラウド２０内において、相互に補完し合う機能を有することに留意する。 The peripheral information is the installation location of the camera 12 and weather information, traffic information, crime prevention information and the like around the installation location, and is stored in the peripheral information DB 22 in association with the time when these pieces of information are acquired. Personal information is information that identifies a person such as a visitor or a resident of a house, such as a person's photo, name, address, work place, work place address, and the like. In addition, personal information may include personal information which is information on a predetermined person described later. It should be noted that the server 16 and each of the DBs 22 to 28 have functions that complement each other in the cloud 20.

　図２に示すように、カメラ１２は、撮像部１２ａ、マイクロホン１２ｂ（第１マイクロホン、第２マイクロホン）、人物検知部１２ｃ、特徴部分検出部１２ｄ、人物特定部１２ｅ、応答判断部１２ｆ（第１応答判断部）、対話エンジン部１２ｇ（第１対話エンジン部）、関連情報収集部１２ｈ、通信部１２ｉ（第１通信部）、スピーカ１２ｊ（第１スピーカ）、メモリ１２ｋ、データベース１２ｌ（人物情報登録部、シナリオ登録部、関連情報登録部）及びデータ蓄積部１２ｍを有する。人物特定部１２ｅは、特徴量抽出部１２ｎ、類似度判定処理部１２ｏ及び年齢・性別推定処理部１２ｐから構成される。 As shown in FIG. 2, the camera 12 includes an imaging unit 12a, a microphone 12b (first and second microphones), a person detection unit 12c, a feature detection unit 12d, a person identification unit 12e, and a response determination unit 12f (first Response determination unit), dialogue engine unit 12g (first dialogue engine unit), related information collection unit 12h, communication unit 12i (first communication unit), speaker 12j (first speaker), memory 12k, database 12l (person information registration) , A scenario registration unit, a related information registration unit), and a data storage unit 12m. The person specifying unit 12e includes a feature amount extracting unit 12n, a similarity degree determination processing unit 12o, and an age / sex estimation processing unit 12p.

　カメラ１２は、メモリ１２ｋに記憶されたプログラムを読み出して実行することにより、人物検知部１２ｃ、特徴部分検出部１２ｄ、人物特定部１２ｅ（特徴量抽出部１２ｎ、類似度判定処理部１２ｏ、年齢・性別推定処理部１２ｐ）、応答判断部１２ｆ、対話エンジン部１２ｇ、及び、関連情報収集部１２ｈの機能を実現する。また、カメラ１２は、人工知能機能を搭載したカメラであって、プログラムの実行により実現される、人物検知部１２ｃ、特徴部分検出部１２ｄ、人物特定部１２ｅ、応答判断部１２ｆ、対話エンジン部１２ｇ、及び、関連情報収集部１２ｈが人工知能機能を担っている。 The camera 12 reads out and executes the program stored in the memory 12k, whereby the person detection unit 12c, the feature portion detection unit 12d, the person specification unit 12e (feature amount extraction unit 12n, similarity determination processing unit 12o, age, The functions of the gender estimation processing unit 12p), the response determination unit 12f, the dialogue engine unit 12g, and the related information collection unit 12h are realized. The camera 12 is a camera equipped with an artificial intelligence function, and is implemented by execution of a program. The person detection unit 12c, the characteristic part detection unit 12d, the person identification unit 12e, the response determination unit 12f, and the dialogue engine unit 12g The related information collection unit 12 h is responsible for the artificial intelligence function.

　撮像部１２ａは、カメラ１２の設置場所周辺を撮像する。マイクロホン１２ｂは、カメラ１２の設置場所周辺の音（例えば、カメラ１２が撮像する人物の音声）を検出する。人物検知部１２ｃは、撮像部１２ａが撮像した画像、さらには、マイクロホン１２ｂが検出した音声データに基づいて、画像に写り込んだ人物を検知する。特徴部分検出部１２ｄは、人物検知部１２ｃが検知した人物の画像から該人物の特徴部分を検出する。 The imaging unit 12 a captures an area around the installation location of the camera 12. The microphone 12 b detects sound around the installation location of the camera 12 (for example, voice of a person captured by the camera 12). The human detection unit 12c detects a person who is captured in an image based on an image captured by the imaging unit 12a and further audio data detected by the microphone 12b. The feature part detection unit 12d detects a feature part of the person from the image of the person detected by the person detection unit 12c.

　なお、人物の特徴部分とは、少なくとも、人物検知部１２ｃが検知した人物の顔又は容姿である。容姿には、人物を前側、斜め又は側方から見たときの姿や、人物の後ろ姿が含まれる。また、特徴部分には、人物検知部１２ｃが検知した人物の服装、該服装の柄、色若しくは模様、又は、該人物の服装に付されたマークを含ませてもよい。例えば、該人物が、カメラ１２の設置場所に来訪する事業者（例えば、各種の配送業者、宅配業者、郵便事業者（の配達担当者や営業担当者））であれば、該事業者が着用する制服又は帽子の特徴的な色又は模様や、該制服又は帽子に付された事業者の標識（商標）等が、人物の特徴部分となる。これらの特徴部分は、一例であって、カメラ１２が撮像した画像に写り込む人物について、該人物を特定できるような部分が該人物の特徴部分であればよい。 The characteristic part of the person is at least the face or appearance of the person detected by the person detection unit 12c. The appearance includes the appearance when the person is viewed from the front, the side or the side, and the back of the person. In addition, the feature portion may include the clothes of the person detected by the person detection unit 12 c, the pattern, color or pattern of the clothes, or a mark attached to the clothes of the person. For example, if the person visits the installation location of the camera 12 (for example, various delivery companies, home delivery companies, postal companies (the delivery person in charge or sales person in charge)), the person wears them. The characteristic color or pattern of the uniform or hat to be used, the sign (trademark) of the business operator attached to the uniform or hat, or the like is the characteristic part of the person. These characteristic portions are an example, and for a person reflected in an image captured by the camera 12, a portion capable of specifying the person may be a characteristic portion of the person.

　人物特定部１２ｅは、特徴部分検出部１２ｄが検出した特徴部分の画像に基づいて、人物検知部１２ｃが検知した人物を特定する。具体的に、特徴量抽出部１２ｎは、特徴部分検出部１２ｄが検出した特徴部分の画像から該特徴部分の特徴量を抽出する。類似度判定処理部１２ｏは、抽出された特徴量と、データベース１２ｌ（以下、ＤＢ１２ｌともいう。）に登録されている所定の人物に関する人物情報に含まれる該人物の特徴部分の特徴量との類似度を判定する。年齢・性別推定処理部１２ｐは、特徴部分の特徴量と、人物情報とを比較することにより、人物の年齢及び／又は性別を推定する。なお、人物情報は、所定の人物を特定するための情報であって、該所定の人物の氏名、年齢、性別、特徴部分の画像や特徴量等を含む情報をいう。 The person specifying unit 12e specifies the person detected by the person detecting unit 12c based on the image of the characteristic portion detected by the characteristic portion detecting unit 12d. Specifically, the feature quantity extraction unit 12 n extracts the feature quantity of the feature part from the image of the feature part detected by the feature part detection unit 12 d. The similarity determination processing unit 12 o is similar to the extracted feature amount and the feature amount of the feature portion of the person included in the person information on the predetermined person registered in the database 12 l (hereinafter, also referred to as DB 12 l). Determine the degree. The age / sex estimation processing unit 12 p estimates the age and / or gender of the person by comparing the feature amount of the feature portion with the person information. The personal information is information for specifying a predetermined person, and includes information including the name, the age, the gender, the image of the characteristic portion, the feature amount, and the like of the predetermined person.

　応答判断部１２ｆは、人物特定部１２ｅが人物を特定した際に、その特定結果に応じた人物への対応を判断する。 When the person identification unit 12 e identifies a person, the response determination unit 12 f determines the response to the person according to the identification result.

　対話エンジン部１２ｇは、人物特定部１２ｅでの特定結果に基づいて応答判断部１２ｆが人物との対話の実行を決定した場合、ＤＢ１２ｌに登録されている複数のシナリオの中から、該対話に適合するシナリオを選定する。なお、シナリオとは、カメラ１２と人物との間で対話を行う際に、カメラ１２から人物に投げかける会話の内容を示した所定の定型文をいう。なお、ＤＢ１２ｌでは、人物や対話の種類等に応じて、複数のシナリオが所定のカテゴリーに分類して登録されていることが望ましい。 When the response judging unit 12f decides to execute the dialog with the person based on the identification result of the person specifying unit 12e, the dialog engine unit 12g conforms to the dialog among the plural scenarios registered in the DB 12l. Select a scenario to The scenario means a predetermined fixed sentence indicating the contents of the conversation thrown from the camera 12 to the person when the camera 12 and the person interact with each other. In addition, in the DB 12l, it is desirable that a plurality of scenarios be classified into predetermined categories and registered in accordance with the type of person, dialogue, and the like.

　関連情報収集部１２ｈは、カメラ１２の設置場所の情報、該設置場所周辺の情報、及び／又は、所定の人物に関する情報（前述の人物情報）を、関連情報として、通信回線１４及び通信部１２ｉを介して外部から収集し、収集した関連情報をＤＢ１２ｌに登録する。例えば、関連情報収集部１２ｈは、周辺情報ＤＢ２２に蓄積されている周辺情報や、個人情報ＤＢ２４に蓄積されている個人情報を、通信回線１４及び通信部１２ｉを介して、関連情報として収集する。なお、関連情報収集部１２ｈは、関連情報を収集した時刻と紐付けて、該関連情報をＤＢ１２ｌに登録することが望ましい。 The related information collection unit 12 h uses, as related information, information on the installation location of the camera 12, information on the periphery of the installation location, and / or information on a predetermined person (person information described above) as the communication information 14 and the communication unit 12 i Collected from the outside, and registered the collected related information in DB 12 l. For example, the related information collection unit 12 h collects peripheral information stored in the peripheral information DB 22 and personal information stored in the personal information DB 24 as related information via the communication line 14 and the communication unit 12 i. It is preferable that the related information collection unit 12 h register the related information in the DB 12 l in association with the time when the related information was collected.

　通信部１２ｉは、通信回線１４を介して、携帯機器１８及び／又はサーバ１６との間で、無線通信による信号の送受信を行う。スピーカ１２ｊは、対話エンジン部１２ｇで選定したシナリオの定型文を音声として出力する。データ蓄積部１２ｍには、人物との対話の終了後、対話中に撮像部１２ａが撮像した人物の画像と、マイクロホン１２ｂが検出した人物の音声データとが少なくとも記録される。また、対話の終了後、対話エンジン部１２ｇは、該対話の内容に応じた新たなシナリオを作成し、作成した新たなシナリオをＤＢ１２ｌに登録する。この場合、新たなシナリオは、人物や対話の種類等に応じた所定のカテゴリーに振り分けて登録される。 The communication unit 12i transmits and receives signals by wireless communication with the portable device 18 and / or the server 16 via the communication line 14. The speaker 12 j outputs as a voice the fixed phrase of the scenario selected by the dialogue engine unit 12 g. After the end of the dialog with the person, at least the image of the person captured by the imaging unit 12a and the voice data of the person detected by the microphone 12b are recorded in the data storage unit 12m. In addition, after the end of the dialogue, the dialogue engine unit 12g creates a new scenario according to the contents of the dialogue, and registers the created new scenario in the DB 12l. In this case, new scenarios are distributed and registered in predetermined categories according to the type of person, dialogue, and the like.

　図３に示すように、サーバ１６は、通信部１６ａ（第３通信部）、問い合わせ内容判断部１６ｂ、データ突き合わせ部１６ｃ、応答判断部１６ｄ（第２応答判断部）、対話エンジン部１６ｅ（第２対話エンジン部）及びメモリ１６ｆを有する。一方、携帯機器１８は、通信部１８ａ（第２通信部）、制御部１８ｂ、メモリ１８ｃ、表示部１８ｄ、操作部１８ｅ、スピーカ１８ｆ（第２スピーカ）及びマイクロホン１８ｇ（第３マイクロホン）を有する。 As shown in FIG. 3, the server 16 includes a communication unit 16 a (third communication unit), an inquiry content judgment unit 16 b, a data matching unit 16 c, a response judgment unit 16 d (second response judgment unit), and a dialogue engine unit 16 e (first 2) A dialogue engine unit) and a memory 16f. On the other hand, the portable device 18 includes a communication unit 18a (second communication unit), a control unit 18b, a memory 18c, a display unit 18d, an operation unit 18e, a speaker 18f (second speaker), and a microphone 18g (third microphone).

　サーバ１６は、メモリ１６ｆに記憶されたプログラムを読み出して実行することにより、問い合わせ内容判断部１６ｂ、データ突き合わせ部１６ｃ、応答判断部１６ｄ及び対話エンジン部１６ｅの機能を実現する。また、サーバ１６は、人工知能機能を搭載したコンピュータであって、プログラムの実行により実現される、問い合わせ内容判断部１６ｂ、データ突き合わせ部１６ｃ、応答判断部１６ｄ及び対話エンジン部１６ｅが人工知能機能を担っている。 The server 16 reads out and executes the program stored in the memory 16f to realize the functions of the inquiry content determination unit 16b, the data matching unit 16c, the response determination unit 16d, and the dialogue engine unit 16e. The server 16 is a computer equipped with an artificial intelligence function, and the inquiry content judgment unit 16b, the data matching unit 16c, the response judgment unit 16d, and the dialogue engine unit 16e realized by execution of a program have artificial intelligence functions. I am responsible.

　サーバ１６の通信部１６ａは、通信回線１４を介して、カメラ１２及び／又は携帯機器１８との間で、無線通信による信号の送受信を行う。問い合わせ内容判断部１６ｂは、カメラ１２の応答判断部１２ｆからの問い合わせ内容を通信部１６ａが受信した場合、どのような問い合わせ内容を受信したのかを判断する。この問い合わせ内容は、カメラ１２側で人物の特定が一旦行われたが、サーバ１６側で該人物をより詳しく特定することを要請するための問い合わせである。例えば、撮像部１２ａが撮像した人物を人物特定部１２ｅで特定しようとしても、精度よく特定することが難しい場合に、カメラ１２からサーバ１６に上記の問い合わせ内容が送信される。 The communication unit 16 a of the server 16 transmits and receives signals by wireless communication with the camera 12 and / or the portable device 18 via the communication line 14. When the communication unit 16a receives the inquiry content from the response determination unit 12f of the camera 12, the inquiry content determination unit 16b determines what kind of inquiry content has been received. The inquiry content is an inquiry for requesting identification of the person in more detail on the server 16 side although the person identification has been performed on the camera 12 side. For example, even if it is difficult to specify a person taken by the imaging unit 12a with the person specifying unit 12e with high accuracy, the above-described inquiry content is transmitted from the camera 12 to the server 16.

　データ突き合わせ部１６ｃは、問い合わせ内容判断部１６ｂでの判断結果に基づき、通信部１６ａを介して、周辺情報ＤＢ２２からカメラ１２の周辺情報を取得すると共に、個人情報ＤＢ２４から人物に応じた個人情報を取得し、取得した周辺情報及び個人情報と、問い合わせ内容とを突き合わせることで、人物を詳しく特定する。 The data matching unit 16c acquires the peripheral information of the camera 12 from the peripheral information DB 22 through the communication unit 16a based on the determination result of the inquiry content determination unit 16b, and personal information according to the person from the personal information DB 24. By matching the acquired peripheral information and personal information with the contents of the inquiry, the person is specified in detail.

　応答判断部１６ｄは、データ突き合わせ部１６ｃが人物を詳しく特定した際に、その特定結果に応じた人物への対応を判断する。対話エンジン部１６ｅは、データ突き合わせ部１６ｃでの特定結果に基づいて応答判断部１６ｄが人物との対話の実行を決定したときに、通信部１６ａを介して、シナリオＤＢ２６から該対話に応じたシナリオを選定する。なお、シナリオＤＢ２６に蓄積されているシナリオは、カメラ１２のＤＢ１２ｌに蓄積されているシナリオと同様である。但し、シナリオＤＢ２６に蓄積されているシナリオのデータ量は、ＤＢ１２ｌに蓄積されているシナリオのデータ量よりも格段に多いことに留意する。また、シナリオＤＢ２６においても、各シナリオは、人物や対話の種類等に応じた所定のカテゴリーに分類されて登録されていることが望ましい。 When the data matching unit 16c specifies a person in detail, the response determination unit 16d determines the correspondence to the person according to the specification result. When the response determination unit 16 d decides to execute the dialog with the person based on the identification result in the data matching unit 16 c, the dialog engine unit 16 e responds to the scenario from the scenario DB 26 via the communication unit 16 a. Select The scenario stored in the scenario DB 26 is the same as the scenario stored in the DB 12 l of the camera 12. However, it should be noted that the amount of scenario data accumulated in the scenario DB 26 is much larger than the amount of scenario data accumulated in the DB 12 l. Further, in the scenario DB 26 as well, it is desirable that each scenario be classified and registered in a predetermined category according to the type of person, dialogue, and the like.

　一方、携帯機器１８の通信部１８ａは、通信回線１４を介して、カメラ１２及び／又はサーバ１６との間で、無線通信による信号の送受信を行う。制御部１８ｂは、メモリ１８ｃに記憶されたプログラムを読み出して実行することにより、携帯機器１８内の各部を制御する。表示部１８ｄは、例えば、タッチパネルの表示部分であり、所望の情報を表示する。操作部１８ｅは、例えば、タッチパネルのソフトキーであり、携帯機器１８の所持者は、操作部１８ｅを操作することにより、携帯機器１８に対して所望の指示を行うことができる。スピーカ１８ｆは、音声データを音として外部に出力する。マイクロホン１８ｇは、携帯機器１８の所持者の声を検出する。 On the other hand, the communication unit 18 a of the portable device 18 transmits and receives signals by wireless communication with the camera 12 and / or the server 16 via the communication line 14. The control unit 18 b controls each unit in the portable device 18 by reading and executing the program stored in the memory 18 c. The display unit 18 d is, for example, a display portion of a touch panel, and displays desired information. The operation unit 18 e is, for example, a soft key of a touch panel, and a holder of the portable device 18 can issue a desired instruction to the portable device 18 by operating the operation unit 18 e. The speaker 18 f outputs audio data as sound to the outside. The microphone 18 g detects the voice of the holder of the portable device 18.

［２．本実施形態の動作］
　次に、本実施形態に係るセキュリティシステム１０の動作について、図４～図７を参照しながら説明する。この動作説明では、必要に応じて、図１～図３も参照しながら説明する。ここでは、以下の３つの実施例（第１～第３実施例）と第１～第３実施例の変形例とについて順に説明する。 [2. Operation of this embodiment]
Next, the operation of the security system 10 according to the present embodiment will be described with reference to FIGS. 4 to 7. The description of the operation will be made with reference to FIGS. 1 to 3 as necessary. Here, the following three embodiments (first to third embodiments) and modified examples of the first to third embodiments will be described in order.

＜２．１　第１実施例＞
　図４を参照しながら、第１実施例（第１の対応）について説明する。第１実施例は、撮像部１２ａが撮像した画像から人物特定部１２ｅが人物を特定し、その特定結果に基づいてカメラ１２と人物との間で対話を行うものである。従って、第１実施例では、全ての処理がカメラ１２内で行われるものであり、カメラ１２と携帯機器１８及び／又はサーバ１６との間で、通信回線１４を介して、人物に関わる情報の送受信が行われないことに留意する。なお、図４の第１実施例では、一例として、人物としての訪問者が住宅に来訪する場合について説明する。 <2.1 First Embodiment>
A first embodiment (first correspondence) will be described with reference to FIG. In the first embodiment, the person specifying unit 12e specifies a person from the image captured by the imaging unit 12a, and based on the specification result, the camera 12 and the person interact with each other. Therefore, in the first embodiment, all the processing is performed in the camera 12, and information between the camera 12 and the portable device 18 and / or the server 16 via the communication line 14 can be Note that no transmission or reception takes place. In the first embodiment of FIG. 4, a case where a visitor as a person visits a house will be described as an example.

　図４のステップＳ１において、訪問者がカメラ１２の設置された住宅の入口に来たとき、カメラ１２の撮像部１２ａは、訪問者を含むカメラ１２の設置場所周辺を撮像する。この場合、カメラ１２は、訪問者が来訪する前から設置場所周辺を撮像してもよいし、又は、訪問者が来訪するまで待機状態を維持し、訪問者が来訪したときに撮像を開始してもよい。また、マイクロホン１２ｂは、来訪した訪問者の音声を含むカメラ１２の設置場所周辺の音を検出する。 In step S1 of FIG. 4, when the visitor comes to the entrance of a house in which the camera 12 is installed, the imaging unit 12a of the camera 12 images around the installation location of the camera 12 including the visitor. In this case, the camera 12 may image the periphery of the installation location before the visitor visits, or remains in a standby state until the visitor visits, and starts imaging when the visitor visits May be Also, the microphone 12 b detects a sound around the installation location of the camera 12 including the voice of the visiting visitor.

　人物検知部１２ｃは、撮像部１２ａが撮像した画像に基づいて訪問者を検知する。具体的に、人物検知部１２ｃは、下記の手法によって、訪問者を検知する。 The human detection unit 12c detects a visitor based on the image captured by the imaging unit 12a. Specifically, the person detection unit 12c detects a visitor by the following method.

　撮像部１２ａがカメラ１２の周囲を連続的に又は所定時間間隔で撮像する場合、すなわち、撮像部１２ａが動画撮影又は間欠的な静止画撮影を行っている場合、人物検知部１２ｃは、カメラ１２が連続的に又は所定時間間隔で撮像した画像について、該画像に写り込んでいる動体を検出するか、該画像中のノイズの変化を検出するか、又は、該画像の明るさの変化を検出することにより、訪問者を検知する。 When the imaging unit 12a images the periphery of the camera 12 continuously or at predetermined time intervals, that is, when the imaging unit 12a is performing moving image shooting or intermittent still image shooting, the human detection unit 12c is a camera 12 Detects, for images taken continuously or at predetermined time intervals, a moving object reflected in the image, a change in noise in the image, or a change in brightness of the image By detecting visitors.

　すなわち、住宅の住人（携帯機器１８の所持者）が外出中で、該住宅を不在にしている場合、カメラ１２で撮像しても、何らかの動きが発生しないはずであるにも関わらず、動体が検出されたときには、訪問者が来訪したと判断することができるからである。 That is, when the resident of the house (the owner of the portable device 18) is out, and the house is absent, the moving object is in spite of the fact that some movement should not occur even if the image is taken by the camera 12. When it is detected, it can be determined that the visitor has visited.

　また、住宅の他の住人（家族）が帰宅した際、他の住人が住宅内の照明を点灯すれば、カメラ１２が撮像した画像の明るさが変化する。この場合でも、訪問者である他の住人が来訪したと判断することができる。 In addition, when another resident (family) of the house returns home, the brightness of the image captured by the camera 12 changes if the other resident lights the light in the house. Even in this case, it can be determined that another resident who is a visitor has visited.

　さらに、人物検知部１２ｃは、カメラ１２が撮像した画像に加え、マイクロホン１２ｂが検出した音（例えば、訪問者の音声又は足音）から訪問者を検知することが可能である。 Furthermore, in addition to the image captured by the camera 12, the person detection unit 12c can detect the visitor from the sound detected by the microphone 12b (for example, the voice or footsteps of the visitor).

　次のステップＳ２において、人物検知部１２ｃが訪問者を検知した場合、特徴部分検出部１２ｄは、訪問者の画像から該訪問者の特徴部分を検出する。以下の説明では、特徴部分検出部１２ｄが訪問者の特徴部分である顔を検出した場合について説明する。 In the next step S2, when the person detection unit 12c detects a visitor, the feature part detection unit 12d detects the feature part of the visitor from the image of the visitor. In the following description, the case where the feature part detection unit 12 d detects a face that is a feature part of a visitor will be described.

　ステップＳ３において、人物特定部１２ｅの特徴量抽出部１２ｎは、特徴部分検出部１２ｄが検出した訪問者の顔の画像から、該顔の特徴量を抽出する。 In step S3, the feature quantity extraction unit 12n of the person specifying unit 12e extracts the feature quantity of the face from the image of the visitor's face detected by the feature part detection unit 12d.

　ステップＳ４において、人物特定部１２ｅの類似度判定処理部１２ｏは、特徴量抽出部１２ｎが抽出した顔の特徴量と、ＤＢ１２ｌに蓄積されている複数の人物情報に含まれる顔の特徴量との類似度を判定する。これにより、判定対象となった訪問者に一致する人物を特定することができる。 In step S4, the similarity determination processing unit 12o of the person specifying unit 12e compares the feature amount of the face extracted by the feature amount extraction unit 12n with the feature amount of the face included in the plurality of pieces of person information stored in the DB 12l. Determine the degree of similarity. Thus, it is possible to specify a person who matches the visitor to be determined.

　ステップＳ５において、人物特定部１２ｅの年齢・性別推定処理部１２ｐは、特徴量抽出部１２ｎが抽出した顔の特徴量と、ＤＢ１２ｌに蓄積されている複数の人物情報とを比較することにより、訪問者の年齢及び／又は性別を推定する。これにより、判定対象となった訪問者の年齢及び／又は性別を特定することができる。 In step S5, the age / sex estimation processing unit 12p of the person specifying unit 12e makes a visit by comparing the feature amount of the face extracted by the feature amount extracting unit 12n with the plurality of pieces of personal information stored in the DB 12l. Estimate the age and / or gender of the person. Thereby, the age and / or sex of the visitor who was the target of determination can be specified.

　このようにステップＳ３～Ｓ５の処理を順に行うことで、人物特定部１２ｅは、訪問者を特定することができる。 By sequentially performing the processes of steps S3 to S5 in this manner, the person specifying unit 12e can specify a visitor.

　次のステップＳ６において、応答判断部１２ｆは、人物特定部１２ｅでの訪問者の特定結果を受けて、訪問者との対話（会話）を開始すべきか否かを判定する。 In the next step S6, the response judging unit 12f receives the identification result of the visitor in the person identifying unit 12e, and determines whether to start a dialogue (conversation) with the visitor.

　この場合、人物特定部１２ｅにおいて訪問者の人物像が精度よく特定された場合、応答判断部１２ｆは、訪問者との対話が可能と判断し、訪問者との対話を開始することを決定し（ステップＳ６：ＹＥＳ）、次のステップＳ７に進む。 In this case, when the person identification unit 12e accurately identifies the visitor's person image, the response determination unit 12f determines that interaction with the visitor is possible, and determines to start interaction with the visitor. (Step S6: YES), the process proceeds to the next step S7.

　一方、人物特定部１２ｅにおいて訪問者を特定することができなかった場合、又は、訪問者を特定することができても、精度よく特定されていない場合、応答判断部１２ｆは、訪問者との対話が不可能と判断し（ステップＳ６：ＮＯ）、ステップＳ１に戻り、ステップＳ１～Ｓ６の処理を再度行わせる。 On the other hand, if the person identification unit 12e can not identify the visitor, or if the visitor can be identified but is not accurately identified, the response determination unit 12f determines that the visitor is not identified. It is determined that the dialogue is impossible (step S6: NO), and the process returns to step S1 to perform the processes of steps S1 to S6 again.

　ステップＳ７において、対話エンジン部１２ｇは、応答判断部１２ｆにおける訪問者との対話を実行する旨の決定を受けて、ＤＢ１２ｌから該対話に最適なシナリオを選定する。 In step S7, in response to the decision to execute the dialogue with the visitor in the response judgment unit 12f, the dialogue engine unit 12g selects the most suitable scenario for the dialogue from the DB 12l.

　この場合、関連情報収集部１２ｈは、例えば、通信部１２ｉ及び通信回線１４を介して、周辺情報ＤＢ２２から周辺情報を収集すると共に、個人情報ＤＢ２４から個人情報を収集し、収集した周辺情報及び個人情報を、人物情報を含む関連情報として、収集時刻と紐付けてＤＢ１２ｌに登録している。そのため、対話エンジン部１２ｇは、訪問者との対話が可能と判断した該訪問者の画像を撮像部１２ａが撮像した時刻に収集された関連情報を、ＤＢ１２ｌから抽出し、抽出した関連情報と、応答判断部１２ｆの決定とに対応するシナリオを、ＤＢ１２ｌから選定する。例えば、対話エンジン部１２ｇは、関連情報に含まれるカメラ１２の設置場所及び該設置場所周辺の気象情報、交通情報及び防犯情報も考慮して、ＤＢ１２ｌから最適なシナリオを選定する。 In this case, for example, the related information collection unit 12 h collects peripheral information from the peripheral information DB 22 via the communication unit 12 i and the communication line 14, and collects personal information from the personal information DB 24 and collects the peripheral information and the individual Information is associated with collection time and registered in the DB 12 l as related information including personal information. Therefore, the dialogue engine unit 12g extracts, from the DB 12l, the related information collected at the time when the imaging unit 12a captures an image of the visitor who has determined that interaction with the visitor is possible, and the extracted related information; A scenario corresponding to the determination of the response determination unit 12f is selected from the DB 12l. For example, the dialogue engine unit 12g selects an optimal scenario from the DB 12l in consideration of the installation location of the camera 12 included in the related information and weather information, traffic information and crime prevention information around the installation location.

　ステップＳ８において、対話エンジン部１２ｇは、シナリオの示す定型文をスピーカ１２ｊから音声として外部に出力する。例えば、訪問者が配送業者、宅配業者又は郵便事業者等の事業者の配達担当者であった場合、「ご苦労様です。ポストに入れて下さい。」、又は、「不在ですので、再配達して下さい。」等の定型文をスピーカ１２ｊから出力する。つまり、ＤＢ１２ｌには、このようなシナリオが蓄積されており、対話エンジン部１２ｇは、ステップＳ７で事業者と対話するために最適なシナリオを選定し、ステップＳ８において、選定した最適なシナリオの定型文をスピーカ１２ｊから音声として出力させる。 In step S8, the dialogue engine unit 12g externally outputs the fixed phrase indicated by the scenario as a voice from the speaker 12j. For example, if the visitor is a delivery person in charge of a carrier such as a delivery company, a home delivery company or a postal company, "You have trouble. Please put in the post." And the like, etc. are output from the speaker 12 j. That is, such a scenario is accumulated in the DB 12l, and the dialogue engine unit 12g selects the most suitable scenario for interacting with the business operator in step S7, and selects the most suitable selected scenario in step S8. The sentence is output as sound from the speaker 12 j.

　ステップＳ９において、応答判断部１２ｆは、訪問者との対話を終了するか否かを判断する。対話を継続する場合、ステップＳ１に戻り、ステップＳ１～Ｓ９の処理が再度実行される。このように、ステップＳ１～Ｓ９の処理が繰り返し実行されることで、カメラ１２は、訪問者に対する動画撮影又は間欠的な静止画撮影を行うことができる。これにより、特徴部分検出部１２ｄは、ステップＳ２において、人物検知部１２ｃが訪問者を検知する毎に（ステップＳ１の処理が実行される毎に）、訪問者の画像から該訪問者の顔を検出し、該顔を追跡することができる。また、ステップＳ１～Ｓ９の処理を繰り返し実行するため、ステップＳ６において、応答判断部１２ｆは、肯定的な判断を行い、対話の実行を継続する。 In step S9, the response determination unit 12f determines whether to end the dialog with the visitor. If the dialogue is to be continued, the process returns to step S1, and the processes of steps S1 to S9 are performed again. Thus, the camera 12 can perform moving image shooting or intermittent still image shooting for the visitor by repeatedly executing the processing of steps S1 to S9. Thereby, the characteristic part detection unit 12d detects the visitor's face from the image of the visitor every time the person detection unit 12c detects the visitor in step S2 (every time the process of step S1 is executed). It can detect and track the face. Further, in order to repeatedly execute the processes of steps S1 to S9, in step S6, the response determination unit 12f makes an affirmative determination and continues the execution of the dialog.

　そして、ステップＳ９において、応答判断部１２ｆが訪問者との対話の終了を決定した場合（ステップＳ９：ＹＥＳ）、次のステップＳ１０に進む。 Then, in step S9, when the response determination unit 12f determines the end of the dialogue with the visitor (step S9: YES), the process proceeds to the next step S10.

　ステップＳ１０において、対話エンジン部１２ｇは、対話中に撮像部１２ａが撮像した（訪問者の）画像と、マイクロホン１２ｂが検出した音（訪問者の音声データ）とをデータ蓄積部１２ｍに記録する。また、ステップＳ１１において、対話エンジン部１２ｇは、訪問者との対話内容に応じた新たなシナリオを作成し、作成した新たなシナリオをＤＢ１２ｌに登録する。この場合、対話エンジン部１２ｇは、対話に用いたシナリオを修正又は変更することで新たなシナリオを作成すればよい。 In step S10, the dialogue engine unit 12g records, in the data storage unit 12m, the image (of the visitor) captured by the imaging unit 12a during the dialogue and the sound (voice data of the visitor) detected by the microphone 12b. Further, in step S11, the dialogue engine unit 12g creates a new scenario according to the content of the dialogue with the visitor, and registers the created new scenario in the DB 12l. In this case, the dialogue engine unit 12 g may create a new scenario by correcting or changing the scenario used for the dialogue.

　このように、第１実施例では、訪問者が来訪する毎に、ステップＳ１～Ｓ１１の処理が実行されるので、対話エンジン部１２ｇは、訪問者との対話を通じて、対話学習を行い、新たなシナリオを作成し、ＤＢ１２ｌに登録する。これにより、対話が実行される毎にＤＢ１２ｌにシナリオが蓄積され、ＤＢ１２ｌ内のシナリオを最新の状態にすることができる。この結果、対話エンジン部１２ｇは、訪問者との対話の際、最新且つ最適なシナリオをＤＢ１２ｌから選定し、選定したシナリオを用いて対話の精度を向上させることができる。 As described above, in the first embodiment, since the processes of steps S1 to S11 are executed each time a visitor visits, the dialogue engine unit 12g performs dialogue learning through dialogue with the visitor, and a new process is performed. Create a scenario and register it in DB12l. As a result, the scenario is accumulated in the DB 12l each time the dialogue is executed, and the scenario in the DB 12l can be updated. As a result, when interacting with a visitor, the dialogue engine unit 12 g can select the latest and optimal scenario from the DB 12 l and use the selected scenario to improve the accuracy of the dialogue.

　また、ステップＳ７において、対話エンジン部１２ｇは、ＤＢ１２ｌに登録されている直近のシナリオ（例えば、過去１年間）の中から、最適なシナリオを選定してもよい。すなわち、前述の対話学習により、時間経過に伴って、新たなシナリオがＤＢ１２ｌに順次登録されるので、同一又は類似の対話に対して、直近のシナリオを選択すれば、的確な対話を実行することが可能になるためである。 In step S7, the dialogue engine unit 12g may select an optimal scenario from the latest scenarios (for example, the past one year) registered in the DB 12l. That is, since new scenarios are sequentially registered in the DB 12l as time passes by the above-mentioned dialogue learning, if the latest scenario is selected for the same or similar dialogue, an appropriate dialogue is executed. Is possible.

　なお、ステップＳ１０でデータ蓄積部１２ｍに記録された対話中の画像及び音声は、例えば、携帯機器１８の所持者による操作部１８ｅの入力操作に基づき、データ蓄積部１２ｍから削除することが可能である。 Note that the image and sound during interaction recorded in the data storage unit 12m in step S10 can be deleted from the data storage unit 12m based on, for example, the input operation of the operation unit 18e by the owner of the portable device 18 is there.

＜２．２　第２実施例＞
　図５を参照しながら、第２実施例（第３の対応）について説明する。第２実施例は、サーバ１６において訪問者に対するより詳しい特定を行い、その特定結果に基づいてカメラ１２と訪問者との間で対話を行うものである。従って、第２実施例では、処理の主体は、カメラ１２とサーバ１６とであることに留意する。なお、第２実施例においても、図４の第１実施例と同様に、外部から建物に訪問者が来訪し、訪問者の顔に基づいて該訪問者を特定する場合について説明する。 2.2 Second Embodiment
A second embodiment (third correspondence) will be described with reference to FIG. In the second embodiment, the server 16 makes a more detailed identification of the visitor, and the camera 12 interacts with the visitor based on the identification result. Therefore, in the second embodiment, it should be noted that the subject of processing is the camera 12 and the server 16. Also in the second embodiment, as in the first embodiment of FIG. 4, a case where a visitor visits a building from the outside and identifies the visitor based on the face of the visitor will be described.

　図４のステップＳ５の処理後、カメラ１２の応答判断部１２ｆは、図５のステップＳ１２に進む。 After the process of step S5 of FIG. 4, the response determination unit 12f of the camera 12 proceeds to step S12 of FIG.

　ステップＳ１２において、応答判断部１２ｆは、人物特定部１２ｅでの特定結果に基づいて、訪問者の詳しい特定をサーバ１６に問い合わせるべきか否か（要請するべきか否か）を判定する。 In step S12, the response determination unit 12f determines whether to inquire the server 16 about the detailed identification of the visitor based on the identification result of the person identification unit 12e.

　人物特定部１２ｅにおいて、撮像部１２ａが撮像した訪問者の画像を用いて、該訪問者の顔の画像から訪問者の人物像を精度よく特定することができた場合、応答判断部１２ｆは、サーバ１６への問い合わせは不要と判断し（ステップＳ１２：ＮＯ）、カメラ１２側において、図４のステップＳ６以降の処理を実行する。 In the case where the person identification unit 12e can specify the person image of the visitor with high accuracy from the image of the visitor's face using the visitor image captured by the imaging unit 12a, the response determination unit 12f It is determined that the inquiry to the server 16 is unnecessary (step S12: NO), and the camera 12 side executes the processing after step S6 in FIG.

　一方、人物特定部１２ｅにおいて、撮像部１２ａが撮像した訪問者の画像を用いても、該訪問者の顔の画像から訪問者の人物像を精度よく特定することができない場合、又は、訪問者の顔を認識することはできるが、当該訪問者がＤＢ１２ｌに蓄積されている複数の人物情報に該当しない場合（初めて来訪した訪問者である場合）、応答判断部１２ｆは、カメラ１２側で訪問者の特定を行うことは難しいと判断し、訪問者の詳しい特定をサーバ１６に問い合わせることを決定する（ステップＳ１２：ＹＥＳ）。 On the other hand, even if the image of the visitor can not be accurately identified from the image of the visitor's face even if the image of the visitor taken by the imaging unit 12a is used in the person specifying unit 12e, or If the visitor does not correspond to a plurality of pieces of personal information stored in the DB 12l (if it is a visitor who has visited for the first time), the response judgment unit 12f visits the camera 12 side. It is determined that it is difficult to identify the person, and it is determined to inquire the server 16 about the detailed identification of the visitor (Step S12: YES).

　次のステップＳ１３において、応答判断部１２ｆは、訪問者の詳しい特定を要請する問い合わせ内容を、通信部１２ｉ及び通信回線１４を介して、サーバ１６に送信する。この場合、問い合わせ内容には、訪問者の詳しい特定を要請する旨の指示内容、応答判断部１２ｆでの判断結果、人物特定部１２ｅでの特定結果、及び、人物特定部１２ｅでの処理に用いた訪問者の画像（訪問者の顔の画像）等が含まれる。 In the next step S13, the response determination unit 12f transmits the inquiry content for requesting detailed identification of the visitor to the server 16 via the communication unit 12i and the communication line 14. In this case, the inquiry contents include instructions for requesting detailed identification of the visitor, determination results by the response determination unit 12f, identification results by the person identification unit 12e, and processing by the person identification unit 12e. The image of the visitor who was (the image of the visitor's face) etc. are included.

　サーバ１６の通信部１６ａは、通信回線１４を介して、カメラ１２の通信部１２ｉから問い合わせ内容を受信すると、受信した問い合わせ内容を問い合わせ内容判断部１６ｂに出力する。 When the communication unit 16a of the server 16 receives the inquiry content from the communication unit 12i of the camera 12 via the communication line 14, the communication unit 16a outputs the received inquiry content to the inquiry content judgment unit 16b.

　ステップＳ１４において、問い合わせ内容判断部１６ｂは、通信部１６ａから入力された問い合わせ内容を確認（判断）する。この場合、入力された問い合わせ内容が訪問者の詳しい特定を要請するための問い合わせ内容であるため、問い合わせ内容判断部１６ｂは、その問い合わせ内容をデータ突き合わせ部１６ｃに出力する。なお、通信部１６ａから入力される情報が問い合わせ内容以外の情報である場合、問い合わせ内容判断部１６ｂは、該情報をデータ突き合わせ部１６ｃに出力しない。 In step S14, the inquiry content determination unit 16b confirms (determines) the inquiry content input from the communication unit 16a. In this case, since the input query content is the query content for requesting the visitor's detailed identification, the query content determination unit 16b outputs the query content to the data matching unit 16c. When the information input from the communication unit 16a is information other than the inquiry content, the inquiry content determination unit 16b does not output the information to the data matching unit 16c.

　ステップＳ１５において、データ突き合わせ部１６ｃは、問い合わせ内容判断部１６ｂから問い合わせ内容が入力されることにより、カメラ１２から訪問者の詳しい特定が要請されたことを認識することができる。そして、データ突き合わせ部１６ｃは、入力された問い合わせ内容を確認した後、通信部１６ａを介して、周辺情報ＤＢ２２からカメラ１２の設置場所周辺の周辺情報を取得すると共に、個人情報ＤＢ２４から訪問者に応じた個人情報を取得する。次に、データ突き合わせ部１６ｃは、取得した周辺情報及び個人情報と、問い合わせ内容とを突き合わせることにより、訪問者を詳しく特定する。 In step S15, the data matching unit 16c can recognize that the camera 12 has requested the detailed identification of the visitor by inputting the inquiry content from the inquiry content determination unit 16b. Then, after the data matching unit 16c confirms the input inquiry content, the data matching unit 16c acquires the peripheral information around the installation location of the camera 12 from the peripheral information DB 22 via the communication unit 16a, and the visitor from the personal information DB 24 Obtain personal information according to your needs. Next, the data matching unit 16c identifies the visitor in detail by matching the acquired peripheral information and personal information with the contents of the inquiry.

　この場合、データ突き合わせ部１６ｃは、カメラ１２の人物特定部１２ｅ（の特徴量抽出部１２ｎ、類似度判定処理部１２ｏ及び年齢・性別推定処理部１２ｐ）と同様の機能を有している。従って、データ突き合わせ部１６ｃにおいても、問い合わせ内容に含まれる訪問者の顔の画像等を用いて、人物特定部１２ｅと同様の処理を行う。但し、周辺情報ＤＢ２２及び個人情報ＤＢ２４には、カメラ１２のＤＢ１２ｌよりも格段に多くのデータ量が蓄積されていると共に、カメラ１２側では特定できなかった新規の訪問者に関する情報も蓄積されている。従って、データ突き合わせ部１６ｃは、これらのＤＢ２２、２４に蓄積された情報を利用して、訪問者をより詳しく特定することができる。 In this case, the data matching unit 16c has the same function as that of the person specifying unit 12e of the camera 12 (of the feature quantity extraction unit 12n, the similarity determination processing unit 12o, and the age / sex estimation processing unit 12p). Therefore, also in the data matching unit 16c, the same process as the person specifying unit 12e is performed using the image or the like of the visitor's face included in the inquiry content. However, in the peripheral information DB 22 and the personal information DB 24, a much larger amount of data is accumulated than the DB 12 l of the camera 12, and information about new visitors that can not be identified by the camera 12 is also accumulated. . Therefore, the data matching unit 16c can identify the visitor in more detail by using the information stored in the DBs 22 and 24.

　このようにして訪問者の詳しい特定が行われた後、図４のステップＳ６に進む。ステップＳ６の処理は、サーバ１６の応答判断部１６ｄが実行する。 After the detailed identification of the visitor is performed in this manner, the process proceeds to step S6 in FIG. The response determination unit 16d of the server 16 executes the process of step S6.

　すなわち、ステップＳ６において、サーバ１６の応答判断部１６ｄは、データ突き合わせ部１６ｃでの訪問者に対する詳しい特定結果に基づき、該訪問者との対話（会話）を開始すべきか否かを判定する。前述のように、データ突き合わせ部１６ｃにおいて訪問者の人物像が詳しく特定されているので、応答判断部１６ｄは、訪問者との対話が可能と判断し、訪問者との対話を開始することを決定して（ステップＳ６：ＹＥＳ）、次のステップＳ７に進む。 That is, in step S6, the response determination unit 16d of the server 16 determines whether to start conversation (conversation) with the visitor based on the detailed identification result for the visitor in the data matching unit 16c. As described above, since the person image of the visitor is specified in detail in the data matching unit 16c, the response determination unit 16d determines that interaction with the visitor is possible, and starts interaction with the visitor. It decides (step S6: YES), it advances to the following step S7.

　ステップＳ７において、サーバ１６の対話エンジン部１６ｅは、応答判断部１６ｄにおける訪問者との対話の実行の決定を受け、通信部１６ａを介して、シナリオＤＢ２６から該対話に最適なシナリオを選定し、取得する。この場合、シナリオＤＢ２６には、カメラ１２のＤＢ１２ｌよりも格段に多くのシナリオが蓄積されている。従って、対話エンジン部１６ｅは、シナリオＤＢ２６に蓄積された複数のシナリオの中から、訪問者との対話に最適なシナリオを確実に選定することができる。また、ステップＳ７において、対話エンジン部１６ｅは、訪問者の画像が撮像された時刻に登録された周辺情報を周辺情報ＤＢ２２から取得すると共に、個人情報ＤＢ２４から訪問者の個人情報を取得し、取得した周辺情報及び個人情報も考慮して、シナリオＤＢ２６から最適なシナリオを選定することも可能である。 In step S7, the dialog engine unit 16e of the server 16 receives the decision on execution of the dialog with the visitor in the response determination unit 16d, and selects the optimal scenario for the dialog from the scenario DB 26 via the communication unit 16a, get. In this case, the scenario DB 26 stores much more scenarios than the DB 12 l of the camera 12. Therefore, the dialogue engine unit 16 e can reliably select a scenario optimal for the dialogue with the visitor from among the plurality of scenarios accumulated in the scenario DB 26. Further, in step S7, the dialogue engine unit 16e acquires the peripheral information registered at the time when the image of the visitor was captured from the peripheral information DB 22, and acquires and acquires the personal information of the visitor from the personal information DB 24. It is also possible to select an optimal scenario from the scenario DB 26 in consideration of the surrounding information and personal information.

　ステップＳ８において、対話エンジン部１６ｅは、通信部１６ａ、通信回線１４及びカメラ１２の通信部１２ｉを介して、シナリオの示す定型文をスピーカ１２ｊから音声として外部に出力する。例えば、訪問者が初めて来訪した人物であった場合、「どちらさまですか。」、「何か御用ですか。」等の定型文をスピーカ１２ｊから出力することができる。 In step S8, the dialogue engine unit 16e outputs the fixed form sentence indicated by the scenario to the outside as sound from the speaker 12j via the communication unit 16a, the communication line 14 and the communication unit 12i of the camera 12. For example, when the visitor is a person who has visited for the first time, a fixed sentence such as “Who?” Or “What is it for?” Can be output from the speaker 12 j.

　ステップＳ９において、サーバ１６の応答判断部１６ｄは、訪問者との対話を終了するか否かを判断する。応答判断部１６ｄが訪問者との対話の終了を決定した場合（ステップＳ９：ＹＥＳ）、次のステップＳ１０において、サーバ１６の対話エンジン部１６ｅは、対話中に撮像部１２ａが撮像した（訪問者の）画像と、マイクロホン１２ｂが検出した音（訪問者の音声データ）とを対話履歴として、通信部１６ａを介して、ヒストリーＤＢ２８に記録する。また、ステップＳ１１において、サーバ１６の対話エンジン部１６ｅは、対話履歴を用いて、訪問者との対話内容に応じた新たなシナリオを作成し、作成した新たなシナリオをシナリオＤＢ２６に登録する。 In step S9, the response determination unit 16d of the server 16 determines whether to end the dialog with the visitor. When the response determination unit 16d determines the end of the dialog with the visitor (step S9: YES), in the next step S10, the dialog engine unit 16e of the server 16 captures an image of the imaging unit 12a during the dialog (visitor ) And the sound (voice data of the visitor) detected by the microphone 12b are recorded in the history DB 28 as the dialogue history via the communication unit 16a. Further, in step S11, the dialogue engine unit 16e of the server 16 creates a new scenario according to the contents of the dialogue with the visitor using the dialogue history, and registers the created new scenario in the scenario DB 26.

　このように、第２実施例では、カメラ１２から問い合わせ内容がサーバ１６に送信される毎に、図５のステップＳ１４、Ｓ１５、図４のステップＳ６～Ｓ１１の処理が順に実行されるので、対話エンジン部１６ｅは、訪問者との対話を通じて、対話学習を行い、ヒストリーＤＢ２８の対話履歴を用いて、新たなシナリオを作成し、シナリオＤＢ２６に登録する。これにより、対話が実行される毎にシナリオＤＢ２６にシナリオが蓄積され、シナリオＤＢ２６内のシナリオを最新の状態にすることができる。この結果、対話エンジン部１６ｅは、訪問者との対話の際、最新且つ最適なシナリオをシナリオＤＢ２６から選定し、選定したシナリオを用いて対話の精度を向上させることができる。 As described above, in the second embodiment, the processes of steps S14 and S15 of FIG. 5 and steps S6 to S11 of FIG. 4 are sequentially executed each time an inquiry content is transmitted from the camera 12 to the server 16. The engine unit 16e performs interaction learning through interaction with the visitor, creates a new scenario using the interaction history of the history DB 28, and registers it in the scenario DB 26. As a result, each time the dialog is executed, a scenario is accumulated in the scenario DB 26, and the scenario in the scenario DB 26 can be updated. As a result, when interacting with a visitor, the dialogue engine unit 16e can select the latest and optimal scenario from the scenario DB 26, and use the selected scenario to improve the accuracy of the dialogue.

　また、ステップＳ７において、対話エンジン部１６ｅは、シナリオＤＢ２６に登録されている直近のシナリオ（例えば、過去１年間）の中から、最適なシナリオを選定してもよい。すなわち、前述の対話学習により、時間経過に伴って、新たなシナリオがシナリオＤＢ２６に順次登録されるので、同一又は類似の対話に対して、直近のシナリオを選択すれば、的確な対話を実行することが可能になるためである。 In step S7, the dialogue engine unit 16e may select an optimal scenario from the latest scenarios (for example, the past one year) registered in the scenario DB 26. That is, since new scenarios are sequentially registered in the scenario DB 26 with the passage of time by the above-described dialogue learning, if the latest scenario is selected for the same or similar dialogue, the appropriate dialogue is executed. It is because it becomes possible.

　なお、第２実施例において、図４のステップＳ６で、データ突き合わせ部１６ｃにおいて訪問者を詳しく特定することができなかった場合、サーバ１６の応答判断部１６ｄは、訪問者との対話が不可能と判断し（ステップＳ６：ＮＯ）、ステップＳ１に戻り、ステップＳ１～Ｓ５、Ｓ１２～Ｓ１５の処理を再度行わせる。また、図４のステップＳ９で、サーバ１６の応答判断部１６ｄが訪問者との対話を継続することを決定した場合（ステップＳ９：ＮＯ）、ステップＳ１に戻り、ステップＳ１～Ｓ５、Ｓ１２～Ｓ１５、Ｓ６～Ｓ９の処理が順に再度実行される。 In the second embodiment, when the data matching unit 16c can not specify the visitor in detail in step S6 of FIG. 4, the response determination unit 16d of the server 16 can not interact with the visitor. It is determined that (step S6: NO), the process returns to step S1, and the processes of steps S1 to S5 and S12 to S15 are performed again. When the response determination unit 16d of the server 16 determines to continue the dialogue with the visitor in step S9 in FIG. 4 (step S9: NO), the process returns to step S1, and steps S1 to S5, and S12 to S15. , S6 to S9 are sequentially executed again.

　さらに、第２実施例の上記の説明では、図４のステップＳ６、Ｓ７、Ｓ９～Ｓ１１の処理をサーバ１６側で全て行う場合について説明した。第２実施例では、これらのステップでの処理の一部又は全てをカメラ１２側で行わせることも可能である。 Furthermore, in the above description of the second embodiment, the case where all the processes of steps S6, S7, and S9 to S11 of FIG. 4 are performed on the server 16 side has been described. In the second embodiment, part or all of the processing in these steps may be performed on the camera 12 side.

　例えば、図５のステップＳ１４、Ｓ１５の処理をサーバ１６側で行い、図４のステップＳ６において、サーバ１６の応答判断部１６ｄから通信部１６ａ、通信回線１４及びカメラ１２の通信部１２ｉを介して応答判断部１２ｆに、データ突き合わせ部１６ｃでの特定結果を送信し、カメラ１２の応答判断部１２ｆでステップＳ６の処理を行わせてもよい。この場合、ステップＳ７以降の処理は、第１実施例と同様に、カメラ１２側で実行される。 For example, the process of steps S14 and S15 of FIG. 5 is performed on the server 16 side, and in step S6 of FIG. 4 from the response determination unit 16d of the server 16 to the communication unit 16a, the communication line 14, and the communication unit 12i of the camera 12. The identification result of the data matching unit 16c may be transmitted to the response determination unit 12f, and the process of step S6 may be performed by the response determination unit 12f of the camera 12. In this case, the processes after step S7 are executed on the camera 12 side as in the first embodiment.

　あるいは、図４のステップＳ６、Ｓ７、Ｓ９の処理をカメラ１２側で行い、カメラ１２側の応答判断部１２ｆによるステップＳ９の肯定的な判定結果を受けて、サーバ１６の対話エンジン部１６ｅが、ステップＳ１０で対話履歴をヒストリーＤＢ２８に記録し、ステップＳ１１で対話履歴を用いて新たなシナリオを作成し、シナリオＤＢ２６に蓄積することも可能である。 Alternatively, the processes of steps S6, S7, and S9 in FIG. 4 are performed on the camera 12 side, and the dialog engine unit 16e of the server 16 receives the positive determination result of step S9 by the response determination unit 12f on the camera 12 side. The interaction history may be recorded in the history DB 28 in step S10, and a new scenario may be created using the interaction history in step S11 and accumulated in the scenario DB 26.

＜２．３　第３実施例＞
　図６を参照しながら、第３実施例（第２の対応）について説明する。第３実施例は、カメラ１２から携帯機器１８に訪問者の来訪を通知し、その後、訪問者と携帯機器１８の所持者との間で対話を行うものである。従って、第３実施例では、処理の主体は、カメラ１２と携帯機器１８とであることに留意する。なお、第３実施例においても、図４の第１実施例と同様に、外部から建物に訪問者が来訪し、訪問者の顔に基づいて該訪問者を特定する場合について説明する。 <2.3 3rd Example>
A third embodiment (second correspondence) will be described with reference to FIG. In the third embodiment, the camera 12 notifies the portable device 18 of the visit of the visitor, and then the dialogue between the visitor and the holder of the portable device 18 is performed. Therefore, it should be noted that in the third embodiment, the subject of processing is the camera 12 and the portable device 18. Also in the third embodiment, as in the first embodiment shown in FIG. 4, a case where a visitor visits a building from the outside and identifies the visitor based on the face of the visitor will be described.

　図４のステップＳ８の処理後、カメラ１２の応答判断部１２ｆは、図６のステップＳ１６に進む。 After the process of step S8 in FIG. 4, the response determination unit 12f of the camera 12 proceeds to step S16 in FIG.

　ステップＳ１６において、応答判断部１２ｆは、人物特定部１２ｅにおける訪問者の特定結果に基づき、訪問者の来訪を携帯機器１８に通知すべきか否かを判定する。 In step S16, the response determination unit 12f determines, based on the identification result of the visitor in the person identification unit 12e, whether to notify the portable device 18 of the visit of the visitor.

　配送業者、宅配業者又は郵便事業者等の事業者の来訪のように、カメラ１２側で対応が可能であるため、来訪の通知が不要と判断した場合（ステップＳ１６：ＮＯ）、応答判断部１２ｆは、図４のステップＳ９の処理を実行する。 When it is determined that the notification of the visit is unnecessary (step S16: NO), the response determination unit 12f, because the camera 12 can cope with the visit, such as the visit of a company such as a delivery company, a home delivery company, or a postal company. Executes the process of step S9 of FIG.

　一方、訪問者が住宅の住人と会話を行いたいケースや、初めて来訪する訪問者のケース等、訪問者の来訪を携帯機器１８に通知する必要があることを決定した場合（ステップＳ１６：ＹＥＳ）、次のステップＳ１７に進む。 On the other hand, when it is determined that the mobile device 18 needs to be notified of the visit of the visitor, such as the case where the visitor wants to have a conversation with the resident of the house, or the case of the visitor visiting for the first time (step S16: YES) , And proceeds to the next step S17.

　ステップＳ１７において、応答判断部１２ｆは、通信部１２ｉ及び通信回線１４を介して携帯機器１８に、訪問者の来訪の通知と、訪問者との対話（ＴＶ電話サービス）の開始の有無を確認するための問い合わせ内容を送信する。 In step S17, the response determination unit 12f checks the portable device 18 via the communication unit 12i and the communication line 14 for notification of the visitor's visit and the presence or absence of start of dialog (TV telephone service) with the visitor. Send the inquiry content for.

　ステップＳ１８において、カメラ１２からの問い合わせ内容を携帯機器１８の通信部１８ａが受信した場合、表示部１８ｄは、問い合わせ内容を表示し、一方で、スピーカ１８ｆは、問い合わせ内容を示す音を出力する。携帯機器１８の所持者は、表示部１８ｄの表示内容を視認することにより、又は、スピーカ１８ｆからの音を聞くことにより、訪問者の来訪や、ＴＶ電話サービスの開始の有無の問い合わせがカメラ１２からあったことを認識することができる。 In step S18, when the communication unit 18a of the portable device 18 receives the inquiry content from the camera 12, the display unit 18d displays the inquiry content, while the speaker 18f outputs a sound indicating the inquiry content. The owner of the portable device 18 looks at the display content of the display unit 18 d or listens to the sound from the speaker 18 f to make an inquiry about whether the visitor is visiting or the start of the video telephone service is the camera 12. It can be recognized that it was from.

　ステップＳ１９において、携帯機器１８の所持者は、表示部１８ｄの表示内容及び／又はスピーカ１８ｆからの音に基づき、ＴＶ電話サービスを開始するか否かを判断する。 In step S19, the holder of the portable device 18 determines whether to start the video telephone service based on the display content of the display unit 18d and / or the sound from the speaker 18f.

　ＴＶ電話サービスを開始しない場合、すなわち、訪問者との対話を行わない場合（ステップＳ１９：ＮＯ）、携帯機器１８の所持者は、操作部１８ｅを何ら操作しない。この結果、携帯機器１８から通信回線１４を介してカメラ１２に何らの回答も行われないため、カメラ１２の応答判断部１２ｆは、ステップＳ１７で問い合わせ内容を送信した時刻から所定時間経過しても、携帯機器１８から何らの回答がなければ、所持者側においてＴＶ電話サービスの開始の意思がないと判断し、図４のステップＳ９の処理に進む。 When the video telephone service is not started, that is, when the dialog with the visitor is not performed (step S19: NO), the holder of the portable device 18 does not operate the operation unit 18e at all. As a result, no reply is sent from the portable device 18 to the camera 12 via the communication line 14. Therefore, the response determination unit 12f of the camera 12 may receive a predetermined time from the time when the content of the inquiry is transmitted in step S17. If there is no response from the portable device 18, it is determined that the holder does not have an intention to start the video telephone service, and the process proceeds to step S9 in FIG.

　あるいは、ＴＶ電話サービスを開始しない場合（ステップＳ１９：ＮＯ）、携帯機器１８の所持者は、操作部１８ｅを操作し、ＴＶ電話サービスを開始しない旨の指示内容を入力する。これにより、通信部１８ａは、入力された指示内容に応じた指示信号を、通信回線１４を介してカメラ１２の通信部１２ｉに送信する。カメラ１２の応答判断部１２ｆは、通信部１２ｉで受信された指示信号に基づき、ＴＶ電話サービスを開始しない旨の指示があったことを把握し、図４のステップＳ９の処理に進む。 Alternatively, when the video telephone service is not started (step S19: NO), the owner of the portable device 18 operates the operation unit 18e and inputs an instruction content indicating that the video telephone service is not started. Thereby, the communication unit 18 a transmits an instruction signal corresponding to the input instruction content to the communication unit 12 i of the camera 12 via the communication line 14. The response determination unit 12f of the camera 12 recognizes that there is an instruction not to start the video telephone service based on the instruction signal received by the communication unit 12i, and proceeds to the process of step S9 in FIG.

　一方、ＴＶ電話サービスの開始を希望する場合（ステップＳ１９：ＹＥＳ）、携帯機器１８の所持者は、操作部１８ｅを操作し、ＴＶ電話サービスの開始を指示する。これにより、通信部１８ａは、入力された指示内容に応じた指示信号を、通信回線１４を介してカメラ１２の通信部１２ｉに送信する。 On the other hand, when it is desired to start the video telephone service (step S19: YES), the holder of the portable device 18 operates the operation unit 18e and instructs the start of the video telephone service. Thereby, the communication unit 18 a transmits an instruction signal corresponding to the input instruction content to the communication unit 12 i of the camera 12 via the communication line 14.

　ステップＳ２０において、カメラ１２の応答判断部１２ｆは、通信部１２ｉで受信された指示信号に基づき、ＴＶ電話サービスの開始が指示されたことを把握し、ＴＶ電話サービスの開始を決定する。 In step S20, the response determination unit 12f of the camera 12 recognizes that the start of the video telephone service has been instructed based on the instruction signal received by the communication unit 12i, and determines the start of the video telephone service.

　対話エンジン部１２ｇは、応答判断部１２ｆでの決定を受けて、通信部１２ｉ及び通信回線１４を介して携帯機器１８（の通信部１８ａ）に、撮像部１２ａが撮像した訪問者の画像と、マイクロホン１２ｂが検出した訪問者の音声データとを送信する。 The dialogue engine unit 12g receives the determination by the response determination unit 12f, and (the communication unit 18a of) the portable device 18 via the communication unit 12i and the communication line 14, the image of the visitor captured by the imaging unit 12a, The voice data of the visitor detected by the microphone 12b is transmitted.

　携帯機器１８の表示部１８ｄは、通信部１８ａが受信した訪問者の画像を表示すると共に、スピーカ１８ｆは、通信部１８ａが受信した音声データを音として出力する。携帯機器１８の所持者は、表示部１８ｄに表示された訪問者の画像を視認し、且つ、スピーカ１８ｆからの音を聞いた後、訪問者との会話を始める。マイクロホン１８ｇは、所持者の声を検出し、通信部１８ａは、マイクロホン１８ｇが検出した所持者の声（音声データ）を、通信回線１４を介して、カメラ１２の通信部１２ｉに送信する。 The display unit 18d of the portable device 18 displays the image of the visitor received by the communication unit 18a, and the speaker 18f outputs the sound data received by the communication unit 18a as a sound. The holder of the portable device 18 visually recognizes the image of the visitor displayed on the display unit 18 d and after hearing the sound from the speaker 18 f, starts a conversation with the visitor. The microphone 18g detects the voice of the holder, and the communication unit 18a transmits the voice (voice data) of the holder detected by the microphone 18g to the communication unit 12i of the camera 12 via the communication line 14.

　カメラ１２のスピーカ１２ｊは、通信部１２ｉが受信した所持者の音声データを音として出力する。これにより、訪問者は、スピーカ１２ｊからの音を聞くことにより、所持者（住宅の住人）と会話をすることができる。このようにして、訪問者と携帯機器１８の所持者との間で対話（ＴＶ電話サービス）が開始される。 The speaker 12j of the camera 12 outputs the sound data of the possessor received by the communication unit 12i as a sound. Thus, the visitor can talk with the owner (resident of the house) by listening to the sound from the speaker 12 j. In this way, a dialogue (TV phone service) is started between the visitor and the holder of the portable device 18.

　ステップＳ２１において、応答判断部１２ｆは、ＴＶ電話サービスを終了するか否かを判断する。この場合、応答判断部１２ｆは、携帯機器１８の所持者による操作部１８ｅの操作によって、ＴＶ電話サービスの終了を指示する指示信号が、携帯機器１８から通信回線１４を介して通信部１２ｉで受信されたか否かを確認する。 In step S21, the response determination unit 12f determines whether to end the video telephone service. In this case, the response determination unit 12 f receives an instruction signal instructing termination of the video telephone service from the portable device 18 via the communication line 14 by the communication unit 12 i by the operation of the operation unit 18 e by the owner of the portable device 18. Check if it has been done.

　該指示信号が受信されていない場合（ステップＳ２１：ＮＯ）、応答判断部１２ｆは、ＴＶ電話サービスの継続を判断する。対話エンジン部１２ｇは、応答判断部１２ｆの判断結果を受けて、ステップＳ２０の処理を実行し、ＴＶ電話サービスを継続する。 When the instruction signal is not received (step S21: NO), the response determination unit 12f determines the continuation of the video telephone service. In response to the determination result of the response determination unit 12f, the dialog engine unit 12g executes the process of step S20 and continues the video telephone service.

　一方、ステップＳ２１において、応答判断部１２ｆは、ＴＶ電話サービスの終了を指示する指示信号が通信部１２ｉで受信された場合（ステップＳ２１：ＹＥＳ）、ＴＶ電話サービスの終了を決定する。対話エンジン部１２ｇは、応答判断部１２ｆの決定を受けて、通信部１２ｉ及び通信回線１４を介した携帯機器１８への訪問者の画像及び音声データの送信を停止する。一方、携帯機器１８においても、ＴＶ電話サービスの終了を指示する指示信号の生成に伴い、所持者の音声データのカメラ１２への送信を停止する。この結果、ＴＶ電話サービスが終了する。 On the other hand, when the instruction signal instructing the end of the video telephone service is received by the communication unit 12i in step S21 (step S21: YES), the response judging unit 12f determines the end of the video telephone service. The dialogue engine unit 12 g stops the transmission of the image and voice data of the visitor to the portable device 18 through the communication unit 12 i and the communication line 14 in response to the determination of the response determination unit 12 f. On the other hand, also in the portable device 18, the transmission of the voice data of the owner to the camera 12 is stopped along with the generation of the instruction signal instructing the end of the video telephone service. As a result, the video telephone service ends.

　ステップＳ２２において、対話エンジン部１２ｇは、対話中（ＴＶ電話サービス中）に撮像部１２ａが撮像した訪問者の画像と、マイクロホン１２ｂが検出した訪問者の音声データとをデータ蓄積部１２ｍに記録する。この場合も、例えば、携帯機器１８の所持者による操作部１８ｅの入力操作に基づき、記録された対話中の画像及び音声をデータ蓄積部１２ｍから削除することが可能である。 In step S22, the dialog engine unit 12g records in the data storage unit 12m the image of the visitor captured by the imaging unit 12a during the dialogue (during the video telephone service) and the voice data of the visitor detected by the microphone 12b. . Also in this case, for example, based on the input operation of the operation unit 18e by the owner of the portable device 18, it is possible to delete the recorded image and sound during the interaction from the data storage unit 12m.

＜２．４　第１～第３実施例の変形例＞
　次に、第１～第３実施例の変形例（第１～第４変形例）について説明する。なお、第１～第４変形例の説明において、特に説明しない動作については、前述の第１～第３実施例の動作が適用される。 <2.4 Modifications of the First to Third Embodiments>
Next, modified examples (first to fourth modified examples) of the first to third embodiments will be described. In the description of the first to fourth modifications, the operations of the first to third embodiments described above are applied to the operations not particularly described.

（２．４．１　第１変形例）
　第１変形例は、住宅の住人が出入りする場合の対応である。例えば、携帯機器１８の所持者が外出して不在である一方で、他の住人が住宅に居る場合に、他の住人が住宅から外に出ていくときに、下記のように、第１～第３実施例の一部の動作を変更して適用される。 (2.4.1 First Modification)
The first modified example is a response when a resident of a house enters and leaves. For example, when the owner of the portable device 18 goes out and is absent while the other resident is in the house, when the other resident goes out of the house, as described below, The modification is applied to a part of the operation of the third embodiment.

　具体的に、第１実施例に適用する場合、図４のステップＳ１で撮像部１２ａが住宅から外に出ていく他の住人を撮像し、撮像された画像から人物検知部１２ｃが他の住人を検知する。次のステップＳ２で特徴部分検出部１２ｄが他の住人の特徴部分、例えば、他の住人の顔又は容姿（後ろ姿）を検出する。 Specifically, when applied to the first embodiment, the imaging unit 12a captures an image of another resident going out of the house in step S1 of FIG. 4, and the person detecting unit 12c captures another resident based on the captured image. To detect In the next step S2, the feature part detection unit 12d detects a feature part of another resident, for example, the face or appearance (back view) of another resident.

　そして、ステップＳ３～Ｓ５で人物特定部１２ｅが特徴部分（顔又は後ろ姿）から他の住人を特定する。その後、ステップＳ６以降の処理が行われ、他の住人との対話を実行することにより、カメラ１２は、住宅から外に出て行こうとする他の住人に対して声掛けを行うことができる。 Then, in steps S3 to S5, the person specifying unit 12e specifies another resident from the feature portion (face or back view). After that, the process after step S6 is performed, and by performing the dialog with other residents, the camera 12 can perform a call to the other residents who are going out of the house. .

　あるいは、第１変形例では、ステップＳ８の処理後、図６の第３実施例を実行してもよいし、又は、図４に破線で示すように、ステップＳ６～Ｓ８の処理を省略して、図６の第３実施例を実行してもよい。この場合、図６のステップＳ１６で、応答判断部１２ｆは、他の住人が住宅から外に出ていくことを携帯機器１８に通知（報知）することを決定する（ステップＳ１６：ＹＥＳ）。次のステップＳ１７において、応答判断部１２ｆは、通信部１２ｉ及び通信回線１４を介して、携帯機器１８の通信部１８ａに、他の住人が住宅から外に出ていくことを通知する。これにより、ステップＳ１８において、携帯機器１８の表示部１８ｄは、通知内容を表示し、スピーカ１８ｆは、通知内容に応じた音を出力する。 Alternatively, in the first modification, after the processing of step S8, the third embodiment of FIG. 6 may be executed, or as shown by a broken line in FIG. 4, the processing of steps S6 to S8 may be omitted. The third embodiment of FIG. 6 may be implemented. In this case, in step S16 in FIG. 6, the response determination unit 12f determines to notify (inform) the portable device 18 that another resident is going out of the house (step S16: YES). In the next step S17, the response determination unit 12f notifies the communication unit 18a of the portable device 18 that the other resident is going out of the house through the communication unit 12i and the communication line 14. Thereby, in step S18, the display unit 18d of the portable device 18 displays the notification content, and the speaker 18f outputs a sound according to the notification content.

　携帯機器１８の所持者は、表示部１８ｄの表示内容を視認するか、又は、スピーカ１８ｆからの音を聞くことにより、他の住人が住宅から外に出ていくことを把握する。このように、他の住人が住宅から外に出ていく場合、携帯機器１８に必ず通知されるので、該携帯機器１８の所持者は、ステップＳ１９、Ｓ２０において、他の住人と対話を実行する等、必要な措置を取ることができる。 The holder of the portable device 18 recognizes that the other resident will go out of the house by visually recognizing the display content of the display unit 18 d or by listening to the sound from the speaker 18 f. As described above, when another resident goes out of the house, the portable device 18 is always notified, so the owner of the portable device 18 executes a dialogue with the other resident in steps S19 and S20. Etc, you can take necessary measures.

　なお、第１変形例では、図４のステップＳ３～Ｓ５の処理で、他の住人を特定することが難しければ、第２実施例を適用し、サーバ１６側で他の住人を詳しく特定してもよい。また、上記の説明では、他の住人の後ろ姿を特徴部分としているが、該他の住人の顔、他の住人を前側、斜め又は側方から見た姿、他の住人の服装等を特徴部分としてもよい。これらの特徴部分は、人物情報（を含む関連情報）としてＤＢ１２ｌに登録されているので、人物特定部１２ｅでは、他の住人の特徴部分の画像及び特徴量と、人物情報（中の特徴部分の情報）とを対比して、他の住人を特定することが可能である。 In the first modification, if it is difficult to specify another resident in the process of steps S3 to S5 in FIG. 4, the second embodiment is applied, and the other resident is specified in detail on the server 16 side. It is also good. Also, in the above description, the back part of the other resident is the characteristic part, but the face of the other resident, the figure of the other resident viewed from the front side, the side or side, the clothes of the other resident, etc. It may be Since these characteristic parts are registered in the DB 12l as personal information (including related information), in the person specifying part 12e, images and characteristic quantities of characteristic parts of other residents and personal information (characteristic parts in It is possible to identify other residents in contrast to information).

（２．４．２　第２変形例）
　第２変形例は、第１及び第２実施例の一部の動作を変更し、訪問者の顔の画像に加え、又は、訪問者の顔の画像に代えて、訪問者の服装等の画像から該訪問者を特定する変形例である。 (2.4.2 second modification)
The second modified example changes the operation of a part of the first and second embodiments, and adds an image of the visitor's face, etc. or an image of the visitor's clothes etc. instead of the image of the visitor's face To identify the visitor.

　この場合、第１実施例では、図４のステップＳ２において、特徴部分検出部１２ｄは、人物検知部１２ｃが検知した訪問者の画像から、該訪問者の特徴部分である顔又は容姿を検出し、さらに、訪問者の服装、該服装の柄、色若しくは模様、又は、該訪問者の服装に付されたマークを検出する。 In this case, in the first embodiment, in step S2 of FIG. 4, the feature detection unit 12d detects the face or appearance that is the feature of the visitor from the image of the visitor detected by the person detection unit 12c. Furthermore, the clothes of the visitor, the pattern, the color or the pattern of the clothes, or the marks attached to the clothes of the visitor are detected.

　ステップＳ３において、人物特定部１２ｅの特徴量抽出部１２ｎは、特徴部分検出部１２ｄが検出した訪問者の顔又は容姿の画像から、顔又は容姿の特徴量を抽出し、さらには、訪問者の服装、該服装の柄、色若しくは模様、又は、該訪問者の服装に付されたマークの画像から、これらの特徴部分の特徴量を抽出する。 In step S3, the feature quantity extraction unit 12n of the person specifying unit 12e extracts the feature quantity of the face or appearance from the image of the face or appearance of the visitor detected by the feature part detection unit 12d. The feature quantities of these features are extracted from the image of the clothes, the pattern of the clothes, the color or pattern, or the mark attached to the clothes of the visitor.

　ステップＳ４において、類似度判定処理部１２ｏは、特徴量抽出部１２ｎが抽出した各特徴部分の特徴量と、ＤＢ１２ｌに蓄積されている複数の人物情報に含まれる特徴部分の特徴量との類似度を判定する。また、ステップＳ５において、年齢・性別推定処理部１２ｐは、特徴量抽出部１２ｎが抽出した各特徴部分の特徴量と、ＤＢ１２ｌに蓄積されている複数の人物情報とを比較することにより、訪問者の年齢及び／又は性別を推定する。 In step S4, the similarity determination processing unit 12o compares the feature amount of each feature portion extracted by the feature amount extraction unit 12n with the feature amount of the feature portion included in the plurality of pieces of personal information stored in the DB 12l. Determine Further, in step S5, the age / sex estimation processing unit 12p compares the feature amount of each feature extracted by the feature amount extracting unit 12n with the plurality of pieces of personal information stored in the DB 12l. Estimate the age and / or gender of

　このようにして、人物特定部１２ｅは、訪問者の顔又は容姿に加え、訪問者の服装、該服装の柄、色若しくは模様、又は、該服装に付されたマークに基づいて、訪問者を特定することができる。 Thus, in addition to the face or appearance of the visitor, the person specifying unit 12e can identify the visitor based on the clothes of the visitor, the pattern, color or pattern of the clothes, or the mark attached to the clothes. It can be identified.

　その後、ステップＳ６～Ｓ９の処理を実行することにより、カメラ１２は、訪問者との対話を実行する。このような処理を実行する場合としては、例えば、訪問者が事業者（例えば、配送業者、宅配業者又は郵便事業者の配達担当者）であり、該事業者が来訪した際、カメラ１２を設置している住宅に誰もいない場合、事業者との応対が困難な住人しか住宅にいない場合、又は、住人が事業者と応対したくない場合をいう。なお、事業者との応対が困難な住人とは、例えば、事業者との応対が難しい子供や、病気等のため住宅内で療養又は静養中の者をいう。 Thereafter, by executing the processes of steps S6 to S9, the camera 12 executes the dialogue with the visitor. As a case where such processing is performed, for example, when the visitor is a business (for example, a delivery company, a delivery company, or a delivery person in charge of a mail business), the camera 12 is installed when the business visits When there is no one in the house where you are working, it means that only the resident who has difficulty in dealing with the employer is in the house, or the resident does not want to attend the employer. In addition, the resident who has difficulty in dealing with a business person means, for example, a child who is difficult to handle a business person, or a person who is being treated or recuperated in a house because of illness or the like.

　そこで、ステップＳ７において、対話エンジン部１２ｇは、このような場合の対話に最適なシナリオをＤＢ１２ｌから選定する。 Therefore, in step S7, the dialogue engine unit 12g selects a scenario optimal for dialogue in such a case from the DB 12l.

　具体的に、ＤＢ１２ｌには、カメラ１２の設置場所（住宅）の関連情報として、上記の場合に対応した関連情報が登録されている。そこで、対話エンジン部１２ｇは、該関連情報をＤＢ１２ｌから抽出し、抽出した関連情報と、応答判断部１２ｆでの決定内容とに対応するシナリオを、ＤＢ１２ｌから選定する。 Specifically, related information corresponding to the above case is registered in the DB 12 l as related information of the installation place (house) of the camera 12. Therefore, the dialogue engine unit 12 g extracts the relevant information from the DB 12 l, and selects a scenario corresponding to the extracted relevant information and the determination content in the response determination unit 12 f from the DB 12 l.

　これにより、ステップＳ８において、対話エンジン部１２ｇは、選定したシナリオの示す定型文をスピーカ１２ｊから音声として外部に出力する。例えば、「宅配ロッカーに入れて下さい。」、「一番遅い時間帯での再配達を希望します。」等の定型文をスピーカ１２ｊから音声出力する。この結果、住宅に誰もいない場合、事業者との応対が困難な住人しか住宅にいない場合、又は、住人が事業者と応対したくない場合であっても、カメラ１２は、事業者等の来訪者に対して、適切な対応を取ることができる。 Thus, in step S8, the dialogue engine unit 12g outputs the fixed form sentence indicated by the selected scenario to the outside as a voice from the speaker 12j. For example, a fixed phrase such as “Please put in a delivery locker”, “I would like to re-deliver at the latest time slot”, etc. is output as an audio from the speaker 12 j. As a result, if there is no one in the house, or if only the resident who has difficulty in responding to the employer is in the house, or if the resident does not want to attend the employer, the camera 12 It is possible to take appropriate measures for visitors.

　なお、第２実施例の場合には、図５のステップＳ１５のデータの突き合わせ処理において、データ突き合わせ部１６ｃは、上記の人物特定部１２ｅでの処理と同様に、訪問者の顔又は容姿に加え、訪問者の服装、該服装の柄、色若しくは模様、又は、該服装に付されたマークに基づいて、訪問者を特定すればよいので、詳細な説明は省略する。 In the case of the second embodiment, in the data matching process of step S15 of FIG. 5, the data matching unit 16c adds to the face or appearance of the visitor as in the process by the person specifying unit 12e described above. Since the visitor may be specified based on the clothes of the visitor, the pattern, color or pattern of the clothes, or the marks attached to the clothes, the detailed description will be omitted.

（２．４．３　第３変形例）
　第３変形例は、訪問者の来訪が分かっている場合、ＤＢ１２ｌに訪問者の情報を予め登録して対応する変形例である。 (2.4.3 third modification)
The third modification is a modification in which the information on the visitor is registered in advance in the DB 12l when the visitor's visit is known.

　図７のステップＳ３１において、例えば、訪問者が配送業者、宅配業者又は郵便事業者等の事業者であり、該事業者から携帯機器１８又は該携帯機器１８の所持者に、該所持者の住宅への来訪予定時刻（配達予定時刻）が予め通知されているものとする（ステップＳ３１：ＹＥＳ）。なお、この通知は、例えば、携帯機器１８がスマートフォンであれば、事業者から携帯機器１８へのメール送信、電話、又は、ソーシャル・ネットワーク・サービス（ＳＮＳ）により行われる。 In step S31 of FIG. 7, for example, the visitor is an enterprise such as a delivery company, a home delivery company, or a postal company, and from the enterprise to the holder of the portable device 18 or the portable device 18, the residence of the holder. It is assumed that the scheduled visit time to the site (scheduled delivery time) has been notified in advance (step S31: YES). Note that, for example, if the mobile device 18 is a smartphone, this notification is performed by e-mail transmission from the business entity to the mobile device 18, a telephone call, or a social network service (SNS).

　ステップＳ３２において、来訪予定時刻に所持者が住宅を留守にする場合、所持者は、携帯機器１８の操作部１８ｅを操作し、訪問者である事業者に関する情報を入力し、来訪予定時刻にカメラ１２が所定の対応を取るように指示を行う。この場合、所持者が入力する情報としては、例えば、訪問者の情報（事業者の名称、事業者の担当者（来訪者）の氏名等）、来訪の目的、来訪の目的が配達であれば、配達される物の情報（発送人の氏名等）、来訪予定時刻等である。 In step S32, when the owner leaves the house at the scheduled visit time, the owner operates the operation unit 18e of the portable device 18, inputs information on the business operator who is the visitor, and the camera at the scheduled visit time 12 instructs to take a predetermined response. In this case, as the information input by the owner, for example, the information of the visitor (the name of the company, the name of the person in charge of the company (visitor), etc.), the purpose of the visit, and the purpose of the visit are delivery. , Information on the item to be delivered (the name of the sender, etc.), scheduled visit time, etc.

　ステップＳ３３において、携帯機器１８の通信部１８ａは、通信回線１４を介して、カメラ１２の通信部１２ｉに、入力された情報を送信する。これにより、ステップＳ３４において、関連情報収集部１２ｈは、通信部１２ｉが受信した情報を関連情報（及び人物情報）としてＤＢ１２ｌに登録する。 In step S33, the communication unit 18a of the portable device 18 transmits the input information to the communication unit 12i of the camera 12 via the communication line 14. Thus, in step S34, the related information collection unit 12h registers the information received by the communication unit 12i in the DB 12l as related information (and personal information).

　これにより、図４の第１実施例において、カメラ１２では、事業者が来訪予定時刻に住宅に来訪した場合、ステップＳ１以降の処理が実行される。具体的に、ステップＳ１～Ｓ５の処理によって、来訪した訪問者が事業者であるか否かが特定される。その後、ステップＳ７の処理によって、ＤＢ１２ｌに予め登録された関連情報（所持者が入力した情報）に応じた最適なシナリオがＤＢ１２ｌから選定される。この結果、ステップＳ８において、携帯機器１８の所持者が住宅を留守にしている場合でも、対話エンジン部１２ｇは、選定したシナリオに従って、来訪した事業者との間で対話を実行することができる。 Thereby, in the first embodiment of FIG. 4, in the camera 12, when the company visits a house at the scheduled visit time, the processes after step S 1 are executed. Specifically, whether or not the visitor who has visited is a business is specified by the processing of steps S1 to S5. Thereafter, in the process of step S7, an optimal scenario corresponding to the related information (information input by the owner) registered in advance in the DB 12l is selected from the DB 12l. As a result, even if the owner of the portable device 18 is away from home in step S8, the dialog engine unit 12g can execute the dialog with the visiting business operator according to the selected scenario.

　なお、図７のステップＳ３１において、事業者等の訪問者の来訪が予め分かっていない場合（ステップＳ３１：ＮＯ）、携帯機器１８の所持者が上記の情報を入力することがないため、カメラ１２は、図４の第１実施例の処理を実行する。 In addition, in step S31 of FIG. 7, when a visitor's visits, such as a company etc. do not understand beforehand (step S31: NO), since the owner of the portable apparatus 18 does not input said information, the camera 12 Executes the processing of the first embodiment of FIG.

　また、住宅周辺の交通事情等で、来訪予定時刻よりも早く事業者が来訪した場合、又は、来訪予定時刻よりも遅く事業者が来訪した場合、実際の来訪時刻と、来訪予定時刻とが一致しない点以外は、来訪した事業者と、関連情報（人物情報）の示す事業者とが一致する。この場合、応答判断部１２ｆは、図４のステップＳ６で来訪した事業者との対話の実行を決定し、対話エンジン部１２ｇは、ステップＳ７で該関連情報に基づいてＤＢ１２ｌから該当するシナリオを選定すればよい。この場合でも、対話エンジン部１２ｇは、選定したシナリオに従って、来訪した事業者との間で対話を実行することができる。 In addition, if the company visits earlier than the scheduled visit time due to traffic conditions around the house, or if the provider visits later than the scheduled visit time, the actual visit time matches the scheduled visit time Except for the point where it does not, the business operator who has visited and the business operator indicated by the related information (person information) match. In this case, the response determination unit 12f determines the execution of the dialog with the business operator who visited in step S6 of FIG. 4, and the dialog engine unit 12g selects the corresponding scenario from the DB 12l based on the related information in step S7. do it. Even in this case, the dialog engine unit 12g can execute the dialog with the visiting business operator according to the selected scenario.

　一方、当初予定していた事業者以外の訪問者（例えば、他の事業者）が来訪予定時刻等に来訪した場合には、上述の第１実施例を適用し、応答判断部１２ｆは、ステップＳ６で訪問者との対話の実行を決定し、対話エンジン部１２ｇは、ステップＳ７でＤＢ１２ｌから該当するシナリオを選定すればよい。これにより、当初の予定とは異なる場面であっても、対話エンジン部１２ｇは、該場面に応じた適切なシナリオを選定し、選定したシナリオに従って、訪問者との間で対話を実行することができる。 On the other hand, when a visitor (for example, another company) other than the company originally planned to visit visits the scheduled time, etc., the above-mentioned first embodiment is applied, and the response judgment unit 12f performs the step The execution of the dialog with the visitor is determined in S6, and the dialog engine unit 12g may select the corresponding scenario from the DB 12l in step S7. Thereby, even if the scene is different from the original schedule, the dialogue engine unit 12 g selects an appropriate scenario according to the scene, and executes the dialogue with the visitor according to the selected scenario. it can.

　あるいは、当初予定していた事業者以外の訪問者が来訪予定時刻等に来訪した場合、上述の第３実施例を適用し、応答判断部１２ｆは、図６のステップＳ１６で携帯機器１８への報知を決定し（ステップＳ１６：ＹＥＳ）、通信部１２ｉから通信回線１４を介して携帯機器１８の通信部１８ａに、異なる訪問者が来訪したことを報知してもよい。これにより、携帯機器１８の表示部１８ｄは、通信部１８ａが受信した通知内容を表示し、スピーカ１８ｆは、通知内容に応じた音を外部に出力する。この結果、携帯機器１８の所持者は、表示部１８ｄの表示内容又はスピーカ１８ｆの音から、当初の予定とは異なる訪問者が住宅に来訪したことを把握することができる。 Alternatively, when a visitor other than the company originally planned to visit visits at the scheduled arrival time etc., the third embodiment described above is applied, and the response determination unit 12f sends the portable device 18 in step S16 of FIG. The notification may be determined (step S16: YES), and the communication unit 12i may notify the communication unit 18a of the portable device 18 via the communication line 14 that different visitors have visited. Thereby, the display unit 18d of the portable device 18 displays the notification content received by the communication unit 18a, and the speaker 18f outputs a sound according to the notification content to the outside. As a result, the holder of the portable device 18 can understand from the display content of the display unit 18 d or the sound of the speaker 18 f that a visitor different from the original schedule has visited the house.

　さらに、第３変形例では、第２変形例と組み合わせ、事業者が来訪予定時刻に来訪した際、該住宅に誰もいない場合、事業者との応対が困難な住人しか住宅にいない場合、又は、住人が事業者と応対したくない場合には、上記のように対応させることも可能である。この場合、カメラ１２は、事業者との対話を実行し、例えば、不在であることから、来訪日時の変更を指示する対応等を取ることができる。 Furthermore, in the third modification, when the enterprise visits at the scheduled visit time in combination with the second modification, when there is no one in the house, when only residents who have difficulty in dealing with the enterprise are in the house, or When the resident does not want to respond to the business, it is possible to respond as described above. In this case, the camera 12 executes a dialog with the business entity, and can take, for example, a response to change the visit date and time, etc., because it is absent.

　このように、第３変形例では、携帯機器１８からカメラ１２に対して、どの事業者が、誰（発送人）からの荷物を、いつ届けるのか（来訪予定時刻、指定された日時及び時間帯）を示す情報が送信され、送信された情報を用いて、訪問者を特定することにより、該情報に示す真正な事業者であるか否かをカメラ１２側で判断することができる。 As described above, in the third modification, which business entity delivers the package from whom (sender) from the portable device 18 to the camera 12 (visit scheduled time, designated date and time, and time zone) When the information indicating the information is transmitted and the transmitted information is used to specify the visitor, it is possible to determine on the camera 12 side whether or not the person is an authentic business person indicated in the information.

（２．４．４　第４変形例）
　第４変形例は、カメラ１２側で訪問者の特定を行っても、該訪問者を精度よく特定することができない場合や、訪問者との間で対話を行っても、訪問者から何らの反応がない場合への対応である。第４変形例は、訪問者が不審者である場合の対応を想定している。 (2.4.4 fourth modification)
In the fourth modification, even if the visitor 12 is identified by the camera 12, the visitor can not be identified with high accuracy, or even if the dialogue with the visitor is performed, the visitor does not It corresponds to the case where there is no reaction. The fourth modified example assumes that the visitor is a suspicious person.

　この場合、訪問者が不審者であれば、ＤＢ１２ｌに登録されている該不審者の人物情報が登録されていないので、図４のステップＳ１～Ｓ５の処理を行っても、訪問者を精度よく特定することができない。そこで、応答判断部１２ｆは、図６のステップＳ１６、Ｓ１７の処理を実行し、訪問者が不審者であると判断し、その判断結果を通信部１２ｉ及び通信回線１４を介して、携帯機器１８の通信部１８ａに送信する。ステップＳ１８において、携帯機器１８の表示部１８ｄは、通信部１８ａが受信した通知を表示し、スピーカ１２ｊは、通知内容に応じた音を出力する。携帯機器１８の所持者は、表示部１８ｄの表示内容を見るか、又は、スピーカ１２ｊからの音を聞くことにより、不審者が来訪したことを把握し、ステップＳ１９以降の不審者との対話等、必要な対応を取ることができる。 In this case, if the visitor is a suspicious person, the person information of the suspicious person registered in the DB 12l is not registered, so even if the processing of steps S1 to S5 in FIG. It can not be identified. Therefore, the response determination unit 12f executes the processes of steps S16 and S17 of FIG. 6, determines that the visitor is a suspicious person, and determines the determination result to the portable device 18 through the communication unit 12i and the communication line 14. To the communication unit 18a. In step S18, the display unit 18d of the portable device 18 displays the notification received by the communication unit 18a, and the speaker 12j outputs a sound according to the content of the notification. The owner of the portable device 18 understands that the suspicious person has visited by looking at the display contents of the display section 18d or by listening to the sound from the speaker 12j, and the dialogue with the suspicious person after step S19, etc. You can take the necessary action.

　あるいは、図４のステップＳ１～Ｓ５の処理を行っても、訪問者を精度よく特定することができない場合、応答判断部１２ｆは、ステップＳ６で、訪問者が不審者であると判断し、該不審者と対話することを決定する（ステップＳ６：ＹＥＳ）。そして、対話エンジン部１２ｇがステップＳ７でシナリオを選定し、ステップＳ８で対話を実行する。この場合、カメラ１２側から不審者に声掛けをしても、該不審者は何ら応答しない可能性がある。このような場合、上記の対応と同様に、図６のステップＳ１６以降の処理が実行され、カメラ１２から携帯機器１８に不審者の来訪が通知される。これにより、携帯機器１８の所持者は、不審者との対話等、必要な対応を取ることができる。 Alternatively, if the visitor can not be identified with high accuracy even if the processing in steps S1 to S5 of FIG. 4 is performed, the response determination unit 12f determines that the visitor is a suspicious person in step S6, and It decides to interact with the suspicious person (step S6: YES). Then, the dialogue engine unit 12g selects a scenario in step S7, and executes dialogue in step S8. In this case, even if the suspicious person calls from the camera 12, the suspicious person may not respond at all. In such a case, the processes after step S16 in FIG. 6 are executed as in the above-described correspondence, and the camera 12 notifies the portable device 18 of the visit of a suspicious person. As a result, the owner of the portable device 18 can take necessary measures such as dialogue with a suspicious person.

　上記の２つの例のいずれにおいても、カメラ１２では、撮像部１２ａが不審者を撮像し、マイクロホン１２ｂが不審者の声を検出し、撮像した不審者の画像及び不審者の声のデータを、撮像した時刻の情報と共に、データ蓄積部１２ｍに記録することが望ましい。また、上記の説明では、訪問者が不審者である場合であったが、実際に不審者でなくても、応答判断部１２ｆで訪問者が不審者と推定される場合にも、上記の処理が行われてもよい。 In any of the above two examples, in the camera 12, the imaging unit 12a images a suspicious person, the microphone 12b detects the voice of the suspicious person, and the captured image of the suspicious person and data of the suspicious person's voice are It is desirable to record in the data storage unit 12m together with the information on the time of imaging. In the above description, although the visitor is a suspicious person, the above processing is also performed when the response determination unit 12 f estimates that the visitor is a suspicious person, even if the visitor is not actually a suspicious person. May be performed.

（２．４．５　その他の変形例）
　上記の説明では、主として、外部から建物に来訪する訪問者に対して、該訪問者の顔の画像から訪問者を特定する場合（第１～第３実施例等）、又は、住宅から外に出ていく他の住人に対して、該他の住人の後ろ姿の画像から他の住人を特定する場合（第１変形例）について説明した。 (2.4.5 Other Modifications)
In the above description, when the visitor is identified from the image of the visitor's face mainly for the visitor who visits the building from the outside (the first to third embodiments etc.) or outside the house The case where the other resident is specified from the image of the back view of the other resident (the first modified example) has been described for the other resident who has left.

　本実施形態では、第１～第３実施例及び第１～第４変形例を組み合わせ、訪問者の顔に加え、又は、顔に代えて、他の特徴部分（容姿、服装等）から、訪問者を特定することも可能である。また、住宅から外に出ていく他の住人についても、後ろ姿に加え、又は、後ろ姿に代えて、他の特徴部分（顔、他の方向からの容姿、服装等）から、他の住人を特定することも可能である。この場合、訪問者又は他の住人の特定についても、カメラ１２側での特定に代えて、第２実施例を適用し、サーバ１６側で特定することも可能である。 In this embodiment, the first to third examples and the first to fourth modified examples are combined, and in addition to or in place of the face of the visitor, the visit is performed from other characteristic portions (such as appearance and clothes). It is also possible to identify the Also, with regard to other residents going out of the house, in addition to or in place of the back view, the other features are identified from other features (face, appearance from other directions, clothes, etc.) It is also possible. In this case, the second embodiment can be applied to the identification of the visitor or other resident instead of the identification on the camera 12 side, and the identification on the server 16 side is also possible.

［３．本実施形態の効果］
　以上説明したように、本実施形態に係るセキュリティシステム１０によれば、カメラ１２に人工知能を搭載し、人物検知部１２ｃ、特徴部分検出部１２ｄ及び人物特定部１２ｅとして機能させることにより、訪問者等の特定の人物に合わせたサービスの提供を、ローカル領域としてのカメラ１２側で行うことが可能となる。これにより、例えば、カメラ１２を設置した住宅の住人（携帯機器１８の所持者）が不在であっても、該カメラ１２が訪問者に対して適切な対応を取ることができる。この結果、セキュリティシステム１０の利便性を向上させることができる。 [3. Effect of this embodiment]
As described above, according to the security system 10 according to the present embodiment, the artificial intelligence is mounted on the camera 12, and the visitor 12 is made to function as the person detection unit 12c, the characteristic part detection unit 12d, and the person identification unit 12e. It is possible to provide a service tailored to a specific person, such as, at the camera 12 side as a local area. Thereby, for example, even if the resident (owner of the portable device 18) of the house where the camera 12 is installed is absent, the camera 12 can take an appropriate response to the visitor. As a result, the convenience of the security system 10 can be improved.

　この場合、人物検知部１２ｃは、カメラ１２が撮像した画像について、該画像に写り込んでいる動体を検出するか、画像のノイズの変化を検出するか、又は、画像の明るさの変化を検出することにより、人物を検知する。これにより、撮像部１２ａが撮像した画像のみを用いて人物を容易に検知することができる。この結果、人物検知部１２ｃでの検知結果をトリガとして、特徴部分検出部１２ｄ及び人物特定部１２ｅで各処理を実行することが可能となる。 In this case, the person detection unit 12c detects, for the image captured by the camera 12, a moving object reflected in the image, a change in the noise of the image, or a change in the brightness of the image. To detect a person. Thus, a person can be easily detected using only the image captured by the imaging unit 12a. As a result, with the detection result of the person detection unit 12c as a trigger, the characteristic part detection unit 12d and the person specification unit 12e can execute each process.

　また、人物検知部１２ｃは、撮像部１２ａが撮像した画像と、マイクロホン１２ｂが検出した音とを併用することにより、人物を精度よく検知することが可能となる。 Further, the person detection unit 12c can detect a person with high accuracy by using the image captured by the imaging unit 12a and the sound detected by the microphone 12b in combination.

　さらに、カメラ１２が連続的に又は所定時間間隔で該カメラ１２の周囲を撮像する場合、特徴部分検出部１２ｄは、人物検知部１２ｃが人物を検知する毎に、人物の画像から該人物の特徴部分を検出し、特徴部分を追跡することで、来訪中の訪問者等の人物の行動を容易に把握すると共に、該人物に対して適切な対応を取ることが可能となる。 Furthermore, when the camera 12 images the surroundings of the camera 12 continuously or at predetermined time intervals, the characteristic part detection unit 12 d detects the characteristics of the person from the image of the person every time the person detection unit 12 c detects the person. By detecting the part and tracking the feature part, it is possible to easily grasp the behavior of a person such as a visiting visitor, and to take an appropriate response to the person.

　また、人物特定部１２ｅは、特徴部分検出部１２ｄが検出した人物の特徴部分の画像から該特徴部分の特徴量を抽出する特徴量抽出部１２ｎと、特徴量抽出部１２ｎが抽出した人物の特徴部分の特徴量と人物情報の示す人物の特徴部分の特徴量との類似度を判定する類似度判定処理部１２ｏと、特徴量抽出部１２ｎが抽出した人物の特徴部分の特徴量と人物情報とを比較することにより人物検知部１２ｃが検知した人物の年齢及び／又は性別を推定する年齢・性別推定処理部１２ｐとから構成されるので、人物検知部１２ｃが検知した人物がどのような人物であるのかを容易に且つ効率よく特定することが可能となる。 The person specifying unit 12e extracts the feature amount of the feature portion from the image of the feature portion of the person detected by the feature portion detecting unit 12d, and the feature of the person extracted by the feature amount extracting unit 12n. Similarity determination processing unit 12o that determines the similarity between the feature amount of the portion and the feature amount of the feature portion of the person indicated by the person information, the feature amount of the feature portion of the person extracted by the feature amount extraction unit 12n, and the person information Since the age / sex estimation processing unit 12p estimates the age and / or sex of the person detected by the person detection unit 12c by comparing with each other, the person detected by the person detection unit 12c is any person. It becomes possible to identify easily and efficiently.

　この場合、少なくとも人物検知部１２ｃで検知された人物の顔又は容姿が特徴部分であれば、カメラ１２が住宅等の建物に設置されているときに、該建物への訪問者や、建物に出入りする住人の動き等を容易に検知することが可能となる。 In this case, if at least the face or appearance of the person detected by the person detection unit 12c is a characteristic part, when the camera 12 is installed in a building such as a house, a visitor to the building or entering or leaving the building It is possible to easily detect the movement of the resident.

　また、人物の服装、該服装の柄、色若しくは模様、又は、該服装に付されたマークを特徴部分に含むことにより、カメラ１２が住宅等の建物に設置されている場合に、該建物への訪問者や建物に出入りする人物がどのような人物であるのかを一層容易に特定することができる。 In addition, when the camera 12 is installed in a building such as a house by including the clothes of a person, the pattern, color or pattern of the clothes, or the mark attached to the clothes in the characteristic part, It is possible to more easily identify what kind of person the visitor of or the person entering or leaving the building is.

　さらに、人物特定部１２ｅが人物を特定した際に、その特定結果に応じた人物への対応を判断する応答判断部１２ｆをカメラ１２が有することで、該人物の人物像に応じた適切な対応を取ることが可能となる。 Furthermore, when the person specifying unit 12e specifies a person, the camera 12 has a response determining unit 12f that determines the response to the person according to the specified result, so that an appropriate response according to the person image of the person is made. It is possible to take

　セキュリティシステム１０では、第１実施例を適用することにより、カメラ１２は、シナリオに従って人物との間で対話を容易に行うことが可能となる。また、カメラ１２が人工知能の一機能である対話エンジン部１２ｇを有しているので、対話の実行によって、画像及び音声のデータ量が増大しても、対話エンジン部１２ｇで容易に且つ速やかに処理することが可能となる。さらに、対話エンジン部１２ｇは、人物との対話を通じて、対話学習を行うので、対話機能を含めた人工知能の諸機能を向上させることが可能となる。 In the security system 10, by applying the first embodiment, the camera 12 can easily interact with a person according to a scenario. In addition, since the camera 12 has the dialogue engine unit 12g which is a function of artificial intelligence, even if the amount of image and voice data increases due to the execution of dialogue, the dialogue engine unit 12g easily and quickly It becomes possible to process. Furthermore, since the dialogue engine unit 12 g performs dialogue learning through dialogue with a person, it is possible to improve various functions of artificial intelligence including the dialogue function.

　また、対話エンジン部１２ｇは、応答判断部１２ｆが人物との対話の実行を決定した場合に、人物特定部１２ｅが特定した人物を撮像部１２ａが撮像した時点において、関連情報収集部１２ｈが収集した関連情報をＤＢ１２ｌから抽出し、抽出した関連情報及び対話に応じたシナリオをＤＢ１２ｌから選定するので、関連情報も参照して、より適切なシナリオを選定し、人物との間の対話の精度を向上させることができる。また、住宅等の建物にカメラ１２が設置されている場合、建物への訪問者が予め分かっているときには、該訪問者の関連情報をＤＢ１２ｌに登録しておけば、訪問者が建物に訪問した際、該訪問者に対して適切な対話を実行することができる。 In addition, when the response determination unit 12f determines execution of a dialog with a person, the dialogue engine unit 12g collects the related information collection unit 12h at the time when the imaging unit 12a captures an image of the person specified by the person specification unit 12e. The related information is extracted from the DB 12 l, and a scenario corresponding to the extracted related information and interaction is selected from the DB 12 l. Therefore, referring to the related information, a more appropriate scenario is selected, and the accuracy of the interaction with the person is determined. It can be improved. In addition, when the camera 12 is installed in a building such as a house, if a visitor to the building is known in advance, the visitor visited the building if the relevant information of the visitor is registered in the DB 12 l In doing so, appropriate dialogue can be performed for the visitor.

　さらに、カメラ１２が関連情報収集部１２ｈを有することにより、最新の関連情報がＤＢ１２ｌに登録されるので、人物との対話の精度をさらに向上させることが可能となる。 Furthermore, the camera 12 includes the related information collection unit 12h, whereby the latest related information is registered in the DB 12l, so that it is possible to further improve the accuracy of the dialog with the person.

　また、人物特定部１２ｅが特定した人物との対話の実行を応答判断部１２ｆが決定した場合、対話エンジン部１２ｇは、撮像部１２ａが人物を撮像した時点において、関連情報収集部１２ｈが収集した関連情報をＤＢ１２ｌから抽出し、抽出した関連情報及び対話に応じたシナリオをＤＢ１２ｌから選定するので、人物との間の対話の精度が一層向上する。 In addition, when the response determination unit 12f determines execution of the dialog with the specified person by the person specifying unit 12e, the dialog engine unit 12g collects the related information collection unit 12h at the time when the imaging unit 12a captures a person. Since the related information is extracted from the DB 12l and the extracted related information and the scenario corresponding to the interaction are selected from the DB 12l, the accuracy of the interaction with the person is further improved.

　さらに、人物特定部１２ｅが特定した人物に応じた関連情報が、ＤＢ１２ｌに登録されていない場合、対話エンジン部１２ｇは、ＤＢ１２ｌに登録されている他の関連情報を抽出し、抽出した他の関連情報及び対話に応じたシナリオをＤＢ１２ｌから選定するか、並びに／又は、該当する人物がいない旨を携帯機器１８に報知する。これにより、住宅等の建物にカメラ１２が設置されている場合、建物への訪問者と、ＤＢ１２ｌに登録されている関連情報の示す訪問者とが一致していないときでも、他の関連情報を用いて、該訪問者との対話を実行することが可能となる。また、建物を留守にしている携帯機器１８の所持者（住人）に訪問者の来訪等を報知することが可能となる。 Furthermore, when the related information according to the person specified by the person specifying unit 12e is not registered in the DB 12l, the dialogue engine unit 12g extracts other related information registered in the DB 12l and extracts the other related information The scenario corresponding to the information and the dialogue is selected from the DB 12 l and / or the portable device 18 is informed that there is no corresponding person. Thereby, when the camera 12 is installed in a building such as a house, even when the visitor to the building and the visitor indicated by the related information registered in the DB 12l do not match, other related information is It can be used to carry out a dialogue with the visitor. In addition, it is possible to notify the holder (resident) of the portable device 18 who is away from the building of the visit of the visitor.

　また、セキュリティシステム１０は、第２実施例を適用する場合、サーバ１６側には、周辺情報ＤＢ２２及び個人情報ＤＢ２４が設けられており、人物を特定するために必要なデータ量は、カメラ１２よりも格段に多いので、周辺情報及び個人情報と、カメラ１２からの問い合わせ内容とを突き合わせることで、サーバ１６において、人物の人物像等を詳しく特定することが可能となる。また、サーバ１６に人工知能を搭載し、問い合わせ内容判断部１６ｂ及びデータ突き合わせ部１６ｃとして機能させることにより、人物に合わせたサービスの提供を、ローカル領域としてのサーバ１６側で行うことが可能となる。この場合でも、カメラ１２を設置した住宅の住人が不在であっても、該サーバ１６が人物に対して適切な対応を取ることができる。この結果、セキュリティシステム１０の利便性を向上させることができる。さらに、データの突き合わせを行うことにより、人物と周辺情報及び個人情報等のコンテンツとを容易に紐付けることが可能となる。 In addition, when the security system 10 applies the second embodiment, the peripheral information DB 22 and the personal information DB 24 are provided on the server 16 side, and the amount of data necessary to specify a person is determined by the camera 12. There are also many cases, and by matching the peripheral information and personal information with the contents of the inquiry from the camera 12, the server 16 can specify the person's image etc. in detail. In addition, by installing artificial intelligence in the server 16 and making it function as the inquiry content determination unit 16b and the data matching unit 16c, it is possible to provide a service tailored to a person on the server 16 side as a local area. . Even in this case, even if there is no resident of the house where the camera 12 is installed, the server 16 can take an appropriate response to the person. As a result, the convenience of the security system 10 can be improved. Furthermore, by matching data, it is possible to easily associate a person with content such as peripheral information and personal information.

　また、サーバ１６が応答判断部１６ｄ及び対話エンジン部１６ｅを有するので、カメラ１２は、サーバ１６側で選定されたシナリオに従って人物との間で対話を容易に且つ精度よく行うことが可能となる。また、サーバ１６が人工知能の一機能である対話エンジン部１６ｅを有しているので、対話の実行によって、画像及び音声のデータ量が増大しても、該対話エンジン部１６ｅで容易に且つ速やかに処理することが可能となる。さらに、対話エンジン部１６ｅは、人物との対話を通じて、対話学習を行い、対話機能を含めた人工知能の諸機能を向上させることができる。 In addition, since the server 16 includes the response determination unit 16d and the dialogue engine unit 16e, the camera 12 can easily and accurately interact with a person according to the scenario selected on the server 16 side. Further, since the server 16 has the dialogue engine unit 16e which is a function of artificial intelligence, even if the amount of image and voice data increases by execution of the dialogue, the dialogue engine unit 16e can easily and quickly It is possible to process Furthermore, the dialogue engine unit 16e can perform dialogue learning through dialogue with a person and improve various functions of artificial intelligence including the dialogue function.

　また、対話エンジン部１６ｅは、対話の終了後、ヒストリーＤＢ２８に記録された対話履歴を用いて新たなシナリオを作成し、作成した前記新たなシナリオをシナリオＤＢ２６に登録するので、人物との間でどのような対話が実行されたのかを後日確認することができ、セキュリティシステム１０の利便性を一層向上させることができる。また、サーバ１６に人工知能が搭載されているため、ヒストリーＤＢ２８に蓄積されるデータ量が飛躍的に増大しても、対話エンジン部１６ｅは、適切に且つ速やかにデータを処理し、作成された新たなシナリオを蓄積することができる。しかも、ヒストリーＤＢ２８に蓄積される対話履歴を用いて新たなシナリオが作成されるので、対話エンジン部１６ｅは、人物との対話の際、最適で且つ最新のシナリオを容易に選定することが可能となる。すなわち、対話エンジン部１６ｅは、人物との対話を通じた対話学習によって新たなシナリオを作成するので、人物との対話を実行する毎に、最適なシナリオを選択しやすくなる。 In addition, after the end of the dialogue, the dialogue engine unit 16e creates a new scenario using the dialogue history recorded in the history DB 28, and registers the created new scenario in the scenario DB 26, so that the dialogue engine unit 16e It is possible to confirm at a later date what kind of interaction has been performed, and the convenience of the security system 10 can be further improved. Further, since artificial intelligence is installed in the server 16, even if the amount of data accumulated in the history DB 28 dramatically increases, the dialogue engine unit 16e processes the data appropriately and promptly and is created. New scenarios can be accumulated. Moreover, since a new scenario is created using the dialogue history accumulated in the history DB 28, the dialogue engine unit 16e can easily select an optimal and latest scenario when interacting with a person. Become. That is, since the dialogue engine unit 16e creates a new scenario by dialogue learning through dialogue with a person, it is easy to select an optimal scenario each time the dialogue with a person is executed.

　さらに、セキュリティシステム１０では、第３実施例を適用することで、カメラ１２の設置場所を不在にしている携帯機器１８の所持者は、該設置場所にいなくても、カメラ１２の設置場所に人物が存在している旨の通知（例えば、訪問者の来訪の通知）をリアルタイムで受けることができる。これにより、セキュリティシステム１０は、所持者に対して適切なサービスを提供することが可能となる。 Furthermore, in the security system 10, by applying the third embodiment, the holder of the portable device 18 who is out of the installation location of the camera 12 can be installed in the installation location of the camera 12, even if not at the installation location. Notification of the presence of a person (eg, notification of a visitor's visit) can be received in real time. Thus, the security system 10 can provide the holder with appropriate services.

　また、第３実施例では、携帯機器１８（の所持者）が通知を受け取った後、人物との間で対話を行うことが可能であり、カメラ１２側の人物と、携帯機器１８の所持者との間で対話が開始されることで、該所持者は、カメラ１２の設置場所に居なくても、リアルタイムで適切なサービス（対話サービス）を受けることができる。これにより、セキュリティシステム１０の利便性をさらに向上させることができる。この場合でも、対話エンジン部１２ｇは、人物との対話を通じて、対話学習を行い、対話機能を含めた人工知能の諸機能を向上させることができる。 In the third embodiment, after (the holder of) the portable device 18 receives the notification, it is possible to interact with the person, and the person on the camera 12 side and the holder of the portable device 18 By starting the dialogue between them, the holder can receive an appropriate service (dialogue service) in real time without being at the installation place of the camera 12. Thereby, the convenience of the security system 10 can be further improved. Even in this case, the dialogue engine unit 12 g can perform dialogue learning through dialogue with a person and improve various functions of artificial intelligence including the dialogue function.

　なお、第１実施例及び第３実施例において、人物との対話が終了した場合、カメラ１２は、少なくとも、対話中に撮像部１２ａが撮像した人物の画像と、マイクロホン１２ｂが検出した人物の音声データとをデータ蓄積部１２ｍに蓄積し、対話エンジン部１２ｇは、対話の終了後、人物との対話の内容に基づいて新たなシナリオを作成し、作成した新たなシナリオをＤＢ１２ｌに登録する。 In the first and third embodiments, when the dialog with the person ends, the camera 12 at least includes the image of the person captured by the imaging unit 12a during the dialog and the voice of the person detected by the microphone 12b. The data is stored in the data storage unit 12m, and the dialogue engine unit 12g creates a new scenario based on the contents of the dialogue with the person after the dialogue ends, and registers the created new scenario in the DB 12l.

　これにより、人物との間でどのような対話が実行されたのかを後日確認することができ、セキュリティシステム１０の利便性を一層向上させることができる。また、カメラ１２に人工知能が搭載されているため、データ蓄積部１２ｍに蓄積されるデータ量が飛躍的に増大しても、対話エンジン部１２ｇは、適切に且つ速やかにデータを処理し、新たなシナリオを作成してＤＢ１２ｌに登録することができる。しかも、対話エンジン部１２ｇは、人物との対話を通じた対話学習によって新たなシナリオを作成するので、人物との対話を実行する毎に、最適なシナリオを選択しやすくなる。 As a result, it is possible to confirm at a later date what kind of dialogue has been performed with the person, and the convenience of the security system 10 can be further improved. Further, since the artificial intelligence is mounted on the camera 12, even if the amount of data stored in the data storage unit 12m dramatically increases, the dialogue engine unit 12g processes the data appropriately and promptly, Scenarios can be created and registered in DB 12l. Furthermore, since the dialogue engine unit 12 g creates a new scenario by dialogue learning through dialogue with a person, it is easy to select an optimal scenario each time the dialogue with a person is executed.

　このように対話学習によってカメラ１２及びサーバ１６における人工知能機能が向上するため、例えば、住宅のホームセキュリティシステムと連携して、人物が住宅の住人であることが確認された場合には、玄関のドアの施錠状態を解除する等の対応を取ることも可能となる。 Thus, since the artificial intelligence function in the camera 12 and the server 16 is improved by the dialog learning, for example, when it is confirmed that the person is a resident of the house in cooperation with the home security system of the house, It is also possible to take measures such as releasing the locked state of the door.

　なお、本発明は、上述の実施形態に限らず、本発明の要旨を逸脱することなく、種々の構成を採り得ることは勿論である。 The present invention is not limited to the above-described embodiment, and it goes without saying that various configurations can be adopted without departing from the scope of the present invention.

Claims

In a security system (10) having at least a camera (12) for imaging a person,
The camera (12) is
An imaging unit (12a) for imaging the surroundings of the camera (12);
A person detection unit (12c) for detecting the person based on an image captured by the imaging unit (12a);
A feature portion detection unit (12d) for detecting a feature portion of the person from the image of the person detected by the person detection unit (12c);
A person identification unit (12e) for identifying the person based on the image of the feature portion detected by the feature portion detection unit (12d);
A security system (10) characterized by having:

The security system (10) according to claim 1, wherein
The person detection unit (12c) detects a moving object reflected in the image, detects a change in noise of the image, or detects the brightness of the image in the image captured by the camera (12). A security system (10) characterized in that the person is detected by detecting a change in height.

The security system (10) according to claim 1 or 2,
The camera (12) further comprises a first microphone (12b) for detecting sound around the camera (12),
The security is characterized in that the person detection unit (12c) detects the person based on the image captured by the imaging unit (12a) and the sound detected by the first microphone (12b). System (10).

In the security system (10) according to any one of claims 1 to 3,
When the camera (12) images the surroundings of the camera (12) continuously or at predetermined time intervals, the characteristic part detection unit (12d) detects the person by the person detection unit (12c). A security system (10) characterized by tracking the feature by detecting the feature from an image of the person.

The security system (10) according to any one of claims 1 to 4,
The camera (12) further includes a person information registration unit (12 l) in which person information on a predetermined person is registered.
The person identification unit (12e)
A feature amount extraction unit (12n) for extracting feature amounts of the feature portion from the image of the feature portion detected by the feature portion detection unit (12d);
A similarity determination processing unit (12o) that determines the similarity between the feature amount of the feature portion of the person extracted by the feature amount extraction unit (12n) and the feature amount of the feature portion of the person indicated by the person information;
The age and / or gender of the person detected by the person detection unit (12c) is estimated by comparing the feature information of the feature portion of the person extracted by the feature amount extraction unit (12n) with the person information Age and gender estimation processor (12p),
Security system (10) characterized by comprising.

The security system (10) according to any one of claims 1 to 5,
The security system (10), wherein the characteristic part is at least a face or appearance of a person detected by the person detection unit (12c).

In the security system (10) according to claim 6,
A security system (10) characterized in that the characteristic portion includes clothes of the person, a pattern, color or pattern of the clothes, or a mark attached to the clothes.

A security system (10) according to any one of the preceding claims,
The camera (12) further includes a first response determination unit (12f) that determines a response to the person according to the specification result when the person specification unit (12e) specifies the person. A security system characterized by (10).

The security system (10) according to claim 8, wherein
The camera (12) is
Scenario registration unit (12 l) where multiple scenarios are registered,
A first dialogue engine that selects a scenario corresponding to the dialogue from the scenario registration unit (12l) when the first response judging unit (12f) decides to execute dialogue with the person based on the identification result Department (12g),
A first speaker (12j) outputting the scenario selected by the first dialogue engine unit (12g) to the outside as voice;
A second microphone (12b) for detecting the voice of the person;
A security system (10).

In the security system (10) according to claim 9,
The camera (12) is a related information registration unit (12l) in which information on the installation place of the camera (12), information on the periphery of the installation place, and / or information on a predetermined person is registered as related information. In addition,
The first dialogue engine unit (12g), when the first response judging unit (12f) decides to execute the dialogue with the person specified by the person specifying unit (12e) based on the specifying result, A security system (10) characterized by extracting the related information from a related information registration unit (12l) and selecting a scenario corresponding to the extracted related information and the dialog from the scenario registration unit (12l).

The security system (10) according to claim 10,
The camera (12) further includes a related information collection unit (12h) that collects the related information from the outside and registers the collected related information in the related information registration unit (12l). (10).

In the security system (10) according to claim 11,
The first dialogue engine unit (12g), when the first response judging unit (12f) decides to execute the dialogue with the person specified by the person specifying unit (12e) based on the specifying result, The related information collecting unit (12h) extracts the related information collected by the related information collecting unit (12h) when the image is taken by the imaging unit (12a) from the related information registration unit (12l), and the extracted related information and the dialog A security system (10) characterized in that the selected scenario is selected from the scenario registration unit (12l).

The security system (10) according to any one of claims 10 to 12,
The first dialogue engine unit (12g) does not register related information according to the person specified by the person specifying unit (12e) in the related information registration unit (12l), the related information registration unit (12g) 12l) extract the other related information registered in 12l), select the extracted other related information and the scenario corresponding to the interaction from the scenario registration unit (12l), and / or notify the outside Security system (10) characterized in that.

The security system (10) according to any one of claims 9 to 13,
It further comprises a portable device (18) connectable to the camera (12) via a communication line (14),
The camera (12) further includes a first communication unit (12i) that can communicate with at least the portable device (18) via the communication line (14).
The first communication unit (12i) is configured to communicate with the communication line when the first response judging unit (12f) decides to notify the portable device (18) of the presence of the person based on the identification result. (14) A security system (10) characterized by notifying the portable device (18) of the presence of the person via (14).

The security system (10) according to claim 14, wherein
The portable device (18) includes a second communication unit (18a) capable of communicating with at least the camera (12) via the communication line (14), a display unit (18d), and an operation unit. (18e), a second speaker (18f), and a third microphone (18g),
When the second communication unit (18a) receives the notification of the presence of the person, the display unit (18d) displays the notification and / or the second speaker (18f) sounds the notification Output as
When the holder of the portable device (18) operates the operation unit (18e) and instructs start of dialogue with the person, the second communication unit (18a) transmits the communication line (14). Sending an instruction signal instructing the start of the dialogue to the first communication unit (12i);
The first communication unit (12i) receives the instruction signal, the image of the person captured by the imaging unit (12a), and the voice data of the person detected by the second microphone (12b) Are transmitted to the second communication unit (18a) via the communication line (14),
The display unit (18d) displays the image of the person received by the second communication unit (18a), and the second speaker (18f) receives the image of the person received by the second communication unit (18a). Voice data is output as sound, and the second communication unit (18a) transmits the voice data of the holder detected by the third microphone (18g) to the first voice data via the communication line (14). Send to the communication unit (12i),
A security system (10) characterized in that the first speaker (12j) outputs the sound data of the possessor received by the first communication unit (12i) as a sound.

The security system (10) according to any one of claims 9 to 13, 15
After the end of the dialog with the person, the camera (12) is at least an image of the person captured by the imaging unit (12a) during the dialog and an image of the person detected by the second microphone (12b). It further has a data storage unit (12 m) in which audio data is recorded,
After completion of the dialogue, the first dialogue engine unit (12g) creates a new scenario based on the contents of the dialogue with the person, and registers the created new scenario in the scenario registration unit (12l). A security system (10) characterized in that.

The security system (10) according to claim 8, wherein
A server (16) connectable to the camera (12) via a communication line (14);
A peripheral information database (22) connected to the server (16) and registered with peripheral information including information on the installation location of the camera (12) and information on the periphery of the installation location;
A personal information database (24) connected to the server (16) and in which personal information of a predetermined person is registered;
And have
When it is determined that the first response judging unit (12f) inquires the server (16) about the detailed identification of the person detected by the person detecting unit (12c) based on the identification result, the communication line (14) Query the server (16) for further identification of the person,
The server (16) is
A third communication unit (16a) that receives an inquiry content from the first response determination unit (12f);
An inquiry content judgment unit (16b) for judging the contents of the inquiry received by the third communication unit (16a);
The peripheral information is acquired from the peripheral information database (22) via the third communication unit (16a) based on the determination result of the inquiry content judgment unit (16b), and from the personal information database (24) A data matching unit (16c) for specifying the person in detail by acquiring the personal information according to the person indicated by the contents of the inquiry and collating the acquired peripheral information and the personal information with the contents of the inquiry ,
A security system (10) characterized by having:

In the security system (10) according to claim 17,
It further has a scenario database (26) connected to the server (16) and registered with a plurality of scenarios,
When the data matching unit (16c) specifies the person indicated by the contents of the inquiry in detail, the server (16) determines a response to the person according to the specification result, the second response determining unit (16d) And, when the second response judging unit (16d) decides to execute the dialogue with the person based on the specified result, the scenario database (26) is transmitted from the scenario database (26) via the third communication unit (16a). And a second dialogue engine unit (16e) for selecting a scenario corresponding to the dialogue;
The camera (12) includes a second speaker (12j) outputting the scenario selected by the second dialogue engine unit (16e) to the outside as a voice, and a third microphone (12b) detecting the voice of the person. A security system (10).

The security system (10) according to claim 18, wherein
It is connected to the server (16), and after the end of the dialog with the person, at least the image of the person captured by the imaging unit (12a) during the dialog and the image detected by the third microphone (12b) It further has a dialogue history database (28) in which voice data of a person is recorded as dialogue history,
The second dialogue engine unit (16e) creates a new scenario using the dialogue history recorded in the dialogue history database (28) after the dialogue ends, and creates the new scenario prepared as the scenario Security system (10) characterized in that it is registered in a database (26).