WO2022137526A1

WO2022137526A1 - Information processing program, information processing method, and information processing device

Info

Publication number: WO2022137526A1
Application number: PCT/JP2020/048809
Authority: WO
Inventors: 諒太下山; 直樹梅田; 信之鷲尾; 芳隆末廣; 主税斎藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2022-06-30
Anticipated expiration: 2023-06-25
Also published as: JPWO2022137526A1

Abstract

The present invention makes it possible to evaluate the quality of data appropriately.　In the present invention, a storage unit stores history information indicating a history of input data and output data for each of a plurality of pieces of software. A processing unit extracts information of first input data, which serves as a start point of data processing executed by the plurality of pieces of software, on the basis of the history information stored in the storage unit. The processing unit evaluates the quality of first output data output through the data processing in accordance with a comparison of whether or not the information of the first input data matches the information of predetermined data input by a user.

Description

Information processing programs, information processing methods and information processing equipment

　本発明は情報処理プログラム、情報処理方法および情報処理装置に関する。 The present invention relates to an information processing program, an information processing method, and an information processing device.

　情報処理システムでは、様々なソフトウェア製品によって大量のデータが処理される。複数のソフトウェア製品を組み合わせて、一連のデータ処理が実現されることもある。
　例えば、情報処理システムにおいて処理されるデータの流れを、データ間の関係を表すデータリネージの情報を用いて管理する方法が提案されている。提案の方法では、物理的データ要素間の関係を表す第１のデータリネージと、ビジネスデータ要素間の関係を表す第２のデータリネージとを基に、物理的データ要素とビジネスデータ要素との関連を検出する。 In information processing systems, a large amount of data is processed by various software products. A series of data processing may be realized by combining multiple software products.
For example, a method of managing the flow of data processed in an information processing system by using data lineage information representing a relationship between the data has been proposed. In the proposed method, the relationship between the physical data element and the business data element is based on the first data lineage that represents the relationship between the physical data elements and the second data lineage that represents the relationship between the business data elements. Is detected.

　なお、基地局装置が端末装置から信号を受信した際の受信信号強度、またはエラーの程度を示す値を含むデータを取得し、取得したデータに基づいて基地局装置の受信信号の品質指標を導出する制御装置の提案がある。 It should be noted that data including a value indicating the strength of the received signal when the base station device receives the signal from the terminal device or the degree of error is acquired, and the quality index of the received signal of the base station device is derived based on the acquired data. There is a proposal for a control device to be used.

国際公開第２０１８／０８９６３３号International Publication No. 2018/089633 特開２０１９－１１０４９７号公報Japanese Unexamined Patent Publication No. 2019-110497

　複数のソフトウェア製品が組み合わされて一連のデータ処理が実現される場合、当該データ処理により出力されるデータが、ユーザの意図する過程を経て生成されたものであるか否かが問題となる。例えば、処理対象のデータの出処が意図したものでないと、データ処理の結果に対する信頼性は低下する。しかし、こうした複数のソフトウェア製品による一連のデータ処理を経て出力されるデータの品質管理の仕組みが確立されていない。 When a series of data processing is realized by combining a plurality of software products, the problem is whether or not the data output by the data processing is generated through the process intended by the user. For example, if the source of the data to be processed is not intended, the reliability of the result of data processing is lowered. However, a quality control mechanism for data output through a series of data processing by such a plurality of software products has not been established.

　１つの側面では、本発明は、データの品質を適切に評価する情報処理プログラム、情報処理方法および情報処理装置を提供することを目的とする。 In one aspect, it is an object of the present invention to provide an information processing program, an information processing method and an information processing apparatus for appropriately evaluating the quality of data.

　１つの態様では、情報処理プログラムが提供される。情報処理プログラムは、コンピュータに、複数のソフトウェアの各々に対する入力データおよび出力データの履歴を示す来歴情報に基づいて、複数のソフトウェアによるデータ処理の始点である第１の入力データの情報を抽出し、第１の入力データの情報がユーザにより入力された、所定のデータの情報に一致するか否かの比較に応じて、データ処理により出力される第１の出力データの品質評価を行う、処理を実行させる。 In one aspect, an information processing program is provided. The information processing program extracts to the computer the information of the first input data which is the starting point of data processing by the plurality of software based on the history information indicating the history of the input data and the output data for each of the plurality of software. Processing that evaluates the quality of the first output data output by the data processing according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user. Let it run.

　また、１つの態様では、情報処理方法が提供される。
　また、１つの対象では、情報処理装置が提供される。 Further, in one aspect, an information processing method is provided.
Further, in one object, an information processing device is provided.

　１つの側面では、データの品質を適切に評価することができる。
　本発明の上記および他の目的、特徴および利点は本発明の例として好ましい実施の形態を表す添付の図面と関連した以下の説明により明らかになるであろう。 In one aspect, the quality of the data can be adequately evaluated.
The above and other objects, features and advantages of the invention will be apparent by the following description in connection with the accompanying drawings representing preferred embodiments of the invention.

第１の実施の形態の情報処理装置を説明する図である。It is a figure explaining the information processing apparatus of 1st Embodiment. 第２の実施の形態のシステム例を示す図である。It is a figure which shows the system example of the 2nd Embodiment. 情報処理装置のハードウェア例を示す図である。It is a figure which shows the hardware example of an information processing apparatus. 情報処理システムにおけるソフトウェアの例を示す図である。It is a figure which shows the example of software in an information processing system. 情報処理装置の機能例を示す図である。It is a figure which shows the functional example of an information processing apparatus. 来歴情報の例を示す図である。It is a figure which shows the example of the history information. 来歴評価の例を示す図である。It is a figure which shows the example of the provenance evaluation. セキュリティ評価の例を示す図である。It is a figure which shows the example of the security evaluation. アクセス権限予測の例を示す図である。It is a figure which shows the example of access authority prediction. 最新性評価の例を示す図である。It is a figure which shows the example of the up-to-dateness evaluation. 総合評価結果テーブルの例を示す図である。It is a figure which shows the example of the comprehensive evaluation result table. 評価結果画面の第１の例を示す図である。It is a figure which shows the 1st example of the evaluation result screen. 評価結果画面の第２の例を示す図である。It is a figure which shows the 2nd example of the evaluation result screen. 情報処理装置の処理例を示すフローチャートである。It is a flowchart which shows the processing example of an information processing apparatus. 来歴評価例を示すフローチャートである。It is a flowchart which shows the history evaluation example. セキュリティ評価例を示すフローチャートである。It is a flowchart which shows the security evaluation example. 最新性評価例を示すフローチャートである。It is a flowchart which shows the latestness evaluation example. 評価結果表示制御例を示すフローチャートである。It is a flowchart which shows the evaluation result display control example.

　以下、本実施の形態について図面を参照して説明する。
　［第１の実施の形態］
　第１の実施の形態を説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.
[First Embodiment]
The first embodiment will be described.

　図１は、第１の実施の形態の情報処理装置を説明する図である。
　情報処理装置１０は、記憶部１１および処理部１２を有する。記憶部１１は、ＲＡＭ（Random Access Memory）などの揮発性記憶装置でもよいし、ＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの不揮発性記憶装置でもよい。処理部１２は、ＣＰＵ（Central Processing Unit）、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）などを含み得る。処理部１２はプログラムを実行するプロセッサであってもよい。ここでいう「プロセッサ」には、複数のプロセッサの集合（マルチプロセッサ）も含まれ得る。 FIG. 1 is a diagram illustrating an information processing apparatus according to the first embodiment.
The information processing device 10 has a storage unit 11 and a processing unit 12. The storage unit 11 may be a volatile storage device such as a RAM (Random Access Memory) or a non-volatile storage device such as an HDD (Hard Disk Drive) or a flash memory. The processing unit 12 may include a CPU (Central Processing Unit), a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), and the like. The processing unit 12 may be a processor that executes a program. The "processor" here may include a set of a plurality of processors (multiprocessor).

　記憶部１１は、複数のソフトウェアの各々に対する入力データおよび出力データの履歴を示す来歴情報２０を記憶する。来歴情報２０は、例えば、複数のソフトウェアを実行する、図示を省略している情報処理システムから取得される。例えば、情報処理装置１０は、ネットワークを介して当該情報処理システムと接続され、ネットワークを介して当該情報処理システムから来歴情報２０を取得して記憶部１１に格納してもよい。あるいは、来歴情報２０は、ユーザによって情報処理装置１０に入力され、記憶部１１に格納されてもよい。来歴情報２０は、データの出処やデータの加工や整形などの履歴を示すデータリネージの情報でもよい。 The storage unit 11 stores the history information 20 indicating the history of the input data and the output data for each of the plurality of software. Provenance information 20 is obtained, for example, from an information processing system (not shown) that executes a plurality of software. For example, the information processing apparatus 10 may be connected to the information processing system via a network, acquire the history information 20 from the information processing system via the network, and store the history information 20 in the storage unit 11. Alternatively, the history information 20 may be input to the information processing apparatus 10 by the user and stored in the storage unit 11. Provenance information 20 may be data lineage information showing the history of data source, data processing, shaping, and the like.

　処理部１２は、記憶部１１に記憶された来歴情報２０に基づいて、複数のソフトウェアによるデータ処理の始点である第１の入力データの情報を抽出する。処理部１２は、第１の入力データの情報がユーザにより入力された、所定のデータの情報に一致するか否かの比較に応じて、当該データ処理により出力される第１の出力データの品質評価を行う。 The processing unit 12 extracts the information of the first input data which is the starting point of the data processing by the plurality of software based on the history information 20 stored in the storage unit 11. The processing unit 12 determines the quality of the first output data output by the data processing according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user. Make an evaluation.

　第１の入力データの情報としては、第１の入力データのデータ名などの識別情報や、第１の入力データを格納しているデータ記憶部４１の識別情報などが考えられる。データ記憶部４１の識別情報は、例えばデータ記憶部４１に対応するＤＢ（Database）のＤＢ名やディレクトリパス、あるいは、データ記憶部４１を提供する記憶装置の識別名などが考えられる。所定のデータの情報は、ユーザが入力データとして想定しているデータに関する情報であり、ユーザが本来の入力として想定しているデータの識別情報や、当該データを格納するデータ記憶部の識別情報などが考えられる。 As the information of the first input data, identification information such as a data name of the first input data, identification information of the data storage unit 41 storing the first input data, and the like can be considered. The identification information of the data storage unit 41 may be, for example, the DB name or directory path of the DB (Database) corresponding to the data storage unit 41, or the identification name of the storage device that provides the data storage unit 41. The information of the predetermined data is information about the data assumed as the input data by the user, such as the identification information of the data assumed as the original input by the user, the identification information of the data storage unit for storing the data, and the like. Can be considered.

　例えば、来歴情報２０は、情報処理システムで実行されるソフトウェア３１，３２によるデータ処理に関する情報を含む。当該データ処理に使用されるデータは、データｄ１，ｄ２，ｄ３である。例えば、来歴情報２０は、データ記憶部４１に記憶されたデータｄ１がソフトウェア３１に入力されて、データｄ１に基づいてソフトウェア３１の処理ｐ１によりデータｄ２が生成され、データｄ２がデータ記憶部４２に格納されることを示す。 For example, the history information 20 includes information related to data processing by software 31 and 32 executed in the information processing system. The data used for the data processing are data d1, d2, d3. For example, in the history information 20, the data d1 stored in the data storage unit 41 is input to the software 31, the data d2 is generated by the processing p1 of the software 31 based on the data d1, and the data d2 is stored in the data storage unit 42. Indicates that it will be stored.

　また、来歴情報２０は、データ記憶部４２に記憶されたデータｄ２がソフトウェア３２に入力されて、データｄ２に基づいてソフトウェア３２の処理ｐ２によりデータｄ３が生成され、データｄ３がデータ記憶部４３に格納されることを示す。なお、データ記憶部４１，４２，４３は、情報処理システムに含まれる記憶装置により実現されてもよいし、情報処理システムの外部の記憶装置により実現されてもよい。 Further, in the history information 20, the data d2 stored in the data storage unit 42 is input to the software 32, the data d3 is generated by the processing p2 of the software 32 based on the data d2, and the data d3 is stored in the data storage unit 43. Indicates that it will be stored. The data storage units 41, 42, and 43 may be realized by a storage device included in the information processing system, or may be realized by a storage device external to the information processing system.

　例えば、処理部１２は、来歴情報２０に基づいて次のように、データの品質評価を行う。まず、処理部１２は、来歴情報２０に基づいて、ソフトウェア３１，３２によるデータ処理により出力されるデータｄ３を特定する。そして、処理部１２は、来歴情報２０に基づいて、データｄ３の生成元データがデータｄ２であることを特定する。更に、処理部１２は、来歴情報２０に基づいて、データｄ２の生成元データがデータｄ１であることを特定する。 For example, the processing unit 12 evaluates the quality of data as follows based on the history information 20. First, the processing unit 12 identifies the data d3 output by the data processing by the software 31 and 32 based on the history information 20. Then, the processing unit 12 specifies that the generation source data of the data d3 is the data d2 based on the history information 20. Further, the processing unit 12 specifies that the generation source data of the data d2 is the data d1 based on the history information 20.

　処理部１２は、来歴情報２０に基づいて、データｄ１の前段階の生成元データはないことを検出する。このため、処理部１２は、データｄ３を出力するデータ処理の始点の入力データは、データｄ１であると特定する。したがって、処理部１２は、来歴情報２０に基づいて、データｄ３に対する始点の入力データの情報として、データｄ１の情報を抽出する（ステップＳ１）。データｄ１は、第１の入力データの一例である。 The processing unit 12 detects that there is no source data in the previous stage of the data d1 based on the history information 20. Therefore, the processing unit 12 specifies that the input data at the start point of the data processing that outputs the data d3 is the data d1. Therefore, the processing unit 12 extracts the information of the data d1 as the information of the input data of the starting point for the data d3 based on the history information 20 (step S1). The data d1 is an example of the first input data.

　処理部１２は、データｄ１の情報が、所定のデータの情報に一致するか否かの比較に応じて、データｄ３の品質評価を行う（ステップＳ２）。例えば、処理部１２は、データｄ１の識別情報が、ユーザの想定する所定のデータの識別情報に一致するか否かの比較を行ってもよい。データの識別情報の比較に加えて、あるいは当該比較に代えて、処理部１２は、データｄ１が格納されているデータ記憶部４１の識別情報が、ユーザの想定する所定のデータ記憶部の識別情報に一致するか否かの比較を行ってもよい。 The processing unit 12 evaluates the quality of the data d3 according to the comparison of whether or not the information of the data d1 matches the information of the predetermined data (step S2). For example, the processing unit 12 may compare whether or not the identification information of the data d1 matches the identification information of the predetermined data assumed by the user. In addition to or instead of comparing the data identification information, the processing unit 12 uses the identification information of the data storage unit 41 in which the data d1 is stored to be the identification information of the predetermined data storage unit assumed by the user. You may make a comparison as to whether or not they match.

　処理部１２は、データｄ１の情報が、所定のデータの情報に一致する場合、データｄ３の品質を、一致しない場合よりも高く評価する。例えば、処理部１２は、データｄ３に対し、値が大きい程品質が高いことを示す品質の指標値を設けてもよい。この場合、処理部１２は、データｄ１の情報が、所定のデータの情報に一致する場合に指標値に所定値を加算し、一致しない場合に指標値への加算を行わないようにする。 When the information of the data d1 matches the information of the predetermined data, the processing unit 12 evaluates the quality of the data d3 higher than the case where the information does not match. For example, the processing unit 12 may provide the data d3 with a quality index value indicating that the larger the value, the higher the quality. In this case, the processing unit 12 adds a predetermined value to the index value when the information of the data d1 matches the information of the predetermined data, and prevents addition to the index value when the information does not match.

　データｄ１の情報に含まれる複数の項目と所定のデータの情報に含まれる複数の項目とを比較する場合、処理部１２は、複数の項目が完全一致するか否かに応じて、指標値に所定値を加算してもよいし、一致する項目数に応じて、加算する値を変えてもよい。 When comparing a plurality of items included in the information of the data d1 with a plurality of items included in the information of the predetermined data, the processing unit 12 sets the index value according to whether or not the plurality of items completely match. A predetermined value may be added, or the value to be added may be changed according to the number of matching items.

　また、処理部１２は、データｄ３に対する始点の入力データとして、データｄ１に加えて他のデータを抽出することもある。その場合、複数の始点の入力データそれぞれの情報が、所定のデータの情報に一致するか否かの比較に応じて、データｄ３の品質評価を行う。所定のデータの情報には、ユーザの想定する複数のデータの情報が含まれてもよい。 Further, the processing unit 12 may extract other data in addition to the data d1 as the input data of the starting point for the data d3. In that case, the quality of the data d3 is evaluated according to the comparison of whether or not the information of each of the input data of the plurality of start points matches the information of the predetermined data. The information of the predetermined data may include information of a plurality of data assumed by the user.

　なお、処理部１２は、加点方式ではなく、減点方式で品質の評価値を求めてもよい。また、品質の指標値は、値が小さいほど品質が高いことを示すものでもよく、その場合、上記の「加算」を「減算」に読み替えればよい。 Note that the processing unit 12 may obtain the quality evaluation value by the point deduction method instead of the point addition method. Further, the quality index value may indicate that the smaller the value, the higher the quality. In that case, the above "addition" may be read as "subtraction".

　情報処理装置１０によれば、複数のソフトウェアの各々に対する入力データおよび出力データの履歴を示す来歴情報に基づいて、複数のソフトウェアによるデータ処理の始点である第１の入力データの情報が抽出される。第１の入力データの情報がユーザにより入力された、所定のデータの情報に一致するか否かの比較に応じて、データ処理により出力される第１の出力データの品質評価が行われる。 According to the information processing apparatus 10, the information of the first input data which is the starting point of the data processing by the plurality of software is extracted based on the history information indicating the history of the input data and the output data for each of the plurality of software. .. The quality of the first output data output by the data processing is evaluated according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user.

　これにより、データの品質を適切に評価することができる。
　情報処理システムでは、種々のソフトウェアが実行されており、複数のソフトウェアによる一連のデータ処理において、ユーザが意図するデータが処理されているか否かにより、データ処理により出力されるデータの品質が変わる。例えば、分析などのデータ処理を行う場合、ユーザが意図しない入力データが処理されていると、当該入力データに不要な情報や誤った情報が含まれていることなどが要因となり、データ処理の結果が誤っている可能性が高まるため、当該結果の信頼性が低下する。 This makes it possible to appropriately evaluate the quality of the data.
In the information processing system, various softwares are executed, and in a series of data processing by a plurality of softwares, the quality of the data output by the data processing changes depending on whether or not the data intended by the user is processed. For example, when performing data processing such as analysis, if input data not intended by the user is processed, the input data may contain unnecessary information or incorrect information, resulting in data processing. Is more likely to be wrong, which reduces the reliability of the result.

　そこで、情報処理装置１０は、来歴情報２０に基づき、複数のソフトウェアによるデータ処理の始点のデータｄ１を特定し、データｄ１がユーザの意図する入力であるか否かを確認することで、当該データ処理により出力されるデータｄ３の品質を適切に評価できる。 Therefore, the information processing apparatus 10 identifies the data d1 at the start point of data processing by the plurality of software based on the history information 20, and confirms whether or not the data d1 is the input intended by the user. The quality of the data d3 output by the processing can be appropriately evaluated.

　処理部１２は、情報処理システムにより出力される、データｄ３を含む複数の出力データに対して、上記の品質評価を行うことができる。処理部１２は、データｄ３を含む複数の出力データそれぞれに対する品質評価の結果を、表示装置により表示させてもよい。例えば、処理部１２は、各出力データに対応する始点の入力データから当該出力データに至るデータフロー図を、出力データ毎に表示装置に表示させてもよい。この場合、処理部１２は、品質が基準よりも低いと評価される出力データに関するデータフロー図を強調表示させてもよい。このようにして、ユーザによる情報処理システムにおけるデータフローの見直しを支援することも考えられる。 The processing unit 12 can perform the above quality evaluation on a plurality of output data including the data d3 output by the information processing system. The processing unit 12 may display the result of quality evaluation for each of the plurality of output data including the data d3 on the display device. For example, the processing unit 12 may display a data flow diagram from the input data of the start point corresponding to each output data to the output data on the display device for each output data. In this case, the processing unit 12 may highlight the data flow diagram relating to the output data whose quality is evaluated to be lower than the standard. In this way, it is possible to support the user in reviewing the data flow in the information processing system.

　以下では、より具体的な例を示して、情報処理装置１０の機能を詳細に説明する。
　［第２の実施の形態］
　次に、第２の実施の形態を説明する。 Hereinafter, the function of the information processing apparatus 10 will be described in detail by showing a more specific example.
[Second Embodiment]
Next, a second embodiment will be described.

　図２は、第２の実施の形態のシステム例を示す図である。
　第２の実施の形態のシステムは、情報処理システム５０および情報処理装置１００を含む。情報処理システム５０および情報処理装置１００は、ネットワーク６０に接続されている。ネットワーク６０は、インターネットやＷＡＮ（Wide Area Network）でもよいし、ＬＡＮ（Local Area Network）でもよい。 FIG. 2 is a diagram showing a system example of the second embodiment.
The system of the second embodiment includes an information processing system 50 and an information processing device 100. The information processing system 50 and the information processing apparatus 100 are connected to the network 60. The network 60 may be the Internet, a WAN (Wide Area Network), or a LAN (Local Area Network).

　情報処理システム５０は、種々のソフトウェアを実行し、複数のソフトウェアを組み合わせた様々なデータ処理を実行する。ソフトウェアは、製品あるいはソフトウェア製品などと呼ばれてもよい。情報処理システム５０は、サーバ装置２００，３００，…を有する。サーバ装置２００，３００，…それぞれは、ソフトウェアを実行するサーバコンピュータである。サーバ装置２００，３００，…それぞれは、情報処理システム５０の内部ネットワークに接続される。サーバ装置２００，３００，…それぞれは、複数のソフトウェアを実行してもよい。また、あるサーバ装置で実行されるソフトウェアが他のサーバ装置で実行される他のソフトウェアと連携することもある。情報処理システム５０は、データを蓄積し、サーバ装置間でのデータの受け渡しに用いられる記憶装置を含み得る。 The information processing system 50 executes various software and executes various data processing in which a plurality of software are combined. The software may be referred to as a product or a software product. The information processing system 50 includes server devices 200, 300, .... Server devices 200, 300, ... Each is a server computer that executes software. The server devices 200, 300, ... Each are connected to the internal network of the information processing system 50. Each of the server devices 200, 300, ... may execute a plurality of software. In addition, software executed on one server device may be linked with other software executed on another server device. The information processing system 50 may include a storage device that stores data and is used for exchanging data between server devices.

　情報処理装置１００は、情報処理システム５０におけるデータリネージの情報を活用して、情報処理システム５０における複数のソフトウェアを跨いだデータ品質評価を行うサーバコンピュータである。情報処理装置１００は、第１の実施の形態の情報処理装置１０の一例である。 The information processing device 100 is a server computer that utilizes data lineage information in the information processing system 50 to perform data quality evaluation across a plurality of software in the information processing system 50. The information processing device 100 is an example of the information processing device 10 of the first embodiment.

　図３は、情報処理装置のハードウェア例を示す図である。
　情報処理装置１００は、ＣＰＵ１０１、ＲＡＭ１０２、ＨＤＤ１０３、ＧＰＵ（Graphics Processing Unit）１０４、入力インタフェース１０５、媒体リーダ１０６およびＮＩＣ（Network Interface Card）１０７を有する。なお、ＣＰＵ１０１は、第１の実施の形態の処理部１２の一例である。ＲＡＭ１０２またはＨＤＤ１０３は、第１の実施の形態の記憶部１１の一例である。 FIG. 3 is a diagram showing a hardware example of the information processing device.
The information processing device 100 includes a CPU 101, a RAM 102, an HDD 103, a GPU (Graphics Processing Unit) 104, an input interface 105, a medium reader 106, and a NIC (Network Interface Card) 107. The CPU 101 is an example of the processing unit 12 of the first embodiment. The RAM 102 or the HDD 103 is an example of the storage unit 11 of the first embodiment.

　ＣＰＵ１０１は、プログラムの命令を実行するプロセッサである。ＣＰＵ１０１は、ＨＤＤ１０３に記憶されたプログラムやデータの少なくとも一部をＲＡＭ１０２にロードし、プログラムを実行する。なお、ＣＰＵ１０１は複数のプロセッサコアを含んでもよい。また、情報処理装置１００は複数のプロセッサを有してもよい。以下で説明する処理は複数のプロセッサまたはプロセッサコアを用いて並列に実行されてもよい。また、複数のプロセッサの集合を「マルチプロセッサ」または単に「プロセッサ」と言うことがある。 The CPU 101 is a processor that executes a program instruction. The CPU 101 loads at least a part of the programs and data stored in the HDD 103 into the RAM 102 and executes the program. The CPU 101 may include a plurality of processor cores. Further, the information processing device 100 may have a plurality of processors. The processes described below may be performed in parallel using multiple processors or processor cores. Also, a set of multiple processors may be referred to as a "multiprocessor" or simply a "processor".

　ＲＡＭ１０２は、ＣＰＵ１０１が実行するプログラムやＣＰＵ１０１が演算に用いるデータを一時的に記憶する揮発性の半導体メモリである。なお、情報処理装置１００は、ＲＡＭ以外の種類のメモリを備えてもよく、複数個のメモリを備えてもよい。 The RAM 102 is a volatile semiconductor memory that temporarily stores a program executed by the CPU 101 and data used by the CPU 101 for calculation. The information processing apparatus 100 may include a type of memory other than the RAM, or may include a plurality of memories.

　ＨＤＤ１０３は、ＯＳ（Operating System）やミドルウェアやアプリケーションソフトウェアなどのソフトウェアのプログラム、および、データを記憶する不揮発性の記憶装置である。なお、情報処理装置１００は、フラッシュメモリやＳＳＤ（Solid State Drive）などの他の種類の記憶装置を備えてもよく、複数の不揮発性の記憶装置を備えてもよい。 The HDD 103 is a non-volatile storage device that stores software programs such as an OS (Operating System), middleware, and application software, and data. The information processing device 100 may be provided with other types of storage devices such as a flash memory and an SSD (Solid State Drive), or may be provided with a plurality of non-volatile storage devices.

　ＧＰＵ１０４は、ＣＰＵ１０１からの命令に従って、情報処理装置１００に接続されたディスプレイ６１に画像を出力する。ディスプレイ６１としては、ＣＲＴ（Cathode Ray Tube）ディスプレイ、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）、プラズマディスプレイ、有機ＥＬ（ＯＥＬ：Organic Electro-Luminescence）ディスプレイなど、任意の種類のディスプレイを用いることができる。 The GPU 104 outputs an image to the display 61 connected to the information processing apparatus 100 in accordance with a command from the CPU 101. As the display 61, any kind of display such as a CRT (Cathode RayTube) display, a liquid crystal display (LCD: Liquid Crystal Display), a plasma display, and an organic EL (OEL: Organic Electro-Luminescence) display can be used.

　入力インタフェース１０５は、情報処理装置１００に接続された入力デバイス６２から入力信号を取得し、ＣＰＵ１０１に出力する。入力デバイス６２としては、マウス・タッチパネル・タッチパッド・トラックボールなどのポインティングデバイス、キーボード、リモートコントローラ、ボタンスイッチなどを用いることができる。また、情報処理装置１００に、複数の種類の入力デバイスが接続されていてもよい。 The input interface 105 acquires an input signal from the input device 62 connected to the information processing device 100 and outputs the input signal to the CPU 101. As the input device 62, a pointing device such as a mouse, a touch panel, a touch pad, a trackball, a keyboard, a remote controller, a button switch, or the like can be used. Further, a plurality of types of input devices may be connected to the information processing apparatus 100.

　媒体リーダ１０６は、記録媒体６３に記録されたプログラムやデータを読み取る読み取り装置である。記録媒体６３として、例えば、磁気ディスク、光ディスク、光磁気ディスク（ＭＯ：Magneto-Optical disk）、半導体メモリなどを使用できる。磁気ディスクには、フレキシブルディスク（ＦＤ：Flexible Disk）やＨＤＤが含まれる。光ディスクには、ＣＤ（Compact Disc）やＤＶＤ（Digital Versatile Disc）が含まれる。 The medium reader 106 is a reading device that reads programs and data recorded on the recording medium 63. As the recording medium 63, for example, a magnetic disk, an optical disk, a magneto-optical disk (MO: Magneto-Optical disk), a semiconductor memory, or the like can be used. The magnetic disk includes a flexible disk (FD: Flexible Disk) and an HDD. Optical discs include CDs (Compact Discs) and DVDs (Digital Versatile Discs).

　媒体リーダ１０６は、例えば、記録媒体６３から読み取ったプログラムやデータを、ＲＡＭ１０２やＨＤＤ１０３などの他の記録媒体にコピーする。読み取られたプログラムは、例えば、ＣＰＵ１０１によって実行される。なお、記録媒体６３は可搬型記録媒体であってもよく、プログラムやデータの配布に用いられることがある。また、記録媒体６３やＨＤＤ１０３を、コンピュータ読み取り可能な記録媒体と言うことがある。 The medium reader 106 copies, for example, a program or data read from the recording medium 63 to another recording medium such as the RAM 102 or the HDD 103. The read program is executed by, for example, the CPU 101. The recording medium 63 may be a portable recording medium and may be used for distribution of programs and data. Further, the recording medium 63 and the HDD 103 may be referred to as a computer-readable recording medium.

　ＮＩＣ１０７は、ネットワーク６０に接続され、ネットワーク６０を介して他のコンピュータと通信を行うインタフェースである。ＮＩＣ１０７は、例えば、スイッチやルータなどの通信装置とケーブルで接続される。 NIC107 is an interface that is connected to the network 60 and communicates with other computers via the network 60. The NIC 107 is connected to a communication device such as a switch or a router by a cable.

　サーバ装置２００，３００，…も情報処理装置１００と同様のハードウェアにより実現される。
　図４は、情報処理システムにおけるソフトウェアの例を示す図である。 The server devices 200, 300, ... Are also realized by the same hardware as the information processing device 100.
FIG. 4 is a diagram showing an example of software in an information processing system.

　情報処理システム５０は、データ取得ソフトウェア５１、データ加工ソフトウェア５２、データ蓄積部５３およびデータ整形ソフトウェア５４を有する。ここで挙げたソフトウェアは一例であり、情報処理システム５０は、これらのソフトウェアの少なくとも一部に代えて、または、これらのソフトウェアに加えて、他の処理を行う他のソフトウェアを有してもよい。データ蓄積部５３は、情報処理システム５０が有する記憶装置により実現される。 The information processing system 50 includes data acquisition software 51, data processing software 52, data storage unit 53, and data shaping software 54. The software mentioned here is an example, and the information processing system 50 may have other software that performs other processing in place of or in addition to at least a part of these software. .. The data storage unit 53 is realized by a storage device included in the information processing system 50.

　例えば、データ取得ソフトウェア５１、データ加工ソフトウェア５２およびデータ整形ソフトウェア５４は、この順番で前段のソフトウェアにより処理されたデータを取得し、処理結果のデータを出力する。 For example, the data acquisition software 51, the data processing software 52, and the data shaping software 54 acquire the data processed by the software in the previous stage in this order, and output the processing result data.

　データ取得ソフトウェア５１は、入力データ記憶部７０から入力データＡ１を取得し、データ加工ソフトウェア５２に提供する。データ加工ソフトウェア５２は、データ加工処理ｓ１により入力データＡ１から蓄積データＡ２を生成する。データ加工ソフトウェア５２は、蓄積データＡ２をデータ蓄積部５３に格納する。 The data acquisition software 51 acquires the input data A1 from the input data storage unit 70 and provides it to the data processing software 52. The data processing software 52 generates the accumulated data A2 from the input data A1 by the data processing processing s1. The data processing software 52 stores the stored data A2 in the data storage unit 53.

　また、データ取得ソフトウェア５１は、入力データ記憶部７１から入力データＢ１を取得し、データ加工ソフトウェア５２に提供する。データ加工ソフトウェア５２は、データ加工処理ｓ２により入力データＢ１から蓄積データＢ２を生成する。データ加工ソフトウェア５２は、蓄積データＢ２をデータ蓄積部５３に格納する。なお、ある蓄積データの生成に使用される入力データは、複数であってもよい。 Further, the data acquisition software 51 acquires the input data B1 from the input data storage unit 71 and provides it to the data processing software 52. The data processing software 52 generates the accumulated data B2 from the input data B1 by the data processing processing s2. The data processing software 52 stores the stored data B2 in the data storage unit 53. The number of input data used to generate a certain stored data may be plural.

　データ整形ソフトウェア５４は、データ蓄積部５３に格納された蓄積データＡ２，Ｂ２を取得する。データ整形ソフトウェア５４は、データ整形処理ｓ３により蓄積データＡ２から活用データＡ３を生成し、活用データＡ３を活用データ記憶部８０に格納する。データ整形ソフトウェア５４は、データ整形処理ｓ４により蓄積データＡ２，Ｂ２から活用データＡＢを生成し、活用データＡＢを活用データ記憶部８０に格納する。データ整形ソフトウェア５４は、データ整形処理ｓ５により蓄積データＡ２，Ｂ２から活用データＡＢを生成し、活用データＡＢを活用データ記憶部８０に格納する。活用データは、一連のデータ処理によって最終的に出力される出力データであると言える。 The data shaping software 54 acquires the stored data A2 and B2 stored in the data storage unit 53. The data shaping software 54 generates the utilization data A3 from the accumulated data A2 by the data shaping process s3, and stores the utilization data A3 in the utilization data storage unit 80. The data shaping software 54 generates the utilization data AB from the accumulated data A2 and B2 by the data shaping process s4, and stores the utilization data AB in the utilization data storage unit 80. The data shaping software 54 generates the utilization data AB from the accumulated data A2 and B2 by the data shaping process s5, and stores the utilization data AB in the utilization data storage unit 80. It can be said that the utilization data is output data that is finally output by a series of data processing.

　入力データ記憶部７０，７１および活用データ記憶部８０の少なくとも一部は、情報処理システム５０が有する記憶装置により実現されてもよい。また、入力データ記憶部７０，７１および活用データ記憶部８０の少なくとも一部は、情報処理システム５０の外部の、情報処理システム５０からネットワーク６０を介してアクセス可能な記憶装置により実現されてもよい。また、「蓄積データ」は、「入力データ」から「活用データ」を作成する中間のデータであるので、「中間データ」とも呼べる。更に、「データ」は、ファイル、テーブルまたはレコードなどと呼ばれる情報の一単位でもよい。 At least a part of the input data storage units 70 and 71 and the utilization data storage unit 80 may be realized by the storage device included in the information processing system 50. Further, at least a part of the input data storage units 70 and 71 and the utilization data storage unit 80 may be realized by a storage device external to the information processing system 50 and accessible from the information processing system 50 via the network 60. .. Further, since the "accumulated data" is intermediate data for creating "utilization data" from "input data", it can also be called "intermediate data". Further, the "data" may be a unit of information called a file, a table, a record, or the like.

　図５は、情報処理装置の機能例を示す図である。
　情報処理装置１００は、記憶部１１０、来歴情報解析部１３０、評価部１４０および表示制御部１５０を有する。記憶部１１０には、ＲＡＭ１０２やＨＤＤ１０３の記憶領域が用いられる。来歴情報解析部１３０、評価部１４０および表示制御部１５０は、ＲＡＭ１０２に記憶されたプログラムがＣＰＵ１０１により実行されることで実現される。 FIG. 5 is a diagram showing a functional example of the information processing apparatus.
The information processing apparatus 100 includes a storage unit 110, a history information analysis unit 130, an evaluation unit 140, and a display control unit 150. The storage area of the RAM 102 or the HDD 103 is used for the storage unit 110. The history information analysis unit 130, the evaluation unit 140, and the display control unit 150 are realized by executing the program stored in the RAM 102 by the CPU 101.

　記憶部１１０は、来歴情報解析部１３０、評価部１４０および表示制御部１５０の処理に用いられる情報を記憶する。記憶部１１０が記憶する情報は、来歴情報を含む。来歴情報は、ユーザによって情報処理システム５０から取得され、記憶部１１０に格納される。あるいは、来歴情報は、情報処理システム５０から取得されるクエリ実行ログを来歴情報解析部１３０により解析することで作成され、記憶部１１０に格納されてもよい。 The storage unit 110 stores information used for processing of the history information analysis unit 130, the evaluation unit 140, and the display control unit 150. The information stored in the storage unit 110 includes the history information. Provenance information is acquired from the information processing system 50 by the user and stored in the storage unit 110. Alternatively, the history information may be created by analyzing the query execution log acquired from the information processing system 50 by the history information analysis unit 130 and stored in the storage unit 110.

　来歴情報解析部１３０は、記憶部１１０に記憶された来歴情報の解析を行う。来歴情報解析部１３０は、入力データ抽出部１３１、アクセス権限予測部１３２および遅延時間算出部１３３を有する。 The history information analysis unit 130 analyzes the history information stored in the storage unit 110. The history information analysis unit 130 has an input data extraction unit 131, an access authority prediction unit 132, and a delay time calculation unit 133.

　入力データ抽出部１３１は、来歴情報に基づいて、情報処理システム５０における複数のソフトウェアにより実現されるデータ処理の始点の入力データを抽出し、当該始点の入力データの情報を記憶部１１０に格納する。入力データ抽出部１３１は、来歴情報を基に、活用データの生成フローを遡って辿ることで、当該活用データに対応する始点の入力データを抽出する。 The input data extraction unit 131 extracts the input data of the start point of the data processing realized by the plurality of software in the information processing system 50 based on the history information, and stores the information of the input data of the start point in the storage unit 110. .. The input data extraction unit 131 extracts the input data of the starting point corresponding to the utilization data by tracing back the generation flow of the utilization data based on the history information.

　アクセス権限予測部１３２は、来歴情報および始点の入力データの情報に基づいて、当該入力データに基づいて生成される蓄積データ、および、蓄積データに基づいて生成される活用データのアクセス権限を予測する。アクセス権限予測部１３２は、予測したアクセス権限の情報を記憶部１１０に格納する。アクセス権限予測部１３２は、来歴情報に基づいて、データ加工やデータ整形などの処理内容を特定する。アクセス権限予測部１３２は、当該データ加工やデータ整形の入力となるデータのアクセス権限と、特定した処理内容とを基に、データ加工やデータ整形により出力されるデータのアクセス権限を予測する。 The access authority prediction unit 132 predicts the access authority of the accumulated data generated based on the input data and the utilization data generated based on the accumulated data based on the history information and the information of the input data of the start point. .. The access authority prediction unit 132 stores the predicted access authority information in the storage unit 110. The access authority prediction unit 132 specifies processing contents such as data processing and data shaping based on the history information. The access authority prediction unit 132 predicts the access authority of the data output by the data processing or the data shaping based on the access authority of the data which is the input of the data processing or the data shaping and the specified processing content.

　遅延時間算出部１３３は、来歴情報および情報処理システム５０で処理されたデータの生成ログを基に、一連のデータ処理における入力から出力までの遅延時間を算出し、算出した遅延時間の情報を記憶部１１０に格納する。 The delay time calculation unit 133 calculates the delay time from input to output in a series of data processing based on the history information and the data generation log processed by the information processing system 50, and stores the calculated delay time information. It is stored in the unit 110.

　評価部１４０は、来歴情報解析部１３０による解析結果に基づいて、情報処理システム５０における複数のソフトウェアを用いたデータ処理により生成されるデータの品質を評価する。評価部１４０は、次の３つの評価種別によって、データの品質を評価する。 The evaluation unit 140 evaluates the quality of the data generated by the data processing using the plurality of software in the information processing system 50 based on the analysis result by the history information analysis unit 130. The evaluation unit 140 evaluates the quality of data according to the following three evaluation types.

　第１の評価種別は、データ来歴である。評価部１４０は、入力データ抽出部１３１により抽出された始点の入力データの情報に基づいてデータ来歴の評価を行う。具体的には、評価部１４０は、始点の入力データが、ユーザが意図するデータに一致するか否かによりデータ来歴を評価する。 The first evaluation type is data history. The evaluation unit 140 evaluates the data history based on the information of the input data of the starting point extracted by the input data extraction unit 131. Specifically, the evaluation unit 140 evaluates the data history based on whether or not the input data of the starting point matches the data intended by the user.

　第２の評価種別は、セキュリティである。評価部１４０は、アクセス権限予測部１３２によるアクセス権限の予測結果に基づいて、データに関するセキュリティの評価を行う。具体的には、評価部１４０は、情報処理システム５０で生成されるデータのアクセス権限が、当該データの生成元データのアクセス権限から予測される適切なアクセス権限となっているか否かにより当該データに関するセキュリティを評価する。 The second evaluation type is security. The evaluation unit 140 evaluates the security of the data based on the prediction result of the access authority by the access authority prediction unit 132. Specifically, the evaluation unit 140 determines whether or not the access authority of the data generated by the information processing system 50 is an appropriate access authority predicted from the access authority of the data from which the data is generated. Evaluate the security of.

　第３の評価種別は、最新性である。評価部１４０は、遅延時間算出部１３３により算出されたデータ生成の遅延時間に基づいて、データの最新性の評価を行う。具体的には、評価部１４０は、情報処理システム５０で生成されたデータが、生成元データの発生から、許容される遅延時間内に生成されたか否かにより当該データの最新性を評価する。 The third evaluation type is up-to-date. The evaluation unit 140 evaluates the latestness of the data based on the data generation delay time calculated by the delay time calculation unit 133. Specifically, the evaluation unit 140 evaluates the up-to-dateness of the data based on whether or not the data generated by the information processing system 50 is generated within an allowable delay time from the generation of the generation source data.

　評価部１４０は、データ来歴、セキュリティ、および、最新性それぞれの評価種別に対して評価値を与える。評価値は、品質の高さの度合いを示す指標である。評価値の値が大きいほど品質が高いことを示し、評価値の値が小さいほど品質が低いことを示す。各評価種別に対する評価値には、「１」などの所定値が上限として設けられてもよい。評価部１４０は、各データに対して評価種別ごとに付与した評価値を、記憶部１１０に格納する。また、評価部１４０は、各データに対する効果項目ごとの評価値に基づいて、活用データの総合評価を行い、総合評価の結果を記憶部１１０に格納する。活用データの総合評価の結果は、当該活用データを生成したデータ処理に対する評価の結果としても使用される。 The evaluation unit 140 gives evaluation values for each evaluation type of data history, security, and up-to-dateness. The evaluation value is an index showing the degree of high quality. The larger the evaluation value, the higher the quality, and the smaller the evaluation value, the lower the quality. A predetermined value such as "1" may be set as an upper limit for the evaluation value for each evaluation type. The evaluation unit 140 stores the evaluation value assigned to each data for each evaluation type in the storage unit 110. Further, the evaluation unit 140 performs a comprehensive evaluation of the utilization data based on the evaluation value for each effect item for each data, and stores the result of the comprehensive evaluation in the storage unit 110. The result of the comprehensive evaluation of the utilization data is also used as the result of the evaluation for the data processing that generated the utilization data.

　表示制御部１５０は、評価部１４０による評価結果を、ディスプレイ６１に表示させたり、ネットワーク６０を介して他のコンピュータに送信したりする。表示制御部１５０は、情報処理システム５０におけるデータフロー図を出力し、データの評価値に応じて、データフロー図におけるデータを示すアイコンの表示態様を制御する。例えば、表示制御部１５０は、評価値が基準よりも高いデータと評価値が基準よりも低いデータとを区別して表示させる。また、表示制御部１５０は、データフロー図に含まれる複数のデータフローのうちユーザにより指定されたアイコンに対応するデータフローに表示を絞り込む制御を行う。 The display control unit 150 displays the evaluation result by the evaluation unit 140 on the display 61, or transmits the evaluation result to another computer via the network 60. The display control unit 150 outputs a data flow diagram in the information processing system 50, and controls the display mode of the icon indicating the data in the data flow diagram according to the evaluation value of the data. For example, the display control unit 150 distinguishes and displays data having an evaluation value higher than the reference and data having an evaluation value lower than the reference. Further, the display control unit 150 controls to narrow down the display to the data flow corresponding to the icon designated by the user among the plurality of data flows included in the data flow diagram.

　図６は、来歴情報の例を示す図である。
　来歴情報１１１は、記憶部１１０に予め格納される。来歴情報１１１の例では、データ形式としてＪＳＯＮ（JavaScript Object Notation）形式を示すが、他のデータ形式が用いられてもよい。なお、ＪＡＶＡＳＣＲＩＰＴは登録商標である。 FIG. 6 is a diagram showing an example of history information.
Provenance information 111 is stored in the storage unit 110 in advance. In the example of the history information 111, a JSON (JavaScript Object Notation) format is shown as the data format, but other data formats may be used. JAVASCRIPT is a registered trademark.

　来歴情報１１１には、情報処理システム５０における複数のソフトウェアにより処理されたデータの来歴が記録されている。一例として、図６では来歴情報１１１における、データ集計を行うソフトウェアにより蓄積データ「ｄａｔａ－Ａ４」が集計されて、活用データ「ｄａｔａ－Ａ５」が出力されたことを示す部分を表している。 The history information 111 records the history of data processed by a plurality of software in the information processing system 50. As an example, FIG. 6 shows a portion of the history information 111 showing that the accumulated data “data-A4” is aggregated by the data aggregation software and the utilization data “data-A5” is output.

　変数「ｔｙｐｅＮａｍｅ」の値「ＸＸＸ＿ｓｃｒｉｐｔ１」は、当該データ集計の処理タイプがＰｙｔｈｏｎ（登録商標）などの所定のスクリプト言語を用いて生成されたスクリプトであることを示す。「ＸＸＸ」は、スクリプト言語の名称などを表してもよい。変数名「ｃｒｅａｔｅｄＢｙ」および値「ｃｒｅａｔｅ－ｕｓｅｒ」は、該当のスクリプトを作成したユーザの名称が「ｃｒｅａｔｅ－ｕｓｅｒ」であることを示す。 The value "XXX_script1" of the variable "typeName" indicates that the processing type of the data aggregation is a script generated using a predetermined script language such as Python (registered trademark). "XXX" may represent the name of the script language or the like. The variable name "createdBy" and the value "create-user" indicate that the name of the user who created the corresponding script is "create-user".

　変数「ａｔｔｒｉｂｕｔｅｓ．ｑｕａｌｉｆｉｅｄＮａｍｅ」の値「ＸＸＸ＿Ａｇｇｒｅｇａｔｉｏｎ　ａｐｐｌｉｃａｔｉｏｎ」は、該当のソフトウェアの修飾名が「ＸＸＸ＿Ａｇｇｒｅｇａｔｉｏｎ　ａｐｐｌｉｃａｔｉｏｎ」であることを示す。 The value "XXX_Aggression application" of the variable "attributes.qualifiedName" indicates that the qualified name of the corresponding software is "XXX_Aggression application".

　変数「ｄｅｓｃｒｉｐｔｉｏｎ」の値「Ａｇｇｒｅｇａｔｉｏｎ　ａｐｐｌｉｃａｔｉｏｎ」は、該当のソフトウェアの処理内容が「Ａｇｇｒｅｇａｔｉｏｎ　ａｐｐｌｉｃａｔｉｏｎ」、すなわち、データ集計であることを示す。 The value "Aggression application" of the variable "description" indicates that the processing content of the corresponding software is "Aggression application", that is, data aggregation.

　変数「ａｔｔｒｉｂｕｔｅｓ．ｒｕｎ＿ｕｓｅｒ」の値「ｓａｍｐｌｅ＿ｕｓｅｒ」は、該当のデータ集計の実行ユーザの名称が「ｓａｍｐｌｅ＿ｕｓｅｒ」であることを示す。 The value "sample_user" of the variable "attributes.run_user" indicates that the name of the execution user of the corresponding data aggregation is "sample_user".

　変数「ａｔｔｒｉｂｕｔｅｓ．ｓｅｒｖｅｒ」の値「ｓａｍｐｌｅ＿ｓｅｒｖｅｒ」は、該当のデータ集計のソフトウェアを実行する実行サーバの名称が「ｓａｍｐｌｅ＿ｓｅｒｖｅｒ」であることを示す。 The value "sample_server" of the variable "attributes.server" indicates that the name of the execution server that executes the corresponding data aggregation software is "sample_server".

　変数「ａｔｔｒｉｂｕｔｅｓ．ｉｎｐｕｔｓ．ｎａｍｅ」の値「ｄａｔａ－Ａ４」は、データ集計に対する入力データの名称が「ｄａｔａ－Ａ４」であることを示す。
　変数「ａｔｔｒｉｂｕｔｅｓ．ｉｎｐｕｔｓ．ｔｙｐｅＮａｍｅ」の値「ｈｄｆｓ＿ｐａｔｈ」は、入力データタイプが「ｈｄｆｓ＿ｐａｔｈ」であることを示す。なお、ｈｄｆｓは、Hadoop（登録商標） Distributed File Systemの略である。 The value "data-A4" of the variable "attributes.inputs.name" indicates that the name of the input data for the data aggregation is "data-A4".
The value "hdfs_path" of the variable "attributes.inputs.typeName" indicates that the input data type is "hdfs_path". In addition, hdfs is an abbreviation for Hadoop (registered trademark) Distributed File System.

　例えば、蓄積データ「ｄａｔａ－Ａ４」の取得元の記憶部の情報は、変数「ａｔｔｒｉｂｕｔｅｓ．ｉｎｐｕｔｓ．ｎａｍｅ」の値に含まれてもよいし、変数「ａｔｔｒｉｂｕｔｅｓ．ｉｎｐｕｔｓ．ｔｙｐｅＮａｍｅ」の値に含まれてもよい。 For example, the information of the storage unit of the acquisition source of the accumulated data "data-A4" may be included in the value of the variable "attributes.imputs.name" or included in the value of the variable "attributes.imputs.typeName". You may.

　変数「ａｔｔｒｉｂｕｔｅｓ．ｏｕｔｐｕｔｓ．ｎａｍｅ」の値「ｄａｔａ－Ａ５」は、蓄積データ「ｄａｔａ－Ａ４」に対するデータ集計に応じた出力データの名称が「ｄａｔａ－Ａ５」であることを示す。 The value "data-A5" of the variable "attributes.outputs.name" indicates that the name of the output data corresponding to the data aggregation for the accumulated data "data-A4" is "data-A5".

　変数「ａｔｔｒｉｂｕｔｅｓ．ｏｕｔｐｕｔｓ．ｔｙｐｅＮａｍｅ」の値「ｈｄｆｓ＿ｐａｔｈ」は、出力データタイプが「ｈｄｆｓ＿ｐａｔｈ」であることを示す。
　例えば、活用データ「ｄａｔａ－Ａ５」の出力先の記憶部の情報は、変数「ａｔｔｒｉｂｕｔｅｓ．ｏｕｔｐｕｔｓ．ｎａｍｅ」の値に含まれてもよいし、変数「ａｔｔｒｉｂｕｔｅｓ．ｏｕｔｐｕｔｓ．ｔｙｐｅＮａｍｅ」の値に含まれてもよい。 The value "hdfs_path" of the variable "attributes.outputs.typeName" indicates that the output data type is "hdfs_path".
For example, the information in the storage unit of the output destination of the utilization data "data-A5" may be included in the value of the variable "attributes.outputs.name" or included in the value of the variable "attributes.outputs.typeName". You may.

　来歴情報１１１には、他のソフトウェアの処理についても、同様のデータ構造によって、処理されたデータの来歴を示す情報が記録されている。
　次に、来歴情報１１１に基づく、データの評価方法を説明する。まず、データ来歴の評価、すなわち、来歴評価を説明する。来歴評価では、データ処理に対する始点の入力データの情報に基づいて、出力データを評価する。 In the history information 111, information indicating the history of the processed data is recorded by the same data structure for the processing of other software.
Next, a method of evaluating data based on the history information 111 will be described. First, the evaluation of the data history, that is, the history evaluation will be described. In the provenance evaluation, the output data is evaluated based on the information of the input data of the starting point for the data processing.

　図７は、来歴評価の例を示す図である。
　例えば、来歴情報１１１は、入力データＡ１から活用データＡ３に至る来歴を示す情報を含む。また、来歴情報１１１は、データ蓄積部５３ａに記憶された蓄積データＢｙから活用データ記憶部８１に記憶された活用データＢｚに至る来歴を示す情報を含む。来歴情報１１１は、蓄積データＢｙに対するデータ整形ソフトウェア５４のデータ整形処理ｓ６により活用データＢｚが生成されたことを示す。 FIG. 7 is a diagram showing an example of provenance evaluation.
For example, the history information 111 includes information indicating the history from the input data A1 to the utilization data A3. Further, the history information 111 includes information indicating the history from the stored data By stored in the data storage unit 53a to the utilization data Bz stored in the utilization data storage unit 81. The history information 111 indicates that the utilization data Bz is generated by the data shaping process s6 of the data shaping software 54 for the stored data By.

　ユーザ所持入力データリスト１１２は、ユーザによって情報処理装置１００に入力され、記憶部１１０に格納される。ユーザ所持入力データリスト１１２は、データ処理の始点の入力データとしてユーザが想定しているデータの情報を示す。例えば、ユーザ所持入力データリスト１１２は、データ名および取得元の項目を含む。データ名は、データの名称である。取得元は、データの取得元の記憶部の名称である。例えば、「ＤＢ＿Ａ１」は、入力データ記憶部７０の名称である。 The user-owned input data list 112 is input to the information processing apparatus 100 by the user and stored in the storage unit 110. The user-possessed input data list 112 shows information of data assumed by the user as input data of a start point of data processing. For example, the user-owned input data list 112 includes data names and acquisition source items. The data name is the name of the data. The acquisition source is the name of the storage unit of the acquisition source of the data. For example, "DB_A1" is the name of the input data storage unit 70.

　入力データ抽出部１３１は、来歴情報１１１に基づいて、ユーザ所持入力データリスト１１２に対応するユーザに関する入力データリスト１１３を生成する。入力データリスト１１３は、該当のユーザが利用するデータ処理における始点の入力データの情報を示す。入力データ抽出部１３１は、ユーザ所持入力データリスト１１２に対応するユーザの識別情報を取得する。入力データ抽出部１３１は、来歴情報１１１から、実行ユーザ（例えば、「ｒｕｎ＿ｕｓｅｒ」）として、該当のユーザの識別情報が記録されている処理（例えば、「ｄｅｓｃｒｉｐｔｉｏｎ」）を特定する。なお、来歴情報１１１は、該当のユーザが利用するデータ処理に関するデータの来歴だけを含む情報でもよい。この場合、入力データ抽出部１３１は、該当のユーザと他のユーザとで来歴を区別する処理を省略できる。 The input data extraction unit 131 generates an input data list 113 for the user corresponding to the user possessed input data list 112 based on the history information 111. The input data list 113 shows information on the input data of the starting point in the data processing used by the user. The input data extraction unit 131 acquires the identification information of the user corresponding to the user-owned input data list 112. The input data extraction unit 131 identifies a process (for example, "description") in which the identification information of the corresponding user is recorded as an execution user (for example, "run_user") from the history information 111. The history information 111 may be information including only the history of data related to data processing used by the user. In this case, the input data extraction unit 131 can omit the process of distinguishing the provenance between the corresponding user and another user.

　入力データ抽出部１３１は、該当のユーザに対して特定した処理の入力データおよび出力データを、入力データから出力データへ順番に辿ることで、例えば、入力データＡ１から活用データＡ３に至る来歴を特定する。同様に、入力データ抽出部１３１は、蓄積データＢｙから活用データＢｚに至る来歴を特定する。なお、図７では、該当のユーザに対する他のデータの来歴の図示を省略している。ここで、ある処理に関して、入力データから出力データへ向かう方向は順方向であり、出力データから入力データへ向かう方向は逆方向である。 The input data extraction unit 131 specifies, for example, the history from the input data A1 to the utilization data A3 by tracing the input data and the output data of the process specified for the corresponding user in order from the input data to the output data. do. Similarly, the input data extraction unit 131 specifies the history from the accumulated data By to the inflection data Bz. Note that FIG. 7 omits the illustration of the history of other data for the corresponding user. Here, with respect to a certain process, the direction from the input data to the output data is the forward direction, and the direction from the output data to the input data is the opposite direction.

　入力データ抽出部１３１は、来歴情報１１１から抽出した該当のユーザに関するデータの来歴に基づいて、出力データを得るために使用される始点の入力データを抽出し、入力データリスト１１３を生成する。入力データリスト１１３は、記憶部１１０に格納される。入力データリスト１１３は、入力データ抽出部１３１により特定された始点の入力データの一覧であると言える。 The input data extraction unit 131 extracts the input data of the starting point used for obtaining the output data based on the history of the data related to the corresponding user extracted from the history information 111, and generates the input data list 113. The input data list 113 is stored in the storage unit 110. It can be said that the input data list 113 is a list of input data of the starting point specified by the input data extraction unit 131.

　例えば、入力データ抽出部１３１は、活用データＡ３から来歴を逆方向に辿ることで、活用データＡ３を得るために用いられる始点の入力データＡ１を特定する。逆方向に辿ったときに終端となるデータ、すなわち、当該データに対応する入力が来歴情報１１１になく、それ以上は逆方向に辿れないデータが始点の入力データに相当する。また、入力データ抽出部１３１は、活用データＢｚから来歴を逆方向に辿ることで、活用データＢｚを得るために用いられる始点の入力データＢｙを特定する。始点の入力データＢｙは、データ蓄積部５３に格納されているので、蓄積データＢｙとも呼べる。入力データ抽出部１３１は、特定した始点の入力データＡ１，Ｂｙを含む入力データリスト１１３を生成する。入力データリスト１１３は、ユーザ所持入力データリスト１１２と同様に、データ名および取得元の項目を含む。 For example, the input data extraction unit 131 specifies the input data A1 of the starting point used to obtain the utilization data A3 by tracing the history from the utilization data A3 in the opposite direction. The data that ends when traced in the reverse direction, that is, the input corresponding to the data does not exist in the history information 111, and the data that cannot be traced in the reverse direction further corresponds to the input data of the start point. Further, the input data extraction unit 131 specifies the input data By of the starting point used for obtaining the utilization data Bz by tracing the history from the utilization data Bz in the reverse direction. Since the input data By of the start point is stored in the data storage unit 53, it can also be called the stored data By. The input data extraction unit 131 generates an input data list 113 including the input data A1 and By of the specified start point. The input data list 113 includes the data name and the item of the acquisition source, similarly to the user-owned input data list 112.

　評価部１４０は、ユーザ所持入力データリスト１１２および来歴情報１１１から抽出した入力データリスト１１３を比較することで、不一致リスト１１４を生成する。不一致リスト１１４は、記憶部１１０に格納される。不一致リスト１１４は、一覧、来歴および取得元の項目を含む。一覧の項目には、ユーザ所持入力データリスト１１２のレコードのうち、入力データリスト１１３に存在しないレコードにおけるデータ名が登録される。来歴の項目には、入力データリスト１１３のレコードのうち、ユーザ所持入力データリスト１１２に存在しないレコードにおけるデータ名が登録される。取得元の項目には、該当のデータの取得元が登録される。一覧の項目にデータ名が登録される場合、同レコードの来歴の項目は、設定なしとなる。来歴の項目にデータ名が登録される場合、同レコードの一覧の項目は、設定なしとなる。図中、設定なしをハイフン「－」で表す。 The evaluation unit 140 generates a discrepancy list 114 by comparing the input data list 112 possessed by the user and the input data list 113 extracted from the history information 111. The mismatch list 114 is stored in the storage unit 110. The discrepancy list 114 includes list, history and source items. In the list item, among the records of the user-owned input data list 112, the data names of the records that do not exist in the input data list 113 are registered. In the history item, among the records of the input data list 113, the data names of the records that do not exist in the user-owned input data list 112 are registered. The acquisition source of the corresponding data is registered in the acquisition source item. When the data name is registered in the item of the list, the item of the history of the same record is not set. When the data name is registered in the history item, the item in the list of the same record is not set. In the figure, no setting is indicated by a hyphen "-".

　ユーザ所持入力データリスト１１２および入力データリスト１１３に対して、不一致リスト１１４には、ユーザ所持入力データリスト１１２のデータ名「Ｂｘ」および取得元「ＤＢ＿Ｂｘ」を含むレコードが登録される。また、不一致リスト１１４には、入力データリスト１１３のデータ名「Ｂｙ」および取得元「ＤＢ＿Ｂｙ」を含むレコードが登録される。 A record including the data name "Bx" and the acquisition source "DB_Bx" of the user-owned input data list 112 is registered in the mismatch list 114 with respect to the user-owned input data list 112 and the input data list 113. Further, a record including the data name "By" and the acquisition source "DB_By" of the input data list 113 is registered in the mismatch list 114.

　評価部１４０は、不一致リスト１１４に基づいて、来歴評価結果１１５を生成する。来歴評価結果１１５は、記憶部１１０に格納される。来歴評価結果１１５は、分類および評価値の項目を含む。分類の項目には、一連のデータ処理に属する一部のデータ処理区間を表す情報が登録される。評価値の項目には、分類ごとのデータの品質の評価値が登録される。 The evaluation unit 140 generates a history evaluation result 115 based on the disagreement list 114. The provenance evaluation result 115 is stored in the storage unit 110. Provenance evaluation result 115 includes items of classification and evaluation value. In the classification item, information representing a part of the data processing section belonging to a series of data processing is registered. In the evaluation value item, the evaluation value of the quality of the data for each classification is registered.

　ここで、例えば、始点の入力データ、蓄積データ、活用データというようにデータが変遷する場合、入力データから蓄積データまでが第１区間、蓄積データから活用データまでが第２区間というように分類される。あるいは、始点の入力データ、第１蓄積データ、第２蓄積データ、活用データというようにデータが変遷する場合も考えられる。この場合、入力データから第１蓄積データまでが第１区間、第１蓄積データから第２蓄積データまでが第２区間、第２蓄積データから活用データまでが第３区間というように分類されてもよい。 Here, for example, when the data changes such as the input data, the accumulated data, and the utilization data of the starting point, the data from the input data to the accumulated data is classified as the first section, and the accumulated data to the utilization data is classified as the second section. To. Alternatively, it is conceivable that the data may change, such as the input data of the starting point, the first storage data, the second storage data, and the utilization data. In this case, even if the input data to the first accumulated data is classified as the first section, the first accumulated data to the second accumulated data is classified as the second section, and the second accumulated data to the utilized data is classified as the third section. good.

　評価部１４０は、来歴情報１１１における該当のユーザに関するデータの来歴のうち、不一致リスト１１４の来歴および取得元の情報に対応するデータを始点の入力データとするものを特定し、当該始点の入力データに後続する分類について、評価値を「０」とする。一方、評価部１４０は、来歴情報１１１における該当のユーザに関するデータの来歴のうち、ユーザ所持入力データリスト１１２のデータ名および取得元の情報に対応するデータを始点の入力データとするものを特定する。評価部１４０は、特定した当該始点の入力データに後続する分類について、評価値を「１」とする。 The evaluation unit 140 identifies the data corresponding to the history of the discrepancy list 114 and the information of the acquisition source as the input data of the start point among the history of the data related to the corresponding user in the history information 111, and the input data of the start point. The evaluation value is set to "0" for the classification following. On the other hand, the evaluation unit 140 specifies the data history of the corresponding user in the history information 111, in which the data corresponding to the data name and the acquisition source information of the user possessed input data list 112 is used as the input data of the starting point. .. The evaluation unit 140 sets the evaluation value to "1" for the classification following the specified input data of the start point.

　例えば、来歴評価結果１１５には、分類「Ａ１－Ａ２間」に対して、評価値「１」というレコードが登録される。これは、入力データＡ１に基づいて生成される蓄積データＡ２の、来歴評価による品質の評価値が「１」であることを示す。 For example, in the history evaluation result 115, a record with an evaluation value of "1" is registered for the classification "between A1 and A2". This indicates that the evaluation value of the quality of the accumulated data A2 generated based on the input data A1 by the history evaluation is "1".

　また、来歴評価結果１１５には、分類「Ａ２－Ａ３間」に対して、評価値「１」というレコードが登録される。これは、蓄積データＡ２に基づいて生成される活用データＡ３の、来歴評価による品質の評価値が「１」であることを示す。 Further, in the history evaluation result 115, a record having an evaluation value of "1" is registered for the classification "A2-A3". This indicates that the evaluation value of the quality of the utilization data A3 generated based on the accumulated data A2 by the history evaluation is "1".

　また、来歴評価結果１１５には、分類「Ｂｙ－Ｂｚ間」に対して、評価値「０」というレコードが登録される。これは、蓄積データＢｙに基づいて生成される活用データＢｚの、来歴評価による品質の評価値が「０」であることを示す。 Further, in the history evaluation result 115, a record having an evaluation value of "0" is registered for the classification "between By and Bz". This indicates that the evaluation value of the quality of the utilization data Bz generated based on the accumulated data By is "0" by the history evaluation.

　なお、活用データに対応する実際の始点のデータが複数のこともある。その場合、評価部１４０は、実際の始点のデータのうち、ユーザ所持入力データリスト１１２に含まれる数が多い程、活用データや中間データの品質を高く評価するように制御できる。 In addition, there may be multiple data of the actual starting point corresponding to the utilization data. In that case, the evaluation unit 140 can control the quality of the utilization data and the intermediate data to be evaluated higher as the number of the data of the actual start point included in the user-owned input data list 112 is larger.

　次に、セキュリティ評価を説明する。
　図８は、セキュリティ評価の例を示す図である。
　例えば、来歴情報１１１は、入力データ記憶部７０に記憶された入力データＡｘからデータ蓄積部５３に記憶された蓄積データＡｙを経て、活用データ記憶部８０に記憶された活用データＡｚ，ＡＢ１に至る来歴を示す情報を含む。 Next, the security evaluation will be described.
FIG. 8 is a diagram showing an example of security evaluation.
For example, the history information 111 reaches the utilization data Az and AB1 stored in the utilization data storage unit 80 via the storage data Ay stored in the data storage unit 53 from the input data Ax stored in the input data storage unit 70. Includes history information.

　来歴情報１１１は、入力データＡｘに対するデータ取得ソフトウェア５１のデータ取得およびデータ加工ソフトウェア５２のデータ加工処理ｓ１により、蓄積データＡｙが生成されたことを示す。また、来歴情報１１１は、蓄積データＡｙに対するデータ整形ソフトウェア５４のデータ整形処理ｓ３により、活用データＡｚが生成されたことを示す。更に、来歴情報１１１は、蓄積データＡｙに対するデータ整形ソフトウェア５４のデータ整形処理ｓ４により、活用データＡＢ１が生成されたことを示す。 The history information 111 indicates that the accumulated data Ay was generated by the data acquisition of the data acquisition software 51 for the input data Ax and the data processing process s1 of the data processing software 52. Further, the history information 111 indicates that the utilization data Az is generated by the data shaping process s3 of the data shaping software 54 for the accumulated data Ay. Further, the history information 111 indicates that the utilization data AB1 is generated by the data shaping process s4 of the data shaping software 54 for the accumulated data Ay.

　まず、アクセス権限予測部１３２は、来歴評価の際に特定した始点の入力データに対するアクセス権限を示す入力データアクセス権限情報１１１ａを取得する。入力データアクセス権限情報１１１ａは、ユーザにより提供されてもよいし、情報処理システム５０から取得されてもよい。あるいは、来歴情報１１１にデータのアクセス権限の情報が含まれる場合、入力データアクセス権限情報１１１ａは、来歴情報１１１から取得されてもよい。入力データアクセス権限情報１１１ａは、記憶部１１０に格納される。 First, the access authority prediction unit 132 acquires the input data access authority information 111a indicating the access authority to the input data of the starting point specified at the time of the provenance evaluation. The input data access authority information 111a may be provided by the user or may be acquired from the information processing system 50. Alternatively, when the history information 111 includes data access authority information, the input data access authority information 111a may be acquired from the history information 111. The input data access authority information 111a is stored in the storage unit 110.

　例えば、アクセス権限は、該当のデータの利用制限事項を示し、当該データにアクセス可能な人員などを示す。アクセス権限予測部１３２は、当該データにアクセス可能な人員などを、該当のデータの利用目的（例えば、公開目的や組織内で秘密に管理するなど）の情報に応じて特定してもよい。 For example, the access authority indicates the usage restrictions of the relevant data, and indicates the personnel who can access the relevant data. The access authority prediction unit 132 may specify the personnel who can access the data according to the information of the purpose of using the data (for example, the purpose of disclosure or secret management within the organization).

　また、アクセス権限予測部１３２は、来歴情報１１１に基づいて、加工整形処理情報１１１ｂを取得する。加工整形処理情報１１１ｂは、該当の処理で入力データから出力データを得るために用いられた処理内容を示す。例えば、アクセス権限予測部１３２は、来歴情報１１１により特定したソフトウェアに対するクエリを解析することで、処理内容を導出する。加工整形処理情報１１１ｂは、記憶部１１０に格納される。 Further, the access authority prediction unit 132 acquires the processing shaping processing information 111b based on the history information 111. The processing shaping processing information 111b indicates the processing content used to obtain the output data from the input data in the corresponding processing. For example, the access authority prediction unit 132 derives the processing content by analyzing the query for the software specified by the history information 111. The processing and shaping processing information 111b is stored in the storage unit 110.

　アクセス権限予測部１３２は、入力データアクセス権限情報１１１ａおよび加工整形処理情報１１１ｂに基づいて、アクセス権限予測を行い、アクセス権限予測結果１１６を生成する。アクセス権限予測の詳細は後述される。アクセス権限予測結果１１６は、記憶部１１０に格納される。 The access authority prediction unit 132 predicts the access authority based on the input data access authority information 111a and the processing and shaping processing information 111b, and generates the access authority prediction result 116. Details of access authority prediction will be described later. The access authority prediction result 116 is stored in the storage unit 110.

　アクセス権限予測結果１１６は、データ名およびアクセス権限の項目を含む。データ名の項目には、データの名称が登録される。アクセス権限の項目には、該当のデータに関して、先行のデータから予測されたアクセス権限が登録される。 The access authority prediction result 116 includes the data name and access authority items. The name of the data is registered in the item of the data name. In the access authority item, the access authority predicted from the preceding data is registered for the corresponding data.

　例えば、アクセス権限予測結果１１６には、データ名「Ａｙ」に対して、予測されたアクセス権限が「担当者のみ」であることを示すレコードが登録される。また、アクセス権限予測結果１１６には、データ名「Ａｚ」に対して、予測されたアクセス権限が「データ管理者のみ」であることを示すレコードが登録される。また、アクセス権限予測結果１１６には、データ名「ＡＢ１」に対して、予測されたアクセス権限が「誰でも」であることを示すレコードが登録される。 For example, in the access authority prediction result 116, a record indicating that the predicted access authority is "only the person in charge" is registered for the data name "Ay". Further, in the access authority prediction result 116, a record indicating that the predicted access authority is "only the data administrator" is registered for the data name "Az". Further, in the access authority prediction result 116, a record indicating that the predicted access authority is "anyone" is registered for the data name "AB1".

　一方、アクセス権限予測部１３２は、アクセス権限予測結果１１６とは別に、各データの実際のアクセス権限を示すアクセス権限情報１１７を取得する。アクセス権限情報１１７は、情報処理システム５０から取得され、記憶部１１０に格納される。例えば、アクセス権限情報１１７には、データ名「Ａｙ」に対して、実際のアクセス権限が「誰でも」であることを示すレコードが含まれる。また、アクセス権限情報１１７には、データ名「Ａｚ」に対して、実際のアクセス権限が「誰でも」であることを示すレコードが含まれる。また、アクセス権限情報１１７には、データ名「ＡＢ１」に対して、予測されたアクセス権限が「誰でも」であることを示すレコードが含まれる。 On the other hand, the access authority prediction unit 132 acquires the access authority information 117 indicating the actual access authority of each data, in addition to the access authority prediction result 116. The access authority information 117 is acquired from the information processing system 50 and stored in the storage unit 110. For example, the access authority information 117 includes a record indicating that the actual access authority is "anyone" for the data name "Ay". Further, the access authority information 117 includes a record indicating that the actual access authority is "anyone" for the data name "Az". Further, the access authority information 117 includes a record indicating that the predicted access authority is "anyone" for the data name "AB1".

　評価部１４０は、アクセス権限予測結果１１６とアクセス権限情報１１７とを比較することで、セキュリティ評価結果１１８を生成する。セキュリティ評価結果１１８は、記憶部１１０に格納される。セキュリティ評価結果１１８は、分類および評価値の項目を含む。分類および評価値の項目の意味は、来歴評価結果１１５の同名の項目の意味と同じである。 The evaluation unit 140 generates the security evaluation result 118 by comparing the access authority prediction result 116 with the access authority information 117. The security evaluation result 118 is stored in the storage unit 110. Security evaluation result 118 includes classification and evaluation value items. The meanings of the items of classification and evaluation value are the same as the meanings of the items of the same name in the history evaluation result 115.

　評価部１４０は、アクセス権限予測結果１１６およびアクセス権限情報１１７を比較して、同一のデータ名のデータに関して、予測されたアクセス権限と実際のアクセス権限とが一致するか否かを判定する。評価部１４０は、予測されたアクセス権限と実際のアクセス権限とが一致する場合、該当のデータが出力となる分類について評価値を「１」とする。一方、評価部１４０は、予測されたアクセス権限と実際のアクセス権限とが一致しない場合、該当のデータが出力となる分類について評価値を「０」とする。 The evaluation unit 140 compares the access authority prediction result 116 and the access authority information 117, and determines whether or not the predicted access authority and the actual access authority match with respect to the data having the same data name. When the predicted access authority and the actual access authority match, the evaluation unit 140 sets the evaluation value to "1" for the classification for which the corresponding data is output. On the other hand, when the predicted access authority and the actual access authority do not match, the evaluation unit 140 sets the evaluation value to "0" for the classification for which the corresponding data is output.

　例えば、セキュリティ評価結果１１８には、分類「Ａｘ－Ａｙ間」に対して、評価値「０」というレコードが登録される。これは、入力データＡｘに基づいて生成される蓄積データＡｙの、セキュリティ評価による品質の評価値が「０」であることを示す。 For example, in the security evaluation result 118, a record with an evaluation value of "0" is registered for the classification "between Ax and Ay". This indicates that the evaluation value of the quality of the accumulated data Ay generated based on the input data Ax by the security evaluation is "0".

　また、セキュリティ評価結果１１８には、分類「Ａｙ－Ａｚ間」に対して、評価値「０」というレコードが登録される。これは、蓄積データＡｙに基づいて生成される活用データＡｚの、セキュリティ評価による品質の評価値が「０」であることを示す。 Further, in the security evaluation result 118, a record having an evaluation value of "0" is registered for the classification "between Ay and Az". This indicates that the evaluation value of the quality of the utilization data Az generated based on the accumulated data Ay by the security evaluation is "0".

　また、セキュリティ評価結果１１８には、分類「Ａｙ－ＡＢ１間」に対して、評価値「１」というレコードが登録される。これは、蓄積データＡｙに基づいて生成される活用データＡＢ１の、セキュリティ評価による品質の評価値が「１」であることを示す。 Further, in the security evaluation result 118, a record with an evaluation value of "1" is registered for the classification "between Ay and AB1". This indicates that the evaluation value of the quality of the utilization data AB1 generated based on the accumulated data Ay is "1" by the security evaluation.

　ここで、アクセス権限予測の例を説明する。
　図９は、アクセス権限予測の例を示す図である。
　アクセス権限予測部１３２は、始点の入力データに関する入力データアクセス権限情報１１１ａに基づいて、始点の入力データに基づいて生成される他のデータのアクセス権限を予測することができる。例えば、アクセス権限予測部１３２は、情報処理システム５０におけるデータカタログの情報から入力データアクセス権限情報１１１ａを取得してもよいし、ユーザにより入力された入力データアクセス権限情報１１１ａを取得してもよい。 Here, an example of access authority prediction will be described.
FIG. 9 is a diagram showing an example of access authority prediction.
The access authority prediction unit 132 can predict the access authority of other data generated based on the input data of the start point based on the input data access authority information 111a regarding the input data of the start point. For example, the access authority prediction unit 132 may acquire the input data access authority information 111a from the information of the data catalog in the information processing system 50, or may acquire the input data access authority information 111a input by the user. ..

　入力データアクセス権限情報１１１ａは、データ名、カラム、秘密区分およびアクセス権限の項目を含む。データ名の項目には、データの名称が登録される。カラムの項目には、該当のデータに含まれるカラムの名称（カラム名）が登録される。カラムは、該当のデータに含まれるデータ項目である。秘密区分の項目には、該当のデータの該当のカラムに対する秘密管理の区分を示す秘密区分が登録される。アクセス権限の項目には、該当のデータの該当のカラムに対するアクセス権限が登録される。 Input data access authority information 111a includes data name, column, secret classification, and access authority items. The name of the data is registered in the item of the data name. The column name (column name) included in the corresponding data is registered in the column item. A column is a data item contained in the corresponding data. In the item of secret classification, a secret classification indicating the classification of secret management for the corresponding column of the corresponding data is registered. In the access authority item, the access authority for the corresponding column of the corresponding data is registered.

　例えば、入力データアクセス権限情報１１１ａは、データ名「Ａｘ」、カラム「ｃｏｌｕｍｎ＿ａ」、秘密区分「社外秘」、アクセス権限「社内誰でも」というレコードが登録される。このレコードは、入力データＡｘのカラム「ｃｏｌｕｍｎ＿ａ」の情報の秘密区分が「社外秘」であり、アクセス権限が「社内誰でも」アクセス可能であることを示す。ここで、「社内」とは、該当のユーザが所属する会社に属するユーザ全般を表している。 For example, in the input data access authority information 111a, a record having a data name "Ax", a column "collect_a", a secret category "confidential", and an access authority "anyone in the company" is registered. This record indicates that the secret classification of the information in the column "column_a" of the input data Ax is "confidential" and the access authority is "anyone in the company". Here, "in-house" represents all users belonging to the company to which the corresponding user belongs.

　入力データアクセス権限情報１１１ａには、データ名「Ａｘ」の「ｃｏｌｕｍｎ＿ｂ」などの他のカラムに対しても、「関係者外秘」、「公開情報」といった秘密区分や、「担当者のみ」や「社内誰でも」といったアクセス権限が登録されている。また、入力データアクセス権限情報１１１ａは、他の始点の入力データに関する秘密区分やアクセス権限の情報を含み得る。 In the input data access authority information 111a, even for other columns such as "collect_b" of the data name "Ax", secret classifications such as "confidential person concerned" and "public information", "only person in charge" and Access rights such as "Anyone in the company" are registered. Further, the input data access authority information 111a may include secret classification and access authority information regarding the input data of another starting point.

　アクセス権限予測部１３２は、加工整形処理情報１１１ｂを取得する。前述のように、アクセス権限予測部１３２は、一連のデータ処理に含まれる各処理のクエリ解析を行うことで、加工整形処理情報１１１ｂを生成する。例えば、アクセス権限予測部１３２は、来歴情報１１１に含まれる、ある処理に対する入力データと出力データとの関係から、加工整形処理情報１１１ｂを生成することもできる。加工整形処理情報１１１ｂは、処理、入力データ、出力データ、入力カラムおよび出力カラムの項目を含む。処理の項目には、各ソフトウェアにおける処理内容の識別情報が登録される。入力データの項目には、当該処理内容に対する入力データが登録される。出力データの項目には、当該処理内容に対する出力データが登録される。入力カラムの項目には、入力データにおけるカラム（入力カラム）の名称が登録される。出力カラムの項目には、出力データにおけるカラム（出力カラム）の名称が登録される。 The access authority prediction unit 132 acquires the processing shaping processing information 111b. As described above, the access authority prediction unit 132 generates processing shaping processing information 111b by performing query analysis of each processing included in a series of data processing. For example, the access authority prediction unit 132 can also generate the processing shaping processing information 111b from the relationship between the input data and the output data for a certain process included in the history information 111. Processing The processing information 111b includes items of processing, input data, output data, input columns, and output columns. Identification information of the processing content in each software is registered in the processing item. Input data for the processing content is registered in the input data item. Output data for the processing content is registered in the output data item. The name of the column (input column) in the input data is registered in the item of the input column. The name of the column (output column) in the output data is registered in the item of the output column.

　例えば、加工整形処理情報１１１ｂは、処理「処理ｓ１」、入力データ「Ａｘ」、出力データ「Ａｙ」、入力カラム「ｃｏｌｕｍｎ＿ａ」、出力カラム「ｃｏｌｕｍｎ＿ａ１」というレコードを含む。このレコードは、データ加工処理ｓ１では、入力データＡｘのカラムｃｏｌｕｍｎ＿ａに基づいて、蓄積データＡｙのカラムｃｏｌｕｍｎ＿ａ１が生成されることを示す。 For example, the processing shaping processing information 111b includes a record of processing "processing s1", input data "Ax", output data "Ay", input column "collect_a", and output column "collect_a1". This record indicates that in the data processing process s1, the column volume_a1 of the accumulated data Ay is generated based on the column volume_a of the input data Ax.

　また、例えば、加工整形処理情報１１１ｂは、処理「処理ｓ１」、入力データ「Ａｘ」、出力データ「Ａｙ」、入力カラム「ｃｏｌｕｍｎ＿ｂ，ｃｏｌｕｍｎ＿ｃ」、出力カラム「ｃｏｌｕｍｎ＿ｂｃ」というレコードを含む。このレコードは、データ加工処理ｓ１では、入力データＡｘのカラムｃｏｌｕｍｎ＿ｂ，ｃｏｌｕｍｎ＿ｃに基づいて、蓄積データＡｙのカラムｃｏｌｕｍｎ＿ｂｃが生成されることを示す。 Further, for example, the processing shaping processing information 111b includes a record of processing "processing s1", input data "Ax", output data "Ay", input columns "collect_b, volume_c", and output column "collect_bc". This record indicates that in the data processing process s1, the column volume_bc of the accumulated data Ay is generated based on the columns volume_b and volume_c of the input data Ax.

　加工整形処理情報１１１ｂには、データ整形処理ｓ３などの他の処理に関するレコードも登録される。
　アクセス権限予測部１３２は、入力データアクセス権限情報１１１ａおよび加工整形処理情報１１１ｂに基づいて、出力カラムごとのアクセス権限の予測結果１１６ａを生成する。出力カラムのアクセス権限は、入力カラムのアクセス権限に基づいて予測される。例えば、ある出力カラムに対応する入力カラムが１つの場合、当該入力カラムのアクセス権限が出力カラムに対して予測されるアクセス権限である。また、ある出力カラムに対する入力カラムが複数の場合、当該複数の入力カラムのアクセス権限のうちの最も制限の強いアクセス権限が、出力カラムに対して予測されるアクセス権限である。 Records related to other processes such as data shaping process s3 are also registered in the processing shaping process information 111b.
The access authority prediction unit 132 generates an access authority prediction result 116a for each output column based on the input data access authority information 111a and the processing shaping processing information 111b. The access authority of the output column is predicted based on the access authority of the input column. For example, when there is one input column corresponding to a certain output column, the access authority of the input column is the expected access authority to the output column. Further, when there are a plurality of input columns for a certain output column, the most restrictive access authority among the access authority of the plurality of input columns is the expected access authority for the output column.

　例えば、加工整形処理情報１１１ｂによれば、カラムｃｏｌｕｍｎ＿ａ１は、カラムｃｏｌｕｍｎ＿ａに基づいて生成される。入力データアクセス権限情報１１１ａによれば、カラムｃｏｌｕｍｎ＿ａのアクセス権限は、「社内誰でも」である。よって、アクセス権限予測部１３２は、蓄積データＡｙのカラムｃｏｌｕｍｎ＿ａ１のアクセス権限を「社内誰でも」であると予測する。アクセス権限予測部１３２は、カラムｃｏｌｕｍｎ＿ａ１の識別情報に対応付けて、予測したアクセス権限「社内誰でも」を、予測結果１１６ａに追加する。 For example, according to the processing shaping processing information 111b, the column volume_a1 is generated based on the column volume_a. According to the input data access authority information 111a, the access authority of the column volume_a is "anyone in the company". Therefore, the access authority prediction unit 132 predicts that the access authority of the column volume_a1 of the accumulated data Ay is "anyone in the company". The access authority prediction unit 132 adds the predicted access authority "anyone in the company" to the prediction result 116a in association with the identification information of the column volume_a1.

　また、加工整形処理情報１１１ｂによれば、カラム「ｃｏｌｕｍｎ＿ｂｃ」は、カラム「ｃｏｌｕｍｎ＿ｂ」および「ｃｏｌｕｍｎ＿ｃ」に基づいて生成される。入力データアクセス権限情報１１１ａによれば、カラム「ｃｏｌｕｍｎ＿ｂ」のアクセス権限は「担当者のみ」であり、カラム「ｃｏｌｕｍｎ＿ｃ」のアクセス権限は「社内誰でも」である。よって、アクセス権限予測部１３２は、蓄積データＡｙのカラム「ｃｏｌｕｍｎ＿ｂｃ」のアクセス権限を、「担当者のみ」および「社内誰でも」のうち、最も制限の強い「担当者のみ」であると予測する。アクセス権限予測部１３２は、カラムｃｏｌｕｍｎ＿ｂｃの識別情報に対応付けて、予測したアクセス権限「担当者のみ」を、予測結果１１６ａに追加する。 Further, according to the processing shaping processing information 111b, the column "column_bc" is generated based on the columns "column_b" and "column_c". According to the input data access authority information 111a, the access authority of the column "collect_b" is "only the person in charge", and the access authority of the column "collect_c" is "anyone in the company". Therefore, the access authority prediction unit 132 predicts that the access authority of the column "collect_bc" of the accumulated data Ay is "only the person in charge" and "anyone in the company", which is the most restrictive "only the person in charge". .. The access authority prediction unit 132 adds the predicted access authority “only the person in charge” to the prediction result 116a in association with the identification information of the column volume_bc.

　アクセス権限予測部１３２は、蓄積データＡｙに含まれるカラムｃｏｌｕｍｎ＿ａ１，ｃｏｌｕｍｎ＿ｂｃそれぞれに対して予測されたアクセス権限に基づいて、蓄積データＡｙのアクセス権限を予測し、アクセス権限予測結果１１６を生成する。例えば、アクセス権限予測部１３２は、該当のデータの全カラムに対して予測されたアクセス権限のうち、最も制限の強いアクセス権限を該当のデータのアクセス権限として予測してもよい。蓄積データＡｙの例では、カラムｃｏｌｕｍｎ＿ａ１，ｃｏｌｕｍｎ＿ｂｃそれぞれに対して予測されたアクセス権限「社内誰でも」および「担当者のみ」のうち、最も制限の強いアクセス権限は「担当者のみ」である。よって、アクセス権限予測部１３２は、蓄積データＡｙのアクセス権限を「担当者のみ」と予測する。アクセス権限予測部１３２は、蓄積データＡｙに対して予測したアクセス権限「担当者のみ」を、アクセス権限予測結果１１６に追加する。 The access authority prediction unit 132 predicts the access authority of the accumulated data Ay based on the access authority predicted for each of the columns volume_a1 and volume_bc included in the accumulated data Ay, and generates the access authority prediction result 116. For example, the access authority prediction unit 132 may predict the most restrictive access authority among the predicted access authority for all columns of the corresponding data as the access authority of the corresponding data. In the example of the accumulated data Ay, among the predicted access privileges "anyone in the company" and "only the person in charge" for each of the columns volume_a1 and volume_bc, the most restrictive access authority is "only the person in charge". Therefore, the access authority prediction unit 132 predicts that the access authority of the accumulated data Ay is "only the person in charge". The access authority prediction unit 132 adds the access authority “only the person in charge” predicted for the stored data Ay to the access authority prediction result 116.

　なお、評価部１４０は、データのカラムごとに予測した予測結果１１６ａと、当該データのカラムごとの実際のアクセス権限とを比較して、セキュリティ評価結果１１８を生成してもよい。その場合、例えば、評価部１４０は、アクセス権限が一致するカラムが多いほど該当のデータの評価値が高くなるように、すなわち、アクセス権限が一致するカラムが多いほど品質が高いと評価するように制御することもできる。 The evaluation unit 140 may generate the security evaluation result 118 by comparing the prediction result 116a predicted for each column of data with the actual access authority for each column of the data. In that case, for example, the evaluation unit 140 evaluates that the more columns with the same access authority, the higher the evaluation value of the corresponding data, that is, the more columns with the same access authority, the higher the quality. It can also be controlled.

　次に、最新性評価を説明する。
　図１０は、最新性評価の例を示す図である。
　来歴情報１１１は、図４で例示したデータの来歴を示す情報を含む。まず、遅延時間算出部１３３は、遅延要件情報１１９を取得する。遅延要件情報１１９は、始点の入力データの発生から活用データが更新されるまでの遅延時間として、ユーザが許容する時間が登録される。遅延要件情報１１９は、ユーザによって情報処理装置１００に入力され、記憶部１１０に格納される。 Next, the latestness evaluation will be described.
FIG. 10 is a diagram showing an example of up-to-dateness evaluation.
Provenance information 111 includes information indicating the provenance of the data exemplified in FIG. First, the delay time calculation unit 133 acquires the delay requirement information 119. In the delay requirement information 119, a time allowed by the user is registered as a delay time from the generation of the input data of the start point to the update of the utilization data. The delay requirement information 119 is input to the information processing apparatus 100 by the user and stored in the storage unit 110.

　遅延要件情報１１９は、データ名および遅延要件の項目を含む。データ名の項目には、活用データのデータ名が登録される。遅延要件の項目には、始点の入力データの発生から活用データが更新されるまでに許容される遅延時間が登録される。遅延要件の項目には、始点の入力データの発生から活用データが更新されるまでに許容される遅延時間の上限が登録されてもよい。 Delay requirement information 119 includes data names and delay requirement items. The data name of the utilization data is registered in the data name item. In the item of delay requirement, the allowable delay time from the generation of the input data of the starting point to the update of the utilization data is registered. In the item of delay requirement, the upper limit of the delay time allowed from the generation of the input data of the starting point to the update of the utilization data may be registered.

　例えば、遅延要件情報１１９は、活用データＡ３の遅延要件が「２時間以内」であることを示すレコードを含む。また、遅延要件情報１１９は、活用データＡＢの遅延要件が「５分以内」であることを示すレコードを含む。更に、遅延要件情報１１９は、活用データＢ３の遅延要件が「１分以内」であることを示すレコードを含む。遅延要件情報１１９は、他のデータ処理により生成される活用データに対する遅延要件のレコードを含み得る。 For example, the delay requirement information 119 includes a record indicating that the delay requirement of the utilization data A3 is "within 2 hours". Further, the delay requirement information 119 includes a record indicating that the delay requirement of the utilization data AB is "within 5 minutes". Further, the delay requirement information 119 includes a record indicating that the delay requirement of the utilization data B3 is "within 1 minute". The delay requirement information 119 may include a record of delay requirements for utilization data generated by other data processing.

　遅延時間算出部１３３は、来歴情報１１１に基づいて、実遅延時間情報１２０を生成する。実遅延時間情報１２０は、記憶部１１０に格納される。例えば、遅延時間算出部１３３は、情報処理システム５０で記録されるデータ更新ログを、情報処理システム５０から取得し、記憶部１１０に格納する。データ更新ログは、データ名と、当該データ名のデータが更新された時刻とを示す情報を含む。遅延時間算出部１３３は、来歴情報１１１と、データ更新ログとに基づいて、実遅延時間情報１２０を生成する。 The delay time calculation unit 133 generates the actual delay time information 120 based on the history information 111. The actual delay time information 120 is stored in the storage unit 110. For example, the delay time calculation unit 133 acquires the data update log recorded by the information processing system 50 from the information processing system 50 and stores it in the storage unit 110. The data update log contains information indicating the data name and the time when the data of the data name was updated. The delay time calculation unit 133 generates the actual delay time information 120 based on the history information 111 and the data update log.

　実遅延時間情報１２０は、データ名、更新時刻および遅延時間の項目を含む。データ名の項目には、データ名が登録される。更新時刻の項目には、該当のデータ名のデータの更新時刻が登録される。図１０の例では、簡単のために、更新時刻は同日のものである例を示すが、更新時刻は、日付を含んでもよい。遅延時間の項目には、始点の入力データが更新された時刻から経過した時間、すなわち、遅延時間が登録される。始点の入力データに対しては、遅延時間は「－」（設定なし）となる。 The actual delay time information 120 includes items of data name, update time, and delay time. The data name is registered in the data name item. In the update time item, the update time of the data with the corresponding data name is registered. In the example of FIG. 10, for the sake of simplicity, an example in which the update time is the same day is shown, but the update time may include a date. In the item of delay time, the time elapsed from the time when the input data of the start point is updated, that is, the delay time is registered. For the input data of the start point, the delay time is "-" (no setting).

　遅延時間算出部１３３は、データ名および更新時刻の情報を、前述のデータ更新ログから取得することができる。また、遅延時間算出部１３３は、来歴情報１１１に基づいてデータの来歴を辿ることで、始点の入力データと、当該始点の入力データを基に生成される後続のデータを特定し、当該後続のデータに対する遅延時間を算出することができる。 The delay time calculation unit 133 can acquire the data name and update time information from the above-mentioned data update log. Further, the delay time calculation unit 133 identifies the input data of the start point and the subsequent data generated based on the input data of the start point by tracing the history of the data based on the history information 111, and identifies the subsequent data generated based on the input data of the start point. The delay time for the data can be calculated.

　例えば、実遅延時間情報１２０は、入力データＡ１に対して更新時刻「０２：３０」、遅延時間「－」のレコードを含む。入力データＡ１は、入力データ抽出部１３１により特定される「始点の入力データ」であるため、遅延時間は「－」となる。 For example, the actual delay time information 120 includes a record with an update time "02:30" and a delay time "-" with respect to the input data A1. Since the input data A1 is the "input data of the starting point" specified by the input data extraction unit 131, the delay time is "-".

　また、実遅延時間情報１２０は、入力データＢ１に対して更新時刻「１３：３０」、遅延時間「－」のレコードを含む。入力データＢ１は、入力データ抽出部１３１により特定される「始点の入力データ」であるため、遅延時間は「－」となる。 Further, the actual delay time information 120 includes a record of the update time "13:30" and the delay time "-" with respect to the input data B1. Since the input data B1 is the "input data of the starting point" specified by the input data extraction unit 131, the delay time is "-".

　また、実遅延時間情報１２０は、蓄積データＡ２に対して更新時刻「０２：３２」、遅延時間「２分」のレコードを含む。来歴情報１１１によれば、蓄積データＡ２は、入力データＡ１に対するデータ取得およびデータ加工を経てデータ蓄積部５３に格納される。このため、蓄積データＡ２の遅延時間は、入力データＡ１の更新時刻「０２：３０」と、蓄積データＡ２の更新時刻「０２：３２」との差「２分」となる。 Further, the actual delay time information 120 includes a record of the update time "02:32" and the delay time "2 minutes" with respect to the accumulated data A2. According to the history information 111, the stored data A2 is stored in the data storage unit 53 after data acquisition and data processing for the input data A1. Therefore, the delay time of the accumulated data A2 is the difference "2 minutes" between the update time "02:30" of the input data A1 and the update time "02:32" of the accumulated data A2.

　実遅延時間情報１２０は、蓄積データＢ２に対して更新時刻「１３：３３」、遅延時間「３分」のレコードを含む。実遅延時間情報１２０は、活用データＡ３に対して更新時刻「０４：００」、遅延時間「１時間３０分」のレコードを含む。実遅延時間情報１２０は、活用データＡＢに対して更新時刻「１３：３３」、遅延時間「３分」のレコードを含む。実遅延時間情報１２０は、活用データＢ３に対して更新時刻「１３：３３」、遅延時間「３分」のレコードを含む。 The actual delay time information 120 includes a record of the update time "13:33" and the delay time "3 minutes" with respect to the accumulated data B2. The actual delay time information 120 includes a record of the update time "04:00" and the delay time "1 hour 30 minutes" with respect to the utilization data A3. The actual delay time information 120 includes a record of the update time “13:33” and the delay time “3 minutes” with respect to the utilization data AB. The actual delay time information 120 includes a record of the update time “13:33” and the delay time “3 minutes” with respect to the utilization data B3.

　評価部１４０は、実遅延時間情報１２０に基づいて、活用データに対して計算された遅延時間が、遅延要件情報１１９の遅延要件を満たすか否かを判定することで、データの最新性を評価し、最新性評価結果１２１を生成する。最新性評価結果１２１は、記憶部１１０に格納される。最新性評価結果１２１は、分類および評価値の項目を含む。分類および評価値の意味は、来歴評価結果１１５の同名の項目の意味と同じである。 The evaluation unit 140 evaluates the up-to-dateness of the data by determining whether or not the delay time calculated for the utilization data satisfies the delay requirement of the delay requirement information 119 based on the actual delay time information 120. Then, the latestness evaluation result 121 is generated. The up-to-dateness evaluation result 121 is stored in the storage unit 110. The up-to-dateness evaluation result 121 includes items of classification and evaluation value. The meaning of the classification and the evaluation value is the same as the meaning of the item of the same name in the provenance evaluation result 115.

　評価部１４０は、活用データに対して計算された遅延時間が、遅延要件を満たす場合、該当の活用データに至る各分類の評価値を「１」とする。評価部１４０は、活用データに対して計算された遅延時間が、遅延要件を満たさない場合、該当の活用データに至る各分類の評価値を「０」とする。ある分類に対して、評価値「１」および「０」の両方が付与され得る場合、当該分類の評価値を「０」とする。 When the delay time calculated for the utilization data meets the delay requirement, the evaluation unit 140 sets the evaluation value of each classification up to the corresponding utilization data to "1". When the delay time calculated for the utilization data does not satisfy the delay requirement, the evaluation unit 140 sets the evaluation value of each classification leading to the utilization data to "0". When both the evaluation values "1" and "0" can be given to a certain classification, the evaluation value of the classification is set to "0".

　実遅延時間情報の例では、活用データＡ３，ＡＢについては、遅延要件を満たすので、活用データＡ３，ＡＢに至る各分類の評価値は「１」となる。一方、活用データＢ３については、遅延要件を満たさないので、活用データ「Ｂ３」に至る各分類の評価値は「０」となる。特に、分類「Ｂ１－Ｂ２間」は、活用データＡＢ，Ｂ３に連なる分類であるが、活用データＢ３に関して遅延要件を満たさないので、評価値は「０」となる。 In the example of the actual delay time information, since the utilization data A3 and AB satisfy the delay requirement, the evaluation value of each classification up to the utilization data A3 and AB is "1". On the other hand, since the utilization data B3 does not satisfy the delay requirement, the evaluation value of each classification leading to the utilization data “B3” is “0”. In particular, the classification "between B1 and B2" is a classification linked to the utilization data AB and B3, but the evaluation value is "0" because the delay requirement is not satisfied for the utilization data B3.

　なお、来歴情報１１１に、データ更新ログに相当する情報が含まれることもある。その場合、遅延時間算出部１３３は、情報処理システム５０からデータ更新ログを別途取得しなくても、来歴情報１１１から実遅延時間情報１２０を生成できる。 Note that the history information 111 may include information corresponding to the data update log. In that case, the delay time calculation unit 133 can generate the actual delay time information 120 from the history information 111 without separately acquiring the data update log from the information processing system 50.

　このようにして、評価部１４０は、来歴情報解析部１３０による解析結果を基に、来歴評価、セキュリティ評価および最新性評価によるデータ品質の評価を行う。更に、評価部１４０は、来歴評価、セキュリティ評価および最新性評価の評価結果を基に、データ品質の総合評価を行う。次に、総合評価について説明する。 In this way, the evaluation unit 140 evaluates the data quality by the history evaluation, the security evaluation, and the up-to-dateness evaluation based on the analysis result by the history information analysis unit 130. Further, the evaluation unit 140 performs a comprehensive evaluation of data quality based on the evaluation results of the history evaluation, the security evaluation, and the up-to-dateness evaluation. Next, the comprehensive evaluation will be described.

　図１１は、総合評価結果テーブルの例を示す図である。
　総合評価結果テーブル１２２は、来歴評価結果１１５、セキュリティ評価結果１１８および最新性評価結果１２１に基づいて、評価部１４０により生成され、記憶部１１０に格納される。総合評価結果テーブル１２２は、分類、来歴評価値、セキュリティ評価値、最新性評価値および総合評価値の項目を含む。 FIG. 11 is a diagram showing an example of a comprehensive evaluation result table.
The comprehensive evaluation result table 122 is generated by the evaluation unit 140 and stored in the storage unit 110 based on the probability evaluation result 115, the security evaluation result 118, and the up-to-dateness evaluation result 121. The comprehensive evaluation result table 122 includes items of classification, history evaluation value, security evaluation value, up-to-dateness evaluation value, and comprehensive evaluation value.

　分類の項目には、分類が登録される。分類の意味は、来歴評価結果１１５における分類の意味と同じである。来歴評価値の項目には、該当の分類に対する来歴評価結果１１５における評価値、すなわち、来歴評価値が登録される。セキュリティ評価値の項目には、該当の分類に対するセキュリティ評価結果１１８における評価値、すなわち、セキュリティ評価値が登録される。最新性評価値の項目には、該当の分類に対する最新性評価結果１２１における評価値、すなわち、最新性評価値が登録される。総合評価値の項目には、来歴評価値、セキュリティ評価値および最新性評価値に基づいて計算される総合評価値が登録される。例えば、総合評価値は、来歴評価値とセキュリティ評価値と最新性評価値との和である。 Classification is registered in the classification item. The meaning of the classification is the same as the meaning of the classification in the history evaluation result 115. In the item of the provenance evaluation value, the evaluation value in the provenance evaluation result 115 for the corresponding classification, that is, the provenance evaluation value is registered. In the item of security evaluation value, the evaluation value in the security evaluation result 118 for the corresponding classification, that is, the security evaluation value is registered. In the item of up-to-dateness evaluation value, the evaluation value in the up-to-dateness evaluation result 121 for the corresponding classification, that is, the up-to-dateness evaluation value is registered. In the item of the comprehensive evaluation value, the comprehensive evaluation value calculated based on the history evaluation value, the security evaluation value and the up-to-dateness evaluation value is registered. For example, the comprehensive evaluation value is the sum of the history evaluation value, the security evaluation value, and the up-to-dateness evaluation value.

　例えば、総合評価結果テーブル１２２は、分類「Ａ１－Ａ２間」に対して、来歴評価値「１」、セキュリティ評価値「１」、最新性評価値「１」、総合評価値「３」のレコードを含む。また、例えば、総合評価結果テーブル１２２は、分類「Ｂ１－Ｂ２間」に対して、来歴評価値「０」、セキュリティ評価値「０」、最新性評価値「０」、総合評価値「０」のレコードを含む。総合評価結果テーブル１２２には、他の分類に対するレコードも登録される。 For example, the comprehensive evaluation result table 122 is a record of a history evaluation value "1", a security evaluation value "1", an up-to-date evaluation value "1", and a comprehensive evaluation value "3" for the classification "A1-A2". including. Further, for example, in the comprehensive evaluation result table 122, the history evaluation value "0", the security evaluation value "0", the up-to-dateness evaluation value "0", and the comprehensive evaluation value "0" are shown for the classification "B1-B2". Includes records for. Records for other classifications are also registered in the comprehensive evaluation result table 122.

　上記の例では、評価部１４０が、来歴評価値（Ｖ１）とセキュリティ評価値（Ｖ２）と最新性評価値（Ｖ３）との和（Ｖ１＋Ｖ２＋Ｖ３）を総合評価値とすることを示した。ここで、Ｖ１，Ｖ２，Ｖ３は正の実数である。一方、総合評価値の算出方法には他の例も考えられる。例えば、評価部１４０は、来歴評価値、セキュリティ評価値および最新性評価値それぞれに対して重みを付けた重み付き和（ｗ１＊Ｖ１＋ｗ２＊Ｖ２＋ｗ３＊Ｖ３）を総合評価値としてもよい。ここで、ｗ１，ｗ２，ｗ３は正の実数である。 In the above example, the evaluation unit 140 has shown that the sum (V1 + V2 + V3) of the history evaluation value (V1), the security evaluation value (V2), and the up-to-dateness evaluation value (V3) is used as the comprehensive evaluation value. Here, V1, V2, and V3 are positive real numbers. On the other hand, other examples can be considered for the calculation method of the comprehensive evaluation value. For example, the evaluation unit 140 may use a weighted sum (w1 * V1 + w2 * V2 + w3 * V3) weighted for each of the provenance evaluation value, the security evaluation value, and the up-to-dateness evaluation value as the comprehensive evaluation value. Here, w1, w2, and w3 are positive real numbers.

　表示制御部１５０は、総合評価結果テーブル１２２に基づいて、評価結果を示す評価結果画面をディスプレイ６１に表示させる。次に、評価結果画面の例を説明する。
　図１２は、評価結果画面の第１の例を示す図である。 The display control unit 150 causes the display 61 to display an evaluation result screen showing the evaluation result based on the comprehensive evaluation result table 122. Next, an example of the evaluation result screen will be described.
FIG. 12 is a diagram showing a first example of the evaluation result screen.

　評価結果画面４００は、データフロー図４０１および凡例４０２の画像を含む。データフロー図４０１は、情報処理システム５０におけるデータの流れを表す図である。表示制御部１５０は、ユーザの識別情報の入力を受け付け、来歴情報１１１に基づいて、当該ユーザの識別情報に関連するデータの流れを、データフロー図４０１として表示させる。 The evaluation result screen 400 includes images of the data flow diagram 401 and the legend 402. The data flow diagram 401 is a diagram showing a data flow in the information processing system 50. The display control unit 150 accepts the input of the user's identification information, and displays the flow of data related to the user's identification information as the data flow diagram 401 based on the history information 111.

　データフロー図４０１では、データの流れが矢印で表されている。１つの矢印は、総合評価結果テーブル１２２における分類に対応する。表示制御部１５０は、矢印に色を付けることで、各分類に対する評価値、すなわち、品質の評価結果をユーザに提示する。凡例４０２は、矢印の色に対応する品質の高さを示す。 Data flow In FIG. 401, the data flow is represented by an arrow. One arrow corresponds to the classification in the comprehensive evaluation result table 122. The display control unit 150 presents the evaluation value for each classification, that is, the evaluation result of the quality, to the user by coloring the arrow. Legend 402 indicates the high quality corresponding to the color of the arrow.

　図１２の例では、品質を３色で区別する場合を示している。第１の色は、品質「高」を表す。第２の色は、品質「中」を表す。第３の色は、品質「低」を表す。第１の色は、例えば緑である。第２の色は、例えば黄色である。第３の色は、例えば赤である。 The example of FIG. 12 shows a case where quality is distinguished by three colors. The first color represents quality "high". The second color represents quality "medium". The third color represents quality "low". The first color is, for example, green. The second color is, for example, yellow. The third color is, for example, red.

　例えば、表示制御部１５０は、総合評価結果テーブル１２２で総合評価値が満点、すなわち、「３」に対応する分類の矢印を第１の色とする。また、表示制御部１５０は、総合評価結果テーブル１２２で総合評価値が「０」に対応する分類の矢印を第３の色とする。更に、表示制御部１５０は、総合評価結果テーブル１２２で総合評価値が０より大きく３より小さい分類の矢印を第２の色とする。なお、総合評価値に応じた矢印の色分けは、２色や４色以上を用いて行われてもよい。 For example, the display control unit 150 uses the arrow of the classification corresponding to the total evaluation value of "3" as the first color in the comprehensive evaluation result table 122. Further, the display control unit 150 uses the arrow of the classification corresponding to the comprehensive evaluation value "0" in the comprehensive evaluation result table 122 as the third color. Further, the display control unit 150 sets the arrow of the classification in which the comprehensive evaluation value is larger than 0 and smaller than 3 in the comprehensive evaluation result table 122 as the second color. The color coding of the arrows according to the comprehensive evaluation value may be performed by using two colors or four or more colors.

　表示制御部１５０は、総合評価値が低い（例えば、３未満の総合評価値である）矢印を経由して得られる活用データに対して、例えばクロスマーク「Ｘ」を重ねて表示させることで、見直しを要する箇所であることをユーザに提示する。 The display control unit 150 displays, for example, a cross mark "X" superimposed on the utilization data obtained via an arrow having a low overall evaluation value (for example, an overall evaluation value of less than 3). Show the user that it is a part that needs to be reviewed.

　ユーザは、評価結果画面４００に重ねて表示されるポインタＰ１を、入力デバイス６２により操作することで、データフロー図４０１におけるデータや処理などを表すアイコンを選択することができる。 The user can select an icon representing data or processing in the data flow diagram 401 by operating the pointer P1 displayed on the evaluation result screen 400 by the input device 62.

　図１３は、評価結果画面の第２の例を示す図である。
　表示制御部１５０は、評価結果画面４００における活用データＡＢのアイコンがポインタＰ１により選択されたことを検出すると、評価結果画面４００を評価結果画面５００に更新する。評価結果画面５００は、データフロー図５０１および凡例５０２の画像を含む。凡例５０２は、凡例４０２と同じである。 FIG. 13 is a diagram showing a second example of the evaluation result screen.
When the display control unit 150 detects that the icon of the utilization data AB on the evaluation result screen 400 is selected by the pointer P1, the display control unit 150 updates the evaluation result screen 400 to the evaluation result screen 500. The evaluation result screen 500 includes images of the data flow diagram 501 and the legend 502. The legend 502 is the same as the legend 402.

　データフロー図５０１では、選択された活用データＡＢからデータ整形処理ｓ４を経由して蓄積データＢ２へ遡る、逆方向の矢印が強調表示される。また、データフロー図５０１では、入力データＢ１からデータ取得処理やデータ加工処理ｓ２を経て蓄積データＢ２へ至ることを表す順方向の矢印が強調表示される。それ以外の矢印については、目立たない態様で表示される。 In the data flow diagram 501, an arrow in the reverse direction, which goes back from the selected utilization data AB to the accumulated data B2 via the data shaping process s4, is highlighted. Further, in the data flow diagram 501, a forward arrow indicating that the input data B1 reaches the accumulated data B2 via the data acquisition process and the data processing process s2 is highlighted. Other arrows are displayed in an inconspicuous manner.

　また、データ整形処理ｓ４に対するデータの入力元を遡る際、当該入力元は蓄積データＡ２，Ｂ２と２つあり、分岐している。この場合、表示制御部１５０は、総合評価値の低い方を優先的に選択して、逆方向の矢印を表示させることが考えられる。より具体的には、評価結果画面５００では、活用データＡＢから遡ったデータ整形処理ｓ４を逆に辿ると、蓄積データＡ２，Ｂ２に分岐する。したがって、表示制御部１５０は、分類「Ａ１－Ａ２間」、分類「Ｂ１－Ｂ２間」のうち、総合評価値の低い方である分類「Ｂ１－Ｂ２間」に連なる矢印を強調表示させる。これにより、ユーザはデータの品質低下の要因となる箇所を見つけ易くなる。 Further, when tracing back the data input source for the data shaping process s4, there are two input sources, the accumulated data A2 and B2, and they are branched. In this case, it is conceivable that the display control unit 150 preferentially selects the one having the lower overall evaluation value and displays the arrow in the opposite direction. More specifically, on the evaluation result screen 500, when the data shaping process s4 traced back from the utilization data AB is traced in the reverse direction, the data is branched into the accumulated data A2 and B2. Therefore, the display control unit 150 highlights the arrow connected to the classification "B1-B2", which is the lower of the classifications "A1-A2" and "B1-B2", which has the lower overall evaluation value. This makes it easier for the user to find a part that causes deterioration of data quality.

　なお、図１２，図１３の例では、総合評価値に対する評価結果画面４００，５００を例示したが、表示制御部１５０は、総合評価結果テーブル１２２を基に、来歴評価値、セキュリティ評価値および最新性評価値の各々に対する評価結果画面を表示させてもよい。 In the examples of FIGS. 12 and 13, the evaluation result screens 400 and 500 for the comprehensive evaluation value are illustrated, but the display control unit 150 uses the comprehensive evaluation result table 122 as a basis for the provenance evaluation value, the security evaluation value, and the latest. The evaluation result screen for each of the sex evaluation values may be displayed.

　次に、情報処理装置１００の処理手順を説明する。
　図１４は、情報処理装置の処理例を示すフローチャートである。
　情報処理装置１００は、例えば、ユーザによるデータの品質評価の開始の入力を受け付けると下記の手順を開始する。 Next, the processing procedure of the information processing apparatus 100 will be described.
FIG. 14 is a flowchart showing a processing example of the information processing apparatus.
The information processing apparatus 100 starts the following procedure when, for example, the user receives an input for starting a data quality evaluation.

　（Ｓ１０）来歴情報解析部１３０は、来歴情報１１１に基づいて、該当のユーザの識別情報に対応する入力データリスト１１３を生成する。評価部１４０は、ユーザ所持入力データリスト１１２および入力データリスト１１３に基づいてデータの来歴評価を行い、来歴評価結果１１５を生成する。評価部１４０は、生成した来歴評価結果１１５を記憶部１１０に格納する。来歴評価の手順は後述される。 (S10) The provenance information analysis unit 130 generates an input data list 113 corresponding to the identification information of the corresponding user based on the provenance information 111. The evaluation unit 140 evaluates the history of data based on the user-owned input data list 112 and the input data list 113, and generates the history evaluation result 115. The evaluation unit 140 stores the generated probability evaluation result 115 in the storage unit 110. The procedure for provenance evaluation will be described later.

　（Ｓ１１）来歴情報解析部１３０は、来歴情報１１１に基づいてアクセス権限予測結果１１６を生成する。評価部１４０は、アクセス権限予測結果１１６および実際のアクセス権限情報１１７に基づいてデータのセキュリティ評価を行い、セキュリティ評価結果１１８を生成する。評価部１４０は、生成したセキュリティ評価結果１１８を記憶部１１０に格納する。セキュリティ評価の手順は後述される。 (S11) The provenance information analysis unit 130 generates the access authority prediction result 116 based on the provenance information 111. The evaluation unit 140 performs security evaluation of data based on the access authority prediction result 116 and the actual access authority information 117, and generates the security evaluation result 118. The evaluation unit 140 stores the generated security evaluation result 118 in the storage unit 110. The security evaluation procedure will be described later.

　（Ｓ１２）来歴情報解析部１３０は、来歴情報１１１に基づいて実遅延時間情報１２０を生成する。評価部１４０は、遅延要件情報１１９および実遅延時間情報１２０に基づいて、データの最新性評価を行い、最新性評価結果１２１を生成する。評価部１４０は、生成した最新性評価結果１２１を記憶部１１０に格納する。最新性評価の手順は後述される。 (S12) The provenance information analysis unit 130 generates the actual delay time information 120 based on the provenance information 111. The evaluation unit 140 evaluates the latestness of the data based on the delay requirement information 119 and the actual delay time information 120, and generates the latestness evaluation result 121. The evaluation unit 140 stores the generated up-to-dateness evaluation result 121 in the storage unit 110. The procedure for up-to-date evaluation will be described later.

　（Ｓ１３）評価部１４０は、来歴評価結果１１５、セキュリティ評価結果１１８および最新性評価結果１２１に基づいて総合評価結果テーブル１２２を生成する。評価部１４０は、生成した総合評価結果テーブル１２２を記憶部１１０に格納する。評価部１４０は、来歴評価結果１１５、セキュリティ評価結果１１８および最新性評価結果１２１における各分類の来歴評価値、セキュリティ評価値および最新性評価値を、総合評価結果テーブル１２２に登録する。そして、評価部１４０は、分類ごとに来歴評価値とセキュリティ評価値と最新性評価値との和を総合評価値として計算し、総合評価結果テーブル１２２に登録する。総合評価値は、来歴評価値とセキュリティ評価値と最新性評価値との重み付き和など、他の計算方法で計算されてもよい。 (S13) The evaluation unit 140 generates a comprehensive evaluation result table 122 based on the provenance evaluation result 115, the security evaluation result 118, and the up-to-dateness evaluation result 121. The evaluation unit 140 stores the generated comprehensive evaluation result table 122 in the storage unit 110. The evaluation unit 140 registers the history evaluation value, the security evaluation value, and the latestness evaluation value of each category in the history evaluation result 115, the security evaluation result 118, and the up-to-dateness evaluation result 121 in the comprehensive evaluation result table 122. Then, the evaluation unit 140 calculates the sum of the history evaluation value, the security evaluation value, and the up-to-dateness evaluation value as the comprehensive evaluation value for each classification, and registers the sum in the comprehensive evaluation result table 122. The comprehensive evaluation value may be calculated by another calculation method such as a weighted sum of the history evaluation value, the security evaluation value, and the up-to-dateness evaluation value.

　（Ｓ１４）表示制御部１５０は、総合評価結果テーブル１２２に基づいて、評価結果表示制御を実行する。評価結果表示制御の手順は後述される。表示制御部１５０は、ユーザによる評価結果表示の終了の入力を受け付けると、評価結果表示制御を終了し、評価結果画面の表示を終了する。そして、情報処理装置１００の処理が終了する。 (S14) The display control unit 150 executes the evaluation result display control based on the comprehensive evaluation result table 122. The procedure for controlling the evaluation result display will be described later. When the display control unit 150 receives the input of the end of the evaluation result display by the user, the display control unit 150 ends the evaluation result display control and ends the display of the evaluation result screen. Then, the processing of the information processing apparatus 100 is completed.

　図１５は、来歴評価例を示すフローチャートである。
　来歴評価は、ステップＳ１０に相当する。
　（Ｓ２０）入力データ抽出部１３１は、ユーザ所持入力データリスト１１２を取得し、記憶部１１０に格納する。例えば、ユーザ所持入力データリスト１１２は、ユーザにより情報処理装置１００に入力される。 FIG. 15 is a flowchart showing a history evaluation example.
Provenance evaluation corresponds to step S10.
(S20) The input data extraction unit 131 acquires the user-owned input data list 112 and stores it in the storage unit 110. For example, the user-owned input data list 112 is input to the information processing apparatus 100 by the user.

　（Ｓ２１）入力データ抽出部１３１は、記憶部１１０に記憶された来歴情報１１１に基づいて、実際の始点の入力データのリスト、すなわち、入力データリスト１１３を生成し、記憶部１１０に格納する。このとき、入力データ抽出部１３１は、例えば来歴情報１１１に基づいて、該当のユーザの識別情報に対応する処理で用いられるデータの来歴を特定し、特定した来歴から入力データリスト１１３を生成する。 (S21) The input data extraction unit 131 generates a list of input data of the actual start point, that is, an input data list 113, based on the history information 111 stored in the storage unit 110, and stores the input data list 113 in the storage unit 110. At this time, the input data extraction unit 131 identifies the history of the data used in the process corresponding to the identification information of the corresponding user based on, for example, the history information 111, and generates the input data list 113 from the specified history.

　（Ｓ２２）評価部１４０は、評価対象の活用データを特定する。なお、来歴評価の処理において、最初にステップＳ２２を実行する時点における各データの来歴評価値の初期値は０であるとする。 (S22) The evaluation unit 140 specifies the utilization data to be evaluated. In the history evaluation process, it is assumed that the initial value of the history evaluation value of each data at the time when step S22 is first executed is 0.

　（Ｓ２３）評価部１４０は、ユーザ所持入力データリスト１１２および入力データリスト１１３に基づいて、該当の活用データに対応する実際の始点の入力データがユーザ所持入力データに一致するか否かを判定する。一致する場合、評価部１４０は、ステップＳ２４に処理を進める。一致しない場合、評価部１４０は、ステップＳ２５に処理を進める。 (S23) The evaluation unit 140 determines whether or not the input data of the actual starting point corresponding to the corresponding utilization data matches the user-owned input data based on the user-owned input data list 112 and the input data list 113. .. If they match, the evaluation unit 140 proceeds to step S24. If they do not match, the evaluation unit 140 proceeds to step S25.

　（Ｓ２４）評価部１４０は、該当の活用データの来歴評価の評価値、すなわち、来歴評価値を加点する。また、評価部１４０は、該当の活用データに至る中間データ（蓄積データ）の来歴評価値も加点する。加点では、例えば、来歴評価値「１」（単位点数）を加算する。ただし、加算する来歴評価値は「１」以外でもよく、前述のように、所定の上限値（例えば「１」）を超えないように該当のデータへの来歴評価値が付与されてもよい。ここで、あるデータに対する品質の評価値は、当該データを出力とする分類の評価値に相当する。例えば、総合評価結果テーブル１２２の分類「Ａ１－Ａ２間」に対する評価値は、蓄積データＡ２に対する評価値と言える。 (S24) The evaluation unit 140 adds points to the evaluation value of the history evaluation of the corresponding utilization data, that is, the history evaluation value. In addition, the evaluation unit 140 also adds points to the history evaluation value of the intermediate data (accumulated data) leading to the corresponding utilization data. In adding points, for example, the revenue evaluation value "1" (unit points) is added. However, the history evaluation value to be added may be other than "1", and as described above, the history evaluation value for the corresponding data may be given so as not to exceed a predetermined upper limit value (for example, "1"). Here, the quality evaluation value for a certain data corresponds to the evaluation value of the classification that outputs the data. For example, the evaluation value for the classification "between A1 and A2" in the comprehensive evaluation result table 122 can be said to be the evaluation value for the accumulated data A2.

　（Ｓ２５）評価部１４０は、評価対象の全活用データを評価済であるか否かを判定する。評価対象の全活用データを評価済の場合、評価部１４０は、来歴評価を終了する。評価対象の全活用データを評価済でない場合、評価部１４０は、ステップＳ２２に処理を進める。 (S25) The evaluation unit 140 determines whether or not all the utilization data to be evaluated have been evaluated. If all the utilization data to be evaluated have been evaluated, the evaluation unit 140 ends the history evaluation. If the evaluation target full utilization data has not been evaluated, the evaluation unit 140 proceeds to step S22.

　なお、ステップＳ２３，Ｓ２４の処理において、活用データに対応する実際の始点のデータが複数のこともある。その場合、評価部１４０は、実際の始点のデータのうち、ユーザ所持入力データリスト１１２に含まれる数だけ、活用データや中間データの来歴評価値を加点する。例えば、評価部１４０は、実際の始点の入力データのうち、ユーザ所持入力データリスト１１２に含まれる数が２の場合に、評価値の単位点数ａの２倍の評価値を活用データや中間データの各々の来歴評価値に加算することが考えられる。 In the processing of steps S23 and S24, there may be a plurality of actual starting point data corresponding to the utilization data. In that case, the evaluation unit 140 adds points to the history evaluation values of the utilization data and the intermediate data as many as the number included in the user-possessed input data list 112 among the actual start point data. For example, when the number of the input data of the actual start point included in the user-owned input data list 112 is 2, the evaluation unit 140 utilizes the evaluation value twice as the unit score a of the evaluation value, or the intermediate data. It is conceivable to add to each history evaluation value of.

　図１６は、セキュリティ評価例を示すフローチャートである。
　セキュリティ評価は、ステップＳ１１に相当する。
　（Ｓ３０）アクセス権限予測部１３２は、ステップＳ２１で生成された入力データリスト１１３に基づいて、処理対象のユーザの識別情報に対応する始点の入力データを特定する。アクセス権限予測部１３２は、特定した始点の入力データに関する入力データアクセス権限情報１１１ａを取得し、記憶部１１０に格納する。また、アクセス権限予測部１３２は、来歴情報１１１に基づいて、始点の入力データから活用データに至るまでの処理に関する加工整形処理情報１１１ｂを取得し、記憶部１１０に格納する。 FIG. 16 is a flowchart showing an example of security evaluation.
The security evaluation corresponds to step S11.
(S30) The access authority prediction unit 132 identifies the input data of the starting point corresponding to the identification information of the user to be processed based on the input data list 113 generated in step S21. The access authority prediction unit 132 acquires the input data access authority information 111a related to the input data of the specified start point and stores it in the storage unit 110. Further, the access authority prediction unit 132 acquires the processing and shaping processing information 111b related to the processing from the input data of the start point to the utilization data based on the history information 111, and stores it in the storage unit 110.

　（Ｓ３１）アクセス権限予測部１３２は、来歴情報１１１に基づいて、始点の入力データから得られる他のデータのアクセス権限を予測し、アクセス権限予測結果１１６を生成し、記憶部１１０に格納する。始点の入力データから得られる他のデータには、始点の入力データに基づいて生成される蓄積データ（中間データ）や活用データが含まれる。また、アクセス権限予測部１３２は、情報処理システム５０のデータカタログなどから実際のデータのアクセス権限を示すアクセス権限情報１１７を取得し、記憶部１１０に格納する。 (S31) The access authority prediction unit 132 predicts the access authority of other data obtained from the input data of the start point based on the history information 111, generates the access authority prediction result 116, and stores it in the storage unit 110. Other data obtained from the input data of the start point includes accumulated data (intermediate data) and utilization data generated based on the input data of the start point. Further, the access authority prediction unit 132 acquires the access authority information 117 indicating the actual data access authority from the data catalog or the like of the information processing system 50 and stores it in the storage unit 110.

　（Ｓ３２）評価部１４０は、評価対象のデータを特定する。評価対象のデータの候補は、アクセス権限予測結果１１６に含まれる全ての蓄積データおよび全ての活用データである。評価部１４０は、評価対象のデータの候補の中から評価対象のデータを１つ特定する。なお、セキュリティ評価の処理において、最初にステップＳ３２を実行する時点における各データのセキュリティ評価値の初期値は０であるとする。 (S32) The evaluation unit 140 specifies the data to be evaluated. Candidates for the data to be evaluated are all accumulated data and all utilization data included in the access authority prediction result 116. The evaluation unit 140 specifies one data to be evaluated from the data candidates to be evaluated. In the security evaluation process, the initial value of the security evaluation value of each data at the time when step S32 is first executed is assumed to be 0.

　（Ｓ３３）評価部１４０は、アクセス権限予測結果１１６における、評価対象のデータに対して予測されたアクセス権限が、アクセス権限情報１１７における実際のアクセス権限に一致するか否かを判定する。一致する場合、評価部１４０は、ステップＳ３４に処理を進める。一致しない場合、評価部１４０は、ステップＳ３５に処理を進める。 (S33) The evaluation unit 140 determines whether or not the predicted access authority for the data to be evaluated in the access authority prediction result 116 matches the actual access authority in the access authority information 117. If they match, the evaluation unit 140 proceeds to step S34. If they do not match, the evaluation unit 140 proceeds to step S35.

　（Ｓ３４）評価部１４０は、評価対象のデータのセキュリティ評価の評価値、すなわち、セキュリティ評価値を加点する。加点では、例えば、セキュリティ評価値「１」を加算する。ただし、加点する来歴評価値は「１」以外でもよく、前述のように、所定の上限値（例えば「１」）を超えないように該当のデータへのセキュリティ評価値が付与されてもよい。 (S34) The evaluation unit 140 adds points to the evaluation value of the security evaluation of the data to be evaluated, that is, the security evaluation value. In addition, for example, a security evaluation value "1" is added. However, the history evaluation value to be added may be other than "1", and as described above, a security evaluation value may be given to the relevant data so as not to exceed a predetermined upper limit value (for example, "1").

　（Ｓ３５）評価部１４０は、評価対象の全データを評価済であるか否かを判定する。評価対象の全データを評価済の場合、評価部１４０は、セキュリティ評価を終了する。評価対象の全データを評価済でない場合、評価部１４０は、ステップＳ３２に処理を進める。 (S35) The evaluation unit 140 determines whether or not all the data to be evaluated have been evaluated. If all the data to be evaluated have been evaluated, the evaluation unit 140 ends the security evaluation. If all the data to be evaluated have not been evaluated, the evaluation unit 140 proceeds to step S32.

　図１７は、最新性評価例を示すフローチャートである。
　最新性評価は、ステップＳ１２に相当する。
　（Ｓ４０）遅延時間算出部１３３は、活用データに関する遅延要件情報１１９を取得する。例えば、遅延要件情報１１９は、ユーザにより情報処理装置１００に入力される。 FIG. 17 is a flowchart showing an example of up-to-date evaluation.
The up-to-dateness evaluation corresponds to step S12.
(S40) The delay time calculation unit 133 acquires the delay requirement information 119 regarding the utilization data. For example, the delay requirement information 119 is input to the information processing apparatus 100 by the user.

　（Ｓ４１）遅延時間算出部１３３は、来歴情報１１１に基づいて、活用データに対する実際のデータ更新の遅延時間を計算する。前述のように、遅延時間算出部１３３は、情報処理システム５０から取得されるデータ更新ログを、データ更新の遅延時間の計算に用いることができる。遅延時間算出部１３３は、計算した遅延時間を、記憶部１１０に格納された実遅延時間情報１２０に記録する。 (S41) The delay time calculation unit 133 calculates the delay time of the actual data update for the utilization data based on the history information 111. As described above, the delay time calculation unit 133 can use the data update log acquired from the information processing system 50 to calculate the delay time for data update. The delay time calculation unit 133 records the calculated delay time in the actual delay time information 120 stored in the storage unit 110.

　（Ｓ４２）評価部１４０は、評価対象の活用データを特定する。なお、最新性評価の処理において、最初にステップＳ４２を実行する時点における各データの最新性評価値の初期値は０であるとする。 (S42) The evaluation unit 140 specifies the utilization data to be evaluated. In the up-to-dateness evaluation process, it is assumed that the initial value of the up-to-dateness evaluation value of each data at the time when step S42 is first executed is 0.

　（Ｓ４３）評価部１４０は、遅延要件情報１１９および実遅延時間情報１２０に基づいて、該当の活用データに対して計算された遅延時間が遅延要件情報１１９に基づく許容範囲内であるか否かを判定する。許容範囲内である場合、ステップＳ４４に処理を進める。許容範囲内でない場合、ステップＳ４５に処理を進める。 (S43) The evaluation unit 140 determines whether or not the delay time calculated for the corresponding utilization data is within the permissible range based on the delay requirement information 119 based on the delay requirement information 119 and the actual delay time information 120. judge. If it is within the permissible range, the process proceeds to step S44. If it is not within the permissible range, the process proceeds to step S45.

　（Ｓ４４）評価部１４０は、該当の活用データの最新性評価の評価値、すなわち、最新性評価値を加点する。また、評価部１４０は、該当の活用データに至る中間データ（蓄積データ）の最新性評価値も加点する。加点では、例えば、最新性評価値「１」を加算する。ただし、加算する最新性評価値は「１」以外でもよく、前述のように、所定の上限値（例えば「１」）を超えないように該当のデータへの最新性評価値が付与されてもよい。 (S44) The evaluation unit 140 adds points to the evaluation value of the latestness evaluation of the corresponding utilization data, that is, the latestness evaluation value. In addition, the evaluation unit 140 also adds points for the up-to-dateness evaluation value of the intermediate data (accumulated data) leading to the corresponding utilization data. In addition, for example, the latestness evaluation value "1" is added. However, the up-to-dateness evaluation value to be added may be other than "1", and as described above, even if the up-to-dateness evaluation value is given to the corresponding data so as not to exceed a predetermined upper limit value (for example, "1"). good.

　（Ｓ４５）評価部１４０は、評価対象の全活用データを評価済であるか否かを判定する。評価対象の全活用データを評価済の場合、評価部１４０は、最新性評価を終了する。評価対象の全活用データを評価済でない場合、評価部１４０は、ステップＳ４２に処理を進める。 (S45) The evaluation unit 140 determines whether or not all the utilization data to be evaluated have been evaluated. If all the utilization data to be evaluated have been evaluated, the evaluation unit 140 ends the up-to-date evaluation. If the evaluation target full utilization data has not been evaluated, the evaluation unit 140 proceeds to step S42.

　図１８は、評価結果表示制御例を示すフローチャートである。
　評価結果表示制御は、ステップＳ１４に相当する。
　（Ｓ５０）表示制御部１５０は、表示対象とする評価種別の、ユーザによる選択を受け付ける。評価種別には、来歴評価、セキュリティ評価、最新性評価、および、総合評価がある。ユーザは、これらの評価種別のうちの１つを選択してもよいし、来歴評価、セキュリティ評価および最新性評価のうちの２つの組み合わせを選択してもよい。以下では、主に総合評価が選択される場合を例示するが、他の評価種別の場合も同様の手順となる。 FIG. 18 is a flowchart showing an evaluation result display control example.
The evaluation result display control corresponds to step S14.
(S50) The display control unit 150 accepts the user's selection of the evaluation type to be displayed. Evaluation types include history evaluation, security evaluation, up-to-date evaluation, and comprehensive evaluation. The user may select one of these evaluation types, or may select a combination of two of the history evaluation, the security evaluation, and the up-to-date evaluation. In the following, the case where the comprehensive evaluation is mainly selected is illustrated, but the procedure is the same for other evaluation types.

　（Ｓ５１）表示制御部１５０は、該当のユーザの識別情報に対応するデータの来歴を示す評価結果画面４００をディスプレイ６１に表示させる。表示制御部１５０は、ネットワーク６０を介して接続されたクライアント装置などの他の装置に評価結果画面４００の情報を送信し、他の装置によって、当該他の装置に接続されたディスプレイに評価結果画面４００を表示させてもよい。表示制御部１５０は、ステップＳ５０で選択された評価種別に対応する評価値（ここでは、総合評価値）を用いて「分類」を表す矢印を色分けしたデータフロー図４０１を、評価結果画面４００の中に表示させる。評価結果画面４００は、凡例４０２の画像を含んでもよい。ステップＳ５０で、２つの評価種別が選択された場合、表示制御部１５０は、該当の分類に対する２つの評価種別での評価値の和により、矢印を色分けすればよい。 (S51) The display control unit 150 displays the evaluation result screen 400 showing the history of the data corresponding to the identification information of the corresponding user on the display 61. The display control unit 150 transmits the information of the evaluation result screen 400 to another device such as a client device connected via the network 60, and the evaluation result screen is displayed on the display connected to the other device by the other device. 400 may be displayed. The display control unit 150 displays the data flow diagram 401 in which the arrow indicating “classification” is color-coded using the evaluation value (here, the comprehensive evaluation value) corresponding to the evaluation type selected in step S50 on the evaluation result screen 400. Display inside. The evaluation result screen 400 may include the image of the legend 402. When two evaluation types are selected in step S50, the display control unit 150 may color-code the arrows according to the sum of the evaluation values of the two evaluation types for the corresponding classification.

　（Ｓ５２）表示制御部１５０は、データフロー図４０１において、何れかのデータの選択があるか否かを判定する。当該選択がある場合、表示制御部１５０は、ステップＳ５３に処理を進める。当該選択がない場合、表示制御部１５０は、ステップＳ５４に処理を進める。例えば、ユーザは、入力デバイス６２を操作して、ポインタＰ１によりデータフロー図４０１に表示されたデータを選択できる。あるいは、評価結果画面４００がユーザの使用するクライアント装置により表示される場合、ユーザは、クライアント装置の入力デバイスを操作して、何れかのデータの選択を、情報処理装置１００に入力してもよい。 (S52) The display control unit 150 determines whether or not any data is selected in the data flow diagram 401. If there is such a selection, the display control unit 150 proceeds to step S53. If there is no such selection, the display control unit 150 proceeds to step S54. For example, the user can operate the input device 62 to select the data displayed in the data flow diagram 401 by the pointer P1. Alternatively, when the evaluation result screen 400 is displayed by the client device used by the user, the user may operate the input device of the client device to input the selection of any data to the information processing device 100. ..

　（Ｓ５３）表示制御部１５０は、総合評価値に基づいて選択されたデータから遡るデータフローを表示させる。例えば、表示制御部１５０は、データフロー図４０１において活用データＡＢが選択された場合、活用データＡＢから１つ前のデータである蓄積データＢ２まで遡るデータフローを含む評価結果画面５００を表示させる。前述のように、表示制御部１５０は、遡る経路に分岐がある場合、活用データＡＢに至る分類のうち、総合評価値が低い分類を多く含む方を優先的に選択して、選択した方の分岐先の分類を強調表示することが考えられる。例えば、評価結果画面５００では、活用データＡＢから遡ったデータ整形処理ｓ４を逆に辿ると、蓄積データＡ２，Ｂ２に分岐する。したがって、表示制御部１５０は、分類「Ａ１－Ａ２間」、分類「Ｂ１－Ｂ２間」のうち、総合評価値の低い方である分類「Ｂ１－Ｂ２間」に連なる矢印を強調表示させる。 (S53) The display control unit 150 displays a data flow that traces back from the data selected based on the comprehensive evaluation value. For example, when the utilization data AB is selected in the data flow diagram 401, the display control unit 150 displays an evaluation result screen 500 including a data flow that traces back from the utilization data AB to the accumulated data B2 which is the previous data. As described above, when the display control unit 150 has a branch in the retroactive route, the display control unit 150 preferentially selects and selects the classification that includes many classifications having a low comprehensive evaluation value among the classifications leading to the utilization data AB. It is conceivable to highlight the classification of the branch destination. For example, on the evaluation result screen 500, when the data shaping process s4 traced back from the utilization data AB is traced in the reverse direction, the data is branched into the accumulated data A2 and B2. Therefore, the display control unit 150 highlights the arrow connected to the classification "B1-B2", which is the lower of the classifications "A1-A2" and "B1-B2", which has the lower overall evaluation value.

　（Ｓ５４）表示制御部１５０は、表示終了の入力を受け付けたか否かを判定する。表示終了の入力を受け付けた場合、表示制御部１５０は、評価結果表示制御を終了する。表示制御部１５０は、表示終了の入力を受け付けていない場合、ステップＳ５２に処理を進める。 (S54) The display control unit 150 determines whether or not the input for ending the display has been accepted. When the display end input is received, the display control unit 150 ends the evaluation result display control. If the display control unit 150 does not accept the input for ending the display, the display control unit 150 proceeds to step S52.

　このように、情報処理装置１００によれば、情報処理基盤におけるデータ品質を評価するサービスを実現することができる。特に、情報処理装置１００により、データの来歴情報を活用することで、ソフトウェア製品を跨ったデータの流れを評価する。また、情報処理装置１００により、評価結果に基づき、データフロー図に問題箇所を表して表示する。 As described above, according to the information processing apparatus 100, it is possible to realize a service for evaluating the data quality in the information processing infrastructure. In particular, the information processing apparatus 100 evaluates the flow of data across software products by utilizing the history information of the data. Further, the information processing apparatus 100 displays the problematic portion on the data flow diagram based on the evaluation result.

　これにより、複数のソフトウェアを跨ぐデータの評価が可能となり、情報処理基盤全体の中での問題箇所のユーザによる特定を支援できる。
　ここで、情報処理システム５０では、種々のソフトウェアが実行される。例えば、ソフトウェア製品ごとに、データの欠損値、重複データおよびデータの種類などによりデータ自体を評価することが考えられる。しかし、複数のソフトウェア製品で構成される情報処理基盤では、ソフトウェア製品ごとの評価だけではなく、データの収集から出力先まで、一連の流れの中で適正にデータ品質が管理されているかを評価することが考えられる。 This makes it possible to evaluate data across multiple softwares and support the user's identification of problem areas in the entire information processing infrastructure.
Here, in the information processing system 50, various software is executed. For example, it is conceivable to evaluate the data itself for each software product based on data missing values, duplicate data, data types, and the like. However, in an information processing platform consisting of multiple software products, not only evaluation of each software product but also evaluation of whether data quality is properly managed in a series of flows from data collection to output destination is evaluated. Is possible.

　ところが、複数のソフトウェア製品に跨る一連の流れの中でデータ品質を評価する仕組みが考えられていない。例えば、一連の流れ中でのデータ品質を評価するために、各ソフトウェア製品のデータ品質を属人的に評価することが考えられるが、評価に時間がかかる。特に、データ品質悪化の原因を特定するために、ユーザにデータの流れを読み解く作業を強いるのは、ユーザの負担が大きく、評価に時間がかかる。 However, no mechanism for evaluating data quality has been considered in a series of flows that span multiple software products. For example, in order to evaluate the data quality in a series of flows, it is conceivable to personally evaluate the data quality of each software product, but the evaluation takes time. In particular, forcing the user to read the data flow in order to identify the cause of the deterioration of data quality is a heavy burden on the user and takes time for evaluation.

　そこで、情報処理装置１００は、来歴情報１１１に基づいて、複数のソフトウェアによるデータ処理におけるデータの品質を適切に評価する。
　例えば、複数のソフトウェアによる一連のデータ処理において、ユーザが意図するデータが処理されているか否かにより、データ処理により出力されるデータの品質が変わる。例えば、分析などのデータ処理を行う場合、ユーザが意図しない入力データが処理されていると、当該入力データに不要な情報や誤った情報が含まれていることなどが要因となり、データ処理の結果が誤っている可能性が高まるため、当該結果の信頼性が低下する。 Therefore, the information processing apparatus 100 appropriately evaluates the quality of data in the data processing by the plurality of software based on the history information 111.
For example, in a series of data processing by a plurality of software, the quality of the data output by the data processing changes depending on whether or not the data intended by the user is processed. For example, when performing data processing such as analysis, if input data not intended by the user is processed, the input data may contain unnecessary information or incorrect information, resulting in data processing. Is more likely to be wrong, which reduces the reliability of the result.

　そこで、情報処理装置１００は、来歴情報１１１に基づき、複数のソフトウェアによるデータ処理の始点の入力データを特定する。始点の入力データがユーザの意図する入力であるか否かを確認することで、当該データ処理により出力される活用データや蓄積データの品質を適切に評価できる。また、当該評価をユーザが行うよりも速く行える。 Therefore, the information processing apparatus 100 specifies the input data of the start point of the data processing by the plurality of software based on the history information 111. By confirming whether the input data of the start point is the input intended by the user, the quality of the utilization data and the accumulated data output by the data processing can be appropriately evaluated. In addition, the evaluation can be performed faster than the user can perform.

　また、情報処理装置１００は、来歴評価に加えて、セキュリティ評価や最新性評価といった複数の評価種別でデータの品質評価を行い、複数の評価種別での評価結果から、データの品質を総合評価することで、より適切にデータの品質を評価できる。 Further, the information processing apparatus 100 evaluates the quality of data in a plurality of evaluation types such as security evaluation and up-to-dateness evaluation in addition to the history evaluation, and comprehensively evaluates the quality of the data from the evaluation results in the plurality of evaluation types. Therefore, the quality of the data can be evaluated more appropriately.

　例えば、情報処理装置１００は、次の処理を行う。
　来歴情報解析部１３０は、複数のソフトウェアの各々に対する入力データおよび出力データの履歴を示す来歴情報１１１に基づいて、複数のソフトウェアによるデータ処理の始点である第１の入力データの情報を抽出する。評価部１４０は、第１の入力データの情報がユーザにより入力された、所定のデータの情報に一致するか否かの比較に応じて、データ処理により出力される第１の出力データの品質評価を行う。 For example, the information processing apparatus 100 performs the following processing.
The history information analysis unit 130 extracts information on the first input data, which is a starting point of data processing by the plurality of software, based on the history information 111 showing the history of input data and output data for each of the plurality of software. The evaluation unit 140 evaluates the quality of the first output data output by the data processing according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user. I do.

　これにより、データの品質を適切に評価することができる。
　評価部１４０は、第１の入力データの情報がユーザにより入力された、所定のデータの情報に一致するか否かの比較に応じて、第１の入力データから第１の出力データに至るまでに経由する中間データの品質評価を行う。 This makes it possible to appropriately evaluate the quality of the data.
The evaluation unit 140 ranges from the first input data to the first output data according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user. Perform quality evaluation of intermediate data via.

　これにより、データ処理の最終的な出力である第１の出力データだけでなく、データ処理の過程で生成される中間データの品質を適切に評価することができる。
　また、来歴情報解析部１３０は、第１の入力データに対する第１のアクセス権限の情報を取得し、第１のアクセス権限および来歴情報１１１に基づいて、第１の出力データに対する第２のアクセス権限を予測する。評価部１４０は、第１の出力データの実際のアクセス権限が、予測された第２のアクセス権限に一致するか否かの比較に応じて、第１の出力データの品質評価を行う。 This makes it possible to appropriately evaluate the quality of not only the first output data, which is the final output of the data processing, but also the intermediate data generated in the process of data processing.
Further, the history information analysis unit 130 acquires the information of the first access authority to the first input data, and based on the first access authority and the history information 111, the second access authority to the first output data. Predict. The evaluation unit 140 evaluates the quality of the first output data according to the comparison of whether or not the actual access authority of the first output data matches the predicted second access authority.

　このように、来歴情報１１１から予測される第１の出力データのアクセス権限が、実際のアクセス権限に一致するか否かを確認することで、第１の出力データが適切なプロセスを経て生成されたものであるか否かを評価できる。例えば、アクセス権限が一致しない場合には、第１の出力データの生成過程において、ユーザの想定していない不適切な処理が行われている可能性がある。このため、予測されたアクセス権限が実際のアクセス権限に一致しない場合、該当のデータの品質は低いと判断される。 In this way, by confirming whether or not the access authority of the first output data predicted from the history information 111 matches the actual access authority, the first output data is generated through an appropriate process. It is possible to evaluate whether or not it is a product. For example, if the access permissions do not match, there is a possibility that inappropriate processing that the user did not anticipate has been performed in the process of generating the first output data. Therefore, if the predicted access authority does not match the actual access authority, the quality of the corresponding data is judged to be low.

　来歴情報解析部１３０は、第２のアクセス権限の予測では、第１の入力データから第１の出力データに至るまでに経由する中間データに対する第３のアクセス権限を第１のアクセス権限に基づいて予測する。来歴情報解析部１３０は、予測した第３のアクセス権限に基づいて第２のアクセス権限を予測する。 In the prediction of the second access authority, the history information analysis unit 130 determines the third access authority for the intermediate data passing from the first input data to the first output data based on the first access authority. Predict. The provenance information analysis unit 130 predicts the second access authority based on the predicted third access authority.

　このように、中間データに対するアクセス権限の予測結果に基づいて、データフローの順方向に順番にデータのアクセス権限を予測することで、第２のアクセス権限を適切に予測することができる。 In this way, the second access authority can be appropriately predicted by predicting the access authority of the data in order in the forward direction of the data flow based on the prediction result of the access authority to the intermediate data.

　評価部１４０は、中間データの実際のアクセス権限が、予測した第３のアクセス権限に一致するか否かの比較に応じて、中間データの品質評価を行ってもよい。
　これにより、データ処理の最終的な出力である第１の出力データだけでなく、データ処理の過程で生成される中間データの品質を適切に評価することができる。 The evaluation unit 140 may evaluate the quality of the intermediate data according to the comparison of whether or not the actual access authority of the intermediate data matches the predicted third access authority.
This makes it possible to appropriately evaluate the quality of not only the first output data, which is the final output of the data processing, but also the intermediate data generated in the process of data processing.

　また、来歴情報解析部１３０は、第１の入力データの発生から第１の出力データが更新されるまでに許容される第１の遅延時間の情報を取得する。第１の遅延時間の情報は、例えば、ユーザにより入力される。来歴情報解析部１３０は、データの更新履歴の情報および来歴情報１１１に基づいて、第１の入力データが発生してから第１の出力データが更新されるまでの第２の遅延時間を計算する。評価部１４０は、第２の遅延時間が第１の遅延時間よりも短いか否かの比較に応じて、第１の出力データの品質評価を行う。 Further, the history information analysis unit 130 acquires information on the first delay time allowed from the generation of the first input data to the update of the first output data. The information of the first delay time is input by the user, for example. The history information analysis unit 130 calculates a second delay time from the generation of the first input data to the update of the first output data based on the data update history information and the history information 111. .. The evaluation unit 140 evaluates the quality of the first output data according to the comparison of whether or not the second delay time is shorter than the first delay time.

　このように、第１の入力データの発生から第１の出力データの更新までの遅延時間が、ユーザの遅延要件を満たすか否かによって、第１の出力データが適切なプロセスを経て生成されたものであるか否かを評価できる。例えば、第２の遅延時間が第１の遅延時間よりも長い場合、遅延要件を満たさないこととなり、第１の入力データから第１の出力データに至るプロセスにおいて、異常や性能劣化などが生じている可能性がある。このため、第２の遅延時間が第１の遅延時間よりも長い場合、第１の出力データの品質は低いと判断される。評価部１４０は、第２の遅延時間と第１の遅延時間との比較に応じて、第１の出力データに加えて、第１の入力データから第１の出力データに至るまでに経由する中間データの品質評価を行ってもよい。この場合も、第２の遅延時間が第１の遅延時間よりも長い場合には、中間データの品質は低いと判断される。 In this way, the first output data is generated through an appropriate process depending on whether the delay time from the generation of the first input data to the update of the first output data meets the delay requirement of the user. It is possible to evaluate whether or not it is a thing. For example, if the second delay time is longer than the first delay time, the delay requirement is not satisfied, and an abnormality or performance deterioration occurs in the process from the first input data to the first output data. There may be. Therefore, when the second delay time is longer than the first delay time, it is determined that the quality of the first output data is low. The evaluation unit 140 passes through from the first input data to the first output data in addition to the first output data according to the comparison between the second delay time and the first delay time. Data quality may be evaluated. In this case as well, if the second delay time is longer than the first delay time, the quality of the intermediate data is judged to be low.

　表示制御部１５０は、第１の入力データから第１の出力データに至るデータフロー図４０１を表示装置に表示させる。表示制御部１５０は、データフロー図４０１に含まれる、第１の入力データと第１の出力データとの関連を示す画像要素の表示態様を、第１の出力データに対する品質評価の結果に基づいて変更する。 The display control unit 150 causes the display device to display the data flow diagram 401 from the first input data to the first output data. The display control unit 150 determines the display mode of the image element, which is included in the data flow diagram 401 and shows the relationship between the first input data and the first output data, based on the result of quality evaluation for the first output data. change.

　これにより、ユーザによる見直し箇所の特定を支援できる。データフロー図４０１におけるデータフローを示す矢印は、画像要素の一例である。ディスプレイ６１は、表示装置の一例である。表示装置は、他の情報処理装置に接続された表示装置でもよい。その場合、表示制御部１５０は、ネットワーク６０を介して他の情報処理装置に表示内容の情報を送信することで表示制御を行う。 This can help the user identify the review location. Data flow The arrow indicating the data flow in FIG. 401 is an example of an image element. The display 61 is an example of a display device. The display device may be a display device connected to another information processing device. In that case, the display control unit 150 performs display control by transmitting information on the display content to another information processing device via the network 60.

　表示制御部１５０は、表示装置に表示されたデータフロー図４０１に含まれる第１の出力データを示す画像が選択を受け付ける。すると、表示制御部１５０は、第１の出力データに至るまでに経由するデータの評価値に基づいて、第１の出力データに至るまでのデータ間の関連を示す複数の画像要素のうち、強調表示させる画像要素を選択し、選択した画像要素を強調表示させる。 The display control unit 150 accepts the selection of the image showing the first output data included in the data flow diagram 401 displayed on the display device. Then, the display control unit 150 emphasizes among a plurality of image elements showing the relationship between the data up to the first output data, based on the evaluation value of the data up to the first output data. Select the image element to be displayed and highlight the selected image element.

　これにより、ユーザによる見直し箇所の特定を支援できる。例えば、図１３に例示したように、第１の出力データに至るデータフローの複数の始点の入力データを基にする系列（例えば、分類「Ａ１－Ａ２間」，「Ｂ１－Ｂ２」間）がある場合がある。その場合、表示制御部１５０は、評価値の低い分類を多く含む系列や分類の評価値の平均値が低い系列に属する画像要素を選択して、強調表示させることが考えられる。これにより、ユーザによる見直しの優先度の高い箇所を提示可能となり、ユーザによる見直し箇所の効率的な特定を支援できる。 This can help the user identify the review location. For example, as illustrated in FIG. 13, a series based on the input data of a plurality of starting points of the data flow leading to the first output data (for example, between the classifications “A1-A2” and “B1-B2”) There may be. In that case, it is conceivable that the display control unit 150 selects and highlights an image element belonging to a series containing many classifications having low evaluation values or a series having a low average value of the evaluation values of the classifications. As a result, it is possible to present a part having a high priority for review by the user, and it is possible to support the efficient identification of the part to be reviewed by the user.

　なお、第１の実施の形態の情報処理は、処理部１２にプログラムを実行させることで実現できる。また、第２の実施の形態の情報処理は、ＣＰＵ１０１にプログラムを実行させることで実現できる。プログラムは、コンピュータ読み取り可能な記録媒体６３に記録できる。 The information processing of the first embodiment can be realized by causing the processing unit 12 to execute the program. Further, the information processing of the second embodiment can be realized by causing the CPU 101 to execute the program. The program can be recorded on a computer-readable recording medium 63.

　例えば、プログラムを記録した記録媒体６３を配布することで、プログラムを流通させることができる。また、プログラムを他のコンピュータに格納しておき、ネットワーク経由でプログラムを配布してもよい。コンピュータは、例えば、記録媒体６３に記録されたプログラムまたは他のコンピュータから受信したプログラムを、ＲＡＭ１０２やＨＤＤ１０３などの記憶装置に格納し（インストールし）、当該記憶装置からプログラムを読み込んで実行してもよい。 For example, the program can be distributed by distributing the recording medium 63 on which the program is recorded. Alternatively, the program may be stored in another computer and distributed via the network. For example, the computer may store (install) a program recorded on the recording medium 63 or a program received from another computer in a storage device such as RAM 102 or HDD 103, read the program from the storage device, and execute the program. good.

　上記については単に本発明の原理を示すものである。更に、多数の変形や変更が当業者にとって可能であり、本発明は上記に示し、説明した正確な構成および応用例に限定されるものではなく、対応する全ての変形例および均等物は、添付の請求項およびその均等物による本発明の範囲とみなされる。 The above merely indicates the principle of the present invention. Further, numerous modifications and modifications are possible to those skilled in the art, and the invention is not limited to the exact configurations and applications described and described above, and all corresponding modifications and equivalents are attached. It is considered to be the scope of the present invention according to the claims and their equivalents.

　１０　情報処理装置
　１１　記憶部
　１２　処理部
　２０　来歴情報
　３１，３２　ソフトウェア
　４１，４２，４３　データ記憶部
　ｄ１，ｄ２，ｄ３　データ
　Ｓ１，Ｓ２　ステップ 10 Information processing device 11 Storage unit 12 Processing unit 20 History information 31, 32 Software 41, 42, 43 Data storage unit d1, d2, d3 Data S1, S2 Step

Claims

On the computer
Based on the history information indicating the history of the input data and the output data for each of the plurality of software, the information of the first input data which is the starting point of the data processing by the plurality of software is extracted.
The quality of the first output data output by the data processing is evaluated according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user.
An information processing program that executes processing.

Further to the computer
The route from the first input data to the first output data depends on the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user. Perform quality evaluation of intermediate data
The information processing program according to claim 1, wherein the processing is executed.

Further to the computer
The information of the first access authority to the first input data is acquired, and the second access authority to the first output data is predicted based on the first access authority and the history information.
The quality of the first output data is evaluated according to the comparison of whether or not the actual access authority of the first output data matches the predicted second access authority.
The information processing program according to claim 1 or 2, wherein the processing is executed.

In the prediction of the second access authority, the third access authority to the intermediate data passing from the first input data to the first output data is predicted based on the first access authority. Predicting the second access right based on the predicted third access right,
The information processing program according to claim 3.

Further to the computer
The quality of the intermediate data is evaluated according to the comparison of whether or not the actual access authority of the intermediate data matches the third access authority.
The information processing program according to claim 4.

Further to the computer
Information on the first delay time allowed from the generation of the first input data to the update of the first output data is acquired.
Based on the data update history information and the history information, the second delay time from the generation of the first input data to the update of the first output data is calculated, and the second delay time is calculated. The quality of the first output data is evaluated according to the comparison of whether or not the delay time is shorter than the first delay time.
The information processing program according to any one of claims 1 to 5, wherein the processing is executed.

Further to the computer
A data flow diagram from the first input data to the first output data is displayed on the display device, and the relationship between the first input data and the first output data included in the data flow diagram is determined. The display mode of the image element shown is changed based on the result of the quality evaluation for the first output data.
The information processing program according to any one of claims 1 to 6, wherein the processing is executed.

Further to the computer
When an image showing the first output data included in the data flow diagram displayed on the display device is selected, the said is based on the evaluation value of the data passing through to the first output data. Among a plurality of image elements showing the relationship between the data up to the first output data, the image element to be highlighted is selected, and the selected image element is highlighted.
The information processing program according to claim 7, wherein the processing is executed.

The computer
Based on the history information indicating the history of the input data and the output data for each of the plurality of software, the information of the first input data which is the starting point of the data processing by the plurality of software is extracted.
The quality of the first output data output by the data processing is evaluated according to the comparison of whether or not the information of the first input data matches the information of the predetermined data input by the user.
Information processing method.

A storage unit that stores history information indicating the history of input data and output data for each of a plurality of software, and a storage unit.
Based on the history information stored in the storage unit, the information of the first input data which is the starting point of the data processing by the plurality of software is extracted, and the information of the first input data is input by the user. , A processing unit that evaluates the quality of the first output data output by the data processing according to the comparison of whether or not the information matches the predetermined data.
Information processing device with.