CN113139071A - Document processing system and method for classifying documents by machine learning - Google Patents
Document processing system and method for classifying documents by machine learning Download PDFInfo
- Publication number
- CN113139071A CN113139071A CN202110087670.1A CN202110087670A CN113139071A CN 113139071 A CN113139071 A CN 113139071A CN 202110087670 A CN202110087670 A CN 202110087670A CN 113139071 A CN113139071 A CN 113139071A
- Authority
- CN
- China
- Prior art keywords
- classification
- machine learning
- document
- codes
- folder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention relates to a file processing system and method for classifying files by using machine learning, which comprises an input module, a processing module and more than one storage module, wherein the storage module defaults to a classified folder during a first modeling program, the classified folder corresponds to codes, after the first modeling program is completed, the input module can receive more than one file image, the processing module calculates according to a group of machine learning model information and the file image to generate a calculation result, and the processing module stores the file image in the classified folder according to the calculation result; the file images are judged in real time, and classification of the file images is automatically completed according to codes of the corresponding classification folders, so that accuracy and efficiency of classifying files are improved.
Description
Technical Field
The present invention relates to document processing systems and methods, and more particularly, to a document processing system and method for classifying documents using machine learning.
Background
With the change of science and technology, many electronic devices that help people to work conveniently, such as copiers, scanners or multi-function office machines, are developed rapidly in the spring of the rainy season, such as copiers, scanners or multi-function office machines, and the documents are scanned into electronic documents for storage by placing the documents on a paper feeding component or a paper placing component of the copier, scanner or multi-function office machine through the scanning component of the copier, scanner or multi-function office machine, so as to improve the convenience of document storage.
When people go to office places such as office units, banks and the like, paper application files with different formats and different services are required to be filled and submitted to undertaking personnel, the undertaking personnel scan the paper application files into electronic file images through a copying machine, a scanner or a multifunctional office machine, and then manually scan bar codes (Barcode and Patch code) on the paper application files through a bar code scanner or manually confirm specific characters and form modes to classify and file the electronic file images and complete service undertaking. However, when the barcode scanner fails and the barcode is stained and can not be read, the operation time of the undertaking staff is prolonged and the efficiency is low, so that the public is worried about intolerance due to too long waiting time, and the undertaking staff is complained; or when the contractor documents the electronic document image according to the specific characters and forms, and is influenced by external environments such as the consultations of the civil or other contractors, the classification and filing errors or no classification may occur due to distraction, so that the interests of the civil may be damaged.
In addition, when people go to a hospital clinic for visiting, patient data needs to be filled in, and then counter staff scan the patient data into electronic file images through a copying machine, a scanner or a multifunctional affair machine and then carry out classified filing, however, when the bar code scanner fails and bar codes are stained and cannot be read, the operation time is prolonged, the patients cannot visit the hospital in real time due to overlong waiting time, dangers are caused, or in the process of filing the electronic file images by the counter staff according to specific characters and forms in a classified manner, when the counter staff is influenced by external environments such as patients or nurses, classification filing errors or no classification can be caused due to distraction, the patient data are abnormal, and subsequent medical disputes can also be caused.
Conventionally, an electronic document image obtained after a copy machine, a scanner or a multi-function peripheral scans a paper document can actually assist in document processing, however, in the subsequent classified filing of electronic document images, the operation time is easily prolonged due to the fact that a bar code on the document cannot be read smoothly, the operation is slow, or the classified filing of electronic document images is easily influenced by external environment, so that the classified filing is inaccurate, and therefore, the defects of time consuming, inefficiency and easiness in error exist in the manual mode of classifying and filing electronic document images in the prior art.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, a primary objective of the present invention is to provide a document processing system and method for classifying documents by machine learning, which can improve the accuracy and efficiency of classification by setting the codes of the classification folders in advance, calculating the document images by machine learning, classifying and storing the calculated results according to the codes of the classification folders, and automatically classifying.
The main technical means adopted to achieve the above object is a file processing method for classifying files by machine learning, which is executed on a file processing system, wherein the file processing system defaults more than one classified folder, and the classified folders correspond to codes, and the method comprises the following steps:
receiving more than one file image;
calculating according to a set of machine learning model information and file images generated by completing a primary modeling program to generate a calculation result;
and storing the file images in the corresponding classified folders according to the calculation result and the codes of more than one classified folders.
According to the method, the file images and the machine learning model information are calculated to obtain the calculation result, the calculation result is processed with the codes of more than one classification folder, and the file images are stored in the corresponding classification folders, so that the classification process is simplified, and the efficiency and the accuracy of classifying files are effectively improved in an automatic classification mode.
Another main technical means for achieving the above object is a document processing system for classifying documents using machine learning, comprising:
the input module acquires more than one document image;
the storage module is used for defaulting the classified folders, and the classified folders correspond to codes;
the processing module is respectively connected with the input module and the storage module;
the processing module receives the file images and calculates the file images and a group of machine learning model information generated by completing a primary modeling program to generate a calculation result, and the processing module compares the calculation result with codes of the classification folders to store the file images in the classification folders.
According to the system, after the file image is obtained by the input module, the processing module calculates the file image and the machine learning model information generated by the completed first modeling program to generate a calculation result, the processing module compares the calculation result with the code of the classification folder to determine whether the file image is stored in the classification folder, and the processing module stores the file image in the classification folder of the storage module, so that the classification process is simplified, and the efficiency and the accuracy of classifying the folder can be effectively improved through an automatic classification mode.
As described above, the present invention provides a document processing system and method for classifying documents by machine learning.
In order to make the aforementioned and other objects of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
The following description will explain embodiments of the present invention in further detail with reference to the accompanying drawings.
FIG. 1 is a block diagram of a system architecture according to a preferred embodiment of the present invention;
FIG. 2 is another system architecture diagram of the preferred embodiment of the present invention;
FIG. 3 is a flow chart of a first method of the preferred embodiment of the present invention;
FIG. 4 is a flow chart of a second method of the preferred embodiment of the present invention;
FIG. 5 is a flow chart of a third method in accordance with the preferred embodiment of the present invention;
FIG. 6 is a flow chart of a fourth method of the preferred embodiment of the present invention;
FIG. 7 is a flow chart of a fifth method of the preferred embodiment of the present invention;
fig. 8 is a flow chart of a sixth method of a preferred embodiment of the present invention.
The reference numbers illustrate:
11: an input module;
12: a processing module;
13: a storage module;
131: classifying folders;
132: a code;
14: expanding a storage module;
141: expanding the classified folders;
142: and (4) code.
Detailed Description
The following description of the embodiments of the present invention is provided for illustrative purposes, and other advantages and effects of the present invention will become apparent to those skilled in the art from the present disclosure. While the invention will be described in connection with the preferred embodiments, there is no intent to limit its features to those embodiments. On the contrary, the invention is described in connection with the embodiments for the purpose of covering alternatives or modifications that may be extended based on the claims of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be practiced without these particulars. Moreover, some of the specific details have been left out of the description in order to avoid obscuring or obscuring the focus of the present invention.
Referring to fig. 1, a document processing system for classifying documents by machine learning according to a preferred embodiment of the present invention includes an input module 11, a processing module 12 and at least one storage module 13; the processing module 12 is electrically connected with the input module 11 and the storage module 13 respectively; in this embodiment, the document processing system for classifying documents by machine learning according to the present invention includes: a copier, a scanner, or a multifunction Peripheral (MFP) having a Function of scanning a document. In this embodiment, two or more storage modules 13 may be further included for respectively storing different data.
In this embodiment, the input module 11 obtains one or more document images according to one or more paper documents to be scanned by the user, and stores the obtained document images in the storage module 13 after being processed by the processing module 12, specifically, the storage module 13 defaults to have a classification folder 131, and the classification folder 131 corresponds to a code 132, and the code 132 of the classification folder 131 is compared with the document images to determine whether the document images are stored in the classification folder 131. Further, when two or more storage modules 13 are included, more than one classification folder 131 may be respectively disposed in each storage module 13, and each classification folder 131 corresponds to one code 132, so that the classification folder 131 to be stored with the document image is determined by comparing the document image with the codes 132 of the classification folders 131. In the preferred embodiment, the specific application manner of the classification folders 131 corresponding to the codes 132 can be that the storage modules 13 respectively default to the classification folders 131, and the corresponding codes 132 are automatically marked respectively without manual or prior marking, so as to simplify the judgment of classification; in addition, based on the special identification requirement, when the storage modules 13 default to the classification folders 131, the corresponding codes 132 may be set for the classification folders 131, and the foregoing application manner is only an example and is not limited thereto.
In use, after a user puts more than one paper document into a copier, a scanner or a multi-function peripheral with a document scanning function, the input module 11 correspondingly obtains more than one document image and outputs the document image to the processing module 12, the processing module 12 performs calculation according to a set of machine learning model information generated by completing a first modeling procedure and the received document image to generate a calculation result, and the processing module 12 compares the calculation result with the code 132 of the classification folder 131 to store the document image in the classification folder 131.
Specifically, the calculation result of the corresponding file image includes a code, and the processing module 12 compares the code 132 of the classification folder 131 according to the code of the calculation result to determine whether the same code is compared, and if so, stores the file image in the classification folder 131.
In addition, when the user classifies the document images by using the document processing system for classifying documents by machine learning according to the present invention, if the processing module 12 determines that it is impossible to compare whether the document image to be classified is stored in the classification folder 131 according to the calculation result, the processing module 12 further executes an auxiliary determination program to assist in determining whether the document image to be classified is stored in the classification folder 131 through an auxiliary determination process, wherein the auxiliary determination process is performed by the processing module 12 obtaining more than one image feature information corresponding to the document image according to the received document image, performing a processing according to the image feature information to generate an auxiliary determination result, and comparing the auxiliary determination result with the code 132 of the classification folder 131 to determine whether the document image is stored in the classification folder 131, specifically, the auxiliary determination result includes a code, and the processing module 12 compares the code of the auxiliary determination result with the code 132 of the classification folder 131 to determine whether to store the document image in the classification folder 131, so as to improve the accuracy and applicability of the classification document image through the auxiliary determination procedure. In the present embodiment, the image feature information includes Optical Character Recognition information (OCR), document image size information, document image color information, and the like.
As mentioned above, the machine learning model information used by the processing module 12 to calculate with the document image is generated by the first modeling program, the specific process of the first modeling program is that the processing module 12 sets the classification folder 131 first, and the classification folder 131 has the corresponding code 132, wherein the corresponding code 132 can be automatically preset by the processing module 12, or automatically set or set manually, the input module 11 receives a plurality of document images for machine learning, the processing module 12 processes the document images by the machine learning program to generate the set of machine learning model information, wherein the set of machine learning model information includes a plurality of sets of coefficients, and one of the plurality of sets of coefficients corresponds to the code 132 of the classification folder 131, when the processing module 12 completes the generation of the set of machine learning model information, the first modeling procedure is completed; machine learning model information required for automatically classifying the document images is established through a primary modeling program, and the efficiency and the accuracy of classifying the document images are improved.
Further, in another embodiment, please refer to fig. 2, when the user has different types of document images to be classified, the document processing system for classifying documents by machine learning according to the present invention further includes more than one expansion storage module 14, the default expansion classification folder 141 in the expansion storage module 14, the processing module 12 further executes an expansion modeling program to provide the function of expanding and classifying different types of document images through an expansion modeling process, wherein the expansion modeling process sets the corresponding expansion classification folder 141 by the processing module 12, the expansion classification folder 141 corresponds to a code 142, the processing module 12 receives a plurality of document images for machine learning, and processes the plurality of document images through the machine learning program to generate a new set of machine learning model information, the new set of machine learning model information includes a plurality of new sets of coefficients, and these new coefficients correspond to the codes 132 of the classification folder 131 and the codes 142 of the expansion classification folder 141, and after the processing module 12 completes the generation of the new set of machine learning model information, the expansion modeling program is completed, and the new set of machine learning model information, the different types of document images and the classified document images are calculated to classify the different types of document images and the classified document images, and are stored in the corresponding classification folder 131 or the corresponding expansion classification folder 141, so as to improve the expandability and use flexibility of the document processing system for classifying documents by machine learning according to the present invention.
Further, in order to match the content of the foregoing embodiments, a specific application is illustrated as a use scenario, which is not limited thereto; for example, when the user wants to classify two bills of different types on hand, after the input module 11 obtains the file images of the two bills of different types, the processing module 12 calculates the obtained two file images and the machine learning model information in sequence to obtain corresponding calculation results in sequence, and the processing module 12 compares the two calculation results in sequence with the codes 132 of the classification folder 131 respectively to determine which file image of the bill of different types needs to be classified and stored in the classification folder 131; further, if the code 132 of the corresponding classification folder 131 cannot be compared with the document image of one of the bills, the processing module 12 further performs an auxiliary determination procedure to further determine the corresponding stored classification folder 131. In addition, if there are other different types of document images of bills to be classified, a corresponding number of the extended storage modules 14 are provided, each extended storage module 14 defaults to the corresponding extended classification folder 141, and the processing module 12 completes the extended modeling procedure to obtain new machine learning model information for calculating the calculation result corresponding to the one of the classification folders 131 or the one of the extended classification folders 141.
Therefore, according to the above embodiments and specific application, the input module 11 outputs the received document images to the processing module 12, the processing module 12 performs calculation according to the machine learning model information and the received document images to generate a calculation result, the processing module 12 compares the calculation result with the codes 132 of the classification folder 131 to determine whether the document images should be classified and stored in the classification folder 131, and the processing module 12 stores the document images in the corresponding classification folder 131, so as to simplify the classification process and effectively improve the efficiency and accuracy of classifying the documents by an automated classification method.
In addition, the auxiliary judgment process can provide further auxiliary judgment for the file images which cannot be judged at present, so that the accuracy and the applicability of classifying the file images are improved.
In addition, the function of classifying different types of file images can be further expanded through expanding the modeling process, so that the use expandability and the use elasticity of the invention are improved.
According to the above embodiments and specific application, the present invention further generalizes the file processing method using machine learning to classify files, as shown in fig. 3, which is executed on the file processing system using machine learning to classify files according to the present invention, the file processing system defaults more than one classified folder 131, and the classified folders 131 correspond to codes 132, and the method includes the following steps:
receiving one or more document images (S20);
calculating according to a set of machine learning model information and document images generated by completing the primary modeling procedure to generate a calculation result (S30);
based on the calculation result and the code 132 of the classification folder 131, the document image is stored in the corresponding classification folder 131 (S40).
As shown in fig. 4, when the above steps are executed to the step of storing the document image in the corresponding classification folder 131 according to the calculation result and the code 132 of the classification folder 131(S40), the method further includes the following steps:
the calculated result is compared with the codes 132 of the classification folders 131 to store the document images in the corresponding classification folders 131 (S41).
Referring to fig. 5, when the above steps are performed to the step of comparing the calculated result with the codes 132 of the classification folders 131 to store the document images in the corresponding classification folders 131(S41), the method further includes the following steps:
judging whether the codes 132 of the same classification folders 131 are compared or not according to the codes of the calculation results (S411);
if so, the document image is stored in the corresponding classification folder 131 (S412).
Referring to FIG. 5, when the above-mentioned step is executed to "determine whether the codes 132 of the same classification folders 131 are matched according to the codes of the calculation results (S41)", if not, the document processing system further provides an auxiliary determination program (S413).
Referring to fig. 5 and 6, the auxiliary determination procedure includes the following steps:
acquiring one or more image feature information of the document image (S4131);
processing according to the image characteristic information to generate an auxiliary judgment result (S4132);
comparing the codes of the auxiliary judgment result with the codes 132 of the one or more classification folders 131 to store the document images in the corresponding classification folders 131 (S4133); the image feature information includes Optical Character Recognition (OCR), document image size information, document image color information, and the like.
In this embodiment, please refer to fig. 7, wherein the first modeling procedure further includes the following steps:
setting a code 132 corresponding to the classification folder 131 (S51);
receiving a plurality of document images (S52); wherein, the received file image is used for machine learning;
executing a machine learning program to process the document images to generate machine learning model information (S53); wherein the machine learning model information includes a plurality of sets of coefficients, one of the plurality of sets of coefficients corresponding to the code 132 of the classification folder 131.
In this embodiment, if a new document image is to be classified, the document processing system further includes more than one expanded classification folder 141, and the method further provides an expanded modeling program, and please refer to fig. 8, where the expanded modeling program further includes the following steps:
setting a code 142 of the expanded classification folder 141 (S61);
receiving a plurality of document images (S62); wherein, the received file image is used for machine learning;
processing the document images by a machine learning program to generate a new set of machine learning model information (S63); wherein the new set of machine learning model information includes a plurality of new sets of coefficients corresponding to the code 132 of the classification folder 131 and the code 142 of the expanded classification folder 141.
In summary, the above-mentioned embodiments are provided only for illustrating the principles and effects of the present invention, and not for limiting the present invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (17)
1. A method for processing documents using machine learning to classify documents, the method being performed on a document processing system, the document processing system defaulting to more than one classification folder, the classification folders having corresponding codes, and the method comprising the steps of:
receiving more than one file image;
calculating according to a set of machine learning model information generated by completing a primary modeling program and the file image to generate a calculation result;
and storing the file image in the corresponding classification folder according to the calculation result and the code of the classification folder.
2. The method of claim 1, wherein when the step of storing the document image in the corresponding classification folder according to the calculation result and the code of the classification folder is executed, the method further comprises the steps of:
and comparing the calculation result with the codes of the classification folders to store the file images in the corresponding classification folders.
3. The method of claim 2, wherein when the step of comparing the calculated result with the code of the classification folder to store the document image in the corresponding classification folder is executed, the method further comprises the steps of:
judging whether the codes of the same classified folders are compared or not according to the codes of the calculation results;
and if so, storing the file images in the corresponding classification folders.
4. The method of claim 3, wherein when the step of determining whether the codes of the classified folders match with each other is executed, if not, the method further provides an auxiliary determination procedure.
5. The method of processing documents using machine learning for document classification according to claim 4, wherein said auxiliary judgment program comprises the steps of:
acquiring more than one image characteristic information of the document image;
processing according to the image characteristic information to generate an auxiliary judgment result;
and comparing the codes of the auxiliary judgment results with the codes of the classification folders to store the file images in the corresponding classification folders.
6. The method of document processing using machine learning for document classification as claimed in claim 1, wherein said first modeling routine further comprises the steps of:
setting codes corresponding to the classification folders;
receiving a plurality of file images;
executing a machine learning program to process the plurality of file images to generate the machine learning model information.
7. The method of processing documents using machine learning for document classification according to claim 6, wherein said machine learning model information includes a plurality of sets of coefficients, one of said plurality of sets of coefficients corresponding to a code of said classification folder.
8. The method of claim 1, wherein the document processing system further comprises one or more augmented taxonomy folders, and wherein the method further provides an augmented modeling program.
9. The method of document processing using machine learning for document classification as claimed in claim 8, wherein said augmented modeling program further comprises the steps of:
setting a code of the expanded classified folder;
receiving a plurality of file images;
and processing the plurality of file images through the machine learning program to generate a new set of machine learning model information.
10. The method of processing documents using machine learning for document classification according to claim 9, wherein said new machine learning model information includes a plurality of sets of new coefficients, said plurality of sets of new coefficients corresponding to codes of said classification folder and said expanded classification folder.
11. The method of document processing using machine learning for document classification as claimed in claim 1, wherein the document processing system includes a copier, a scanner or a multi-function peripheral.
12. A document processing system for classifying documents using machine learning, comprising:
the input module acquires more than one document image;
the storage module defaults to a classification folder, and the classification folder is corresponding to codes;
the processing module is respectively connected with the input module and the storage module;
the processing module receives the file images and calculates the file images and a group of machine learning model information generated by completing a primary modeling program to generate a calculation result, and the processing module compares the calculation result with the codes of the classification folders to store the file images in the corresponding classification folders.
13. The system of claim 12, wherein the processing module compares the code of the calculation result with the code of the classification folder, and stores the document image in the corresponding classification folder when the processing module compares the code of the classification folder with the code of the calculation result.
14. The system of claim 13, wherein when the processing module does not match the code of the classification folder with the code of the calculation result, the processing module further executes an auxiliary judgment program and obtains one or more image feature information according to the document image, the processing module processes the image feature information to generate an auxiliary judgment result, and the processing module matches the code of the auxiliary judgment result with the code of the classification folder to store the document image in the corresponding classification folder.
15. The document processing system using machine learning for document classification as claimed in claim 14, wherein the image characteristic information includes optical character recognition information, document image size information or document image color information.
16. The system of claim 12, wherein the input module receives a plurality of document images when the processing module executes the first modeling routine, the processing module processes the plurality of document images through a machine learning routine to generate the machine learning model information, the machine learning model information includes a plurality of sets of coefficients, and one of the plurality of sets of coefficients corresponds to a code of the classification folder.
17. The system of claim 12, further comprising one or more expansion modules, said expansion modules default to expansion classification folders; the processing module executes an extended modeling program to set codes corresponding to the extended classification folder, receives a plurality of file images, and processes the plurality of file images through the machine learning program to generate a new set of machine learning model information, wherein the new machine learning model information comprises a plurality of new groups of coefficients, and the new groups of coefficients correspond to the codes of the classification folder and the codes of the extended classification folder.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109102766 | 2020-01-30 | ||
| TW109102766A TWI750572B (en) | 2020-01-30 | 2020-01-30 | Document processing system and method for document classification using machine learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN113139071A true CN113139071A (en) | 2021-07-20 |
| CN113139071B CN113139071B (en) | 2023-10-24 |
Family
ID=76811210
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110087670.1A Active CN113139071B (en) | 2020-01-30 | 2021-01-22 | File processing system and method for classifying files by machine learning |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US11663526B2 (en) |
| CN (1) | CN113139071B (en) |
| TW (1) | TWI750572B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7268764B1 (en) * | 2022-01-14 | 2023-05-08 | 凸版印刷株式会社 | Image processing device, image processing method and image processing program |
| CN118939604B (en) * | 2024-07-24 | 2025-02-21 | 广西智汇通人力资源有限公司 | Data processing method and system based on archive information |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103699904A (en) * | 2013-12-25 | 2014-04-02 | 大连理工大学 | Image computer-aided diagnosis method for multi-sequence nuclear magnetic resonance images |
| CN105320945A (en) * | 2015-10-30 | 2016-02-10 | 小米科技有限责任公司 | Image classification method and apparatus |
| CN107220975A (en) * | 2017-07-31 | 2017-09-29 | 合肥工业大学 | Uterine neck image intelligent auxiliary judgment system and its processing method |
| CN108109680A (en) * | 2017-12-20 | 2018-06-01 | 南通艾思达智能科技有限公司 | A kind of method of settlement of insurance claim image bag sorting |
| CN109977073A (en) * | 2019-03-11 | 2019-07-05 | 厦门纵横集团科技股份有限公司 | A kind of law court's electronics folder automation filing system and its method |
| US20200019853A1 (en) * | 2018-07-13 | 2020-01-16 | Primax Electronics Ltd. | Product testing system with auxiliary judging function and auxiliary testing method applied thereto |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8495679B2 (en) * | 2000-06-30 | 2013-07-23 | Thomson Licensing | Method and apparatus for delivery of television programs and targeted de-coupled advertising |
| US10043112B2 (en) * | 2014-03-07 | 2018-08-07 | Qualcomm Incorporated | Photo management |
| US10382792B2 (en) * | 2015-02-13 | 2019-08-13 | Lg Electronics Inc. | Method and apparatus for encoding and decoding video signal by means of transform-domain prediction |
| US11580398B2 (en) * | 2016-10-14 | 2023-02-14 | KLA-Tenor Corp. | Diagnostic systems and methods for deep learning models configured for semiconductor applications |
| US10395362B2 (en) * | 2017-04-07 | 2019-08-27 | Kla-Tencor Corp. | Contour based defect detection |
| TWI662511B (en) * | 2017-10-03 | 2019-06-11 | 財團法人資訊工業策進會 | Hierarchical image classification method and system |
| US11500533B2 (en) * | 2018-02-14 | 2022-11-15 | Lg Electronics Inc. | Mobile terminal for displaying a preview image to be captured by a camera and control method therefor |
| TW202004519A (en) * | 2018-06-05 | 2020-01-16 | 正修學校財團法人正修科技大學 | Method for automatically classifying images |
| US11537506B1 (en) * | 2018-10-26 | 2022-12-27 | Amazon Technologies, Inc. | System for visually diagnosing machine learning models |
| JP2021043775A (en) * | 2019-09-12 | 2021-03-18 | 富士ゼロックス株式会社 | Information processing device and program |
| US11423308B1 (en) * | 2019-09-20 | 2022-08-23 | Apple Inc. | Classification for image creation |
| JP7439435B2 (en) * | 2019-09-30 | 2024-02-28 | 富士フイルムビジネスイノベーション株式会社 | Information processing device and program |
| JP7357579B2 (en) * | 2020-03-30 | 2023-10-06 | シャープ株式会社 | Image processing device, image processing method and program |
| US20220075845A1 (en) * | 2020-05-18 | 2022-03-10 | Best Apps, Llc | Computer aided systems and methods for creating custom products |
-
2020
- 2020-01-30 TW TW109102766A patent/TWI750572B/en active
-
2021
- 2021-01-22 CN CN202110087670.1A patent/CN113139071B/en active Active
- 2021-01-27 US US17/160,326 patent/US11663526B2/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103699904A (en) * | 2013-12-25 | 2014-04-02 | 大连理工大学 | Image computer-aided diagnosis method for multi-sequence nuclear magnetic resonance images |
| CN105320945A (en) * | 2015-10-30 | 2016-02-10 | 小米科技有限责任公司 | Image classification method and apparatus |
| CN107220975A (en) * | 2017-07-31 | 2017-09-29 | 合肥工业大学 | Uterine neck image intelligent auxiliary judgment system and its processing method |
| CN108109680A (en) * | 2017-12-20 | 2018-06-01 | 南通艾思达智能科技有限公司 | A kind of method of settlement of insurance claim image bag sorting |
| US20200019853A1 (en) * | 2018-07-13 | 2020-01-16 | Primax Electronics Ltd. | Product testing system with auxiliary judging function and auxiliary testing method applied thereto |
| CN109977073A (en) * | 2019-03-11 | 2019-07-05 | 厦门纵横集团科技股份有限公司 | A kind of law court's electronics folder automation filing system and its method |
Non-Patent Citations (2)
| Title |
|---|
| WANG CHUNFENG等: "Heterogeneous transfer learning based on stack sparse auto-encoders for fault diagnosis", 《2018 CHINESE AUTOMATIC CONGRESS(CAC)》, pages 1 - 2 * |
| 赵可杨 等: "机器学习辅助肿瘤诊断", 《肿瘤》, vol. 38, no. 10, pages 987 - 991 * |
Also Published As
| Publication number | Publication date |
|---|---|
| US11663526B2 (en) | 2023-05-30 |
| TWI750572B (en) | 2021-12-21 |
| CN113139071B (en) | 2023-10-24 |
| US20210240974A1 (en) | 2021-08-05 |
| TW202129513A (en) | 2021-08-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20210209359A1 (en) | Image processing apparatus, control method for image processing apparatus, and non-transitory storage medium | |
| CN101184137B (en) | Image processing method and device, image reading and forming device | |
| US8185398B2 (en) | Reading device with shortcut read function | |
| EP1473641A2 (en) | Information processing apparatus, method, storage medium and program | |
| US11620844B2 (en) | Image processing apparatus, control method of image processing apparatus, and storage medium | |
| US10530957B2 (en) | Image filing method | |
| US8300944B2 (en) | Image processing method, image processing apparatus, image reading apparatus, image forming apparatus, image processing system, and storage medium | |
| US7463772B1 (en) | De-warping of scanned images | |
| CN113139071A (en) | Document processing system and method for classifying documents by machine learning | |
| CN105787425A (en) | Information processing apparatus, system, and information processing method | |
| JP6435934B2 (en) | Document image processing program, image processing apparatus and character recognition apparatus using the program | |
| JP2009206658A (en) | Image processing method, image processor, image forming apparatus, program, and storage medium | |
| JP4859054B2 (en) | Image processing apparatus, image processing method, program, and recording medium | |
| US20210118316A1 (en) | Document checking system and grading system | |
| JP5962449B2 (en) | Determination program, determination method, and determination apparatus | |
| US10834281B2 (en) | Document size detecting by matching between image of entire document and read size image | |
| JP3093493B2 (en) | Image storage and retrieval device | |
| US8451461B2 (en) | Information processor, information processing system, and computer readable medium | |
| US11657632B2 (en) | Image processing device, image reading device, image processing method, and non-transitory computer readable medium, using two pieces of image data | |
| CN101609453A (en) | A kind of separator page and the method and apparatus that utilizes the document classification of this separator page | |
| JP6303742B2 (en) | Image processing apparatus, image processing method, and image processing program | |
| JP2007041709A (en) | Document processing system, document processing system control method, document processing apparatus, computer program, and computer-readable storage medium | |
| CN112364868A (en) | Rotation correction method and device for electronic file | |
| CN116389649B (en) | Paper scanning input storage method, equipment and computer readable storage medium | |
| JP2020004345A (en) | Image collation system, image collation method, and program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |