CN111681301B

CN111681301B - Method and device for processing pictures and texts in slide, terminal and storage medium

Info

Publication number: CN111681301B
Application number: CN202010512421.8A
Authority: CN
Inventors: 余芳强; 彭阳; 张铭; 黄轶; 许璟琳; 高尚; 李晨辉
Original assignee: Shanghai Construction No 4 Group Co Ltd
Current assignee: Shanghai Construction No 4 Group Co Ltd
Priority date: 2020-06-08
Filing date: 2020-06-08
Publication date: 2023-05-09
Anticipated expiration: 2040-06-08
Also published as: CN111681301A

Abstract

The invention discloses a processing method, a device, a terminal and a storage medium for pictures and texts in a slide, wherein the method comprises the following steps: determining picture information and text information of each slide in the slide file; performing natural language analysis processing on the text information of all the pages of slides to obtain named text sentences of all the pages of slides; according to the picture information in the slide file and the named text sentences of the slide file, matching the named text sentences for each picture; and establishing a corresponding relation between the picture information and the text information in the slide file according to the picture information in the slide file and the picture naming text statement obtained by matching each picture. The scheme of the invention can solve the problem of higher difficulty in searching the corresponding relation between the plurality of pictures and the plurality of text information in each engineering slide, and achieves the effect of reducing the difficulty in searching the corresponding relation between the plurality of pictures and the plurality of text information in each engineering slide.

Description

Method and device for processing pictures and texts in slide, terminal and storage medium

Technical Field

The invention belongs to the technical field of construction engineering informatization, in particular relates to a processing method, a device, a terminal and a storage medium for pictures and texts in a slide, and particularly relates to a method, a device, a terminal and a storage medium for automatically matching pictures and texts in a slide and automatically naming and independently storing pictures in a construction engineering material slide.

Background

In the engineering management process, a large number of slides in the format of PPT (Microsoft Office PowerPoint, which refers to presentation software of microsoft corporation) are often used for communication and exchange. These slides are important assets for enterprises as high quality engineering materials. The slide has a large number of precious pictures, which need to be frequently checked in business and can be reused in the subsequent document making process. Because the pictures are stored in the slide files in a scattered manner, the pictures are difficult to quickly search in the using process, so that the slide files are frequently required to be manually opened for searching, and the efficiency is low. It is necessary to find correspondence between a plurality of pictures and a plurality of text information in each engineering slide to more conveniently view and use the way of the engineering slides. The engineering slide has the characteristics of large picture quantity and more text information, which brings certain difficulty to processing, so that the selection of a proper algorithm is particularly important to find the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide.

The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.

Disclosure of Invention

The invention aims at solving the defects, and provides a processing method, a processing device, a terminal and a storage medium for pictures and texts in slides, so as to solve the problem of higher difficulty in searching the corresponding relation between a plurality of pictures and a plurality of text messages in each engineering slide, and achieve the effect of reducing the difficulty in searching the corresponding relation between a plurality of pictures and a plurality of text messages in each engineering slide.

The invention provides a processing method of pictures and texts in a slide, which comprises the following steps: determining picture information and text information of each slide in the slide file; carrying out natural language analysis processing on text information of all pages of slides in the slide file to obtain named text sentences of all pages of slides in the slide file; matching the named text statement for each picture in the slide file according to the picture information in the slide file and the named text statement of the slide file to obtain the picture named text statement of each picture; establishing a corresponding relation between the picture information and the text information in the slide file according to the picture information in the slide file and the picture naming text statement obtained by matching each picture; wherein, the picture information of each slide in the slide file comprises: width and height of each slide, and picture bounding box information of each picture in each slide; text information of each slide in the slide file includes: text content, word size, and/or text bounding box information for all text statements for each page of the slide.

Optionally, determining the picture information and the text information of each slide in the slide file includes: acquiring the width and the height of each slide in the slide file, and acquiring the picture bounding box information of each picture in each slide as the picture information of each slide in the slide file; acquiring text content, word size and/or text bounding box information of all text sentences in each slide as text information of each slide in the slide file; and/or, performing natural language analysis processing on text information of all pages of slides in the slide file, including: determining each page of corpus information of each slide in the slide file and the full text corpus information of all pages of slides in the slide file based on the text information of all pages of slides in the slide file; performing natural language analysis processing on each page of corpus information and full-text corpus information to obtain a subject word set and a trivial word set of each page of slides, and a full-text subject word set and a full-text trivial word set of all pages of slides; determining a named file statement of all text information in the slide file based on each subject word set and each trivial word set of each slide and the full-text subject word sets and full-text trivial word sets of all slides; and/or, establishing a corresponding relation between the picture information and the text information in the slide file, including: and naming a text sentence for each picture in each slide in the slide file, constructing a storage result for each picture, and storing each picture according to the storage result.

Optionally, determining the corpus information of each page of each slide in the slide file and the full text corpus information of all pages of slides in the slide file includes: connecting text information of each slide in the slide file according to a first set sequence to form corpus information of each slide; and eliminating text information repeatedly appearing at the same position in different pages of slides aiming at the text information of all pages of slides in the slide file to obtain corrected text information of all pages of slides; the corrected text information of all the pages of slides is connected according to a second set sequence to form full text corpus information of all the pages of slides; and/or, performing natural language analysis processing on each page of corpus information and full text corpus information, including: performing Chinese word segmentation on each corpus information in each page of corpus information and full-text corpus information by using an N shortest path word segmentation algorithm to obtain word segmentation results; in the word segmentation result, sentences in each corpus information are segmented into words and phrases with set specifications and set proper nouns, so that word segmentation is obtained; selecting a set number of words in the word segmentation result of each corpus according to a set score order by using a TextRank algorithm as a subject word set of the corpus; marking words (namely words which do not belong to the subject word set but have higher frequency than any subject word) meeting the trivial word standard in the corpus into the trivial word set of the corpus; and/or determining a named file statement of all text information in the slide file, including: according to the word size and the text bounding box information in the text information of each slide in the slide file, determining the text content with the largest word size in the part with the minimum data in the Y direction of the text bounding box information of each picture in each slide being larger than the set coefficient times of the height of each picture as the named text sentence of the picture in the slide; and analogizing to obtain named text sentences of all pictures in all pages of slides in the slide file, wherein the named text sentences are used as named file sentences of all text information in the slide file; and/or, matching named text sentences for each picture in the slide file, including: determining all named text sentences of each picture in a set direction range according to picture bounding box information in picture information of each picture in each slide to obtain all named text sentences of the picture; if all the named text sentences of the picture are empty, determining the named text sentences of the slides of the pages where the picture is located as the picture named text sentences of the picture; if all the named text sentences of the picture are not empty, determining weights according to all the named text sentences of the picture, and determining matching results of the picture and all the named text sentences according to all the named text sentences of the picture; and determining a named text sentence corresponding to the maximum value as a named text sentence matched with the picture according to the maximum value of the sum of the weight of the picture and the matching result of the picture, so as to take the named text sentence as a picture named text sentence of the picture; and/or, constructing a storage result for each picture, and storing each picture according to the storage result, including: for each slide, creating a folder and naming the folder by using a named text sentence of the slide; and storing each picture independently as an independent file, naming the independent file of each picture by using a picture naming text sentence, and placing the independent file in the newly created folder after naming the independent file of each picture; establishing a storage record for the picture in a database; and in the storage record of the picture, determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture.

Optionally, determining the weight according to all named text sentences of the picture includes: according to the word size of each named text sentence in all named text sentences of the picture, the distance between each named text sentence of the picture and the center point of the picture, and the distance between each named text sentence of the picture and the center point of the pictureThe overlapping length between the pictures, calculating the weight

Wherein:

d _j is the distance between each named text statement of the picture and the center point of the picture, l _j The overlapping length between each named text statement of the picture and the picture is the same as the overlapping length between the named text statement of the picture and the picture; and/or determining a matching result of the picture and all named text sentences according to all named text sentences of the picture, including: the word segmentation words in each named text sentence of the picture are sequentially matched with the topic word set and the trivial word set of each page of the page where the picture is located, and the full-text topic word set and the full-text trivial word set of all pages of the slide, so that a matching result is obtained; and/or determining and storing the picture full name field of the picture according to the picture naming text statement of the picture, including: if the length of the Chinese characters in the picture naming text sentence of the picture is smaller than or equal to a preset value, directly storing the picture naming text sentence of the picture as a picture full name field of the picture; if the length of the Chinese characters in the picture naming text sentence of the picture is larger than a preset value, extracting a set phrase in the picture naming text sentence of the picture by using an information entropy algorithm, and taking the set phrase as a new picture naming text sentence of the picture; storing a new picture naming text sentence of the picture as a picture full name field of the picture; determining a picture full name field of the picture as a search engine word of the picture by using a quick word segmentation method, and storing the search engine word as a picture search word of the picture; traversing all pictures in a slide file will have Several pictures of the same picture search word are paired and stored as similar pictures in a database.

In accordance with another aspect of the present invention, there is provided a processing apparatus for pictures and texts in a slide, comprising: a determining unit for determining picture information and text information of each slide in the slide file; the processing unit is used for carrying out natural language analysis processing on the text information of all the pages of slides in the slide file to obtain named text sentences of all the pages of slides in the slide file; the processing unit is also used for matching the named text statement for each picture in the slide file according to the picture information in the slide file and the named text statement of the slide file to obtain the picture named text statement of each picture; the processing unit is also used for establishing a corresponding relation between the picture information and the text information in the slide file according to the picture information in the slide file and the picture naming text statement obtained by matching each picture; wherein, the picture information of each slide in the slide file comprises: width and height of each slide, and picture bounding box information of each picture in each slide; text information of each slide in the slide file includes: text content, word size, and/or text bounding box information for all text statements for each page of the slide.

Optionally, the determining unit determines picture information and text information of each slide in the slide file, including: acquiring the width and the height of each slide in the slide file, and acquiring the picture bounding box information of each picture in each slide as the picture information of each slide in the slide file; acquiring text content, word size and/or text bounding box information of all text sentences in each slide as text information of each slide in the slide file; and/or, the processing unit performs natural language analysis processing on text information of all pages of slides in the slide file, including: determining each page of corpus information of each slide in the slide file and the full text corpus information of all pages of slides in the slide file based on the text information of all pages of slides in the slide file; performing natural language analysis processing on each page of corpus information and full-text corpus information to obtain a subject word set and a trivial word set of each page of slides, and a full-text subject word set and a full-text trivial word set of all pages of slides; determining a named file statement of all text information in the slide file based on each subject word set and each trivial word set of each slide and the full-text subject word sets and full-text trivial word sets of all slides; and/or, the processing unit establishes a corresponding relation between the picture information and the text information in the slide file, and the method comprises the following steps: and naming a text sentence for each picture in each slide in the slide file, constructing a storage result for each picture, and storing each picture according to the storage result.

Optionally, the processing unit determines each page of corpus information of each page of slides in the slide file and full text corpus information of all pages of slides in the slide file, including: connecting text information of each slide in the slide file according to a first set sequence to form corpus information of each slide; and eliminating text information repeatedly appearing at the same position in different pages of slides aiming at the text information of all pages of slides in the slide file to obtain corrected text information of all pages of slides; the corrected text information of all the pages of slides is connected according to a second set sequence to form full text corpus information of all the pages of slides; and/or, the processing unit performs natural language analysis processing on each page of corpus information and full text corpus information, including: performing Chinese word segmentation on each corpus information in each page of corpus information and full-text corpus information by using an N shortest path word segmentation algorithm to obtain word segmentation results; in the word segmentation result, sentences in each corpus information are segmented into words and phrases with set specifications and set proper nouns, so that word segmentation is obtained; selecting a set number of words in the word segmentation result of each corpus according to a set score order by using a TextRank algorithm as a subject word set of the corpus; marking words (namely words which do not belong to the subject word set but have higher frequency than any subject word) meeting the trivial word standard in the corpus into the trivial word set of the corpus; and/or the processing unit determines a named file statement of all text information in the slide file, including: according to the word size and the text bounding box information in the text information of each slide in the slide file, determining the text content with the largest word size in the part with the minimum data in the Y direction of the text bounding box information of each picture in each slide being larger than the set coefficient times of the height of each picture as the named text sentence of the picture in the slide; and analogizing to obtain named text sentences of all pictures in all pages of slides in the slide file, wherein the named text sentences are used as named file sentences of all text information in the slide file; and/or, the processing unit matches named text sentences for each picture in the slide file, including: determining all named text sentences of each picture in a set direction range according to picture bounding box information in picture information of each picture in each slide to obtain all named text sentences of the picture; if all the named text sentences of the picture are empty, determining the named text sentences of the slides of the pages where the picture is located as the picture named text sentences of the picture; if all the named text sentences of the picture are not empty, determining weights according to all the named text sentences of the picture, and determining matching results of the picture and all the named text sentences according to all the named text sentences of the picture; and determining a named text sentence corresponding to the maximum value as a named text sentence matched with the picture according to the maximum value of the sum of the weight of the picture and the matching result of the picture, so as to take the named text sentence as a picture named text sentence of the picture; and/or, the processing unit constructs a storage result for each picture, stores each picture according to the storage result, and includes: for each slide, creating a folder and naming the folder by using a named text sentence of the slide; and storing each picture independently as an independent file, naming the independent file of each picture by using a picture naming text sentence, and placing the independent file in the newly created folder after naming the independent file of each picture; establishing a storage record for the picture in a database; and in the storage record of the picture, determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture.

Optionally, the processing unit determines the weight according to all named text sentences of the picture, including: calculating weight according to the word size of each named text sentence in all named text sentences of the picture, the distance between each named text sentence of the picture and the center point of the picture and the overlapping length between each named text sentence of the picture and the picture

Wherein:

d _j is the distance between each named text statement of the picture and the center point of the picture, l _j The overlapping length between each named text statement of the picture and the picture is the same as the overlapping length between the named text statement of the picture and the picture; and/or the processing unit determines a matching result of the picture and all named text sentences according to all named text sentences of the picture, including: the word segmentation words in each named text sentence of the picture are sequentially matched with the topic word set and the trivial word set of each page of the page where the picture is located, and the full-text topic word set and the full-text trivial word set of all pages of the slide, so that a matching result is obtained; and/or the processing unit determines and stores the picture full name field of the picture according to the picture naming text statement of the picture, and the processing unit comprises: if the length of the Chinese characters in the picture naming text sentence of the picture is smaller than or equal to a preset value, directly storing the picture naming text sentence of the picture as a picture full name field of the picture; if the length of the Chinese characters in the picture naming text sentence of the picture is greater than the preset value Extracting a set phrase in the picture naming text sentence of the picture by using an information entropy algorithm, and taking the set phrase as a new picture naming text sentence of the picture; storing a new picture naming text sentence of the picture as a picture full name field of the picture; determining a picture full name field of the picture as a search engine word of the picture by using a quick word segmentation method, and storing the search engine word as a picture search word of the picture; traversing all pictures in the slide file, and matching several pictures with the same picture search word as similar pictures to be stored in a database.

In accordance with another aspect of the present invention, there is provided a terminal comprising: the processing device for pictures and texts in the slide.

In accordance with the above method, the present invention further provides a storage medium, where the storage medium includes a stored program, where the program, when executed, controls a device where the storage medium is located to execute the above method for processing pictures and texts in a slide.

In accordance with a further aspect of the present invention, there is provided a processor for running a program, wherein the program is run to perform the above-described method for processing pictures and texts in a slide.

According to the scheme, through automatic matching of the pictures and the texts in the engineering material slide, the corresponding relation between the pictures and the text information in each page of the engineering slide can be conveniently determined, the pictures are automatically named and stored independently, extraction, naming and independent storage of the pictures in the engineering material slide file can be rapidly and automatically completed, subsequent rapid retrieval of precious engineering pictures according to the picture names can be supported, and the use convenience is improved.

Furthermore, according to the scheme of the invention, the matching and automatic naming of the pictures and the texts in the slide are realized by adopting natural language processing and automatically analyzing the position relation and the text semantic relation of the pictures and the text blocks in the slide according to the characteristics of large picture quantity and more text information of the engineering slide file, so that the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide can be conveniently determined, and the quick retrieval and the repeated use of the slide pictures can be supported.

Furthermore, according to the scheme of the invention, by aiming at the position relation between the pictures and the text content in each slide, the characteristics of randomness and strong correlation are compared, and the pictures and the text blocks are accurately matched through an intelligent algorithm according to the geometric position information, so that the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide can be conveniently determined, and the searching and the use of a user are facilitated.

Furthermore, according to the scheme of the invention, by adopting an algorithm based on the text size bounding box and the semantic score aiming at the position relation between the pictures and the text content in each page of the slide, the most relevant text can be accurately matched for each picture under the conditions that the positions of the text blocks are changeable and a plurality of pictures exist, so that the corresponding relation between the plurality of pictures and a plurality of text messages in each page of the engineering slide can be rapidly determined, and the user can conveniently search and use.

Furthermore, according to the scheme of the invention, by aiming at the position relation between the pictures and the text content in each slide, in the process of determining the corresponding relation between a plurality of pictures and a plurality of text information in each slide of engineering slides, natural language analysis algorithms such as Chinese segmentation, textRank and the like are adopted, proper nouns such as personal names, place names and the like can be extracted, and subject words of longer paragraphs can be extracted, so that the method can correspond to the subjects of the slides, and has wider application range.

According to the scheme, through natural language processing, the position relation and the text semantic relation of the pictures and the text blocks in the slide are automatically analyzed, the corresponding relation between the pictures and the text information in each page of the engineering slide is determined, so that matching and automatic naming of the pictures and the text in the slide are realized, the problem of high difficulty in searching the corresponding relation between the pictures and the text information in each page of the engineering slide is solved, and the effect of reducing the searching difficulty of the corresponding relation between the pictures and the text information in each page of the engineering slide is achieved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a flow chart of an embodiment of a method for processing pictures and text in a slide according to the present invention;

FIG. 2 is a flow chart of an embodiment of determining picture information and text information for each slide in a slide file according to the method of the present invention;

FIG. 3 is a flow chart of an embodiment of a method of the present invention for performing a natural language analysis process on text information of all pages of slides in a slide file;

FIG. 4 is a flow chart of an embodiment of determining each page of corpus information for each slide in a slide file and full text corpus information for all pages of slides in the slide file according to the method of the present invention;

FIG. 5 is a flow chart of an embodiment of performing natural language analysis processing on each page of corpus information and full-text corpus information in the method of the present invention;

FIG. 6 is a flow chart of an embodiment of a method of the present invention for matching named text statements for each picture in a slide file;

FIG. 7 is a flow chart of an embodiment of the method of the present invention for constructing a storage result for each picture and storing each picture according to the storage result;

FIG. 8 is a flowchart illustrating an embodiment of determining and storing a picture full name field of the picture according to a picture naming text statement of the picture in the method of the present invention;

FIG. 9 is a schematic diagram illustrating an embodiment of a processing device for pictures and text in a slide according to the present invention;

FIG. 10 is a flowchart of a method for automatically matching and naming pictures in a slide according to an embodiment of the terminal of the present invention;

FIG. 11 is a diagram of an example of matching of picture naming text sentences in a slide according to an embodiment of the terminal of the present invention;

FIG. 12 is a flowchart illustrating a picture naming text sentence matching process according to an embodiment of the terminal of the present invention;

FIG. 13 is a diagram showing an example of automatic picture extraction and naming results in a slide according to an embodiment of the terminal of the present invention;

FIG. 14 is a schematic diagram of the structure of the storage of the result set in the database according to an embodiment of the terminal of the present invention;

fig. 15 is a schematic structural diagram of an apparatus for automatically matching and naming pictures in a slide according to an embodiment of the terminal of the present invention.

In the embodiment of the present invention, reference numerals are as follows, in combination with the accompanying drawings:

1-a slide information extraction module; 2-a text sentence preprocessing module; 3-a natural language analysis module; 4-a slide name matching module; 5-a picture name matching module; 6-a picture storage module; 102-a determining unit; 104-a processing unit.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to specific embodiments of the present invention and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

According to an embodiment of the present invention, a method for processing pictures and texts in a slide is provided, and a flowchart of an embodiment of the method of the present invention is shown in fig. 1. The method for processing the pictures and the texts in the slide can be mainly applied to the aspect of processing the pictures and the texts in the engineering slide file (namely, the slide in the engineering material), and the method for processing the pictures and the texts in the engineering slide file can comprise the steps S110 to S140.

At step S110, picture information and text information of each slide in the slide file are determined.

Alternatively, the specific procedure of determining the picture information and text information of each slide in the slide file in step S110 may be referred to as the following exemplary description.

The following is a flowchart of an embodiment of determining the picture information and text information of each slide in the slide file according to the method of the present invention shown in fig. 2, which further describes the specific process of determining the picture information and text information of each slide in the slide file in step S110, and may include: step S210 and step S220.

Step S210, for each slide in the slide file, acquiring the width and the height of each slide, and acquiring the picture bounding box information of each picture in each slide as the picture information of each slide in the slide file. That is, the picture information of each slide in the slide file may include: width and height of each slide, and picture bounding box information for each picture in each slide. The method comprises the steps of,

step S220, obtaining text content, word size and/or text bounding box information of all text sentences in each slide as text information of each slide in the slide file. That is, text information of each slide in the slide file may include: text content, word size, and/or text bounding box information for all text statements for each page of the slide.

For example: reading each picture p in slide files _i Is recorded as p= { P _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; for each page of slide, the text content, the word size and bounding box information of all text sentences are read and recorded as T= { T _i }，t _i ＝{c _i ，x _imin ，y _imin ，x _imax ，y _imax }. As can be for the first slide filek pages of slide h _k Acquiring width h _x The height is h _y Reading the kth page slide h _k Each picture p _i Is recorded as P _k ＝{p _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; reading the kth page slide h _k Text content, word size and bounding box information of all text sentences in the text sentence, recorded as T _k ＝{t _i }，t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax }. Wherein c _i Is text content s _i Is of the word size, x _imin 、y _imin 、x _imax 、y _imax The bounding box information is position information.

For example: can aim at each slide h in the slide file _k As shown in FIG. 11, the width is h _x =210, height h _y =130. Reading each picture p therein _i Is recorded as p= { P _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax }. For each page of slide h, the text content, the word size and bounding box information of all text sentences are read and recorded as T= { T _i }，ti＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax }. As shown in fig. 11, there are four pictures p= { P ₁ ，p ₂ ，p ₃ ，p ₄ }，p ₁ ＝{5，30，55，90}，p ₂ = {60, 40, 105, 80}; there are 7 text sentences t= { T ₁ ，t ₂ ，t ₃ ，t ₄ ，t ₅ ，t ₆ ，t ₇ }, t is ₃ = { "certain dimethyl hospital application", 24, 10, 102, 50, 108}, t ₄ = { "maintenance personnel workload and level evaluation", 14, 10, 23, 50, 27}, t ₅ = { "building weak link assessment", 14, 107, 47, 113, 77}.

Thus, the picture information and the text information of each slide in the slide file are firstly determined, so that the picture and the text in the engineering slide file can be conveniently processed based on the picture information and the text information of each slide.

At step S120, the text information of all the pages of the slide in the slide file is subjected to natural language analysis processing, so as to obtain named text sentences of all the pages of the slide in the slide file, so that the named text sentences of the slide file can be determined according to the text information in the slide file.

Alternatively, for a specific procedure of performing the natural language analysis processing on the text information of all the pages of the slide in the slide file in step S120, reference may be made to the following exemplary description.

The following is a flowchart of an embodiment of performing a natural language analysis on text information of all pages of slides in a slide file in the method of the present invention in connection with fig. 3, which further illustrates a specific process of performing a natural language analysis on text information of all pages of slides in the slide file in step S120, and may include: step S310 to step S330.

Step S310, based on the text information of all the pages of slides in the slide file, each page of corpus information of each page of slides in the slide file and the full text corpus information of all the pages of slides in the slide file are determined.

More optionally, the specific procedure of determining the corpus information of each page of the slides in the slide file and the full text corpus information of all pages of the slides in the slide file based on the text information of all pages of the slides in the slide file in step S310 may be referred to as the following exemplary description.

In conjunction with the flowchart of an embodiment of determining the corpus information of each slide in the slide file and the full text corpus information of all slides in the slide file in the method of the present invention shown in fig. 4, the specific process of determining the corpus information of each slide in the slide file and the full text corpus information of all slides in the slide file in step S310 may further include: step S410 and step S420.

Step S410, slide show is performedThe text information of each slide in the slide file is connected according to a first set sequence to form each page of corpus information (such as page corpus W _k ). The method comprises the steps of,

step S420, eliminating repeated text information in the same position in different pages of slides aiming at the text information of all pages of slides in the slide file to obtain corrected text information of all pages of slides; and the corrected text information of all the pages of slides are connected according to a second set sequence to form the full text corpus information (such as full text corpus W ₀ )。

For example: the text of the kth page slide is formed into a section of text from top to bottom and from left to right to form page corpus W _k . Then, according to the sequence of slide pages, all the page corpus is formed into full text corpus W ₀ But before connecting, the text sentences repeatedly appearing at the same position in different slides should be deleted to correct the text sentence set T of each slide _k . As shown in fig. 11, text t ₂ Is the title name of the slide, repeatedly appears in all slides, and can be removed. Text t ₁ The chapter names repeatedly appear at the same position in the front and back slides and can be deleted.

Therefore, by determining the corpus information of each page of the slides in the slide file and the full text corpus information of all pages of the slides in the slide file based on the text information of all pages of the slides in the slide file, the processing can be performed on each page of slides and all slides respectively, and the comprehensiveness and the accuracy of processing pictures and texts in the engineering slide file can be guaranteed.

Step S320, performing natural language analysis processing on the corpus information and the full text corpus information of each page by adopting a natural language analysis method to obtain a subject word set and a trivial word set of each page of slides, and a full text subject word set and a full text trivial word set of all pages of slides.

More optionally, in step S320, a natural language analysis method is adopted to perform a natural language analysis on each page of corpus information and full text corpus informationAnd, it may include: and respectively carrying out natural language analysis processing on each page of corpus information and full-text corpus information to obtain each page of subject word set and each page of trivial word set of each slide, and full-text subject word set and full-text trivial word set of all pages of slides. For example: for page corpus (i.e. page corpus W _k ) And slide file corpus (i.e. full text corpus W ₀ ) Natural language analysis is carried out to obtain a subject word set E of each page _k ＝{e _k，i Per page trivial word set C _k ＝{c _k，i Set of full text subject words E ₀ ＝{e _0，i General word set C ₀ ＝{c _0，i }. More specific processes may be as follows:

performing natural language analysis processing on each page of corpus information and full-text corpus information to obtain a subject word set and a trivial word set of each page of slides; and carrying out natural language analysis processing on each page of corpus information and full text corpus information to obtain full text subject word sets and full text trivial word sets of all pages of slides. The process of performing natural language analysis processing on any corpus information in each page of corpus information and full text corpus information may further specifically include the following processes.

The following is a schematic flow chart of an embodiment of performing natural language analysis processing on each page of corpus information and full text corpus information in the method of the present invention shown in fig. 5, which further illustrates a specific process of performing natural language analysis processing on each page of corpus information and full text corpus information in step S320, and may include: step S510 to step S530.

Step S510, performing Chinese word segmentation on each corpus information in each page of corpus information and full-text corpus information by using an N-shortest path word segmentation algorithm to obtain word segmentation results. In the word segmentation result, sentences in each corpus information are segmented into words and phrases with set specifications and set proper nouns, and word segmentation is obtained.

For example: and performing Chinese word segmentation operation on the material by using an N shortest path word segmentation algorithm, and dividing the sentence into short words, phrases and proper nouns. For example, "serviceman workload and level assessment" will be divided into { serviceman, personnel, workload, and, level, assessment }.

In step S520, a TextRank algorithm is used to select a set number of words in the word segmentation result of each corpus according to the set score order, and the selected words are used as a subject word set (e.g., subject word set E) of the corpus.

For example: using TextRank algorithm, selecting the 10 words with the highest score as the subject word set E of the corpus. These words are characterized by a high frequency of occurrence and a strong correlation with other subject words. For example, the set of subject terms e= { maintenance, worksheet, hospital, … … }, for case PPT.

In step S530, the words (i.e., the words not belonging to the subject word set but having a frequency higher than that of any subject word) in the corpus that meet the trivial word standard are drawn into the trivial word set (e.g., trivial word set C) of the corpus.

For example: and (3) dividing words which do not belong to the subject word set but have higher frequency than any subject word in the corpus into the trivial word set C. These trivial words are also characterized by being frequently used in many different corpora, and therefore, they do not have a degree of recognition, and cannot be used as keywords for a particular corpus, but rather have a negative effect on semantic generalization. For example, the trivial word set e= { certain, always, person, … … }, of the page of fig. 11. In the subsequent step, the important phrase in the text is extracted as the 'final name' of the picture by using an information entropy algorithm. And using a quick word segmentation method, using the final name of the picture as a search engine word segmentation method, and storing the obtained words into a picture search word.

Therefore, by adopting natural language analysis algorithms such as Chinese word segmentation and TextRank, the problem that some techniques for marking pictures are not applicable to long texts and limited to short proper nouns is solved. Firstly, performing natural language analysis on the subject words of each page of the slide and the full text corpus, then directly performing natural language analysis on the associated text of each picture, and finally simplifying the associated text into short and representative file names by combining the semantics and the slide subject; proper nouns such as name and place name can be extracted, and the subject words of longer paragraphs can be extracted, and the method can correspond to the subject of the slide.

Step S330, determining a named file sentence of all text information in the slide file based on the set of each page of subject words and the set of each page of trivial words of each page of slides, and the set of full-text subject words and the set of full-text trivial words of all pages of slides.

Therefore, on the basis of selecting a proper algorithm to find the corresponding relation between a plurality of pictures and a plurality of text messages in each page of the engineering slide, natural language processing is adopted to automatically analyze the position relation and text semantic relation of the pictures and text blocks in the slide, so that the matching and automatic naming of the pictures and the text in the slide are realized, sentences can be simplified, meanwhile, the pictures can correspond to the subjects of the slide, the naming accuracy of the pictures in the slide file is improved, and the application range of the processing mode is widened.

More optionally, determining the named file sentence of all text information in the slide file in step S330 based on the set of per-page subject words and the set of per-page trivial words of each-page slide, and the set of full-text subject words and the set of full-text trivial words of all-page slides may include: and determining the text content with the largest word size in the part with the minimum data in the Y direction of the text bounding box information of each picture in each slide being larger than the set coefficient times of the height of each picture as the named text sentence of the picture in each slide according to the word size and the text bounding box information in the text information of each slide in the slide file. And analogically, obtaining named text sentences of all pictures in all pages of slides in the slide file, and taking the named text sentences as named file sentences of all text information in the slide file.

For example: calculating named text sentence h of each slide according to the word sizes and bounding box information of all text sentences in each slide _t . The preferred calculation method or matching method is to select each picture p _i Y in bounding box information of (2) _imin >k is the same as p of each picture _i Height h of (2) _y Word size s in all text sentences _i Maximum text sentence t _i Named text sentence h for slide _t K is preferably 1/3. As shown in FIG. 11, the upper part is 1/3 of thatThe upper text sentence is only the text sentence t ₃ And text sentence t ₆ In which the text sentence t ₃ Is 24, maximum. The named text sentence of the slide is therefore the text sentence t ₃ 。

Therefore, the method and the device can be used for distinguishing the subject words and the trivial words for each slide by determining the named file sentences of all text information in the slide file based on the subject word set and the trivial word set of each page of each slide and the full-text subject word set and the full-text trivial word set of all pages of slides, and are beneficial to ensuring the accuracy of the named file sentences of all text information in the obtained slide file.

At step S130, according to the picture information in the slide file and the named text sentence of the slide file, the named text sentence is matched for each picture in the slide file, so as to obtain the picture named text sentence of each picture.

Optionally, in step S130, the named text sentence is matched for each picture in the slide file according to the picture information in the slide file and the named text sentence of the slide file, such as p for each picture _i Match the most appropriate named text sentence t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax -store to each picture p _i P is obtained from bounding box information of (2) _i ＝{c _i ，X _imin ，Y _imin ，X _imax ，Y _imax }. As shown in fig. 12, specifically, it may include:

the following is a schematic flow chart of an embodiment of matching a named text sentence for each picture in a slide file in the method of the present invention in connection with fig. 6, which further illustrates a specific process of matching a named text sentence for each picture in a slide file in step S130, and may include: step S610 to step S640.

Step S610, determining all named text sentences of each picture in the set direction range according to the picture bounding box information in the picture information of each picture in each slide to obtain the pictureAll named text statements of a tile. As shown in fig. 12, according to the position information p of the picture _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Directly above, directly below and directly to the left Fang Huozheng, the text sentence tp= { tp is calculated _j }＝{(t _j ，w _j ) }. As shown in fig. 11, for picture p ₁ The text sentence directly above, directly below, directly to the left Fang Huozheng right of = {5, 30, 55, 90}, may include t ₃ ，t ₄ ，t ₅ ，t ₆ ，t ₇ 。

Step S620, if all the named text sentences of the picture are empty, determining the named text sentences of the slides of the page where the picture is located as the picture named text sentences of the picture.

For example: as shown in fig. 12, if according to picture p _i Position information p of (2) _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Calculated text sentence tp directly above, directly below and directly to the left Fang Huozheng right _j Is empty set, then the named text sentence h of the slide is used _t As picture p _i Is a named text sentence of (c).

In step S630, if all the named text sentences of the picture are not null, the weight is determined according to all the named text sentences of the picture, and the matching result of the picture and all the named text sentences is determined according to all the named text sentences of the picture.

Still further optionally, determining weights according to all named text sentences of the picture in step S630 may include: according to the word size of each named text sentence in all named text sentences of the picture and between each named text sentence of the picture and the center point of the picture

Calculating the overlapping length between each named text statement of the picture and the picture, and calculating the weight, wherein:

d _j is the distance between each named text statement of the picture and the center point of the picture, l _j The length of overlap between each named text statement of the picture and the picture is the same.

For example: according to the position information p of the picture _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Calculated text sentence tp directly above, directly below and directly to the left Fang Huozheng right _j Each text sentence t of (a) _j Word size of (c) and picture p _i Is used for calculating the weight w by the center point distance and the overlapping layering degree _j The method comprises the steps of carrying out a first treatment on the surface of the Preferably

Wherein d is _j Is t _j Center point and p _j Distance between center points. l (L) _j For picture p _j And text sentence t _j The overlap length between them. The preferred calculation formula is as follows:

as shown in fig. 11, for picture p ₁ Text sentence t ₃ Weight w of (2) ₃ =0.622; text sentence t ₄ Weight w of (2) ₄ =1.057; text sentence t ₅ Weight w of (2) ₅ -0.527; text sentence t ₆ Weight w of (2) ₆ -1.02; text sentence t ₇ Weight w of (2) ₇ -0.92; for picture p ₂ Text (A)This sentence t ₃ Weight w of (2) ₃ -0.463; text sentence t ₄ Weight w of (2) ₄ -0.365; text sentence t ₅ Weight w of (2) ₅ =1.01; text sentence t ₆ Weight w of (2) ₆ -0.45. Text sentence t ₇ Weight w of (2) ₇ ＝-0.63。

Thus, by determining the weight according to the word size of each named text sentence in all named text sentences of the picture, the distance between each named text sentence of the picture and the center point of the picture, and the overlapping length between each named text sentence of the picture and the picture, the reliability and the accuracy of each weight determination can be ensured.

Still further optionally, determining the matching result of the picture and all named text sentences according to all named text sentences of the picture in step S630 may include: and matching word segmentation words in each named text sentence of the picture with the topic word set and the trivial word set of each page of the page where the picture is located, and the full-text topic word set and the full-text trivial word set of all pages of the slide in sequence to obtain a matching result.

For example: sentence t of text _j The word segmentation words in the text are sequentially combined with the subject word set E of the page _k Full text subject term set E ₀ The trivial word set C of this page _k Set of trivial words throughout C ₀ The words in the text sentence t are matched, and then the text sentence t is calculated according to the word matching times by pressing the following table _j Is added with score b of (2) _j . Matches with subject words will score and matches with trivial words will be withheld, but the score and withhold should not exceed the limits. Preferably, the text sentence t _j Is added with score b of (2) _j The following table may be referred to.

Matching conditionShape of a Chinese character	Score of each match	The limit of the score
			Belonging to the subject term set E of the page _k	0.2	0.0～1.0
Belonging to the full text subject term set E ₀	0.1	0.0～0.5
			Belongs to the trivial word set C of the page _k	-0.2	-0.6～0.0
Belonging to the full text trivial word set C ₀	-0.1	-0.3～0.0

As shown in fig. 11, the text sentence t ₃ Contains 1 full text subject term "Hospital", so the text sentence t ₃ Is added with score b of (2) ₃ =0.1. Text sentence t ₄ Contains 1 main topic word "maintenance" and 1 full text plain word "personnel", so the text sentence t ₄ Is added with score b of (2) ₄ =0.2-0.1=0.1, the remainder being similar.

Therefore, word segmentation words in each named text sentence of the picture are sequentially matched with each subject word set and each trivial word set of the page where the picture is located, and full-text subject word sets and full-text trivial word sets of all pages of slides, so that matching results of the picture and all named text sentences are determined, each page of slides and all pages of slides can be processed, and comprehensiveness and reliability of the matching results are guaranteed.

And step S640, according to the maximum of the weight of the picture and the sum of the matching results of the picture, determining the named text sentence corresponding to the maximum as the named text sentence matched with the picture, and taking the named text sentence as the picture named text sentence of the picture.

For example: as shown in fig. 12, the text sentence t _i Is the total weight w of (2) _j +text sentence t _j Is added with score b of (2) _j Maximum text sentence t _j As picture p _i Is a named text sentence of (c). As shown in fig. 11, picture p ₁ The corresponding named text sentence is text sentence t ₄ Picture p ₂ The corresponding named text sentence is text sentence t ₅ Picture p ₃ The corresponding named text sentence is text sentence t ₆ Picture p ₄ The corresponding named text sentence is text sentence t ₇ 。

Thus, the correlation between the pictures in the slide and the text on the page in the space position is calculated; and then, a keyword and trivial word matching method is adopted to score and quantify the key semantic information, so that the technical effects of more accurately corresponding the pictures with the slide text blocks and greatly enhancing the readability and the understandability of the picture naming are achieved.

At step S140, according to the picture information in the slide file and the picture named text sentence obtained by matching each picture, a correspondence between the picture information and the text information in the slide file is established.

The picture information of each slide in the slide file may include: width and height of each slide, and picture bounding box information for each picture in each slide. The text information of each slide in the slide file may include: text content, word size, and/or text bounding box information for all text statements for each page of the slide.

Therefore, according to the characteristics of engineering slide files, such as the characteristics of random slide files with strong correlation, a proper algorithm is selected according to the position relation between the pictures and the text content in each slide, the pictures and the text blocks are accurately matched through an intelligent algorithm according to the geometric position information, the intelligent slide file can be used for naming the pictures, and the readability, the understandability and the retrievability of the picture names in the slides can be enhanced. The method comprises the steps of automatically analyzing the position relation and text semantic relation of pictures and text blocks in a slide by adopting natural language processing, and realizing matching and automatic naming of the pictures and the text in the slide; the related text of the picture is reduced to short and representative subject words by adopting a natural language analysis method, so that the index and the quick retrieval are convenient to establish.

Optionally, the specific process of establishing the correspondence between the picture information and the text information in the slide file in step S140 according to the picture information in the slide file and the picture named text statement obtained by matching each picture may include: and naming a text sentence for each picture in each page of the slide in the slide file, constructing a storage result for each picture, and storing each picture according to the storage result so as to establish a corresponding relation between picture information and text information in the slide file. That is, each picture in the slide file is stored as an individual file, and the individual file of each picture is named according to the picture named text sentence obtained by matching each picture, so as to establish the corresponding relationship between the picture information and the text information in the slide file.

For example: based on picture p _i Named text sentence c _i The final result is constructed and may include the picture itself, the final name, the full name of the picture, the search term of the picture, etc.

Therefore, the method and the device can name and store according to the picture information in the slide file and the picture naming text statement obtained by matching each picture, so that the corresponding relation between the picture information and the text information in the slide file can be quickly and conveniently established for the engineering slide file, and the user can conveniently check and adapt.

More optionally, for a specific procedure of naming a text sentence for each picture in each slide in the slide file, constructing a storage result for each picture, and storing each picture according to the storage result, reference may be made to the following exemplary description.

The following is a schematic flow chart of an embodiment of constructing a storage result for each picture and storing each picture according to the storage result in the method of the present invention shown in fig. 7, which further illustrates a specific process of constructing the storage result for each picture and storing each picture according to the storage result, may include: step S710 and step S720.

Step S710, for each slide, creating a folder and naming the folder by using a named text sentence of the slide; and independently storing each picture as an independent file, naming the independent file of each picture by using a picture naming text sentence, and placing the independent file in the newly created folder after naming the independent file of each picture.

For example: can be used for newly built folder with slide name h _t Naming each picture p in the slide _i Independently stored as files and named final names such as c _i Placed into the folder.

In step S720, a storage record is created for the picture in the database. And in the storage record of the picture, determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture.

For example: in the database for picture p _i Establishing a record, and directly storing the picture full name field based on the picture p _i Named text sentence c _i . Such as p ₄ The full name field of (1) is "the lamp tube of the outpatient service transfusion room always is on and off".

Therefore, by naming text sentences for each picture in each slide in the slide file, a storage result is constructed for each picture and each picture is stored according to the storage result, so that the pictures and the texts in the slide can be conveniently and correspondingly stored based on engineering, the storage is convenient, and the searching is also convenient.

Still further alternatively, in the storing record of the picture in step S720, a specific process of determining and storing the picture full name field of the picture according to the picture naming text statement of the picture may be referred to as the following exemplary description.

The following process of determining and storing the picture full name field of the picture according to the picture naming text sentence of the picture in step S720 is further described with reference to an embodiment of determining and storing the picture full name field of the picture according to the picture naming text sentence of the picture in the method of the present invention shown in fig. 8, and may include: step S810 to step S840.

Step 810, if the length of the Chinese characters in the picture naming text sentence of the picture is less than or equal to the preset value, the picture naming text sentence of the picture is directly stored as the picture full name field of the picture.

Step S820, if the length of the Chinese characters in the picture naming text sentence of the picture is larger than the preset value, extracting a set phrase in the picture naming text sentence of the picture by using an information entropy algorithm, and taking the set phrase as a new picture naming text sentence of the picture; and storing the new picture naming text statement of the picture as a picture full name field of the picture.

For example: if based on picture p _i Named text sentence c _i If the length of the picture is too long, preferably more than 10 Chinese characters, extracting important phrases in the text by using an information entropy algorithm, and storing the important phrases as 'final names' of the picture in a database. If based on picture p _i Named text sentence c _i And if the picture is shorter, the final name of the picture is the same as the full name. Such as picture p ₄ The whole name of the infusion room is 14 words, and the final name is formed by extracting the phrase of lighting tube of the infusion room and turning on and off.

Step S830, using the fast word segmentation method, determining the picture full name field of the picture as the search engine word of the picture, and storing the search engine word as the picture search word of the picture.

For example: and determining the picture full name field of the picture as a search engine word of the picture by using a quick word segmentation method, and storing the search engine word as a picture search word of the picture.

Step S840, traversing all pictures in the slide file, pairing several pictures with the same picture search word, and storing the paired pictures as similar pictures in a database.

For example: and (4) performing post-processing, traversing all the pictures, and storing the pictures with the same search words in a database as similar pictures in pairs to support a picture recommendation algorithm, wherein the final result can be seen in an example shown in fig. 14. As shown in fig. 13, a folder "some dimethyl hospital application" was newly created, and 4 picture files were extracted and named for the slide shown in fig. 11. The "picture search term" is obtained using a rapid word segmentation method. All pictures are traversed, the pictures with the same search word are paired in pairs and stored in a database as similar pictures to support a picture recommendation algorithm, and the final result can be seen in an example shown in fig. 14.

Therefore, the picture full name field of the picture is determined and stored according to the picture naming text statement of the picture, the reliability of storage is guaranteed, the occupied space is small, and the convenience of searching is also guaranteed.

Through a large number of experiments and verification, by adopting the technical scheme of the embodiment, through automatically matching the pictures and the texts in the engineering material slide, the corresponding relation between a plurality of pictures and a plurality of text messages in each page of the engineering slide can be conveniently determined, the pictures can be automatically named and independently stored, the extraction, the naming and the independent storage of the pictures in the engineering material slide file can be rapidly and automatically completed, the subsequent rapid retrieval of precious engineering pictures according to the picture names can be supported, and the use convenience is improved.

According to an embodiment of the present invention, there is also provided a processing apparatus of pictures and texts in a slide, corresponding to the processing method of pictures and texts in a slide. Referring to fig. 9, a schematic diagram of an embodiment of the apparatus of the present invention is shown. The processing device for pictures and texts in a slide can be mainly applied to the processing aspect of pictures and texts in engineering slide files (namely, slides in engineering materials), and the processing device for the pictures and the texts in the engineering slide files can comprise: a determination unit 102 and a processing unit 104.

In an alternative example, the determining unit 102 may be configured to determine picture information and text information of each slide in the slide file. The specific function and processing of the determination unit 102 are described in step S110.

Alternatively, the determining unit 102 determines the picture information and the text information of each slide in the slide file, may include:

the determining unit 102 may be further configured to obtain, for each slide in the slide file, a width and a height of each slide, and obtain, as the picture information of each slide in the slide file, picture bounding box information of each picture in each slide. That is, the picture information of each slide in the slide file may include: width and height of each slide, and picture bounding box information for each picture in each slide. The specific function and processing of the determination unit 102 is also referred to in step S210. The method comprises the steps of,

the determining unit 102 may be further configured to obtain text content, word size and/or text bounding box information of all text sentences in each slide as text information of each slide in the slide file. That is, text information of each slide in the slide file may include: text content, word size, and/or text bounding box information for all text statements for each page of the slide. The specific function and processing of the determination unit 102 is also referred to step S220.

For example: reading each picture p in slide files _i Is recorded as p= { P _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; for each page of slide, the text content, the word size and bounding box information of all text sentences are read and recorded as T= { T _i }，t _i ＝{c _i ，x _imin ，y _imin ，x _imax ，y _imax }. As can be directed to slide show textIn-piece kth page slide h _k Acquiring width h _x The height is h _y Reading the kth page slide h _k Each picture p _i Is recorded as P _k ＝{p _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; reading the kth page slide h _k Text content, word size and bounding box information of all text sentences in the text sentence, recorded as T _k ＝{t _i }，t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax }. Wherein c _i Is text content s _i Is of the word size, x _imin 、y _imin 、x _imax 、y _imax The bounding box information is position information.

In an alternative example, the processing unit 104 may be configured to perform natural language analysis processing on text information of all pages of slides in the slide file, so as to obtain named text sentences of all pages of slides in the slide file, so that the named text sentences of the slide file may be determined according to the text information in the slide file. The specific function and process of the processing unit 104 refer to step S120.

Optionally, the processing unit 104 performs a natural language analysis processing on the text information of all pages of slides in the slide file, which may include:

the processing unit 104 may be further configured to determine, based on text information of all the pages of the slides in the slide file, each page of corpus information of each page of the slides in the slide file, and full text corpus information of all the pages of the slides in the slide file. The specific function and processing of the processing unit 104 is also referred to in step S310.

More optionally, the determining, by the processing unit 104, each page of corpus information of each page of slides in the slide file and full text corpus information of all pages of slides in the slide file based on text information of all pages of slides in the slide file may include:

the processing unit 104 may be further configured to connect text information of each slide in the slide file according to a first set order to form each page of corpus information (e.g. page corpus W _k ). The specific function and processing of the processing unit 104 is also referred to as step S410. The method comprises the steps of,

the processing unit 104 may be further configured to reject text information repeatedly appearing at the same position in different pages of slides for text information of all pages of slides in the slide file, so as to obtain corrected text information of all pages of slides; and the corrected text information of all the pages of slides are connected according to a second set sequence to form the full text corpus information (such as full text corpus W ₀ ). The processing unit 104 is embodied inThe function and process are also referred to in step S420.

The processing unit 104 may be further specifically configured to perform natural language analysis processing on each page of corpus information and full text corpus information by using a natural language analysis method, so as to obtain each page of subject word set and each page of trivial word set of each page of slide, and each page of full text subject word set and full text trivial word set of all pages of slide. The specific function and processing of the processing unit 104 is also referred to in step S320.

More optionally, the processing unit 104 performs natural language analysis processing on each page of corpus information and full text corpus information by using a natural language analysis method, which may include: and respectively carrying out natural language analysis processing on each page of corpus information and full-text corpus information to obtain each page of subject word set and each page of trivial word set of each slide, and full-text subject word set and full-text trivial word set of all pages of slides. For example: for page corpus (i.e. page corpus W _k ) And slide file corpus (i.e. full text corpus W ₀ ) Natural language analysis is carried out to obtain a subject word set E of each page _k ＝{e _k，i Per page trivial word set C _k ＝{c _k，i Set of full text subject words E ₀ ＝{e _0，i General word set C ₀ ＝{c _0，i }. More specific processes may be as follows:

the processing unit 104 may be further specifically configured to perform natural language analysis processing on each page of corpus information and full text corpus information, so as to obtain a subject word set of each page of the slide and a trivial word set of each page. And carrying out natural language analysis processing on each page of corpus information and full text corpus information to obtain full text subject word sets and full text trivial word sets of all pages of slides. The process of performing natural language analysis processing on any corpus information in each page of corpus information and full text corpus information further specifically may include the following processes:

The processing unit 104 may be further specifically configured to perform chinese word segmentation on each corpus information in each page of corpus information and each corpus information in the full-text corpus information by using an N-shortest path word segmentation algorithm, so as to obtain a word segmentation result. In the word segmentation result, sentences in each corpus information are segmented into words and phrases with set specifications and set proper nouns, and word segmentation is obtained. The specific function and processing of the processing unit 104 is also referred to as step S510.

The processing unit 104 may be further specifically configured to select, using a TextRank algorithm, a set number of words in the word segmentation result of each corpus according to a set score order, as a subject word set (e.g. the subject word set E) of the corpus. The specific function and processing of the processing unit 104 is also referred to as step S520.

The processing unit 104 may be specifically further configured to score the words in the corpus (i.e., the words that do not belong to the subject word set but have a frequency higher than any subject word) that meet the trivial word standard into the trivial word set (e.g., trivial word set C) of the corpus. The specific function and processing of the processing unit 104 is also referred to step S530.

The processing unit 104 may be further configured to determine a named-file statement of all text information in the slide file based on the set of per-page subject words and the set of per-page trivial words of each-page slide, and the set of full-text subject words and the set of full-text trivial words of all-page slides. The specific function and processing of the processing unit 104 is also referred to as step S330.

More optionally, the processing unit 104 determines a named file statement of all text information in the slide file based on the set of per-page subject words and the set of per-page trivial words for each-page slide, and the set of full-text subject words and the set of full-text trivial words for all-page slides, which may include: the processing unit 104 may be further specifically configured to determine, according to the word size and the text bounding box information in the text information of each slide in the slide file, text content with the largest word size in a portion where the minimum data in the Y direction in the text bounding box information of each picture in each slide is greater than the set coefficient multiple of the height of each picture as a named text sentence of the picture in the slide. And analogically, obtaining named text sentences of all pictures in all pages of slides in the slide file, and taking the named text sentences as named file sentences of all text information in the slide file.

For example: calculating named text sentence h of each slide according to the word sizes and bounding box information of all text sentences in each slide _t . Preferably, the computing means or matching means select each picture p _i Y in bounding box information of (2) _imin >k is the same as p of each picture _i Height h of (2) _y Word size s in all text sentences _i Maximum text sentence t _i Named text sentence h for slide _t K is preferably 1/3. As shown in FIG. 11, the upper 1/3 or more text sentences have only text sentence t ₃ And text sentence t ₆ In which the text sentence t ₃ The word size of (2) is 24, max; the named text sentence of the slide is therefore the text sentence t ₃ 。

In an alternative example, the processing unit 104 may be further configured to match the named text sentence for each picture in the slide file according to the picture information in the slide file and the named text sentence of the slide file, to obtain a picture named text sentence of each picture. The specific function and processing of the processing unit 104 is also referred to as step S130.

Optionally, the processing unit 104 matches named text sentences for each picture in the slide file, e.g. for each picture p, based on the picture information in the slide file and the named text sentences of the slide file _i Match the most appropriate named text sentence t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax -store to each picture p _i P is obtained from bounding box information of (2) _i ＝{c _i ，X _imin ，Y _imin ，X _imax ，Y _imax }. As shown in fig. 12, specifically, it may include:

the processing unit 104 may be further configured to determine, according to the picture bounding box information in the picture information of each picture in each slide, all named text sentences of the picture within the set direction range, and obtain all named text sentences of the picture. The specific function and processing of the processing unit 104 is also referred to as step S610. As shown in fig. 12, according to the position information p of the picture _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Directly above, directly below and directly to the left Fang Huozheng, the text sentence tp= { tp is calculated _j }＝{(t _j ，w _j ) }. As shown in fig. 11, for picture p ₁ Text sentences directly above, directly below, directly to the left Fang Huozheng right = {5, 30, 55, 90}, may beTo include t ₃ ，t ₄ ，t ₅ ，t ₆ ，t ₇ 。

The processing unit 104 may be further configured to determine, as the picture named text sentence of the picture, the named text sentence of the slide of the page where the picture is located if all named text sentences of the picture are empty. The specific function and processing of the processing unit 104 is also referred to as step S620.

The processing unit 104 may be further specifically configured to determine a weight according to all the named text sentences of the picture if all the named text sentences of the picture are not empty, and determine a matching result of the picture and all the named text sentences of the picture according to all the named text sentences of the picture. The specific function and processing of the processing unit 104 also refer to step S630.

More optionally, the processing unit 104 determines weights according to all named text sentences of the picture, which may include: the processing unit 104 may be further configured to calculate the weight according to the word size of each named text sentence in all named text sentences of the picture, the distance between each named text sentence of the picture and the center point of the picture, and the overlapping length between each named text sentence of the picture and the picture

Wherein:

Wherein d is _j Is t _j Center point and p _j Distance between center points. l (L) _j For picture p _j And text sentence t _j The overlap length between them. The preferred calculation formula is as follows: />

As shown in fig. 11, for picture p ₁ Text sentence t ₃ Weight w of (2) ₃ =0.622; text sentence t ₄ Weight w of (2) ₄ =1.057; text sentence t ₅ Weight w of (2) ₅ -0.527; text sentence t ₆ Weight w of (2) ₆ -1.02; text sentence t ₇ Weight w of (2) ₇ -0.92; for picture p ₂ Text sentence t ₃ Weight w of (2) ₃ -0.463; text sentence t ₄ Weight w of (2) ₄ -0.365; text sentence t ₅ Weight w of (2) ₅ =1.01; text sentence t ₆ Weight of (2)Weight w ₆ -0.45; text sentence t ₇ Weight w of (2) ₇ ＝-0.63。

More optionally, the determining, by the processing unit 104, a matching result of the picture and all named text sentences according to all named text sentences of the picture may include: the processing unit 104 may be further specifically configured to match the word segmentation word in each named text sentence of the picture with each of the set of subject words and each of the set of trivial words of the page where the picture is located, and the set of full-text subject words and the set of full-text trivial words of all pages of the slide in sequence, so as to obtain a matching result.

Matching situations	Score of each match	The limit of the score
			Belonging to the subject term set E of the page _k	0.2	0.0～1.0
Belonging to the full text subject term set E ₀	0.1	0.0～0.5
			Belongs to the trivial word set C of the page _k	-0.2	-0.6～0.0
Belonging to the full text trivial word set C ₀	-0.1	-0.3～0.0

And the processing unit 104 may be further configured to determine, according to a maximum one of the weights of the picture and the sum of the matching results of the picture, a named text sentence corresponding to the maximum one as a named text sentence matching the picture, so as to use the named text sentence as a picture named text sentence of the picture. The specific function and processing of the processing unit 104 also refer to step S640.

Thus, the correlation between the pictures in the slide and the text on the page in the space position is calculated; and then, a keyword and trivial word matching device is adopted to score and quantify the key semantic information, so that the technical effects of more accurately corresponding the pictures with the slide text blocks and greatly enhancing the readability and the understandability of the picture naming are achieved.

In an alternative example, the processing unit 104 may be further configured to establish a correspondence between the picture information and the text information in the slide file according to the picture information in the slide file and the picture named text sentence obtained by matching each picture. The specific function and processing of the processing unit 104 is also referred to as step S140.

Therefore, according to the characteristics of engineering slide files, such as the characteristics of random slide files with strong correlation, a proper algorithm is selected according to the position relation between the pictures and the text content in each slide, the pictures and the text blocks are accurately matched through an intelligent algorithm according to the geometric position information, the intelligent slide file can be used for naming the pictures, and the readability, the understandability and the retrievability of the picture names in the slides can be enhanced. The method comprises the steps of automatically analyzing the position relation and text semantic relation of pictures and text blocks in a slide by adopting natural language processing, and realizing matching and automatic naming of the pictures and the text in the slide; the related text of the picture is reduced to short and representative subject words by adopting a natural language analysis device, so that the index and the quick retrieval are convenient to establish.

Optionally, the establishing, by the processing unit 104, a correspondence between the picture information and the text information in the slide file according to the picture information in the slide file and the picture named text statement obtained by matching each picture may include:

the processing unit 104 may be further specifically configured to name a text sentence for each picture in each page of the slide in the slide file, construct a storage result for each picture, and store each picture according to the storage result, so as to establish a correspondence between picture information and text information in the slide file. That is, each picture in the slide file is stored as an individual file, and the individual file of each picture is named according to the picture named text sentence obtained by matching each picture, so as to establish the corresponding relationship between the picture information and the text information in the slide file.

Optionally, the processing unit 104 names a text sentence for each picture in each slide in the slide file, constructs a storage result for each picture, and stores each picture according to the storage result, which may include:

the processing unit 104 may be further configured to create a folder for each slide and name the slide by using a named text sentence of the slide; and independently storing each picture as an independent file, naming the independent file of each picture by using a picture naming text sentence, and placing the independent file in the newly created folder after naming the independent file of each picture. The specific function and processing of the processing unit 104 is also referred to as step S710.

The processing unit 104 may be further configured to establish a storage record for the picture in the database; and in the storage record of the picture, determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture. The specific function and processing of the processing unit 104 is also referred to step S720.

More optionally, in the storage record of the picture, the processing unit 104 determines and stores the picture full name field of the picture according to the picture naming text statement of the picture, which may include:

The processing unit 104 may be further configured to directly store the picture naming text sentence of the picture as the picture full name field of the picture if the length of the chinese character in the picture naming text sentence of the picture is less than or equal to the preset value. The specific function and processing of the processing unit 104 also refer to step S810.

The processing unit 104 may be further configured to extract a set phrase in the picture naming text sentence of the picture by using an information entropy algorithm if the length of the chinese character in the picture naming text sentence of the picture is greater than a preset value, and use the set phrase as a new picture naming text sentence of the picture; and storing the new picture naming text statement of the picture as a picture full name field of the picture. The specific function and processing of the processing unit 104 is also referred to as step S820.

The processing unit 104 may be further specifically configured to determine, using a fast word segmentation method, a full name field of a picture as a search engine word of the picture, and store the search engine word as a picture search word of the picture. The specific function and processing of the processing unit 104 also refer to step S830.

The processing unit 104 may be further configured to traverse all the pictures in the slide file, pair several pictures with the same picture search word, and store the paired pictures as similar pictures in the database. The specific function and processing of the processing unit 104 is also referred to as step S840.

Since the processes and functions implemented by the apparatus of the present embodiment substantially correspond to the embodiments, principles and examples of the methods shown in fig. 1 to 8, the description of the present embodiment is not exhaustive, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.

Through a large number of experiments, the technical scheme of the invention is adopted, and the position relation and text semantic relation of the pictures and the text blocks in the slide are automatically analyzed by adopting natural language processing according to the characteristics of large picture quantity and more text information of the engineering slide file, so that the matching and automatic naming of the pictures and the text in the slide are realized, the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide can be conveniently determined, and the quick searching and the repeated use of the slide pictures can be supported.

According to an embodiment of the present invention, there is also provided a terminal corresponding to a processing apparatus of pictures and text in a slide. The terminal may include: the processing device for pictures and texts in the slide.

Considering that the engineering slide has the characteristics of large picture quantity and more text information, the difficulty in searching and using Cheng Huandeng pictures and other processes is inevitably brought, so that the selection of a proper algorithm is particularly important in searching the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide.

Some software that extracts slide pictures does not retain the associated text content. Engineering slide material typically consists of engineering pictures and text statements related to the content of the pictures. Given that the pictures in the slide show have strong correlation with the text on the page, both in content and spatial location, the readability, understandability and retrievability of the picture naming will be greatly enhanced if such information can be utilized. However, the slide has more pictures and text blocks, the positions are relatively random, the text information is also dense, and how to accurately extract text sentences corresponding to each picture from the slide is a difficult problem.

In some methods for matching pictures with context information, the matching method is mostly aimed at semi-structured texts with high structuring degree, wherein the upper text and the lower text of a single picture are relatively clear, such as web page format texts and word format texts. In a slide, there may be a situation that a plurality of pictures and a plurality of words are randomly combined and arranged, and how to accurately match the most relevant text for each picture is very difficult, and some techniques have lower accuracy in picture and word matching and picture naming in the slide.

In addition, some techniques for marking pictures through texts are limited to short and small proper nouns such as personal names, place names and the like, but the slide files are mainly long natural sentences, and the long natural sentences are directly used as the picture file names; therefore, further, on the basis of selecting a proper algorithm to find the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide, how to simplify sentences and simultaneously correspond to the subject of the slide is a problem to be further solved.

In an alternative embodiment, the scheme of the invention provides a method for automatically matching pictures and texts in a slide and naming pictures, in particular to a method for automatically matching pictures and texts in a slide of construction engineering materials and automatically naming and independently storing the pictures, which can rapidly and automatically complete the extraction, naming and independent storage of the pictures in a slide file of the engineering materials and support the subsequent rapid retrieval of precious engineering pictures according to the names of the pictures.

Specifically, according to the scheme of the invention, the position relation and the text semantic relation of the pictures and the text blocks in the slide can be automatically analyzed by adopting natural language processing according to the characteristics of the engineering slide file, so that the matching and automatic naming of the pictures and the text in the slide are realized, and the quick retrieval and the repeated use of the slide pictures are supported.

In an optional embodiment, in the solution of the present invention, a method for automatically matching pictures and texts in a slide and naming pictures may include the following steps:

step 1: reading each picture p in slide files _i Is recorded as p= { P _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; for each page of slide, the text content, the word size and bounding box information of all text sentences are read and recorded as T= { T _i }，t _i ＝{c _i ，x _imin ，y _imin ，x _imax ，y _imax }。

For example: can be aimed at the kth page slide h in slide file _k Acquiring width h _x The height is h _y Reading the kth page slide h _k Each picture p _i Is recorded as P _k ＝{p _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax -a }; reading the kth page slide h _k Text content, word size and bounding box information of all text sentences in the text sentence, recorded as T _k ＝{t _i }，t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax }. Wherein c _i Is text content s _i Is of the word size, x _imin 、y _imin 、x _imax 、y _imax The bounding box information is position information.

Wherein capitalization represents the bounding value of a bounding box, e.g. X _imin 、Y _imin 、X _imax 、Y _imax Etc.; lower case is a coordinate value, e.g. x _imin 、y _imin 、x _imax 、y _imax The method comprises the steps of carrying out a first treatment on the surface of the Because the bounding box is not a common coordinate concept, it refers to a straight line parallel to the coordinate axes, such as the four broken lines in fig. 11. The bounding box, i.e. the geometry or position information, is a detailed position information, which not only marks the position, but also derives the geometry of the object directly.

Specifically, in step 1, slide h may be specified for each page in the slide file _k As shown in FIG. 11, the width is h _x =210, height h _y =130. Reading each picture p therein _i Is recorded as p= { P _i }，p _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax }. For each page of slide h, the text content, the word size and bounding box information of all text sentences are read and recorded as T= { T _i }，ti＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax }。

For example: as shown in fig. 11, there are four pictures p= { P ₁ ，p ₂ ，p ₃ ，p ₄ }，p ₁ ＝{5，30，55，90}，p ₂ = {60, 40, 105, 80}; there are 7 text sentences t= { T ₁ ，t ₂ ，t ₃ ，t ₄ ，t ₅ ，t ₆ ，t ₇ }, t is ₃ = { "certain dimethyl hospital application", 24, 10, 102, 50, 108}, t ₄ = { "maintenance personnel workload and level evaluation", 14, 10, 23, 50, 27}, t ₅ = { "building weak link assessment", 14, 107, 47, 113, 77}.

Corresponding to step 1, step 1 may be performed using the slide information extraction module 1 during actual use. The operation performed by the slide information extraction module 1 may be referred to in the related description of step 1, and will not be described herein.

Step 2: the text of the kth page slide is formed into a section of text from top to bottom and from left to right to form page corpus W _k . Then, according to the sequence of slide pages, all the page corpus is formed into full text corpus W ₀ But before connecting, the text sentence set T of each slide should be revised _k The text sentence repeatedly appeared at the same position of the slide is deleted. For example: as shown in fig. 11, text t ₂ The title name of the slide is repeatedly appeared in all slides and can be removed; text t ₁ The chapter names repeatedly appear at the same position in the front and back slides and can be deleted.

Corresponding to step 2, in the actual use process, step 2 may be performed by using the text sentence preprocessing module 2. The operation performed by the text sentence preprocessing module 2 may refer to the related description of step 2, which is not described herein.

Step 3: for page corpus (i.e. page corpus W _k ) And slide file corpus (i.e. full text corpus W ₀ ) Natural language analysis is carried out to obtain a subject word set E of each page _k ＝{e _k，i Sets of trivial words (i.e. words not belonging to the set of subject words but more frequently than any subject word) per page C _k ＝{c _k，i Set of full text subject words E ₀ ＝{e _0，i General word set C ₀ ＝{c _0，i }. The processing of the page corpus and the slide file corpus is similar, and the method can comprise the following specific steps.

Step 3.1: and performing Chinese word segmentation operation on the material by using an N shortest path word segmentation algorithm, and dividing the sentence into short words, phrases and proper nouns. For example, "serviceman workload and level assessment" will be divided into { serviceman, personnel, workload, and, level, assessment }.

Wherein, the words can be units with independent meaning composed of 1-3 Chinese characters. The phrase can be sentence fragments consisting of 2 or more related words. Proper nouns may be specific or unique persons or things (person names, place names, etc.).

Step 3.2: using TextRank algorithm, selecting the 10 words with the highest score as the subject word set E of the corpus. These words are characterized by a high frequency of occurrence and a strong correlation with other subject words. For example, the set of subject terms e= { maintenance, worksheet, hospital, … … }, for case PPT.

Step 3.3: and (5) dividing words meeting the trivial word standard in the corpus into a trivial word set C. These trivial words are also characterized by being frequently used in many different corpora, and therefore, they do not have a degree of recognition, and cannot be used as keywords for a particular corpus, but rather have a negative effect on semantic generalization. For example, the trivial word set e= { certain, always, person, … … }, of the page of fig. 11. In the subsequent step, the important phrases in the text are extracted as the "final names" of the pictures using an information entropy algorithm. And using a quick word segmentation method, using the final name of the picture as a search engine word segmentation method, and storing the obtained words into a picture search word.

In correspondence with step 3, during actual use, step 3 may be performed using the natural language analysis module 3. The operations performed by the natural language analysis module 3 may be referred to in the description related to step 3, and will not be described herein.

Step 4: calculating named text sentence h of each slide according to the word sizes and bounding box information of all text sentences in each slide _t . The preferred calculation method or matching method is to select each picture p _i Y in bounding box information of (2) _imin >k is the same as p of each picture _i Height h of (2) _y Word size s in all text sentences _i Maximum text sentence t _i Named text sentence h for slide _t K is preferably 1/3. As shown in FIG. 11, the upper 1/3 or more text sentences have only text sentence t ₃ And text sentence t ₆ In which the text sentence t ₃ The word size of (2) is 24, max; the named text sentence of the slide is therefore the text sentence t ₃ 。

In correspondence with step 4, step 4 may be performed by slide name matching module 4 during actual use. The operation performed by the slide name matching module 4 may be referred to in the description related to step 4, and will not be described herein.

Step 5: for each picture p _i Match the most appropriate named text sentence t _i ＝{c _i ，s _i ，x _imin ，y _imin ，x _imax ，y _imax -store to each picture p _i P is obtained from bounding box information of (2) _i ＝{c _i ，X _imin ，Y _imin ，X _imax ，Y _imax }. As shown in fig. 12, the method specifically includes the following steps:

step 5.1 according to the position information p of the picture _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Directly above, directly below and directly to the left Fang Huozheng, the text sentence tp= { tp is calculated _j }＝{(t _j ，w _j ) }. As shown in fig. 11, for picture p ₁ Text sentence right above, right below, right left Fang Huozheng, including t = {5, 30, 55, 90} ₃ ，t ₄ ，t ₅ ，t ₆ ，t ₇ 。

Step 5.2 according to the position information p of the picture _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Calculated text sentence tp directly above, directly below and directly to the left Fang Huozheng right _j Each text sentence t of (a) _j Word size of (c) and picture p _i Calculating the weight w from the center point distance and the overlap length of the image _j The method comprises the steps of carrying out a first treatment on the surface of the Preferably

as shown in fig. 11, for picture p ₁ Text sentence t ₃ Weight w of (2) ₃ =0.622; text sentence t ₄ Weight w of (2) ₄ =1.057; text sentence t ₅ Weight w of (2) ₅ -0.527; text sentence t ₆ Weight w of (2) ₆ -1.02; text sentence t ₇ Weight w of (2) ₇ -0.92; for picture p ₂ Text sentence t ₃ Weight w of (2) ₃ -0.463; text sentence t ₄ Weight w of (2) ₄ -0.365; text sentence t ₅ Weight w of (2) ₅ =1.01; text sentence t ₆ Weight w of (2) ₆ -0.45; text sentence t ₇ Weight w of (2) ₇ ＝-0.63。

Step 5.3 text sentence t _j The word segmentation words in the text are sequentially combined with the subject word set E of the page _k Full text subject term set E ₀ The trivial word set C of this page _k Set of trivial words throughout C ₀ The words in the text sentence t are matched, and then the text sentence t is calculated according to the word matching times by pressing the following table _j Is added with score b of (2) _j . Matches with subject words will score and matches with trivial words will be withheld, but the score and withhold should not exceed the limits. Preferably, the text sentence t _j Is added with score b of (2) _j The following table may be referred to.

Step 5.4 text sentence t _i Is the total weight w of (2) _j +text sentence t _j Is added with score b of (2) _j Maximum text sentence t _j As picture p _i Is a named text sentence of (c). As shown in fig. 11, picture p ₁ The corresponding named text sentence is text sentence t ₄ Picture p ₂ The corresponding named text sentence is text sentence t ₅ Picture p ₃ The corresponding named text sentence is text sentence t ₆ Picture p ₄ The corresponding named text sentence is text sentence t ₇ 。

Step 5.5 if according to the picturep _i Position information p of (2) _i ＝{X _imin ，Y _imin ，X _imax ，Y _imax Calculated text sentence tp directly above, directly below and directly to the left Fang Huozheng right _j Is empty set, then the named text sentence h of the slide is used _t As picture p _i Is a named text sentence of (c).

Corresponding to step 5, in the actual use process, step 5 may be performed by using the picture name matching module 5. The operations performed by the picture name matching module 5 may be referred to in the related description of step 5, and will not be described herein.

Step 6: based on picture p _i Named text sentence c _i Constructing a final result, which may include a picture itself, a final name, a picture full name, a picture search word, etc., may specifically include the following steps:

step 6.1 the new folder may be named slide h _t Naming each picture p in the slide _i Independently stored as files and named final names such as c _i Placed into the folder.

Step 6.2 picture p in database _i Establishing a record, and directly storing the picture full name field based on the picture p _i Named text sentence c _i . Such as p ₄ The full name field of (1) is "the lamp tube of the outpatient service transfusion room always is on and off".

Step 6.3 if based on picture p _i Named text sentence c _i If the length of the picture is too long, preferably more than 10 Chinese characters, extracting important phrases in the text by using an information entropy algorithm, and storing the important phrases as 'final names' of the picture in a database. If based on picture p _i Named text sentence c _i And if the picture is shorter, the final name of the picture is the same as the full name. Such as picture p ₄ The whole name of the infusion room is 14 words, and the final name is formed by extracting the phrase of lighting tube of the infusion room and turning on and off.

Step 6.4, using a quick word segmentation method, using the final name of the picture as a search engine word segmentation, and storing the obtained word into a picture search word; when later retrieved, only the method is matched with the methodSome search terms match, not full names, which may avoid ambiguity. For example, picture p ₄ The picture search word of (1) is { transfusion, transfusion room, lamp, tube, on, off }. Then, a post-processing is performed, all pictures are traversed, the pictures with the same search word are paired in pairs and stored in a database as similar pictures to support a picture recommendation algorithm, and the final result can be seen in an example shown in fig. 14.

That is, in steps 6.2 to 6.4, the picture p is in the database _i Establishing a record, and directly storing the full name field of the picture in c _i . If c _i If the length of the picture is too long, preferably more than 10 Chinese characters, extracting important phrases in the text by using an information entropy algorithm, and storing the important phrases as 'final names' of the picture in a database. The "picture search term" is obtained using a rapid word segmentation method. And traversing all the pictures, and pairing the pictures with the same search word in pairs as similar pictures and storing the similar pictures into a database so as to support a picture recommendation algorithm. For example: as shown in fig. 13, a folder "some dimethyl hospital application" was newly created, and 4 picture files were extracted and named for the slide shown in fig. 11. The "picture search term" is obtained using a rapid word segmentation method. All pictures are traversed, the pictures with the same search word are paired in pairs and stored in a database as similar pictures to support a picture recommendation algorithm, and the final result can be seen in an example shown in fig. 14.

In correspondence with step 6, during actual use, step 6 may be performed using the picture storage module 6. The operations performed by the image storage module 6 may be referred to in the description related to step 6, and will not be described herein.

Therefore, according to the scheme of the invention, the method for automatically matching pictures and texts in the slide and naming pictures can be used for accurately matching the pictures with text blocks through an intelligent algorithm according to geometric position information aiming at the characteristics of randomness and strong correlation of the position relation of the pictures and text contents in each slide, so that the readability, the comprehensiveness and the retrievability of the names of the pictures in the slide can be enhanced. The related text of the picture is reduced to short and representative subject words by adopting a natural language analysis method, so that the index and quick retrieval are convenient to establish.

Specifically, the scheme of the invention adopts an algorithm based on the text word size bounding box and the semantic score, and solves the problem of accurately matching the most relevant text for each picture under the conditions that the positions of the text blocks are changeable and a plurality of pictures exist. According to the scheme, firstly, the correlation between the pictures in the slide and the text on the page in the space position is calculated; and then, a keyword and trivial word matching method is adopted to score and quantify the key semantic information, so that the technical effects of more accurately corresponding the pictures with the slide text blocks and greatly enhancing the readability and the understandability of the picture naming are achieved.

Furthermore, the scheme of the invention adopts Chinese word segmentation and TextRank and other natural language analysis algorithms, and solves the problems that some techniques for marking pictures are not applicable to long texts and limited to short proper nouns. According to the scheme, firstly, the subject words of each page and the full text corpus of a slide are subjected to natural language analysis, then the associated text of each picture is subjected to natural language analysis, and finally, the associated text is reduced to be short and small and representative file names by combining the semantics and the slide subject; proper nouns such as name and place name can be extracted, and the subject words of longer paragraphs can be extracted, and the method can correspond to the subject of the slide.

Since the processes and functions implemented by the terminal of the present embodiment basically correspond to the embodiments, principles and examples of the apparatus shown in fig. 9, the description of the present embodiment is not exhaustive, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.

Through a large number of experiments, the technical scheme of the invention is adopted, the picture and text blocks are accurately matched according to the geometric position information by aiming at the characteristics of randomness and strong correlation of the position relation of the picture and the text content in each slide, and the corresponding relation between a plurality of pictures and a plurality of text information in each page of the engineering slide can be conveniently determined, so that the searching and the use of a user are convenient.

According to an embodiment of the present invention, there is also provided a storage medium corresponding to a method for processing pictures and texts in a slide, the storage medium including a stored program, wherein the device in which the storage medium is controlled to execute the above method for processing pictures and texts in a slide when the program runs.

Since the processes and functions implemented by the storage medium of the present embodiment substantially correspond to the embodiments, principles and examples of the methods shown in fig. 1 to 8, the description of the present embodiment is not exhaustive, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.

Through a large number of experiments and verification, by adopting the technical scheme of the invention, through adopting an algorithm based on text size bounding boxes and semantic scores aiming at the position relation between pictures and text contents in each slide, the most relevant text can be accurately matched for each picture under the conditions that the positions of text blocks are changeable and a plurality of pictures exist, so that the corresponding relation between a plurality of pictures and a plurality of text messages in each page of an engineering slide can be rapidly determined, and the user can conveniently search and use.

According to an embodiment of the present invention, there is also provided a processor corresponding to a method of processing pictures and texts in a slide, the processor being configured to execute a program, wherein the program executes the above-described method of processing pictures and texts in a slide.

Since the processes and functions implemented by the processor of the present embodiment substantially correspond to the embodiments, principles and examples of the methods shown in fig. 1 to 8, the description of the present embodiment is not exhaustive, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.

Through a large number of experiments, the technical scheme of the invention is adopted, through aiming at the position relation between the pictures and the text content in each slide, in the process of determining the corresponding relation between a plurality of pictures and a plurality of text messages in each slide of engineering, natural language analysis algorithms such as Chinese word segmentation, textRank and the like are adopted, proper nouns such as personal names, place names and the like can be extracted, and subject words of longer paragraphs can be extracted, and the application range is wider in correspondence with the subjects of the slides.

In summary, it is readily understood by those skilled in the art that the above-described advantageous ways can be freely combined and superimposed without conflict.

The above description is only an example of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. A method for processing pictures and text in a slide, comprising:

determining picture information and text information of each slide in the slide file;

carrying out natural language analysis processing on text information of all pages of slides in the slide file to obtain named text sentences of all pages of slides in the slide file; the natural language analysis processing is carried out on the text information of all pages of slides in the slide file, and the natural language analysis processing comprises the following steps: determining each page of corpus information of each slide in the slide file and the full text corpus information of all pages of slides in the slide file based on the text information of all pages of slides in the slide file; performing natural language analysis processing on each page of corpus information and full-text corpus information to obtain a subject word set and a trivial word set of each page of slides, and a full-text subject word set and a full-text trivial word set of all pages of slides; determining a named file statement of all text information in the slide file based on each subject word set and each trivial word set of each slide and the full-text subject word sets and full-text trivial word sets of all slides; wherein determining a named file statement of all text information in the slide file comprises: according to the word size and the text bounding box information in the text information of each slide in the slide file, determining the text content with the largest word size in the part with the minimum data in the Y direction of the text bounding box information of each picture in each slide being larger than the set coefficient times of the height of each picture as the named text sentence of the picture in the slide; and analogizing to obtain named text sentences of all pictures in all pages of slides in the slide file, wherein the named text sentences are used as named file sentences of all text information in the slide file;

Matching the named text statement for each picture in the slide file according to the picture information in the slide file and the named text statement of the slide file to obtain the picture named text statement of each picture; wherein, match the text sentence of naming for every picture in the slide file, including: determining all named text sentences of each picture in a set direction range according to picture bounding box information in picture information of each picture in each slide to obtain all named text sentences of the picture; if all the named text sentences of the picture are empty, determining the named text sentences of the slides of the pages where the picture is located as the picture named text sentences of the picture; if all the named text sentences of the picture are not empty, determining weights according to all the named text sentences of the picture, and determining matching results of the picture and all the named text sentences according to all the named text sentences of the picture; and determining a named text sentence corresponding to the maximum value as a named text sentence matched with the picture according to the maximum value of the sum of the weight of the picture and the matching result of the picture, so as to take the named text sentence as a picture named text sentence of the picture;

Establishing a corresponding relation between the picture information and the text information in the slide file according to the picture information in the slide file and the picture naming text statement obtained by matching each picture;

wherein, the picture information of each slide in the slide file comprises: width and height of each slide, and picture bounding box information of each picture in each slide; text information of each slide in the slide file includes: text content, word size, and/or text bounding box information for all text statements for each page of the slide.

2. The method for processing pictures and text in a slide according to claim 1, wherein,

determining picture information and text information of each slide in the slide file comprises the following steps:

acquiring the width and the height of each slide in the slide file, and acquiring the picture bounding box information of each picture in each slide as the picture information of each slide in the slide file; the method comprises the steps of,

acquiring text content, word size and/or text bounding box information of all text sentences in each slide as text information of each slide in a slide file;

And/or the number of the groups of groups,

establishing a corresponding relation between picture information and text information in a slide file comprises the following steps:

and naming a text sentence for each picture in each slide in the slide file, constructing a storage result for each picture, and storing each picture according to the storage result.

3. The method for processing pictures and text in a slide according to claim 2, wherein,

determining each page of corpus information of each page of slides in the slide file and full-text corpus information of all pages of slides in the slide file comprises the following steps:

connecting text information of each slide in the slide file according to a first set sequence to form corpus information of each slide; the method comprises the steps of,

aiming at text information of all pages of slides in a slide file, eliminating text information repeatedly appearing at the same position in different pages of slides to obtain corrected text information of all pages of slides; the corrected text information of all the pages of slides is connected according to a second set sequence to form full text corpus information of all the pages of slides;

and/or the number of the groups of groups,

performing natural language analysis processing on each page of corpus information and full text corpus information, wherein the natural language analysis processing comprises the following steps:

Performing Chinese word segmentation on each corpus information in each page of corpus information and full-text corpus information by using an N shortest path word segmentation algorithm to obtain word segmentation results; in the word segmentation result, sentences in each corpus information are segmented into words and phrases with set specifications and set proper nouns, so that word segmentation is obtained;

selecting a set number of words in the word segmentation result of each corpus according to a set score order by using a TextRank algorithm as a subject word set of the corpus;

dividing words meeting the trivial word standard in the corpus into trivial word sets of the corpus;

and/or the number of the groups of groups,

constructing a storage result for each picture, and storing each picture according to the storage result, including:

for each slide, creating a folder and naming the folder by using a named text sentence of the slide; and storing each picture independently as an independent file, naming the independent file of each picture by using a picture naming text sentence, and placing the independent file in the newly created folder after naming the independent file of each picture;

establishing a storage record for the picture in a database; and in the storage record of the picture, determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture.

4. A method for processing pictures and text in a slide according to claim 3, wherein,

determining weights according to all named text sentences of the picture, including:

calculating weight according to the word size of each named text sentence in all named text sentences of the picture, the distance between each named text sentence of the picture and the center point of the picture and the overlapping length between each named text sentence of the picture and the picture

Wherein:

d _j is the distance between each named text statement of the picture and the center point of the picture, l _j The overlapping length between each named text statement of the picture and the picture is the same as the overlapping length between the named text statement of the picture and the picture;

and/or the number of the groups of groups,

determining a matching result of the picture and all named text sentences according to all named text sentences of the picture, wherein the matching result comprises the following steps:

the word segmentation words in each named text sentence of the picture are sequentially matched with the topic word set and the trivial word set of each page of the page where the picture is located, and the full-text topic word set and the full-text trivial word set of all pages of the slide, so that a matching result is obtained;

and/or the number of the groups of groups,

determining and storing a picture full name field of the picture according to a picture naming text sentence of the picture, including:

If the length of the Chinese characters in the picture naming text sentence of the picture is smaller than or equal to a preset value, directly storing the picture naming text sentence of the picture as a picture full name field of the picture;

if the length of the Chinese characters in the picture naming text sentence of the picture is larger than a preset value, extracting a set phrase in the picture naming text sentence of the picture by using an information entropy algorithm, and taking the set phrase as a new picture naming text sentence of the picture; storing a new picture naming text sentence of the picture as a picture full name field of the picture;

determining a picture full name field of the picture as a search engine word of the picture by using a quick word segmentation method, and storing the search engine word as a picture search word of the picture;

traversing all pictures in the slide file, and matching several pictures with the same picture search word as similar pictures to be stored in a database.

5. A device for processing pictures and text in a slide, comprising:

a determining unit for determining picture information and text information of each slide in the slide file;

the processing unit is used for carrying out natural language analysis processing on the text information of all the pages of slides in the slide file to obtain named text sentences of all the pages of slides in the slide file; the processing unit performs natural language analysis processing on text information of all pages of slides in the slide file, and the processing unit comprises the following steps: determining each page of corpus information of each slide in the slide file and the full text corpus information of all pages of slides in the slide file based on the text information of all pages of slides in the slide file; performing natural language analysis processing on each page of corpus information and full-text corpus information to obtain a subject word set and a trivial word set of each page of slides, and a full-text subject word set and a full-text trivial word set of all pages of slides; determining a named file statement of all text information in the slide file based on each subject word set and each trivial word set of each slide and the full-text subject word sets and full-text trivial word sets of all slides; wherein the processing unit determines a named file statement of all text information in the slide file, comprising: according to the word size and the text bounding box information in the text information of each slide in the slide file, determining the text content with the largest word size in the part with the minimum data in the Y direction of the text bounding box information of each picture in each slide being larger than the set coefficient times of the height of each picture as the named text sentence of the picture in the slide; and analogizing to obtain named text sentences of all pictures in all pages of slides in the slide file, wherein the named text sentences are used as named file sentences of all text information in the slide file;

The processing unit is also used for matching the named text statement for each picture in the slide file according to the picture information in the slide file and the named text statement of the slide file to obtain the picture named text statement of each picture; the processing unit matches named text sentences for each picture in the slide file, and comprises the following steps: determining all named text sentences of each picture in a set direction range according to picture bounding box information in picture information of each picture in each slide to obtain all named text sentences of the picture; if all the named text sentences of the picture are empty, determining the named text sentences of the slides of the pages where the picture is located as the picture named text sentences of the picture; if all the named text sentences of the picture are not empty, determining weights according to all the named text sentences of the picture, and determining matching results of the picture and all the named text sentences according to all the named text sentences of the picture; and determining a named text sentence corresponding to the maximum value as a named text sentence matched with the picture according to the maximum value of the sum of the weight of the picture and the matching result of the picture, so as to take the named text sentence as a picture named text sentence of the picture;

The processing unit is also used for establishing a corresponding relation between the picture information and the text information in the slide file according to the picture information in the slide file and the picture naming text statement obtained by matching each picture;

6. The apparatus for processing pictures and text in a slide as set forth in claim 5 wherein,

the determination unit determines picture information and text information of each slide in the slide file, including:

And/or the number of the groups of groups,

the processing unit establishes a corresponding relation between picture information and text information in a slide file, and the method comprises the following steps:

7. The apparatus for processing pictures and text in a slide as set forth in claim 6 wherein,

the processing unit determines each page of corpus information of each page of slides in the slide file and full text corpus information of all pages of slides in the slide file, and comprises the following steps:

and/or the number of the groups of groups,

the processing unit performs natural language analysis processing on each page of corpus information and full text corpus information, and comprises the following steps:

and/or the number of the groups of groups,

the processing unit constructs a storage result for each picture, stores each picture according to the storage result, and comprises the following steps:

8. The apparatus for processing pictures and text in a slide as set forth in claim 7 wherein,

the processing unit determines weights according to all named text sentences of the picture, including:

Wherein:

and/or the number of the groups of groups,

the processing unit determines a matching result of the picture and all named text sentences according to all named text sentences of the picture, and the matching result comprises the following steps:

And/or the number of the groups of groups,

the processing unit determines and stores the picture full name field of the picture according to the picture naming text statement of the picture, and comprises the following steps:

9. A terminal, comprising: a processing device for pictures and text in a slide according to any one of claims 5 to 8;

Alternatively, it includes:

a processor for executing a plurality of instructions;

a memory for storing a plurality of instructions;

wherein the plurality of instructions are for storage by the memory and loading and executing by the processor the method of processing pictures and text in a slide as claimed in any one of claims 1 to 4.

10. A storage medium having a plurality of instructions stored therein; the plurality of instructions for loading and executing by a processor the method of processing pictures and text in a slide as claimed in any one of claims 1 to 4.