[go: up one dir, main page]

CN120086699B - Smart construction site safety management and control method and system based on multi-source data analysis - Google Patents

Smart construction site safety management and control method and system based on multi-source data analysis

Info

Publication number
CN120086699B
CN120086699B CN202510580593.1A CN202510580593A CN120086699B CN 120086699 B CN120086699 B CN 120086699B CN 202510580593 A CN202510580593 A CN 202510580593A CN 120086699 B CN120086699 B CN 120086699B
Authority
CN
China
Prior art keywords
features
text
matrix
feature
branch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510580593.1A
Other languages
Chinese (zh)
Other versions
CN120086699A (en
Inventor
赵小永
刘立广
王鹏
杨武顺
宋伟青
王硕
宋奕霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Railway Construction Group Co Ltd
Xian Engineering Co Ltd of China Railway Construction Group Co Ltd
Original Assignee
China Railway Construction Group Co Ltd
Xian Engineering Co Ltd of China Railway Construction Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Railway Construction Group Co Ltd, Xian Engineering Co Ltd of China Railway Construction Group Co Ltd filed Critical China Railway Construction Group Co Ltd
Priority to CN202510580593.1A priority Critical patent/CN120086699B/en
Publication of CN120086699A publication Critical patent/CN120086699A/en
Application granted granted Critical
Publication of CN120086699B publication Critical patent/CN120086699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/08Construction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Computer Security & Cryptography (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)

Abstract

The invention is suitable for the field of intelligent construction sites, and provides an intelligent construction site safety control method and system based on multi-source data analysis, wherein in the control method, a BERT model is adopted to respectively extract features of a site information text and a weather information text so as to obtain text features; the method comprises the steps of adopting an improved SlowFast model to extract visual features of image information, constructing a cross-modal relation matrix between text and the visual features, weighting and fusing the text features and the visual features, taking the fused features as input of an MLP network to obtain preliminary behavior categories of workers in the image, comparing the preliminary behavior category results with site features and weather features, adjusting the preliminary behavior categories according to a preset correction rule to obtain final behavior categories, and carrying out safety control on behaviors of the workers in the construction site based on the final behavior categories. The control method can realize comprehensive, accurate and real-time monitoring and management of the behaviors of workers on the construction site by acquiring and fusing the multisource information and combining the safety control measures.

Description

Intelligent building site safety control method and system based on multi-source data analysis
Technical Field
The invention belongs to the technical field of intelligent building site management and control, and particularly relates to an intelligent building site safety management and control method and system based on multi-source data analysis.
Background
The internal environment of the construction site is complex and changeable, and construction workers need to operate according to relevant regulations in the construction site, and meanwhile, the safety state of the area where the construction workers are located needs to be paid attention to.
At present, the safety management of the construction site can reduce the probability of accident occurrence to a certain extent, however, the traditional construction site safety monitoring means have a plurality of defects, manual inspection is difficult to cover the whole area, personnel quality and fatigue are easily affected, and the fixed safety notification and warning effect is limited. The monitoring to construction workers is incomplete and discontinuous, security supervision loopholes exist, potential safety hazards cannot be found and early warned in time, and the accuracy and reliability of site security monitoring are reduced.
Disclosure of Invention
The embodiment of the invention aims to provide an intelligent building site safety control method and system based on multi-source data analysis, which aim to solve the technical problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions.
The embodiment of the invention provides an intelligent building site safety control method based on multi-source data analysis, which comprises the following steps:
S1, acquiring multi-source information of a construction site, wherein the multi-source information comprises site information, weather information and image information;
s2, respectively carrying out text preprocessing on the site information and the weather information, respectively carrying out feature extraction on the preprocessed site information text and the preprocessed weather information text by adopting a BERT model to obtain site features and weather features, and carrying out splicing processing on the site features and the weather features to obtain text features;
S3, extracting visual features of image information by adopting an improved SlowFast model, wherein the improved SlowFast model comprises a fast branch and a slow branch, a double-layer attention module is introduced into the fast branch, a feature enhancement module is introduced into the slow branch, weights of the fast branch and the slow branch are adjusted based on a dynamic weight mechanism, and the outputs of the two branches are subjected to feature fusion according to the adjusted weights to obtain visual features;
s4, constructing a cross-modal relation matrix between the text and the visual features by calculating the Pearson correlation coefficient, and weighting and fusing the text features and the visual features according to the cross-modal relation matrix to obtain final fused features;
S5, taking the fusion features as the input of an MLP network, wherein multi-stage layered fusion residual errors are introduced into the MLP network, features are extracted through residual error MLP blocks of a plurality of stages, shallow layer features and deep layer features are fused in each stage, and the fusion features are mapped to corresponding behavior categories to obtain preliminary behavior categories of workers in the images;
S6, comparing the preliminary behavior category result with site characteristics and weather characteristics, and adjusting the preliminary behavior category according to a preset correction rule to obtain a final behavior category;
s7, safety control is conducted on the behaviors of the workers on the construction site based on the final behavior categories.
Further, in step S3, the step of introducing a dual-layer attention module in the fast branch includes:
s311, linearly changing the input characteristics to generate a query matrix, a key matrix and a value matrix in an attention mechanism;
S312, constructing a directed graph, establishing attention relations among different areas, calculating average values of query matrixes and key matrixes in each area, generating an area query matrix and an area key matrix, generating an adjacency matrix by dot product area query matrixes and area key matrixes, and measuring correlation among different areas by the adjacency matrix;
s313, pruning is carried out on the adjacent matrixes, wherein the pruning comprises the first k adjacent matrixes of the high-correlation area, and a route index matrix is obtained;
s314, aggregating key matrix and value matrix tensors of all routing areas based on the attention mechanisms focused on k routing areas to generate an aggregated key matrix and value matrix;
s315, performing attention operation on the aggregated key matrix and the value matrix, and introducing a local context enhancement item LEC to derive a result tensor so as to obtain the output characteristics of the fast branch.
In step S3, the feature enhancement module is of a multi-branch structure, and the output feature graphs of the branches are spliced, and the spliced feature graphs and the input feature graphs are added through residual connection, and output features of the slow branches are obtained through the ReLU activation function output.
Further, in step S3, in the step of adjusting weights of the fast branch and the slow branch based on the dynamic weight mechanism, a weight calculation formula is expressed as:;
Wherein, the The characteristic sequence representing the slow branch is subjected to global average pooling; The feature sequences representing the fast branches are subjected to global average pooling, conv1, conv2 and conv3 represent convolution operation, sigma represents a sigmoid activation function for limiting an output result to be in a range of 0 to 1, and the output result is used for representing the size of the weight.
Further, in step S3, the step of performing feature fusion on the outputs of the two branches according to the adjusted weights to obtain visual features includes:
the output characteristics of the fast branch and the output characteristics of the slow branch are respectively recorded as And, wherein,Representing the output characteristics of the slow-leg,Representing the output characteristics of the fast branch;
global average pooling is carried out on the characteristic sequences of the slow branch and the fast branch respectively to obtain pooling results And;
And respectively carrying out convolution operation and difference on the two pooling results, and calculating the motion characteristic difference F of the fast and slow branches, wherein the motion characteristic difference F is expressed as:;
Wherein conv1 and conv2 both represent convolution operations, σ represents an activation function;
Convolving the motion characteristic difference F and generating characteristic weights by adopting sigmoid activation function Expressed as:;
wherein conv3 represents a convolution operation and σ represents a sigmoid activation function;
the characteristic weight is related to the characteristic of the slow branch Performing point multiplication operation to generate enhanced feature images, wherein,;
Feature map to be enhancedAnd fusing the characteristics of the fast branch as the subsequent input of the slow branch, so as to realize the characteristic fusion of the fast and slow branches and obtain visual characteristics.
Further, in step S4, the pearson correlation coefficientExpressed as:;
In the formula, Is thatAndIs a covariance of (2); Is that Standard deviation of (2); Is that Standard deviation of (2); Is that Is used for the average value of (a),Is thatE represents the expected value.
Further, in step S4, the step of constructing a cross-modal relation matrix between the text and the visual feature includes:
respectively substituting the text features and the visual features into the variables of the Pearson correlation coefficient, and constructing a relation fusion matrix to obtain a relation fusion matrix of the text features and the visual features , wherein,The values of (a) are the number of text features and visual features, and the dimension of the matrix isMatrix of (a)Reflecting the interrelationship between text features and visual features;
Substituting the visual features and the text features into the variables of the pearson correlation coefficients, constructing a relation fusion matrix, and obtaining the relation matrix of the visual features and the text features by using the pearson correlation coefficients
Further, in step S4, the step of weighting and fusing the text feature and the visual feature according to the cross-modal relation matrix to obtain a final fused feature includes:
In the process of obtaining the relation matrix And (3) withAnd then weighting the visual features and the text features respectively, wherein:
Through text-to-vision weighted relation matrix To weight visual features, expressed as:, representing visual features generated by weighting the text-to-visual relationship matrix;
vision to text weighted relation matrix To weight text features, expressed as:, Representing text features generated by weighting the text by a visual-to-text weighting relationship matrix;
And Respectively enhanced text features and visual features;
Finally, the enhanced text features and visual features are fused together by a weighted average method, and are expressed as follows: Wherein, the method comprises the steps of, In order to finally fuse the feature vectors,AndIs a hyper-parameter that controls how much text features and visual features contribute in the final fusion feature.
Further, in step S5, introducing a multi-stage hierarchical fusion residual in the MLP network, including:
Dividing the MLP network into a plurality of phases, each phase comprising a number of residual MLP blocks, the structure of each residual MLP block being expressed as: ;
Wherein, the Represent the firstOutput of layers, MLP (·) represents a multi-layer perceptron;
At the end of each phase, the output features of the current phase are fused with the output features of the previous phase, expressed as: ;
Wherein, the Representing the fusion characteristics of the s-th stage, wherein concat (·, ·) represents characteristic splicing operation;
jump connection is added between different stages of the network, and shallow layer characteristics and deep layer characteristics are fused, which are expressed as follows: ;
Wherein, the Representing the output of the j-th hop connection,The representation transforms the feature;
the fusion features of the last stage are input to the classification layer, expressed as: ;
where O represents the final classification output and W and b represent the weight and bias of the classification layer, respectively.
Another embodiment of the present invention provides an intelligent worksite safety control system based on multi-source data analysis, the control system comprising the following modules:
The data acquisition module is used for acquiring multi-source information of the construction site, wherein the multi-source information comprises site information, weather information and image information;
The text feature extraction module is used for respectively preprocessing the text of the field information and the weather information, respectively extracting features of the preprocessed field information text and the preprocessed weather information text by adopting the BERT model to obtain field features and weather features, and performing splicing processing on the field features and the weather features to obtain text features;
the improved SlowFast model comprises a fast branch and a slow branch, a double-layer attention module is introduced into the fast branch, a feature enhancement module is introduced into the slow branch, weights of the fast branch and the slow branch are adjusted based on a dynamic weight mechanism, and the outputs of the two branches are subjected to feature fusion according to the adjusted weights to obtain visual features;
The feature fusion module is used for constructing a cross-modal relation matrix between the text and the visual features by calculating the Pearson correlation coefficient, and weighting and fusing the text features and the visual features according to the cross-modal relation matrix to obtain final fusion features;
The behavior classification module is used for taking the fusion characteristics as the input of the MLP network, wherein multi-stage layered fusion residual errors are introduced into the MLP network, characteristics are extracted through residual error MLP blocks of a plurality of stages, shallow layer characteristics and deep layer characteristics are fused in each stage, and the fusion characteristics are mapped to corresponding behavior categories to obtain preliminary behavior categories of workers in the image;
The behavior modification module is used for comparing the preliminary behavior category result with the site characteristics and the weather characteristics, and adjusting the preliminary behavior category according to a preset modification rule to obtain a final behavior category;
and the behavior control module is used for safely controlling the behaviors of the construction site workers based on the final behavior category.
Compared with the prior art, the intelligent building site safety control method and system based on multi-source data analysis have the beneficial effects that:
Firstly, the invention can more comprehensively understand the actual situation of a construction site by fusing site information, weather information and image information, adopts a BERT model to extract text characteristics, so that the influence of the site and weather can be considered in the subsequent behavior recognition, combines an improved SlowFast model, can better capture motion information in an image sequence by introducing a double-layer attention module into a fast branch, improves the recognition precision of the behavior of a worker, and can enhance the extraction capability of key characteristics of safety equipment for the worker by introducing a characteristic enhancement module into a slow branch;
Secondly, the cross-modal relation matrix is constructed by calculating the pearson correlation coefficient, so that the correlation between text features and visual features can be quantized, a basis is provided for cross-modal feature fusion, the text features and the visual features are fused by weighting according to the relation matrix, the deep fusion of multi-modal information can be realized, the site and weather information are fully utilized to assist behavior recognition, and the accuracy and the robustness of the behavior recognition are improved;
Thirdly, the fusion characteristics are used as the input of the MLP network, the improved MLP network is utilized, the multi-stage layered fusion residual MLP is utilized, the adaptability of the network model to different characteristic modes is enhanced, the preliminary classification of the behaviors of workers is realized, the preliminary behavior classification result is compared with the site characteristics and the weather characteristics, and the accuracy of behavior recognition can be further improved by adjusting according to a preset correction rule. The site features and the weather features provide contextual information of behavior occurrence, which is helpful for verifying and correcting the preliminary classification result and reducing misjudgment;
In summary, the control method of the invention can realize comprehensive, accurate and real-time monitoring and management of the actions of workers on the construction site by acquiring and fusing the multisource information and combining the safety control measures.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present invention.
FIG. 1 is a flow chart of an implementation of the intelligent site safety control method based on multi-source data analysis of the present invention;
FIG. 2 is a sub-flowchart of the intelligent site security management and control method based on multi-source data analysis of the present invention;
FIG. 3 is another sub-flowchart of the intelligent worksite security management and control method based on multi-source data analysis of the present invention;
FIG. 4 is a further sub-flowchart of the intelligent worksite security management and control method based on multi-source data analysis of the present invention;
FIG. 5 is a block diagram of an intelligent worksite safety control system based on multi-source data analysis according to the present invention;
Fig. 6 is a block diagram of a computer device according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Specific implementations of the invention are described in detail below in connection with specific embodiments.
Referring to fig. 1, in one embodiment of the present invention, an intelligent site security control method based on multi-source data analysis is provided, the control method includes the following steps:
S1, acquiring multi-source information of a construction site, wherein the multi-source information comprises site information, weather information and image information;
The invention can more comprehensively understand the actual situation of the construction site by fusing the site information, the weather information and the image information, wherein the site information mainly comprises the layout of the construction site, and comprises the specific positions and the specific shapes of buildings, construction areas, material stacking areas, channels and the like. The functions of different areas, such as a high-altitude operation area, a foundation pit operation area, a welding operation area, a common construction area and the like, can be clearly divided through the site information, and the safety level and the special requirements of each area are marked at the same time; the collection of the weather information is realized by two main ways, namely, firstly, a high-precision weather forecast API is called to acquire weather forecast data of a region where a construction site is located within the future hours to days, wherein the weather forecast data comprises weather elements such as temperature, humidity, rainfall, wind power, wind direction, air pressure and the like, and secondly, reliable weather monitoring equipment such as a weather station is installed on the construction site to monitor the actual weather condition of the construction site in real time.
Furthermore, the image information is a key data source for behavior recognition, high-definition and high-frame-rate cameras are installed in each key area and key parts of the construction site, so that reasonable layout of the cameras is ensured, main operation areas, channels and dangerous areas of the construction site can be covered comprehensively, and monitoring blind areas are avoided.
S2, respectively carrying out text preprocessing on the site information and the weather information, respectively carrying out feature extraction on the preprocessed site information text and the preprocessed weather information text by adopting a BERT model to obtain site features and weather features, and carrying out splicing processing on the site features and the weather features to obtain text features;
The method comprises the steps of extracting text features by adopting a BERT model, enabling the influence of places and weather to be considered in subsequent behavior recognition, firstly, respectively preprocessing the places and the weather to ensure the normalization and consistency of data, removing noise, unified text formats, word segmentation and the like in the text in the preprocessing process, then respectively extracting features of the preprocessed places and the weather, capturing rich semantic information and context relations by adopting the BERT model, specifically, respectively inputting the preprocessed places and the weather into the pre-trained BERT model, encoding each word or sentence by the BERT model according to a multi-layer bidirectional transducer structure in the BERT model, generating corresponding feature vectors, thereby obtaining the places and the weather, and finally, splicing the places and the weather to form a comprehensive text feature vector.
S3, extracting visual features of image information by adopting an improved SlowFast model, wherein the improved SlowFast model comprises a fast branch and a slow branch, a double-layer attention module is introduced into the fast branch, a feature enhancement module is introduced into the slow branch, weights of the fast branch and the slow branch are adjusted based on a dynamic weight mechanism, and the outputs of the two branches are subjected to feature fusion according to the adjusted weights to obtain visual features;
The invention combines the improved SlowFast model, can better capture the motion information in the image sequence and improve the recognition accuracy of the worker behavior by introducing a double-layer attention module into the fast branch, and can enhance the extraction capability of the worker for using the key features of the safety equipment by introducing a feature enhancement module into the slow branch;
s4, constructing a cross-modal relation matrix between the text and the visual features by calculating the Pearson correlation coefficient, and weighting and fusing the text features and the visual features according to the cross-modal relation matrix to obtain final fused features;
According to the invention, the cross-modal relation matrix is constructed by calculating the Pearson correlation coefficient, so that the correlation between text features and visual features can be quantized, a basis is provided for cross-modal feature fusion, the text features and the visual features are weighted and fused according to the relation matrix, the deep fusion of multi-modal information can be realized, the site and weather information are fully utilized to assist behavior recognition, and the accuracy and the robustness of the behavior recognition are improved;
S5, taking the fusion features as the input of an MLP network, wherein multi-stage layered fusion residual errors are introduced into the MLP network, features are extracted through residual error MLP blocks of a plurality of stages, shallow layer features and deep layer features are fused in each stage, and the fusion features are mapped to corresponding behavior categories to obtain preliminary behavior categories of workers in the images;
S6, comparing the preliminary behavior category result with site characteristics and weather characteristics, and adjusting the preliminary behavior category according to a preset correction rule to obtain a final behavior category;
s7, safety control is carried out on the behaviors of the workers on the construction site based on the final behavior category;
According to the invention, the fusion characteristics are used as the input of the MLP network, and the improved MLP network is utilized to utilize multi-stage layered fusion residual MLP to realize the preliminary classification of the behaviors of workers through characteristic fusion and residual connection, so that the adaptability of the network model to different characteristic modes is enhanced;
Further, the preliminary behavior category results are compared with the site characteristics and the weather characteristics, and are adjusted according to the preset correction rules, so that the accuracy of behavior identification can be further improved;
The site features and the weather features provide contextual information of behavior occurrence, which is helpful for verifying and correcting the preliminary classification result and reducing erroneous judgment.
In the embodiment of the invention, the position of the worker is combined with the site information, so that the specific area where the worker is currently located can be determined when the worker is subjected to behavior control. For example, whether the worker is in an overhead working area, near a foundation pit, in a tower crane operation area, or in a general construction passage, etc. Different areas correspond to different types of possible behaviors and different safety requirements, which provide key context information for space perception and behavior positioning, and in addition, the behaviors of workers in certain areas can be reasonably expected according to the functional division of places. For example, in a material deposit area, it is normal for workers to take the action of handling material, but if the workers take welding or cutting or the like in that area, there may be some safety hazard.
Therefore, in the embodiment of the invention, after the site information is fused with the multi-source information such as the weather information, the image information and the like, a more comprehensive context can be provided. For example, in an aerial work area with strong wind weather, the worker may have physical inclination, etc. and combine the site information (aerial work area) and weather information (strong wind), it may be more accurately judged that this is a normal reaction in which the worker tries to keep balance in the strong wind environment, rather than a dangerous behavior against rule. The multisource information fusion enables behavior recognition to fully consider site factors, so that recognition accuracy is improved.
In the step S6 of the invention, the preliminary behavior category result is compared with the site characteristics and the weather characteristics, and the behavior category result is revised based on the comparison result, wherein the fact that the site is a complex environment is considered, various uncertainties exist, weather can be changed suddenly, the behavior mode of a worker can be changed greatly, such as the walking speed is slow, the operation action can be cautious, and the like, so that the subsequent behavior correction step of the invention can further correct the behavior recognition result by considering more factors such as the trend, the change speed and the like of the weather change.
Referring to fig. 2, in step S3 of the embodiment of the disclosure, the step of introducing a dual-layer attention module in the fast branch includes:
s311, linearly changing the input characteristics to generate a query matrix, a key matrix and a value matrix in an attention mechanism;
S312, constructing a directed graph, establishing attention relations among different areas, calculating average values of query matrixes and key matrixes in each area, generating an area query matrix and an area key matrix, generating an adjacency matrix by dot product area query matrixes and area key matrixes, and measuring correlation among different areas by the adjacency matrix;
s313, pruning is carried out on the adjacent matrixes, wherein the pruning comprises the first k adjacent matrixes of the high-correlation area, and a route index matrix is obtained;
s314, aggregating key matrix and value matrix tensors of all routing areas based on the attention mechanisms focused on k routing areas to generate an aggregated key matrix and value matrix;
s315, performing attention operation on the aggregated key matrix and the value matrix, and introducing a local context enhancement item LEC to derive a result tensor so as to obtain the output characteristics of the fast branch.
Further, in step S3, the feature enhancement module is a multi-branch structure, and splices the output feature graphs of the multiple branches, and the spliced feature graphs and the input feature graphs are added through residual connection, and output through a ReLU activation function to obtain the output feature of the slow branch, and in one implementation manner of the multi-branch structure, three branch structures are adopted, and specifically include:
the first branch with two convolution layers, one convolution layer uses a convolution kernel of 1 multiplied by 1, the stride is stride, the channel number of the input feature map is adjusted to be 2 multiplied by inter_ planes, the other convolution layer uses a convolution kernel of 3 multiplied by 3, the stride is 1, the filling is 1, and the first branch is used for extracting local features, and can increase the channel number of the features and extract richer local features under the condition that the size of the feature map is not changed;
A second branch having four convolution layers, one using a1×1 convolution kernel with a stride of 1, adjusting the number of channels of the input feature map to inter_ planes, another using a1×3 convolution kernel with a stride of stride, filling (0, 1) for expanding the receptive field in the height direction, a last using a 3×1 convolution kernel with a stride of stride, (1, 0) for expanding the receptive field in the width direction, and a fourth using a 3×3 convolution kernel with a stride of 1, filling 5, a void ratio of 5 for further extracting void convolution features, expanding the receptive field and capturing more context information;
The third branch with three convolution layers, one convolution layer uses a 1X 1 convolution kernel, the stride is stride, the channel number of the input feature graph is adjusted to be 2X inter_ planes, the other convolution layer uses a 3X 1 convolution kernel, the stride is stride, the filling is (1, 0), the last convolution layer uses a 1X 3 convolution kernel, the stride is stride, the filling is (0, 1), the feature is extracted from different convolution kernel dimension sequences, and the feature diversity is enriched;
and finally, adding the channel number with the input feature map through residual connection, and outputting through a ReLU activation function to obtain the output feature of the slow branch.
In the embodiment of the present invention, in the step of adjusting the weights of the fast branch and the slow branch based on the dynamic weight mechanism provided in step S3, the weight calculation formula is expressed as follows: Wherein, the method comprises the steps of, The characteristic sequence representing the slow branch is subjected to global average pooling; The feature sequences representing the fast branches are subjected to global average pooling, conv1, conv2 and conv3 represent convolution operation, sigma represents a sigmoid activation function for limiting an output result to be in a range of 0 to 1, and the output result is used for representing the size of the weight.
As shown in fig. 3, in step S3, the step of performing feature fusion on the outputs of the two branches according to the adjusted weights to obtain visual features includes:
s321, acquiring the output characteristics of the fast branch and the slow branch, which are respectively recorded as And, wherein,Representing the output characteristics of the slow-leg,Representing the output characteristics of the fast branch;
S323, carrying out global average pooling on the characteristic sequences of the slow branch and the fast branch respectively to obtain pooling results And;
S323, respectively carrying out convolution operation and difference on the two pooling results, and calculating the motion characteristic difference F of the fast and slow branches, wherein the motion characteristic difference F is expressed as: wherein conv1 and conv2 each represent a convolution operation and σ represents an activation function;
s324, performing convolution operation on the motion characteristic difference F and generating characteristic weights by adopting sigmoid activation function Expressed as: wherein conv3 represents a convolution operation and σ represents a sigmoid activation function;
s325, combining the characteristic weight with the characteristic of the slow branch Performing point multiplication operation to generate enhanced feature images, wherein,;
S326, feature map to be enhancedAnd fusing the characteristics of the fast branch as the subsequent input of the slow branch, so as to realize the characteristic fusion of the fast and slow branches and obtain visual characteristics.
Further, in step S4, the pearson correlation coefficientExpressed as:;
In the formula, Is thatAndIs a covariance of (2); Is that Standard deviation of (2); Is that Standard deviation of (2); Is that Is used for the average value of (a),Is thatE represents the expected value.
Further, in step S4, the step of constructing a cross-modal relation matrix between the text and the visual features comprises the steps of substituting the text features and the visual features into the variables of the Pearson correlation coefficient respectively, constructing a relation fusion matrix, and obtaining the relation fusion matrix of the text features and the visual features, wherein,The values of (a) are the number of text features and visual features, and the dimension of the matrix isMatrix of (a)Substituting the visual features and the text features into the variables of the pearson correlation coefficients, constructing a relationship fusion matrix, and obtaining the relationship matrix of the visual features and the text features by using the pearson correlation coefficients
In step S4 of the invention, the step of weighting and fusing text features and visual features according to the cross-modal relation matrix to obtain final fused features comprises the steps of obtaining the relation matrixSum-of-relations matrixThen weighting the visual characteristic and the text characteristic respectively, wherein the visual characteristic and the text characteristic are weighted by a text-to-visual weighted relation matrixTo weight visual features, expressed as:, representing visual features generated by weighting a text-to-visual relationship matrix To weight text features, expressed as:, representing text features generated by weighting the text by a visual-to-text weighting relationship matrix, wherein, AndAnd finally, fusing the enhanced text features and the visual features together by a weighted average method, wherein the enhanced text features and the visual features are expressed as follows: Wherein, the method comprises the steps of, In order to finally fuse the feature vectors,AndIs a hyper-parameter that controls how much text features and visual features contribute in the final fusion feature.
Further, referring to fig. 4, in step S5, a multi-stage hierarchical fusion residual is introduced into the MLP network, including:
S411, dividing the MLP network into a plurality of stages, wherein each stage comprises a plurality of residual MLP blocks, and the structure of each residual MLP block is expressed as follows: Wherein, the method comprises the steps of, Represent the firstOutput of layers, MLP (·) represents a multi-layer perceptron;
S412, at the end of each stage, fusing the output characteristics of the current stage with the output characteristics of the previous stage, and representing as: Wherein, the method comprises the steps of, Representing the fusion characteristics of the s-th stage, wherein concat (·, ·) represents characteristic splicing operation;
s413, adding jump connection between different stages of the network, and fusing shallow layer characteristics and deep layer characteristics, wherein the jump connection is expressed as follows: Wherein, the method comprises the steps of, Representing the output of the j-th hop connection,The representation transforms the feature;
s414, inputting the fusion characteristic of the last stage into a classification layer, wherein the fusion characteristic is expressed as: Wherein O represents the final classification output and W and b represent the weight and bias of the classification layer, respectively.
In summary, the control method of the invention can realize comprehensive, accurate and real-time monitoring and management of the actions of workers on the construction site by acquiring and fusing the multisource information and combining the safety control measures.
Referring to fig. 5, in another embodiment of the present invention, an intelligent site safety control system based on multi-source data analysis is provided, the control system includes the following modules:
a data acquisition module 81, configured to acquire multi-source information of a worksite, where the multi-source information includes site information, weather information, and image information;
the text feature extraction module 82 is configured to perform text preprocessing on the site information and the weather information, perform feature extraction on the preprocessed site information text and weather information text by using a BERT model, obtain site features and weather features, and perform splicing processing on the site features and the weather features, so as to obtain text features;
The improved SlowFast model comprises a fast branch and a slow branch, wherein a double-layer attention module is introduced into the fast branch, a feature enhancement module is introduced into the slow branch, weights of the fast branch and the slow branch are adjusted based on a dynamic weight mechanism, and the outputs of the two branches are subjected to feature fusion according to the adjusted weights to obtain visual features;
the feature fusion module 84 is configured to construct a cross-modal relation matrix between the text and the visual feature by calculating the pearson correlation coefficient, and weight-fuse the text feature and the visual feature according to the cross-modal relation matrix to obtain a final fusion feature;
The behavior classification module 85 is configured to take the fusion feature as an input of an MLP network, wherein a multi-stage layered fusion residual is introduced into the MLP network, the feature is extracted through residual MLP blocks of multiple stages, shallow and deep features are fused in each stage, and the fusion feature is mapped to a corresponding behavior class to obtain a preliminary behavior class of a worker in the image;
the behavior modification module 86 is configured to compare the preliminary behavior category result with the site feature and the weather feature, and adjust the preliminary behavior category according to a predetermined modification rule to obtain a final behavior category;
The behavior control module 87 is configured to safely control the behavior of the construction site worker based on the final behavior category.
As shown in fig. 6, the computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program which, when executed by a processor, causes the processor to implement a smart site security management method based on multi-source data analysis.
The internal memory may also store a computer program that, when executed by the processor, causes the processor to perform a smart worksite security management method based on multi-source data analysis.
It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer readable storage medium is provided, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the processor is caused to execute the intelligent building site safety management method based on multi-source data analysis provided in the above embodiment.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (9)

1.基于多源数据分析的智慧工地安全管控方法,其特征在于,该管控方法包括以下步骤:1. A smart construction site safety management and control method based on multi-source data analysis, characterized in that the management and control method comprises the following steps: S1、获取工地的多源信息,多源信息包括场地信息、天气信息和图像信息;S1. Acquire multi-source information of the construction site, including site information, weather information and image information; S2、将场地信息和天气信息分别进行文本预处理,并采用BERT模型分别对预处理后的场地信息文本和天气信息文本进行特征提取,得到场地特征和天气特征,对场地特征和天气特征进行拼接处理,得到文本特征;S2. Preprocess the venue information and weather information respectively, and use the BERT model to extract features from the preprocessed venue information text and weather information text respectively to obtain venue features and weather features, and concatenate the venue features and weather features to obtain text features; S3、采用改进SlowFast模型提取图像信息的视觉特征;改进SlowFast模型包括快支路和慢支路,在快支路中引入双层注意力模块,在慢支路中引入特征增强模块,特征增强模块为多分支结构,将多个分支的输出特征图进行拼接;拼接后特征图与输入特征图通过残差连接相加,经过ReLU激活函数输出得到慢支路的输出特征,基于动态权重机制调整快支路和慢支路的权重,根据调整的权重将两个支路的输出进行特征融合,得到视觉特征;S3. Use the improved SlowFast model to extract the visual features of image information; the improved SlowFast model includes a fast branch and a slow branch, a double-layer attention module is introduced in the fast branch, and a feature enhancement module is introduced in the slow branch. The feature enhancement module is a multi-branch structure, and the output feature maps of multiple branches are spliced; the spliced feature map is added to the input feature map through a residual connection, and the output feature of the slow branch is obtained through the ReLU activation function output, and the weights of the fast branch and the slow branch are adjusted based on the dynamic weight mechanism. The outputs of the two branches are feature fused according to the adjusted weights to obtain visual features; S4、通过计算皮尔逊相关系数,构建文本和视觉特征之间的跨模态关系矩阵,根据跨模态关系矩阵加权融合文本特征和视觉特征,得到最终的融合特征;S4. By calculating the Pearson correlation coefficient, a cross-modal relationship matrix between text and visual features is constructed, and the text features and visual features are weightedly fused according to the cross-modal relationship matrix to obtain the final fusion features; S5、将融合特征作为MLP网络的输入,其中,在MLP网络引入多阶段分层融合残差,通过多个阶段的残差MLP块提取特征,并在每个阶段融合浅层和深层特征,将融合特征映射到对应的行为类别,得到图像中工人的初步行为类别;S5. The fused features are used as the input of the MLP network, wherein a multi-stage hierarchical fusion residual is introduced into the MLP network, features are extracted through residual MLP blocks of multiple stages, shallow and deep features are fused at each stage, and the fused features are mapped to the corresponding behavior categories to obtain the preliminary behavior categories of the workers in the image; 所述在MLP网络引入多阶段分层融合残差的步骤中,将MLP网络分为多个阶段,每个阶段包含若干残差MLP块,在每个阶段的末尾,将当前阶段的输出特征与前一阶段的输出特征进行融合,在网络的不同阶段之间添加跳跃连接,融合浅层特征和深层特征,将最后一阶段的融合特征输入到分类层;In the step of introducing multi-stage hierarchical fusion residuals in the MLP network, the MLP network is divided into multiple stages, each stage contains a number of residual MLP blocks, at the end of each stage, the output features of the current stage are fused with the output features of the previous stage, skip connections are added between different stages of the network, shallow features and deep features are fused, and the fused features of the last stage are input into the classification layer; S6、将初步行为类别结果与场地特征、天气特征进行对比,根据预定的修正规则,对初步行为类别进行调整,得到最终行为类别;S6, comparing the preliminary behavior category results with the site characteristics and weather characteristics, and adjusting the preliminary behavior category according to a predetermined correction rule to obtain a final behavior category; S7、基于最终行为类别,对工地工人的行为进行安全管控。S7. Based on the final behavior category, safety management and control of the behavior of construction site workers is carried out. 2.根据权利要求1所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S3中,在快支路中引入双层注意力模块的步骤,包括:2. The smart construction site safety management and control method based on multi-source data analysis according to claim 1 is characterized in that, in step S3, the step of introducing a double-layer attention module in the fast branch includes: S311、将输入特征进行线性变化,产生注意力机制中的查询矩阵、键矩阵和值矩阵;S311, linearly transform the input features to generate a query matrix, a key matrix, and a value matrix in the attention mechanism; S312、构建有向图并建立不同区域之间的注意力关系,计算每个区域中查询矩阵和键矩阵的平均值,产生区域查询矩阵和区域键矩阵,点积区域查询矩阵和区域键矩阵生成邻接矩阵,邻接矩阵用于衡量不同区域之间的相关性;S312, construct a directed graph and establish attention relationships between different regions, calculate the average values of the query matrix and the key matrix in each region, generate a region query matrix and a region key matrix, and generate an adjacency matrix by performing a dot product of the region query matrix and the region key matrix. The adjacency matrix is used to measure the correlation between different regions; S313、对邻接矩阵进行剪枝处理,包括前k个高相关性区域邻接矩阵,得到路由索引矩阵;S313, pruning the adjacency matrix, including the first k high-correlation region adjacency matrices, to obtain a routing index matrix; S314、基于集中于k个路由区域的注意力机制,聚合所有路由区域的键矩阵和值矩阵张量,以生成聚合的键矩阵和值矩阵;S314, based on the attention mechanism focused on the k routing areas, aggregating the key matrix and value matrix tensors of all routing areas to generate an aggregated key matrix and value matrix; S315、对聚合的键矩阵和值矩阵进行注意力操作,引入局部上下文增强项LEC来导出结果张量,得到快支路的输出特征。S315. Perform an attention operation on the aggregated key matrix and value matrix, introduce a local context enhancement term LEC to derive the result tensor, and obtain the output features of the fast branch. 3.根据权利要求2所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S3中,基于动态权重机制调整快支路和慢支路的权重的步骤中,权重计算公式表示为:3. The smart construction site safety management and control method based on multi-source data analysis according to claim 2 is characterized in that, in step S3, in the step of adjusting the weights of the fast branch and the slow branch based on the dynamic weight mechanism, the weight calculation formula is expressed as: ; 其中,表示慢支路的特征序列经过全局平均池化结果;表示快支路的特征序列经过全局平均池化结果;conv1、conv2、conv3均表示卷积操作;σ表示sigmoid激活函数,用于将输出结果限制在0到1范围内,输出结果用于表示权重的大小。in, Indicates the result of global average pooling of the feature sequence of the slow branch; It represents the result of global average pooling of the feature sequence of the fast branch; conv1, conv2, and conv3 all represent convolution operations; σ represents the sigmoid activation function, which is used to limit the output result to the range of 0 to 1. Used to indicate the size of the weight. 4.根据权利要求3所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S3中,根据调整的权重将两个支路的输出进行特征融合,得到视觉特征的步骤,包括:4. The method for intelligent construction site safety management and control based on multi-source data analysis according to claim 3 is characterized in that, in step S3, the outputs of the two branches are subjected to feature fusion according to the adjusted weights to obtain visual features, comprising: 获取快支路的输出特征和慢支路的输出特征,分别记为,其中,表示慢支路的输出特征,表示快支路的输出特征;Get the output characteristics of the fast branch and the output characteristics of the slow branch, which are recorded as and ,in, represents the output characteristics of the slow branch, Indicates the output characteristics of the fast branch; 分别对慢支路和快支路的特征序列进行全局平均池化,得到池化结果Perform global average pooling on the feature sequences of the slow branch and the fast branch respectively to obtain the pooling results and ; 将两个池化结果分别进行卷积操作并作差,计算快慢支路的运动特征差异F,表示为:;其中,conv1和conv2均表示卷积操作,σ表示激活函数;The two pooling results are convolved and subtracted to calculate the motion feature difference F between the fast and slow branches, which is expressed as: ; Where conv1 and conv2 both represent convolution operations, and σ represents the activation function; 对运动特征差异F进行卷积操作并采用sigmoid激活函数生成特征权重,表示为:;其中,conv3表示卷积操作,σ表示sigmoid激活函数;Perform convolution operation on the motion feature difference F and use sigmoid activation function to generate feature weights , expressed as: ; Where conv3 represents the convolution operation and σ represents the sigmoid activation function; 将特征权重与慢支路的特征进行点乘操作,生成增强的特征图,其中,The feature weights are combined with the features of the slow branch Perform point multiplication to generate enhanced feature maps ,in, ; 将增强的特征图与快支路的特征进行融合,作为慢支路的后续输入,实现快慢支路的特征融合,得到视觉特征。The enhanced feature map The features of the fast branch are fused together as the subsequent input of the slow branch to achieve feature fusion of the fast and slow branches and obtain visual features. 5.根据权利要求4所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S4中,皮尔逊相关系数表示为:5. The smart construction site safety management and control method based on multi-source data analysis according to claim 4 is characterized in that, in step S4, the Pearson correlation coefficient It is expressed as: ; 式中,的协方差;的标准差;的标准差;的均值,的均值;E表示期望值。In the formula, yes and The covariance of yes The standard deviation of yes The standard deviation of yes The mean of yes E represents the expected value. 6.根据权利要求5所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S4中,构建文本和视觉特征之间的跨模态关系矩阵的步骤,包括:6. The method for intelligent construction site safety management and control based on multi-source data analysis according to claim 5 is characterized in that, in step S4, the step of constructing a cross-modal relationship matrix between text and visual features comprises: 将文本特征和视觉特征分别代入皮尔逊相关系数的变量中,构建关系融合矩阵,得到文本特征与视觉特征关系融合矩阵 ,其中,的值为文本特征和视觉特征的个数;矩阵的维度为;矩阵反映了文本特征与视觉特征之间的相互关系;Substitute the text features and visual features into the variables of the Pearson correlation coefficient respectively, construct the relationship fusion matrix, and obtain the relationship fusion matrix between text features and visual features ,in, The value of is the number of text features and visual features; the dimension of the matrix is ;matrix It reflects the relationship between text features and visual features; 将视觉特征和文本特征代入皮尔逊相关系数的变量中,构建关系融合矩阵,使用皮尔逊相关系数,可得视觉特征与文本特征关系矩阵Substitute the visual features and text features into the variables of the Pearson correlation coefficient to construct a relationship fusion matrix. Using the Pearson correlation coefficient, the relationship matrix between visual features and text features can be obtained. . 7.根据权利要求6所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S4中,根据跨模态关系矩阵加权融合文本特征和视觉特征,得到最终的融合特征的步骤,包括:7. The method for intelligent construction site safety management and control based on multi-source data analysis according to claim 6 is characterized in that, in step S4, the step of obtaining the final fusion feature by weighted fusion of text features and visual features according to the cross-modal relationship matrix comprises: 在得到关系矩阵和关系矩阵后,分别对视觉特征和文本特征进行加权;其中:In the relationship matrix and the relationship matrix After that, the visual features and text features are weighted respectively; where: 通过文本到视觉的加权关系矩阵来加权视觉特征,表示为:表示通过文本到视觉的关系矩阵加权后生成的视觉特征;Weighted Relationship Matrix via Text-to-Vision To weight the visual features, it is expressed as: , Represents the visual features generated by weighting the text-to-visual relationship matrix; 通过视觉到文本的加权关系矩阵来加权文本特征,表示为:表示通过视觉到文本的加权关系矩阵加权后生成的文本特征;Vision-to-text weighted relationship matrix To weight the text features, it is expressed as: , Represents the text features generated by weighting the visual-to-text weighted relationship matrix; 分别是文本特征和视觉特征; and They are text features and visual features respectively; 最后,通过加权平均的方法,将加权后的文本特征和视觉特征融合在一起,表示为:;其中,为最终融合特征向量,是控制文本特征与视觉特征在最终融合特征中贡献程度的超参数。Finally, the weighted text features and visual features are fused together by weighted averaging, expressed as: ;in, is the final fused feature vector, and It is a hyperparameter that controls the contribution of text features and visual features in the final fusion features. 8.根据权利要求7所述的基于多源数据分析的智慧工地安全管控方法,其特征在于,在步骤S5中,在MLP网络引入多阶段分层融合残差,包括:8. The smart construction site safety management and control method based on multi-source data analysis according to claim 7 is characterized in that, in step S5, a multi-stage hierarchical fusion residual is introduced into the MLP network, comprising: 每个残差MLP块的结构表示为:The structure of each residual MLP block is expressed as: ; 其中,表示第层的输出,表示多层感知机;in, Indicates The output of the layer, represents a multi-layer perceptron; 所述在每个阶段的末尾,将当前阶段的输出特征与前一阶段的输出特征进行融合,表示为:;其中,表示第s阶段的融合特征,表示特征拼接操作;At the end of each stage, the output features of the current stage are fused with the output features of the previous stage, expressed as: ;in, represents the fusion features of the sth stage, Represents feature concatenation operation; 所述在网络的不同阶段之间添加跳跃连接,融合浅层特征和深层特征,表示为:;其中,表示第j个跳跃连接的输出,表示对特征进行变换;The skip connection is added between different stages of the network to fuse shallow features and deep features, which can be expressed as: ;in, represents the output of the j-th skip connection, Indicates the transformation of features; 所述将最后一阶段的融合特征输入到分类层,表示为:;其中,O表示最终的分类输出,W和b分别表示分类层的权重和偏置。The fusion features of the last stage are input into the classification layer, which is expressed as: ; Where O represents the final classification output, W and b represent the weight and bias of the classification layer respectively. 9.用于实现如权利要求1至8任一项所述基于多源数据分析的智慧工地安全管控方法的管控系统,其特征在于,该管控系统包括以下模块:9. A control system for implementing the smart construction site safety control method based on multi-source data analysis as claimed in any one of claims 1 to 8, characterized in that the control system comprises the following modules: 数据获取模块,用于获取工地的多源信息,多源信息包括场地信息、天气信息和图像信息;A data acquisition module is used to acquire multi-source information of the construction site, including site information, weather information and image information; 文本特征提取模块,用于将场地信息和天气信息分别进行文本预处理,并采用BERT模型分别对预处理后的场地信息文本和天气信息文本进行特征提取,得到场地特征和天气特征,对场地特征和天气特征进行拼接处理,得到文本特征;A text feature extraction module is used to perform text preprocessing on the venue information and weather information respectively, and use the BERT model to extract features from the preprocessed venue information text and weather information text respectively to obtain venue features and weather features, and then perform splicing processing on the venue features and weather features to obtain text features; 视觉特征提取模块,用于采用改进SlowFast模型提取图像信息的视觉特征;改进SlowFast模型包括快支路和慢支路,在快支路中引入双层注意力模块,在慢支路中引入特征增强模块,基于动态权重机制调整快支路和慢支路的权重,根据调整的权重将两个支路的输出进行特征融合,得到视觉特征;A visual feature extraction module is used to extract visual features of image information using an improved SlowFast model; the improved SlowFast model includes a fast branch and a slow branch, a double-layer attention module is introduced in the fast branch, and a feature enhancement module is introduced in the slow branch. The weights of the fast branch and the slow branch are adjusted based on a dynamic weight mechanism, and the outputs of the two branches are feature-fused according to the adjusted weights to obtain visual features; 特征融合模块,用于通过计算皮尔逊相关系数,构建文本和视觉特征之间的跨模态关系矩阵,根据跨模态关系矩阵加权融合文本特征和视觉特征,得到最终的融合特征;The feature fusion module is used to construct a cross-modal relationship matrix between text and visual features by calculating the Pearson correlation coefficient, and to weight the text features and visual features according to the cross-modal relationship matrix to obtain the final fusion features; 行为分类模块,用于将融合特征作为MLP网络的输入,其中,在MLP网络引入多阶段分层融合残差,通过多个阶段的残差MLP块提取特征,并在每个阶段融合浅层和深层特征,将融合特征映射到对应的行为类别,得到图像中工人的初步行为类别;The behavior classification module is used to use the fused features as the input of the MLP network. In the MLP network, multi-stage hierarchical fusion residuals are introduced to extract features through residual MLP blocks in multiple stages, and shallow and deep features are fused at each stage. The fused features are mapped to the corresponding behavior categories to obtain the preliminary behavior categories of the workers in the image; 行为修正模块,用于将初步行为类别结果与场地特征、天气特征进行对比,根据预定的修正规则,对初步行为类别进行调整,得到最终行为类别;A behavior correction module is used to compare the preliminary behavior category results with the site characteristics and weather characteristics, and adjust the preliminary behavior category according to a predetermined correction rule to obtain a final behavior category; 行为管控模块,用于基于最终行为类别,对工地工人的行为进行安全管控。The behavior control module is used to conduct safety control on the behavior of construction site workers based on the final behavior category.
CN202510580593.1A 2025-05-07 2025-05-07 Smart construction site safety management and control method and system based on multi-source data analysis Active CN120086699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510580593.1A CN120086699B (en) 2025-05-07 2025-05-07 Smart construction site safety management and control method and system based on multi-source data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510580593.1A CN120086699B (en) 2025-05-07 2025-05-07 Smart construction site safety management and control method and system based on multi-source data analysis

Publications (2)

Publication Number Publication Date
CN120086699A CN120086699A (en) 2025-06-03
CN120086699B true CN120086699B (en) 2025-07-25

Family

ID=95855795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510580593.1A Active CN120086699B (en) 2025-05-07 2025-05-07 Smart construction site safety management and control method and system based on multi-source data analysis

Country Status (1)

Country Link
CN (1) CN120086699B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116844236A (en) * 2023-07-13 2023-10-03 重庆理工大学 A behavior recognition method and system based on improved Slowfast
CN119919932A (en) * 2025-04-03 2025-05-02 安徽农业大学 Agricultural product classification method integrating dual-stream attention integration and cross-modal fusion

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183313B (en) * 2020-09-27 2022-03-11 武汉大学 SlowFast-based power operation field action identification method
CN117542118A (en) * 2023-11-24 2024-02-09 中国科学技术大学 UAV aerial video action recognition method based on dynamic modeling of spatiotemporal information
CN119380106A (en) * 2024-10-30 2025-01-28 电子科技大学(深圳)高等研究院 Medical image analysis method and system based on residual MLP network with sparse attention mechanism
CN119474496B (en) * 2024-11-08 2025-09-23 长安大学 An intelligent traffic event recognition method based on large traffic model and cross-modal retrieval
CN119851348A (en) * 2024-12-31 2025-04-18 西安理工大学 Sports action recognition method based on 3D space-time attention and slowfast network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116844236A (en) * 2023-07-13 2023-10-03 重庆理工大学 A behavior recognition method and system based on improved Slowfast
CN119919932A (en) * 2025-04-03 2025-05-02 安徽农业大学 Agricultural product classification method integrating dual-stream attention integration and cross-modal fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于改进SlowFast算法的电梯乘客异常行为识别;王志恒 等;中国计量大学学报;20240915;第35卷(第3期);第407-413页 *

Also Published As

Publication number Publication date
CN120086699A (en) 2025-06-03

Similar Documents

Publication Publication Date Title
CN119004322B (en) A pipeline system fault diagnosis method and system based on hierarchical attention mechanism
CN119251641B (en) Method, system and equipment for predicting reliability of power transmission line based on SENet and EffNet
CN109145743A (en) A kind of image-recognizing method and device based on deep learning
CN119646271B (en) An emergency fire hazard detection method based on multimodal AI large model recognition technology
CN119691419B (en) Multi-scale extreme high wind event AI identification method, device and medium integrating physical constraints
CN119272209A (en) A foundation pit digital twin monitoring method, system and application thereof
CN117851802A (en) Water quality prediction method and device and computer readable storage medium
CN119091307A (en) Landslide hazard remote sensing detection method and system integrating spectral and terrain information
Wang et al. Multicategory fire damage detection of post‐fire reinforced concrete structural components
CN117150383B (en) A new energy vehicle power battery fault classification method based on ShuffleDarkNet37-SE
CN120086699B (en) Smart construction site safety management and control method and system based on multi-source data analysis
CN119226805B (en) Multi-mode data generalization learning method and system based on causal invariant transformation
KR102784194B1 (en) Method and electronic device for providing property prediction data of a composite based on artificial intelligence
CN118865375B (en) Cell state detection method, device and storage medium based on space-time feature fusion
CN120542959A (en) Emergency situation intelligent decision-making method, device and system based on deep learning
CN118585772B (en) Early warning methods and information release platforms applicable to emergencies
CN120012974B (en) A method, system, device and medium for predicting offshore wind power output
CN119918716B (en) ENSO long-term prediction methods, devices, and media based on multi-head spatiotemporal attention mechanisms
CN119693620B (en) Multi-scene fire detection method based on deep learning
CN119152276B (en) A local climate zone classification method based on multi-source data fusion
Cheng et al. The fusion strategy of multimodal learning in image and text recognition
CN120747607A (en) Image classification method, device, equipment and medium based on self-supervision attention
CN119251809A (en) An intelligent safety tool access detection method based on AI vision
Ye Attention-Based CNN-BiLSTM Model for La Niña Events
CN117036846A (en) A helmet wearing detection method based on hybrid connection improved YOLOv5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant