Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
Data processing methods, devices and electronic equipment of the embodiments of the present disclosure are described below with reference to the accompanying drawings.
The present disclosure provides a data processing method, an apparatus and an electronic device, by acquiring rule information in a data processing process, and combining historical rule information and comprehensive root cause evaluation data, intelligent optimizing is performed on the rule information, so that the rule information can be adaptively and iteratively optimized, and can meet the requirements of continuous evolution of a system and continuous change of service characteristics for service quality analysis, and meanwhile, accuracy of quality difference identification and positioning results and intelligent closed loop of the whole system are improved.
As shown in fig. 1, an embodiment of the present disclosure provides a data processing method, which may include:
And step 101, acquiring the XDR data message, the historical rule information and the rule information of the data processing process.
The XDR data message may be in a transmission format of a structured data record (Structured Data Record). In network transport protocols, XDR can be a method for encapsulating and transmitting structured data. The rule information can be a rule which is adjusted and changed within a certain range, and can be used for guiding and controlling the quality of service quality difference identification and positioning processing process. The rule information may include a preset perception score threshold, a reference threshold for a perception indicator, a challenge threshold, a negative impact level, a negative bias level, a delimiting indicator, a preset negative impact level threshold, a preset negative bias level threshold, a preset delimiting indicator threshold, and the like. The historical rule information may be experience and rules accumulated in the past problem diagnosis and resolution process for a certain period of time.
Different XDR data messages can be distinguished through the message header of the XDR data message. The header may contain information identifying the type and format of the message. The message content can be a structured XDR record, and a TLV (Type-Length-Value) transmission format is generally adopted to support packaging of a plurality of XDR records into an independent data message, so that transmission efficiency can be improved, and especially when a plurality of records need to be transmitted together. In the same XDR data message, all the packaged XDR record requirements are of the same service type, so that the receiving end can perform unified processing logic according to the service type when processing the message, and the length requirements of the packaged XDR records in the same message are consistent, the processing logic of the receiving end can be simplified, and the processing efficiency is improved. The header typically contains information such as service type, record length, record number, etc.
For the embodiment of the disclosure, the execution subject can be a data processing device or equipment, the application mainly abstracts the logic of the quality of service quality difference identification and positioning processing process, extracts parameterizable rules (namely rule information), combines rule state data (namely history rule information) and root cause evaluation data, carries out intelligent optimization on the rules (namely rule information), enables the rules to be self-adaptive iterative optimization, uniformly manages the optimal rules, forms a rule base and is applied to the quality difference identification and positioning processing process. The application can meet the service quality analysis requirements of continuous evolution of the system and continuous change of service characteristics, and simultaneously improves the accuracy of quality difference identification and positioning results and the intelligent closed loop of the whole system.
And 102, determining comprehensive root cause evaluation data in the quality difference positioning process according to the XDR data message and the rule information.
The comprehensive root cause evaluation data can be data for diagnosing faults, anomalies and other problems in a network, finding out root causes and evaluating the root causes; the quality difference positioning process can be a process for diagnosing problems and analyzing reasons for poor network quality (such as cells) in network optimization.
For the embodiment of the disclosure, in the quality difference positioning process, comprehensive root cause evaluation data in the quality difference positioning process can be determined according to the XDR data message and the rule information. Wherein the XDR data message and rule information may be used to collect data about quality problems, which may then be used to determine the root cause of the quality problem (i.e., the composite root cause evaluation data). The comprehensive root cause evaluation data can be a key part in the quality difference positioning process and can be used for determining the reasons for quality problems and solving the problems.
And 103, carrying out self-adaptive correction on the rule information according to the comprehensive root cause evaluation data and the historical rule information.
In specific application scenarios, such as network security, quality monitoring, anomaly detection, and the like, the comprehensive root cause evaluation data can provide information about the effect and efficiency of rule information, such as whether a rule can accurately identify a problem, whether valuable information can be effectively filtered out, and the like. By analyzing the comprehensive root cause evaluation data, it is possible to determine problems with rule information, such as threshold setting being too high or too low, condition setting being too wide or too narrow, and the like. The historical rule information may provide an evolving process of rule information, such as including the adjustment of rules over time, and the effect of adjustments, etc. By analyzing the historical rule information, it is possible to find the trend of adjustments to the rule information, and how these adjustments affect system performance.
The effect and efficiency of the rule information can be better understood by integrating root cause evaluation data and historical rule information, so that the rule information is adaptively corrected, and the performance and efficiency of the system are improved.
For the embodiment of the disclosure, the method and the device can intelligently complete rule optimizing by combining real-time data, root cause evaluation information and historical rule information, continuously adapt to network environment changes, promote rule accuracy, avoid the problem that an expert database or a knowledge base cannot sense environment changes and update in real time, and avoid the bottleneck of manual experience analysis conclusion.
In summary, compared with the prior art, the data processing method provided by the present disclosure, the present disclosure obtains the XDR data packet, the history rule information, and the rule information in the data processing process; determining comprehensive root cause evaluation data in the quality difference positioning process according to the XDR data message and the rule information; and carrying out self-adaptive correction on the rule information according to the comprehensive root cause evaluation data and the historical rule information. By the method, the rule information in the data processing process is acquired, and the history rule information and the comprehensive root cause evaluation data are combined to conduct intelligent optimizing on the rule information, so that the rule information can be subjected to self-adaptive iterative optimization, the service quality analysis requirements of continuous evolution of a system and continuous change of service characteristics can be met, and meanwhile, the accuracy of quality difference identification and positioning results and the intelligent closed loop of the whole system are improved.
Further, as a refinement and extension of the foregoing embodiments, for a complete description of a specific implementation of the method of the present disclosure, the present disclosure provides a specific method as shown in fig. 2, where the method includes:
step 201, obtaining an XDR data message, historical rule information and rule information of a data processing process.
For the disclosed embodiments, the elements of the present application will involve a generic database read, and since the database is not a system claim, this will not be described in too much detail in this description of the inventive arrangements.
As shown in fig. 3, the data collection unit may be an external system, which may be generally referred to as a DPI (DEEP PACKET insertion) system, and is capable of processing mirror image or spectroscopic data from the switch, responsible for collecting service quality signaling data and generating a fact ticket record XDR, and interacting with the system based on a general transmission interface, and pushing the XDR data to the system in real time.
The data preprocessing unit can be responsible for the real-time preprocessing of the XDR data, and comprises the steps of receiving an XDR data message pushed by an external system, splitting the XDR data message into complete XDR records, performing field conversion, backfilling, verification and the like on each XDR record, generating a memory record (namely an XDR memory record) which can be efficiently processed by the system, and sending the memory record to the quality difference recognition unit.
The quality difference recognition unit can be used for processing XDR (can be an XDR memory record) in real time, clustering service quality indexes based on service dimension, calculating index scores through a reference threshold and a challenge threshold of a perception index, then comprehensively calculating the perception scores according to index weights, recognizing quality difference service through the perception score threshold, and recognizing quality difference rules, namely the indexes and the perception score threshold, from the rule management unit; the XDR index distribution data is extracted and sent to a rule state output unit, and the XDR of the quality difference service is sent to a quality difference positioning unit.
The quality difference positioning unit can be used for responsible quality difference problem delimitation positioning, firstly, clustering the XDR (XDR) in different domain dimensions, counting perception scores on objects in each dimension, transversely comparing in the related domain, then screening quality difference objects according to negative influence degree and negative deviation degree, and finally determining root cause objects according to delimitation indexes; and extracting rule state data of the delimitation positioning process and sending the rule state data to a rule state acquisition unit.
The root cause output unit is used for receiving the quality difference positioning root cause, counting the occurrence frequency of the root cause object, backfilling the information such as the root cause processing period and the like, and then sending the information to the external root cause processing unit.
The root cause processing unit can be an external system and is responsible for solving the root cause problem causing quality difference, backfilling the expected solving time of the problem after the solution is implemented, and feeding back the state information to the root cause evaluation unit.
The root cause evaluation unit is responsible for receiving feedback of root cause processing, verifying the optimizing effect, giving comprehensive evaluation (namely comprehensive root cause evaluation data), and then sending the comprehensive evaluation to the rule intelligent optimizing unit.
The rule intelligent optimizing unit can be responsible for intelligent optimizing of quality difference identification and quality difference positioning rules, and analyzes rule state data and root cause evaluation data based on algorithms such as variance, correlation coefficient, complex correlation system, GMM, least square method, polynomial curve fitting, excitation strategy and the like to find an optimal rule.
The rule management unit can be responsible for optimal rule management of quality difference identification and quality difference positioning, and support updating and inquiring of rules.
The rule state acquisition unit can be responsible for loading rule state data from a database, performing conversion processing such as histogram probability distribution and the like, and then sending the rule state data to the rule intelligent optimizing unit.
Step 202, carrying out concurrent processing on the XDR data message to obtain an XDR memory record, and transmitting the XDR memory record in the form of an XDR message to a column through the message.
For the embodiment of the disclosure, in order for the data preprocessing unit to support real-time processing of mass data, the system can start a plurality of threads on each host node to concurrently execute the data preprocessing task, and the computing resources are fully utilized.
Specifically, the data preprocessing unit may perform concurrent processing on the XDR data packet to obtain an XDR memory record, where the concurrent processing may be to parse multiple XDR data packets received in parallel, and convert each XDR record into a memory format (i.e. an XDR memory record) capable of being rapidly processed, and then, may perform a concurrent preprocessing process on the XDR fields, which may include default backfilling, content conversion, exception checking, and so on.
To improve preprocessing efficiency, these field cleaning logic are dynamically compiled into a binary dynamic library that can be run directly and provide a call interface. When the XDR records of each service type need to be processed, the dynamic library interface is called to transfer the memory blocks of the original XDR records and the context state information, the dynamic library returns the XDR cleaning result in the form of memory block pointers, the cleaned XDR records are assembled into XDR messages which can be rapidly transferred in the system, each message supports a plurality of XDR records, and finally the XDR messages are put into a message queue Qu 1.
The message queue is a first-in first-out data structure, which can ensure the order and the integrity of data. The XDR messages are placed in the message queue, so that subsequent data processing and analysis can be performed according to a preset sequence, reordering is not needed, and the processing and analysis efficiency is improved. Meanwhile, the message queue can realize data caching and distribution, and the processing capacity and flexibility of the system are further improved.
And 203, reading the XDR messages in the message pair columns, and clustering the perception indexes based on the service dimension for the XDR records in each XDR message until reaching the preset time granularity to obtain the perception indexes.
For the embodiment of the disclosure, the quality difference identifying unit may read XDR (extended data record) messages from the message queue Q u1, and perform service dimension-based sensing index clustering on the XDR record in each XDR message, where the clustering process may be continued until a preset time granularity is reached, and finally obtain the service dimension-based sensing index. The predetermined time granularity may be a preset time interval, for example, a full-point granularity of 5 minutes, 15 minutes, 30 minutes, 1 hour, 1 day, etc. During this time granularity, the system continues to perform cluster analysis of the perceived metrics to obtain finer business insights.
As a possible implementation manner, the preset quality of service perception index model may be used to cluster the perception indexes based on different service dimensions for the XDR records in each XDR message, for example, clustering according to dimensions such as product, region, customer type, etc. The quality of service perceived index model may be a model composed of a plurality of perceived indexes that may be used to measure the quality of service, such as customer satisfaction, transaction success rate, response time, etc.
And 204, calculating the comprehensive score of the perception index based on the rule information, and screening quality difference services with the comprehensive score smaller than a preset perception score threshold.
The rule information can comprise a preset perception score threshold value, a reference threshold value of a perception index, a challenge threshold value and the like; the perceived composite score may be a composite score calculated from the plurality of perceived metrics scores and the corresponding weights, representing the overall quality condition of the business.
For the embodiment of the disclosure, according to the set rule information, the comprehensive score of each perception index can be calculated, then the quality difference service with the comprehensive score lower than the preset perception score threshold is screened out, the service with poor quality can be found out, and then the service is improved and optimized to improve the overall quality of the service. The preset perception scoring threshold may be a preset perception scoring standard, which is used for judging whether the quality of the service meets the requirement.
For the embodiment of the disclosure, calculating the comprehensive score of the perception index based on the rule information may include:
Calculating a perception index score of the perception index based on the reference threshold and the challenge threshold of the perception index; and calculating the comprehensive score of the perception index according to the perception weight corresponding to the perception index score. The reference threshold may be a preset standard value of a perception index, and is used to determine whether the quality of service reaches a basic and acceptable level, for example, the quality of service can be considered to be qualified only when the score of the perception index is higher than the threshold.
The challenge threshold may be a predetermined perceived index challenge value, which is generally higher than the reference threshold, for determining whether the quality of service is at an excellent level. For example, the quality of service can be considered excellent only if the score of the perceived index is above this threshold.
As shown in fig. 4, a perception index score of the perception index is calculated based on the reference threshold and the challenge threshold of the perception index; and calculating the comprehensive score of the perception index according to the perception weight corresponding to the perception index score, wherein the specific formula is as follows:
And is also provided with
Wherein QoE represents the composite score, W i represents the perceptual weight, and KQI represents the perceptual indicator score.
The standard threshold and the perception weight of the perception index come from a rule management unit, and the index set capable of reflecting the perception of the service quality of a user and the corresponding threshold standard thereof are automatically screened and identified through rules, so that the rules can be refined to specific service application, and the difference requirements of different service behaviors and characteristics are met. In this system, the perceived index is divided into three intervals according to a preset reference threshold and a challenge threshold, and linear scoring is performed in the three intervals, where the reference threshold and the challenge threshold are two reference points for judging whether the quality of service meets or exceeds the expected standard. By default, the system uses a scoring system to score, but is not limited to a scoring system, and may also use a percentile or other scoring system, which facilitates the system to establish a unified and intuitive standard that scores the perception indicators based on the following formula.
Where baseline represents a reference threshold, challenge represents a challenge threshold, direction represents a perceived index direction, and the value 1 or-1, 1 represents a positive index (the larger the index value is, the better), and 1 represents a negative index (the smaller the index value is, the better).
For each service application, a perceived comprehensive score QoE is calculated, then quality difference services with comprehensive scores smaller than a preset perceived score threshold TQoE are screened out, the system defaults TQoE to be 4, the quality difference services are packaged into messages, the messages are sent to a message queue Qu4, and root cause analysis is carried out on the quality difference positioning unit. At the same time the unit forwards the XDR message to the message queue Qu2.
And the quality difference identification unit performs perception index statistics based on the user identification dimension for each service application, and in the process, the system defaults to a statistics granularity of 1 hour, and after the statistics granularity is reached, the statistics data is stored in a database. In order to obtain enough samples and save resource expenditure, users need to be sampled according to a certain proportion before statistics, the sampling proportion can be adjusted according to actual needs, and after statistics is completed, statistical results are filtered, and users with insufficient traffic are filtered, so that accuracy and effectiveness of the statistical results are guaranteed.
And when the quality difference identification unit is started, reading the quality difference identification rule from the rule management unit, generating a rule cache in the unit, and setting the cache state as a main cache. Meanwhile, the quality difference recognition unit periodically reads the quality difference recognition rule from the rule management unit to generate a standby cache, and after the standby cache is initialized, the primary cache and the standby cache are switched, the original primary cache is changed into the standby cache, and the standby cache is changed into the primary cache as shown in the following figure 5. Because the primary-backup switching is completed through the flag bit atomic operation, the normal processing flow is not affected, and a mutual exclusion mechanism is not required to be added for the primary-backup cache reading and writing.
The method specifically comprises the following steps: and reading rule information, and generating a main cache and a standby cache, wherein the main cache is used for switching with the standby cache.
And 205, determining comprehensive root cause evaluation data according to the XDR memory records of the quality difference service.
For the embodiment of the disclosure, determining comprehensive root cause evaluation data according to the XDR memory record of the quality difference service may specifically include:
carrying out delimitation positioning treatment on the XDR memory records to obtain root cause objects and root cause data;
Judging whether the occurrence frequency of the root cause object exceeds a preset threshold according to the root cause data, and if so, generating a root cause record; and determining comprehensive root cause evaluation data according to the evaluation scores and weights of different evaluation dimensions in the root cause records.
The delimiting and positioning process can be to determine the specific cause and position of the problem by accurately analyzing and positioning the XDR memory records. Root cause objects may be key factors that cause problems to occur, and root cause data may be data that details and analyzes these factors.
In a specific application scenario, the root cause output unit may periodically query the database for root cause data, which may be data recording the causes and situations that may cause problems to occur.
The system judges whether the frequency of occurrence of the latest root cause object exceeds a preset threshold value, the preset threshold value can be configured according to actual needs, if the frequency of occurrence of the latest root cause object exceeds the preset threshold value, a root cause record is generated, the root cause record comprises information such as a root cause conclusion, a processing period and the like, the system can inform the root cause record to a root cause processing unit registered outside the system, and therefore the related processing unit can timely learn about the problem condition and process the problem condition. Meanwhile, the root cause records are saved in a database in the system for subsequent inquiry and analysis. The expired root cause object is cleaned up. The external root cause processing unit registers with the root cause output unit of the system so as to be capable of receiving and processing the root cause record sent by the system.
The root cause record can be a form of detailed record of the root cause object and the root cause data which cause the problem, and can be used for further analysis and problem solving.
In a specific application scenario, after the root cause processing unit processes the root cause problem, information such as effective time and root cause evaluation is fed back to the root cause evaluation unit. And after receiving the feedback information, the root cause evaluation unit updates the root cause record information in the database.
The system will periodically query the database for root cause records for which the time of implementation has arrived. The system then queries the database for historical data and root cause resolution implemented data associated with the root cause records to verify the effectiveness of the problem resolution.
Root cause evaluation is performed from several dimensions: influence range, repair difficulty, cost influence, customer satisfaction and solution effect. Each evaluation dimension was 1 point and the total point was 5 points. If no feedback is given to the root cause, the default score for each dimension is 0.6 points. Such scoring aids the system in establishing a unified and intuitive standard. Scoring according to the present application includes, but is not limited to, scoring with five minutes, and may be performed using percentages, etc.
Then, a comprehensive evaluation score is calculated according to the weight of each dimension (the weight of each dimension is 0.2 by default). And finally, storing the root cause information and the comprehensive evaluation result into a database.
The influence range may be how large the quality problem of the quality-poor object may have a significant influence on the whole production process or the final product; the repair difficulty can be whether the quality problem of the poor quality object can be solved by a simple repair mode or not, and how much the repair difficulty is; the cost effect can be whether the quality problem of the poor quality object can affect the cost or not, and how much the cost effect is; customer satisfaction can be whether a bad object affects customer satisfaction or not, and how much the bad object affects the customer satisfaction; the solving effect can be the index recovery condition after the problem is solved, and the root cause evaluation unit is used for automatically evaluating the problem.
For the embodiment of the disclosure, performing delimitation positioning processing on the XDR memory record to obtain a root cause object and a root cause data, which may specifically include:
Calculating the contribution value of each perception index to the quality difference service; clustering the XDR memory records according to different dimension objects to generate a statistical table with preset dimensions and preset granularity; counting the perception index scores of each dimension object from the statistics table based on the contribution values; screening target quality difference objects under the same dimension objects based on the negative influence degree and the negative deviation degree of the perception index score, and determining root cause objects according to the delimitation indexes of the target quality difference objects; the root factor is determined based on the negative surface influence degree, the negative deviation degree, and the root factor object.
Wherein, the negative influence degree can be the negative influence degree of the perception index on the service quality; the negative bias may be the gap between the actual performance and the expected or standard.
For the embodiment of the disclosure, the method for determining the root cause object according to the delimitation index of the target quality difference object includes the steps of:
Screening perception objects with perception index scores which do not reach preset scores; determining a perceived object with a negative influence degree exceeding a preset negative influence degree threshold as an initial quality difference object, and determining an object belonging to the initial quality difference object with a negative deviation degree exceeding a preset negative deviation degree threshold as a target quality difference object; determining a target quality difference object meeting preset conditions as a root cause object, wherein the preset conditions comprise: the object with the boundary index exceeding the preset boundary index threshold and the boundary index of the object with the same type of object with the boundary index exceeding the preset boundary index threshold.
In a specific application scenario, the quality difference positioning unit can read XDR messages from the message queue Qu2, cluster the XDR in each domain dimension, and generate a statistical table with preset dimensions and preset granularity. The fields may include terminals, radio cells, core network elements, content sources, and the like. The statistical table only calculates basic indexes, and each composite index KQI is split into two basic indexes of a numerator and a denominator. The granularity of the statistical table can be preset, and the statistical result needs to be stored in a database.
And reading a quality difference service list from the message queue Qu4, and calculating the contribution value of each KQI to QoE quality difference. The calculation formula is as follows: Δc i=(5-KQIi)*wi
Where KQIi is the KQI score for which root cause localization is performed when Δc i is greater than 0.
And (3) screening KQI index data of corresponding services from the domain statistics tables, and sequentially carrying out root cause analysis according to network elements of a core network, content sources, wireless cells and terminals based on a principle from a large pipeline to a small pipeline. And (3) transversely comparing in each field, and comparing the data of the content source which needs to be removed and passes through the network element of the quality difference core network, and screening out objects with the score of the perception index not being full score (namely, the perception objects which do not reach the preset score).
And calculating the negative face influence degree of each object, and screening out the quality difference object which has the greatest negative influence on the quality difference of the overall index through a negative face influence degree threshold value. And calculating the negative deviation degree of each quality difference object, and screening out the quality difference object with the largest negative deviation degree compared with the whole index through a negative deviation degree threshold value.
And traversing the boundary index of each quality difference object, and screening out quality difference objects of which the boundary index exceeds the quality difference threshold of the boundary index. For the quality difference core network, it is necessary to further judge that all the delimitation indexes of other services of the same type applied on the core network are quality difference, and the quality difference object screened out in the judging step is the root cause object.
The perception index reference threshold, the challenge threshold, the negative influence degree threshold, the negative deviation degree threshold and the boundary index and the quality difference threshold come from the rule management unit and the loading and caching mechanism quality difference identification unit. The quality difference positioning unit organizes information such as root cause objects, business applications, field types, negative influence degree, negative deviation degree and the like into records, and stores the records into a database to form root cause data.
And 206, determining each perceived index weight by utilizing an index weight optimizing algorithm and combining comprehensive root cause evaluation and historical rule information, determining the optimal optimizing result in the perceived index weight by utilizing a k-means clustering method, and carrying out self-adaptive correction on the rule information by utilizing the optimal optimizing result.
For the embodiment of the disclosure, the index weight optimizing algorithm is utilized to combine the comprehensive root cause evaluation and the history rule information to determine each perception index weight, the k-means clustering method is utilized to determine the optimal optimizing result in the perception index weight, and the rule information is adaptively corrected by utilizing the optimal optimizing result, so that the specific implementation process can be as follows: aiming at each service application, an objective perception index system is required to be constructed, different weights are given to each index, and the requirement of comprehensive evaluation of service quality is met. Index grouping is performed according to the service characteristics according to the index value data, weights of each grouping are calculated, then Pearson correlation coefficients (Pearson correlation coefficient) among indexes are calculated in the grouping, and the variance value after each index value is normalized, and the index weights are further determined in the grouping. The weight calculation formula is as follows:
Where Wi represents the weight of the feature i, xi represents the ith feature in the dataset, Y represents the target variable, m represents the total number of features, var (Xi) represents the variance of the ith feature, corr (Xi, Y) represents the correlation coefficient between the ith feature and the target variable, dot represents the absolute value, a is a constant between 0 and 1, for adjusting the relative importance of the variance and the correlation coefficient. In this formula, the first term measures the duty cycle of the variance of the feature i in the whole dataset, and the second term measures the duty cycle of the degree of correlation between the feature i and the target variable in the whole dataset. The value of a determines the relative importance of the variance and correlation coefficient in the weight calculation. If a is close to 0, the variance has a greater influence on the weight, if a is close to 1, the correlation coefficient has a greater influence on the weight, and the default value of a is 0.5. Multiplying Wi by the grouping weight to obtain the weight of each index to be checked If equal to 1, if not equal to 1, the weight fine adjustment is needed. When the optimizing period arrives, the index weight nearest to the clustering center is found out through k-means clustering to be used as the optimal optimizing result, and the rule is updated through the rule management unit.
Step 207, determining each perception index threshold by using an index threshold optimizing method and combining comprehensive root cause evaluation and historical rule information, taking the average value result of the perception index threshold as an optimal optimizing result, and carrying out self-adaptive correction on the rule information by using the optimal optimizing result.
For the embodiment of the disclosure, an index threshold optimizing method is utilized to combine comprehensive root cause evaluation and historical rule information to determine each perception index threshold, an average value result of the perception index threshold is used as an optimal optimizing result, and the rule information is adaptively corrected by utilizing the optimal optimizing result, wherein the index is used for measuring the quality of service in a certain aspect, an appropriate threshold is required to be selected for each index, the requirement of customer evaluation on the service quality is met, and the index threshold is divided into a reference threshold and a challenge threshold by the system.
Each index value may be divided into a quality difference zone, a normal zone, and a quality preference zone, the indexes may be clustered using GMM, and a benchmark threshold and a challenge threshold of the indexes are determined based on the clustering result. Specifically, each index can be regarded as a mixed gaussian distribution and divided into 3 gaussian distribution clusters using GMM. Then, for the quality difference interval and the quality preference interval, the average value of the quality difference interval and the quality preference interval can be calculated, and the average value is respectively used as a reference threshold value and a challenge threshold value.
When the optimizing period is reached, the average value index threshold value is taken as the optimal optimizing result for each threshold value learned by each granularity, and the rule is updated through the rule management unit.
Step 208, screening the boundary indexes with confidence degrees meeting the preset confidence degrees, determining the quality difference threshold value of the boundary indexes by combining the quality difference correlation between the perception indexes and the boundary indexes and the comprehensive root cause evaluation and history rule information, taking the quality difference threshold value with the highest occurrence frequency as the optimal optimizing result, and carrying out self-adaptive correction on the rule information by utilizing the optimal optimizing result.
For the embodiment of the disclosure, the delimiting indexes with confidence degrees meeting the preset confidence degrees are screened first, specifically, in order to delimit the problem to different fields, for different service applications, different problem fields and each different perception index need to be selected as appropriate pipeline delimiting indexes. The delimited index is an index capable of distinguishing the problem domain, and generally refers to an index of a transmission layer, such as an index of a TCP downlink retransmission rate, a TCP uplink retransmission rate, an average time delay of a TCP uplink RTT, an average time delay of a TCP downlink RTT, and the like, which are called as pipeline indexes in the system.
The perceived index is generally free from delimitation capability and needs to be delimited by related pipeline indexes, but the influence degree and the correlation degree of the pipeline indexes on the perceived index are different, the pipeline delimitation indexes based on fixation cannot meet the requirements of different service scenes, and pipeline transmission protocols per se can avoid influencing the perceived indexes to a certain extent, for example, the downstream retransmission rate of the pipeline TCP of a certain service is very high, which indicates that the packet loss is serious, but the fast retransmission mechanism of the TCP protocol can quickly detect the packet loss and inform a transmitting end of retransmission, thereby ensuring that the perception is not influenced, therefore, the association relation needs to be determined based on big data analysis and is unreliable based on experience.
The delimitation index optimizing is to search a pipeline class index set with the strongest correlation with the perception index through complex correlation, firstly establish a static mapping relation between the perception index and the pipeline class delimitation index according to business behavior characteristics, appoint 1 to 3 delimitation indexes for each perception index aiming at a certain field, and screen out the delimitation index with high confidence based on the mapping relation and then combine with correlation coefficient analysis of big data.
When calculating the complex correlation coefficient, the number of the delimitation indexes does not exceed 3. For example, the perceived index K0 can delimit a pipeline class index in a certain area from K1, K2, and K3, and then for (K0, K1), (K0, K2), (K0, K3), (K0, K1, K2), (K0, K1, K3), (K0, K2, K3), the complex correlation coefficient is calculated respectively, and when only one delimited index is used, the simple correlation coefficient is calculated. Taking (K0, K1, K2) as an example, for two indexes, calculating two index relations of K0 and K1, K2 as follows:
Where r12 is the correlation coefficient between K1 and K2, r10 is the correlation coefficient between K1 and K0, and r20 is the correlation coefficient between K2 and K0. r0.12 is the complex correlation coefficient between K0 and K1, K2. And finally, taking the set with the largest complex correlation coefficient absolute value of all sets to generate a delimitation index.
The quality difference threshold value of the delimited index is determined by combining comprehensive root cause evaluation and historical rule information through quality difference correlation between the perception index and the delimited index, the quality difference threshold value with the highest occurrence frequency is used as an optimal optimizing result, and the rule information is subjected to self-adaptive correction by utilizing the optimal optimizing result, wherein the quality difference threshold value of the delimited index is selected to meet the judgment basis of the quality difference of the perception index in the corresponding field through quality difference correlation analysis of the perception index and the delimited index. The quality difference threshold of the delimitation index is obtained by carrying out regression fitting of the perception index and each delimitation index through a logarithmic curve, and then obtaining the intersection point of the reference threshold of the perception index and the fitting curve as the quality difference threshold of the delimitation index, as shown in the following figure 6.
The logarithmic equation used is: y=a log (x) +b, where a, b is the coefficient of the logarithmic equation.
The a and b coefficients are calculated according to sample data by using a least square method, and then the calculated y value is used as a quality difference threshold of a delimitation index according to a reference threshold of the x=perception index.
When the optimizing period arrives, screening the best optimizing result with the highest occurrence frequency of the delimited index set, and updating the rule through the rule management unit.
Step 209, calculating the negative influence degree of each perceived object by using a negative influence degree calculation formula, screening root objects and corresponding perceived index scores of which the negative influence degree is greater than a preset negative influence degree threshold, updating the negative influence degree threshold according to the root objects and the corresponding perceived index scores, taking the current negative influence degree threshold reaching the optimizing period as an optimal optimizing result, and carrying out self-adaptive modification on rule information by using the optimal optimizing result.
For the embodiment of the disclosure, a negative face influence degree of each perception object is calculated by using a negative face influence degree calculation formula, root objects with negative face influence degrees larger than a preset negative face influence degree threshold and corresponding perception index scores thereof are screened, the negative influence degree threshold is updated according to the root objects and the corresponding perception index scores thereof, the current negative influence degree threshold reaching an optimizing period is used as an optimal optimizing result, and rule information is adaptively modified by using the optimal optimizing result, and the method is specific in that:
To be able to screen the valuable root causes, a suitable negative influence level needs to be chosen for different business applications, for different problem domains and for each different perception index. The influence degree is the degree to which individual objects affect the overall index, and the influence degree calculation formula for each individual object is as follows:
Wherein Smax represents KQI full score, and the system adopts five-component system, i.e. smax=5; SENTIRETY is the overall scoring value of the KQI index; sothers (x) is the calculated KQI index scoring value for all the remaining subjects after individual subject x is removed from the whole; when the individual object x is removed from the whole, the value of the Sothers (x) is improved compared with the value of SENTIRETY, the individual object x is indicated to have negative influence on the whole, so that D (x) obtained by calculation is positive, negative face influence degree is indicated, and the value range is [0,100]. The KQI scoring function is found in Scorei's calculation formula of the quality difference recognition unit.
The negative influence degree threshold is used for screening economic value bad objects, and when the system is initialized, an initial negative influence degree threshold is set and is called an initial threshold alpha 0, the current effective threshold is alpha, and if the system threshold is not adjusted, alpha=alpha 0. The threshold value estimate is measured by root cause comprehensive evaluation and root factor quantity ratio, and the value estimate is expressed as follows:
Where Ei is the total score divided by the total score 5 for each root, normalized to [0,1], M is the number of root factors for the negative impact optimization threshold interval, and N is the total number of root factors for the current threshold interval (M+.N). k is a weight coefficient, the root cause value and the root factor quantity are balanced, and through experiments, the default value of the k coefficient is 0.6.
The negative influence threshold optimizing process is that VE0 of the granularity is calculated first; screening out root causes with negative influence degree larger than alpha+Salpha (Salpha is step length, the default value of the system is 0.5%), calculating VE1 of the root causes, if VE1 is larger than VE0, putting the timestamp with the granularity into a queue Q, otherwise, emptying the queue Q; and finally judging the length of the queue Q, if the length of the queue Q is larger than T (the default value of the hour granularity system is 168 and the default value of the day granularity system is 14), and if T is the threshold value of the number of connection granularity, updating the negative influence degree threshold value to alpha+Salpha, and emptying the queue Q.
When the optimizing period arrives, the current latest negative influence threshold value is taken as the optimal optimizing result, and the rule is updated through the rule management unit.
And 210, calculating the negative deviation degree of each perception object by using a deviation degree calculation formula, screening root objects with the negative deviation degree larger than a preset negative surface influence degree threshold and corresponding perception index scores thereof, updating the negative deviation degree threshold according to the root objects and the corresponding perception index scores thereof, taking the current negative deviation degree threshold reaching the optimizing period as an optimal optimizing result, and carrying out self-adaptive modification on rule information by using the optimal optimizing result.
For the embodiment of the disclosure, a deviation calculation formula is utilized to calculate the negative deviation of each perceived object, root objects and corresponding perceived index scores thereof with the negative deviation greater than a preset negative influence threshold are screened, the negative deviation threshold is updated according to the root objects and the corresponding perceived index scores thereof, the current negative deviation threshold reaching the optimizing period is used as an optimal optimizing result, and rule information is adaptively modified by utilizing the optimal optimizing result. The degree of deviation is the degree of deviation of the individual index from the overall index, and the calculation formula for the degree of deviation of each individual object is as follows:
Wherein Smax represents KQI full score, and the system adopts five-component system, i.e. smax=5; SENTIRETY is the overall scoring value of the KQI index; sindividuality (x) is the KQI score for individual subject x; if Sindividuality (x) is less than SENTIRETY, it indicates that the individual target index is degraded and has a negative deviation, so that the calculated D' (x) is a positive value, indicating the degree of negative deviation, and the value range is [0,100].
The negative bias threshold value estimation formula is the same as the negative influence value estimation formula VE.
The negative deviation threshold optimizing process is that VE0' of the granularity is calculated firstly; screening out root causes with negative influence degree larger than beta+Sbeta (Sbeta is step length, the default value of the system is 0.5%), calculating VE1' of the root causes, if VE1' is larger than VE0', putting the timestamp with the granularity into a queue Q, otherwise, emptying the queue Q; and finally judging the length of the queue Q, if the length of the queue Q is larger than T, and if T is the threshold value of the number of the connection granularity, updating the negative deviation degree threshold value to alpha+Salpha, and emptying the queue Q.
When the optimizing period arrives, the current latest negative deviation threshold value is taken as the optimal optimizing result, and the rule is updated through the rule management unit.
The rule management unit is utilized to classify and store rule intelligent optimizing results, a main existing medium is a general database, each type of rule stores different database tables (collectively called rule tables) respectively, the rule tables are backed up periodically, and the rule table recovery operation is supported. The rule management unit provides a simple query and update interface, and meets the read-write requirements of other units on the rules.
The rule state acquisition unit is used for acquiring rule state data, wherein the state data mainly comprises index statistical data of user identification dimensions and index statistical data of various fields. Firstly, index statistical data of user identification dimension is loaded from a database and is firstly converted into index histogram probability distribution data and then sent to a rule intelligent optimizing unit for index weight and threshold intelligent optimizing; secondly, calculating a perception index and a pipeline index from the statistics table data of each domain of the value of the database, and then giving the perception index and the pipeline index to a rule intelligent optimizing unit for intelligent optimizing of a delimitation index and a quality difference threshold value, and finally loading root cause information from the database for intelligent optimizing of negative face influence degree and negative deviation degree.
Key points of the present application may include:
(1) The processing logic of the quality difference identification and positioning process is abstracted, parameterizable rules are extracted, the parameterizable rules comprise a quality index reference threshold, a challenge threshold, a weight, a quality difference object negative surface influence degree, a negative deviation degree threshold, a delimitation index, a quality difference threshold and the like, the business processing flow is controlled through the rules, the rules are optimized and adjusted in time according to the actual environment and feedback, the efficiency and quality of the business flow are improved, and the business processing flow is continuously improved;
(2) Converting rule state data based on a service quality difference identification process into histogram probability distribution, performing rule intelligent optimization of quality difference identification through algorithms such as variance, correlation coefficient, GMM and the like, applying rules such as index weight, index threshold and the like, comprehensively evaluating service perception by adopting a grading system, and identifying contribution of quality difference service and KQI to QoE quality difference;
(3) Based on the rule state data of the quality difference positioning process, the rule intelligent optimization of quality difference positioning is carried out through algorithms such as complex correlation coefficient, least square method, polynomial curve fitting, excitation strategy and the like, so that the problem that an expert database or a knowledge base cannot sense environmental change and update in real time is avoided, and the bottleneck of manual experience analysis conclusion is avoided;
(4) The optimal rule is uniformly managed, rule updating, access and persistence are supported, the quality difference identification rule is periodically read from the rule management unit by each unit, a standby cache is generated, after the standby cache is initialized, the primary cache is switched to the standby cache, and the standby cache is changed to the primary cache;
(5) And extracting feedback of root causes after processing implementation, automatically performing effect comparison verification to form comprehensive evaluation, and applying the comprehensive evaluation to rule intelligent optimization, wherein rules can be continuously improved through a feedback mechanism, and intelligent closed loop is realized without manual intervention.
The application mainly abstracts the logic of the quality difference identification and positioning process, extracts parameterizable rules, combines rule state data and root cause evaluation data, carries out intelligent optimizing on the rules, enables the rules to be self-adaptive iterative optimizing, uniformly manages the optimal rules, forms a rule base and is applied to the quality difference identification and positioning process.
Compared with the prior art, the application can meet the service quality analysis requirements of continuous evolution of the system and continuous change of service characteristics, can intelligently complete rule optimizing by combining real-time data, root cause evaluation information and history rule information, continuously adapt to network environment change, improve rule accuracy, avoid the problem that an expert database or a knowledge base cannot sense environment change and update in real time, and the bottleneck of manual experience analysis conclusion, and continuously improve the accuracy of quality difference identification and positioning results and the intelligent closed loop of the whole system by the feedback mechanism of evaluation. Compared with the prior art, the technical scheme of the application can intelligently adjust the rules by using a feedback mechanism, thereby achieving the purpose of improving the accuracy of the system.
In summary, compared with the prior art, the data processing method provided by the present disclosure, the present disclosure obtains the XDR data packet, the history rule information, and the rule information in the data processing process; determining comprehensive root cause evaluation data in the quality difference positioning process according to the XDR data message and the rule information; and carrying out self-adaptive correction on the rule information according to the comprehensive root cause evaluation data and the historical rule information. By the method, the rule information in the data processing process is acquired, and the history rule information and the comprehensive root cause evaluation data are combined to conduct intelligent optimizing on the rule information, so that the rule information can be subjected to self-adaptive iterative optimization, the service quality analysis requirements of continuous evolution of a system and continuous change of service characteristics can be met, and meanwhile, the accuracy of quality difference identification and positioning results and the intelligent closed loop of the whole system are improved.
Based on the specific implementation of the method shown in fig. 1 and fig. 2, this embodiment provides a data processing apparatus, as shown in fig. 7, including: an acquisition module 31, a determination module 32 and an optimization module 33;
An obtaining module 31, configured to obtain an XDR data packet, historical rule information, and rule information in a data processing procedure;
A determining module 32, configured to determine comprehensive root cause evaluation data in a quality difference positioning process according to the XDR data packet and the rule information;
And an optimization module 33, configured to adaptively modify the rule information according to the comprehensive root cause evaluation data and the historical rule information.
In a specific application scenario, the rule information includes a preset perception score threshold, and the determining module 32 is configured to perform concurrent processing on the XDR data packet to obtain an XDR memory record, where the XDR memory record is transmitted in an XDR message form through a message pair column;
Reading the XDR messages in the message pair columns, and clustering the XDR records in each XDR message based on the perception index of the service dimension until reaching the preset time granularity to obtain the perception index; calculating the comprehensive score of the perception index based on the rule information, and screening quality difference services with the comprehensive score smaller than a preset perception score threshold; and determining the comprehensive root cause evaluation data according to the XDR memory records of the quality difference service.
In a specific application scenario, the rule information includes a reference threshold and a challenge threshold of the sensing index, and the determining module 32 is configured to calculate a sensing index score of the sensing index based on the reference threshold and the challenge threshold of the sensing index; and calculating the comprehensive score of the perception index according to the perception weight corresponding to the perception index score.
In a specific application scenario, the determining module 32 may be configured to perform a delimitation positioning process on the XDR memory record, so as to obtain a root cause object and root cause data; judging whether the occurrence frequency of the root cause object exceeds a preset threshold according to the root cause data, and if so, generating a root cause record; and determining comprehensive root cause evaluation data according to the evaluation scores and weights of different evaluation dimensions in the root cause records.
In a specific application scenario, the rule information includes negative influence, negative deviation and a delimiting indicator, and the determining module 32 is configured to calculate a contribution value of each perceived indicator to the quality difference service; clustering the XDR memory records according to different dimension objects to generate a statistical table with preset dimensions and preset granularity; counting a perception index score of each dimension object from the statistics based on the contribution value; screening target quality difference objects under the same dimension object based on the negative influence degree and the negative deviation degree of the perception index score, and determining the root cause object according to the delimitation index of the target quality difference object; the root cause data is determined based on the negative face influence degree, the negative deviation degree, and the root cause object.
In a specific application scenario, the rule information includes a preset negative surface influence threshold, a preset negative deviation threshold, and a preset delimiting indicator threshold, and the determining module 32 is configured to screen the perceived objects whose perceived indicator scores do not reach a preset score;
determining the perceived object with the negative influence degree exceeding a preset negative influence degree threshold as an initial quality difference object, and determining the object with the negative deviation degree exceeding a preset negative deviation degree threshold and belonging to the initial quality difference object as the target quality difference object;
Determining a target quality difference object meeting preset conditions as the root cause object, wherein the preset conditions comprise: and the boundary indexes of the target quality difference objects with the boundary indexes exceeding the preset boundary index threshold and the boundary indexes of the objects with the same type of the target quality difference objects exceed the preset boundary index threshold.
In a specific application scenario, the optimization module 33 may be configured to determine each perceived index weight by using an index weight optimization algorithm in combination with the comprehensive root cause evaluation and the historical rule information, determine an optimal optimizing result in the perceived index weight by using a k-means clustering method, and perform adaptive correction on the rule information by using the optimal optimizing result;
determining each perception index threshold by combining the comprehensive root cause evaluation and the historical rule information by using an index threshold optimizing method, taking the average value result of the perception index threshold as the optimal optimizing result, and carrying out self-adaptive correction on the rule information by using the optimal optimizing result;
Screening a delimitation index with confidence degree meeting preset confidence degree, determining a quality difference threshold value of the delimitation index by combining the comprehensive root cause evaluation and the history rule information through quality difference correlation between a perception index and the delimitation index, taking the quality difference threshold value with highest occurrence frequency as the optimal optimizing result, and carrying out self-adaptive correction on the rule information by utilizing the optimal optimizing result;
Calculating the negative influence degree of each perception object by using a negative influence degree calculation formula, screening root objects and corresponding perception index scores of which the negative influence degree is larger than a preset negative influence degree threshold, updating the negative influence degree threshold according to the root objects and the corresponding perception index scores, taking the current negative influence degree threshold reaching an optimizing period as the optimal optimizing result, and carrying out self-adaptive modification on the rule information by using the optimal optimizing result;
And calculating the negative deviation degree of each perception object by using a deviation degree calculation formula, screening root objects and corresponding perception index scores of which the negative deviation degree is larger than a preset negative influence degree threshold, updating the negative deviation degree threshold according to the root objects and the corresponding perception index scores, taking the current negative deviation degree threshold reaching an optimizing period as the optimal optimizing result, and carrying out self-adaptive correction on the rule information by using the optimal optimizing result.
In a specific application scenario, the apparatus further includes: a generation module 34;
the generating module 34 is configured to read rule information and generate a primary cache and a backup cache, where the primary cache is configured to switch with the backup cache.
It should be noted that, for other corresponding descriptions of the functional units related to the data processing apparatus provided in this embodiment, reference may be made to corresponding descriptions of the methods in fig. 1 and fig. 2, which are not repeated herein.
Based on the above-described methods as shown in fig. 1 and 2, the present disclosure also provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the above-described methods as shown in fig. 1 and 2.
Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to execute the method of each implementation scenario of the present disclosure.
Based on the method shown in fig. 1 and fig. 2 and the virtual device embodiment shown in fig. 7, in order to achieve the above object, the embodiment of the disclosure further provides an electronic device, which may be configured on an end side of a vehicle (such as an electric automobile), and the device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the method as shown in fig. 1 and 2 described above.
Optionally, the physical device may further include a user interface, a network interface, a camera, radio frequency (RadioFrequency, RF) circuitry, sensors, audio circuitry, WI-FI modules, and so on. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
It will be appreciated by those skilled in the art that the above-described physical device structure provided by the present disclosure is not limiting of the physical device, and may include more or fewer components, or may combine certain components, or a different arrangement of components.
The storage medium may also include an operating system, a network communication module. The operating system is a program that manages the physical device hardware and software resources described above, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the information processing entity equipment.
From the above description of embodiments, it will be apparent to those skilled in the art that the present disclosure may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. Compared with the prior art, the data processing method, the device and the electronic equipment provided by the disclosure acquire the XDR data message, the historical rule information and the rule information of the data processing process; determining comprehensive root cause evaluation data in the quality difference positioning process according to the XDR data message and the rule information; and carrying out self-adaptive correction on the rule information according to the comprehensive root cause evaluation data and the historical rule information. By the method, the rule information in the data processing process is acquired, and the history rule information and the comprehensive root cause evaluation data are combined to conduct intelligent optimizing on the rule information, so that the rule information can be subjected to self-adaptive iterative optimization, the service quality analysis requirements of continuous evolution of a system and continuous change of service characteristics can be met, and meanwhile, the accuracy of quality difference identification and positioning results and the intelligent closed loop of the whole system are improved.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The above is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.