US20160350198A1 - Detection of abnormal resource usage in a data center - Google Patents
Detection of abnormal resource usage in a data center Download PDFInfo
- Publication number
- US20160350198A1 US20160350198A1 US14/721,777 US201514721777A US2016350198A1 US 20160350198 A1 US20160350198 A1 US 20160350198A1 US 201514721777 A US201514721777 A US 201514721777A US 2016350198 A1 US2016350198 A1 US 2016350198A1
- Authority
- US
- United States
- Prior art keywords
- resource usage
- usage data
- resource
- data
- abnormal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/006—Identification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3051—Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
- G06F11/3072—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
- G06F11/3082—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting the data filtering being achieved by aggregating or compressing the monitored data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3452—Performance evaluation by statistical analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3495—Performance evaluation by tracing or monitoring for systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
Definitions
- a public cloud computing system (“a cloud”) provides shared computing resources for use by customers.
- the computing resources of a cloud are hardware and software resources.
- the hardware resources include components of servers such as cores of central processing units (CPUs), graphics processing units (GPUs), main memory, secondary storage, and so on.
- the software resources include operating systems, database systems, accounting applications, and so on.
- a typical cloud may have several data centers at various locations throughout the world. Each data center may host tens of thousands of servers.
- a customer typically purchases a subscription to use the services of the cloud.
- a customer may provide billing information and be provided an account that is accessible using logon information such as a user name and password.
- logon information such as a user name and password.
- a cloud provider may offer various incentives that allow the users to subscribe and use the cloud for a limited time.
- a customer can then use the servers of the cloud to execute computer programs such as for hosting websites, performing accounting functions, performing data analyses, and so on.
- a cloud may use various billing models such as model based on amount of core usage, memory usage, and other resource usage.
- Clouds like other computer systems, are susceptible to cyber-attacks. These cyber-attacks may include viruses, worms, denial-of-service attacks, and so on. Clouds are also susceptible to fraudulent use of resources resulting from exploitation of a vulnerability in the subscription process of the cloud. For example, a cloud may offer free 30-day subscriptions to new customers. When subscribing, the user may be provided with a Completely Automated Public Turing test to tell Computer and Humans Apart (“CAPTCHA”) test. If the user discovers a vulnerability in the subscription process that allows the user to bypass or always pass the CAPTCHA test, that user may be able to develop a computer program to create hundreds and thousands of new, but unauthorized, subscriptions.
- CAPTCHA Computer and Humans Apart
- a user may have access to valid, but stolen, credit card numbers. Such a user can create hundreds of subscriptions using the stolen credit card numbers before the theft is identified. Such unauthorized users can then use computer resources at such a high rate that there are insufficient resources left to service the needs of authorized users. To help lessen the impact of such unauthorized uses, a cloud may be designed with a certain amount of capacity that is in excess of what is needed to support authorized users. The price of purchasing and maintaining such excess capacity can be high.
- a system for identifying abnormal resource usage in a data center employs a prediction model for each of a plurality of resources and an abnormal resource usage criterion.
- the prediction models are generated from resource usage data of the data center, and the abnormal resource usage criterion is established based on error statistics for the prediction models.
- the system retrieves current resource usage data for a current time and past resource usage data for that resource.
- the system extracts features from the past resource usage data for that resource, predicts using the prediction model for that resource usage data for the current time based on the extracted features, and determines an error between the predicted resource usage data and the current resource usage data.
- the system determines whether errors satisfy the abnormal resource usage criterion. If so, the system indicates that an abnormal resource usage has occurred.
- FIG. 1 is a flow diagram that illustrates the generating of a classifier in some embodiments of an abnormal activity detection (“AAD”) system.
- AAD abnormal activity detection
- FIG. 2 is a flow diagram that illustrates the identifying of an abnormal resource usage in some embodiments of the AAD system.
- FIG. 3 is a block diagram that illustrates components of the AAD system in some embodiments.
- FIG. 4 is a flow diagram that illustrates the processing of a generate classifier component in some embodiments.
- FIG. 5 is a flow diagram that illustrates the processing of a generate model component in some embodiments.
- FIG. 6 is a flow diagram that illustrates the processing of a generate error statistics component in some embodiments.
- FIG. 7 is a flow diagram that illustrates the processing of a calculate error data component in some embodiments.
- FIG. 8 is a flow diagram that illustrates the processing of an apply classifier component in some embodiments.
- FIG. 9 is a flow diagram that illustrates the processing of a generate classification data component in some embodiments.
- an abnormal activity detection (“AAD”) system detects when resource usage at a data center is so high that the resource usage is likely fraudulent. Such fraudulent usage may be referred to as a “fraud storm” at the data center because of the sudden onset of significant fraudulent use.
- the AAD system detects a fraud storm using a classifier to classify whether the current resource usage indicates abnormal activity resulting in abnormal resource usage at the data center.
- the AAD system may generate a prediction model for various resources to predict normal resource usage given past resource usage.
- the AAD system uses resources that are likely to increase during a fraud storm, such as number of cores in use, number of new subscriptions, amount of outbound traffic, amount of disk usage, and so on.
- the AAD system may also generate an error model to estimate the errors in the prediction models based on comparison of predicted resource usage and actual resource usage of past resource usage. To determine if abnormal activity is occurring at the current time, the AAD system applies the classifier to past resource usage. The classifier uses the prediction models to predict a predicted resource usage for the current time based on past resource usage and then determines an error between predicted resource usage and the current resource usage for each resource. If the errors satisfy an abnormal resource usage criterion, then the AAD system indicates that abnormal activity is occurring.
- the provider of the cloud can take various steps to stop the abnormal activity such as revoking new subscriptions that appear to be fraudulently obtained, limiting the subscription rate, placing additional protections on the subscription process, identifying and correcting the vulnerability that led to the abnormal activity, and so on.
- the AAD system generates a classifier to identify abnormal resource usage in a data center based on resource usage data collected for various resources while normal activity was occurring. For each of the resources, the AAD system collects (e.g., is provided with data collected by the data center) resource usage data at various time intervals. For example, the interval may be one hour, and the resource usage data may include the average number of cores in use during that hour and the number of new subscriptions received during that hour. For each of the intervals, the AAD system identifies the current resource usage data for that resource and extracts features from past resource usage data for one or more resources. The extracted features may include average resource usage in the hours 1, 2, 4, and 8 hours ago and in the hours 1, 2, 4, 7, and 14 days ago.
- the extracted features may also include the average resource usage over the past 2, 4, 8, 12, 24, and 48 hours. Other features may be used from past resource usage data that may be indicative of the current resource usage data such as differences between resource usage data, variations in resource usage data, and so on. Also, the features may also include monthly and annual features to help account for seasonal variations.
- the AAD system then generates a prediction model for each resource from the current resource usage data and the extracted features for predicting resource usage data for that resource at a current time given features extracted from past resource usage data.
- the prediction model can be generated using various regression models such as random forest regression, k-nearest neighbors regression, support vector machine (SVM) with radial basis function (RBF) kernel, linear regression, ridge linear regression, and so on.
- SVM support vector machine
- RBF radial basis function
- the AAD system also generates error statistics based on estimated errors in the prediction models derived from the collected resource usage data. For example, the error statistics may include mean and standard deviation of the errors for each resource and covariances of errors of resources.
- the AAD system then establishes from the error statistics an abnormal resource usage criterion that when satisfied indicates that the abnormal activity is occurring.
- the abnormal resource usage criterion may be based on a p-value determined for some significance level.
- the AAD system may regenerate the classifier at various times such as periodically (e.g., weekly) or when certain events occur (e.g., a certain number of non-fraudulent new subscriptions are received).
- the AAD system identifies abnormal resource usage using the prediction models and the abnormal resource usage criterion. For each resource, the AAD system accesses current resource usage data for a current time and past resource usage data for the resources. The AAD system extracts features from the past resource usage data for these resources (i.e., features used to generate the classifier). The AAD system then uses the prediction model for that resource to predict predicted resource usage data for the current time based on the extracted features. The AAD system then determines an error between the predicted resource usage data and the current resource usage data. After determining the errors for each of the resources, the AAD system determines whether the determined errors satisfy the abnormal resource usage criterion. If so, the AAD system indicates that an abnormal resource usage is occurring.
- FIG. 1 is a flow diagram that illustrates the generating of a classifier in some embodiments of the AAD system.
- the AAD system generates a prediction model for each resource.
- the AAD system selects the next resource.
- decision block 102 if all the resources have already been selected, then the AAD system continues at block 108 , else the AAD system continues at block 103 .
- the AAD system accesses resource usage data for the selected resource for various times.
- the resource usage data may be provided by a data center to the AAD system.
- the component In blocks 104 - 106 , the component generates, for each time interval (e.g., every hour) with a window (e.g., 60 days), classification data that includes current resource usage data for that time interval and features extracted from past resource usage data. In block 104 , the component selects the next time interval. In decision block 105 , if all the time intervals have already been selected, then the AAD system continues at block 107 , else the AAD system continues at block 106 . In block 106 , the AAD system identifies the current resource usage data for the time interval and extracts various features for the selected time interval from past resource usage data and then loops to block 104 to select the next time interval.
- a window e.g. 60 days
- the AAD system if the AAD system regenerates the classifier, then the AAD system need only generate classification data from the time the classifier was last generated.
- the AAD system generates a prediction model for the selected resource and then loops to block 101 to select the next resource.
- the AAD system generates error statistics for the prediction models.
- the component establishes an abnormal resource usage criterion based on the error statistics and then completes.
- FIG. 2 is a flow diagram that illustrates the identifying of an abnormal resource usage in some embodiments of the AAD system.
- the AAD system may perform this identifying at various time intervals (e.g., hourly).
- the component accesses the resource usage data for the data center.
- the AAD system selects the next resource.
- decision block 203 if all the resources have already been selected, then the AAD system continues at block 206 , else the AAD system continues at block 204 .
- the AAD system predicts resource usage data for the selected resource for the current time interval using the prediction model and features extracted from past resource usage data.
- the AAD system determines the error between the predicted resource usage data and the current resource usage data for the time interval. The component then loops to block 202 to select the next resource. In decision block 206 , if the abnormal usage criterion is satisfied by the determined errors, then the AAD system returns an indication of abnormal resource usage, else the AAD system returns an indication of normal resource usage.
- FIG. 3 is a block diagram that illustrates components of the AAD system in some embodiments.
- the AAD system 300 includes a generate classifier component 301 , an apply classifier component 302 , a generate model component 303 , a generate error statistics component 304 , a calculate error data component 305 , and a generate classification data component 306 .
- the generate classifier component is invoked periodically to generate a classifier based on the most recent resource usage data (e.g., the past 60 days).
- the apply classifier component is invoked at various time intervals to determine whether abnormal resource usage is occurring.
- the generate model component is invoked to generate a prediction model for a resource.
- the generate error statistics component is invoked to generate error statistics for the resources to estimate the error in the generated prediction models.
- the calculate error data component is invoked to calculate the error data for a prediction model.
- the generate classification data component is invoked to generate the classification data for use in generating the classifier and in applying the classifier.
- the AAD system also includes a classifier storage 307 to store the weights for the prediction models, the error statistics, and the abnormal resource usage criterion.
- the AAD system retrieves usage data from the resource usage data storage 310 of the data center.
- the ADD system may also include a fill-in gaps component 308 to fill in gaps in the resource usage data and a supervised classifier component 309 to filter out activity that has been erroneously identified as abnormal activity.
- the computing devices and systems on which the AAD system may be implemented may include a central processing unit, input devices, output devices (e.g., display devices and speakers), storage devices (e.g., memory and disk drives), network interfaces, graphics processing units, accelerometers, cellular radio link interfaces, global positioning system devices, and so on.
- the input devices may include keyboards, pointing devices, touch screens, gesture recognition devices (e.g., for air gestures), head and eye tracking devices, microphones for voice recognition, and so on.
- the computing devices may include desktop computers, laptops, tablets, e-readers, personal digital assistants, smartphones, gaming devices, servers, and computer systems such as massively parallel systems.
- the computing devices may access computer-readable media that include computer-readable storage media and data transmission media.
- the computer-readable storage media are tangible storage means that do not include a transitory, propagating signal. Examples of computer-readable storage media include memory such as primary memory, cache memory, and secondary memory (e.g., DVD) and other storage means.
- the computer-readable storage media may have recorded on it or may be encoded with computer-executable instructions or logic that implements the AAD system.
- the data transmission media is used for transmitting data via transitory, propagating signals or carrier waves (e.g., electromagnetism) via a wired or wireless connection.
- the AAD system may be described in the general context of computer-executable instructions, such as program modules and components, executed by one or more computers, processors, or other devices.
- program modules or components include routines, programs, objects, data structures, and so on that perform particular tasks or implement particular data types.
- the functionality of the program modules may be combined or distributed as desired in various embodiments.
- aspects of the AAD system may be implemented in hardware using, for example, an application-specific integrated circuit (ASIC).
- ASIC application-specific integrated circuit
- FIG. 4 is a flow diagram that illustrates the processing of a generate classifier component in some embodiments.
- the generate classifier component 400 is invoked to generate the classifier.
- the component invokes the generate model component passing for each time interval classification data that includes features for the cores resource (X c ) and the corresponding current resource usage data (y c ) for that time interval and receives the weights for the features for the model (f c ) in return.
- a subset of the features of the classification data may be represented by the following table:
- the time column represents the time for the data in each row.
- Time 0 represents the current time
- time ⁇ 1 represents one hour ago
- time ⁇ 2 represents two hours ago
- the 0 hour column represents the current resource usage data (y c ) for the corresponding time.
- the other columns represent the extracted features (X c ) for the corresponding time.
- the illustrated extracted features include the number of cores in use one hour ago, eight hours ago, one day ago, and 14 days ago. For example, four hours ago the extracted features were 7000, 11000, 9000, and 7000.
- the generate classifier component invokes the generate model component passing for each time interval classification data that includes features for the subscriptions resource (X n ) and the corresponding resource usage data (y n ) for that time interval and receives the weights for the features for the model (f n ) in return.
- the component invokes a generate error statistics component and receives the error statistics in return such as a covariance matrix and the mean of the errors for each resource.
- the component establishes the abnormal resource usage criterion as a p-value for a multivariate normal distribution based on a threshold significance level.
- the p-value may be generated based on a Mahalanobis distance or based on the estimated weight of a cumulative distribution function in a rectangle of values higher than the observed values. (See Genz, A. and Bretz, F., “Computation of Multivariate Normal and t Probabilities,” Springer Science & Business Media (2009).) The component then completes.
- the resource usage data collected at a data center may have gaps or may have been collected during a period of abnormal activity.
- a gap may be present because, for example, a component of the data center responsible for collecting the data may have failed or may have been taken down for maintenance.
- the resource usage data that is collected during a period of abnormal activity might not be useful in predicting normal resource usage data. Such a period may be considered a gap in the collecting of normal resource usage data.
- a fill-in gap component may use various interpolation techniques such as a linear Gaussian Bayesian network or linear interpolation. With a Bayesian network, the missing data can be imputed using forward sampling with likelihood weighting or using belief propagation.
- FIG. 5 is a flow diagram that illustrates the processing of a generate model component in some embodiments.
- the generate model component 500 is invoked to generate a prediction model for a resource based on current usage data and extracted features for various time intervals.
- the component uses a ridge regression model although as described above other models may be employed.
- the component generates prediction models for various ridge values and selects the prediction model with the smallest error.
- the ridge values may be a logarithm set including the values 10 0 , 10 1 , 10 2 , . . . 10 12 .
- the component For each ridge value, the component generates various prediction models using different subsets of the time intervals and calculates the error from the remaining time intervals.
- the component selects the next ridge value.
- decision block 502 if all the ridge values have already been selected, then the component returns the prediction model (i.e., the weights for the features), else the component continues at block 503 .
- the component selects the next time interval.
- decision block 504 if all the time intervals have already been selected for the selected ridge, then the component loops to block 501 to select the next ridge, else the component continues at block 505 .
- the component applies a linear regression technique to generate the prediction model using the selected ridge based on the subset of intervals that does not include the selected interval.
- the component uses the generated model to predict predicted resource usage data for the selected time interval.
- the component calculates the error between the predicted resource usage data and the current resource usage data.
- decision block 508 if the error is less than the minimum error encountered so far, then the component continues at block 509 , else the component loops to block 503 to select the next time interval.
- the component sets the minimum error encountered so far to the error calculated in block 507 .
- the component sets the prediction model to the prediction model generated in block 505 . The component then loops to block 503 to select the next time interval.
- the component is illustrated as generating a prediction model for each time interval for each ridge value, the component may generate prediction models for larger subsets of the time intervals rather than for each time interval, which may produce acceptable prediction models with less computational resources.
- FIG. 6 is a flow diagram that illustrates the processing of a generate error statistics component in some embodiments.
- the generate error statistics component 600 generates error statistics for the prediction model using cross-validation to estimate the error.
- the component invokes a calculate error data component passing the current resource usage data and extracted features for the time intervals for the cores resource and receives the error data (e c ) in return.
- the component invokes the calculate error data component passing the current resource usage data and extracted features for the time intervals for the subscriptions resource and receives the error data (e n ) in return.
- the component calculates the mean for the error data of the cores resource.
- the component calculates the mean for the error data of the subscriptions resource.
- the component calculates a covariance matrix based on the error data for the cores resource and the subscriptions resource and then returns.
- FIG. 7 is a flow diagram that illustrates the processing of a calculate error data component in some embodiments.
- the calculate error data component 700 generates a prediction model based on various subsets of the time intervals, uses the prediction model to predict predicted resource usage data for the remaining intervals, and calculates the error between the predicted resource usage data and the current resource usage data for that time interval.
- the component may generate five prediction models withholding a different 20% of the intervals for each prediction model.
- the component selects the next cross-validation.
- decision block 702 if all the cross-validations have already been selected, then the component returns, else the component continues at block 703 .
- the component selects a subset of the time intervals to withhold.
- the component invokes the generate model component passing an indication of the current resource usage data and features that have not been withheld.
- the component uses the generated prediction model to calculate the error between predicted resource usage data and current resource usage data for the withheld intervals.
- the component selects the next withheld interval.
- decision block 706 if all the withheld intervals have already been selected, then the component loops to block 701 to select the next cross-validation, else the component continues at block 707 .
- the component calculates the error for the selected interval and then loops to block 705 to select the next cross-validation.
- FIG. 8 is a flow diagram that illustrates the processing of an apply classifier component in some embodiments.
- the apply classifier component 800 is passed an indication of the features for the resources and determines whether the current resource usage data when compared to the predicted resource usage data indicates an abnormal resource usage.
- the component applies the prediction model for the cores resource to the features for the cores resource to predict resource usage data.
- the component applies the prediction model for the subscriptions resource to the features for the subscriptions resource to predict resource usage data.
- the component calculates the error between the predicted resource usage data and the current resource usage data for the cores resource.
- the component calculates the error between the predicted resource usage data and the current resource usage data for the subscriptions.
- decision block 805 if the errors satisfy an abnormal resource usage criterion, then the component returns an indication that the resource usage is abnormal, else the component returns an indication that the resource usage is normal.
- the AAD system may generate a supervised classifier to filter out erroneous indications of abnormal resource usage.
- the AAD system may use as training data for the supervised classifier the resource usage data that has been indicated as being abnormal and labels (e.g., manually generated) that identify the resource usage data as being normal or abnormal.
- the AAD system may use any of a variety of supervised training techniques such as an SVM, decision trees, adaptive boosting, and so on.
- the supervised classier component can then input the features for that abnormal resource usage data and classify as being normal or abnormal.
- FIG. 9 is a flow diagram that illustrates the processing of a generate classification data component in some embodiments.
- the generate classification data component 900 generates the current classification data for the resources.
- the component selects the next resource.
- decision block 902 if all the resources have already been selected, then the component returns, else the component continues at block 903 .
- the component retrieves the current resource usage data for the selected resource.
- the component retrieves past resource usage data for one hour ago for the selected resource.
- the component retrieves past resource usage data for 14 days ago for the selected resource. The ellipsis between block 904 and block 905 indicates that past resource usage data may be retrieved for other intervals.
- the component generates an average resource usage data over the past two hours for the selected resource.
- the component generates average resource usage data for the last 48 hours for the selected resource and then loops to block 901 to select the next resource.
- a method performed by a computer system for generating a classifier to identify abnormal resource usage in a data center is provided.
- the method for each of a plurality of resources, provides resource usage data for that resource at various times.
- the method for each of a plurality of times, identifies current resource usage data for that resource for that time and extracts features from past resource usage data of that resource prior to that time.
- the method generates a prediction model for that resource from the current resource usage data and the extracted features for the times to predict resource usage data for that resource at a current time given features extracted from past resource usage data.
- the method then generates from the resource usage data for the resources error statistics for the prediction models and establishes from the error statistics an abnormal resource usage criterion.
- the method may be used in conjunction with any one of or any combination of the following embodiments.
- the method may further, for each of the plurality of resources, provide current resource usage data for a current time and past resource usage data for that resource, extract features from the past resource usage data for that resource, generate by the prediction model for that resource predicted resource usage data for the current time, and determine error between the predicted resource usage data and the current resource usage data.
- the method may indicate abnormal resource usage has occurred.
- a resource may be cores of the data center and the resource usage data for the cores may be the number of cores in use at the data center.
- the extracted features for the number of cores may include the average number of cores in use during past intervals.
- a resource may also be subscriptions to the data center and the resource usage data for the subscriptions may be the number of new subscriptions to the data center.
- the extracted features for subscriptions may the number of new subscriptions during past intervals.
- the error statistics may be generated using cross-validation of a prediction model.
- the method may further regenerate the classifier on a periodic basis.
- the error statistics may include a mean of the errors for each resource and a covariance for each pair of resources.
- the abnormal resource usage criterion may be based on a p-value for the error statistics.
- a computer-readable storage medium stores computer-executable instructions for controlling a computing system to identify abnormal resource usage in a data center.
- the computer-executable instructions comprising instructions that access a prediction model for each of a plurality of resources and an abnormal resource usage criterion where the prediction models may be generated from resource usage data of the data center and where the abnormal resource usage criterion may be established based on error statistics for the prediction models.
- the instructions further, for each of a plurality of resources of the data center, access current resource usage data for a current time and past resource usage data for that resource, extract features from the past resource usage data for that resource, predict by the prediction model for that resource predicted resource usage data for the current time based on the extracted features, and determine an error between the predicted resource usage data and the current resource usage data.
- the instructions further, when the determined errors satisfy the abnormal resource usage criterion, indicate an abnormal resource usage has occurred.
- the extracted features for the number of cores may include the average number of cores in use during past intervals and the extracted features for subscriptions may include the number of new subscriptions received during past intervals.
- the instructions further, for each of the plurality of resources of the data center, collect resource usage data for that resource at each of a plurality of intervals and wherein the extracted features include resource usage data for time intervals of one hour, one day, and one week prior to the current time.
- the instructions may further when an abnormal resource usage has been indicated, apply a supervised classifier to the extracted features to filter out erroneous indications of abnormal resource usage.
- a computer system that identifies abnormal resource usage in a data center.
- the computer system may comprise one or more computer-readable storage media storing computer-executable instructions and one or more processors for executing the computer-executable instructions stored in the one or more computer-readable storage media.
- the instructions may include instructions that access current resource usage data for a current time and features of past resource usage data for resources of the data center, and apply a classifier to the current resource usage data and the features to determine whether the current resource usage data represents an abnormal resource usage.
- the classifier may, for each of a plurality of resources of the data center, predict using a prediction model for that resource predicted resource usage data for the current time based on the features and determine an error between the predicted resource usage data and the current resource usage data and when the determined errors satisfy the abnormal resource usage criterion, indicate an abnormal resource usage has occurred.
- the instructions further include instructions for generating the classifier that, for each of the plurality of resources, for each of a plurality of times, identify current resource usage data for that resource for that time and extract features from past resource usage data for that resource and then generate a prediction model for that resource from the current resource usage data and the extracted features for the times to predict resource usage data for that resource at a current time given features extracted from past resource usage data.
- the instructions may further include instructions that generate from the resource usage data for the resources error statistics for the prediction models and establish from the error statistics an abnormal resource usage criterion.
- the classifier is regenerated at various times using resource usage data that includes resource usage data collected since the classifier was last generated.
- the prediction models may be generated using a linear regression technique.
- a resource may be cores of the data center and a resource may be subscriptions to the data center.
- the extracted features for the number of cores may include the average number of cores in use during past intervals, and the extracted features for subscriptions may include the number of new subscriptions during past intervals.
- the instructions may further when an abnormal resource usage has been indicated, apply a supervised classifier to the extracted features to filter out erroneous indications of abnormal resource usage.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Debugging And Monitoring (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
- A public cloud computing system (“a cloud”) provides shared computing resources for use by customers. The computing resources of a cloud are hardware and software resources. The hardware resources include components of servers such as cores of central processing units (CPUs), graphics processing units (GPUs), main memory, secondary storage, and so on. The software resources include operating systems, database systems, accounting applications, and so on. A typical cloud may have several data centers at various locations throughout the world. Each data center may host tens of thousands of servers.
- To use a cloud, a customer typically purchases a subscription to use the services of the cloud. When purchasing a subscription, a customer may provide billing information and be provided an account that is accessible using logon information such as a user name and password. To encourage users to become customers, a cloud provider may offer various incentives that allow the users to subscribe and use the cloud for a limited time. Once logged on, a customer can then use the servers of the cloud to execute computer programs such as for hosting websites, performing accounting functions, performing data analyses, and so on. A cloud may use various billing models such as model based on amount of core usage, memory usage, and other resource usage.
- Clouds, like other computer systems, are susceptible to cyber-attacks. These cyber-attacks may include viruses, worms, denial-of-service attacks, and so on. Clouds are also susceptible to fraudulent use of resources resulting from exploitation of a vulnerability in the subscription process of the cloud. For example, a cloud may offer free 30-day subscriptions to new customers. When subscribing, the user may be provided with a Completely Automated Public Turing test to tell Computer and Humans Apart (“CAPTCHA”) test. If the user discovers a vulnerability in the subscription process that allows the user to bypass or always pass the CAPTCHA test, that user may be able to develop a computer program to create hundreds and thousands of new, but unauthorized, subscriptions. As another example, a user may have access to valid, but stolen, credit card numbers. Such a user can create hundreds of subscriptions using the stolen credit card numbers before the theft is identified. Such unauthorized users can then use computer resources at such a high rate that there are insufficient resources left to service the needs of authorized users. To help lessen the impact of such unauthorized uses, a cloud may be designed with a certain amount of capacity that is in excess of what is needed to support authorized users. The price of purchasing and maintaining such excess capacity can be high.
- A system for identifying abnormal resource usage in a data center is provided. In some embodiments, the system employs a prediction model for each of a plurality of resources and an abnormal resource usage criterion. The prediction models are generated from resource usage data of the data center, and the abnormal resource usage criterion is established based on error statistics for the prediction models. For each of a plurality of resources of the data center, the system retrieves current resource usage data for a current time and past resource usage data for that resource. The system then extracts features from the past resource usage data for that resource, predicts using the prediction model for that resource usage data for the current time based on the extracted features, and determines an error between the predicted resource usage data and the current resource usage data. After determining the error data for the resources, the system determines whether errors satisfy the abnormal resource usage criterion. If so, the system indicates that an abnormal resource usage has occurred.
- This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
-
FIG. 1 is a flow diagram that illustrates the generating of a classifier in some embodiments of an abnormal activity detection (“AAD”) system. -
FIG. 2 is a flow diagram that illustrates the identifying of an abnormal resource usage in some embodiments of the AAD system. -
FIG. 3 is a block diagram that illustrates components of the AAD system in some embodiments. -
FIG. 4 is a flow diagram that illustrates the processing of a generate classifier component in some embodiments. -
FIG. 5 is a flow diagram that illustrates the processing of a generate model component in some embodiments. -
FIG. 6 is a flow diagram that illustrates the processing of a generate error statistics component in some embodiments. -
FIG. 7 is a flow diagram that illustrates the processing of a calculate error data component in some embodiments. -
FIG. 8 is a flow diagram that illustrates the processing of an apply classifier component in some embodiments. -
FIG. 9 is a flow diagram that illustrates the processing of a generate classification data component in some embodiments. - In some embodiments, an abnormal activity detection (“AAD”) system detects when resource usage at a data center is so high that the resource usage is likely fraudulent. Such fraudulent usage may be referred to as a “fraud storm” at the data center because of the sudden onset of significant fraudulent use. The AAD system detects a fraud storm using a classifier to classify whether the current resource usage indicates abnormal activity resulting in abnormal resource usage at the data center. To generate the classifier, the AAD system may generate a prediction model for various resources to predict normal resource usage given past resource usage. The AAD system uses resources that are likely to increase during a fraud storm, such as number of cores in use, number of new subscriptions, amount of outbound traffic, amount of disk usage, and so on. The AAD system may also generate an error model to estimate the errors in the prediction models based on comparison of predicted resource usage and actual resource usage of past resource usage. To determine if abnormal activity is occurring at the current time, the AAD system applies the classifier to past resource usage. The classifier uses the prediction models to predict a predicted resource usage for the current time based on past resource usage and then determines an error between predicted resource usage and the current resource usage for each resource. If the errors satisfy an abnormal resource usage criterion, then the AAD system indicates that abnormal activity is occurring. When such abnormal activity is occurring, the provider of the cloud can take various steps to stop the abnormal activity such as revoking new subscriptions that appear to be fraudulently obtained, limiting the subscription rate, placing additional protections on the subscription process, identifying and correcting the vulnerability that led to the abnormal activity, and so on.
- In some embodiments, the AAD system generates a classifier to identify abnormal resource usage in a data center based on resource usage data collected for various resources while normal activity was occurring. For each of the resources, the AAD system collects (e.g., is provided with data collected by the data center) resource usage data at various time intervals. For example, the interval may be one hour, and the resource usage data may include the average number of cores in use during that hour and the number of new subscriptions received during that hour. For each of the intervals, the AAD system identifies the current resource usage data for that resource and extracts features from past resource usage data for one or more resources. The extracted features may include average resource usage in the hours 1, 2, 4, and 8 hours ago and in the hours 1, 2, 4, 7, and 14 days ago. The extracted features may also include the average resource usage over the past 2, 4, 8, 12, 24, and 48 hours. Other features may be used from past resource usage data that may be indicative of the current resource usage data such as differences between resource usage data, variations in resource usage data, and so on. Also, the features may also include monthly and annual features to help account for seasonal variations.
- The AAD system then generates a prediction model for each resource from the current resource usage data and the extracted features for predicting resource usage data for that resource at a current time given features extracted from past resource usage data. The prediction model can be generated using various regression models such as random forest regression, k-nearest neighbors regression, support vector machine (SVM) with radial basis function (RBF) kernel, linear regression, ridge linear regression, and so on. The AAD system also generates error statistics based on estimated errors in the prediction models derived from the collected resource usage data. For example, the error statistics may include mean and standard deviation of the errors for each resource and covariances of errors of resources. The AAD system then establishes from the error statistics an abnormal resource usage criterion that when satisfied indicates that the abnormal activity is occurring. For example, the abnormal resource usage criterion may be based on a p-value determined for some significance level. The AAD system may regenerate the classifier at various times such as periodically (e.g., weekly) or when certain events occur (e.g., a certain number of non-fraudulent new subscriptions are received).
- In some embodiments, the AAD system identifies abnormal resource usage using the prediction models and the abnormal resource usage criterion. For each resource, the AAD system accesses current resource usage data for a current time and past resource usage data for the resources. The AAD system extracts features from the past resource usage data for these resources (i.e., features used to generate the classifier). The AAD system then uses the prediction model for that resource to predict predicted resource usage data for the current time based on the extracted features. The AAD system then determines an error between the predicted resource usage data and the current resource usage data. After determining the errors for each of the resources, the AAD system determines whether the determined errors satisfy the abnormal resource usage criterion. If so, the AAD system indicates that an abnormal resource usage is occurring.
-
FIG. 1 is a flow diagram that illustrates the generating of a classifier in some embodiments of the AAD system. In blocks 101-107, the AAD system generates a prediction model for each resource. Inblock 101, the AAD system selects the next resource. Indecision block 102, if all the resources have already been selected, then the AAD system continues atblock 108, else the AAD system continues atblock 103. Inblock 103, the AAD system accesses resource usage data for the selected resource for various times. The resource usage data may be provided by a data center to the AAD system. In blocks 104-106, the component generates, for each time interval (e.g., every hour) with a window (e.g., 60 days), classification data that includes current resource usage data for that time interval and features extracted from past resource usage data. Inblock 104, the component selects the next time interval. Indecision block 105, if all the time intervals have already been selected, then the AAD system continues atblock 107, else the AAD system continues atblock 106. Inblock 106, the AAD system identifies the current resource usage data for the time interval and extracts various features for the selected time interval from past resource usage data and then loops to block 104 to select the next time interval. In some embodiments, if the AAD system regenerates the classifier, then the AAD system need only generate classification data from the time the classifier was last generated. Inblock 107, the AAD system generates a prediction model for the selected resource and then loops to block 101 to select the next resource. Inblock 108, the AAD system generates error statistics for the prediction models. Inblock 109, the component establishes an abnormal resource usage criterion based on the error statistics and then completes. -
FIG. 2 is a flow diagram that illustrates the identifying of an abnormal resource usage in some embodiments of the AAD system. The AAD system may perform this identifying at various time intervals (e.g., hourly). Inblock 201, the component accesses the resource usage data for the data center. Inblock 202, the AAD system selects the next resource. Indecision block 203, if all the resources have already been selected, then the AAD system continues atblock 206, else the AAD system continues atblock 204. Inblock 204, the AAD system predicts resource usage data for the selected resource for the current time interval using the prediction model and features extracted from past resource usage data. Inblock 205, the AAD system determines the error between the predicted resource usage data and the current resource usage data for the time interval. The component then loops to block 202 to select the next resource. Indecision block 206, if the abnormal usage criterion is satisfied by the determined errors, then the AAD system returns an indication of abnormal resource usage, else the AAD system returns an indication of normal resource usage. -
FIG. 3 is a block diagram that illustrates components of the AAD system in some embodiments. TheAAD system 300 includes a generateclassifier component 301, an applyclassifier component 302, a generatemodel component 303, a generateerror statistics component 304, a calculateerror data component 305, and a generateclassification data component 306. The generate classifier component is invoked periodically to generate a classifier based on the most recent resource usage data (e.g., the past 60 days). The apply classifier component is invoked at various time intervals to determine whether abnormal resource usage is occurring. The generate model component is invoked to generate a prediction model for a resource. The generate error statistics component is invoked to generate error statistics for the resources to estimate the error in the generated prediction models. The calculate error data component is invoked to calculate the error data for a prediction model. The generate classification data component is invoked to generate the classification data for use in generating the classifier and in applying the classifier. The AAD system also includes aclassifier storage 307 to store the weights for the prediction models, the error statistics, and the abnormal resource usage criterion. The AAD system retrieves usage data from the resourceusage data storage 310 of the data center. The ADD system may also include a fill-ingaps component 308 to fill in gaps in the resource usage data and asupervised classifier component 309 to filter out activity that has been erroneously identified as abnormal activity. - The computing devices and systems on which the AAD system may be implemented may include a central processing unit, input devices, output devices (e.g., display devices and speakers), storage devices (e.g., memory and disk drives), network interfaces, graphics processing units, accelerometers, cellular radio link interfaces, global positioning system devices, and so on. The input devices may include keyboards, pointing devices, touch screens, gesture recognition devices (e.g., for air gestures), head and eye tracking devices, microphones for voice recognition, and so on. The computing devices may include desktop computers, laptops, tablets, e-readers, personal digital assistants, smartphones, gaming devices, servers, and computer systems such as massively parallel systems. The computing devices may access computer-readable media that include computer-readable storage media and data transmission media. The computer-readable storage media are tangible storage means that do not include a transitory, propagating signal. Examples of computer-readable storage media include memory such as primary memory, cache memory, and secondary memory (e.g., DVD) and other storage means. The computer-readable storage media may have recorded on it or may be encoded with computer-executable instructions or logic that implements the AAD system. The data transmission media is used for transmitting data via transitory, propagating signals or carrier waves (e.g., electromagnetism) via a wired or wireless connection.
- The AAD system may be described in the general context of computer-executable instructions, such as program modules and components, executed by one or more computers, processors, or other devices. Generally, program modules or components include routines, programs, objects, data structures, and so on that perform particular tasks or implement particular data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments. Aspects of the AAD system may be implemented in hardware using, for example, an application-specific integrated circuit (ASIC).
-
FIG. 4 is a flow diagram that illustrates the processing of a generate classifier component in some embodiments. The generateclassifier component 400 is invoked to generate the classifier. Inblock 401, the component invokes the generate model component passing for each time interval classification data that includes features for the cores resource (Xc) and the corresponding current resource usage data (yc) for that time interval and receives the weights for the features for the model (fc) in return. A subset of the features of the classification data may be represented by the following table: -
Time 0 hour 1 hour . . . 8 hours 1 day . . . 14 days 0 10000 9500 5500 9750 10500 -1 9500 10250 -2 10250 7500 -3 7500 6500 -4 6500 7000 11000 9000 7000 -5 7000 . . . - The time column represents the time for the data in each row. Time 0 represents the current time, time −1 represents one hour ago, time −2 represents two hours ago, and so on. The 0 hour column represents the current resource usage data (yc) for the corresponding time. For example, four hours ago the average number of cores that were in use was 6500. The other columns represent the extracted features (Xc) for the corresponding time. The illustrated extracted features include the number of cores in use one hour ago, eight hours ago, one day ago, and 14 days ago. For example, four hours ago the extracted features were 7000, 11000, 9000, and 7000. In
block 402, the generate classifier component invokes the generate model component passing for each time interval classification data that includes features for the subscriptions resource (Xn) and the corresponding resource usage data (yn) for that time interval and receives the weights for the features for the model (fn) in return. Inblock 403, the component invokes a generate error statistics component and receives the error statistics in return such as a covariance matrix and the mean of the errors for each resource. Inblock 404, the component establishes the abnormal resource usage criterion as a p-value for a multivariate normal distribution based on a threshold significance level. The p-value may be generated based on a Mahalanobis distance or based on the estimated weight of a cumulative distribution function in a rectangle of values higher than the observed values. (See Genz, A. and Bretz, F., “Computation of Multivariate Normal and t Probabilities,” Springer Science & Business Media (2009).) The component then completes. - In some embodiments, the resource usage data collected at a data center may have gaps or may have been collected during a period of abnormal activity. A gap may be present because, for example, a component of the data center responsible for collecting the data may have failed or may have been taken down for maintenance. The resource usage data that is collected during a period of abnormal activity might not be useful in predicting normal resource usage data. Such a period may be considered a gap in the collecting of normal resource usage data. To fill in the gaps, a fill-in gap component may use various interpolation techniques such as a linear Gaussian Bayesian network or linear interpolation. With a Bayesian network, the missing data can be imputed using forward sampling with likelihood weighting or using belief propagation.
-
FIG. 5 is a flow diagram that illustrates the processing of a generate model component in some embodiments. The generatemodel component 500 is invoked to generate a prediction model for a resource based on current usage data and extracted features for various time intervals. The component uses a ridge regression model although as described above other models may be employed. The component generates prediction models for various ridge values and selects the prediction model with the smallest error. The ridge values may be a logarithm set including the values 100, 101, 102, . . . 1012. For each ridge value, the component generates various prediction models using different subsets of the time intervals and calculates the error from the remaining time intervals. Inblock 501, the component selects the next ridge value. Indecision block 502, if all the ridge values have already been selected, then the component returns the prediction model (i.e., the weights for the features), else the component continues atblock 503. Inblock 503, the component selects the next time interval. Indecision block 504, if all the time intervals have already been selected for the selected ridge, then the component loops to block 501 to select the next ridge, else the component continues atblock 505. Inblock 505, the component applies a linear regression technique to generate the prediction model using the selected ridge based on the subset of intervals that does not include the selected interval. Inblock 506, the component uses the generated model to predict predicted resource usage data for the selected time interval. Inblock 507, the component calculates the error between the predicted resource usage data and the current resource usage data. Indecision block 508, if the error is less than the minimum error encountered so far, then the component continues atblock 509, else the component loops to block 503 to select the next time interval. Inblock 509, the component sets the minimum error encountered so far to the error calculated inblock 507. Inblock 510, the component sets the prediction model to the prediction model generated inblock 505. The component then loops to block 503 to select the next time interval. Although the component is illustrated as generating a prediction model for each time interval for each ridge value, the component may generate prediction models for larger subsets of the time intervals rather than for each time interval, which may produce acceptable prediction models with less computational resources. -
FIG. 6 is a flow diagram that illustrates the processing of a generate error statistics component in some embodiments. The generateerror statistics component 600 generates error statistics for the prediction model using cross-validation to estimate the error. Inblock 601, the component invokes a calculate error data component passing the current resource usage data and extracted features for the time intervals for the cores resource and receives the error data (ec) in return. Inblock 602, the component invokes the calculate error data component passing the current resource usage data and extracted features for the time intervals for the subscriptions resource and receives the error data (en) in return. Inblock 603, the component calculates the mean for the error data of the cores resource. Inblock 604, the component calculates the mean for the error data of the subscriptions resource. Inblock 605, the component calculates a covariance matrix based on the error data for the cores resource and the subscriptions resource and then returns. -
FIG. 7 is a flow diagram that illustrates the processing of a calculate error data component in some embodiments. The calculateerror data component 700 generates a prediction model based on various subsets of the time intervals, uses the prediction model to predict predicted resource usage data for the remaining intervals, and calculates the error between the predicted resource usage data and the current resource usage data for that time interval. The component may generate five prediction models withholding a different 20% of the intervals for each prediction model. Inblock 701, the component selects the next cross-validation. Indecision block 702, if all the cross-validations have already been selected, then the component returns, else the component continues atblock 703. Inblock 703, the component selects a subset of the time intervals to withhold. Inblock 704, the component invokes the generate model component passing an indication of the current resource usage data and features that have not been withheld. In blocks 705-707, the component uses the generated prediction model to calculate the error between predicted resource usage data and current resource usage data for the withheld intervals. Inblock 705, the component selects the next withheld interval. Indecision block 706, if all the withheld intervals have already been selected, then the component loops to block 701 to select the next cross-validation, else the component continues atblock 707. Inblock 707, the component calculates the error for the selected interval and then loops to block 705 to select the next cross-validation. -
FIG. 8 is a flow diagram that illustrates the processing of an apply classifier component in some embodiments. The applyclassifier component 800 is passed an indication of the features for the resources and determines whether the current resource usage data when compared to the predicted resource usage data indicates an abnormal resource usage. Inblock 801, the component applies the prediction model for the cores resource to the features for the cores resource to predict resource usage data. Inblock 802, the component applies the prediction model for the subscriptions resource to the features for the subscriptions resource to predict resource usage data. Inblock 803, the component calculates the error between the predicted resource usage data and the current resource usage data for the cores resource. Inblock 804, the component calculates the error between the predicted resource usage data and the current resource usage data for the subscriptions. Indecision block 805, if the errors satisfy an abnormal resource usage criterion, then the component returns an indication that the resource usage is abnormal, else the component returns an indication that the resource usage is normal. - In some embodiments, the AAD system may generate a supervised classifier to filter out erroneous indications of abnormal resource usage. The AAD system may use as training data for the supervised classifier the resource usage data that has been indicated as being abnormal and labels (e.g., manually generated) that identify the resource usage data as being normal or abnormal. Once the training data is generated, the AAD system may use any of a variety of supervised training techniques such as an SVM, decision trees, adaptive boosting, and so on. After the AAD system initially indicates abnormal resource usage data, the supervised classier component can then input the features for that abnormal resource usage data and classify as being normal or abnormal.
-
FIG. 9 is a flow diagram that illustrates the processing of a generate classification data component in some embodiments. The generateclassification data component 900 generates the current classification data for the resources. Inblock 901, the component selects the next resource. Indecision block 902, if all the resources have already been selected, then the component returns, else the component continues atblock 903. Inblock 903, the component retrieves the current resource usage data for the selected resource. Inblock 904, the component retrieves past resource usage data for one hour ago for the selected resource. Inblock 905, the component retrieves past resource usage data for 14 days ago for the selected resource. The ellipsis betweenblock 904 and block 905 indicates that past resource usage data may be retrieved for other intervals. Inblock 906, the component generates an average resource usage data over the past two hours for the selected resource. Inblock 907, the component generates average resource usage data for the last 48 hours for the selected resource and then loops to block 901 to select the next resource. - In some embodiments, a method performed by a computer system for generating a classifier to identify abnormal resource usage in a data center is provided. The method, for each of a plurality of resources, provides resource usage data for that resource at various times. The method, for each of a plurality of times, identifies current resource usage data for that resource for that time and extracts features from past resource usage data of that resource prior to that time. The method generates a prediction model for that resource from the current resource usage data and the extracted features for the times to predict resource usage data for that resource at a current time given features extracted from past resource usage data. The method then generates from the resource usage data for the resources error statistics for the prediction models and establishes from the error statistics an abnormal resource usage criterion. The method may be used in conjunction with any one of or any combination of the following embodiments. In some embodiments, the method may further, for each of the plurality of resources, provide current resource usage data for a current time and past resource usage data for that resource, extract features from the past resource usage data for that resource, generate by the prediction model for that resource predicted resource usage data for the current time, and determine error between the predicted resource usage data and the current resource usage data. When the determined errors satisfy the abnormal resource usage criterion, the method may indicate abnormal resource usage has occurred. In some embodiments, a resource may be cores of the data center and the resource usage data for the cores may be the number of cores in use at the data center. The extracted features for the number of cores may include the average number of cores in use during past intervals. A resource may also be subscriptions to the data center and the resource usage data for the subscriptions may be the number of new subscriptions to the data center. The extracted features for subscriptions may the number of new subscriptions during past intervals. In some embodiments, the error statistics may be generated using cross-validation of a prediction model. In some embodiments, the method may further regenerate the classifier on a periodic basis. In some embodiments, the error statistics may include a mean of the errors for each resource and a covariance for each pair of resources. In some embodiments, the abnormal resource usage criterion may be based on a p-value for the error statistics.
- In some embodiments, a computer-readable storage medium is provided that stores computer-executable instructions for controlling a computing system to identify abnormal resource usage in a data center. The computer-executable instructions comprising instructions that access a prediction model for each of a plurality of resources and an abnormal resource usage criterion where the prediction models may be generated from resource usage data of the data center and where the abnormal resource usage criterion may be established based on error statistics for the prediction models. The instructions further, for each of a plurality of resources of the data center, access current resource usage data for a current time and past resource usage data for that resource, extract features from the past resource usage data for that resource, predict by the prediction model for that resource predicted resource usage data for the current time based on the extracted features, and determine an error between the predicted resource usage data and the current resource usage data. The instructions further, when the determined errors satisfy the abnormal resource usage criterion, indicate an abnormal resource usage has occurred. These instructions may be used in conjunction with any one of or any combination of the following embodiments. In some embodiments, a resource may be cores of the data center and a resource may be subscriptions to the data center. In some embodiments, the extracted features for the number of cores may include the average number of cores in use during past intervals and the extracted features for subscriptions may include the number of new subscriptions received during past intervals. In some embodiments, the instructions further, for each of the plurality of resources of the data center, collect resource usage data for that resource at each of a plurality of intervals and wherein the extracted features include resource usage data for time intervals of one hour, one day, and one week prior to the current time. In some embodiments, the instructions may further when an abnormal resource usage has been indicated, apply a supervised classifier to the extracted features to filter out erroneous indications of abnormal resource usage.
- In some embodiments, a computer system that identifies abnormal resource usage in a data center is provided. The computer system may comprise one or more computer-readable storage media storing computer-executable instructions and one or more processors for executing the computer-executable instructions stored in the one or more computer-readable storage media. The instructions may include instructions that access current resource usage data for a current time and features of past resource usage data for resources of the data center, and apply a classifier to the current resource usage data and the features to determine whether the current resource usage data represents an abnormal resource usage. The classifier may, for each of a plurality of resources of the data center, predict using a prediction model for that resource predicted resource usage data for the current time based on the features and determine an error between the predicted resource usage data and the current resource usage data and when the determined errors satisfy the abnormal resource usage criterion, indicate an abnormal resource usage has occurred. These instructions may be used in conjunction with any one of or any combination of the following embodiments. In some embodiments, the instructions further include instructions for generating the classifier that, for each of the plurality of resources, for each of a plurality of times, identify current resource usage data for that resource for that time and extract features from past resource usage data for that resource and then generate a prediction model for that resource from the current resource usage data and the extracted features for the times to predict resource usage data for that resource at a current time given features extracted from past resource usage data. In some embodiments, the instructions may further include instructions that generate from the resource usage data for the resources error statistics for the prediction models and establish from the error statistics an abnormal resource usage criterion. In some embodiments, the classifier is regenerated at various times using resource usage data that includes resource usage data collected since the classifier was last generated. In some embodiments, the prediction models may be generated using a linear regression technique. In some embodiments, a resource may be cores of the data center and a resource may be subscriptions to the data center. In some embodiments, the extracted features for the number of cores may include the average number of cores in use during past intervals, and the extracted features for subscriptions may include the number of new subscriptions during past intervals. In some embodiments, the instructions may further when an abnormal resource usage has been indicated, apply a supervised classifier to the extracted features to filter out erroneous indications of abnormal resource usage.
- Although the subject matter has been described in language specific to structural features and/or acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. Accordingly, the invention is not limited except as by the appended claims.
Claims (20)
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/721,777 US9665460B2 (en) | 2015-05-26 | 2015-05-26 | Detection of abnormal resource usage in a data center |
| PCT/US2016/033390 WO2016191229A1 (en) | 2015-05-26 | 2016-05-20 | Detection of abnormal resource usage in a data center |
| CN201680028265.XA CN107636621B (en) | 2015-05-26 | 2016-05-20 | Detecting abnormal resource usage in a data center |
| EP16725735.1A EP3304314B1 (en) | 2015-05-26 | 2016-05-20 | Detection of abnormal resource usage in a data center |
| US15/385,718 US10402244B2 (en) | 2015-05-26 | 2016-12-20 | Detection of abnormal resource usage in a data center |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/721,777 US9665460B2 (en) | 2015-05-26 | 2015-05-26 | Detection of abnormal resource usage in a data center |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/385,718 Continuation US10402244B2 (en) | 2015-05-26 | 2016-12-20 | Detection of abnormal resource usage in a data center |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20160350198A1 true US20160350198A1 (en) | 2016-12-01 |
| US9665460B2 US9665460B2 (en) | 2017-05-30 |
Family
ID=56084439
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/721,777 Active US9665460B2 (en) | 2015-05-26 | 2015-05-26 | Detection of abnormal resource usage in a data center |
| US15/385,718 Active US10402244B2 (en) | 2015-05-26 | 2016-12-20 | Detection of abnormal resource usage in a data center |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/385,718 Active US10402244B2 (en) | 2015-05-26 | 2016-12-20 | Detection of abnormal resource usage in a data center |
Country Status (4)
| Country | Link |
|---|---|
| US (2) | US9665460B2 (en) |
| EP (1) | EP3304314B1 (en) |
| CN (1) | CN107636621B (en) |
| WO (1) | WO2016191229A1 (en) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107636621A (en) * | 2015-05-26 | 2018-01-26 | 微软技术许可有限责任公司 | The abnormal resource detected in data center uses |
| US20190188000A1 (en) * | 2017-12-20 | 2019-06-20 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for Preloading Application, Computer Readable Storage Medium, and Terminal Device |
| US20190196849A1 (en) * | 2017-12-21 | 2019-06-27 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and Device for Preloading Application, Storage Medium, and Terminal Device |
| CN110377491A (en) * | 2019-07-10 | 2019-10-25 | 中国银联股份有限公司 | A kind of data exception detection method and device |
| US10686802B1 (en) * | 2019-12-03 | 2020-06-16 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| CN111527478A (en) * | 2017-10-13 | 2020-08-11 | 华为技术有限公司 | System and method for detecting abnormal user experience and performance in cooperation with cloud equipment |
| US11164048B2 (en) * | 2018-05-07 | 2021-11-02 | Google Llc | Focus-weighted, machine learning disease classifier error prediction for microscope slide images |
| US11892933B2 (en) * | 2018-11-28 | 2024-02-06 | Oracle International Corporation | Predicting application performance from resource statistics |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018160177A1 (en) * | 2017-03-01 | 2018-09-07 | Visa International Service Association | Predictive anomaly detection framework |
| US11061885B2 (en) * | 2018-06-15 | 2021-07-13 | Intel Corporation | Autonomous anomaly detection and event triggering for data series |
| CN110865896B (en) * | 2018-08-27 | 2021-03-23 | 华为技术有限公司 | Slow disk detection method and device and computer readable storage medium |
| CN110083507B (en) * | 2019-04-19 | 2020-11-24 | 中国科学院信息工程研究所 | Method and device for classifying key performance indicators |
| US11544560B2 (en) * | 2020-04-10 | 2023-01-03 | Microsoft Technology Licensing, Llc | Prefetching and/or computing resource allocation based on predicting classification labels with temporal data |
| CN114791899A (en) * | 2021-01-25 | 2022-07-26 | 华为技术有限公司 | A database management method and device |
| US11994941B2 (en) * | 2021-09-23 | 2024-05-28 | Dell Products L.P. | Analysis and remediation of alerts |
| CN119676109A (en) * | 2024-12-17 | 2025-03-21 | 中国工商银行股份有限公司 | A resource leakage detection method, device, equipment, medium and program product |
| CN120469904B (en) * | 2025-07-16 | 2025-11-07 | 临沂大学 | Data platform resource consumption monitoring system |
Family Cites Families (46)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5668944A (en) * | 1994-09-06 | 1997-09-16 | International Business Machines Corporation | Method and system for providing performance diagnosis of a computer system |
| US6574587B2 (en) * | 1998-02-27 | 2003-06-03 | Mci Communications Corporation | System and method for extracting and forecasting computing resource data such as CPU consumption using autoregressive methodology |
| US6381735B1 (en) * | 1998-10-02 | 2002-04-30 | Microsoft Corporation | Dynamic classification of sections of software |
| US6691067B1 (en) | 1999-04-07 | 2004-02-10 | Bmc Software, Inc. | Enterprise management system and method which includes statistical recreation of system resource usage for more accurate monitoring, prediction, and performance workload characterization |
| US6810495B2 (en) * | 2001-03-30 | 2004-10-26 | International Business Machines Corporation | Method and system for software rejuvenation via flexible resource exhaustion prediction |
| US7437446B2 (en) | 2002-09-30 | 2008-10-14 | Electronic Data Systems Corporation | Reporting of abnormal computer resource utilization data |
| US7480919B2 (en) * | 2003-06-24 | 2009-01-20 | Microsoft Corporation | Safe exceptions |
| US7506215B1 (en) * | 2003-12-09 | 2009-03-17 | Unisys Corporation | Method for health monitoring with predictive health service in a multiprocessor system |
| US7254750B1 (en) * | 2004-03-30 | 2007-08-07 | Unisys Corporation | Health trend analysis method on utilization of network resources |
| US7286962B2 (en) * | 2004-09-01 | 2007-10-23 | International Business Machines Corporation | Predictive monitoring method and system |
| US7818150B2 (en) * | 2005-03-11 | 2010-10-19 | Hyperformix, Inc. | Method for building enterprise scalability models from load test and trace test data |
| WO2007041709A1 (en) | 2005-10-04 | 2007-04-12 | Basepoint Analytics Llc | System and method of detecting fraud |
| US8074115B2 (en) * | 2005-10-25 | 2011-12-06 | The Trustees Of Columbia University In The City Of New York | Methods, media and systems for detecting anomalous program executions |
| US7539907B1 (en) * | 2006-05-05 | 2009-05-26 | Sun Microsystems, Inc. | Method and apparatus for determining a predicted failure rate |
| US8346691B1 (en) | 2007-02-20 | 2013-01-01 | Sas Institute Inc. | Computer-implemented semi-supervised learning systems and methods |
| BRPI0722218A2 (en) * | 2007-12-20 | 2014-07-01 | Hewlett Packard Development Co | METHOD FOR GENERATING A MODEL REPRESENTING AT LEAST PART OF A COMPUTER BASED BUSINESS PROCESS HAVING A NUMBER OF FUNCTIONAL STEPS, SOFTWARE IN A MEDIA READ BY MACHINE AND SYSTEM FOR GENERATING A REPRESENTATIVE MODEL REPRESENTING ON A PART OF A PROCESS BASED ON A NUMBER OF FUNCTIONAL STEPS |
| US8145456B2 (en) * | 2008-09-30 | 2012-03-27 | Hewlett-Packard Development Company, L.P. | Optimizing a prediction of resource usage of an application in a virtual environment |
| US8260603B2 (en) * | 2008-09-30 | 2012-09-04 | Hewlett-Packard Development Company, L.P. | Scaling a prediction model of resource usage of an application in a virtual environment |
| US7818145B2 (en) * | 2008-09-30 | 2010-10-19 | Hewlett-Packard Development Company, L.P. | Detecting an error in a prediction of resource usage of an application in a virtual environment |
| US8880682B2 (en) * | 2009-10-06 | 2014-11-04 | Emc Corporation | Integrated forensics platform for analyzing IT resources consumed to derive operational and architectural recommendations |
| US8543522B2 (en) | 2010-04-21 | 2013-09-24 | Retail Decisions, Inc. | Automatic rule discovery from large-scale datasets to detect payment card fraud using classifiers |
| US20120066554A1 (en) * | 2010-09-09 | 2012-03-15 | Microsoft Corporation | Application query control with cost prediction |
| US8595556B2 (en) * | 2010-10-14 | 2013-11-26 | International Business Machines Corporation | Soft failure detection |
| US8738549B2 (en) * | 2010-12-21 | 2014-05-27 | International Business Machines Corporation | Predictive modeling |
| US10558544B2 (en) * | 2011-02-14 | 2020-02-11 | International Business Machines Corporation | Multiple modeling paradigm for predictive analytics |
| GB201104786D0 (en) | 2011-03-22 | 2011-05-04 | Centrix Software Ltd | A monitoring system |
| US20120266026A1 (en) * | 2011-04-18 | 2012-10-18 | Ramya Malanai Chikkalingaiah | Detecting and diagnosing misbehaving applications in virtualized computing systems |
| US8412945B2 (en) | 2011-08-09 | 2013-04-02 | CloudPassage, Inc. | Systems and methods for implementing security in a cloud computing environment |
| US9015536B1 (en) * | 2011-08-31 | 2015-04-21 | Amazon Technologies, Inc. | Integration based anomaly detection service |
| US8873813B2 (en) | 2012-09-17 | 2014-10-28 | Z Advanced Computing, Inc. | Application of Z-webs and Z-factors to analytics, search engine, learning, recognition, natural language, and other utilities |
| US8732525B2 (en) | 2011-10-11 | 2014-05-20 | International Business Machines Corporation | User-coordinated resource recovery |
| US9367803B2 (en) * | 2012-05-09 | 2016-06-14 | Tata Consultancy Services Limited | Predictive analytics for information technology systems |
| US20140058763A1 (en) | 2012-07-24 | 2014-02-27 | Deloitte Development Llc | Fraud detection methods and systems |
| CN103577268B (en) * | 2012-08-07 | 2016-12-21 | 复旦大学 | Adaptive resource Supply Method based on application load |
| US20140143012A1 (en) * | 2012-11-21 | 2014-05-22 | Insightera Ltd. | Method and system for predictive marketing campigns based on users online behavior and profile |
| EP2750432A1 (en) * | 2012-12-28 | 2014-07-02 | Telefónica, S.A. | Method and system for predicting the channel usage |
| US9031829B2 (en) | 2013-02-08 | 2015-05-12 | Machine Zone, Inc. | Systems and methods for multi-user multi-lingual communications |
| US9122681B2 (en) | 2013-03-15 | 2015-09-01 | Gordon Villy Cormack | Systems and methods for classifying electronic information using advanced active learning techniques |
| US20150039333A1 (en) | 2013-08-02 | 2015-02-05 | Optum, Inc. | Claim-centric grouper analysis |
| US9628340B2 (en) * | 2014-05-05 | 2017-04-18 | Ciena Corporation | Proactive operations, administration, and maintenance systems and methods in networks using data analytics |
| CN103986669B (en) * | 2014-05-07 | 2017-04-19 | 华东师范大学 | Assessment method of resource allocation strategy in cloud computing |
| WO2015179778A1 (en) * | 2014-05-23 | 2015-11-26 | Datarobot | Systems and techniques for predictive data analytics |
| WO2015191394A1 (en) * | 2014-06-09 | 2015-12-17 | Northrop Grumman Systems Corporation | System and method for real-time detection of anomalies in database usage |
| CN104184819B (en) * | 2014-08-29 | 2017-12-05 | 城云科技(中国)有限公司 | Multi-layer load balancing cloud resource monitoring method |
| US9665460B2 (en) * | 2015-05-26 | 2017-05-30 | Microsoft Technology Licensing, Llc | Detection of abnormal resource usage in a data center |
| US20170078850A1 (en) * | 2015-09-14 | 2017-03-16 | International Business Machines Corporation | Predicting location-based resource consumption in mobile devices |
-
2015
- 2015-05-26 US US14/721,777 patent/US9665460B2/en active Active
-
2016
- 2016-05-20 CN CN201680028265.XA patent/CN107636621B/en active Active
- 2016-05-20 EP EP16725735.1A patent/EP3304314B1/en active Active
- 2016-05-20 WO PCT/US2016/033390 patent/WO2016191229A1/en not_active Ceased
- 2016-12-20 US US15/385,718 patent/US10402244B2/en active Active
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107636621A (en) * | 2015-05-26 | 2018-01-26 | 微软技术许可有限责任公司 | The abnormal resource detected in data center uses |
| CN111527478A (en) * | 2017-10-13 | 2020-08-11 | 华为技术有限公司 | System and method for detecting abnormal user experience and performance in cooperation with cloud equipment |
| US11321210B2 (en) | 2017-10-13 | 2022-05-03 | Huawei Technologies Co., Ltd. | System and method for cloud-device collaborative real-time user experience and performance abnormality detection |
| US20190188000A1 (en) * | 2017-12-20 | 2019-06-20 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for Preloading Application, Computer Readable Storage Medium, and Terminal Device |
| CN109947497A (en) * | 2017-12-20 | 2019-06-28 | 广东欧珀移动通信有限公司 | Application preloading method, device, storage medium and mobile terminal |
| US10908920B2 (en) * | 2017-12-20 | 2021-02-02 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method for preloading application, computer readable storage medium, and terminal device |
| US20190196849A1 (en) * | 2017-12-21 | 2019-06-27 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and Device for Preloading Application, Storage Medium, and Terminal Device |
| US10891142B2 (en) * | 2017-12-21 | 2021-01-12 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and device for preloading application, storage medium, and terminal device |
| US11657487B2 (en) | 2018-05-07 | 2023-05-23 | Google Llc | Focus-weighted, machine learning disease classifier error prediction for microscope slide images |
| US11164048B2 (en) * | 2018-05-07 | 2021-11-02 | Google Llc | Focus-weighted, machine learning disease classifier error prediction for microscope slide images |
| US12235748B2 (en) | 2018-11-28 | 2025-02-25 | Oracle International Corporation | Predicting application performance from resource statistics |
| US11892933B2 (en) * | 2018-11-28 | 2024-02-06 | Oracle International Corporation | Predicting application performance from resource statistics |
| CN110377491A (en) * | 2019-07-10 | 2019-10-25 | 中国银联股份有限公司 | A kind of data exception detection method and device |
| US20230033742A1 (en) * | 2019-12-03 | 2023-02-02 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| US11831653B2 (en) * | 2019-12-03 | 2023-11-28 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| US11425145B2 (en) * | 2019-12-03 | 2022-08-23 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| US20240073218A1 (en) * | 2019-12-03 | 2024-02-29 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| US12155671B2 (en) * | 2019-12-03 | 2024-11-26 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
| US10686802B1 (en) * | 2019-12-03 | 2020-06-16 | Capital One Services, Llc | Methods and systems for provisioning cloud computing services |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3304314B1 (en) | 2020-06-24 |
| US10402244B2 (en) | 2019-09-03 |
| US20170161127A1 (en) | 2017-06-08 |
| US9665460B2 (en) | 2017-05-30 |
| WO2016191229A1 (en) | 2016-12-01 |
| EP3304314A1 (en) | 2018-04-11 |
| CN107636621A (en) | 2018-01-26 |
| CN107636621B (en) | 2021-06-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10402244B2 (en) | Detection of abnormal resource usage in a data center | |
| US12175853B2 (en) | Adaptive severity functions for alerts | |
| US12051072B1 (en) | Fraud detection | |
| US11195183B2 (en) | Detecting a transaction volume anomaly | |
| JP7191837B2 (en) | A Novel Nonparametric Statistical Behavioral Identification Ecosystem for Power Fraud Detection | |
| EP3373543B1 (en) | Service processing method and apparatus | |
| US10491697B2 (en) | System and method for bot detection | |
| EP1975869A1 (en) | Enhanced fraud detection with terminal transaction-sequence processing | |
| US20190340615A1 (en) | Cognitive methodology for sequence of events patterns in fraud detection using event sequence vector clustering | |
| KR102548321B1 (en) | Valuable alert screening methods for detecting malicious threat | |
| US11373189B2 (en) | Self-learning online multi-layer method for unsupervised risk assessment | |
| CN108292414A (en) | Automatic recommendation of deployments in a data center | |
| US20190340614A1 (en) | Cognitive methodology for sequence of events patterns in fraud detection using petri-net models | |
| EP4143689A1 (en) | Blockchain network risk management universal blockchain data model | |
| US12112369B2 (en) | Transmitting proactive notifications based on machine learning model predictions | |
| US8560827B1 (en) | Automatically determining configuration parameters for a system based on business objectives | |
| US10931697B2 (en) | System and method of identifying fraudulent activity from a user device using a chain of device fingerprints | |
| US11755700B2 (en) | Method for classifying user action sequence | |
| US20240152926A1 (en) | Preventing digital fraud utilizing a fraud risk tiering system for initial and ongoing assessment of risk | |
| EP3721394B1 (en) | Threshold based fraud management for cloud computing system | |
| Kandhikonda | AI-Enhanced Fraud Detection in Financial Services: A Technical Deep Dive | |
| US12177265B2 (en) | Systems and methods for intelligent analysis and deployment of cybersecurity assets in a cybersecurity threat detection and mitigation platform | |
| US11763207B1 (en) | Evaluating machine learning model performance by leveraging system failures | |
| EP3674942B1 (en) | System and method of identifying fraudulent activity from a user device using a chain of device fingerprints | |
| Nerd et al. | Integration of PHAS and FAHP for Optimizing Alert Prioritization in Financial Ecosystem Security. |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NEUVIRTH-TELEM, HANI;HILBUCH, AMIT;NAHUM, SHAY BARUCH;AND OTHERS;SIGNING DATES FROM 20160408 TO 20160413;REEL/FRAME:038271/0067 |
|
| AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF CONVEYING PARTY PREVIOUSLY RECORDED AT REEL: 038271 FRAME: 0067. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:NEUVIRTH-TELEM, HANI;HILBUCH, AMIT;NAHUM, SHAY BARUCH;AND OTHERS;SIGNING DATES FROM 20160408 TO 20160413;REEL/FRAME:038451/0046 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |