[go: up one dir, main page]

CN109299610B - Method for verifying and identifying unsafe and sensitive input in android system - Google Patents

Method for verifying and identifying unsafe and sensitive input in android system Download PDF

Info

Publication number
CN109299610B
CN109299610B CN201811163790.XA CN201811163790A CN109299610B CN 109299610 B CN109299610 B CN 109299610B CN 201811163790 A CN201811163790 A CN 201811163790A CN 109299610 B CN109299610 B CN 109299610B
Authority
CN
China
Prior art keywords
input
sensitive
verification
validations
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811163790.XA
Other languages
Chinese (zh)
Other versions
CN109299610A (en
Inventor
杨珉
杨哲慜
张磊
何郁郁
张振宇
洪庚
张源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201811163790.XA priority Critical patent/CN109299610B/en
Publication of CN109299610A publication Critical patent/CN109299610A/en
Application granted granted Critical
Publication of CN109299610B publication Critical patent/CN109299610B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/034Test or assess a computer or a system

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Stored Programmes (AREA)

Abstract

本发明属于程序安全分析漏洞挖掘技术领域,具体为一种安卓系统中不安全敏感输入验证识别方法。本发明方法包括:输入验证识别,首先提取程序代码中的中断分支,经过对代码结构特征进行分析,找出包含中断指令的独立程序分支,来判断当前程序执行是否包含校验输入的意图;敏感输入验证识别,采用自然语言处理对大量输入参数进行基于语义的聚类,再通过指定少量已知敏感参数,采用机器学习来推测其他的未知敏感参数;最后,漏洞识别,通过检查这些包含敏感参数的输入验证是否满足安全规则来判断其是否为不安全的输入验证。通过对这类输入验证进行识别,能够确定由其构成的系统级安全漏洞,对加强移动系统安全、防范系统级别攻击具有重要意义。

Figure 201811163790

The invention belongs to the technical field of program security analysis vulnerability mining, in particular to an insecure sensitive input verification and identification method in an Android system. The method of the invention includes: input verification and identification, firstly extracting the interrupt branch in the program code, and after analyzing the code structure features, finds the independent program branch containing the interrupt instruction, to judge whether the current program execution contains the intention of verifying the input; Input verification and identification, use natural language processing to perform semantic-based clustering on a large number of input parameters, and then use machine learning to infer other unknown sensitive parameters by specifying a small number of known sensitive parameters; finally, vulnerability identification, by checking these contain sensitive parameters Whether the input validation meets the security rules to determine whether it is an unsafe input validation. By identifying this type of input validation, it is possible to determine the system-level security loopholes formed by it, which is of great significance for strengthening the security of mobile systems and preventing system-level attacks.

Figure 201811163790

Description

Method for verifying and identifying unsafe and sensitive input in android system
Technical Field
The invention belongs to the technical field of program security analysis and vulnerability discovery, and particularly relates to a natural language processing, machine learning technology and static information flow analysis technology, in particular to an unsafe input verification identification method in an android system.
Background
Over 60% of mobile devices are using the android system, where a large number of applications related to our daily lives are running. To implement various functions, the application can read and operate android system resources, such as a GPS device and screen display, and perform sensitive operations, such as sending and deleting SMS messages. In the android system, these resources and sensitive operations are managed by more than 100 system services. It is clear that access control in these services plays an important role in the security of the overall system.
In the present invention we have performed empirical studies on a special set of key security checks in system services, which we define as sensitive input validation. The android system contains at least 700 different sensitive input verifications compared to 351 permissions contained in the system. They are used in large numbers for various purposes, such as to prevent general applications from accessing sensitive system level devices by restricting device names.
The present invention is different from conventional input validation studies. Traditional input validation research focuses on a narrow and well-defined set of sensitive inputs, such as Web inputs that may cause SQL injection attacks, and user-space pointers that are passed to the Linux kernel that may cause memory leak attacks. While the android system, by its uniqueness, does not know which inputs should be verified. Thus, the present invention is more focused on the other aspect of knowing neither which inputs should be verified, nor where these inputs need to be verified. Specifically, this is determined by the following android properties: (1) and (4) unstructured. Unlike android permission checks that rely on system-defined interfaces, such as context. In fact, any input to the disclosed method in a system service may result in sensitive input validation (a conditional statement that involves parameter checking). (2) The definition is ambiguous. Unlike rights authentication, which is described by the detailed documentation in the android rights model, there is no publicly available source to define how sensitive input verification should be performed in the android system service. Thus, it is not clear whether the input needs to be validated and completed correctly. (3) And (4) fragmenting. Sensitive input validation is scattered across a large number of Java classes. For example, in android 7.0, our evaluations show that they are widely dispersed in 173 different Java classes, while android rights guarantees are concentrated in 6 classes. Furthermore, even in the same service approach, sensitive input validation is often scattered across various execution paths, limiting system operation in a fine-grained manner.
Thus, while sensitive input verification in android services is important, security personnel overlook their security due to its inadequate design and implementation. First, system developers confuse their system security models. The Android system service may incorrectly trust input from a generic application, and even sometimes place input validation in the application program (Android SDK) process. Second, system developers can also ignore the issue of input verification when customizing the android system. However, in these contexts, there is no way in the android system to automatically identify sensitive input verifications and the security vulnerabilities that they constitute.
Disclosure of Invention
The invention aims to provide a brand-new unsafe sensitive input verification and identification method based on a code layer structure and semantic analysis driving, which is suitable for automatically identifying unsafe sensitive input verification contained in codes of an android system in a large scale.
The invention provides an insecure sensitive input verification and identification method, which is used for identifying an insecure data source depended on during verification input and comprises three parts: the method comprises the steps of code structure analysis-based input verification identification, natural language processing and machine learning-based sensitive input verification identification and security rule-based vulnerability identification.
First, based on the input verification identification of code structure analysis, an interrupt branch in the program code is extracted, such as an exception is thrown. The code structure characteristics are analyzed to find out the independent program branch containing the interrupt instruction, so as to judge whether the current program execution contains the intent of checking input.
Secondly, sensitive input verification and identification based on natural language processing and machine learning, and semantic-based clustering is carried out on a large number of input parameters by adopting natural language processing to enable synonymous parameters to be clustered together; and then, a small number of known sensitive parameters are specified, and other unknown sensitive parameters are presumed by adopting machine learning.
And (III) judging whether the input verification containing the sensitive parameters is unsafe input verification by checking whether the input verification meets the security rules based on the vulnerability recognition of the security rules.
The final design architecture of the present invention is shown in fig. 1, and the following describes three parts of the present invention in detail:
input verification recognition based on code structure analysis
Since input validation is a core problem of the present invention, we need a method to automatically identify and study input validation in the android system. This problem is very challenging because they are neither performed through predefined system interfaces nor identified through fixed APIs (e.g., permission checking). The present invention utilizes the inherent code structure features in the input validation for identification. In particular, the first requirement for input validation is that the input must be passed through the data stream to the compare statement and compared against some predefined value or result obtained dynamically from other APIs. Then, different actions are taken based on the result of the comparison. However, unlike a general program branch statement, input verification not only requires comparing the input with other data, but also immediately interrupts program execution when verification fails. For example, interrupting execution by throwing a SecurityException exception when verification fails causes the program to exit immediately. Thus, the present invention requires knowledge of which termination action will typically be taken when authentication fails. After analyzing some actual input validation in the android system, the present invention summarizes the following four interrupt operations: (1) an exception is thrown. A straightforward way to mark an application input violation of input validation is to throw exceptions such as SecurityException and IllegalArgmentException. (2) A constant is returned. The system service uses some predefined constants to show that the caller failed in the input validation and then returns as a return value in the interrupt branch. (3) And logging and returning. Logging information is useful for monitoring the operation of the system. In the interrupt branch, they typically record some information about the illegal entry and then return. (4) And recycling the resources and returning. In some cases, system services require that allocated resources be reclaimed and then returned directly.
By means of the identification of the four interrupt operations, the method for identifying input verification of the invention comprises the following steps: firstly, determining all program branch statements which can accept application input in system service; then, judging whether the branch statements contain an interrupt branch or not through code structure analysis; furthermore, some branch statements may generate a large number of program branches depending on different inputs, and these branches are generally used to process different input situations, and are not intended to check the inputs, so they should be deleted from the recognition result.
(II) sensitive input verification recognition based on natural language processing and machine learning
Currently, there is no efficient way to distinguish sensitive input validations from all input validations. It is more accurate and complete by understanding the processing logic of the input parameters in the system service and the corresponding operation type. However, this analysis method requires a large amount of a priori knowledge to describe which operations in the system are sensitive. Which is often difficult to obtain. The present invention therefore takes another distinct approach. By utilizing machine learning, we can mark a set of less known sensitive input validations as starting samples and let the machine learning automatically learn the rest using Association Rule Mining (Association Rule Mining).
When sensitive inputs are marked, the traditional method is to mark the sensitive inputs by using semantic information of variable names. For example, the identity of the caller is represented by the sensitive variable "packageName". However, the android system manages a large amount of system resources and uses multiple variable names to represent different parts. It is difficult to confirm their sensitivity if the entire android system is not fully understood. Thus, the present invention automatically discovers other potentially sensitive input validations by specifying a few initial known sensitive input validations, and then using association rule mining techniques. The reason for choosing this method is based on the correlation between sensitive input verifications, which are usually by being located in the same service method. Taking the example of "packageName" and "uid," the android system often uses them together to verify the identity of an application. Thus, their sensitivities may be positively correlated. The detailed method is as follows:
the incoming authentications are pre-grouped. One important requirement in association rule mining is the need to observe sufficient samples/occurrences of any given variable. However, if we deal with each unique variable name separately, it may eventually happen that variables flag1 and flag2 occur only once in the codebase, respectively, making association rule mining ineffective. Thus, if variables share a common term (or prefix/suffix), we can simply group them together because they are semantically highly related. To this end, the invention provides a pre-grouping of input verifications by means of input parameters in two steps: (1) the variable names are segmented and the stems are extracted. Normally, the input parameter is a word that is letter case segmented. For example, 'componentName' may be divided into 'component' and 'name', 'groupOwnerAddress' may be divided into 'group', 'lower' and 'address'. Therefore, we can break these long words into separate words. In addition, for each separate word, the present invention attempts to further identify the base word. For example, words such as 'types' and 'subtype' are derived from the basic word 'type', and prefixes'm' of the words 'mflag' and 'mname' should also be deleted. After this step, the present invention obtains the root word for each input parameter. (2) And normalizing the variable name. We can obtain a normalized name by merging the roots of each input parameter. However, even if words are segmented and word stems are used, it is difficult to avoid meaningless qualifiers, which in turn causes deviations in the final names. For example, the variable "linkaddress" may be divided into "link" and "address," and both "address" and the qualifier "link" are considered root words. To delete a qualifier, the present invention calculates the frequency of occurrence of each pair of words. If two words often occur simultaneously, we only retain the more popular words. After these steps, the present invention groups all input validations in the android system, and the input validations whose input parameters have the same normalized name are all divided into the same group for later machine learning based on association rule mining techniques.
New sensitive input verifications are learned. Without a priori knowledge, it is difficult to ascertain whether the verification involves any sensitive input. However, system developers tend to perform similar input verification in close proximity. For example, the validation of "packageName" and "uid" are typically adjacent. Thus, the present invention utilizes input verification proximity as a feature to perform association mining. Specifically, we extend the sensitive input validation set by way of association rule mining. First, the distance between each pair of input validations is calculated. Two input verifications are considered to be adjacent to each other if they occur on two basic blocks having a common edge. Then, if two input authentication groups contain multiple adjacent pairs, the groups are associated together. By this approach, starting with a few known sensitive input verifications, the present invention iteratively collects all relevant groups until a new group can no longer be discovered. The method can effectively find a large number of sensitive input verifications.
(III) vulnerability identification based on security rules
According to the invention, the vulnerability problem caused by unsafe input verification in the android system is detected from two different latitudes, and corresponding safety rules are formulated.
First, the present invention looks for unsafe input validation in each android system through intra-system analysis.
Erroneously trusting data provided by an application. Some system services verify caller identity based on input parameters, but because these parameters come from an application, they can be forged, and all of these parameters should be untrusted. Thus, if sensitive input verification verifies sensitive data provided by an application, such sensitive input verification is not secure.
Erroneously trusting code in an application process. Because of the unstructured nature of input validation, they are often placed into application processes. In particular, Android SDKs that run within application processes often include various checks on input parameters. Typically, Android SDKs package data from an application and forward it to an Android system services process. Whereas in the data packing process a large number of input verifications are used to check for illegitimate parameters, many of which are sensitive. However, these sensitive input verifications are ignored in system services. These sensitive input verifications are insecure because the application can bypass the Android SDK to directly access system services. Furthermore, the traditional understanding of the Android SDK scope includes only the exposed interfaces, but in fact those not exposed interfaces labeled as @ hide or @ systempai are also within the reach of the application, as the application can still access these hidden interfaces via reflection.
Secondly, inconsistent sensitive input verification is searched for in a plurality of android systems through intersystem analysis. In order to find sensitive input verification weakened by a third-party manufacturer, the invention detects inconsistent sensitive input verification between the android original system and the third-party customized system. First, the present invention needs to find similar system methods among different systems to compare if the input verification is consistent. Conventional looking at only class names and method names to determine similarity is not applicable here. Since many third party vendor customized systems introduced new system services that, while performing similar functions to the android original services, their function nomenclature is also significantly different and significantly reduced in security. Therefore, the invention proposes to cluster the common interfaces of different systems according to the similarity of the method behavior. In particular, we utilize static taint analysis techniques to represent the behavior of a method based on its data dependency graph. When the behavior similarity of two function interfaces is higher than a threshold value, the invention classifies the two function interfaces into similar methods. Then, by comparing whether similar methods have the same sensitive input verification, the invention can find many sensitive input verifications which are ignored by third-party manufacturers.
The invention can determine the system level security vulnerability formed by the input verification by identifying the input verification, and has important significance for strengthening the security of the mobile system and preventing the system level attack. Specifically, the traditional identity verification vulnerability mining for the android system aims at the android permission system, and the identity verification of the android system is considered from a brand-new angle, namely the angle of input verification. Quantitatively, the android system only contains about 350 authorities, but the number of sensitive inputs is as high as 700; from the aspect of identification difficulty, the interfaces of authority verification are well defined by the android system and are only distributed in a few java classes, sensitive input verification is widely distributed, and java classes of any system service can contain the sensitive input verification. Furthermore, technically, recognizing sensitive inputs directly in the system code is a problem that is very dependent on expert experience, especially in cases where the android system code volume is large. Therefore, extracting sensitive parameters directly in the android system is a nearly impossible matter. To solve the problem, the invention firstly proposes that the sensitive input can be identified by identifying the sensitive input verification. In general, input parameters used for sensitive input verification are also necessarily sensitive. Finally, the security rule for identifying the vulnerability is formulated by deeply understanding the android system hierarchical model, and the security rule has important significance for analyzing the system architecture and enhancing the security of the system architecture.
Drawings
Fig. 1 is a system overall framework.
FIG. 2 is a sample sensitive input authentication code.
FIG. 3 shows sensitive input keywords and clusters.
Fig. 4 is a security rule.
Detailed Description
The invention designs and realizes the brand-new unsafe input verification identification method based on the combination of natural language processing and machine learning. This section introduces details of the specific implementation of the framework.
Input verification recognition based on code structure analysis
The android system is analyzed on the basis of a Soot framework tool. The Soot framework is a mature Java program decompilation tool. Firstly, the android system image is decompressed, all Java class files are extracted from the android system image, then the Soot is used for decompiling, and the intermediate representation (the Jimple format file) of the system code is obtained. Then, in the decompiled Jimple codes, all android system services, methods in the system services and input variables are extracted to serve as code information sources to be analyzed. In extracting system services, the present invention considers not only all system services declared and registered in Java class, but also the system services they use. This enables the invention to cover the capabilities of a part of the system services that are implemented based on Native.
For the extracted system services, the invention finds out all the open methods contained in the system services by analyzing the interface definition, and then performs path-sensitive data flow taint analysis on each method. Meanwhile, aiming at data flow taint analysis among methods, the invention optimizes a large number of inaccessible nodes by filtering out nodes protected by system-level authority, thereby greatly reducing the time overhead of taint analysis sensitive to the path and reducing the complexity of path traversal.
Sensitive input verification identification based on natural language processing and machine learning
The invention uses the Stanford Parser implemented based on Java to perform the natural language processing analysis. The Stanford Parser is a common grammar parsing tool, can parse the structure of a sentence and mark part-of-speech tags for different participle units in the sentence, and also provides a plurality of methods for displaying the dependency relationship among the participle units in the sentence. Therefore, the method is selected to realize lexical analysis and dependency analysis. In addition, the present invention uses WordNet for longest word matching to identify the valid root word for each word. The specific method is that all characters are matched continuously until the longest character which can be matched is matched.
The present invention uses rule mining techniques for machine learning. A particular feature is to calculate the distance between adjacent input validations. The distance threshold is set to 3, i.e. if two input validations can be found within 3 basic code blocks, then both are considered relevant.
Vulnerability identification based on security rules
The present invention uses behavior similarity as a feature to find similar system service methods. The similarity threshold is set to 0.7. Experiments with 4 third party vendor customized systems showed that 0.7 is the largest threshold for finding similar methods. A larger threshold, e.g. 0.8, would only find the same, but not similar, approach, which is not in accordance with the requirements of the present invention. The present invention then performs a difference set comparison of the input verifications of similar methods, i.e., checks to see if one method lacks some input verification relative to another, to find systematic methods for which the input verification is weakened.
Through the framework, the invention realizes a vulnerability tool for mining based on unsafe input verification in the android system. The effectiveness of the method provided by the invention is proved through detection and analysis of an actual system. Firstly, through static analysis, the tool covers the detection of most system services in the android system, including the system services which are partially realized depending on Native. Secondly, the tool of the present invention discovers 20 system level vulnerabilities in 8 android systems. For example, the system service accessitimymanagervice identifies the application identity by using the untrusted parameter packageName, and by forging the parameter, malicious software can bypass authentication and border-crossing access, so that attacks such as interface hijacking, password leakage and the like are caused; the system service WindowManagerService utilizes an untrusted parameter Toast _ Type to identify the window Type, and by forging the parameter, malicious software can bypass authority verification to construct a system level window, so that window phishing attack aiming at any application software is caused; in addition, there are other vulnerabilities that lead to attacks such as delegation, information leakage, system log cleansing, and the like. The vulnerabilities cannot be covered by traditional work because the vulnerabilities are based on unsafe sensitive input verification, and the invention fills up the gap in the research field.

Claims (3)

1.一种安卓系统中不安全敏感输入验证识别方法,其特征在于,识别在验证输入时所依赖的不安全数据源,具体步骤为:1. an unsafe sensitive input verification and identification method in an Android system, is characterized in that, identifying the unsafe data source that relies on when verifying the input, and concrete steps are: (一)基于代码结构分析的输入验证识别,首先提取程序代码中的中断分支,经过对代码结构特征进行分析,找出包含中断指令的独立程序分支,来判断当前程序执行是否包含校验输入的意图;(1) Input verification and identification based on code structure analysis, firstly extract the interrupt branch in the program code, and after analyzing the code structure features, find out the independent program branch containing the interrupt instruction, to judge whether the current program execution contains the verification input. intention; (二)基于自然语言处理和机器学习的敏感输入验证识别,采用自然语言处理对大量输入参数进行基于语义的聚类,使同义参数聚合在一起;再通过指定少量已知敏感参数,采用机器学习来推测其他的未知敏感参数;最后,(2) Sensitive input verification and recognition based on natural language processing and machine learning, using natural language processing to perform semantic-based clustering on a large number of input parameters, so that synonymous parameters can be aggregated together; learn to infer other unknown sensitive parameters; finally, (三)基于安全规则的漏洞识别,通过检查这些包含敏感参数的输入验证是否满足安全规则来判断其是否为不安全的输入验证;(3) Vulnerability identification based on security rules, by checking whether these input validations containing sensitive parameters meet the security rules to determine whether they are unsafe input validations; 所述基于代码结构分析的输入验证识别:The input verification recognition based on code structure analysis: 首先总结以下四种中断操作:(1)抛出异常,即标示应用程序输入违反输入验证的直接方式就是抛出异常;(2)返回常量,系统服务使用一些预定义的常量来显示调用者在输入验证中失败,然后作为返回值在中断分支中返回;(3)记录日志并返回,记录日志信息,在中断分支中,记录有关非法输入的一些信息,然后返回;(4)回收资源并返回,在某些情况下,系统服务需要先回收分配的资源,然后再直接返回;First, the following four interrupt operations are summarized: (1) throwing an exception, that is, the direct way to indicate that the application input violates input validation is to throw an exception; (2) returning a constant, the system service uses some predefined constants to show that the caller is in the Fail in input validation, then return as return value in interrupt branch; (3) log and return, record log information, in interrupt branch, record some information about illegal input, then return; (4) reclaim resources and return , in some cases, the system service needs to recycle the allocated resources first, and then return directly; 借助这四种中断操作的识别,输入验证识别的流程为:首先,在系统服务中确定所有可以接受应用输入的程序分支语句;然后,通过代码结构分析来判断这些分支语句是否包含一个中断分支;此外,对于有些分支语句依据不同的输入会产生大量的程序分支,从识别结果中删除。With the identification of these four interrupt operations, the input verification process is as follows: first, determine all program branch statements that can accept application input in the system service; then, determine whether these branch statements contain an interrupt branch through code structure analysis; In addition, some branch statements will generate a large number of program branches according to different inputs, which are deleted from the recognition results. 2.根据权利要求1所述的安卓系统中不安全敏感输入验证识别方法,其特征在于,所述基于自然语言处理和机器学习的敏感输入验证识别:2. in the Android system according to claim 1, insecure sensitive input verification identification method, is characterized in that, the described sensitive input verification identification based on natural language processing and machine learning: 通过利用机器学习,将一组较少的已知敏感输入验证标记为起始样本,并利用关联规则挖掘技术让机器学习自动学习其余部分;By leveraging machine learning, a small set of known sensitive input validations are marked as starting samples, and association rule mining techniques are used to let machine learning learn the rest automatically; 在标示敏感输入时,通过指定少数初始已知敏感输入验证,再使用关联规则挖掘技术来自动发现其他可能敏感的输入验证;具体做法为:When marking sensitive inputs, specify a few initial known sensitive input validations, and then use association rule mining technology to automatically discover other potentially sensitive input validations; the specific approach is as follows: 对输入验证进行预分组:通过下述两个步骤借助输入参数进行输入验证的预分组:(1)分割变量名并提取词干;(2)变量名称规范化,通过合并每个输入参数的词根来获得规范化名称,其中计算每对单词的出现频率;(3)删除限定符;如果两个单词经常同时出现,只保留更流行的单词;这样,使输入参数具有相同规范化名称的输入验证都被划分为同一组,用于之后基于关联规则挖掘技术机器学习部分;Pre-grouping for input validation: Pre-grouping for input validation with the help of input parameters in two steps: (1) splitting variable names and stemming; (2) normalizing variable names by combining the stems of each input parameter get the normalized name, where the frequency of occurrence of each pair of words is calculated; (3) remove the qualifier; if two words frequently co-occur, only keep the more popular word; this way, input validations that make input parameters with the same normalized name are all divided For the same group, it is used for the machine learning part of mining technology based on association rules later; 学习新的敏感输入验证:通过关联规则挖掘的方式来扩展敏感输入验证集;首先,计算每对输入验证之间的距离;如果两个输入验证发生在具有公共边的两个基本块上,认为这两个输入验证彼此相邻;然后,如果两个输入验证组包含多个相邻对,将这些组关联在一起;这样,从少数已知的敏感输入验证开始,迭代地收集所有相关组,直到不再能够发现新组。Learning new sensitive input validation: Extend the sensitive input validation set by mining association rules; first, calculate the distance between each pair of input validations; if two input validations occur on two basic blocks with common edges, consider The two input validations are adjacent to each other; then, if the two input validation groups contain multiple adjacent pairs, associate the groups together; this way, starting with a few known sensitive input validations, iteratively collects all relevant groups, until new groups can no longer be discovered. 3.根据权利要求2所述的安卓系统中不安全敏感输入验证识别方法,其特征在于,所述基于安全规则的漏洞识别:3. in the Android system according to claim 2, insecure sensitive input verification and identification method, is characterized in that, described vulnerability identification based on security rules: 首先,通过系统内分析,在每一个安卓系统中查找不安全的输入验证;包括:First, look for insecure input validation in every Android system through in-system analysis; including: 错误地信任应用程序提供的数据,某些系统服务根据输入参数验证调用者身份,但是因为这些参数来自于应用程序,可以被伪造,所有这些参数应当是不可信的;因此,如果敏感输入验证校验了应用程序所提供的敏感数据,那么这种敏感输入验证就是不安全的;Incorrectly trusting data provided by the application, some system services verify the caller's identity based on input parameters, but because these parameters come from the application and can be forged, all these parameters should not be trusted; therefore, if sensitive input validation checks If the sensitive data provided by the application is verified, then this sensitive input verification is not secure; 错误地信任应用程序进程中的代码,因为输入验证的非结构化特性,它们经常被放置到应用程序进程中;Incorrectly trusting code in application processes, which are often placed into application processes because of the unstructured nature of input validation; 其次,通过系统间分析在多个安卓系统中查找不一致的敏感输入验证;为了找到第三方厂商弱化的敏感输入验证,检测安卓原版系统和第三方定制化系统间不一致的敏感输入验证;首先,在不同系统间找到相似的系统方法,以比较其输入验证是否一致;根据方法行为的相似性对不同系统的公共接口进行聚类;具体来说,利用静态污点分析技术来表示基于其数据依赖图的方法行为;当两个函数接口其行为相似性高于阈值时,将它们归为相似的方法;然后,通过比较相似方法间是否具有相同的敏感输入验证,从而找到很多被第三方厂商所忽略的敏感输入验证。Second, find inconsistent sensitive input validations in multiple Android systems through inter-system analysis; in order to find sensitive input validations weakened by third-party manufacturers, detect inconsistent sensitive input validations between the Android original system and third-party customized systems; first, in the Find similar system methods among different systems to compare whether their input validations are consistent; cluster the common interfaces of different systems according to the similarity of method behavior; specifically, use static taint analysis technology to represent the data based on their data dependency graphs. Method behavior; when the behavior similarity of two functional interfaces is higher than the threshold, they are classified as similar methods; then, by comparing whether similar methods have the same sensitive input verification, so as to find many ignored by third-party vendors Sensitive input validation.
CN201811163790.XA 2018-10-02 2018-10-02 Method for verifying and identifying unsafe and sensitive input in android system Active CN109299610B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811163790.XA CN109299610B (en) 2018-10-02 2018-10-02 Method for verifying and identifying unsafe and sensitive input in android system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811163790.XA CN109299610B (en) 2018-10-02 2018-10-02 Method for verifying and identifying unsafe and sensitive input in android system

Publications (2)

Publication Number Publication Date
CN109299610A CN109299610A (en) 2019-02-01
CN109299610B true CN109299610B (en) 2021-03-30

Family

ID=65161646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811163790.XA Active CN109299610B (en) 2018-10-02 2018-10-02 Method for verifying and identifying unsafe and sensitive input in android system

Country Status (1)

Country Link
CN (1) CN109299610B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11715111B2 (en) * 2018-09-25 2023-08-01 Capital One Services, Llc Machine learning-driven servicing interface
CN112395884B (en) * 2020-11-15 2022-04-12 复旦大学 Android API semantic relation map construction method based on code document

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894134A (en) * 2010-06-21 2010-11-24 南京邮电大学 A Spatial Layout Based Phishing Webpage Detection and Its Implementation Method
CN105022958A (en) * 2015-07-11 2015-11-04 复旦大学 Android application used application program vulnerability detection and analysis method based on code library security specifications
CN105190564A (en) * 2013-04-11 2015-12-23 甲骨文国际公司 Predictive diagnosis of SLA violations in cloud services through seasonal trending and forecasting with thread intensity analysis
CN106649783A (en) * 2016-12-28 2017-05-10 上海智臻智能网络科技股份有限公司 Synonym mining method and apparatus
CN107810480A (en) * 2015-06-26 2018-03-16 微软技术许可有限责任公司 Distributed according to the instruction block of performance metric
CN107925659A (en) * 2015-08-15 2018-04-17 微软技术许可有限责任公司 Domain on no domain server adds virtual name
CN108171073A (en) * 2017-12-06 2018-06-15 复旦大学 A kind of private data recognition methods based on the parsing driving of code layer semanteme

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894134A (en) * 2010-06-21 2010-11-24 南京邮电大学 A Spatial Layout Based Phishing Webpage Detection and Its Implementation Method
CN105190564A (en) * 2013-04-11 2015-12-23 甲骨文国际公司 Predictive diagnosis of SLA violations in cloud services through seasonal trending and forecasting with thread intensity analysis
CN107810480A (en) * 2015-06-26 2018-03-16 微软技术许可有限责任公司 Distributed according to the instruction block of performance metric
CN105022958A (en) * 2015-07-11 2015-11-04 复旦大学 Android application used application program vulnerability detection and analysis method based on code library security specifications
CN107925659A (en) * 2015-08-15 2018-04-17 微软技术许可有限责任公司 Domain on no domain server adds virtual name
CN106649783A (en) * 2016-12-28 2017-05-10 上海智臻智能网络科技股份有限公司 Synonym mining method and apparatus
CN108171073A (en) * 2017-12-06 2018-06-15 复旦大学 A kind of private data recognition methods based on the parsing driving of code layer semanteme

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Identifying User-Input Privacy in Mobile Applications at a Large Scale;Yuhong Nan等;《IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY》;20170331;全文 *

Also Published As

Publication number Publication date
CN109299610A (en) 2019-02-01

Similar Documents

Publication Publication Date Title
Feng et al. Apposcopy: Semantics-based detection of android malware through static analysis
US11314862B2 (en) Method for detecting malicious scripts through modeling of script structure
CN114077741B (en) Software supply chain safety detection method and device, electronic equipment and storage medium
US8261344B2 (en) Method and system for classification of software using characteristics and combinations of such characteristics
RU2614557C2 (en) System and method for detecting malicious files on mobile devices
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
CN108280350B (en) Android-oriented mobile network terminal malicious software multi-feature detection method
KR101212553B1 (en) Malicious file scanning device and method
US20130139265A1 (en) System and method for correcting antivirus records to minimize false malware detections
Alzarooni Malware variant detection
Martinelli et al. I find your behavior disturbing: Static and dynamic app behavioral analysis for detection of android malware
CN112817877B (en) Abnormal script detection method and device, computer equipment and storage medium
CN116932381A (en) Automatic evaluation method for security risk of applet and related equipment
CN105205413A (en) A data protection method and device
EP3113065B1 (en) System and method of detecting malicious files on mobile devices
Demissie et al. Anflo: Detecting anomalous sensitive information flows in android apps
CN109299610B (en) Method for verifying and identifying unsafe and sensitive input in android system
Feichtner et al. Obfuscation-resilient code recognition in Android apps
Peiró et al. Detecting stack based kernel information leaks
CN114936369A (en) SQL injection attack active defense method, system and storage medium based on mark
WO2010149986A2 (en) A method, a computer program and apparatus for analysing symbols in a computer
KR101908517B1 (en) Method for malware detection and unpack of malware using string and code signature
KR101880689B1 (en) Apparatus and method for detecting malicious code
Pearsall An evaluation of graph representation of programs for malware detection and categorization using graph-based machine learning methods
CN119180027B (en) A SQL Injection Detection Method for Database Auditing Systems Based on Syntax Tree Comparison

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant