[go: up one dir, main page]

CN103136477B - The scan method of paper sample and system - Google Patents

The scan method of paper sample and system Download PDF

Info

Publication number
CN103136477B
CN103136477B CN201310071272.6A CN201310071272A CN103136477B CN 103136477 B CN103136477 B CN 103136477B CN 201310071272 A CN201310071272 A CN 201310071272A CN 103136477 B CN103136477 B CN 103136477B
Authority
CN
China
Prior art keywords
file
sample
gray
samples
dangerous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310071272.6A
Other languages
Chinese (zh)
Other versions
CN103136477A (en
Inventor
冯鑫
李振博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310071272.6A priority Critical patent/CN103136477B/en
Publication of CN103136477A publication Critical patent/CN103136477A/en
Application granted granted Critical
Publication of CN103136477B publication Critical patent/CN103136477B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Paper (AREA)

Abstract

本发明公开了一种文件样本的扫描方法和系统,所述方法包括:对于文件样本中的灰文件样本,根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本;根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;使用所选取的鉴定器扫描待扫描的灰文件样本,并存储扫描结果,以在接收到查询文件样本是否安全的请求时,返回扫描结果;所述灰文件样本为安全性未知的文件样本。本发明能够节约进行扫描的设备的资源,加快扫描效率,提供扫描速度。

The invention discloses a method and system for scanning file samples. The method includes: for gray file samples in file samples, selecting gray file samples to be scanned from stored gray file samples according to a preset strategy; The update record of the authenticator and/or the scan record of the authenticator select an authenticator for scanning the gray file sample; scan the gray file sample to be scanned using the selected authenticator, and store the scanning result for when the query file sample is received When requesting whether it is safe or not, the scan result is returned; the gray file sample is a file sample with unknown security. The present invention can save the resource of scanning equipment, accelerate scanning efficiency and improve scanning speed.

Description

文件样本的扫描方法和系统Method and system for scanning document samples

技术领域technical field

本发明涉及计算机网络安全领域,具体涉及一种文件样本的扫描方法和系统。The invention relates to the field of computer network security, in particular to a method and system for scanning file samples.

背景技术Background technique

在网络安全领域中,通常需要进行病毒文件的查杀。病毒文件是一个概括性的术语,指任何故意创建用来执行未经授权并通常是有害行为的应用文件。例如,包括:计算机病毒、后门程序、键盘记录器、密码盗取者、Word和Excel宏病毒、引导区病毒、脚本病毒、木马等。In the field of network security, it is usually necessary to scan and kill virus files. Virus file is an umbrella term for any application file that is intentionally created to perform unauthorized and often harmful actions. For example, including: computer viruses, backdoor programs, keyloggers, password stealers, Word and Excel macro viruses, boot sector viruses, script viruses, Trojan horses, etc.

现有技术中,病毒文件查杀所采用的方法主要依赖于特征库模式。特征库是由厂商收集到的病毒文件样本的特征码组成,而特征码则是分析工程师从病毒文件中找到和正当文件的不同之处,截取一段类似于“关键词”的文件代码,该文件代码为特征码。在查杀过程中,引擎会读取文件,与特征库中的所有特征码进行匹配,如果发现文件代码被命中,就可以判定该文件为病毒文件。In the prior art, the method adopted for scanning and killing virus files mainly depends on the signature database mode. The signature library is composed of signature codes of virus file samples collected by the manufacturer, and the signature code is the difference between the virus file and the legitimate file that the analysis engineer finds, and intercepts a piece of file code similar to "keyword". The code is a feature code. During the killing process, the engine will read the file and match it with all the signatures in the signature library. If the file code is found to be hit, it can be determined that the file is a virus file.

但是,随着病毒文件数量的增长,当前病毒文件数量呈几何级增长,基于这种爆发式的增速,特征库的生成与更新往往滞后,很多时候终端单独的查杀引擎无法查杀出未知的病毒文件。However, with the increase in the number of virus files, the current number of virus files is increasing geometrically. Based on this explosive growth rate, the generation and update of the signature database often lag behind. In many cases, the independent anti-virus engine of the terminal cannot detect and kill unknown virus files.

因此,现有技术中产生了主动防御方法。在主动防御的方法中,基于文件行为自主分析判断,进行实时查杀,不以特征码作为判断病毒文件的依据,而是从文件的原始定义出发,直接将文件行为作为判断病毒文件的依据,其中衍生出在本地使用特征库、在本地设置行为阈值以及在本地启发式杀毒的方式来判别、拦截病毒文件的行为,从而在一定程度上达到保护终端的目的。Therefore, active defense methods have been produced in the prior art. In the active defense method, based on independent analysis and judgment of file behavior, real-time scanning and killing is carried out. Instead of using signature codes as the basis for judging virus files, starting from the original definition of files, file behavior is directly used as the basis for judging virus files. Among them, the behavior of using signature database locally, setting behavior thresholds locally, and local heuristic antivirus methods to identify and intercept virus files is derived, so as to achieve the purpose of protecting terminals to a certain extent.

但是,上述本地主动防御方法也存在需要解决的问题。首先,本地主动防御容易对病毒文件造成免杀,例如,通过对病毒文件加壳即可以避开本地主动防御的特征库防杀模式;通过针对病毒行为,减少或替换病毒文件执行的相关行为,从而避免触发行为阈值防杀模式的启动上限。另外,本地主动防御还需要依赖于本地数据库的及时更新,如果数据库更新不及时,则造成病毒文件未被发现。However, the above-mentioned local active defense method also has problems to be solved. First of all, local active defense is easy to prevent virus files from killing. For example, by packing virus files, you can avoid the signature database anti-kill mode of local active defense; by targeting virus behaviors, reducing or replacing related behaviors performed by virus files, Thereby avoiding triggering the activation upper limit of the behavior threshold anti-killing mode. In addition, the local active defense also needs to rely on the timely update of the local database. If the database is not updated in time, virus files will not be found.

基于上述问题,现有技术中还具有基于云安全的主动防御方法,不依赖于本地数据库,并且将主动防御的分析比对操作放在服务器侧完成。Based on the above problems, there is also a cloud-based active defense method in the prior art, which does not depend on the local database, and the analysis and comparison operation of the active defense is completed on the server side.

但是,对于云安全的主动防御方法,通常涉及的待查杀的文件样本达到上亿级。因为每个对文件样本进行查杀的鉴定器都会由专门的分析人员设定特征库,查杀过的文件样本可能会发生漏报或者误报的情况,特征库在不断升级,因此对文件样本进行重复扫描可以弥补之前的漏报,修复之前的误报。在每次进行扫描时,如果将所有文件样本都扫描一遍,则会耗费服务器侧的大量资源。However, for the active defense method of cloud security, the file samples to be checked and killed usually involve hundreds of millions of levels. Because each identifier that scans and kills file samples will have a signature library set by a dedicated analyst, the file samples that have been checked and killed may be missed or falsely reported. The signature library is constantly being upgraded, so file samples Doing a repeat scan can make up for previous false negatives and fix previous false positives. If all file samples are scanned each time a scan is performed, a large amount of resources on the server side will be consumed.

发明内容Contents of the invention

鉴于上述问题,本发明提出了文件样本的扫描方法和系统,以克服进行文件扫描时,耗费资源过多的问题。In view of the above problems, the present invention proposes a method and system for scanning file samples, so as to overcome the problem of consuming too many resources when scanning files.

根据本发明的一个方面,提供了一种文件样本的扫描方法,所述方法包括:According to one aspect of the present invention, a scanning method of a document sample is provided, the method comprising:

对于文件样本中的灰文件样本,根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本;For the gray file samples in the file samples, the gray file samples to be scanned are selected from the stored gray file samples according to a preset strategy;

根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;Selecting an authenticator for scanning the gray file sample according to the update record of each authenticator and/or the scanning record of the authenticator;

使用所选取的鉴定器扫描待扫描的灰文件样本,并存储扫描结果,以在接收到查询文件样本是否安全的请求时,返回扫描结果;Use the selected identifier to scan the gray file sample to be scanned, and store the scanning result, so that when a request is received to inquire whether the file sample is safe, the scanning result is returned;

所述灰文件样本为安全性未知的文件样本。The gray file samples are file samples with unknown security.

其中,所述根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本具体包括:Wherein, the selection of the gray file samples to be scanned from the stored gray file samples according to the preset strategy specifically includes:

根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本。The false negative rate of the gray file sample is obtained according to the attributes of the gray file sample, and the gray file sample to be scanned is selected from the stored gray file samples according to the false negative rate.

其中,所述方法还包括:Wherein, the method also includes:

对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器,所述危险文件样本为被鉴定器报告为病毒文件的文件样本;For a dangerous file sample in the file sample, determine an identifier that reports the dangerous file sample as a virus file, and the dangerous file sample is a file sample that is reported as a virus file by the identifier;

使用报告危险文件样本为病毒文件的鉴定器重新扫描危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。Use the identifier that reports the dangerous file sample as a virus file to re-scan the dangerous file sample, and if the scanning result shows that the dangerous file sample is no longer a virus file, then determine that the dangerous file sample is a file sample that has been falsely reported as a virus file, and the Dangerous file samples are removed from false positives.

其中,所述确定报告危险文件样本为病毒文件的鉴定器后还包括:Wherein, after determining the identifier that reports the dangerous file sample as a virus file, it also includes:

根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系;According to the identified identifiers, the false alarm rate of the dangerous file sample is obtained, and the corresponding relationship between the number of identifiers and the false alarm rate is established;

根据误报率从存储的危险文件样本中选取待扫描的危险文件样本;Select the dangerous file samples to be scanned from the stored dangerous file samples according to the false positive rate;

所述使用报告危险文件样本为病毒文件的鉴定器重新扫描危险文件样本具体包括:The re-scanning of the dangerous file sample using the identifier that reports the dangerous file sample as a virus file specifically includes:

对于每个待扫描的危险文件样本,使用报告该危险文件样本为病毒文件的鉴定器重新扫描该危险文件样本。For each dangerous file sample to be scanned, the dangerous file sample is rescanned using an identifier that reported the dangerous file sample as a virus file.

其中,所述方法还包括:Wherein, the method also includes:

在接收到查询文件样本是否安全的请求时,在日志中记录接收到的请求;When receiving a request to inquire whether a file sample is safe, record the received request in a log;

根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。According to the records in the log, the file samples whose query times are greater than the preset popularity threshold within the preset time are extracted, and the extracted file samples are active file samples.

其中,所述根据灰文件样本的属性得出灰文件样本的漏报率具体包括:Wherein, the underreporting rate of the gray file sample obtained according to the attributes of the gray file sample specifically includes:

从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率;Extract the gray file sample from the active file sample, and obtain the false negative rate of the gray file sample according to the attributes of the extracted gray file sample;

所述根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本具体包括:The gray file samples to be scanned are selected from the stored gray file samples according to the false negative rate and specifically include:

从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。From the extracted gray file samples, a gray file sample with a false negative rate greater than a preset false negative rate threshold is selected, and the selected gray file sample is used as a gray file sample to be scanned.

其中,所述确定报告危险文件样本为病毒文件的鉴定器具体包括:Wherein, the identifier for determining to report a dangerous file sample as a virus file specifically includes:

从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器;Extracting dangerous file samples from active file samples, and determining that the extracted dangerous file samples are an identifier for virus files;

所述根据误报率从存储的危险文件样本中选取待扫描的危险文件样本具体包括:The selection of the dangerous file samples to be scanned from the stored dangerous file samples according to the false alarm rate specifically includes:

从提取的危险文件样本中选取误报率大于预设的误报率阀值的危险文件样本,以选取的危险文件样本为待扫描的危险文件样本。A dangerous file sample with a false positive rate greater than a preset false positive rate threshold is selected from the extracted dangerous file samples, and the selected dangerous file sample is used as a dangerous file sample to be scanned.

其中,所述根据灰文件样本的属性得出灰文件样本的漏报率具体包括:Wherein, the underreporting rate of the gray file sample obtained according to the attributes of the gray file sample specifically includes:

根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率作为计算该灰文件样本的漏报率的参量。According to the statistics of the characteristics of the virus file and the attributes of the gray file sample, the probability that the gray file sample may be a virus file is calculated, and this probability is used as a parameter for calculating the false negative rate of the gray file sample .

其中,所述根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率具体包括:Wherein, according to the statistical results obtained by performing statistics on the characteristics of virus files, and the attributes of gray file samples, calculating the probability that the gray file samples may be virus files specifically includes:

根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the size of the virus file and the size of the gray file sample;

和/或,and / or,

根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the path of the virus file and the path of the gray file sample;

和/或,and / or,

根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。According to the list of dangerous operation behaviors obtained by counting the operation behaviors of virus files and the operation behaviors of gray file samples, the probability that the gray file samples may be virus files is calculated.

其中,所述根据各个鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器具体包括:Wherein, the said identifier selected for scanning the gray file sample according to the scanning records of each identifier specifically includes:

对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔;For each identifier and each gray file sample to be scanned, calculate the scanning interval of the gray file sample corresponding to the identifier according to the number of times the identifier scans the gray file sample;

根据扫描间隔选取用于扫描灰文件样本的鉴定器。Select the identifier used to scan gray file samples based on the scan interval.

其中,所述根据各个鉴定器的更新记录选取用于扫描灰文件样本的鉴定器具体包括:Wherein, the selection of the identifier for scanning the gray file sample according to the update record of each identifier specifically includes:

根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。Selects from each authenticator the one that has been updated since the last scan based on the update record.

其中,所述根据灰文件样本的属性得出灰文件样本的漏报率前还包括:Wherein, before the false negative rate of the gray file sample is obtained according to the attribute of the gray file sample, it also includes:

判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则检测出该灰文件样本的漏报率为0,如果否,则进行所述根据灰文件样本的属性得出灰文件样本的漏报率的操作。Determine whether the first discovery time of the gray file sample is earlier than the preset time threshold, if yes, detect that the false negative rate of the gray file sample is 0, if not, then perform the gray file sample according to the attributes of the gray file sample to obtain gray The operation of the false negative rate of the file sample.

根据本发明的另一个方面,本发明公开了一种文件样本的扫描系统,所述系统包括:样本存储装置、查杀引擎、扫描调度装置和包含多个鉴定器的样本扫描装置;According to another aspect of the present invention, the present invention discloses a system for scanning file samples, said system comprising: a sample storage device, an anti-virus engine, a scan scheduling device, and a sample scanning device including multiple identifiers;

所述样本存储装置,适于存储文件样本;The sample storage device is suitable for storing document samples;

所述扫描调度装置,适于对于文件样本中的灰文件样本,根据预设的策略从所述样本存储装置存储的灰文件样本中选取待扫描的灰文件样本,并根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;The scanning scheduling device is adapted to select the gray file samples to be scanned from the gray file samples stored in the sample storage device according to a preset strategy for the gray file samples in the file samples, and select the gray file samples to be scanned according to the update records of each identifier and/or the authenticator's scan records to select the authenticator used to scan the gray file sample;

所述样本扫描装置,适于从所述样本存储装置获取待扫描的灰文件样本,使用所述扫描调度装置选取的鉴定器扫描获取的待扫描的灰文件样本,并将扫描结果存储到查杀引擎;The sample scanning device is adapted to acquire gray file samples to be scanned from the sample storage device, use the identifier selected by the scan scheduling device to scan the obtained gray file samples to be scanned, and store the scanning results in the scanning engine;

所述查杀引擎,适于存储样本文件的扫描结果,并在接收到查询文件样本是否安全的请求时,返回扫描结果;The killing engine is suitable for storing the scan results of the sample files, and returns the scan results when receiving a request to inquire whether the file samples are safe;

所述灰文件样本为安全性未知的文件样本。The gray file samples are file samples with unknown security.

其中,所述扫描调度装置,具体适于根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本。Wherein, the scanning scheduling device is specifically adapted to obtain the false negative rate of the gray document sample according to the attribute of the gray document sample, and select the gray document sample to be scanned from the stored gray document samples according to the false positive rate.

其中,所述扫描调度装置,还适于对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器,所述危险文件样本为被鉴定器报告为病毒文件的文件样本;Wherein, the scan scheduling device is further adapted to, for a dangerous file sample in the file sample, determine an identifier that reports the dangerous file sample as a virus file, and the dangerous file sample is a file sample that is reported as a virus file by the identifier;

所述样本扫描装置,还适于从所述样本存储装置获取危险文件样本,使用所述扫描调度装置确定的鉴定器重新扫描获取的危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。The sample scanning device is further adapted to obtain a dangerous file sample from the sample storage device, and use the identifier determined by the scan scheduling device to re-scan the obtained dangerous file sample, if the scanning result is that the dangerous file sample is no longer a virus file, then it is determined that the dangerous file sample is a file sample that is falsely reported as a virus file, and the false positive operation is performed on the dangerous file sample.

其中,所述扫描调度装置,还适于在确定报告危险文件样本为病毒文件的鉴定器后,根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系,根据误报率从所述样本存储装置存储的危险文件样本中选取待扫描的危险文件样本;Wherein, the scanning scheduling device is also suitable for determining the false alarm rate of the dangerous file sample according to the identified identifier after determining the identifier that reports the dangerous file sample as a virus file, and establishing the correspondence between the number of identifiers and the false alarm rate. According to the false alarm rate, the dangerous file samples to be scanned are selected from the dangerous file samples stored in the sample storage device;

所述样本扫描装置,具体适于对于每个待扫描的危险文件样本,使用报告该危险文件样本为病毒文件的鉴定器重新扫描该危险文件样本。The sample scanning device is specifically adapted to, for each dangerous file sample to be scanned, re-scan the dangerous file sample using an identifier that reports the dangerous file sample as a virus file.

其中,所述查杀引擎,还适于在接收到查询文件样本是否安全的请求时,在日志中记录接收到的请求;Wherein, the killing engine is also suitable for recording the received request in a log when receiving a request to inquire whether the file sample is safe;

所述扫描调度装置,还适于根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。The scanning scheduling device is further adapted to extract file samples whose query times are greater than a preset popularity threshold within a preset time according to the records in the log, and the extracted file samples are active file samples.

其中,所述扫描调度装置,具体适于从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率,根据所得出的漏报率从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。Wherein, the scanning scheduling device is specifically adapted to extract gray file samples from active file samples, obtain the false negative rate of the gray file sample according to the attributes of the extracted gray file samples, and obtain the false positive rate from the extracted gray file sample according to the obtained false negative rate. From the gray file samples, a gray file sample with a false negative rate greater than a preset false negative rate threshold is selected, and the selected gray file sample is the gray file sample to be scanned.

其中,所述扫描调度装置,具体适于从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器,根据确定的鉴定器得出提取的危险文件样本的误报率,根据误报率从提取的危险文件样本中选取误报率大于预设的误报率阀值的危险文件样本,以选取的危险文件样本为待扫描的危险文件样本。Wherein, the scan scheduling device is specifically suitable for extracting dangerous file samples from active file samples, determining the identifier that reports the extracted dangerous file sample as a virus file, and obtaining a false positive of the extracted dangerous file sample according to the identified identifier According to the false positive rate, the dangerous document samples whose false positive rate is greater than the preset false positive rate threshold are selected from the extracted dangerous document samples, and the selected dangerous document samples are the dangerous document samples to be scanned.

其中,所述扫描调度装置,具体适于根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率为计算该灰文件样本的漏报率的参量。Wherein, the scanning scheduling device is specifically adapted to calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by statistically analyzing the characteristics of the virus file and the attributes of the gray file sample, and the probability is A parameter to calculate the false negative rate of the gray file sample.

其中,所述扫描调度装置,具体适于根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率;Wherein, the scanning scheduling device is specifically adapted to calculate the probability that the gray file sample may be a virus file according to the statistical result obtained by counting the size of the virus file and the size of the gray file sample;

和/或,and / or,

根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the path of the virus file and the path of the gray file sample;

和/或,and / or,

根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。According to the list of dangerous operation behaviors obtained by counting the operation behaviors of virus files and the operation behaviors of gray file samples, the probability that the gray file samples may be virus files is calculated.

其中,所述扫描调度装置,具体适于对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描灰文件样本的鉴定器。Wherein, the scan scheduling device is specifically adapted to, for each identifier and each gray file sample to be scanned, calculate the number of scans of the gray file sample corresponding to the identifier according to the number of times the identifier scans the gray file sample. Interval, select the identifier used to scan gray file samples according to the scanning interval.

其中,所述扫描调度装置,具体适于根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。Wherein, the scanning scheduling device is specifically adapted to select an authenticator that has been updated after the last scan from various authenticators according to the update record.

其中,所述扫描调度装置,还适于在根据灰文件样本的属性得出灰文件样本的漏报率前,判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则检测出该灰文件样本的漏报率为0,如果否,则进行所述根据灰文件样本的属性得出灰文件样本的漏报率的操作。Wherein, the scanning scheduling device is also adapted to determine whether the first discovery time of the gray file sample is earlier than the preset time threshold before obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample, and if so, then If it is detected that the false negative rate of the gray file sample is 0, if not, the operation of obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample is performed.

根据本发明的文件样本扫描的技术方案,在对文件样本中的灰文件样本进行扫描时,根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本;根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;使用所选取的鉴定器扫描待扫描的灰文件样本,并存储扫描结果,以在接收到查询文件样本是否安全的请求时,返回扫描结果。According to the technical scheme of file sample scanning of the present invention, when scanning the gray file samples in the file samples, the gray file samples to be scanned are selected from the stored gray file samples according to the preset strategy; Record and/or the scan record of the authenticator to select an authenticator for scanning the gray file sample; use the selected authenticator to scan the gray file sample to be scanned, and store the scanning result, so that when a request to inquire about the security of the file sample is received , return the scan result.

因为,在进行灰文件样本扫描时,依据预设的策略对灰文件样本进行了选取,并且根据鉴定器的更新记录和/或鉴定器的扫描记录对鉴定器进行了选取,所以在能够保证弥补漏报的同时,减少了扫描的工作量,由此解决了在扫描时需要将文件样本全部扫描,导致消耗资源量大的问题,取得了节约进行扫描的设备的资源,加快扫描效率,提供扫描速度的有益效果。Because, when scanning the gray file sample, the gray file sample is selected according to the preset strategy, and the authenticator is selected according to the update record of the authenticator and/or the scanning record of the authenticator, so it can be ensured that the At the same time, the workload of scanning is reduced, which solves the problem of large resource consumption due to the need to scan all file samples during scanning, saves the resources of scanning devices, speeds up scanning efficiency, and provides scanning Beneficial effect of speed.

上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and understandable , the specific embodiments of the present invention are enumerated below.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating a preferred embodiment and are not to be considered as limiting the invention. Also throughout the drawings, the same reference numerals are used to designate the same parts. In the attached picture:

图1示出了根据本发明一个实施例的文件样本的扫描系统的结构图;Fig. 1 shows the structural diagram of the scanning system of the document sample according to one embodiment of the present invention;

图2示出了根据本发明一个实施例的文件样本的扫描方法的流程图;Fig. 2 shows the flowchart of the scanning method of file sample according to an embodiment of the present invention;

图3示出了根据本发明一个实施例的依据漏报率选取灰文件样本的流程图;Fig. 3 shows the flowchart of selecting gray file samples according to the false negative rate according to one embodiment of the present invention;

图4示出了根据本发明一个实施例的文件样本的扫描方法中对危险文件样本进行扫描的流程图;FIG. 4 shows a flow chart of scanning dangerous file samples in a file sample scanning method according to an embodiment of the present invention;

图5示出了根据本发明一个实施例的文件样本的扫描方法中对灰文件样本进行扫描的流程图;FIG. 5 shows a flow chart of scanning a gray file sample in a method for scanning a file sample according to an embodiment of the present invention;

图6示出了根据本发明一个实施例的文件样本的扫描方法中对危险文件样本进行扫描的流程图。Fig. 6 shows a flow chart of scanning dangerous file samples in the method for scanning file samples according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

参见图1,示出了根据本发明一个实施例的文件样本的扫描系统。系统包括:样本存储装置100、查杀引擎200、扫描调度装置300和包含多个鉴定器的样本扫描装置400。该系统可以是系统中各个装置位于同一个物理设备中,也可以是系统中各个装置位于不同的物理设备中。Referring to FIG. 1 , it shows a system for scanning document samples according to an embodiment of the present invention. The system includes: a sample storage device 100, an antivirus engine 200, a scan scheduling device 300, and a sample scanning device 400 including multiple identifiers. In the system, each device in the system may be located in the same physical device, or it may be that each device in the system is located in different physical devices.

样本存储装置100,适于存储文件样本。The sample storage device 100 is suitable for storing file samples.

扫描调度装置300,适于对于文件样本中的灰文件样本,根据预设的策略从样本存储装置100存储的灰文件样本中选取待扫描的灰文件样本,并根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。其中,鉴定器可以为用于检测文件样本安全性的杀毒应用,例如bitdefender(比特梵德,来自罗马尼亚的一种杀毒应用),以及QVM(奇虎支持向量机)提供的杀毒应用,以及云杀毒引擎等。灰文件样本为安全性未知的文件样本。The scan scheduling device 300 is adapted to select the gray file samples to be scanned from the gray file samples stored in the sample storage device 100 according to a preset policy for the gray file samples in the file samples, and according to the update records of each identifier and/or or Authenticator's Scan Record Select the authenticator used to scan the gray file sample. Among them, the identifier can be an antivirus application used to detect the security of file samples, such as bitdefender (Bitdefender, an antivirus application from Romania), an antivirus application provided by QVM (Qihoo Support Vector Machine), and cloud antivirus engine etc. Gray file samples are file samples with unknown security.

具体地,扫描调度装置300根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本。Specifically, the scan scheduling device 300 obtains the false negative rate of the gray document sample according to the attribute of the gray document sample, and selects the gray document sample to be scanned from the stored gray document samples according to the false positive rate.

举例而言,扫描调度装置300根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率为计算该灰文件样本的漏报率的参量。For example, the scan scheduling device 300 calculates the probability that the gray file sample may be a virus file according to the statistical results obtained by statistically counting the characteristics of the virus file and the attributes of the gray file sample, and uses the probability as the basis for calculating the gray file sample. A parameter of the false negative rate of the document sample.

例如,扫描调度装置300对病毒文件的大小、路径、和/或行为进行统计,依据统计结果得出文件样本可能为病毒文件的概率。For example, the scan scheduling apparatus 300 makes statistics on the size, path, and/or behavior of virus files, and obtains the probability that the file sample may be a virus file according to the statistical results.

扫描调度装置300根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率。The scanning scheduling apparatus 300 calculates the probability that the gray file sample may be a virus file according to the statistical result obtained by counting the size of the virus file and the size of the gray file sample.

通常病毒文件为了传播,所以病毒文件比较小。对病毒文件进行统计,例如使用hadoop(一种分布式计算平台)进行统计,得出文件大小与报病毒率的一条关系曲线,依据曲线得出灰文件样本可能为病毒文件的概率。Usually virus files are for spreading, so virus files are relatively small. Statistics on virus files, for example, using Hadoop (a distributed computing platform) to make statistics, obtain a relationship curve between file size and reported virus rate, and obtain the probability that a gray file sample may be a virus file according to the curve.

如果仅以依据灰文件样本大小得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。例如,漏报率阀值为0.001%,文件大小在10M以上的文件样本报病毒率为0.001%,则选择小于10M的灰文件样本为待扫描的灰文件样本。If only the probability based on the size of the gray file sample is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and the gray file samples whose false negative rate is greater than the preset false negative rate threshold are selected as the ones to be scanned. Gray file sample. For example, the false negative rate threshold is 0.001%, and the virus reporting rate of file samples with a file size of more than 10M is 0.001%, then the gray file samples smaller than 10M are selected as the gray file samples to be scanned.

扫描调度装置300根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率。The scanning scheduling apparatus 300 calculates the probability that the gray file sample may be a virus file according to the statistical result obtained by counting the path of the virus file and the path of the gray file sample.

通过对病毒文件的路径的离线统计,例如使用hadoop(一种分布式计算平台)进行统计,可以得出文件路径与报病毒率的一条关系曲线,依据曲线得出灰文件样本可能为病毒文件的概率。Through offline statistics of the path of virus files, for example, using hadoop (a distributed computing platform) for statistics, a relationship curve between the file path and the reported virus rate can be obtained, and the gray file sample may be a virus file according to the curve probability.

如果仅以依据灰文件样本路径得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。If only the probability obtained based on the gray file sample path is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and the gray file samples whose false negative rate is greater than the preset false negative rate threshold are selected as the ones to be scanned. Gray file sample.

扫描调度装置300根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。The scan scheduling device 300 calculates the probability that the gray file sample may be a virus file according to the list of dangerous operation behaviors obtained by counting the operation behaviors of the virus files and the operation behaviors of the gray file samples.

危险操作行为列表中可以包括下列操作行为中的一种或多种:The list of dangerous operational behaviors may include one or more of the following operational behaviors:

写入注册表进行自动加载;Write to the registry for automatic loading;

修改注册表;Modify the registry;

修改系统文件;Modify system files;

修改指定的应用文件;Modify the specified application file;

执行进程间注入;Perform inter-process injection;

结束进程;end process;

修改浏览器中网页内容;以及Modify the content of web pages in your browser; and

记录键盘操作。Record keystrokes.

根据灰文件样本触发的危险操作行为数量,计算出该灰文件样本可能为病毒文件的概率。触发的危险操作行为越多,灰文件样本可能为病毒文件的概率越高。例如,获得灰文件样本触发危险操作行为的数量,将该数量除以危险操作行为列表中危险操作行为的总量,得出灰文件样本可能为病毒文件的概率。According to the number of dangerous operation behaviors triggered by the gray file sample, the probability that the gray file sample may be a virus file is calculated. The more dangerous operations are triggered, the higher the probability that the gray file sample may be a virus file. For example, the number of dangerous operation behaviors triggered by the gray file samples is obtained, and the number is divided by the total number of dangerous operation behaviors in the list of dangerous operation behaviors to obtain the probability that the gray file samples may be virus files.

如果仅以依据灰文件样本触发的危险操作行为数量得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。If only the probability obtained based on the number of dangerous operation behaviors triggered by gray file samples is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and gray files with a false negative rate greater than the preset false negative rate threshold are selected The sample is a gray file sample to be scanned.

如果以上述得出的多个概率为计算漏报率的参量,则可以对应每个参量设置权重值,将参量加权求和得出灰文件样本的漏报率。If the multiple probabilities obtained above are used as parameters for calculating the false negative rate, you can set a weight value corresponding to each parameter, and calculate the false positive rate of the gray file sample by summing the weighted parameters.

其中,在选择鉴定器时,可以根据各个鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。Wherein, when selecting an authenticator, an authenticator for scanning the gray file sample may be selected according to the scanning records of each authenticator.

扫描调度装置300对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描灰文件样本的鉴定器。其中,扫描的次数越多则扫描间隔越长。例如,采用公式T=(logN)×1.5+1计算扫描间隔。N为某个鉴定器扫描某个灰文件样本的扫描的次数,T为该灰文件样本对应于该鉴定器的扫描间隔。For each identifier and each gray file sample to be scanned, the scan scheduling device 300 calculates the scanning interval of the gray file sample corresponding to the identifier according to the number of times the identifier scans the gray file sample, and selects the scanning interval according to the scanning interval. Authenticator for scanning gray file samples. Wherein, the more times of scanning, the longer the scanning interval. For example, the scan interval is calculated using the formula T=(logN)×1.5+1. N is the number of times a certain identifier scans a gray file sample, and T is the scanning interval of the gray file sample corresponding to the identifier.

其中,在选择鉴定器时,可以根据各个鉴定器的更新记录选取用于扫描灰文件样本的鉴定器。扫描调度装置300根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。Wherein, when selecting an authenticator, an authenticator for scanning gray file samples may be selected according to update records of each authenticator. The scan scheduling device 300 selects an identifier that has been updated after the last scan from each identifier according to the update record.

也可以,根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。例如,先选取在上次扫描后进行过更新的鉴定器,然后从该选取的鉴定器中再按扫描间隔选取鉴定器。It is also possible to select an authenticator for scanning the gray file sample according to the update record of each authenticator and the scanning record of the authenticator. For example, select the identifiers that have been updated since the last scan, and then select identifiers from the selected identifiers by the scan interval.

样本扫描装置400,适于从样本存储装置100获取待扫描的灰文件样本,使用扫描调度装置300选取的鉴定器扫描获取的待扫描的灰文件样本,并将扫描结果存储到查杀引擎200。The sample scanning device 400 is adapted to obtain gray file samples to be scanned from the sample storage device 100 , use the identifier selected by the scan scheduling device 300 to scan the acquired gray file samples to be scanned, and store the scanning results in the antivirus engine 200 .

查杀引擎200,适于存储样本文件的扫描结果,并在接收到查询文件样本是否安全的请求时,返回扫描结果。The antivirus engine 200 is adapted to store the scanning results of the sample files, and return the scanning results when receiving a request to inquire whether the file samples are safe.

本实施例在保证弥补漏报的同时,减少了扫描的工作量,由此解决了在扫描时需要将文件样本全部扫描,导致消耗资源量大的问题,取得了节约进行扫描的设备的资源,加快扫描速度,提供扫描效率的有益效果。This embodiment reduces the workload of scanning while ensuring to make up for missed reports, thereby solving the problem that all file samples need to be scanned during scanning, resulting in a large amount of resource consumption, and saving the resources of scanning devices. Accelerates the scanning speed, providing the beneficial effect of scanning efficiency.

在一较佳的实施例中,扫描调度装置300,还适于在根据灰文件样本的属性得出灰文件样本的漏报率前,判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则检测出该灰文件样本的漏报率为0,如果否,则进行上述根据灰文件样本的属性得出灰文件样本的漏报率的操作。In a preferred embodiment, the scan scheduling device 300 is also adapted to determine whether the first discovery time of the gray file sample is earlier than the preset time threshold before obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample value, if yes, then detect that the false negative rate of the gray file sample is 0, if not, then perform the above-mentioned operation of obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample.

因为,灰文件样本的发现时间越早,则其被漏报的可能性越小。当首次发现时间早于预设时间阀值时,则不再对该灰文件样本进行扫描,由此,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Because, the earlier the gray file sample is discovered, the less likely it will be missed. When the first discovery time is earlier than the preset time threshold, the gray file sample is no longer scanned, thereby further reducing unnecessary scanning operations, saving resources used for scanning file samples, and improving scanning efficiency.

在本发明的另一个实施例中,对文件样本中的危险文件样本进行扫描。危险文件样本为被鉴定器报告为病毒文件的文件样本。对于危险文件样本使用将其报告为病毒文件的鉴定器对该危险文件样本进行扫描,如果扫描后,该些鉴定器都确定该危险文件样本不为病毒文件,则对该危险文件样本进行去误报操作。本实施例的具体技术方案如下所述。In another embodiment of the present invention, dangerous file samples in the file samples are scanned. Dangerous file samples are file samples that are reported by the identifier as virus files. For the dangerous file sample, use the identifier that reports it as a virus file to scan the dangerous file sample, if after scanning, these identifiers all determine that the dangerous file sample is not a virus file, then remove the error for the dangerous file sample report operation. The specific technical solution of this embodiment is as follows.

样本存储装置100,适于存储文件样本。The sample storage device 100 is suitable for storing file samples.

扫描调度装置300,还适于对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器。The scan scheduling device 300 is further adapted to determine, for the dangerous file samples in the file samples, an identifier that reports the dangerous file samples as virus files.

样本扫描装置400,还适于从样本存储装置100获取危险文件样本,使用扫描调度装置300确定的鉴定器重新扫描获取的危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报的文件样本,对该危险文件样本进行去误报操作。The sample scanning device 400 is further adapted to obtain a dangerous file sample from the sample storage device 100, and use the identifier determined by the scan scheduling device 300 to re-scan the obtained dangerous file sample, and if the scanning result shows that the dangerous file sample is no longer a virus file, then It is determined that the dangerous file sample is a file sample that has been falsely reported, and an operation of removing false positives is performed on the dangerous file sample.

其中,去误报操作可以为删除该文件样本为病毒文件的记录,也可以为将该文件样本的记录由危险文件样本更新为灰文件样本或白文件样本。白文件样本为确定无危险的文件样本。Wherein, the false positive removal operation may be to delete the record that the file sample is a virus file, or to update the record of the file sample from a dangerous file sample to a gray file sample or a white file sample. White file samples are file samples that are determined to be harmless.

查杀引擎200,适于存储样本文件的扫描结果,并在接收到查询文件样本是否安全的请求时,返回扫描结果。The antivirus engine 200 is adapted to store the scanning results of the sample files, and return the scanning results when receiving a request to inquire whether the file samples are safe.

通过本实施例,对于危险文件样本,仅使用将其报告为病毒文件的鉴定器对该危险文件样本进行扫描,由此,在保证修正误报的同时,能够减少扫描操作,提高扫描效率。Through this embodiment, for a dangerous file sample, only the identifier that reports it as a virus file is used to scan the dangerous file sample, thereby ensuring correction of false positives while reducing scanning operations and improving scanning efficiency.

进一步地,计算危险文件样本的误报率,根据误报率选择进行扫描的危险文件样本。由此进一步减少了被扫描的危险文件样本。Further, the false positive rate of the dangerous file samples is calculated, and the dangerous file samples to be scanned are selected according to the false positive rate. This further reduces the number of dangerous file samples scanned.

扫描调度装置300在确定报告危险文件样本为病毒文件的鉴定器后,根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系。所确定的鉴定器的数量越多,该危险文件样本的误报率越低。After the scan scheduling device 300 determines the identifier that reports the dangerous file sample as a virus file, it obtains the false alarm rate of the dangerous file sample according to the determined identifier, and establishes a corresponding relationship between the number of identifiers and the false positive rate. The greater the number of identified identifiers, the lower the false positive rate for the dangerous file sample.

扫描调度装置300还根据误报率从样本存储装置100存储的危险文件样本中选取待扫描的危险文件样本。The scanning scheduling device 300 also selects the dangerous file samples to be scanned from the dangerous file samples stored in the sample storage device 100 according to the false positive rate.

例如,对于将危险文件样本报告为病毒文件的鉴定器,可以对各个鉴定器设置对应该危险文件的准确率,将准确率相加得和值,用1减去和值得出危险文件样本的误报率,选取误报率大于预设的误报率阀值的危险文件样本为待扫描的危险文件样本。For example, for an identifier that reports a dangerous file sample as a virus file, you can set the accuracy rate corresponding to the dangerous file for each identifier, add the accuracy rates to get the sum value, and subtract the sum value from 1 to get the error rate of the dangerous file sample. positive rate, and select dangerous file samples whose false positive rate is greater than the preset false positive rate threshold as the dangerous file samples to be scanned.

在设置对应该危险文件的准确率时,如果是分析人员设置为病毒文件,则直接设置准确率为1。对于各个鉴定器,依据对该鉴定器的信任度和该鉴定器对危险文件的扫描次数设置准确率,扫描次数越高准确率越高。例如,对杀毒引擎A的信任度较高,则杀毒引擎A扫描次数大于扫描阀值后,确定杀毒引擎A对应该危险文件的准确率为1。When setting the accuracy rate corresponding to the dangerous file, if the analyst sets it as a virus file, directly set the accuracy rate to 1. For each authenticator, the accuracy rate is set according to the trust degree of the authenticator and the number of times the authenticator scans dangerous files, and the higher the number of scans, the higher the accuracy rate. For example, if the degree of trust in antivirus engine A is high, then after the number of scans by antivirus engine A is greater than the scanning threshold, it is determined that the accuracy rate of antivirus engine A corresponding to the dangerous file is 1.

由此,根据误报率对危险文件样本进行选取,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Thus, the dangerous file samples are selected according to the false alarm rate, which further reduces unnecessary scanning operations, saves resources used for scanning file samples, and improves scanning efficiency.

进一步地,对于确定的报告危险文件样本为病毒文件的鉴定器,根据鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。Further, for the determined identifier reporting that the dangerous file sample is a virus file, an identifier for scanning the dangerous file sample is selected according to the update record of the identifier and/or the scanning record of the identifier.

其中,在将危险文件样本确定为病毒文件的鉴定器中,可以根据各个鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。Among the identifiers that determine the dangerous file sample as a virus file, an identifier for scanning the dangerous file sample may be selected according to the scanning records of each identifier.

扫描调度装置300对于每个确定的鉴定器、每个待扫描的危险文件样本,根据该鉴定器扫描该危险文件样本的次数,计算该危险文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描危险文件样本的鉴定器。其中,扫描的次数越多则扫描间隔越长。例如,采用公式T=(logN)×1.5+1计算扫描间隔。N为某个鉴定器扫描某个危险文件样本的扫描的次数,T为该危险文件样本对应于该鉴定器的扫描间隔。For each determined identifier and each dangerous file sample to be scanned, the scan scheduling device 300 calculates the scan interval of the dangerous file sample corresponding to the identifier according to the number of times the identifier scans the dangerous file sample, and according to the scan interval Choose an identifier to scan for risky file samples. Wherein, the more times of scanning, the longer the scanning interval. For example, the scan interval is calculated using the formula T=(logN)×1.5+1. N is the number of times a certain identifier scans a certain dangerous file sample, and T is the scanning interval of the dangerous file sample corresponding to the identifier.

其中,在将危险文件样本确定为病毒文件的鉴定器中,可以根据各个鉴定器的更新记录选取用于扫描危险文件样本的鉴定器。扫描调度装置300根据更新记录从鉴定器中选取在上次扫描后进行过更新的鉴定器。Wherein, among the identifiers that determine the dangerous file sample as a virus file, an identifier for scanning the dangerous file sample may be selected according to update records of each identifier. The scan scheduling apparatus 300 selects an identifier that has been updated after the last scan from the identifiers according to the update record.

也可以,根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描危险文件样本的鉴定器。例如,在将危险文件样本确定为病毒文件的鉴定器中,先选取在上次扫描后进行过更新的鉴定器,然后从该选取的鉴定器中再按扫描间隔选取鉴定器。Alternatively, the authenticator used to scan the dangerous file sample may be selected according to the update record of each authenticator and the scanning record of the authenticator. For example, among the identifiers that determine a dangerous file sample as a virus file, first select an identifier that has been updated after the last scan, and then select an identifier according to the scan interval from the selected identifiers.

由此,根据鉴定器的更新记录和/或鉴定器的扫描记录对于鉴定器进行选取,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Therefore, selecting an authenticator according to the update record of the authenticator and/or the scanning record of the authenticator further reduces unnecessary scanning operations, saves resources used for scanning file samples, and improves scanning efficiency.

在本发明的另一个实施例中,统计文件样本的查询的热度,从热度高的活跃文件样本中选取文件样本,进一步减少进行扫描的文件样本数量,提高了扫描效率。In another embodiment of the present invention, the query popularity of file samples is counted, and file samples are selected from active file samples with high popularity, further reducing the number of file samples to be scanned, and improving scanning efficiency.

查杀引擎200,还适于在接收查询文件样本是否安全的请求时,在日志中记录接收到的请求。The antivirus engine 200 is further adapted to record the received request in a log when receiving a request to inquire whether the file sample is safe.

扫描调度装置300,还适于根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。活跃文件样本,为热度较高,被查询的频率大于预设门限值的文件样本。The scanning scheduling device 300 is further adapted to extract file samples whose query times are greater than a preset popularity threshold within a preset time according to the records in the log, and the extracted file samples are active file samples. Active file samples refer to file samples with high popularity and the query frequency is greater than the preset threshold value.

由此,在灰文件样本进行扫描时,扫描调度装置300从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率,从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。Thus, when the gray file samples are scanned, the scan scheduling device 300 extracts the gray file samples from the active file samples, and obtains the false negative rate of the gray file samples according to the attributes of the extracted gray file samples. From the extracted gray file samples Select the gray file samples whose false negative rate is greater than the preset false negative rate threshold, and use the selected gray file samples as the gray file samples to be scanned.

在危险文件样本进行扫描时,扫描调度装置300从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器,根据确定的鉴定器得出提取的危险文件样本的误报率,从提取的危险文件样本中选取误报率大于预设误报率阀值的危险文件样本,以选取的危险文件样本为待扫描的危险文件样本。When a dangerous file sample is scanned, the scan scheduling device 300 extracts a dangerous file sample from active file samples, determines that the extracted dangerous file sample is an identifier for a virus file, and obtains an error rate of the extracted dangerous file sample according to the determined identifier. The alarm rate is selected from the extracted dangerous file samples with a false alarm rate greater than the preset false alarm rate threshold, and the selected dangerous file samples are the dangerous file samples to be scanned.

以下结合一具体实例,对灰文件样本的扫描进行说明。The scanning of gray file samples will be described below in conjunction with a specific example.

文件样本的扫描系统包括:样本存储装置100、查杀引擎200、扫描调度装置300和包含多个鉴定器的样本扫描装置400。在该具体实例中,以文件样本的MD5(消息摘要算法第五版)值为文件样本的标识。此外,也可以以md5+sha1的40字节长度的字符串作为文件样本的唯一标识,以避免仅以md5为标识,造成的标识冲突,即对两个不同的文件样本算出的md5值相同时,该两个文件样本的标识冲突。扫描调度装置300存储有文件样本信息库,其中存储有各个文件样本的属性信息和其他相关信息,例如,对于每个文件样本,存储有文件样本的大小、路径、操作行为、以及文件样本是否为危险文件样本或灰文件样本,在文件样本为危险文件样本时,报告其为病毒文件的鉴定器等。The file sample scanning system includes: a sample storage device 100, an antivirus engine 200, a scan scheduling device 300, and a sample scanning device 400 including multiple identifiers. In this specific example, the MD5 (message digest algorithm version 5) value of the file sample is used as the identifier of the file sample. In addition, the 40-byte character string of md5+sha1 can also be used as the unique identifier of the file sample to avoid the identifier conflict caused by only using md5 as the identifier, that is, when the md5 values calculated for two different file samples are the same , the two file samples have conflicting IDs. The scan scheduling device 300 stores a file sample information library, which stores attribute information and other relevant information of each file sample, for example, for each file sample, stores the size, path, operation behavior of the file sample, and whether the file sample is A dangerous file sample or a gray file sample, when the file sample is a dangerous file sample, report it as a virus file identifier, etc.

样本存储装置100存储文件样本。The sample storage device 100 stores file samples.

查杀引擎200存储样本文件的扫描结果,在接收到查询文件样本是否安全的请求时,返回扫描结果;并且在接收查询文件样本是否安全的请求时,在日志中记录接收到的请求。接收的请求中包括文件样本的MD5值。The antivirus engine 200 stores the scan results of the sample files, and returns the scan results when receiving a request to inquire whether the file samples are safe; and records the received request in a log when receiving a request to inquire whether the file samples are safe. The received request includes the MD5 value of the file sample.

扫描调度装置300根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。The scan scheduling device 300 extracts file samples whose query times are greater than a preset popularity threshold within a preset time according to the records in the log, and the extracted file samples are active file samples.

扫描调度装置300从活跃文件样本中提取灰文件样本,根据该灰文件样本的MD5值从文件样本信息库中获得该灰文件样本的属性,根据获得的属性得出该灰文件样本的漏报率,从提取的灰文件样本中选取所得出的漏报率大于预设漏报率阀值的灰文件样本,以该选取的灰文件样本为待扫描的灰文件样本。The scan scheduling device 300 extracts the gray file sample from the active file sample, obtains the attribute of the gray file sample from the file sample information database according to the MD5 value of the gray file sample, and obtains the false negative rate of the gray file sample according to the obtained attribute , selecting a gray file sample whose false negative rate is greater than a preset false negative rate threshold from the extracted gray file samples, and using the selected gray file sample as the gray file sample to be scanned.

扫描调度装置300根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。The scanning scheduling device 300 selects an authenticator for scanning the gray file sample according to the update record of each authenticator and the scanning record of the authenticator.

样本扫描装置400从样本存储装置100获取待扫描的灰文件样本,使用扫描调度装置300选取的鉴定器扫描获取的灰文件样本,并将扫描结果存储到查杀引擎200。The sample scanning device 400 obtains gray file samples to be scanned from the sample storage device 100 , scans the acquired gray file samples using the identifier selected by the scan scheduling device 300 , and stores the scanning results in the antivirus engine 200 .

以下结合一具体实例,对危险文件样本的扫描进行说明。The following describes the scanning of dangerous file samples in combination with a specific example.

样本存储装置100存储文件样本。The sample storage device 100 stores file samples.

查杀引擎200存储文件样本的扫描结果,在接收到查询文件样本是否安全的请求时,返回扫描结果;并且在接收到查询文件样本是否安全的请求时,在日志中记录接收到的请求。接收的请求中包括文件样本的MD5值。The antivirus engine 200 stores the scan results of the file samples, and returns the scan results when receiving a request to check whether the file samples are safe; and records the received request in the log when receiving a request to check whether the file samples are safe. The received request includes the MD5 value of the file sample.

扫描调度装置300根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。The scan scheduling device 300 extracts file samples whose query times are greater than a preset popularity threshold within a preset time according to the records in the log, and the extracted file samples are active file samples.

扫描调度装置300从活跃文件样本中提取危险文件样本,根据该危险文件样本的MD5值读取信息库中信息,确定报告提取的危险文件样本为病毒文件的鉴定器,根据确定的鉴定器得出提取的危险文件样本的误报率,从提取的危险文件样本中选取所得出的误报率大于预设的误报率阀值的危险文件样本,以该选取的危险文件样本为待扫描的危险文件样本。The scan scheduling device 300 extracts a dangerous file sample from active file samples, reads information in the information base according to the MD5 value of the dangerous file sample, determines that the extracted dangerous file sample is an identifier for a virus file, and obtains The false alarm rate of the extracted risk file samples, the risk file samples whose false alarm rate is greater than the preset false alarm rate threshold are selected from the extracted risk file samples, and the selected risk file samples are the risk files to be scanned. Documentation sample.

扫描调度装置300对于每个待扫描的危险文件样本,从确定的报告危险文件样本为病毒文件的鉴定器中,根据鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。For each dangerous file sample to be scanned, the scan scheduling device 300 selects from the determined identifiers that report the dangerous file sample as a virus file according to the update record of the identifier and/or the scan record of the identifier to scan the dangerous file. The identifier for the sample.

样本扫描装置400从样本存储装置100获取待扫描的危险文件样本,对于每个待扫描的危险文件样本,使用扫描调度装置300确定的鉴定器重新扫描该危险文件样本,如果各个鉴定器的扫描结果都为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。The sample scanning device 400 acquires dangerous file samples to be scanned from the sample storage device 100, and for each dangerous file sample to be scanned, re-scans the dangerous file sample using the identifier determined by the scan scheduling device 300, if the scanning result of each identifier is If it is all that the dangerous file sample is no longer a virus file, it is determined that the dangerous file sample is a file sample falsely reported as a virus file, and the false positive operation is performed on the dangerous file sample.

以上对于本发明的文件扫描系统进行了说明,该系统能够保证弥补漏报和修正误报的同时,减少了扫描的工作量,由此解决了在扫描时需要将文件样本全部扫描,导致消耗资源量大的问题,取得了节约进行扫描的设备的资源,加快扫描效率,提供扫描速度的有益效果。The file scanning system of the present invention has been described above. The system can ensure that missed reports and corrected false reports can be compensated, while reducing the workload of scanning, thereby solving the problem of the need to scan all file samples during scanning, resulting in resource consumption. In order to solve the problem of large volume, the resource of the scanning device is saved, the scanning efficiency is accelerated, and the scanning speed is improved.

参见图2,示出了根据本发明一个实施例的文件样本的扫描方法。Referring to FIG. 2 , it shows a method for scanning a file sample according to an embodiment of the present invention.

步骤S210,对于文件样本中的灰文件样本,根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本。Step S210, for the gray file samples in the file samples, select the gray file samples to be scanned from the stored gray file samples according to a preset strategy.

具体地,在步骤S210中,根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本。举例而言,根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率为计算该灰文件样本的漏报率的参量。Specifically, in step S210, the false negative rate of the gray document sample is obtained according to the attribute of the gray document sample, and the gray document sample to be scanned is selected from the stored gray document samples according to the false negative rate. For example, according to the statistical results obtained by performing statistics on the characteristics of virus files, and the attributes of gray file samples, the probability that the gray file sample may be a virus file is calculated, and the probability is used as an error in calculating the gray file sample. The parameters of the reporting rate.

例如,对病毒文件的大小、路径、和/或行为进行统计,依据统计结果得出文件样本可能为病毒文件的概率。For example, the size, path, and/or behavior of the virus file are counted, and the probability that the file sample may be a virus file is obtained based on the statistical results.

根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率。According to the statistical result obtained by counting the size of the virus file and the size of the gray file sample, the probability that the gray file sample may be a virus file is calculated.

通常病毒文件为了传播,所以病毒文件比较小。对病毒文件进行统计,例如使用hadoop(一种分布式计算平台)进行统计,得出文件大小与报病毒率的一条关系曲线,依据曲线得出灰文件样本可能为病毒文件的概率。Usually virus files are for spreading, so virus files are relatively small. Statistics on virus files, for example, using Hadoop (a distributed computing platform) to make statistics, obtain a relationship curve between file size and reported virus rate, and obtain the probability that a gray file sample may be a virus file according to the curve.

如果仅以依据灰文件样本大小得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。例如,漏报率阀值为0.001%,文件大小在10M以上的文件样本报病毒率为0.001%,则选择小于10M的灰文件样本为待扫描的灰文件样本。If only the probability based on the size of the gray file sample is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and the gray file samples whose false negative rate is greater than the preset false negative rate threshold are selected as the ones to be scanned. Gray file sample. For example, the false negative rate threshold is 0.001%, and the virus reporting rate of file samples with a file size of more than 10M is 0.001%, then the gray file samples smaller than 10M are selected as the gray file samples to be scanned.

根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率。According to the statistical result obtained by counting the path of the virus file and the path of the gray file sample, the probability that the gray file sample may be a virus file is calculated.

通过对病毒文件的路径的离线统计,例如使用hadoop(一种分布式计算平台)进行统计,可以得出文件路径与报病毒率的一条关系曲线,依据曲线得出灰文件样本可能为病毒文件的概率。Through offline statistics of the path of virus files, for example, using hadoop (a distributed computing platform) for statistics, a relationship curve between the file path and the reported virus rate can be obtained, and the gray file sample may be a virus file according to the curve probability.

如果仅以依据灰文件样本路径得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。If only the probability obtained based on the gray file sample path is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and the gray file samples whose false negative rate is greater than the preset false negative rate threshold are selected as the ones to be scanned. Gray file sample.

根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。According to the list of dangerous operation behaviors obtained by counting the operation behaviors of virus files and the operation behaviors of gray file samples, the probability that the gray file samples may be virus files is calculated.

危险操作行为列表中可以包括下列操作行为中的一种或多种:The list of dangerous operational behaviors may include one or more of the following operational behaviors:

写入注册表进行自动加载;Write to the registry for automatic loading;

修改注册表;Modify the registry;

修改系统文件;Modify system files;

修改指定的应用文件;Modify the specified application file;

执行进程间注入;Perform inter-process injection;

结束进程;end process;

修改浏览器中网页内容;以及Modify the content of web pages in your browser; and

记录键盘操作。Record keystrokes.

根据灰文件样本触发的危险操作行为数量,计算出该灰文件样本可能为病毒文件的概率。触发的危险操作行为越多,灰文件样本可能为病毒文件的概率越高。例如,获得灰文件样本触发危险操作行为的数量,将该数量除以危险操作行为列表中危险操作行为的总量,得出灰文件样本可能为病毒文件的概率。According to the number of dangerous operation behaviors triggered by the gray file sample, the probability that the gray file sample may be a virus file is calculated. The more dangerous operations are triggered, the higher the probability that the gray file sample may be a virus file. For example, the number of dangerous operation behaviors triggered by the gray file samples is obtained, and the number is divided by the total number of dangerous operation behaviors in the list of dangerous operation behaviors to obtain the probability that the gray file samples may be virus files.

如果仅以依据灰文件样本触发的危险操作行为数量得出的概率为计算漏报率的参量,则以该概率为漏报率,选择漏报率大于预设的漏报率阀值的灰文件样本为待扫描的灰文件样本。If only the probability obtained based on the number of dangerous operation behaviors triggered by gray file samples is used as the parameter for calculating the false negative rate, then this probability is used as the false negative rate, and gray files with a false negative rate greater than the preset false negative rate threshold are selected The sample is a gray file sample to be scanned.

如果以上述得出的多个概率为计算漏报率的参量,则可以对应每个参量设置权重值,将参量加权求和得出灰文件样本的漏报率。If the multiple probabilities obtained above are used as parameters for calculating the false negative rate, you can set a weight value corresponding to each parameter, and calculate the false positive rate of the gray file sample by summing the weighted parameters.

步骤S220,根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。Step S220, selecting an authenticator for scanning the gray file sample according to the update record of each authenticator and/or the scanning record of the authenticator.

其中,鉴定器可以为用于检测文件样本安全性的杀毒应用,例如bitdefender(比特梵德,来自罗马尼亚的一种杀毒应用),以及QVM(奇虎支持向量机)提供的杀毒应用以及云杀毒引擎等。灰文件样本为安全性未知的文件样本。Among them, the identifier can be an antivirus application used to detect the security of file samples, such as bitdefender (Bitdefender, an antivirus application from Romania), and antivirus applications and cloud antivirus engines provided by QVM (Qihoo Support Vector Machine) wait. Gray file samples are file samples with unknown security.

具体地,在选择鉴定器时,可以根据各个鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。Specifically, when selecting an authenticator, an authenticator for scanning the gray file sample may be selected according to the scanning records of each authenticator.

对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描灰文件样本的鉴定器。其中,扫描的次数越多则扫描间隔越长。例如,采用公式T=(logN)×1.5+1计算扫描间隔。N为某个鉴定器扫描某个灰文件样本的扫描的次数,T为该灰文件样本对应于该鉴定器的扫描间隔。For each identifier and each gray file sample to be scanned, according to the number of times the identifier scans the gray file sample, calculate the scanning interval of the gray file sample corresponding to the identifier, and select the scanning interval for scanning the gray file according to the scanning interval The identifier for the sample. Wherein, the more times of scanning, the longer the scanning interval. For example, the scan interval is calculated using the formula T=(logN)×1.5+1. N is the number of times a certain identifier scans a gray file sample, and T is the scanning interval of the gray file sample corresponding to the identifier.

其中,在选择鉴定器时,也可以根据各个鉴定器的更新记录选取用于扫描灰文件样本的鉴定器。具体地,根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。Wherein, when selecting an authenticator, an authenticator for scanning the gray file sample may also be selected according to update records of each authenticator. Specifically, the identifier that has been updated after the last scan is selected from each identifier according to the update record.

也可以,根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。例如,先选取在上次扫描后进行过更新的鉴定器,然后从该选取的鉴定器中再按扫描间隔选取鉴定器。It is also possible to select an authenticator for scanning the gray file sample according to the update record of each authenticator and the scanning record of the authenticator. For example, select the identifiers that have been updated since the last scan, and then select identifiers from the selected identifiers by the scan interval.

步骤S230,使用所选取的鉴定器扫描待扫描的灰文件样本,并存储扫描结果,以在接收到查询文件样本是否安全的请求时,返回扫描结果。Step S230, using the selected identifier to scan the gray file sample to be scanned, and storing the scanning result, so that the scanning result can be returned when receiving a request to inquire whether the file sample is safe.

本实施例在保证弥补漏报的同时,减少了扫描的工作量,由此解决了在扫描时需要将文件样本全部扫描,导致消耗资源量大的问题,取得了节约进行扫描的设备的资源,加快扫描速度,提供扫描效率的有益效果。This embodiment reduces the workload of scanning while ensuring to make up for missed reports, thereby solving the problem that all file samples need to be scanned during scanning, resulting in a large amount of resource consumption, and saving the resources of scanning devices. Accelerates the scanning speed, providing the beneficial effect of scanning efficiency.

在一较佳的实施例中,如图3所示,为根据本发明一个实施例的依据漏报率选取灰文件样本的流程图,所述步骤S210包括如下步骤。In a preferred embodiment, as shown in FIG. 3 , which is a flow chart of selecting gray file samples according to the false negative rate according to an embodiment of the present invention, the step S210 includes the following steps.

步骤S2102,提取灰文件样本。Step S2102, extract gray file samples.

步骤S2104,判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则执行步骤S2106,如果否,则执行步骤S2108。Step S2104, judge whether the first discovery time of the gray file sample is earlier than the preset time threshold, if yes, execute step S2106, if not, execute step S2108.

步骤S2106,检测出该灰文件样本的漏报率为0,执行步骤S2110。In step S2106, it is detected that the false negative rate of the gray file sample is 0, and step S2110 is executed.

步骤S2108,根据灰文件样本的属性得出灰文件样本的漏报率,执行步骤S2110。In step S2108, the false negative rate of the gray file sample is obtained according to the attribute of the gray file sample, and step S2110 is executed.

步骤S2110,判断灰文件样本是否提取完,如果是,执行步骤S2112,否则,执行步骤S2102。Step S2110, judging whether the gray file sample has been extracted, if yes, execute step S2112, otherwise, execute step S2102.

步骤S2112,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本。Step S2112, selecting a gray file sample to be scanned from the stored gray file samples according to the false negative rate.

因为,灰文件样本的发现时间越早,则其被漏报的可能性越小。当首次发现时间早于预设时间阀值时,则不再对该灰文件样本进行扫描,由此,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Because, the earlier the gray file sample is discovered, the less likely it will be missed. When the first discovery time is earlier than the preset time threshold, the gray file sample is no longer scanned, thereby further reducing unnecessary scanning operations, saving resources used for scanning file samples, and improving scanning efficiency.

在本发明的另一个实施例中,对文件样本中的危险文件样本进行扫描。危险文件样本为被鉴定器报告为病毒文件的文件样本。对于危险文件样本使用将其报告为病毒文件的鉴定器对该危险文件样本进行扫描,如果扫描后,该些鉴定器都确定该危险文件样本不为病毒文件,则对该危险文件样本进行去误报操作。本实施例的具体技术方案如下所述。In another embodiment of the present invention, dangerous file samples in the file samples are scanned. Dangerous file samples are file samples that are reported by the identifier as virus files. For the dangerous file sample, use the identifier that reports it as a virus file to scan the dangerous file sample, if after scanning, these identifiers all determine that the dangerous file sample is not a virus file, then remove the error for the dangerous file sample report operation. The specific technical solution of this embodiment is as follows.

参见图4,示出了根据本发明一个实施例的文件样本的扫描方法中对危险文件样本进行扫描的流程,包括如下步骤。Referring to FIG. 4 , it shows the process of scanning dangerous file samples in the method for scanning file samples according to an embodiment of the present invention, including the following steps.

步骤S410,对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器。Step S410, for a dangerous file sample in the file samples, determine an identifier that reports the dangerous file sample as a virus file.

步骤S420,使用报告危险文件样本为病毒文件的鉴定器重新扫描获取的危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。Step S420, re-scanning the obtained dangerous file sample with the identifier that reports the dangerous file sample as a virus file, and if the scanning result shows that the dangerous file sample is no longer a virus file, then determine that the dangerous file sample is falsely reported as a virus file A file sample, performing a false positive removal operation on the dangerous file sample.

其中,去误报操作可以为删除该文件样本为病毒文件的记录,也可以为将该文件样本的记录由危险文件样本更新为灰文件样本或白文件样本。白文件样本为确定无危险的文件样本。Wherein, the false positive removal operation may be to delete the record that the file sample is a virus file, or to update the record of the file sample from a dangerous file sample to a gray file sample or a white file sample. White file samples are file samples that are determined to be harmless.

通过本实施例,对于危险文件样本,仅使用将其报告为病毒文件的鉴定器对该危险文件样本进行扫描,由此,在保证修正误报的同时,能够减少扫描操作,提高扫描效率。Through this embodiment, for a dangerous file sample, only the identifier that reports it as a virus file is used to scan the dangerous file sample, thereby ensuring correction of false positives while reducing scanning operations and improving scanning efficiency.

进一步地,计算危险文件样本的误报率,根据误报率选择进行扫描的危险文件样本。由此进一步减少了被扫描的危险文件样本。Further, the false positive rate of the dangerous file samples is calculated, and the dangerous file samples to be scanned are selected according to the false positive rate. This further reduces the number of dangerous file samples scanned.

在步骤S410后包括:根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系,根据误报率从存储的危险文件样本中选取待扫描的危险文件样本。其中,所确定的鉴定器的数量越多,该危险文件样本的误报率越低。After step S410, it includes: obtaining the false alarm rate of the dangerous file sample according to the determined identifier, establishing the corresponding relationship between the number of identifiers and the false alarm rate, and selecting the dangerous file to be scanned from the stored dangerous file samples according to the false alarm rate. sample. Wherein, the greater the number of identified identifiers, the lower the false positive rate of the dangerous file sample.

例如,对于将危险文件样本报告为病毒文件的鉴定器,可以对各个鉴定器设置对应该危险文件的准确率,将准确率相加得和值,用1减去和值得出危险文件样本的误报率,选取误报率大于预设的误报率阀值的危险文件样本作为待扫描的危险文件样本。For example, for an identifier that reports a dangerous file sample as a virus file, you can set the accuracy rate corresponding to the dangerous file for each identifier, add the accuracy rates to get the sum value, and subtract the sum value from 1 to get the error rate of the dangerous file sample. positive rate, and select dangerous file samples whose false positive rate is greater than the preset false positive rate threshold as dangerous file samples to be scanned.

在设置对应该危险文件的准确率时,如果是分析人员设置为病毒文件,则直接设置准确率为1。对于各个鉴定器,依据对该鉴定器的信任度和该鉴定器对危险文件的扫描次数设置准确率,扫描次数越高准确率越高。例如,对杀毒引擎A的信任度较高,则杀毒引擎A扫描次数大于扫描阀值后,确定杀毒引擎A对应该危险文件的准确率为1。When setting the accuracy rate corresponding to the dangerous file, if the analyst sets it as a virus file, directly set the accuracy rate to 1. For each authenticator, the accuracy rate is set according to the trust degree of the authenticator and the number of times the authenticator scans dangerous files, and the higher the number of scans, the higher the accuracy rate. For example, if the degree of trust in antivirus engine A is high, then after the number of scans by antivirus engine A is greater than the scanning threshold, it is determined that the accuracy rate of antivirus engine A corresponding to the dangerous file is 1.

由此,根据误报率对危险文件样本进行选取,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Thus, the dangerous file samples are selected according to the false alarm rate, which further reduces unnecessary scanning operations, saves resources used for scanning file samples, and improves scanning efficiency.

进一步地,步骤S410后还包括对于确定的报告危险文件样本为病毒文件的鉴定器,根据鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。Further, after step S410, for the determined identifier reporting that the dangerous file sample is a virus file, selecting an identifier for scanning the dangerous file sample according to the update record of the identifier and/or the scanning record of the identifier.

其中,在将危险文件样本确定为病毒文件的鉴定器中,可以根据各个鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。Among the identifiers that determine the dangerous file sample as a virus file, an identifier for scanning the dangerous file sample may be selected according to the scanning records of each identifier.

对于每个确定的鉴定器、每个待扫描的危险文件样本,根据该鉴定器扫描该危险文件样本的次数,计算该危险文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描危险文件样本的鉴定器。其中,扫描的次数越多则扫描间隔越长。例如,采用公式T=(logN)×1.5+1计算扫描间隔。N为某个鉴定器扫描某个危险文件样本的扫描的次数,T为该危险文件样本对应于该鉴定器的扫描间隔。For each determined identifier and each dangerous file sample to be scanned, according to the number of times the identifier scans the dangerous file sample, calculate the scanning interval of the dangerous file sample corresponding to the identifier, and select the scan interval for scanning according to the scanning interval An identifier for dangerous file samples. Wherein, the more times of scanning, the longer the scanning interval. For example, the scan interval is calculated using the formula T=(logN)×1.5+1. N is the number of times a certain identifier scans a certain dangerous file sample, and T is the scanning interval of the dangerous file sample corresponding to the identifier.

其中,在将危险文件样本确定为病毒文件的鉴定器中,可以根据各个鉴定器的更新记录选取用于扫描危险文件样本的鉴定器。例如,根据更新记录从鉴定器中选取在上次扫描后进行过更新的鉴定器。Wherein, among the identifiers that determine the dangerous file sample as a virus file, an identifier for scanning the dangerous file sample may be selected according to update records of each identifier. For example, based on the update record, select from among the authenticators that have been updated since the last scan.

也可以,根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描危险文件样本的鉴定器。例如,在将危险文件样本确定为病毒文件的鉴定器中,先选取在上次扫描后进行过更新的鉴定器,然后从该选取的鉴定器中再按扫描间隔选取鉴定器。Alternatively, the authenticator used to scan the dangerous file sample may be selected according to the update record of each authenticator and the scanning record of the authenticator. For example, among the identifiers that determine a dangerous file sample as a virus file, first select an identifier that has been updated after the last scan, and then select an identifier according to the scan interval from the selected identifiers.

由此,根据鉴定器的更新记录和/或鉴定器的扫描记录对于鉴定器进行选取,进一步减少了不必要的扫描操作,节约了扫描文件样本所用资源,提高了扫描效率。Therefore, selecting an authenticator according to the update record of the authenticator and/or the scanning record of the authenticator further reduces unnecessary scanning operations, saves resources used for scanning file samples, and improves scanning efficiency.

在本发明的另一个实施例中,统计文件样本的查询的热度,从热度高的活跃文件样本中选取文件样本,进一步减少进行扫描的文件样本数量,提高了扫描效率。In another embodiment of the present invention, the query popularity of file samples is counted, and file samples are selected from active file samples with high popularity, further reducing the number of file samples to be scanned, and improving scanning efficiency.

所方法还包括:The method also includes:

在接收查询文件样本是否安全的请求时,在日志中记录接收到的请求。When receiving a request to inquire whether a file sample is safe, record the received request in a log.

根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。活跃文件样本,为热度较高,被查询的频率大于预设门限值的文件样本。According to the records in the log, the file samples whose query times are greater than the preset popularity threshold within the preset time are extracted, and the extracted file samples are active file samples. Active file samples refer to file samples with high popularity and the query frequency is greater than the preset threshold value.

由此,在扫描灰文件样本时,所述根据灰文件样本的属性得出灰文件样本的漏报率具体包括:从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率。所述根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本具体包括:从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。Thus, when scanning a gray file sample, the under-reporting rate of the gray file sample obtained according to the attributes of the gray file sample specifically includes: extracting the gray file sample from the active file sample, and obtaining the gray file sample according to the attribute of the extracted gray file sample. The false negative rate of the gray file sample. Said selecting the gray file samples to be scanned from the stored gray file samples according to the false negative rate specifically includes: selecting the gray file samples whose false negative rate is greater than the preset false positive rate threshold from the extracted gray file samples, and using the selected The gray file sample is a gray file sample to be scanned.

在扫描危险文件样本时,所述确定报告危险文件样本为病毒文件的鉴定器具体包括:从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器。所述根据误报率从存储的危险文件样本中选取待扫描的危险文件样本具体包括:从提取的危险文件样本中选取误报率大于预设误报率阀值的危险文件样本,以选取的误文件样本为待扫描的危险文件样本。When scanning the dangerous file samples, the determining the identifier for reporting the dangerous file samples as virus files specifically includes: extracting the dangerous file samples from the active file samples, and determining the identifier for reporting the extracted dangerous file samples as virus files. The selection of the dangerous file samples to be scanned from the stored dangerous file samples according to the false alarm rate specifically includes: selecting the dangerous file samples whose false alarm rate is greater than the preset false alarm rate threshold from the extracted dangerous file samples, and using the selected The error file sample is a dangerous file sample to be scanned.

以下结合一具体实例,对灰文件样本的扫描进行说明。The scanning of gray file samples will be described below in conjunction with a specific example.

在该具体实例中,以文件样本的MD5(消息摘要算法第五版)值为文件样本的标识。此外,也可以以md5+sha1的40字节长度的字符串作为文件样本的唯一标识,以避免仅以md5为标识,造成的标识冲突,即对两个不同的文件样本算出的md5值相同时,该两个文件样本的标识冲突。在文件样本信息库中,存储有各个文件样本的属性信息和其他相关信息,例如,对于每个文件样本,存储有文件样本的大小、路径、操作行为、以及文件样本是否为黑或灰文件样本,报告其为病毒文件的鉴定器等。In this specific example, the MD5 (message digest algorithm version 5) value of the file sample is used as the identifier of the file sample. In addition, the 40-byte character string of md5+sha1 can also be used as the unique identifier of the file sample to avoid the identifier conflict caused by only using md5 as the identifier, that is, when the md5 values calculated for two different file samples are the same , the two file samples have conflicting IDs. In the file sample information base, attribute information and other relevant information of each file sample are stored, for example, for each file sample, the size, path, operation behavior of the file sample, and whether the file sample is a black or gray file sample are stored , an identifier that reports it as a virus file, etc.

参见图5,示出了根据本发明一个实施例的文件样本的扫描方法中对灰文件样本进行扫描的流程图。Referring to FIG. 5 , it shows a flow chart of scanning a gray document sample in a method for scanning a document sample according to an embodiment of the present invention.

步骤S510,接收查询文件样本是否安全的请求,返回存储的请求中文件样本的扫描结果,在日志中记录接收到的请求。Step S510, receiving a request to inquire whether the file sample is safe, returning the stored scanning result of the file sample in the request, and recording the received request in the log.

具体地,接收的请求中包括文件样本的MD5值,根据MD5值查找扫描结果,并按MD5值记录请求。Specifically, the received request includes the MD5 value of the file sample, the scan result is searched according to the MD5 value, and the request is recorded according to the MD5 value.

步骤S520,根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。Step S520, according to the records in the log, extract file samples whose query times are greater than a preset popularity threshold within a preset time, and the extracted file samples are active file samples.

步骤S530,从活跃文件样本中提取灰文件样本,根据该灰文件样本的MD5值从文件样本信息库中获得该灰文件样本的属性,根据获得的属性得出该灰文件样本的漏报率,从提取的灰文件样本中选择所得出的漏报率大于预设漏报率阀值的灰文件样本。Step S530, extract the gray file sample from the active file sample, obtain the attribute of the gray file sample from the file sample information base according to the MD5 value of the gray file sample, and obtain the false negative rate of the gray file sample according to the obtained attribute, A gray file sample whose false negative rate is greater than a preset false negative rate threshold is selected from the extracted gray file samples.

步骤S540,根据各个鉴定器的更新记录和鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器。Step S540, selecting an authenticator for scanning the gray file sample according to the update record of each authenticator and the scanning record of the authenticator.

步骤S550,使用选取的鉴定器扫描获取的灰文件样本,并存储扫描结果。Step S550, using the selected identifier to scan the acquired gray file sample, and store the scanning result.

以下结合一具体实例,对危险文件样本的扫描进行说明。The following describes the scanning of dangerous file samples in conjunction with a specific example.

参见图6,示出了根据本发明一个实施例的文件样本的扫描方法中对危险文件样本进行扫描的流程图。Referring to FIG. 6 , it shows a flow chart of scanning dangerous file samples in the method for scanning file samples according to an embodiment of the present invention.

步骤S610,接收查询文件样本是否安全的请求,返回存储的请求中文件样本的扫描结果,在日志中记录接收到的请求。Step S610, receiving a request to inquire whether the file sample is safe, returning the stored scanning result of the file sample in the request, and recording the received request in the log.

具体地,接收的请求中包括文件样本的MD5值,根据MD5值查找扫描结果,并按MD5值记录请求。Specifically, the received request includes the MD5 value of the file sample, the scan result is searched according to the MD5 value, and the request is recorded according to the MD5 value.

步骤S620,根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。Step S620, according to the records in the log, extract file samples whose query times are greater than a preset popularity threshold within a preset time, and the extracted file samples are active file samples.

步骤S630,从活跃文件样本中提取危险文件样本,根据该危险文件样本的MD5值读取信息库中信息,确定报告提取的危险文件样本为病毒文件的鉴定器。Step S630, extracting a dangerous file sample from active file samples, reading the information in the information base according to the MD5 value of the dangerous file sample, and determining the identifier that reports the extracted dangerous file sample as a virus file.

步骤S640,根据确定的鉴定器得出提取的危险文件样本的误报率,从提取的危险文件样本中选取误报率大于预设的误报率阀值的危险文件样本作为待扫描的危险文件样本。Step S640, obtain the false positive rate of the extracted dangerous file samples according to the determined identifier, and select the dangerous file samples whose false positive rate is greater than the preset false positive rate threshold as the dangerous files to be scanned from the extracted dangerous file samples sample.

步骤S650,对于每个待扫描的危险文件样本,从确定的报告该危险文件样本为病毒文件的鉴定器中,根据鉴定器的更新记录和鉴定器的扫描记录选取用于扫描该危险文件样本的鉴定器。Step S650, for each dangerous file sample to be scanned, from the identified identifiers that report the dangerous file sample as a virus file, select an identifier for scanning the dangerous file sample according to the update record of the identifier and the scan record of the identifier. authenticator.

步骤S660,对于每个待扫描的危险文件样本,使用选取的鉴定器重新扫描该危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报的文件样本,对该危险文件样本进行去误报操作。Step S660, for each dangerous file sample to be scanned, use the selected identifier to re-scan the dangerous file sample, if the scanning result is that the dangerous file sample is no longer a virus file, then determine that the dangerous file sample is a false positive A file sample, performing a false positive removal operation on the dangerous file sample.

以上对于本发明的文件扫描方法进行了说明,该方法能够保证弥补漏报和修正误报的同时,减少了扫描的工作量,由此解决了在扫描时需要将文件样本全部扫描,导致消耗资源量大的问题,取得了节约进行扫描的设备的资源,加快扫描效率,提供扫描速度的有益效果。The document scanning method of the present invention has been described above. This method can ensure that missed reports and correct false reports can be compensated, while reducing the workload of scanning, thereby solving the problem of the need to scan all document samples during scanning, resulting in resource consumption. In order to solve the problem of large volume, the resource of the scanning device is saved, the scanning efficiency is accelerated, and the scanning speed is improved.

在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other device. Various generic systems can also be used with the teachings based on this. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not specific to any particular programming language. It should be understood that various programming languages can be used to implement the content of the present invention described herein, and the above description of specific languages is for disclosing the best mode of the present invention.

在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.

类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, in order to streamline this disclosure and to facilitate an understanding of one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or its description. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art can understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore may be divided into a plurality of sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings), as well as any method or method so disclosed, may be used in any combination, except that at least some of such features and/or processes or units are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will understand that although some embodiments described herein include some features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention. and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的文件样本的扫描系统中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components in the document sample scanning system according to the embodiment of the present invention. The present invention can also be implemented as an apparatus or an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.

应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

Claims (20)

1.一种文件样本的扫描方法,所述方法包括:1. A scanning method of a file sample, said method comprising: 对于文件样本中的灰文件样本,根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本;For the gray file samples in the file samples, the gray file samples to be scanned are selected from the stored gray file samples according to a preset strategy; 根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;Selecting an authenticator for scanning the gray file sample according to the update record of each authenticator and/or the scanning record of the authenticator; 使用所选取的鉴定器扫描待扫描的灰文件样本,并存储扫描结果,以在接收到查询文件样本是否安全的请求时,返回扫描结果;Use the selected identifier to scan the gray file sample to be scanned, and store the scanning result, so that when a request is received to inquire whether the file sample is safe, the scanning result is returned; 所述灰文件样本为安全性未知的文件样本;The gray file sample is a file sample with unknown security; 其中,所述根据预设的策略从存储的灰文件样本中选取待扫描的灰文件样本具体包括:Wherein, the selection of the gray file samples to be scanned from the stored gray file samples according to the preset strategy specifically includes: 根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本;According to the attribute of the gray file sample, the false negative rate of the gray file sample is obtained, and the gray file sample to be scanned is selected from the stored gray file samples according to the false negative rate; 所述根据灰文件样本的属性得出灰文件样本的漏报率前还包括:Before the false negative rate of the gray file sample is obtained according to the attribute of the gray file sample, it also includes: 判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则检测出该灰文件样本的漏报率为0,如果否,则进行所述根据灰文件样本的属性得出灰文件样本的漏报率的操作。Determine whether the first discovery time of the gray file sample is earlier than the preset time threshold, if yes, detect that the false negative rate of the gray file sample is 0, if not, then perform the gray file sample according to the attributes of the gray file sample to obtain gray The operation of the false negative rate of the file sample. 2.根据权利要求1所述的方法,其中,2. The method of claim 1, wherein, 所述方法还包括:The method also includes: 对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器,所述危险文件样本为被鉴定器报告为病毒文件的文件样本;For a dangerous file sample in the file sample, determine an identifier that reports the dangerous file sample as a virus file, and the dangerous file sample is a file sample that is reported as a virus file by the identifier; 使用报告危险文件样本为病毒文件的鉴定器重新扫描危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。Use the identifier that reports the dangerous file sample as a virus file to re-scan the dangerous file sample, and if the scanning result shows that the dangerous file sample is no longer a virus file, then determine that the dangerous file sample is a file sample that has been falsely reported as a virus file, and the Dangerous file samples are removed from false positives. 3.根据权利要求2所述的方法,其中,3. The method of claim 2, wherein, 所述确定报告危险文件样本为病毒文件的鉴定器后还包括:After determining that the identifier reporting the dangerous file sample is a virus file, the method further includes: 根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系;According to the identified identifiers, the false alarm rate of the dangerous file sample is obtained, and the corresponding relationship between the number of identifiers and the false alarm rate is established; 根据误报率从存储的危险文件样本中选取待扫描的危险文件样本;Select the dangerous file samples to be scanned from the stored dangerous file samples according to the false positive rate; 所述使用报告危险文件样本为病毒文件的鉴定器重新扫描危险文件样本具体包括:The re-scanning of the dangerous file sample using the identifier that reports the dangerous file sample as a virus file specifically includes: 对于每个待扫描的危险文件样本,使用报告该危险文件样本为病毒文件的鉴定器重新扫描该危险文件样本。For each dangerous file sample to be scanned, the dangerous file sample is rescanned using an identifier that reported the dangerous file sample as a virus file. 4.根据权利要求1至3任一项所述的方法,其中,4. The method according to any one of claims 1 to 3, wherein, 所述方法还包括:The method also includes: 在接收到查询文件样本是否安全的请求时,在日志中记录接收到的请求;When receiving a request to inquire whether a file sample is safe, record the received request in a log; 根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。According to the records in the log, the file samples whose query times are greater than the preset popularity threshold within the preset time are extracted, and the extracted file samples are active file samples. 5.根据权利要求4所述的方法,其中,5. The method of claim 4, wherein, 所述根据灰文件样本的属性得出灰文件样本的漏报率具体包括:The underreporting rate of the gray file sample according to the attribute of the gray file sample specifically includes: 从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率;Extract the gray file sample from the active file sample, and obtain the false negative rate of the gray file sample according to the attributes of the extracted gray file sample; 所述根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本具体包括:The gray file samples to be scanned are selected from the stored gray file samples according to the false negative rate and specifically include: 从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。From the extracted gray file samples, a gray file sample with a false negative rate greater than a preset false negative rate threshold is selected, and the selected gray file sample is used as a gray file sample to be scanned. 6.根据权利要求4所述的方法,其中,6. The method of claim 4, wherein, 所述确定报告危险文件样本为病毒文件的鉴定器具体包括:The identifier for determining that the reported dangerous file sample is a virus file specifically includes: 从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器;Extracting dangerous file samples from active file samples, and determining that the extracted dangerous file samples are an identifier for virus files; 所述根据误报率从存储的危险文件样本中选取待扫描的危险文件样本具体包括:The selection of the dangerous file samples to be scanned from the stored dangerous file samples according to the false alarm rate specifically includes: 从提取的危险文件样本中选取误报率大于预设的误报率阀值的危险文件样本,以选取的危险文件样本为待扫描的危险文件样本。A dangerous file sample with a false positive rate greater than a preset false positive rate threshold is selected from the extracted dangerous file samples, and the selected dangerous file sample is used as a dangerous file sample to be scanned. 7.根据权利要求1所述的方法,其中,7. The method of claim 1, wherein, 所述根据灰文件样本的属性得出灰文件样本的漏报率具体包括:The underreporting rate of the gray file sample according to the attribute of the gray file sample specifically includes: 根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率作为计算该灰文件样本的漏报率的参量。According to the statistics of the characteristics of the virus file and the attributes of the gray file sample, the probability that the gray file sample may be a virus file is calculated, and this probability is used as a parameter for calculating the false negative rate of the gray file sample . 8.根据权利要求7所述的方法,其中,8. The method of claim 7, wherein, 所述根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率具体包括:According to the statistical results obtained by performing statistics on the characteristics of virus files, and the attributes of gray file samples, calculating the probability that the gray file samples may be virus files specifically includes: 根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the size of the virus file and the size of the gray file sample; 和/或,and / or, 根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the path of the virus file and the path of the gray file sample; 和/或,and / or, 根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。According to the list of dangerous operation behaviors obtained by counting the operation behaviors of virus files and the operation behaviors of gray file samples, the probability that the gray file samples may be virus files is calculated. 9.根据权利要求1所述的方法,其中,9. The method of claim 1, wherein, 所述根据各个鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器具体包括:The selection of the identifier for scanning the gray file sample according to the scanning records of each identifier specifically includes: 对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔;For each identifier and each gray file sample to be scanned, calculate the scanning interval of the gray file sample corresponding to the identifier according to the number of times the identifier scans the gray file sample; 根据扫描间隔选取用于扫描灰文件样本的鉴定器。Select the identifier used to scan gray file samples based on the scan interval. 10.根据权利要求1所述的方法,其中,10. The method of claim 1, wherein, 所述根据各个鉴定器的更新记录选取用于扫描灰文件样本的鉴定器具体包括:The selection of the identifier for scanning the gray file sample according to the update record of each identifier specifically includes: 根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。Selects from each authenticator the one that has been updated since the last scan based on the update record. 11.一种文件样本的扫描系统,所述系统包括:样本存储装置、查杀引擎、扫描调度装置和包含多个鉴定器的样本扫描装置;11. A scanning system for file samples, said system comprising: a sample storage device, an anti-virus engine, a scanning scheduling device and a sample scanning device comprising a plurality of identifiers; 所述样本存储装置,适于存储文件样本;The sample storage device is suitable for storing file samples; 所述扫描调度装置,适于对于文件样本中的灰文件样本,根据预设的策略从所述样本存储装置存储的灰文件样本中选取待扫描的灰文件样本,并根据各个鉴定器的更新记录和/或鉴定器的扫描记录选取用于扫描灰文件样本的鉴定器;The scanning scheduling device is adapted to select the gray file samples to be scanned from the gray file samples stored in the sample storage device according to a preset strategy for the gray file samples in the file samples, and select the gray file samples to be scanned according to the update records of each identifier and/or the authenticator's scan records to select the authenticator used to scan the gray file sample; 所述样本扫描装置,适于从所述样本存储装置获取待扫描的灰文件样本,使用所述扫描调度装置选取的鉴定器扫描获取的待扫描的灰文件样本,并将扫描结果存储到查杀引擎;The sample scanning device is adapted to acquire gray file samples to be scanned from the sample storage device, use the identifier selected by the scan scheduling device to scan the obtained gray file samples to be scanned, and store the scanning results in the scanning engine; 所述查杀引擎,适于存储样本文件的扫描结果,并在接收到查询文件样本是否安全的请求时,返回扫描结果;The killing engine is suitable for storing the scan results of the sample files, and returns the scan results when receiving a request to inquire whether the file samples are safe; 所述灰文件样本为安全性未知的文件样本;The gray file sample is a file sample with unknown security; 其中,所述扫描调度装置,具体适于根据灰文件样本的属性得出灰文件样本的漏报率,根据漏报率从存储的灰文件样本中选取待扫描的灰文件样本;Wherein, the scanning scheduling device is specifically adapted to obtain the false negative rate of the gray document sample according to the attribute of the gray document sample, and select the gray document sample to be scanned from the stored gray document samples according to the false positive rate; 所述扫描调度装置,还适于在根据灰文件样本的属性得出灰文件样本的漏报率前,判断灰文件样本的首次发现时间是否早于预设时间阀值,如果是,则检测出该灰文件样本的漏报率为0,如果否,则进行所述根据灰文件样本的属性得出灰文件样本的漏报率的操作。The scanning scheduling device is also suitable for judging whether the first discovery time of the gray file sample is earlier than the preset time threshold before obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample, and if so, detecting The false negative rate of the gray file sample is 0, if not, perform the operation of obtaining the false negative rate of the gray file sample according to the attributes of the gray file sample. 12.根据权利要求11所述的系统,其中,12. The system of claim 11, wherein, 所述扫描调度装置,还适于对于文件样本中的危险文件样本,确定报告危险文件样本为病毒文件的鉴定器,所述危险文件样本为被鉴定器报告为病毒文件的文件样本;The scan scheduling device is further adapted to determine, for a dangerous file sample among the file samples, an identifier that reports the dangerous file sample as a virus file, and the dangerous file sample is a file sample that is reported as a virus file by the identifier; 所述样本扫描装置,还适于从所述样本存储装置获取危险文件样本,使用所述扫描调度装置确定的鉴定器重新扫描获取的危险文件样本,如果扫描结果为该危险文件样本不再为病毒文件,则确定该危险文件样本为被误报为病毒文件的文件样本,对该危险文件样本进行去误报操作。The sample scanning device is further adapted to obtain a dangerous file sample from the sample storage device, and use the identifier determined by the scan scheduling device to re-scan the obtained dangerous file sample, if the scanning result is that the dangerous file sample is no longer a virus file, then it is determined that the dangerous file sample is a file sample that is falsely reported as a virus file, and the false positive operation is performed on the dangerous file sample. 13.根据权利要求12所述的系统,其中,13. The system of claim 12, wherein, 所述扫描调度装置,还适于在确定报告危险文件样本为病毒文件的鉴定器后,根据确定的鉴定器得出危险文件样本的误报率,建立鉴定器数量和误报率的对应关系,根据误报率从所述样本存储装置存储的危险文件样本中选取待扫描的危险文件样本;The scanning scheduling device is further adapted to determine the false alarm rate of the dangerous file sample according to the determined identifier after determining the identifier that reports the dangerous file sample as a virus file, and establish a corresponding relationship between the number of identifiers and the false alarm rate, selecting a dangerous file sample to be scanned from the dangerous file samples stored in the sample storage device according to the false positive rate; 所述样本扫描装置,具体适于对于每个待扫描的危险文件样本,使用报告该危险文件样本为病毒文件的鉴定器重新扫描该危险文件样本。The sample scanning device is specifically adapted to, for each dangerous file sample to be scanned, re-scan the dangerous file sample using an identifier that reports the dangerous file sample as a virus file. 14.根据权利要求11至13任一项所述的系统,其中,14. A system according to any one of claims 11 to 13, wherein, 所述查杀引擎,还适于在接收到查询文件样本是否安全的请求时,在日志中记录接收到的请求;The killing engine is also adapted to record the received request in a log when receiving a request to inquire whether the file sample is safe; 所述扫描调度装置,还适于根据日志中的记录,提取在预设时间内查询次数大于预设热度阀值的文件样本,提取的文件样本为活跃文件样本。The scanning scheduling device is further adapted to extract file samples whose query times are greater than a preset popularity threshold within a preset time according to the records in the log, and the extracted file samples are active file samples. 15.根据权利要求14所述的系统,其中,15. The system of claim 14, wherein, 所述扫描调度装置,具体适于从活跃文件样本中提取灰文件样本,根据提取的灰文件样本的属性得出该灰文件样本的漏报率,根据所得出的漏报率从提取的灰文件样本中选取漏报率大于预设漏报率阀值的灰文件样本,以选取的灰文件样本为待扫描的灰文件样本。The scanning scheduling device is specifically adapted to extract gray file samples from active file samples, obtain the false negative rate of the gray file sample according to the attributes of the extracted gray file samples, and obtain the false negative rate from the extracted gray file samples according to the obtained false negative rate. Gray file samples with a false negative rate greater than a preset false negative rate threshold are selected from the samples, and the selected gray file samples are the gray file samples to be scanned. 16.根据权利要求14所述的系统,其中,16. The system of claim 14, wherein, 所述扫描调度装置,具体适于从活跃文件样本中提取危险文件样本,确定报告提取的危险文件样本为病毒文件的鉴定器,根据确定的鉴定器得出提取的危险文件样本的误报率,根据误报率从提取的危险文件样本中选取误报率大于预设的误报率阀值的危险文件样本,以选取的危险文件样本为待扫描的危险文件样本。The scanning scheduling device is specifically suitable for extracting dangerous file samples from active file samples, determining the identifier that reports the extracted dangerous file sample as a virus file, and obtaining the false alarm rate of the extracted dangerous file sample according to the determined identifier, Selecting dangerous file samples with a false positive rate greater than a preset false positive rate threshold from the extracted dangerous file samples according to the false positive rate, and using the selected dangerous file samples as dangerous file samples to be scanned. 17.根据权利要求11所述的系统,其中,17. The system of claim 11, wherein, 所述扫描调度装置,具体适于根据对病毒文件的特征进行统计而得出的统计结果,以及灰文件样本的属性,计算出该灰文件样本可能为病毒文件的概率,以该概率为计算该灰文件样本的漏报率的参量。The scanning scheduling device is specifically adapted to calculate the probability that the gray file sample may be a virus file based on the statistical results obtained by statistically counting the characteristics of the virus file and the attributes of the gray file sample, and use this probability as the basis for calculating the The parameter of the false negative rate of the gray file sample. 18.根据权利要求17所述的系统,其中,18. The system of claim 17, wherein, 所述扫描调度装置,具体适于根据对病毒文件的大小进行统计而得出的统计结果,以及灰文件样本的大小,计算出该灰文件样本可能为病毒文件的概率;The scanning scheduling device is specifically adapted to calculate the probability that the gray file sample may be a virus file according to the statistical result obtained by counting the size of the virus file and the size of the gray file sample; 和/或,and / or, 根据对病毒文件的路径进行统计而得出的统计结果,以及灰文件样本的路径,计算出该灰文件样本可能为病毒文件的概率;Calculate the probability that the gray file sample may be a virus file according to the statistical results obtained by counting the path of the virus file and the path of the gray file sample; 和/或,and / or, 根据对病毒文件的操作行为进行统计而得出的危险操作行为列表,以及灰文件样本的操作行为,计算出该灰文件样本可能为病毒文件的概率。According to the list of dangerous operation behaviors obtained by counting the operation behaviors of virus files and the operation behaviors of gray file samples, the probability that the gray file samples may be virus files is calculated. 19.根据权利要求11所述的系统,其中,19. The system of claim 11, wherein: 所述扫描调度装置,具体适于对于每个鉴定器、每个待扫描的灰文件样本,根据该鉴定器扫描该灰文件样本的次数,计算该灰文件样本对应于该鉴定器的扫描间隔,根据扫描间隔选取用于扫描灰文件样本的鉴定器。The scan scheduling device is specifically adapted to calculate, for each identifier and each gray file sample to be scanned, the scanning interval of the gray file sample corresponding to the identifier according to the number of times the identifier scans the gray file sample, Select the identifier used to scan gray file samples based on the scan interval. 20.根据权利要求11所述的系统,其中,20. The system of claim 11, wherein: 所述扫描调度装置,具体适于根据更新记录从各个鉴定器中选取在上次扫描后进行过更新的鉴定器。The scanning scheduling device is specifically adapted to select an authenticator that has been updated after the last scan from various authenticators according to the update record.
CN201310071272.6A 2013-03-06 2013-03-06 The scan method of paper sample and system Active CN103136477B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310071272.6A CN103136477B (en) 2013-03-06 2013-03-06 The scan method of paper sample and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310071272.6A CN103136477B (en) 2013-03-06 2013-03-06 The scan method of paper sample and system

Publications (2)

Publication Number Publication Date
CN103136477A CN103136477A (en) 2013-06-05
CN103136477B true CN103136477B (en) 2015-09-02

Family

ID=48496294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310071272.6A Active CN103136477B (en) 2013-03-06 2013-03-06 The scan method of paper sample and system

Country Status (1)

Country Link
CN (1) CN103136477B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729593B (en) * 2013-12-31 2017-04-12 安一恒通(北京)科技有限公司 File security identification method and system
CN105095752B (en) * 2014-05-07 2019-01-08 腾讯科技(深圳)有限公司 The recognition methods of viral data packet, apparatus and system
CN104751058B (en) * 2015-03-16 2018-08-31 联想(北京)有限公司 A kind of file scanning method and electronic equipment
CN105938533B (en) * 2016-03-03 2019-01-22 杭州迪普科技股份有限公司 A kind of scan method and scanning means of system vulnerability
CN108334777B (en) * 2017-04-17 2020-04-24 北京安天网络安全技术有限公司 Sample analysis method and system based on user view angle
CN108920956B (en) * 2018-07-03 2021-05-14 亚信科技(成都)有限公司 Machine learning method and system based on context awareness
CN114067180B (en) * 2021-11-19 2025-08-29 奇安信科技集团股份有限公司 Sample retrace method and device, electronic device, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685486A (en) * 2008-09-23 2010-03-31 联想(北京)有限公司 Virus killing method and virus killing system with multiple antivirus engines
US7823205B1 (en) * 2006-06-29 2010-10-26 Symantec Corporation Conserving computing resources while providing security
CN102314571A (en) * 2011-09-27 2012-01-11 奇智软件(北京)有限公司 Method and device for processing computer viruses
CN102594809A (en) * 2012-02-07 2012-07-18 奇智软件(北京)有限公司 Method and system for rapidly scanning files
CN102609653A (en) * 2012-02-07 2012-07-25 奇智软件(北京)有限公司 Method and system for fast file scanning
CN102867148A (en) * 2011-07-08 2013-01-09 北京金山安全软件有限公司 Safety protection method and device for electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7043757B2 (en) * 2001-05-22 2006-05-09 Mci, Llc System and method for malicious code detection
US20060026687A1 (en) * 2004-07-31 2006-02-02 Cyrus Peikari Protecting embedded devices with integrated permission control
US8832828B2 (en) * 2009-03-26 2014-09-09 Sophos Limited Dynamic scanning based on compliance metadata

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7823205B1 (en) * 2006-06-29 2010-10-26 Symantec Corporation Conserving computing resources while providing security
CN101685486A (en) * 2008-09-23 2010-03-31 联想(北京)有限公司 Virus killing method and virus killing system with multiple antivirus engines
CN102867148A (en) * 2011-07-08 2013-01-09 北京金山安全软件有限公司 Safety protection method and device for electronic equipment
CN102314571A (en) * 2011-09-27 2012-01-11 奇智软件(北京)有限公司 Method and device for processing computer viruses
CN102594809A (en) * 2012-02-07 2012-07-18 奇智软件(北京)有限公司 Method and system for rapidly scanning files
CN102609653A (en) * 2012-02-07 2012-07-25 奇智软件(北京)有限公司 Method and system for fast file scanning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"云安全"检测技术安全性分析;许蓉等;《计算机工程与设计》;20120930;全文 *
参观金山公司见闻及感想总结;wd19880427;《爱毒霸社区》;20100426;全文 *

Also Published As

Publication number Publication date
CN103136477A (en) 2013-06-05

Similar Documents

Publication Publication Date Title
CN103136477B (en) The scan method of paper sample and system
JP5094928B2 (en) Method and apparatus for intelligent bot using fake virtual machine information
CN103078864B (en) A kind of Initiative Defense Ile repair method based on cloud security
Canali et al. Prophiler: a fast filter for the large-scale detection of malicious web pages
CN103617395B (en) Method, device and system for intercepting advertisement programs based on cloud security
CN103634306B (en) The safety detection method and safety detection server of network data
US9336389B1 (en) Rapid malware inspection of mobile applications
CN103577756B (en) The method for detecting virus judged based on script type and device
US9135443B2 (en) Identifying malicious threads
CN103390130B (en) Based on the method for the rogue program killing of cloud security, device and server
CN106055981B (en) Method and device for generating threat intelligence
CN102882875B (en) Active defense method and device
CN103473501B (en) A malware tracking method based on cloud security
CN103034808B (en) Scan method, equipment and system and cloud management and equipment
WO2014194803A1 (en) Cloud security-based file processing method and device
CN102413142A (en) Active defense method based on cloud platform
CN103207970B (en) Virus document scan method and device
WO2014082599A1 (en) Scanning device, cloud management device, method and system for checking and killing malicious programs
CN102982284A (en) Scanning equipment, cloud management equipment and method and system used for malicious program checking and killing
CN103279707A (en) Method, device and system for actively defending against malicious programs
CN103701816B (en) Perform the scan method and scanning means of the server of Denial of Service attack
CN102932370A (en) Safety scanning method, equipment and system
CN103761478A (en) Judging method and device of malicious files
CN107800686A (en) A kind of fishing website recognition methods and device
Wu et al. Detect repackaged android application based on http traffic similarity

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220720

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right