[go: up one dir, main page]

CN113162818A - Method and system for realizing distributed flow acquisition and analysis - Google Patents

Method and system for realizing distributed flow acquisition and analysis Download PDF

Info

Publication number
CN113162818A
CN113162818A CN202110138388.1A CN202110138388A CN113162818A CN 113162818 A CN113162818 A CN 113162818A CN 202110138388 A CN202110138388 A CN 202110138388A CN 113162818 A CN113162818 A CN 113162818A
Authority
CN
China
Prior art keywords
network traffic
samples
sample
dimension
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110138388.1A
Other languages
Chinese (zh)
Inventor
颜靖华
刘阳
王益静
黄雨晨
王晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
National Computer Network and Information Security Management Center
Original Assignee
Institute of Information Engineering of CAS
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS, National Computer Network and Information Security Management Center filed Critical Institute of Information Engineering of CAS
Priority to CN202110138388.1A priority Critical patent/CN113162818A/en
Publication of CN113162818A publication Critical patent/CN113162818A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明涉及一种分布式流量采集分析的实现方法及系统。该方法的步骤包括:采集网络流量样本,对网络流量样本进行标记,添加不同维度的索引标签;将采集的网络流量样本存储至Elasticsearch分布式搜索引擎中,并按照不同维度对网络流量样本进行检索;对网络流量样本进行统计,并存储至Redis数据库中;采用TCPREPLAY技术将网络流量样本进行回放,可以按照数据包的速度或者指定速度去重放网络流量。本发明通过网络数据包的采集和分析,可以实现对网络流量的标记、检索、储存和回放,通过对抓取的网络流量进行分析,可以实时了解网络的状态。

Figure 202110138388

The invention relates to a method and a system for realizing distributed flow collection and analysis. The steps of the method include: collecting network traffic samples, marking the network traffic samples, and adding index labels of different dimensions; storing the collected network traffic samples in the Elasticsearch distributed search engine, and retrieving the network traffic samples according to different dimensions ; Count the network traffic samples and store them in the Redis database; use the TCPREPLAY technology to play back the network traffic samples, and replay the network traffic according to the speed of the data packets or the specified speed. The invention can realize the marking, retrieval, storage and playback of network traffic through the collection and analysis of network data packets, and can know the state of the network in real time by analyzing the captured network traffic.

Figure 202110138388

Description

Method and system for realizing distributed flow acquisition and analysis
Technical Field
The invention belongs to the field of distributed traffic acquisition and analysis, and relates to a method and a system for realizing distributed traffic acquisition and analysis, which can realize acquisition, analysis, marking, retrieval, storage and playback of PCAP traffic.
Background
With the rapid development of the internet and network applications, network traffic is exhibiting explosive growth, and its potential value is being continuously mined and utilized. Networks are bearing increasing data transmission requirements as basic conditions for data exchange and sharing, and how to realize real-time acquisition, storage and analysis of network data is a problem that must be faced by network traffic analysis. Currently, the performance of a single server is far from meeting the requirement of network data analysis, and a distributed network data acquisition and analysis mode is a development direction and a necessary means of the work. Therefore, the adoption of a distributed architecture is currently a necessary option.
The distributed network traffic analysis system mainly solves the capabilities of network data acquisition, data storage, data analysis, visualization and the like under the condition of ultra-high speed, and realizes the distributed deployment of each functional module by adopting a loose coupling mode. Although the network traffic analysis technology in the industry can analyze the network, the dimension of the analysis is not so fine, and the analysis effect is yet to be improved.
Disclosure of Invention
The invention provides a method and a system for realizing distributed traffic collection and analysis, which can realize marking, searching, storing and replaying of network traffic through collection and analysis of network data packets, and can know the state of a network in real time through analyzing the captured network traffic.
The method divides the network flow from the region dimension, the bandwidth dimension, the time dimension, the code address dimension, the two-layer protocol dimension, the three-layer protocol dimension, the keyword dimension, the length range dimension, the flow length range and the like, and supports the inquiry according to various combined index conditions to obtain the corresponding flow sample, generate the information abstract file, and can play back, count and the like the network sample.
The technical scheme adopted by the invention is as follows:
a method for realizing distributed traffic collection and analysis comprises the following steps:
collecting a network flow sample, and adding index labels with different dimensions to the network flow sample;
storing the collected network traffic samples into an Elasticissearch distributed search engine, and retrieving the network traffic samples according to different dimensions;
counting network flow samples and storing the network flow samples into a Redis database;
and playing back the network traffic sample.
Further, the different dimensions include: region dimension, bandwidth dimension, time dimension, code address dimension, protocol dimension, keyword dimension, length range dimension, and traffic length range dimension.
Further, the Elasticisearch distributed search engine segments the index; when one index is created, the number of fragments of the index needs to be specified, the fragments are divided into main fragments and copy fragments, when one document is stored, the Elasticissearch distributed search stores the main fragments into the corresponding main fragments through calculation and then synchronizes the main fragments into the copy fragments, and the copy fragments not only perform redundant operation on the main fragments, but also can perform query and calculation to share the pressure of the main fragments.
Further, the network flow samples are counted, and the counted values include byte number, packet number, stream number, average duration value, maximum duration value and minimum duration value; and according to different protocol statistics proportion conditions, sample flow statistics display is carried out through a bar chart, a pie chart and a line chart, so that a user can conveniently understand a retrieval result.
Further, for the data stored in the Redis database, a dual-guarantee storage structure of a MySQL master-slave cluster and a HDFS high-availability cluster is adopted for data persistence storage.
Further, the playback of the network traffic sample is to replay the network traffic by using TCPREPLAY technology.
Further, the network flow sample is played back, the network flow is supported to be played back according to the speed of the data packet when the sample flow is captured or the designated speed, and the transmitted data packet sequence is strictly ensured to be consistent with the real flow data packet sequence when the sample flow is captured in the playback process; the number, playback time and current playback rate of the played back packets are fed back in real time in the playback process, and the played back data packets are dynamically modified according to the MAC addresses in the playback process.
A system for realizing distributed flow acquisition and analysis by adopting the method comprises the following steps:
the sample flow capturing module is used for collecting network flow samples;
the sample traffic marking module is used for adding index labels with different dimensions to the acquired network traffic samples;
the sample traffic retrieval module is used for storing the acquired network traffic samples into an Elasticissearch distributed search engine and retrieving the network traffic samples according to different dimensions;
the sample flow counting module is used for counting network flow samples and storing the network flow samples into a Redis database;
and the sample flow playback module is used for playing back the network flow sample.
The invention has the following advantages and positive effects:
(1) the Elasticisearch is used as a large distributed cluster, so that a new server can be easily expanded into an ES cluster; the system can also be operated on a single machine to be used as a lightweight search engine; compared with the traditional relational database, the ES provides the functions of full-text retrieval, synonym processing, relevancy ranking, complex data analysis, near-real-time processing of mass data and the like; the same index is divided into a plurality of shards (Shard), and the processing efficiency is improved by using the concept of divide and conquer; a copy (replay) mechanism is provided, one fragment can be provided with a plurality of copies, and even if some servers are down, the cluster can still work normally; and the simple and easy-to-use API is provided, and the construction, the deployment and the use of the service are easy to operate.
(2) TCPREPLAY, the sample flow can be played back to the designated position as it is or after any modification. Allowing any modification to the sample traffic, specifying the speed at which the sample traffic is replayed, etc.
(3) Regarding the analysis part of the PCAP package, the Data Plane Development Kit (DPDK) is used for analyzing the PCAP package into an ELOG log form, so that the Data package can be rapidly processed, secondary Development is performed on the DPDK, the Data processing performance and throughput can be greatly improved, and the working efficiency of a Data Plane application program is improved.
(4) In the aspect of data storage, the invention simultaneously utilizes two schemes of MySQL and HDFS, and also adopts a high-availability mode in the aspect of cluster construction, thereby avoiding the problem of single-point failure and further ensuring the persistence of data.
Drawings
Fig. 1 is a flow chart of the functional implementation of the present invention.
FIG. 2 is a general deployment diagram of the present invention. The master control node is used for collecting data collected by the collection server cluster and analyzing the PCAP file into an ELOG log; the distributed Redis cache is used for realizing the distributed storage of the PCAP flow samples; the MYSQL cluster and the HDFS file system are used for persisting data; an ES distributed retrieval system facilitates search queries according to different dimensions.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention shall be described in further detail with reference to the following detailed description and accompanying drawings.
The PCAP (packet capture, an application program interface for capturing network traffic) network traffic analysis in the invention is mainly divided into three parts: marking, retrieving and playing back. The marking is mainly to analyze the captured PCAP package to generate an ELOG file, mark the generated file data and increase indexes for some commonly used fields. The retrieval method mainly comprises the steps of storing captured data analyzed by a PCAP package into an Elasticissearch distributed search engine, inquiring through various combined index conditions to obtain a flow sample, generating an information summary file, classifying the data in the retrieval process, counting according to the minute level, and analyzing the flow sample through a column diagram, a list, a pie diagram and a line diagram, so that a user can understand retrieval results conveniently. The playback is mainly to use TCPREPLAY tool to replay the network traffic from the captured PCAP file according to the speed of the data packet or the designated speed, as long as the playback is within the range of the hardware tolerance. The method can lead the flow to be directly split between the two network cards, written into files, screened and edited in various modes according to the requirement, thereby providing a method for testing firewalls, NIDS and other network devices.
The key technology used by the system is as follows: data acquisition and data analysis; an Elasticissearch distributed search engine; redis distributed storage, MySQL persistence, HDFS file system.
The invention provides a method for realizing PCAP network flow analysis, which comprises the following steps:
step 1: PCAP traffic sample analysis and labeling. After receiving the PCAP traffic sample, analyzing the PCAP traffic sample, and adding index labels with different dimensions to commonly used fields, wherein the index labels mainly comprise a region dimension, a bandwidth dimension, a time dimension, a code address dimension, a two-layer protocol dimension, a three-layer protocol dimension, a keyword dimension, a length range dimension, a traffic length range dimension and the like.
a) And (5) regional dimension. The method mainly comprises an access office point, an access operator and an access direction.
b) The bandwidth dimension. According to 10GE, 100GE, 10GPS, etc.
c) The time dimension. The samples were analyzed by minutes, hours, days, months and years.
d) The code address dimension. And analyzing according to the source and destination IP address range and the port number range of the traffic.
e) The two-layer protocol dimension. The analysis is performed according to the source and destination MAC address range, VLAN, MPLS, ICMP, ARP, LACP, LLDP, etc.
f) Three protocol dimensions. The analysis is performed according to typical applications such as DHCP, DNS, HTTP, SMTP, POP3, IMAP, FTP, etc.
g) The keyword dimension. The method supports self-defining of keywords at any position and supports two, three and four layers of feature code definition.
h) Length range dimension. The analysis is performed according to the ranges of two-layer frame check, three-layer packet length and four-layer packet length.
i) The flow length range dimension. And analyzing according to the flow packet data range and the flow time length range.
Step 2: and (2) retrieving the PCAP flow sample data, further dividing the sample flow according to different dimensions listed in the step (1), storing the sample flow into an Elasticissearch distributed search engine, and segmenting the index to solve the problem of data loss caused by the fact that hardware of a single node reaches a stored critical value.
The method for segmenting the index comprises the following steps: the bottom layer of the elastic search is based on Lucene, one node is an instance of the elastic search, and each fragment is an instance of Lucene. Each segment contains all the basic functions of Lucene. When an index is created, the number of fragments of the index needs to be specified, and the fragments are divided into a main fragment and a copy fragment. The relationship between the main shards and the copy shards is that when a document is stored, the elastic search is stored in the corresponding main shards through calculation and then is synchronized into the copy shards, and the copy shards can be regarded as the redundant structure of the main shards. However, the copy fragment not only performs redundancy operation only on the main fragment, but also can perform query, calculation and the like to share the pressure of the main fragment.
For the retrieval of data with different dimensions, the process of reading the data is specifically as follows:
a) the client sends a read request (get request) to any node (node), which is then called a coordinating node.
b) The coordination node routes the document and forwards the request to the corresponding node, and at the moment, a random polling algorithm is used for randomly selecting one of a primary shard mechanism and a replication shard mechanism to balance the load of the read request.
c) The node receiving the request returns a document to the coordinating node.
d) And the coordination node returns the document to the client.
The flow samples are obtained by inquiring various combined index conditions, and the system can inquire the statistic value of each flow sample in a certain time node according to the time node, wherein the statistic value comprises the number of bytes, the number of packets, the number of streams, the average value of the duration, the maximum value of the duration and the minimum value of the duration. The statistical proportion condition is carried out according to different protocols (HTTP, TCP, UDP and the like), and the sample flow statistical display is carried out through a bar chart, a pie chart, a line chart and the like, so that a user can understand the retrieval result more conveniently, and a decision is made according to the retrieval result.
And step 3: and storing the PCAP traffic sample data. In step 2, the flow samples are further divided according to different dimensions, the quantity statistics is carried out on the sample flow according to the time dimension, and the sample flow is stored in a Redis database. Redis is an open source log-type and Key-Value database which is written by using ANSI C language, supports network, can be based on memory and can also be persistent, and provides API of multiple languages.
Considering that Redis is memory-based, power is lost. We open both AOF and RDB persistence methods, in this case, when Redis restarts, AOF file is loaded to restore the original data preferentially, because in general, AOF file stores more complete data set than RDB file.
The Redis database is strong in real-time performance, but for data persistence, further operation is needed, and the data are persisted to two places simultaneously by adopting a double-guarantee storage structure of a MySQL master-slave cluster and a HDFS high-availability cluster.
And 4, step 4: and (4) playback of sample flow. The purpose of traffic playback is to prepare for testing firewalls, NIDS, and other network devices. The TCPREPLAY technology is adopted to playback the sample flow, the network flow can be played back according to the speed of a data packet when the sample flow is captured or the designated speed, the packet sending through the designated port of the special network card is supported at a specific moment, the controllable simulation of the current network flow is realized, the consistency of the sequence of the sent data packet and the sequence of the real flow data packet when the data packet is captured is strictly ensured in the playback process, the real-time feedback of the statistical information such as the number, the playback time, the current playback rate and the like of the played back packets in the playback process is supported, and the dynamic modification of the played back data packet according to the MAC address in the playback process is supported.
Based on the same inventive concept, another embodiment of the present invention provides a system for implementing distributed traffic collection and analysis, including:
the sample flow capturing module is used for collecting network flow samples;
the sample traffic marking module is used for adding index labels with different dimensions to the acquired network traffic samples;
the sample traffic retrieval module is used for storing the acquired network traffic samples into an Elasticissearch distributed search engine and retrieving the network traffic samples according to different dimensions;
the sample flow counting module is used for counting network flow samples and storing the network flow samples into a Redis database;
and the sample flow playback module is used for playing back the network flow sample.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps of the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.
Parts of the invention not described in detail are well known to the person skilled in the art.
The foregoing disclosure of the specific embodiments of the present invention and the accompanying drawings is directed to an understanding of the present invention and its implementation, and it will be appreciated by those skilled in the art that various alternatives, modifications, and variations may be made without departing from the spirit and scope of the invention. The present invention should not be limited to the disclosure of the embodiments and drawings in the specification, and the scope of the present invention is defined by the scope of the claims.

Claims (10)

1.一种分布式流量采集分析的实现方法,其特征在于,包括以下步骤:1. a realization method of distributed flow collection and analysis, is characterized in that, comprises the following steps: 采集网络流量样本,对网络流量样本添加不同维度的索引标签;Collect network traffic samples, and add index labels of different dimensions to the network traffic samples; 将采集的网络流量样本存储至Elasticsearch分布式搜索引擎中,并按照不同维度对网络流量样本进行检索;Store the collected network traffic samples in the Elasticsearch distributed search engine, and retrieve network traffic samples according to different dimensions; 对网络流量样本进行统计,并存储至Redis数据库中;Count network traffic samples and store them in the Redis database; 将网络流量样本进行回放。Play back network traffic samples. 2.根据权利要求1所述的方法,其特征在于,所述不同维度包括:地域维度、带宽维度、时间维度、码址维度、协议维度、关键词维度、长度范围维度、流量长度范围维度。2 . The method according to claim 1 , wherein the different dimensions include: a geographical dimension, a bandwidth dimension, a time dimension, a code address dimension, a protocol dimension, a keyword dimension, a length range dimension, and a traffic length range dimension. 3 . 3.根据权利要求1所述的方法,其特征在于,所述Elasticsearch分布式搜索引擎对索引进行分片;当一个索引在创建时需要指定索引的分片数量,分片分为主分片和副本分片,当存储一个文档的时候,Elasticsearch分布式搜索通过计算将其存入到相应的主分片上,然后同步到其副本分片中,副本分片不仅仅是只对主分片进行冗余操作,还能进行查询、计算以分担其主分片的压力。3. method according to claim 1, is characterized in that, described Elasticsearch distributed search engine shards index; When an index needs to specify the shard quantity of index when creating, shard is divided into main shard and shard. Replica shards. When storing a document, Elasticsearch distributed search stores it into the corresponding primary shard by calculation, and then synchronizes it to its replica shard. The replica shard is not only redundant to the primary shard. In addition to other operations, it can also perform queries and calculations to share the pressure of its primary shard. 4.根据权利要求1所述的方法,其特征在于,所述对网络流量样本进行统计,其统计值包括字节数、包数、流数、时长平均值、时长最大值、时长最小值;并根据不同的协议统计占比情况,通过柱状图、饼图、折线图来进行样本流量统计展示,以方便用户理解检索结果。4. The method according to claim 1, wherein the network traffic samples are counted, and the statistical values include the number of bytes, the number of packets, the number of flows, the average value of the duration, the maximum duration of the duration, and the minimum duration of the duration; And according to the statistics of different protocols, the sample traffic statistics are displayed through bar charts, pie charts, and line charts to facilitate users to understand the search results. 5.根据权利要求1所述的方法,其特征在于,对于存储至所述Redis数据库中的数据,采用MySQL主从集群和HDFS高可用集群的双重保障存储结构进行数据持久化存储。5 . The method according to claim 1 , wherein, for the data stored in the Redis database, a dual-guaranteed storage structure of MySQL master-slave cluster and HDFS high-availability cluster is used for data persistent storage. 6 . 6.根据权利要求1所述的方法,其特征在于,所述将网络流量样本进行回放,是采用TCPREPLAY技术重放网络流量。6 . The method according to claim 1 , wherein the replaying the network traffic sample is to replay the network traffic by using the TCPREPLAY technology. 7 . 7.根据权利要求1所述的方法,其特征在于,所述将网络流量样本进行回放,支持按照捕获样本流量时数据包的速度或者指定速度重放网络流量,在回放过程中严格保证发送的数据包序列与捕获时的真实流量数据包序列一致;支持在回放的过程中将回放的包的个数、回放时间以及当前回放率进行实时反馈,支持在回放的过程中按照MAC地址对回放的数据包进行动态修改。7. method according to claim 1, is characterized in that, described network flow sample is played back, supports according to the speed of data packet or specified speed replay network flow when capturing sample flow, strictly guarantees the sending during playback. The data packet sequence is consistent with the real traffic data packet sequence at the time of capture; it supports real-time feedback of the number of playback packets, playback time and current playback rate during playback, and supports the playback process according to the MAC address during playback. Packets are dynamically modified. 8.一种采用权利要求1~7中任一权利要求所述方法的分布式流量采集分析的实现系统,其特征在于,包括:8. A system for realizing distributed traffic collection and analysis using the method according to any one of claims 1 to 7, characterized in that, comprising: 样本流量捕获模块,用于采集网络流量样本;The sample traffic capture module is used to collect network traffic samples; 样本流量标记模块,用于对采集的网络流量样本添加不同维度的索引标签;The sample traffic labeling module is used to add index labels of different dimensions to the collected network traffic samples; 样本流量检索模块,用于将采集的网络流量样本存储至Elasticsearch分布式搜索引擎中,并按照不同维度对网络流量样本进行检索;The sample traffic retrieval module is used to store the collected network traffic samples in the Elasticsearch distributed search engine, and retrieve the network traffic samples according to different dimensions; 样本流量统计模块,用于对网络流量样本进行统计,并存储至Redis数据库中;The sample traffic statistics module is used to collect statistics on network traffic samples and store them in the Redis database; 样本流量回放模块,用于将网络流量样本进行回放。The sample traffic playback module is used to play back network traffic samples. 9.一种电子装置,其特征在于,包括存储器和处理器,所述存储器存储计算机程序,所述计算机程序被配置为由所述处理器执行,所述计算机程序包括用于执行权利要求1~7中任一权利要求所述方法的指令。9. An electronic device, comprising a memory and a processor, wherein the memory stores a computer program, the computer program is configured to be executed by the processor, and the computer program includes a program for executing claims 1- 7. Instructions for the method of claim 7. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储计算机程序,所述计算机程序被计算机执行时,实现权利要求1~7中任一权利要求所述的方法。10 . A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a computer, the method according to any one of claims 1 to 7 is implemented. 11 .
CN202110138388.1A 2021-02-01 2021-02-01 Method and system for realizing distributed flow acquisition and analysis Pending CN113162818A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110138388.1A CN113162818A (en) 2021-02-01 2021-02-01 Method and system for realizing distributed flow acquisition and analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110138388.1A CN113162818A (en) 2021-02-01 2021-02-01 Method and system for realizing distributed flow acquisition and analysis

Publications (1)

Publication Number Publication Date
CN113162818A true CN113162818A (en) 2021-07-23

Family

ID=76879079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110138388.1A Pending CN113162818A (en) 2021-02-01 2021-02-01 Method and system for realizing distributed flow acquisition and analysis

Country Status (1)

Country Link
CN (1) CN113162818A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113794719A (en) * 2021-09-14 2021-12-14 中国工商银行股份有限公司 Network abnormal traffic analysis method and device based on Elasticissearch technology and electronic equipment
CN114124470A (en) * 2021-11-01 2022-03-01 山东顺国电子科技有限公司 Network flow metadata acquisition technical algorithm
CN114138810A (en) * 2022-01-27 2022-03-04 中国民航信息网络股份有限公司 Access flow statistical method and system
CN114741467A (en) * 2022-03-07 2022-07-12 福建升腾资讯有限公司 A method and system for full-text retrieval
CN115278747A (en) * 2022-08-26 2022-11-01 恒为科技(上海)股份有限公司 A data processing method, device and storage medium
CN115987836A (en) * 2022-12-28 2023-04-18 上海天旦网络科技发展有限公司 Hierarchically scalable cloud network flow analysis method and system
CN116126855A (en) * 2022-12-28 2023-05-16 北京结慧科技有限公司 Multi-field data storage method, electronic equipment and storage medium
CN116170352A (en) * 2023-02-01 2023-05-26 北京首都在线科技股份有限公司 Network traffic processing method and device, electronic equipment and storage medium
CN116319488A (en) * 2023-05-22 2023-06-23 神州灵云(北京)科技有限公司 Method, device and storage medium for cyclic test by using pcap data packet

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103188112A (en) * 2011-12-28 2013-07-03 阿里巴巴集团控股有限公司 Network flow detection method and network flow detection device
CN109150859A (en) * 2018-08-02 2019-01-04 北京北信源信息安全技术有限公司 A kind of Botnet detection method flowing to similitude based on network flow
CN109614401A (en) * 2018-12-06 2019-04-12 航天恒星科技有限公司 Network transmission data storage system based on ElasticSearch and Hbase technology
CN110489211A (en) * 2019-08-16 2019-11-22 杭州安恒信息技术股份有限公司 Back method and device based on filter Driver on FSD frame
CN110889023A (en) * 2019-11-20 2020-03-17 河海大学常州校区 A distributed multifunctional search engine for elasticsearch
US20200341854A1 (en) * 2019-04-26 2020-10-29 EMC IP Holding Company LLC Efficient Method to Find Changed Data between Indexed Data and New Backup
US20200387357A1 (en) * 2017-12-05 2020-12-10 Agile Stacks Inc. Machine generated automation code for software development and infrastructure operations
CN112100197A (en) * 2020-07-31 2020-12-18 紫光云(南京)数字技术有限公司 Quasi-real-time log data analysis and statistics method based on Elasticissearch

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103188112A (en) * 2011-12-28 2013-07-03 阿里巴巴集团控股有限公司 Network flow detection method and network flow detection device
US20200387357A1 (en) * 2017-12-05 2020-12-10 Agile Stacks Inc. Machine generated automation code for software development and infrastructure operations
CN109150859A (en) * 2018-08-02 2019-01-04 北京北信源信息安全技术有限公司 A kind of Botnet detection method flowing to similitude based on network flow
CN109614401A (en) * 2018-12-06 2019-04-12 航天恒星科技有限公司 Network transmission data storage system based on ElasticSearch and Hbase technology
US20200341854A1 (en) * 2019-04-26 2020-10-29 EMC IP Holding Company LLC Efficient Method to Find Changed Data between Indexed Data and New Backup
CN110489211A (en) * 2019-08-16 2019-11-22 杭州安恒信息技术股份有限公司 Back method and device based on filter Driver on FSD frame
CN110889023A (en) * 2019-11-20 2020-03-17 河海大学常州校区 A distributed multifunctional search engine for elasticsearch
CN112100197A (en) * 2020-07-31 2020-12-18 紫光云(南京)数字技术有限公司 Quasi-real-time log data analysis and statistics method based on Elasticissearch

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113794719A (en) * 2021-09-14 2021-12-14 中国工商银行股份有限公司 Network abnormal traffic analysis method and device based on Elasticissearch technology and electronic equipment
CN113794719B (en) * 2021-09-14 2023-07-25 中国工商银行股份有限公司 Network abnormal traffic analysis method and device based on elastic search technology and electronic equipment
CN114124470A (en) * 2021-11-01 2022-03-01 山东顺国电子科技有限公司 Network flow metadata acquisition technical algorithm
CN114138810A (en) * 2022-01-27 2022-03-04 中国民航信息网络股份有限公司 Access flow statistical method and system
CN114138810B (en) * 2022-01-27 2022-04-12 中国民航信息网络股份有限公司 Access flow statistical method and system
CN114741467A (en) * 2022-03-07 2022-07-12 福建升腾资讯有限公司 A method and system for full-text retrieval
CN115278747A (en) * 2022-08-26 2022-11-01 恒为科技(上海)股份有限公司 A data processing method, device and storage medium
CN115987836A (en) * 2022-12-28 2023-04-18 上海天旦网络科技发展有限公司 Hierarchically scalable cloud network flow analysis method and system
CN116126855A (en) * 2022-12-28 2023-05-16 北京结慧科技有限公司 Multi-field data storage method, electronic equipment and storage medium
CN116170352A (en) * 2023-02-01 2023-05-26 北京首都在线科技股份有限公司 Network traffic processing method and device, electronic equipment and storage medium
CN116319488A (en) * 2023-05-22 2023-06-23 神州灵云(北京)科技有限公司 Method, device and storage medium for cyclic test by using pcap data packet
CN116319488B (en) * 2023-05-22 2023-08-11 神州灵云(北京)科技有限公司 Method, device and storage medium for cyclic test by using pcap data packet

Similar Documents

Publication Publication Date Title
CN113162818A (en) Method and system for realizing distributed flow acquisition and analysis
US12355645B2 (en) Aggregation of select network traffic statistics
US11681678B2 (en) Fast circular database
US9565076B2 (en) Distributed network traffic data collection and storage
US9210090B1 (en) Efficient storage and flexible retrieval of full packets captured from network traffic
JP6490059B2 (en) Method for processing data, tangible machine readable recordable storage medium and device, and method for querying features extracted from a data record, tangible machine readable recordable storage medium and device
US8666985B2 (en) Hardware accelerated application-based pattern matching for real time classification and recording of network traffic
US12362985B1 (en) Enhanced simple network management protocol (SNMP) connector
Taherimonfared et al. Real-time handling of network monitoring data using a data-intensive framework
Elsen et al. goProbe: a scalable distributed network monitoring solution
Touloupas et al. RASP: Real-time network analytics with distributed NoSQL stream processing
CN120342857B (en) Link configuration processing method and device of software system
CN121125249A (en) Application monitoring method, system, electronic device and storage medium
Wullink et al. ENTRADA: a High-Performance Network Traffic Data Stream
Gunnarsson qflow: a fast customer-oriented NetFlow database for accounting and data retention
JP2013062627A (en) Network information storage device, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210723

RJ01 Rejection of invention patent application after publication