CN1151635C - General dispatching system based on content adaptive for colony network service - Google Patents
General dispatching system based on content adaptive for colony network service Download PDFInfo
- Publication number
- CN1151635C CN1151635C CNB021387702A CN02138770A CN1151635C CN 1151635 C CN1151635 C CN 1151635C CN B021387702 A CNB021387702 A CN B021387702A CN 02138770 A CN02138770 A CN 02138770A CN 1151635 C CN1151635 C CN 1151635C
- Authority
- CN
- China
- Prior art keywords
- module
- message
- service
- information
- handshake
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
- Computer And Data Communications (AREA)
Abstract
本发明为一种适用于集群网络服务的基于内容的通用调度系统。整个系统的核心分为两个部分:一是安装在前端机的基于内容的调度器CADS;二是安装在节点机的CANS模块。本系统的工作流程和方法中运用了伪服务器技术、捎带技术、截获和伪装三次握手技术、通用调度技术。本系统与现有的调度系统相比,具有支持更多的并发用户、提高内存缓存命中率的同时提高集群存储设备的利用率、通用性好、可扩展性好、系统吞吐率高等优点。
The invention is a content-based general scheduling system suitable for cluster network services. The core of the whole system is divided into two parts: one is the content-based scheduler CADS installed in the front-end machine; the other is the CANS module installed in the node machine. In the workflow and method of this system, pseudo-server technology, piggyback technology, interception and camouflage three-way handshake technology, and general scheduling technology are used. Compared with the existing scheduling system, this system has the advantages of supporting more concurrent users, improving the memory cache hit rate and improving the utilization rate of cluster storage devices, good versatility, good scalability, and high system throughput rate.
Description
技术领域technical field
本发明属于集群网络服务器技术领域,为一种适用于集群网络服务的基于内容的通用调度系统。The invention belongs to the technical field of cluster network servers, and is a content-based general scheduling system suitable for cluster network services.
背景技术Background technique
随着国际互联网迅猛发展,网络已经成了人们日常学习、工作不可或缺的一种有效的工具.不管是现有的门户网站还是建立在Web Service架构之上的企业电子交易平台,都离不开功能强大的Web服务器的支持。而web服务器集群以价格低、可靠性好、吞吐率高、系统资源丰富等优点越来越受到人们的青睐。因此,为了更好的发挥集群的丰富资源和可扩展性等优点,平衡请求分布的集群网络服务调度系统成为集群系统性能好坏的关键所在。With the rapid development of the Internet, the network has become an indispensable and effective tool for people's daily study and work. Whether it is an existing portal or an enterprise electronic trading platform built on the Web Service architecture, it is inseparable. Open powerful web server support. And the web server cluster is more and more favored by people because of its low price, good reliability, high throughput rate, abundant system resources and so on. Therefore, in order to make better use of the advantages of abundant resources and scalability of the cluster, a cluster network service scheduling system that balances request distribution becomes the key to the performance of the cluster system.
所谓集群网络服务调度系统,实际上是由多个物理服务器共同完成对用户的服务请求:当一个客户请求到达集群网络服务器后,由一个分派器把该请求定向到一台真实的物理服务器上,这样既提供了对单一系统映象的支持,又增强了系统的可扩展性。The so-called cluster network service scheduling system actually uses multiple physical servers to jointly complete service requests for users: when a client request arrives at the cluster network server, a dispatcher directs the request to a real physical server, This not only provides support for a single system image, but also enhances the scalability of the system.
在现有的集群网络服务调度系统中,分派器在向后端节点机转发报文时没有考虑到请求报文的内容,这样,在一个服务请求到达分派器后,除了负载考虑外,所有的后端节点机被一视同仁,导致了集群系统存储设备利用率不高、集群系统的主存资源没有充分利用、某些需要内容调度策略支持的网络服务不能满足和多副本等问题。针对现有集群网络服务调度系统的不足,基于内容的调度系统成为研究热点。现有的基于内容的调度系统有ArrowPoint的web交换机、Apache、Squid和KTCPVS等。前三种均在用户空间实现,处理开销很大。KTCPVS是在操作系统的内核实现,避免了用户空间和内核空间的切换和内存复制的开销(见2001年《计算机工程与科学》杂志第23卷第3期)。但它是基于网络地址转换(NAT)结构,其示意图如图1所示。前端机将客户发来的请求按照其内容不同转发给不同节点机,再由节点机服务该请求并做出响应,把响应包发回给前端机,由前端机负责把响应包返回给客户端。该结构消除了上述现有调度系统存在的一些弱点(例如:集群系统的主存资源没有充分利用、集群系统存储设备利用率不高、某些需要内容调度策略支持的网络服务不能满足、多副本)。但是在该结构中,前端机既要负责调度,又要负责转发响应包给客户,前端机负载很重。随着节点机日益增加,前端机负载将日趋繁重,成为整个集群系统的瓶颈,降低了系统的可扩展性,减少了并发客户数;另外,它只支持http服务,没有通用性。In the existing cluster network service scheduling system, the dispatcher does not consider the content of the request message when forwarding the message to the back-end node machine. In this way, after a service request arrives at the dispatcher, in addition to load considerations, all The back-end node machines are treated equally, resulting in low utilization of storage devices in the cluster system, insufficient utilization of the main memory resources of the cluster system, unsatisfactory network services that require the support of content scheduling policies, and multiple copies. Aiming at the deficiencies of existing cluster network service scheduling systems, content-based scheduling systems have become a research hotspot. Existing content-based scheduling systems include ArrowPoint's web switch, Apache, Squid, and KTCPVS. The first three are implemented in user space, and the processing overhead is very large. KTCPVS is implemented in the kernel of the operating system, avoiding the overhead of switching between user space and kernel space and memory copying (see 2001 "Computer Engineering and Science" magazine, Volume 23, Issue 3). But it is based on a network address translation (NAT) structure, the schematic diagram of which is shown in Figure 1. The front-end machine forwards the request sent by the client to different node machines according to its content, and then the node machine serves the request and makes a response, and sends the response packet back to the front-end machine, and the front-end machine is responsible for returning the response packet to the client . This structure eliminates some of the weaknesses of the above-mentioned existing scheduling system (for example: the main memory resources of the cluster system are not fully utilized, the storage device utilization rate of the cluster system is not high, some network services that require the support of content scheduling policies cannot be satisfied, multiple copies ). However, in this structure, the front-end machine is not only responsible for scheduling, but also responsible for forwarding the response packet to the client, and the load of the front-end machine is very heavy. With the increasing number of node machines, the load on the front-end machine will become increasingly heavy, which will become the bottleneck of the entire cluster system, reducing the scalability of the system and the number of concurrent clients; in addition, it only supports http services and has no versatility.
发明内容Contents of the invention
本发明的目的是针对现有的调度系统的不足,提出了一种适用于集群网络服务的基于内容的通用调度系统,该系统可直接发响应包给客户而不需经过前端机,大大的减少了前端机的负载,增加了并发客户数;同时,本系统支持多种服务,而不仅仅是http服务,并且可以很方便的支持更多的服务,具有很好的通用性。The purpose of the present invention is to address the deficiencies of the existing scheduling system, and propose a content-based general scheduling system suitable for cluster network services. The system can directly send response packets to customers without going through the front-end machine, greatly reducing The load of the front-end machine has been reduced, and the number of concurrent clients has been increased; at the same time, the system supports multiple services, not just http services, and can easily support more services, which has good versatility.
本发明提供的一种适用于集群网络服务的基于内容的通用调度系统,包括前端机和子网,其特征在于:所述前端机配置有策略定制模块、伪服务器和负载调度器,所述子网由配置有TCP协议栈改造模块和服务例程的多个节点机构成,由多个子网形成服务器池;A content-based general scheduling system suitable for cluster network services provided by the present invention includes a front-end machine and a subnet, and is characterized in that: the front-end machine is configured with a policy customization module, a pseudo-server and a load scheduler, and the subnet It is composed of multiple node machines configured with TCP protocol stack transformation modules and service routines, and a server pool is formed by multiple subnets;
所述策略定制模块用于用户定制其策略,并将该策略传送到所述负载调度器;该模块包括策略定义用户界面、策略校验模块、策略库和发送内核模块;The policy customization module is used for users to customize their policies and transmit the policies to the load scheduler; this module includes a policy definition user interface, a policy verification module, a policy library and a sending kernel module;
策略定义用户界面用于用户定制集群系统所允许的策略,并把用户The policy definition user interface is used to customize the policies allowed by the cluster system, and the user
定制的策略交由策略校验模块统一校验;The customized strategy is handed over to the strategy verification module for unified verification;
策略库用于存放策略规则,供策略校验模块调用;The policy library is used to store policy rules for calling by the policy verification module;
策略校验模块将策略定义用户界面传来的策略与策略库进行匹配,The policy verification module matches the policy transmitted from the policy definition user interface with the policy library,
判断是否符合;若不符合,则进行出错处理,将出错信息报告给用户,Determine whether it meets; if not, perform error handling and report the error message to the user,
否则,交由发送内核模块处理;Otherwise, it will be handled by the sending kernel module;
发送内核模块将通过策略校验模块校验的策略通过系统调用传到负载调度器处理;The sending kernel module passes the policy verified by the policy verification module to the load scheduler for processing through system calls;
所述伪服务器用于同发起请求的客户机进行三次握手通信,并将信息传送给所述负载调度器;所述伪服务器包括初始化模块、监听端口服务模块、服务类型库、保存握手信息模块和握手信息表;The pseudo-server is used to perform a three-way handshake communication with the client that initiates the request, and transmits information to the load scheduler; the pseudo-server includes an initialization module, a listening port service module, a service type library, a module for saving handshake information and handshake information table;
初始化模块用于初始化网络服务链表并从服务类型库中取出要监听的服务类型和端口号交由监听端口服务模块进行监听往这些端口发来的网络包;The initialization module is used to initialize the network service linked list and take out the service type and port number to be monitored from the service type library and hand it over to the monitoring port service module to monitor the network packets sent to these ports;
监听端口服务模块接收从初始化模块传来的服务类型和端口号,并在这些端口监听网络包;当三次握手信号完成后,进入保存握手信号模块来保存握手信息;The monitoring port service module receives the service type and port number transmitted from the initialization module, and monitors network packets on these ports; when the three-way handshake signal is completed, enter the save handshake signal module to save the handshake information;
服务类型库用于存放集群系统所支持的服务类型和相应服务的端口号,提供给初始化模块使用;The service type library is used to store the service types supported by the cluster system and the port numbers of the corresponding services, and provide them to the initialization module;
保存握手信息模块用于在三次握手完成后保存握手信息到握手信息表中;The handshake information saving module is used to save the handshake information to the handshake information table after the three-way handshake is completed;
握手信息表用于存放三次握手信息,提供给负载调度器中的报文捎带模块;The handshake information table is used to store the three-way handshake information, which is provided to the message piggybacking module in the load scheduler;
所述负载调度器用于将客户发来的请求包按其内容的不同,分别调度到不同子网的不同节点机上;以及将伪服务器中已截获的握手信息转发给节点机;所述负载调度器包括报文接收模块、通用调度器、事件接收模块、策略库、报文捎带模块、调度表和转发模块;The load scheduler is used to dispatch the request packets sent by the client to different node machines of different subnets according to their contents; and forward the handshake information intercepted in the fake server to the node machines; the load scheduler Including message receiving module, general scheduler, event receiving module, policy library, message piggybacking module, scheduling table and forwarding module;
报文接收模块用于接收网络包,并判断该网络包的IP是否为虚拟IP,如果不是,则交由原TCP/IP协议栈自行处理,否则,进入通用调度器;The message receiving module is used to receive the network packet, and judge whether the IP of the network packet is a virtual IP, if not, then hand it over to the original TCP/IP protocol stack for processing, otherwise, enter the general scheduler;
事件接收模块用于接收来自发送内核模块的策略信息,并把其保存在策略库中;The event receiving module is used to receive policy information from the sending kernel module and store it in the policy library;
策略库用于保存策略信息,供通用调度器调用;The policy library is used to save policy information for calling by the general scheduler;
调度表用于保存报文请求内容和被调度的目的节点机信息,供通用调度器调用;The scheduling table is used to save the content of the message request and the information of the scheduled destination node machine, which is called by the general scheduler;
通用调度器根据报文接收模块提供的报文分析数据、策略库和调度表中的信息决定调度到某节点机上,并将该决定传给报文捎带模块;The general scheduler decides to dispatch to a certain node machine according to the message analysis data provided by the message receiving module, the policy library and the information in the scheduling table, and passes the decision to the message piggybacking module;
报文捎带模块从通用调度器传来的调度信息中取出报文请求内容,和捎带握手信息表中的握手信息一同传给转发模块;The message piggyback module takes out the message request content from the scheduling information sent by the general scheduler, and transmits it to the forwarding module together with the handshake information in the piggyback handshake information table;
转发模块用于将报文捎带模块传来的包经过内部高速网络转发给指定节点机的所述TCP协议栈改造模块;The forwarding module is used to forward the packet from the message piggybacking module to the TCP protocol stack transformation module of the designated node machine through the internal high-speed network;
所述TCP协议栈改造模块用于截获负载调度器发来的网络包,截取其握手信息,恢复原报文,实现与客户端之间的伪连接,并将响应报文通过伪连接发送给客户端。The TCP protocol stack modification module is used to intercept the network packet sent by the load scheduler, intercept its handshake information, restore the original message, realize the pseudo connection with the client, and send the response message to the client through the pseudo connection end.
本系统综合运用了伪服务器技术、捎带技术、截获和伪装三次握手技术、通用调度技术,采用直接路由结构即节点机直接发响应包给客户而不需经过前端机,明显的减少了前端机的负载,增加了并发客户数。同时,本系统支持多种服务,而不仅仅是http服务,并且可以很方便的支持更多的服务,具有很好的通用性。本发明与现有的调度系统相比,具有支持更多的并发用户、提高内存缓存命中率的同时提高集群存储设备的利用率、通用性好、可扩展性好、系统吞吐率高等优点。具体而言,本发明具有以下特点。This system comprehensively uses pseudo-server technology, piggyback technology, interception and camouflage three-way handshake technology, and general scheduling technology. It adopts a direct routing structure, that is, the node machine directly sends a response packet to the client without going through the front-end machine, which significantly reduces the front-end machine. load, increasing the number of concurrent clients. At the same time, this system supports multiple services, not just http services, and can easily support more services, which has good versatility. Compared with the existing dispatching system, the present invention has the advantages of supporting more concurrent users, improving memory cache hit rate and utilization rate of cluster storage devices, good versatility, good scalability, high system throughput rate and the like. Specifically, the present invention has the following characteristics.
1)动态可扩展性1) Dynamic scalability
本系统采用直接路由转发响应包给客户,而不像网络地址转换把响应包传回给前端机,使得前端机负载减轻,提高了整个系统的吞吐量,增加了更多的并发用户,提高了系统的动态可扩展性。This system uses direct routing to forward the response packet to the client, unlike the network address translation that sends the response packet back to the front-end machine, which reduces the load on the front-end machine, improves the throughput of the entire system, adds more concurrent users, and improves Dynamic scalability of the system.
2)支持动态页面请求和静态页面请求2) Support dynamic page request and static page request
本系统采用基于内容的混合调度,对包括动态页面请求和静态页面请求在内的混合请求提供支持,更好的利用节点机的主存Cache,取得更大的处理能力。This system adopts content-based hybrid scheduling to provide support for mixed requests including dynamic page requests and static page requests, and better utilize the main memory Cache of the node machine to obtain greater processing capacity.
3)支持持续服务3) Support continuous service
本系统根据本机维护的连接记录在IP层把报文直接转发到目的节点机,保证了服务的完整性,从而支持一次连接的多次服务。The system forwards the message directly to the destination node machine at the IP layer according to the connection record maintained by the machine, which ensures the integrity of the service and supports multiple services for one connection.
4)支持任意的基于TCP协议的服务4) Support any TCP protocol-based service
本系统采用通用调度器,支持多种类型的网络服务,而且可以很方便的增加更多的服务解析器以支持更多类型的服务。本系统可以支持任意的基于TCP协议的服务。对于不同类型的服务,该系统调度到不同的子网中,方便用户管理和扩展集群。The system adopts a general scheduler, supports various types of network services, and can easily add more service resolvers to support more types of services. This system can support any service based on TCP protocol. For different types of services, the system schedules them into different subnets, which is convenient for users to manage and expand the cluster.
附图说明Description of drawings
图1为现有技术中基于网络地址转换的内容调度系统的结构示意图;FIG. 1 is a schematic structural diagram of a content scheduling system based on network address translation in the prior art;
图2为本发明适用于集群网络服务的基于内容的通用调度系统的结构示意图;FIG. 2 is a schematic structural diagram of a content-based general scheduling system applicable to cluster network services in the present invention;
图3为策略定制模块的结构示意图;FIG. 3 is a schematic structural diagram of a strategy customization module;
图4为伪服务器的结构示意图;FIG. 4 is a schematic structural diagram of a pseudo-server;
图5为前端机中负载调度器(CADS)的结构示意图;Fig. 5 is the structural representation of load scheduler (CADS) in the front-end machine;
图6为图5中通用调度器的结构示意图;Fig. 6 is a schematic structural diagram of the general scheduler in Fig. 5;
图7为节点机上TCP协议栈改造模块(CANS)的结构示意图;Fig. 7 is the structural representation of the TCP protocol stack transformation module (CANS) on the node machine;
图8为本发明系统的工作流程示意图。Fig. 8 is a schematic diagram of the workflow of the system of the present invention.
具体实施方式Detailed ways
下面结合附图对本发明作进一步详细的说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.
从工作原理划分,本系统的核心分为两大部分,一是安装在前端机的基于内容的调度器CADS;二是安装在节点机的CANS模块。如图2所示,适用于集群网络服务的基于内容的通用调度系统(以下简称CAVS(Content AwareVirtual Server))的体系结构分为两层:第一层为前端机,采用基于内容的调度系统(以下简称CADS(Content Aware Dispatcher System))和伪服务器;第二层为服务器池,它由许多子网组成,每个子网提供特定的服务如http服务、代理服务等。内容调度系统在节点机上的实现部分(以下简称CANS(Content Aware Node server System))也在该层中。集群中各节点机是通过高速网络相连接的。Divided from the working principle, the core of this system is divided into two parts, one is the content-based scheduler CADS installed in the front-end machine; the other is the CANS module installed in the node machine. As shown in Figure 2, the architecture of the content-based general scheduling system (hereinafter referred to as CAVS (Content Aware Virtual Server)) suitable for cluster network services is divided into two layers: the first layer is the front-end machine, which adopts the content-based scheduling system ( Hereinafter referred to as CADS (Content Aware Dispatcher System)) and pseudo-server; the second layer is the server pool, which consists of many subnets, and each subnet provides specific services such as http services, proxy services, etc. The implementation part of the content scheduling system on the node machine (hereinafter referred to as CANS (Content Aware Node server System)) is also in this layer. Each node machine in the cluster is connected through a high-speed network.
CADS负责把客户发来的请求包根据其内容的不同,分别将其调度到不同子网的不同节点机上。这样,相同内容的请求由同一节点机提供服务。当一个新的请求到来时,便可按其内容分配到相应的节点机上,直接从缓存中读出请求内容而不必去读硬盘,使得整个系统吞吐率更高、响应时间更快、延迟更短。CADS is responsible for dispatching the request packets sent by customers to different node machines in different subnets according to their different contents. In this way, requests for the same content are served by the same node machine. When a new request comes, it can be assigned to the corresponding node machine according to its content, and the request content can be read directly from the cache without having to read the hard disk, making the overall system throughput higher, faster response time, and shorter delay .
CADS还负责将截获的握手信息转发给节点机。节点机通过CANS根据截获到的握手信息进行三次握手的伪装,表明节点机已与客户机建立了伪连接。CANS通过伪连接把响应信息直接传送给客户端,而不需要经过前端机来返回响应信息,减少了前端机的负载,支持更多的并发用户,减少了延迟。CADS is also responsible for forwarding the intercepted handshake information to the node machine. The node machine performs three-way handshake camouflage according to the intercepted handshake information through CANS, indicating that the node machine has established a false connection with the client machine. CANS transmits the response information directly to the client through a pseudo-connection, without the need to return the response information through the front-end machine, which reduces the load on the front-end machine, supports more concurrent users, and reduces delays.
在这种体系结构下,当所有服务器节点机超载时,管理员可以很快地加入新的服务器节点机来处理请求,而无需将Web文档等复制到节点机的本地硬盘上,动态可扩展性好。Under this architecture, when all server node machines are overloaded, administrators can quickly add new server node machines to process requests without copying Web documents, etc. to the local hard disk of the node machine. Dynamic scalability good.
下面分别介绍各部分的功能:The functions of each part are introduced as follows:
策略定制模块Policy customization module
该模块用于用户定制自己的策略(当然,该策略必须符合策略库的要求),并把该策略传到内核中的负载调度器,负载调度器根据该策略进行工作,从而达到符合用户要求的效果。This module is used for users to customize their own strategy (of course, the strategy must meet the requirements of the strategy library), and pass the strategy to the load scheduler in the kernel, and the load scheduler works according to the strategy, so as to meet the requirements of the user Effect.
如图3所示,该模块通过策略定义用户界面20定义符合策略库要求的策略,并通过发送内核模块23将用户定制的策略传到内核中的负载调度器,交由其中的事件接收模块4处理。其功能与相互关系描述如下:As shown in Figure 3, this module defines the strategy that meets the requirements of the strategy library through the strategy definition user interface 20, and transmits the user-defined strategy to the load scheduler in the kernel by sending the kernel module 23, and hands it over to the event receiving module 4 therein deal with. Its functions and interrelationships are described as follows:
1)策略定义用户界面20:用于用户定制集群系统所允许的策略,如本系统虚拟IP,以及系统支持哪些服务,每种服务分配哪些节点机等等。它把用户定制的策略交由策略校验模块21统一校验。1) Policy definition user interface 20: used to customize the policies allowed by the cluster system, such as the virtual IP of the system, which services the system supports, which node machines are assigned to each service, and so on. It submits the policies customized by users to the policy verification module 21 for unified verification.
2)策略校验模块21:对策略定义用户界面20传来的策略与策略库22进行匹配,判断是否符合。若不符合,则进行出错处理,将出错信息报告给用户;否则,交由发送内核模块23处理。2) Policy verification module 21: Match the policy transmitted from the policy definition user interface 20 with the policy library 22, and judge whether it conforms. If not, error processing is performed, and the error information is reported to the user; otherwise, it is handed over to the sending kernel module 23 for processing.
3)策略库22:存放策略规则,方便策略校验模块21对用户定制的规则进行匹配校验。3) Policy library 22: stores policy rules, which is convenient for the policy verification module 21 to perform matching verification on user-defined rules.
4)发送内核模块23:将通过策略校验模块21校验的策略通过系统调用传给内核,由事件接收模块4处理。4) Sending kernel module 23: transmits the policy verified by the policy verification module 21 to the kernel through system calls, and is processed by the event receiving module 4.
伪服务器fake server
伪服务器用于同发起请求的客户机进行三次握手通信。伪服务器的概念是同真实的网络服务程序(如Apache Web服务器)相对应的。The fake server is used for three-way handshake communication with the requesting client. The concept of a pseudo-server corresponds to a real network service program (such as Apache Web server).
对真正的网络服务程序而言,首先要在一个标准端口接收来自客户的请求,在完成同客户的三次握手之后,接收客户服务请求,并根据客户的请求把相应的内容返回给用户,然后通过改进的三次握手同客户协商以便结束该连接。网络服务程序被精心的设计成可以同时服务多个连接请求,以提高服务能力。For a real network service program, it first needs to receive the request from the client on a standard port, after completing the three-way handshake with the client, receive the client service request, and return the corresponding content to the user according to the request of the client, and then pass An improved three-way handshake negotiates with the client to end the connection. The network service program is carefully designed to serve multiple connection requests at the same time to improve service capabilities.
伪服务器在概念和工作原理上与真实的网络服务程序有两方面的不同:The concept and working principle of the pseudo server are different from the real network service program in two aspects:
(1)伪服务器的功能仅仅是同客户端进行三次握手的交互,一旦三次握手过程完毕,伪服务器便终止该连接。(1) The function of the fake server is only to perform three-way handshake interaction with the client. Once the three-way handshake process is completed, the fake server terminates the connection.
(2)伪服务器能够同时支持多种网络服务。由于伪服务器只是监听端口,建立连接,不需对该请求作更多的处理。因此,伪服务器与具体的服务内容、类型无关,这就使得伪服务器能够成为一种通用的机制,同时对多个标准服务端口提供服务。这样,当该集群服务器需要提供多种网络服务时,就不用在分派器上安装多个网络服务程序。(2) The fake server can support multiple network services at the same time. Since the fake server only listens to the port and establishes a connection, there is no need to do more processing on the request. Therefore, the pseudo-server has nothing to do with the specific service content and type, which makes the pseudo-server a general mechanism to provide services to multiple standard service ports at the same time. In this way, when the cluster server needs to provide multiple network services, there is no need to install multiple network service programs on the dispatcher.
根据以上两点分析,可以看出伪服务器的设计一方面能减小分派器负载和实现的复杂性(不用安装多个服务程序,避免了服务程序的开销);另一方面可以为集群服务器提供通用性。这样,只要在分派器上安装了伪服务器,就可以对多种网络服务提供建立三次握手,获取握手信息的功能。According to the analysis of the above two points, it can be seen that the design of the pseudo-server can reduce the load of the dispatcher and the complexity of the implementation on the one hand (no need to install multiple service programs, avoiding the overhead of the service program); on the other hand, it can provide cluster servers Versatility. In this way, as long as a fake server is installed on the dispatcher, it can provide the functions of establishing a three-way handshake and obtaining handshake information for various network services.
伪服务器结构示意图如图4所示,首先通过初始化模块11进行初始化,并从服务类型库中取服务类型和端口数据交由监听端口服务模块监听,待完成三次握手之后,由保存握手信号模块保存握手信息进握手信息表中,并将其传给负载调度器中的报文捎带模块6。其功能与相互关系描述如下:The schematic diagram of the fake server structure is shown in Figure 4. Initialization is performed by the initialization module 11 first, and the service type and port data are taken from the service type library and passed to the monitoring port service module for monitoring. After the three-way handshake is completed, it is saved by the save handshake signal module. The handshake information is entered into the handshake information table and passed to the message piggybacking module 6 in the load scheduler. Its functions and interrelationships are described as follows:
1)初始化模块11:用于初始化网络服务链表,并从服务类型库13中取出要监听的服务类型和端口号,交由监听端口服务模块12监听发往这些端口的网络包。1) Initialization module 11: used to initialize the network service linked list, and take out the service type and port number to be monitored from the service type library 13, and hand over the monitoring port service module 12 to monitor the network packets sent to these ports.
2)监听端口服务模块12:接受从初始化模块11传来的服务类型和端口号,并在这些端口监听网络包。当三次握手完成后,进入保存握手信号模块14来保存握手信息。2) Listen port service module 12: accept the service type and port number transmitted from the initialization module 11, and monitor network packets at these ports. After the three-way handshake is completed, enter the save handshake signal module 14 to save the handshake information.
3)服务类型库13:用于存放集群系统所支持的服务类型和相应服务的端口号。3) Service type library 13: used to store the service types supported by the cluster system and the port numbers of the corresponding services.
4)保存握手信息模块14:用于在三次握手完成后保存握手信息到握手信息表10中,以便报文捎带模块6从握手信息表中获取握手信息。4) Save the handshake information module 14: for saving the handshake information in the handshake information table 10 after the three-way handshake is completed, so that the message piggybacking module 6 can obtain the handshake information from the handshake information table.
5)握手信息表10:用于存放三次握手信息如当前IP包的源IP地址、源端口号、目的IP地址、目的端口号等。5) Handshake information table 10: for storing three-way handshake information such as source IP address, source port number, destination IP address, destination port number, etc. of the current IP packet.
负载调度器(CADS)Load Scheduler (CADS)
CADS用于把客户发来的请求包和截获的握手信息根据其请求包的内容不同分别将其调度到不同子网的不同节点机上。其结构如图5、图6所示,报文接收模块1接收网络包,并进行过滤得到所需的网络包,交由通用调度器3。通用调度器中的协议识别模块24把传来的网络包进行协议识别,根据不同的服务请求调用不同的服务例程(如http服务例程25-1,corba服务例程25-2,代理服务例程25-n等),得到相应的请求内容后调用通用调度系统26进行统一处理;同时,事件接收模块4从发送内核模块23(见图3)中接收已定制的策略保存在策略库5中。通用调度器3根据报文分析数据、策略库和调度表信息决定调度到某节点机上,并经过报文捎带模块6、转发模块9将报文传给CANS的报文过滤模块16(见图7)。其功能与相互关系描述如下:CADS is used to dispatch the request packet sent by the client and the intercepted handshake information to different node machines in different subnets according to the content of the request packet. Its structure is shown in FIG. 5 and FIG. 6 . The message receiving module 1 receives network packets, and filters them to obtain the required network packets, and delivers them to the general scheduler 3 . The protocol recognition module 24 in the universal dispatcher carries out protocol recognition to the incoming network packet, and calls different service routines (such as http service routine 25-1, corba service routine 25-2, proxy service routine 25-1, etc.) Routine 25-n etc.), after obtaining corresponding request content, call general scheduling system 26 and carry out unified processing; Simultaneously, event receiving module 4 receives the customized strategy from sending kernel module 23 (seeing Fig. 3) and saves in strategy storehouse 5 middle. Universal scheduler 3 decides to schedule on a certain node machine according to message analysis data, policy storehouse and scheduling table information, and passes message to the message filter module 16 of CANS through message piggyback module 6, forwarding module 9 (seeing Fig. 7 ). Its functions and interrelationships are described as follows:
1)报文接收模块1:接收到网络包,它判断该网络包的IP是否为虚拟IP。如果不是,则交由原TCP/IP协议栈自行处理;否则,进入通用调度器3中的协议识别模块24。1) Message receiving module 1: upon receiving a network packet, it judges whether the IP of the network packet is a virtual IP. If not, it will be handled by the original TCP/IP protocol stack; otherwise, it will enter the protocol identification module 24 in the general scheduler 3 .
2)通用调度器3:它查找调度表7信息判断是否已调度过该请求,如果有,则直接从调度表中取出调度信息;否则,根据策略库5中相应的策略和协议识别模块24的处理结果,便可得到调度信息即调度到哪个子网的哪台节点机上等并保存在调度表7中。将调度信息传给报文捎带模块6。2) general scheduler 3: it searches the scheduling table 7 information to judge whether the request has been scheduled, and if so, directly takes out the scheduling information from the scheduling table; As a result of the processing, scheduling information, that is, which node machine in which subnet to schedule to, can be obtained and stored in the scheduling table 7 . Send the scheduling information to the message piggybacking module 6 .
通用调度器是本系统核心模块。它对所有类型的服务统一处理,使得系统更方便的支持更多类型的服务。目前,本系统支持http服务,corba服务,代理服务。如果需要支持其它服务,我们只需在协议识别模块24加入相应的服务的识别,如图中列举的http服务例程25-1、Corba服务例程25-2,直至代理服务例程25-n。并增加相应的服务例程对该服务进行处理,最后便由通用调度模块26进行统一调度,一种服务类型便可以得到支持。其功能与相互关系描述如下:The general scheduler is the core module of this system. It handles all types of services uniformly, making the system more convenient to support more types of services. At present, the system supports http service, corba service and proxy service. If other services need to be supported, we only need to add the identification of corresponding services in the protocol identification module 24, such as the http service routine 25-1, Corba service routine 25-2 listed in the figure, until the proxy service routine 25-n . A corresponding service routine is added to process the service, and finally the general scheduling module 26 performs unified scheduling, and one type of service can be supported. Its functions and interrelationships are described as follows:
2.1)协议识别模块24:对报文接收模块1接收的网络包的内容进行分析,得到客户所需要的服务类型和相应的端口号,判断客户所需要的服务,调用相应的服务例程(如http服务例程25-1,corba服务例程25-2,代理服务例程25-n等)进行处理。2.1) protocol recognition module 24: analyze the content of the network packet that message receiving module 1 receives, obtain the service type and the corresponding port number required by the client, judge the service required by the client, call corresponding service routine (such as http service routine 25-1, corba service routine 25-2, proxy service routine 25-n, etc.) to process.
2.2)http服务例程25-1:对http请求进行处理。它首先分析http协议头部,得到服务请求的内容,交由通用调度模块26统一调度。2.2) http service routine 25-1: process http requests. It firstly analyzes the header of the http protocol to obtain the content of the service request, and sends it to the general scheduling module 26 for unified scheduling.
2.3)corba服务例程25-2:对corba请求进行处理。它首先分析iiop协议头部,得到服务请求的内容,交由通用调度模块26统一调度。2.3) corba service routine 25-2: process the corba request. It first analyzes the header of the iiop protocol, obtains the content of the service request, and sends it to the general scheduling module 26 for unified scheduling.
2.4)代理服务例程25-n:对代理请求进行处理。它首先分析代理协议头部,得到服务请求的内容,交由通用调度模块26统一调度。2.4) Proxy service routine 25-n: process the proxy request. It firstly analyzes the header of the proxy protocol, obtains the content of the service request, and sends it to the general scheduling module 26 for unified scheduling.
2.5)通用调度模块26:它对所有类型的服务统一处理,使得系统更方便的支持更多类型的服务。它把服务例程处理得到的内容通过混合调度算法进行调度,得到调度信息后,交由报文捎带模块6进行处理。2.5) General scheduling module 26: it processes all types of services in a unified manner, making the system more convenient to support more types of services. It schedules the content processed by the service routine through a hybrid scheduling algorithm, and after obtaining the scheduling information, sends it to the message piggybacking module 6 for processing.
3)调度表7:保存报文请求内容,被调度的目的节点机信息如IP地址等,端口号,定时器等信息。其中定时器用来维护调度表是否失效。3) Scheduling table 7: save the content of the message request, the information of the scheduled destination node machine such as IP address, port number, timer and other information. The timer is used to maintain whether the scheduling table is invalid.
4)事件接收模块4:接收来自发送内核模块23(见图3)的策略信息并把其保存在策略库5中。4) Event receiving module 4: receiving the policy information from the sending kernel module 23 (see FIG. 3 ) and storing it in the policy library 5 .
5)策略库5:保存策略信息如服务类型和其可用节点机信息。5) Policy library 5: store policy information such as service type and its available node machine information.
6)报文捎带模块6:从调度模块传来的调度信息中取出报文请求内容捎带握手信息表10中的握手信息一同传给转发模块9。6) Message piggybacking module 6: Take out the message request content from the scheduling information sent by the scheduling module and send it to the forwarding module 9 together with the handshake information in the handshake information table 10 .
7)转发模块9:将报文捎带模块6传来的包经过内部高速网络转发给指定节点机,由节点机的CANS的报文过滤模块16(见图7)负责接收该包。7) Forwarding module 9: the packet sent by message piggyback module 6 is forwarded to designated node machine through internal high-speed network, and the message filter module 16 (seeing Fig. 7) of the CANS of node machine is responsible for receiving this bag.
8)定时器8:该模块定时给调度表失效模块发触发信号,以便于删除过时的调度信息。8) Timer 8: This module regularly sends a trigger signal to the scheduling table invalidation module, so as to delete outdated scheduling information.
TCP协议栈改造模块TCP protocol stack transformation module
该模块用于截获前端机发来的网络包,截取其握手信息,恢复原报文(即无握手信息的报文),实现与客户端进行伪连接,并将响应报文通过伪连接发送给客户端。为了满足这些要求需要修改节点机的TCP协议栈。This module is used to intercept the network packet sent by the front-end machine, intercept its handshake information, restore the original message (that is, the message without handshake information), realize the pseudo connection with the client, and send the response message to the client. In order to meet these requirements, it is necessary to modify the TCP protocol stack of the node machine.
如图7所示,该模块由报文过滤模块16接收转发模块9发来的网络包并交由报文恢复模块17截取握手信息,恢复原报文(即无握手信息的报文),再由伪装三次握手模块根据握手信息进行伪装,修改原TCP协议栈得到新的改装后的TCP协议栈,再由发送响应模块19将原报文的响应报文直接发送给客户端。其功能与相互关系描述如下:As shown in Figure 7, this module receives the network packet that forwarding module 9 sends by message filtering module 16 and hand over to intercepting handshake information by message recovery module 17, recovers original message (that is, the message that does not have handshaking information), and then The camouflaged three-way handshake module camouflages according to the handshake information, modifies the original TCP protocol stack to obtain a new modified TCP protocol stack, and then sends the response message of the original message directly to the client by the sending response module 19 . Its functions and interrelationships are described as follows:
1)报文过滤模块16:接收所有网络包。如果包的协议字段不为CA_TCP,表明该包不为本系统所要的包,交由原TCP/IP协议栈处理;否则,表明接收到来自转发模块9发送过来的包,则交由报文恢复模块17处理。1) Message filtering module 16: receiving all network packets. If the protocol field of the packet is not CA_TCP, it indicates that the packet is not the desired packet of the system, and it is handled by the original TCP/IP protocol stack; otherwise, it indicates that the packet sent from the forwarding module 9 is received, and it is restored by the message Module 17 processing.
2)报文恢复模块17:从网络包中截取握手信号,恢复原报文(即无握手信息的报文),将握手信号和原报文一同传给伪装三次握手模块18。2) Message recovery module 17: intercept the handshake signal from the network packet, restore the original message (that is, a message without handshake information), and pass the handshake signal and the original message to the fake three-way handshake module 18.
3)伪装三次握手模块18:根据报文恢复模块17传来的握手信息进行TCP层的伪装,然后用伪装后的TCP协议传送原报文给上层网络服务程序(如httpserver,cache server等)。网络服务程序服务该请求并将响应包传给发送响应包模块19。3) camouflage three-way handshake module 18: carry out the camouflage of TCP layer according to the handshake information sent by message recovery module 17, then send the original message to upper layer network service program (such as httpserver, cache server etc.) with the TCP protocol after camouflage. The network service program serves the request and sends the response packet to the module 19 for sending the response packet.
4)发送响应包模块19:根据伪装后的TCP协议将响应包发送给客户端。4) Send response packet module 19: send the response packet to the client according to the disguised TCP protocol.
图8描述了本发明系统的工作流程:Fig. 8 has described the workflow of the system of the present invention:
①当CADS接收到网络包时,它判断是否对虚拟IP的访问。如果不是,则说明不是本系统所需要的网络包,本系统不做任何处理直接交由原TCP/IP协议栈自行处理;否则,初始化三次握手信息表项类型的指针。① When CADS receives the network packet, it judges whether to visit the virtual IP. If not, it means that it is not a network packet required by the system, and the system does not do any processing and directly submits it to the original TCP/IP protocol stack for processing; otherwise, initialize the pointer of the three-way handshake information entry type.
②开始查找是否有握手信息。如果没有,则交由伪服务器截获三次握手信息,并将其保存在握手信息表中。如果已有,则表明握手信息刚建立起来,进入协议识别模块。②Start to find out whether there is handshake information. If not, the fake server intercepts the three-way handshake information and saves it in the handshake information table. If it exists, it means that the handshake information has just been established and enters the protocol identification module.
③协议识别模块首先根据端口进行协议识别,并进入相应的协议处理模块。协议处理模块分析包的内容截取出相关信息,调用通用调度模块进行统一处理。③ The protocol identification module first identifies the protocol according to the port, and enters into the corresponding protocol processing module. The protocol processing module analyzes the content of the packet to extract relevant information, and calls the general scheduling module for unified processing.
④通用调度模块根据③中截获的信息采用一定的算法决定调度到哪个子网的哪台节点机上。如果在调度表中已有该调度信息,则直接根据该调度信息把报文捎带握手信息一同转发到被调度的节点机上;否则,采用一定的算法得到调度信息,并把它保存在调度表中,再根据该调度信息把报文捎带握手信息一同转发到被调度的节点机上。④ The general scheduling module uses a certain algorithm to decide which node machine in which subnet to dispatch to according to the information intercepted in ③. If the scheduling information already exists in the scheduling table, forward the message along with the handshake information to the scheduled node directly according to the scheduling information; otherwise, use a certain algorithm to obtain the scheduling information and save it in the scheduling table , and then forward the message along with the handshake information to the scheduled node machine according to the scheduling information.
⑤节点机接收到前端机发送过来的报文时,CANS取出握手信息,恢复客户请求报文的原貌,开始进入TCP/IP伪装模块来进行三次握手信号的伪装。它负责完成向上层服务程序伪装三次握手的功能。⑤ When the node machine receives the message sent by the front-end machine, CANS takes out the handshake information, restores the original appearance of the client request message, and starts to enter the TCP/IP masquerading module to camouflage the three-way handshake signal. It is responsible for completing the function of disguising the three-way handshake to the upper service program.
⑥最后,由该节点机的网络服务程序完成该服务请求并将结果直接返回给用户。⑥Finally, the network service program of the node computer completes the service request and returns the result directly to the user.
对于该连接的后续报文,如对一个服务项的多次传输,用改进的三次握手结束该连接等,网络分派器会根据本机维护的连接记录在IP层把报文直接转发到目的节点机,使该服务完整的被完成。For subsequent messages of the connection, such as multiple transmissions of a service item, ending the connection with an improved three-way handshake, etc., the network dispatcher will directly forward the message to the destination node at the IP layer according to the connection record maintained by the machine machine so that the service is completed in its entirety.
下面举例说明本系统实施过程中的配置情况。The following example illustrates the configuration during the implementation of this system.
在具有16个节点机上的集群系统构建一个适用于集群网络服务的基于内容的通用调度系统。其基本配置如表1所示。A content-based general scheduling system suitable for cluster network services is constructed on a cluster system with 16 nodes. Its basic configuration is shown in Table 1.
表1 系统的配置例示Table 1 Example of system configuration
其中,一台作为前端机,其余的服务节点机按照服务分成若干子网,如:Web服务子网、代理服务子网。具体实施如下:节点机1充当前端机,装载伪服务器模块和前端机内容调度系统(CADS);节点机2至节点机8在Web服务子网中,节点机9至节点机16在代理服务子网中,各节点机上均装载节点机内容服务系统(CANS)和相应的网络服务程序(即节点机2至节点机8均装载http server,节点机9至节点机16均装载cache server)。Among them, one is used as the front-end machine, and the remaining service node machines are divided into several subnets according to services, such as: Web service subnet and proxy service subnet. The specific implementation is as follows: node machine 1 acts as a front-end machine, and loads a pseudo server module and a front-end machine content scheduling system (CADS); node machine 2 to node machine 8 are in the Web service subnet, and node machine 9 to node machine 16 are in the proxy service subnet. In the network, each node machine is loaded with the node machine content service system (CANS) and the corresponding network service program (that is, the node machine 2 to the node machine 8 all load the http server, and the node machine 9 to the node machine 16 all load the cache server).
对整个系统的配置说明如下:The configuration of the entire system is described as follows:
1)服务类型库(13)具有如下字段,如表2所示。1) The service type library (13) has the following fields, as shown in Table 2.
表2 服务类型库的配置例示
Claims (4)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB021387702A CN1151635C (en) | 2002-07-09 | 2002-07-09 | General dispatching system based on content adaptive for colony network service |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB021387702A CN1151635C (en) | 2002-07-09 | 2002-07-09 | General dispatching system based on content adaptive for colony network service |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1392701A CN1392701A (en) | 2003-01-22 |
| CN1151635C true CN1151635C (en) | 2004-05-26 |
Family
ID=4749695
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB021387702A Expired - Fee Related CN1151635C (en) | 2002-07-09 | 2002-07-09 | General dispatching system based on content adaptive for colony network service |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN1151635C (en) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1300986C (en) * | 2003-04-14 | 2007-02-14 | 华为技术有限公司 | Method of realizing quick five seven layer exchange |
| CN100581157C (en) * | 2003-04-18 | 2010-01-13 | 智邦科技股份有限公司 | Method for transferring workload of seventh layer load balancer to server side for processing |
| US7334089B2 (en) * | 2003-05-20 | 2008-02-19 | Newisys, Inc. | Methods and apparatus for providing cache state information |
| CN1588411B (en) * | 2004-10-12 | 2011-02-23 | 北京北大方正电子有限公司 | Flow control method based on flow customization |
| BRPI0621617A2 (en) * | 2006-05-05 | 2011-12-13 | Thomson Licensing | first early threshold-based normalized rate (nredf) transmission for delayed loading services |
| CA2652147A1 (en) * | 2006-05-16 | 2007-11-29 | Bea Systems, Inc. | Next generation clustering |
| CN101083804B (en) * | 2006-06-02 | 2010-10-06 | 中兴通讯股份有限公司 | Realizing method for packet member to perform temporary scheduling by digital cluster communication system |
| CN103929365B (en) * | 2014-03-25 | 2019-05-14 | 格尔软件股份有限公司 | A kind of SiteServer LBS and method suitable for UDP service |
-
2002
- 2002-07-09 CN CNB021387702A patent/CN1151635C/en not_active Expired - Fee Related
Also Published As
| Publication number | Publication date |
|---|---|
| CN1392701A (en) | 2003-01-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1158615C (en) | Method and device for implementing load balancing on streaming media server | |
| CN101217464B (en) | A transmission method of UDP data packet | |
| CN1309214C (en) | Cooperative intrusion detection based large-scale network security defense system | |
| CN101217493B (en) | TCP data package transmission method | |
| CN109547580B (en) | A method and device for processing data message | |
| CN102185936B (en) | DNS (domain name system) service system and method based on Linux operation system | |
| CN111107081B (en) | DPDK-based multi-process DNS service method and system | |
| CN1410905A (en) | Full distribution type aggregation network servicer system | |
| CN1620010A (en) | VLAN server | |
| CN101051891A (en) | Method and device for safety strategy uniformly treatment in safety gateway | |
| CN1532723A (en) | Polymerization of service registraion form | |
| CN106973053B (en) | Acceleration method and system for broadband access server | |
| CN1284094C (en) | Distributed parallel scheduling wide band network server system | |
| US20050169309A1 (en) | System and method for vertical perimeter protection | |
| CN101729598A (en) | Method and system for increasing Web service response speed and network processor | |
| CN1151635C (en) | General dispatching system based on content adaptive for colony network service | |
| US20100195513A1 (en) | Packet inspection device and method | |
| CN102761608A (en) | UDP (User Datagram Protocol) conversation multiplexing method and load balancing equipment | |
| CN1722663A (en) | A proxy server system and method for realizing proxy communication thereof | |
| CN100359891C (en) | A Method of Improving the Service Processing Performance of Multimedia Message Center by Caching | |
| CN115834722B (en) | Data processing method, device, network element equipment and readable storage medium | |
| CN1909507A (en) | Method and system for message transfer | |
| CN102495764A (en) | Method and device for realizing data distribution | |
| CN113453278A (en) | TCP packet segmentation packaging method based on 5G UPF and terminal | |
| CN1754365A (en) | Method and device for distributing data packets sent from computer to cluster system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C19 | Lapse of patent right due to non-payment of the annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |