WO2004061550A2 - Systeme et procede de co-processeur d'analyse de reseau - Google Patents
Systeme et procede de co-processeur d'analyse de reseau Download PDFInfo
- Publication number
- WO2004061550A2 WO2004061550A2 PCT/US2003/015829 US0315829W WO2004061550A2 WO 2004061550 A2 WO2004061550 A2 WO 2004061550A2 US 0315829 W US0315829 W US 0315829W WO 2004061550 A2 WO2004061550 A2 WO 2004061550A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- application
- network
- user
- expert
- monitoring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/04—Network management architectures or arrangements
- H04L41/046—Network management architectures or arrangements comprising network management agents or mobile agents therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/02—Standardisation; Integration
- H04L41/0213—Standardised network management protocols, e.g. simple network management protocol [SNMP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/06—Generation of reports
Definitions
- the present invention relates to network monitoring and management, and more particularly to accelerating network monitoring and management.
- Networks are used to interconnect multiple devices, such as computing devices, and allow the communication of information between the various interconnected devices.
- Many organizations rely on networks to communicate information between different individuals, departments, work groups, and geographic locations.
- a network is an important resource that must operate efficiently.
- networks are used to communicate electronic mail (e-mail), share information between individuals, and provide access to shared resources, such as printers, servers, and databases.
- e-mail electronic mail
- shared resources such as printers, servers, and databases.
- a network failure or inefficient operation may significantly affect the ability of certain individuals or groups to perform their required functions.
- a typical network contains multiple interconnected devices, including computers, servers, printers, and various other network communication devices such as routers, bridges, switches, and hubs.
- the multiple devices in a network are interconnected with multiple communication links that allow the various network devices to communicate with one another. If a particular network device or network communication link fails or underperforms, multiple devices, or the entire network, may be affected.
- Network management is the process of managing the various network devices and network communication links to provide the necessary network services to the users of the network.
- Typical network management systems collect information regarding the operation and performance of the network and analyze the collected information to detect problems in the network. For example, a high network utilization or a high network response time may indicate that the network (or a particular device or link in the network) is approaching an overloaded condition. In an overloaded condition, network devices may be unable to communicate at a reasonable speed, thereby reducing the usefulness of the network. In this situation, it is important to identify the network problem and the source of the problem quickly and effectively such that the proper network operation can be restored.
- the network analyzer agent may include a user interface.
- Such user interface of the network analyzer agent may be adapted for allowing the selection of the network applications.
- a predetermined set of network applications may be included from which the network analyzer agent may select.
- the hardware co-processor In use, the hardware co-processor generates application objects and sends the application objects to the network analyzer agent for reporting via the user interface. Moreover, the network analyzer agent may be adapted for displaying protocol decodes via the user interface.
- the network applications may include an expert analysis application, an intrusion detection application, a multi-network segment analysis application, a voice over Internet Protocol (VoIP) application, a mobile application, and/or an application monitoring application.
- VoIP voice over Internet Protocol
- the network analyzer agent and the hardware co- processor may share memory. Still yet, the network communications may be received from at least one media module.
- Figure 1 is a representation of a system architecture according to one embodiment.
- FIG. 2 shows a representative hardware environment that may be associated with the workstations of Figure 1, in accordance with one embodiment.
- Figure 2 A illustrates one exemplary framework of the monitoring system of Figure 1, in accordance with one embodiment.
- Figure 2B illustrates another exemplary framework of the monitoring system of Figure 1, in accordance with another embodiment including an expert application.
- Figure 2C illustrates an exemplary method of use of the co-processor of Figure 2A in the context of an expert accelerator application.
- FIG. 3 illustrates an Application Monitor system according to one embodiment.
- Figure 4 is a diagram illustrating a system configuration for incorporating multiple nodes with centralized management.
- Figure 5 shows the basic hardware configuration of a Probe.
- Figure 6 shows the basic hardware configuration of the shelf system.
- Figure 7 depicts an illustrative CPCI module.
- FIG 8 depicts an HDD rear transition module (RTM).
- Figure 9A is a drawing of RTM usage in a multi-interface configuration.
- Figure 9B depicts RTM usage in a single-interface configuration.
- Figure 10 depicts CPCI bus transfer modes.
- Figure 11 shows an illustrative CPCI related hardware subclassification tree.
- Figure 12 depicts an operational environment including a node along with a set of environmental entities, which the node interacts with.
- Figure 13 is a table that listing a sub-classification of users.
- Figure 14 is a high-level diagram that shows basic components of application server hardware.
- Figure 15 shows the application server top-level subsystems and dependencies.
- Figure 16 shows the Ul servers provided by the Application Server.
- Figure 17 shows the primary run-time flows between application server subsystems and Ul servers.
- FIG 18 is a diagram showing a Multi-Interface (MI) Expert server and its related subsystems.
- MI Multi-Interface
- FIG. 19 depicts an RMON services subsystem and its primary flows.
- Figure 20 shows the primary flows associated with the logging manager.
- Figure 21 depicts several application server object repository packages.
- Figure 22 A shows an example managed object containment view of a node as seen by the application server.
- Figure 22B depicts an example managed object containment view of a media module as seen by the application server.
- Figure 23 is a flow diagram of a process in which the configuration manager uses the compatibility objects as a rules base for managing version and capability relationships between the system and its modules (hardware and software).
- Figure 24 show some of the relationships between the registry services and other subsystems.
- Figure 25 depicts registry entry object associations.
- Figure 26 shows a collection of triggers and trigger groups.
- Figure 28 is a high-level diagram that shows basic components of the media module hardware and dependencies.
- Figure 29 shows a top-level view of a PMD subsystem.
- Figure 30 shows a top-level view of a capture subsystem.
- Figure 31 shows a top-level view of a shared memory subsystem.
- Figure 32 shows a top-level view of a focus subsystem.
- Figure 33 shows the media module top-level subsystems and dependencies.
- Figure 34 shows the main components of the media module expert subsystem.
- Figure 35 illustrates a top-level Media Module Expert component classification.
- Figure 36 shows an example sub-classification of application expert components and the relation to a few application protocols.
- Figure 37 depicts a process for expert application performance analysis according to one embodiment.
- Figure 38 illustrates RMON object dependencies and persistence levels.
- Figure 39 shows the pipelined (flow processing and expert processing) filter and buffer components provided by the media module.
- Figure 40 depicts a process for adaptive priority data filtering according to an embodiment.
- Figure 41 is a media module general processing flow.
- Figure 42 is a high-level media module packet processing sequence diagram.
- FIG. 1 illustrates a network architecture 100, in accordance with one embodiment.
- a plurality of remote networks 102 are provided including a first remote network 104 and a second remote network 106. Also included is at least one gateway 107 coupled between the remote networks 102 and a proximate network 108.
- the networks 104, 106 may each take any form including, but not limited to a local area network (LAN), a wide area network (WAN) such as the Internet, etc.
- LAN local area network
- WAN wide area network
- the gateway 107 serves as an entrance point from the remote networks 102 to the proximate network 108.
- the gateway 107 may function as a router, which is capable of directing a given packet of data that arrives at the gateway 107, and a switch, which furnishes the actual path in and out of the gateway 107 for a given packet.
- At least one data server 114 coupled to the proximate network 108, and which is accessible from the remote networks 102 via the gateway 107.
- the data server(s) 114 may include any type of computing device/group ware. Coupled to each data server 114 is a plurality of user devices 116. Such user devices 116 may include a desktop computer, lap-top computer, hand-held computer, printer or any other type of logic. It should be noted that a user device 117 may also be directly coupled to any of the networks, in one embodiment.
- a monitoring system 120 is coupled to a network 108. Illustrative monitoring systems will be described in more detail below. It should be noted that additional monitoring systems and/or components thereof may be utilized with any type of network element coupled to the networks 104, 106, 108. In the context of the present description, a network element may refer to any component of a network.
- FIG 2 shows a representative hardware environment associated with a user device 116 of Figure 1, in accordance with one embodiment.
- a user device 116 of Figure 1 Such figure illustrates a typical hardware configuration of a workstation having a central processing unit 210, such as a microprocessor, and a number of other units interconnected via a system bus 212.
- a central processing unit 210 such as a microprocessor
- the workstation shown in Figure 2 includes a Random Access Memory (RAM) 214, Read Only Memory (ROM) 216, an I/O adapter 218 for connecting peripheral devices such as disk storage units 220 to the bus 212, a user interface adapter 222 for connecting a keyboard 224, a mouse 226, a speaker 228, a microphone 232, and/or other user interface devices such as a touch screen and a digital camera (not shown) to the bus 212, communication adapter 234 for connecting the workstation to a communication network 235 (e.g., a data processing network) and a display adapter 236 for connecting the bus 212 to a display device 238.
- a communication network 235 e.g., a data processing network
- display adapter 236 for connecting the bus 212 to a display device 238.
- the workstation may have resident thereon an operating system such as the Microsoft Windows® NT or Windows® 2000 Operating System (OS), the IBM OS/2 operating system, the MAC OS, or UNIX operating system. It will be appreciated that a preferred embodiment may also be implemented on platforms and operating systems other than those mentioned.
- a preferred embodiment may be written using JAVA, C, and/or C++ language, or other programming languages, along with an object oriented programming methodology.
- Object oriented programming (OOP) has become increasingly used to develop complex applications.
- Figure 2A illustrates one exemplary framework 250 of the monitoring system 120 of Figure 1, in accordance with one embodiment.
- a plurality of media modules 266 is provided for collecting network information. More exemplary information on such media modules 266 will be set forth hereinafter in greater detail.
- the network communications may include portions of network communications (i.e. packets, frames, etc.), statistical data related to network communications, or any other information relating to network communications which may be used for network analysis.
- a hardware co-processor 254 coupled to the media modules 266 via a direct memory (DMA) interface for receiving the network communications therefrom at least in part.
- the hardware co- processor 254 may include any hardware capable of running or executing a plurality of network applications 260 which, in turn, are adapted for processing the network communications for various purposes.
- the network applications 260 may include any applications capable of processing network communications (i.e. packets, frames, etc.), and generating objects indicative of results of such processing.
- the hardware co-processor 254 may be equipped with a predetermined set of network applications based on a sales arrangement, etc. Various examples of such network applications 260 will be set forth hereinafter in greater detail. In use, the hardware co-processor 254 generates application objects 264 that reflect the processing of the network communications.
- the applications executed by the hardware co-processor 254 may have several things in common. For example, they may involve frames, packet descriptors, and/or flow records, which are obtained from the media module(s). Table 1A illustrates various applications that may be executed utilizing the hardware co-processor 254.
- the hardware co-processor 254 may accept network communications with packet descriptors.
- flow records from the media modules 266 may also be provided.
- a network analyzer agent 252 Coupled to the hardware co-processor 254 and the media modules 266 is a network analyzer agent 252. Such coupling involves a DMA interface. As an option, the network analyzer agent 252 and the hardware co-processor 254 may share memory via a shared memory interface 262 for flow control and configurations. Still yet, a shared command response memory interface may be provided for bi-directional communications. Such shared memory may be used for configuration, commands, notifications, responses, etc.
- the network analyzer agent 252 collects the network communications from the media modules 266 for the purpose of conducting protocol decodes, buffer management, and/or other standard network analyzer functions. Still yet, the network analyzer agent 252 controls the hardware co-processor 254.
- the network analyzer agent 252 dictates which network applications 260 are executed by the hardware co-processor 254 and the manner in which the network communications are received by the hardware co-processor 254 from the media modules 266. Still yet, the network analyzer agent 252 also receives the application objects 264 for reporting to a user via a user interface. Such user interface may further be used for the purpose of receiving user input as to the specific manner in which the network analyzer agent 252 is to control the hardware co-processor 254 and the network applications 260 that are run.
- the hardware co-processor 254 operates as an "application-enabler" and various network applications 260 may be executed in a manner that is decoupled from a user interface associated with the network analyzer agent 252.
- the network analyzer agent 252 may further avoid any COM/DCOM usage.
- processing may be accelerated by virtue of the fact that the hardware co-processor 254 is dedicated for only designed network applications 260.
- any one or more of the network applications 260 may be executed concurrently in combination on any desired platform (i.e. Unix®, Windows®, etc.)
- FIG. 2B illustrates another exemplary framework 280 of the monitoring system 120 of Figure 1, in accordance with another embodiment including an expert application.
- network communications may be streamed by direct memory access to a miniport driver 296, which forwards them to the kernel driver 297. Both of these drivers may be modified per the desires of the user.
- a buffer manager 298 may then keep track of read and write pointers in a circular buffer. This behavior may be similar to what happens on the network analyzer agent 282 during capture.
- the write pointer may be updated as network communications are deposited from the kernel driver 297 and placed in a capture buffer.
- the read pointer is updated as the hardware co-processor 284 monitors accessed network communications from the capture buffer.
- a dropped event is raised when the write pointer progresses past the read pointer. This event causes the hardware co-processor 284 to provide notification that a dropped event has occurred. This gives the system a chance to reinitialize for a dropped event.
- the hardware co-processor 284 posts network communications as fast as possible to the network analyzer agent 282 by calling a function in NgExpertSvr.dll 290 and updating the read pointer.
- packet descriptors may either become part of the frame header or passed as a parameter to the function.
- flow records may be used to aid the system in parsing network communications and gathering statistics.
- An exemplary expert accelerator application will now be described in greater detail.
- One purpose of the present expert accelerator application is to develop a significant improvement in the amount of network communications per secon ⁇ mar a system may process.
- Figure 2C illustrates an exemplary method 299 of use of the hardware co-processor of Figure 1A in the context of an expert accelerator application. Such exemplary method 299 illustrates various concepts involved in the design of the present system.
- the media module may support multiple interfaces at the same time. Each interface may monitor the same segment, different segments, or even segments on either side of a router, switch, or other network device.
- Various types of multi-segment analysis may be performed per the desires of the user including aggregation, correlation, and asynchronous analysis.
- the aggregation expert is recognized by the fact that the IP address/port number pairs have at least one node in common. This may be the case when several clients are connected to a specified server and each client connection is monitored by a different media module. However, they are connected to the same server, and one can compute a variety of different statistics, as well as determine which client is experiencing difficulties.
- the correlation model depends on tapping into both sides of a switch, router, or WAN connection. In this case, not only is one interested in response times and equipment delays, but he or she may possibly have a break down of the delays into their component parts. For example, consider an Oracle® application running over a WAN link. There is also a similar workstation running the same application locally, on the same segment as the server. One could conceivably have the Oracle® application running on the same server through a switch or router. The first configuration provides an opportunity to measure the network delay through the WAN link by using the second configuration to eliminate much of the local traffic delays.
- Monitoring connections to a typical application server from various probes that are not synchronized in time may also provide valuable information. Knowing how data from different parts of the network at different times of the day or different days of the week can enable a network manager to allocate resources appropriately and assign priorities for QoS or SLA agreements. Asynchronous analysis can be performed relative to aggregation and/or correlation data.
- Table IB illustrates various additional features that may be implemented in the context of the present network analyzer accelerator application.
- the present platform includes a set of application monitoring and management tools that provide business critical application and network performance information to administrators such as CIOs and enterprise network managers.
- Such application-monitoring system is provided for domestic enterprise management.
- One purpose of this system is to enable administrators (such as CIOs and Network Managers) to introduce real-time application monitoring into service offerings.
- administrators such as CIOs and Network Managers
- One embodiment provides distributed multi-segment network monitoring and correlation, with a focus on application performance.
- This multi-segment capability can be extended to multi-site monitoring and correlation (e.g. nodes placed at different geographical locations).
- the system is preferably based on a scalable, high- performance, open architecture, which can be easily adapted to support many different topologies and features.
- Figure 3 illustrates an Application Monitor system 300 according to one embodiment.
- the system can include the following topologies:
- the system includes two major components: a single application server module and one or more Media Modules.
- the role of the media module is to provide a physical observation point of network traffic on a given segment 306.
- the application server provides all administrative functions (i.e. user interface, provisioning, reports, alarms and statistics, Simple Network Management Protocol (SNMP) agent, etc.) for the system.
- SNMP Simple Network Management Protocol
- a single monitoring interface is available in a self-contained, managed device, similar to a typical Remote Network Monitoring (RMON) probe.
- RMON Remote Network Monitoring
- Multi-interface configuration a larger system is possible by providing multiple interfaces (Media Modules), which allows monitoring and real-time correlation of multiple (co-located) network segments 308.
- Media Modules Media Modules
- no higher-layer management console is required.
- This second configuration also allows the mixing and matching of different media module types.
- One exemplary benefit of this configuration would be to monitor traffic seen on the WAN-side of a router, on a backbone, and on individual branch segments all from the same system, providing a complete network view from a single administrative point.
- the system is a self-managed device, meaning that no additional EMS/NMS functionality is required for any of the supported features.
- a user can connect directly to the node using any standard web browser and immediately receive alarms, statistics and diagnosis, configure triggers, view reports, etc.
- FIG. 4 is a diagram illustrating a system configuration 400 for incorporating multiple nodes with centralized management. As shown, this may be accomplished using one of the following approaches:
- SNMP Simple Network Management Protocol
- the second approach offers many benefits over a standard SNMP manager including enhanced correlation, multi-interface "Expert" functions, self-similar topology views, a rich set of triggers, system auto-discovery, etc.
- the Application Monitoring system is a high performance, scalable monitoring and analysis tool using custom, purpose-built hardware. Furthermore, the system provides advanced network and application performance monitoring capability to enterprise network managers and CIOs.
- Table 1C lists some exemplary features.
- the system platform can support a multitude of monitoring and analysis applications due to its open architecture and inherent flow classification capabilities.
- Table 2 is a partial list of applications provided by the system. These include real-time application monitoring and diagnostic services
- Table 2 • Performance and SLA management - Application and network response time, distributions, etc.
- QOS Quality of Service
- the system according to one embodiment is extensible in the areas shown in Table 3.
- System Hardware Components A system hardware architecture according to a preferred embodiment is described below.
- the system hardware architecture in this example is based on the Compact PCI (CPCI) multi-processor computer platform.
- the configurations can use a chassis, power supplies and system controller (single board computer) module.
- Hardware modules can be developed per physical media type (i.e. ATM, Gigabit Ethernet, etc.) but all share a common design above the media-dependent portion.
- Probe Enclosure - small 2U CPCI chassis houses one Application Server and one Media Module
- the system can include the following Compact PCI compliant components, for example:
- the 2U backplane supports 64-bit or 32 bit bus transfers at 66 or 33 MHz
- the multi-slot backplane supports 64-bit or 32 bit bus transfers at 33 MHz
- Primary Hardware modules (6U CPCI cards):
- Rear Transition Module HDD board - provides hard drive, serial port and Ethernet for any primary hardware module. Note that this module is always required for the application server and is optional for media modules (in multi-slot configurations).
- the Compact PCI specification allows the use of multiple bus masters in a system and includes support for the items shown in Table 5.
- FIG. 5 shows the basic hardware configuration of a Probe 3U2. Vanous combinations are possible for the two configurations; however in general the standalone probe can use a 2U pizza-box chassis 502 populated with a single media module 504 and application server Module 506.
- Figure 6 shows the basic hardware configuration of the shelf system 304.
- the shelf system can use a 16-slot chassis 602 populated with a single application server Module 604 and one or more Media Modules 606. It should be noted that the application server and media module designs are reusable in any CPCI enclosure.
- Figure 7 depicts an illustrative Compact PCI (CPCI) module 700. All hardware modules can conform to the to PICMG 2.0 R3.0 Compact PCI Core Specification, which defines a shared 32 or 64-bit data transfer path running at 33 or 66 MHz, a set of standard board profiles, an optional rear transition module (rear I/O) per slot, and one or more optional PMC (mezzanine) daughter cards per standard board.
- CPCI Compact PCI
- the standard board sizes can be based on a Euro-card format and are typically available in two primary sizes, as listed in the following table.
- boards have a height profile, which dictates how many backplane slots they occupy.
- the common single-slot profile is referred to as "4HP".
- Boards may be of this unit height or multiples of it such as 8HP (double-slot), 12HP (triple- slot), etc.
- the application server module according to an illustrative embodiment a 6U, 4HP
- SBC single-slot CPCI single-board computer
- the role of the system controller is generally to configure any peripheral modules via plug-and-play auto detection. This includes assignment of memory address ranges, identifying bus number, slot number, hot-swap and bus-master capabilities, etc. All CPCI backplanes have at least one designated "system-slot" where the system controller resides.
- the application server therefore is responsible for detecting, configuring, managing and downloading software to all media modules in a given system.
- the following table lists some of the application server hardware attributes.
- the media module is a 6U, 8HP (double- slot) CPCI custom hardware module which acts as the network analysis interface in any system configuration.
- the role of the media module is generally to monitor a physical network segment, perform various levels of real-time analysis and to report events and statistics to the application server Module via the CPCI backplane.
- the media module supports plug-and-play auto detection, assignment of memory address ranges, reporting bus number, slot number, hot-swap and bus-master capabilities, etc.
- Table 8 lists some of the media module hardware attributes.
- FIG. 8 depicts an HDD Rear Transition Module (RTM) 800.
- the system architecture supports a single RTM for each primary board in the system (i.e. application server or Media Module).
- the RTM is an ancillary module which provides the functions set forth in Table 9.
- the RTM module 800 may be required for the application server module in some systems, and is optional for each media module in a multi-interface system.
- Figure 9A is a drawing of RTM usage in a multi-interface configuration 900.
- an RTM 800 may provide each media module 902 with the ability to perform autonomous capture and statistics logging to disk and enables multi-segment post capture analysis without requiring disk sharing.
- FIG. 9B depicts RTM 800 usage in a single-interface configuration 920.
- streaming to the Application Server's RTM disk via the backplane may be adequate for this purpose.
- PMC Modules
- the application server supports multiple general-purpose PMC (daughter-card) modules with connector access through the front bezel.
- PMC aughter-card
- All primary connectors can be provided via the front bezel of the system boards.
- the auxiliary connectors (ETH and COM) can also be provided on the RTM modules.
- Figure 10 depicts CPCI bus transfer modes.
- the general transfer model taken for the system architecture is to utilize the CPCI backplane 1000 primarily for configuration, statistics, events and post capture (disk) transfers between the Media Module(s) 1002 and the Application Server 1004.
- the bulk processing of packet data is handled directly by the Media Module 1002, whereby the application server 1004 is essentially responsible for providing statistics and correlated data to the end user or management station. This approach improves performance and scalability.
- the method used for moving data between the media modules 1002 and application server 1004 can be based on a "pull" model, whereby higher-level entities retneve data (i.e. statistics and data objects) from the lower-level entities.
- the lower-level objects are maintained by the media modules 1002 "in-place". Therefore all requests for media module generated objects (from a user or management station) result in the application server 1004 retrieving data directly from the media module(s) 1002 of interest.
- This section will focus on a functional decomposition of the system.
- This first-level decomposition will include both hardware and software subsystems as functional entities.
- the system architecture may be open and extensible at every level.
- an object-oriented approach has been used in decomposing the system into sets of self- contained subsystems with common interfaces. These subsystems may be overloaded with different components of the same "class" to extend functionality over time without creating additional complexity.
- This approach applies not only to specific hardware and software components, but also to combined functional entities as a whole. Each of these entities may be viewed as an encapsulated subsystem comprised of hardware, software, or both which provides a particular class of functionality within the system.
- Many of the diagrams referred to herein assume some level of understanding of the UML (Unified Modeling Language) by the reader.
- UML is a standard notation for the modeling of real-world objects as a first step in developing an object-oriented design methodology.
- Figure 11 shows an illustrative CPCI related hardware subclassification tree 1100.
- the subclassification example while quite simple, illustrates the potential overloading of media modules and CPCI enclosures within the system.
- the operational environment generally includes the elements listed in Table 10.
- Figure 12 depicts an operational environment 1200 including a node 1202 along with a set of environmental entities, which it interacts with. These environmental entities will be described in the next subsections.
- the network 1204 under observation may include one or more network segments, which may or may not have a logical relationship to one another. Some examples of segments with relation to one another are listed in Table 11.
- All observed network segments can be monitored via connections with one or more media interfaces, which are in turn realized by media modules in the system.
- Environmental equipment that the system can interact with includes three main classes:
- Machine clients i.e. network management systems
- Supporting equipment includes any external equipment that adds feature capability to the node itself in its monitoring role.
- the Modem 1206 and RAID array 1208 are considered to be of this supporting class.
- Many other types of supporting equipment may be interfaced to through CPCI option boards, PMC modules, or auxiliary interfaces.
- Machine clients however, play a different role in that they have direct access to the managed objects of the system. Because of this, they can affect the behavior and state of the node and may be treated with the same security precautions as a human client.
- Machine clients supported by the node include SNMP managers and CORBA managers.
- the application server itself may act as a higher-layer manager to a group of elements, which may be remotely located.
- the application server software may be running on a dedicated management workstation and uses CORBA as a direct object-level access protocol.
- CORBA client would be a second level OSI NMS.
- the ODMG and other bodies have standardized on CORBA as the management interface above the element (EMS) level.
- the third class of equipment includes RMON probes.
- the node For clients in the first category, the node provides authentication and access to resources based on user privileges and provisioned policies.
- the intermediate management system For the second type of users (indirect), the intermediate management system provides the majority of authentication and policy enforcement. In this case, the node treats the management machine as a "trusted" user and only enforces provisioned blanket policies for the machine. It should be noted that there may be situations where the node may be required to support both human and machine clients simultaneously. This type of situation is not precluded in the architecture.
- another sub-classilication 01 users may oe required based on how the client uses the node. For the present discussion, this sub- classification pertains to users from the first category (i.e. direct human clients). The sub-classification of these users can be based on the operations each class of user is interested in or allowed to perform.
- Figure 13 is a table 1300 that lists these classes.
- the application server Module is the single point of user or management interaction with the monitoring node.
- the application server Module acts as the CPCI "system controller" in any configuration, as such it resides in the system slot of a CPCI chassis.
- the hardware for this module can be a Pentium 4 based single board computer running Linux, for example. Table 13 lists some of the features of this module.
- the application server is generally responsible for the functions listed in Table 14.
- FIG. 14 is a high-level diagram that shows the basic components 1400-1410 of the application server hardware. Illustrative components are briefly described in Table 15.
- Front bezel interfaces 1402 - Standard I/O (mouse, keyboard, SVGA, 10/100 Ethernet)
- AS Processor 1404 e.g., Pentium 3, 850 MHz Intel processor
- This section will describe an illustrative software subsystems and interfaces which can comprise the application server module.
- a top-down approach will be used to introduce the overall architecture and each of the constituent subsystems.
- This architecture should be viewed as a basic model, which can be changed as more focused resources are added to the system.
- Figure 15 shows the application server top-level subsystems and dependencies.
- a set of top-level packages representing major architectural components are shown.
- the architecture is very centered around the common object repository 1504 (and configuration manager 1506).
- This repository is preferably an active object database, which supports event generation when certain operations are performed on (or attributes change in) active objects.
- this portion of the architecture is used to support inter-subsystem communications and triggering functions.
- a set of common engines 1508 for supporting user interface functions i.e. logging, statistics, alarm and event managers
- These engines each provide a consolidated point for sending common types of information from various sources to the Ul servers 1510.
- FIG. 15 Also shown in Figure 15 is another set of related subsystems 1511, which handle user session management including security, registering for services, and setting up triggers.
- a set of subsystems 1512 provide analysis, monitoring and administrative services either directly to clients (i.e. RMON) or through the Ul servers.
- the hardware services subsystem 1514 which provides all access to hardware objects (Media Module), including events, configuration, statistics, and maintenance functions. Note that throughout this section it is assumed that inter-subsystem object access is provided through the object repository (via CORBA) and events are passed between subsystems using CORBA.
- FIG 16 shows the Ul servers 1510 provided by the Application Server.
- the Ul servers are responsible for providing web clients various Ul elements for configuring the system or a session, creating triggers, creating and viewing reports, graphs and logs, viewing alarms, statistics and events, and performing maintenance or administrative functions.
- the web-based interface can rely on an Enterprise Java Beans (EJB) framework and can provide dynamic HTML generation via Java Server Pages (JSP) for passive clients.
- the framework can support connections with active clients for providing an event interface and enhanced functionality.
- clients may retrieve active applets (or beans) from the Application Server, which may use Java remote method invocation (RMI) to support real-time event notification and direct operations on the server.
- RMI Java remote method invocation
- this mechanism allows a greater level of scalability by leveraging the power of the client machine for distributed graphics generation and logging, etc.
- the serial Ul is essentially a terminal (command-line) interface for administrative and maintenance functions such as setting the IP addresses of the node, running system diagnostics, etc. It should be noted that many of the administrative functions are available through the web interface as well.
- Figure 17 shows the primary run-time flows between application server subsystems and Ul servers 1510.
- the log server is the element that provides access to log files on a per user basis.
- Log files provide a time-stamped persistence mechanism for transient data and events.
- Logs may be created as user specific or as system global.
- the system global logs may be stored on the application server module, whereas user specific logs can reside on the application server or on the client machine (assuming an active client).
- the log server provides operations for creating, deleting, enabling and disabling each log.
- Per- user logs are created by adding alarms, triggers, statistics and events as "logged” in the user's registry entry.
- Global logs are created by adding alarms, triggers, statistics and events as "logged” in the SYSTEM registry entry. Once a log is created, it is accessible via the log server screens.
- the logging manager subsystem provides the actual functions for creating and adding entries to logs and dispatching information to the log server.
- the graph server is the element that provides access to various graphs on a per user basis. Graphs provide a useful mechanism for viewing of multi-dimensional data. Graphs may be generated based on user specified or system global data and events.
- the graph server provides operations for creating, deleting, enabling and disabling each graph view. Per-user graphs are created via the user's registry entry. Global graphs are created via the SYSTEM registry entry.
- the graph server additionally provides functions for creating and adding entries to graphs along with the graph type and criteria. Graphs may be generated using dynamic data or data from log files. In general the graph server receives data from the subsystems listed in Table 16.
- the report server like the graph server provides access to report lues on a per user basis. Reports may be generated based on user specified or system global data and events. The report server provides operations for creating, deleting, enabling and disabling each report view. The report server additionally provides functions for creating and adding entries to reports along with the report type and criteria. Per-user reports are created via the user's registry entry. Global reports are created via the SYSTEM registry entry. Reports may be generated using dynamic data or data from log files. In general the report server receives data from the subsystems set forth in Table 17.
- the statistics server is the element that provides access to groups of statistics on a per user basis.
- Statistics groups may be created as user specific or as system global.
- the system global statistics can be stored on the application server module, whereas user specific statistics can reside on the application server or on the client machine (assuming an active client).
- the statistics server provides operations for creating, deleting, enabling and disabling statistics groups. Adding statistics in the user's registry entry creates per-user groups. Adding statistics in the SYSTEM registry entry creates global groups. Once a statistics group is created, it is accessible via the statistics server screens.
- the statistics manager subsystem provides the actual functions for creating and adding entries to statistics groups and dispatcnmg information to the statistics server.
- the event server like the statistics server provides access to groups of events on a per user basis.
- Event groups may be created as user specific or as system global.
- the system global events may be stored on the application server module, whereas user specific events can reside on the application server or on the client machine (assuming an active client).
- the event server provides operations for creating, deleting, enabling and disabling event groups. Adding events in the user's registry entry creates per-user groups. Adding events in the SYSTEM registry entry creates global groups. Once an events group is created, it is accessible via the event server screens.
- the event manager subsystem provides the actual functions for creating and adding entries to event groups and dispatching information to the event server.
- the configuration server provides access to system configuration functions and information.
- Table 18 lists some of the types of configuration information available.
- the configuration server relies primarily on the configuration manager for accessing system information, but also depends on administrative services and the session manager for controlling access to privileged configuration operations.
- the triggers server is the element that provides access to triggers on a per user basis. Triggers may be created as user specific or as system global.
- the triggers server provides operations for creating, deleting, modifying, enabling and disabling triggers.
- the triggers server presents the system events and actions available to triggering functions. Adding triggers to the user's registry entry creates per-user triggers. Adding triggers in the SYSTEM registry entry creates global triggers. Once a trigger is created, it is accessible via the triggers server screens.
- the triggers manager subsystem provides the actual functions for creating and adding triggers and exchanges events and actions with other subsystems and the object database.
- the alarms server like the event and statistics servers, provides access to groups of alarms on a per user basis.
- Alarm groups may be created as user specific or as system global.
- the system global alarms may be stored on the application server module, whereas user specific alarms can reside on the application server or on the client machine (assuming an active client).
- the alarms server provides operations for creating, deleting, enabling and disabling alarm groups. Adding alarms in the user's registry entry creates per-user groups. Adding alarms in the SYSTEM registry entry creates global groups. Once an alarm group is created, it is accessible via the alarms server screens.
- the alarms manager subsystem provides the actual functions for creating and adding entries to alarm groups and dispatching information to the alarms server.
- the decode server provides various views of captured packets in a human readable format.
- the decode server receives data from the capture manager subsystem.
- the administrative server provides a system administrator with a set of functions for provisioning, maintaining and managing the system. Access to these services is typically restricted from all users except those with administrative privileges.
- the administrative services subsystem provides the actual functions for administering the system and provides an interface to the administrative server (and the administrative serial Ul server). Table 19 lists some of the operations available via the administrative server.
- MI Expert Server 1702 (See Figure 17)
- FIG 18 is a diagram showing the MI Expert server 1702 and its related subsystems.
- the MI expert server subsystem is responsible for creating, deleting, enabling and disabling expert monitoring and analysis functions on the application server.
- MI Multi-interface
- the expert server In the proxy mode (much like the RMON proxy module), the expert server relays expert objects, alarms, statistics and events from media modules to one or more of the Ul servers or supporting engines.
- the expert server collects expert objects, alarms, statistics and events from multiple media modules to perform correlation across multiple interfaces based on rules sets.
- This second mode may also be used to provide information to the application server RMON agent for correlation MIBs.
- the expert server may request media modules to capture packet data to disk, which may be used to further correlate information across multiple interfaces. It should be noted that both modes could be in operation simultaneously.
- RMON Services 1704 See Figure 17
- FIG 19 depicts an RMON services subsystem 1704 and its primary flows.
- the RMON services subsystem is responsible for providing access to local MIB objects for external SNMP management systems as well as internal Ul servers.
- Multi-interface (MI) agent module 1904 Multi-interface (MI) agent module 1904
- the proxy module (much like the expert proxy mode) relays SNMP objects alarms, statistics and events from agents on media modules and the MI agent to external SNMP managers, as well as to the local manager module.
- the MI agent module provides correlation across multiple interfaces based on rules sets. This second module may use information generated by the MI expert to generate the correlation MIBs, which are available to external managers as well as to the local manager module.
- the manager module collects information from the MI agent and the media module agents (and potentially external agents) for presentation to a direct (web) user.
- the manager module may rely on local engines (logging manager, statistics manager, event manager, alarm manager and capture manager) and the Ul servers to provide RMON management views to users.
- the administrative services subsystem is responsible for providing administrative functions to a (direct) client with administrative privileges.
- Two user interface servers have access to the services provided by this subsystem: 1. Administrative Serial Ul (CLI based)
- Administrative Server web based
- triggers may be configured to perform a subset of administrative functions based on system events, time of day, etc.
- Figure 20 shows the primary flows associated with the logging manager 1706.
- the logging manager subsystem is responsible for creating and storing system and user logs, which include time-stamped events, alarms, statistics, and other information as requested on a per session basis.
- the logging manager provides the requested log information to the log server Ul element based on logging criteria in the user and SYSTEM registry entries.
- the logging manager uses the application server hard drive to persist this data and may additionally use secondary storage (i.e. a file server) for extended capability.
- secondary storage i.e. a file server
- equivalent functionality may be provided on each media module when equipped with a local hard drive.
- the logging manager on the application server treats each logging manager on the media modules as a remote file server.
- the statistics manager 1708 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the application server and equivalent functions on the media modules.
- This subsystem is used to provide (dispatch) statistics to the statistics server, graph server and report server Ul elements, as well as to the logging manager.
- the various statistics may be dispatched based on intervals, change occurrence, etc. as defined in the user and SYSTEM registry entries.
- This subsystem provides dispatch filtering on a per user basis for multiple client sessions. System triggers may be provided by this subsystem to invoke actions based on statistics.
- the actual statistics objects are maintained in the object repository.
- the alarm manager 1710 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the application server and equivalent functions on the media modules.
- This subsystem is used to provide (dispatch) alarms to the alarms server, graph server and report server Ul elements, as well as to the logging manager.
- the various alarms may be dispatched based on severity, intervals, change occurrence, etc. as defined in the user and SYSTEM registry entries.
- This subsystem provides dispatch filtering on a per user basis for multiple client sessions. System triggers may be provided by this subsystem to invoke actions based on alarms (i.e. dial a pager, etc.).
- the actual alarm objects are maintained in the object repository.
- Event Manager 1712 (See Figure 17)
- the event manager 1712 like the alarm manager 1710 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the application server and equivalent functions on the media modules. This subsystem is used to provide
- the capture manager subsystem like the logging manager is responsible for creating and storing trace files, which include filtered packets as requested on a per session basis.
- the capture manager provides the requested information to various clients including the decode server Ul element, based on capture criteria in the user and SYSTEM registry entries.
- the capture manager uses the application server hard drive to persist this data and may additionally use secondary storage (i.e. a file server) for extended capability.
- secondary storage i.e. a file server
- equivalent functionality may be provided on each media module when equipped with a local hard drive.
- the capture manager on the application server treats the capture managers on the media modules as a remote file server.
- Figure 21 depicts several application server object repository packages 2100.
- the object repository 1504 is the heart of the application server and is used to store all application server objects. Virtually all application server subsystems use the object repository to store and access their objects. Several types of objects 2102 in the object repository are shown in Figure 21.
- the object repository can also provide active object capabilities meaning that objects may create notification events on creation, deletion or change of state. This functionality may be used as a triggering mechanism allowing virtually any system capability to be invoked by triggers.
- Figure 22A shows an example managed object containment view 2200 of a node as seen by the application server.
- Figure 22B depicts an example managed object containment view 2220 of a media module as seen by the application server.
- the configuration manager is responsible for providing all access to managed objects in the system. This includes managing the state and availability of hardware objects, compatibility objects, application objects, administrative, session and security objects, Ul objects and trigger objects.
- the managed objects accessed by the configuration manager are not the actual transient objects produced by applications, but are rather configuration objects, which control and reflect the state of applications, hardware, etc.
- the media module object is created upon insertion into the chassis.
- the media module sub-objects reside on the media module.
- Figure 23 is a flow diagram of a process 2300 in which the configuration manager uses the compatibility objects as a rules base for managing version and capability relationships between the system and its modules (hardware and software).
- a media module is received into the chassis.
- the application server detects the module and creates an (root) object for it in operation 2304.
- the version and capabilities of the module are detected in operation 2306, and in operation 2308, are compared with an entry of its class in the compatibility tree. If the version is incompatible, the new module is disabled in operation 2310 and an alarm is generated in operation 2312. Otherwise, the default configuration is applied to the module in operation 2314 and in operation 2316, the module is activated.
- the state of the module and all of its sub-objects are now available for further operations. This same process may apply for any additional hardware or software modules.
- the session manager is responsible for controlling users logging into the system, authenticating them, validating access privileges, etc.
- the session manager uses the security manager, configuration manager and registry services subsystems to perform much of this functionality.
- previously created session configurations may be loaded for the client by the session manager.
- the security manager provides authorization levels to users based on provisioned privilege and authentication policies.
- the registry services subsystem provides a capability to associate items of interest to individual users of the system or to the system itself.
- the registry can have two major classes of entries:
- system entry is a global entry, which can only be accessed by the system administrators or users with appropriate privileges.
- user entries are created when a user configures a session on the system. In both cases, the types of information listed in Table 21 are maintained in the registry:
- the SYSTEM registry entry are those that are viewed as "always important" on a global basis. These items may be available for viewing by all users, higher-level managers, etc. or according to individual user policies.
- the registry therefore creates a type of customizable steering mechanism that prevents events and data, which are not of interest to everyone from flooding all clients.
- Figure 24 show some of the relationships between the registry services 2400 and other subsystems.
- Figure 25 depicts registry entry object associations 2500.
- Figure 26 shows a collection of triggers 2602 and trigger groups 2604.
- the triggers manager 1714 is indirectly responsible for the creation, deletion, activation and deactivation of triggers and directly responsible for the scheduling and invocation of actions based on triggers. This includes listening for events for enabled triggers, evaluating conditions required to fire the trigger, and invoking the action(s) for the trigger.
- the set of triggerable events and actions needs to be published by each subsystem via the configuration manager (i.e. through the managed objects for the subsystem).
- Trigger groups may be created per-user or globally via the registry.
- the hardware services subsystem provides all event and object communication between the application server and other system modules. This includes CPCI backplane drivers, hardware detection and initial configuration, interrupts, data transfers, etc. Table 22 lists two mechanisms for communication over the CPCI backplane.
- the first mechanism allows the application server flexible access to all media modules in the system using an IP transport. This mode can be used to provide RMON (SNMP) access to agents on media modules and supports other direct object access protocols. Since the majority of traffic between media modules and the application server is based on configuration, events and statistics the performance is adequate.
- the second mechanism provides a "raw" transfer mode using the PCI (memory mapped) target initiator approach. In this mode, very high-speed shared memory transfers are possible using the PCI burst DMA mechanism. This mode may be useful for accessing trace files captured to disk on the media modules, etc.
- the media module is effectively a single-board, real-time monitor/analyzer and is the single point of network monitoring for the monitoring node.
- the media module acts as a CPCI (master/slave) "peripheral controller" in any configuration and as such it may always reside in a peripheral slot of a CPCI chassis.
- the hardware for this module includes multiple microprocessors, FPGAs and other application-specific circuitry.
- the media module supports Gigabit Ethernet (and others).
- the main processor on the media module can run a real-time embedded OS (Nx Works). Table
- Multi-level Expert monitoring - Media, Network, Transport, Session, Service and APM
- the media module is generally responsible for the functions listed in Table 24.
- the media module hardware and software architecture is optimized based on three main functions:
- the media module is architected to optimize performance for each of these functions.
- This optimization consists of application specific hardware, distributed filtering and partitioning of software on multiple processors to provide the highest levels of run-time performance. The majority of this optimization revolves around the flow classification function, as this is central to all other functions on the media module.
- the media module is preferably a CPCI single board hardware/real-time software module.
- This board is essentially a high-powered monitor/analyzer on a CPCI module.
- Figure 28 is a high-level diagram that shows the basic components of media module hardware and dependencies. Each of the hardware components and subsystems will be described in the following sections.
- FIG. 29 shows a top-level view of the PMD subsystem 2802.
- the PMD subsystem provides the items listed in Table 26.
- a low-level protocol termination e.g. GbE, ATM, POS, etc.
- each PMD type Associated with each PMD type is a "media expert" function, which both encapsulates and provides a well-defined interface to the above functions.
- the media expert may be implemented as a combination of hardware and software.
- the software portion may be implemented in a dedicated task on the media module main processor, or in a dedicated PMD processor. For simpler protocols (Ethernet, etc.) the task approach can be used, whereas for more complicated protocols (that involve complex signaling), a dedicated PMD processor is preferable.
- the PMD is responsible for providing a packet-level interface to the flow classification engine. Since the flow classifier only understands packets, any cell or other transport streams may be reassembled prior to presentation to the capture control interface.
- the PMD subsystem prepends each packet passed on to the capture subsystem with a descriptor containing the information listed in the Table 27.
- Timestamp • Frame type (control , etc . )
- the PMD maintains all interface counts appropriate to the media (packets, bytes, too long, too short, etc.) as well as any alarm status and control.
- the physical interfaces may be optical or electrical, depending on the media type. For Gigabit Ethernet, these interfaces can be optical and can be provided by GBIC devices.
- the timing interface provides a mechanism to use an outside timing source for providing per-packet timestamps. This may be used to synchronize the timing across multiple media modules in different locations.
- the external timing interface may be provided to all media modules in a shelf system by a set of predefined signals on the CPCI backplane.
- the source of these timing signals can be an optional GPS (or other) timing module.
- the uP interface provides the media module (main) processor access to all configuration and status registers, memories, etc for the PMD. In the cases where a dedicated PMD processor exists, this interface may utilize a shared memory mechanism.
- the packet level interface is used for transferring pre-filtered packets to the capture subsystem.
- This interface provides a unified (multiplexed) stream containing packets received from all physical interfaces that are destined for capture or queuing.
- This interface either provides timing to or receives timing from the capture subsystem. Buffering within the PMD resolves the timing boundary issues across this interface.
- the capture subsystem can use a demand-driven transfer mechanism to retrieve packets when available from the PMD.
- the capture subsystem provides filtering and buffering for packets received from the PMD, an interface to the flow processor for accessing packets in the capture buffer and an interface for forwarding a selected subset of the captured packets to the focus buffer.
- the capture subsystem provides a triple-ported interface to the capture buffer.
- Figure 30 shows a top-level view of the capture subsystem 2804.
- the capture subsystem provides the functions listed in Table 28.
- the packet level interface is the source of all packet data to be processed by the capture subsystem.
- the capture subsystem retrieves packets from the PMD whenever packets are available as indicated by the PMD.
- This interface uses DMA to transfer packets into the capture buffer after parsing and filtering each received packet.
- the uP interface provides the media module (flow) processor access to all configuration and status registers, memories, etc for the capture subsystem.
- This interface is the source of all packet data to be processed by the flow processor and is controlled exclusively by the flow processor. This includes setting up filters and triggers, managing queues and initiating DMA transfers for forwarding selected packets on to the focus buffer.
- This interface can support an on-demand hardware packet transfer mechanism (DMA) into the flow processor's local memory to alleviate timing contention for the capture buffer.
- DMA hardware packet transfer mechanism
- the focus buffer interface is used for transferring packets from the capture buffer into the focus buffer. This forwarding uses DMA and is under control of the flow processor. Operationally, once the flow processor has analyzed a packet in the capture buffer, a decision is made whether to forward the packet on or not. If the packet is to be forwarded, the flow processor initiates the transfer across this interface. A control mechanism can exist to indicate when the focus buffer is full.
- the capture subsystem provides two primary modes of operation, and several sub- modes within each primary mode.
- the primary modes are listed in the Table 29.
- diagnostic mode the capture buffer takes snapshots of data from the line and provides basic (pattern) filtering capabilities.
- the buffer modes supported in diagnostic mode include those listed in Table 30.
- fill and stop mode when a capture is initiated (usually by a trigger), the buffer fills linearly until full or a stop trigger is fired. In the wrap mode, the buffer is continuously being overwritten with the most recent data from the line until a stop trigger is fired.
- the start and stop capture triggers are implemented in hardware and support stop after N (bytes) capability. This allows a user defined capture window with information both before and after the event of interest.
- the capture buffer acts as a high performance FIFO queue.
- Table 31 lists buffer modes supported in monitoring mode.
- the buffer In priority queuing mode, the buffer is segmented into two virtual queues: priority and non-priority. Each queue maintains and is accessed by separate head, tail and current offset pointers. Associated with the priority queue is a priority filter table (CAM), which contains information pertaining to the priority flows (e.g. address pairs, etc.) The buffer space for each queue is varies dynamically based on the arrival of packets that meet the priority criteria (i.e. have an entry in the priority filter). Initially all packets are considered non-priority, but as the flow processor identifies a flow as being "important", information about the stream of packets that comprise the flow is written back to the queue manager and tagged as priority.
- CAM priority filter table
- buffers are reallocated to the priority queue from the non-priority queue. Likewise when the number of priority flows decreases, buffers are reallocated to the non-priority queue. These queues effectively appear as separate FIFOs with varying depth and are completely managed by hardware.
- This mechanism allows the flow processor to focus on servicing priority packets over non-priority packets to prevent data loss.
- the flow processor monitors the average depth of the priority queue and may selectively discard flows from the priority filter.
- the capture buffer appears as a single FIFO and gives no particular preference to the packets being captured. Packets are therefore likely to be dropped in this mode. Filtering Modes
- the capture subsystem supports various hardware filtering capabilities depending on operating mode (i.e. diagnostic or monitor). In any mode, a dedicated 72 bit wide content addressable memory (CAM) is used to provide the filtering on 128K flows. In diagnostic mode, patterns may be entered into the CAM based on information contained in Table 32.
- operating mode i.e. diagnostic or monitor.
- CAM content addressable memory
- the CAM is used as a priority flow recognition mechanism, which allows the flow processor to give priority to a set of flows that contain the provisioned L3 (or other) address pairs corresponding to packets of interest.
- What normally constitutes the criteria for flows of interest is an unbiased rate throttling mechanism, whereby a population of flows are given priority based on being already classified. This mechanism may be extended however by biasing the priority filter to focus on a set of flows which have some significance to the flow processor or other entity. In this case, only flows that match the focus criteria are given priority, effectively filtering out other "non-interesting" flows.
- the media module flow processor is a microprocessor subsystem dedicated to the task of flow classification.
- This processor is the main client of the capture butter and pre- processes all packets for further analysis by the main processor.
- This processor stores the results of classification in shared memory and builds a descriptor for each packet forwarded on to the main processor (through the focus buffer).
- Tasks on the main processor may identify a flow as being important by tagging its flow record in the shared memory, which the flow processor subsequently uses as criteria for forwarding additional packets of that flow.
- This mechanism provides another type of adaptive filtering capability to reduce the probability of dropped packets for post-classification analysis.
- This processor can have its own dedicated program and data memories as well as access to the shared memory. The processor may or may not require an OS.
- the media module main processor can be, for example, an 800 MHz PowerPC dedicated to providing general application support for the media module.
- the main processor subsystem provides the functionality set forth in Table 33.
- This processor can run the Vx Works real-time embedded operating system.
- FIG 31 shows a top-level view of the shared memory subsystem 2810.
- the shared memory subsystem provides a data and event communication mechanism between the flow processor and the main processor. This memory is made equally available to the two processors via arbitration. All flow records created by the flow processor are stored in this memory in addition to per-packet parse descriptors. The descriptors are queued to allow the main processor to perform asynchronous processing of packets from the flow processor.
- the main processor may write-back pointers and flow control (filter) information in the shared flow records as a feedback mechanism for selecting a focus set.
- This subsystem also serves as the download, configuration and status mechanism for the flow processor and FPGAs.
- Focus Subsystem 2812 (See Figure 28) The focus subsystem provides buffering for packets received from the capture subsystem and an interface to the main processor for accessing those packets in the focus buffer. In effect, the focus subsystem provides a dual-ported interface to the focus buffer.
- Figure 32 shows a top-level view of the focus subsystem 2812.
- the focus subsystem provides the functionality listed in Table 34.
- the uP interface provides the media module (main) processor access to all configuration and status registers, memories, etc for the focus subsystem.
- This interface is the source of all packet data to be processed by the main processor (expert, etc.) and is controlled exclusively by the main processor.
- This interface can support an on-demand hardware packet transfer mechanism (DMA) into the main processor's local memory to alleviate timing contention for the focus buffer.
- DMA on-demand hardware packet transfer mechanism
- the capture buffer interface is used for transferring packets from the capture buffer into the focus buffer.
- This forwarding uses DMA (in the capture subsystem) and is under control of the flow processor.
- DMA in the capture subsystem
- a decision is made whether to forward the packet on or not. This decision is based on indications fed back from the expert task on main processor for the scope (flows) expert is interested in and is effectively a second level of filtering. If the packet is to be forwarded, the flow processor initiates the transfer across this interface.
- a control mechanism may be provided to indicate when the focus buffer is full.
- the focus subsystem provides two primary modes of operation, and several sub-modes within each primary mode.
- the primary modes are listed in Table 35 below.
- diagnostic mode the focus buffer takes snapshots of data from the capture buffer based on classification (i.e. multi-layer) filtering provided by the flow processor.
- classification i.e. multi-layer
- the buffer modes supported in diagnostic mode are listed in Table 36.
- fill and stop mode when a capture is initiated (usually by a trigger), the buffer fills linearly until full or a stop trigger is fired. In the wrap mode, the buffer is continuously being overwritten with the most recent data from the line until a stop trigger is fired.
- the start and stop capture triggers are implemented in hardware and support stop after N (bytes) capability. This allows a user defined capture window with information both before and after the event of interest.
- the focus buffer acts as a high performance FIFO queue.
- Table 37 lists buffer modes supported in monitoring mode.
- the buffer In priority queuing mode, the buffer is segmented into two virtual queues: priority and non-priority. Each queue maintains and is accessed by separate head, tail and current offset pointers. Associated with the priority queue is a priority tagging mechanism provided by the flow processor, which is based on which flows are important to expert.
- the buffer space for each queue is varies dynamically based on the arrival of classified packets that meet the priority criteria (i.e. have a priority entry in the flow classifier). Initially all packets are considered non-priority, but as the expert task identifies a flow as being "important", information about the stream of packets that comprise the flow is written back to the flow processor and tagged as priority.
- buffers are reallocated to the priority queue from the non-priority queue. Likewise when the number of priority flows decreases, . buffers are reallocated to the non-priority queue. These queues effectively appear as separate FIFOs with varying depth and are completely managed by hardware.
- This mechanism allows the expert task to focus on servicing priority packets over non-priority packets to prevent data loss.
- the expert task monitors the average depth of the priority queue and may selectively discard flows from the priority filter.
- the focus buffer appears as a single FIFO and gives no particular preference to the packets being captured other than through flow filtering. Packets are therefore more likely to be dropped in this mode.
- the focus subsystem does not provide hardware filtering. Instead, filtering is achieved using a software feedback approach.
- the flow processor is directed by the main processor (expert) as to the focus set of applications, etc. that are forwarded on for expert processing.
- the priority queuing of a subset of flows within the focus set is used to provide additional filtering capability.
- the media module has the ability to use an optional hard drive for the persistent storage of various data.
- Table 38 lists some of the uses for the HDD module.
- the HDD (when equipped) resides on a CPCI rear transition module directly behind the media module.
- the media module provides an IDE interface on a set of user defined CPCI backplane signals.
- the CPCI backplane interface on the media module can be used for all communications with the application server or other client modules.
- This interface may be set up in transparent or non-transparent modes and provides both target and initiator capabilities.
- the main processor memory is made accessible to the application server via this interface for general communication (configuration, download, status, etc.) and any shared object access. This interface also allows the application server access to the focus buffer and local HDD.
- the media module provides a dedicated 10/100 interface via the front bezel, which may be used for debugging, alternate access for management systems, etc.
- This section will describe the software subsystems and interfaces which comprise the media module.
- a top-down approach will be used to introduce the overall architecture and each of the constituent subsystems.
- This architecture should be viewed as an illustrative model, which can be changed as more focused resources are added to the development.
- Figure 33 shows top-level subsystems and dependencies of a media module 3300 according to one embodiment.
- a set of top-level packages representing major architectural components are shown. In the following subsections, each will be described and further decomposed into additional subsystems with their descriptions.
- the architecture is very centered around the common data repository 3302 (and configuration manager 3304). This repository is viewed as being a shared memory database, which is accessible by all subsystems. As will be seen, this is an important part of the architecture for supporting inter-subsystem communications and triggering functions.
- a set of common engines 3306 are provided for supporting generic functions (i.e. logging, statistics, alarm and event managers).
- a set of subsystems 3308 provide analysis, monitoring and triggering services either directly to clients (i.e. expert to RMON) or to the application server.
- a hardware services subsystem 3310 provides all access to hardware objects (interfaces, HDD, etc.), including events, configuration, statistics, and maintenance functions. Note that throughout this section it is assumed that inter-subsystem object access is provided through the data repository and events are passed between subsystems using OS or hardware mechanisms.
- the persistence manager is responsible for gathering any transient objects that require storage beyond their active state. For example, APM requires that objects related to flows (connection between client, server and application) be aggregated beyond the life of a single flow involving the three parts. This requires a type of medium term persistence so that a client may view the behavior of the flow over time. A longer- term persistence (i.e. indefinite) may also be provided for providing history and logging. This type of persistence requires storage to a non- volatile medium such as a hard disk.
- the persistence manager has access to three types of storage for persisting objects it is responsible for, listed in Table 39 below.
- the primary mechanism for persisting aggregated information can be to store the native flow and expert objects in a hierarchical database. Reports (RMON, etc.) may be generated on an as needed (i.e. per query) basis from these objects eliminating the need to store RMON tables, etc.
- This aggregation can be performed as a background or periodic task, which collects objects from the flow processor and expert enabling them to focus on current (transient) flows only.
- There may be a second level to this mechanism whereby the optional media module hard drive is used to provide further long-term storage for these objects.
- the FLASH database is used for storing critical configuration data, which may always be available even after power loss or reset events.
- the type of data to be stored in flash is listed in Table 40.
- the persistence manager may encapsulate all three storage mediums using a common interface (API) to minimize the impact of reassigning data from one storage area to another.
- the persistence manager therefore is responsible for the collection, storage and deletion (clean-up) of all persistent objects on the media module.
- the clients of this subsystem are listed in Table 41.
- the system may support different experts that monitor different protocol layers as well sets of protocols/applications that make up a service.
- the experts can be turned on and off independent of other experts within the system.
- the experts can be enabled on a Media Module basis, with all interfaces within the Media Module running the same set of experts. Each individual Media Module within the system can have a different set of experts running.
- the media module expert subsystem is a real-time application monitoring and analysis engine running on the media module main processor, which builds information based on receiving per-packet data for selected flows.
- the main focus for this analysis is application performance monitoring (APM) which supports both RMON and local applications. This information is built upon and enhances information gathered by the flow processor and falls generally into three categories:
- monitoring information generally refers to functions related to providing APM metrics, deep application recognition and application subtype classification (e.g. MIME types over HTTP, etc.).
- Diagnostic information is gathered in focused monitoring modes and includes APM "drill-down" monitoring (i.e. TPM), as well as detecting any general network related anomalies. Troubleshooting information is gathered in diagnostic mode during fault isolation monitoring where a specific problem exists and a user is searching for an exact cause of the problem. This last type of information may include capture data as well as alarms and diagnoses.
- the two operating modes for the media module expert are monitoring mode and diagnostic mode. Different expert capabilities exist in each of these modes.
- TPM transport performance metrics
- the media module expert uses the results of flow processing (classification) as a foundation for all of its operations.
- the flow processor stores the results of its parsing and classification in the shared memory between the two processors.
- the expert subsystem uses packets, events, flow records and parse descriptors produced by the flow processor in its processing and stores its own results (objects) in main processor memory.
- Figure 34 shows the main components of the media module expert subsystem 3314.
- the media module expert is comprised of a set of component subsystems 3402-3410, which will be described in the following sections.
- individual real-time expert components may be enabled independently of each other and do not necessarily require that all lower layers be enabled to process packets. Instead, all expert components rely on the parsing, filtering and classification results from the flow processor as a basis for their operation.
- all expert objects are tied to flows in that they are directly traceable (linked) to the flow record for the specific flow. For each flow that the expert processes, an expert flow record, containing parameter areas for each enabled component is created in main processor memory. Each expert component has access to all areas of the flow record which may provide useful information for its processing.
- Expert components are generally classified (and sub-classified) by layer according to their operations and include the main classes shown in Table 43.
- Some experts may rely on other experts.
- the Services Experts can rely on multiple subclasses within the Application Expert to evaluate the specific service, or the Application Performance Monitoring Expert may rely on a Transport Expert to drill-down on what could be causing performance problems.
- Figure 35 illustrates a top-level Media Module Expert component classification 3500.
- the network expert components are available in diagnostic mode and provide network layer analysis of potential problems that may affect application performance. Some of the functionality provided by these optional network layer expert components is set forth in Table 44 below. These expert components would not normally be activated in monitoring mode.
- the transport expert components are available in diagnostic mode and provide fransport layer analysis of potential problems that may affect application performance.
- a special class of transport expert may provide transport performance metrics and is considered a diagnostic extension of APM that is used in "drill-down" mode. These metrics include statistical means, deviations, etc. and are particular to TPM.
- Some of the functionality provided by the other optional transport layer expert components are set forth in Table 45. These expert components would not normally be activated in monitoring mode.
- Session Expert 3406 (See Figure 34)
- the session expert components are available in diagnostic mode and provide session layer analysis of potential problems that may affect application performance.
- a special class of session expert (Login expert) may provide discovery and correlation of computer (host) and user names and logins and is considered a desired extension of APM.
- Table 46 illustrates some of the functionality provided by the other optional diagnostic session layer expert components.
- the application expert components are available in monitoring and diagnostic mode and provide application layer (and sub-application layer) analysis and performance metrics. There are at least two primary classes of application expert components:
- API Application performance monitoring
- Protocol/application subclasses monitor specific protocols/applications (called protocol/application subclasses) to determine the performance of the specific protocol/application from a client's perspective, the server's perspective, and/or network's perspective.
- Each protocol/application subclass has a set of metrics (objects) that it can use to measure the performance.
- the metrics can be applied to different response times of commands/responses, a stream of data, etc.
- metrics can be gathered on deeper evaluation of transaction (not just response times) associated with the specific protocol/application that is being monitored.
- the subclasses can evaluate performance for a single server, a set of servers, a client, a set of clients, and a set of client/server flows.
- the APM expert components are concerned with generating metrics related to application performance and are further categorized into three sub-classes, which apply individually or in combination to various application protocols based on transaction types.
- the sub-classes are listed in Table 47 below.
- the application content expert components are concerned with identifying application sub-types within a base application (e.g. JPEG MIME types within HTTP, etc.). These components are required for some applications and are used to identify tunneled applications and build more precise APM metrics.
- Figure 36 shows an example sub-classification of components of the application expert 3408 and the relation to a few application protocols. As shown, different application expert component subtypes have different requirements based on their usage. Table 48 shows several application expert component subtypes.
- the RTP application expert component 3602 is derived from stream oriented APM class only •
- the FTP application expert component 3604 is derived from transaction and throughput oriented APM classes
- the HTTP application expert component 3606 is derived from the stream oriented, transaction oriented and throughput oriented APM classes as well as the application content class
- the Sybase application expert component 3608 is derived from the transaction oriented and stream oriented APM classes as well as the application content class
- this model is not meant to imply an object-oriented language, but may be useful for a pattern-based approach to designing similar types of expert components with some degree of reuse.
- one or more application expert components may be enabled for monitoring.
- several operating modes are provided within the application expert as listed in Table 49.
- the set of enabled applications is indicated to the flow processor so that it only passes on packets of flows containing those applications.
- the classification processor's normal operating mode a statistically unbiased population of flows is allowed through the capture buffer based on its ability to keep up with traffic. This allows RMON 1 and 2 processing to maintain a balanced view of the network without dropping packets of classified flows.
- the application expert however can override the classification processor's unbiased operation by giving it a set of applications (or other criteria) to be given classification priority. This "forced" classification mode affects RMON as the filtering for classification is no longer unbiased.
- the application expert works on a subset of flows within the classification set and may process a subset of those flows in a similar (unbiased or biased) approach.
- This reduced set of flows is referred to as the expert sub-population and depends on the application expert's operating mode.
- Flat mode is used to enable concurrent evaluation of a set of enabled applications.
- the number of applications enabled at a given time may have an impact on performance depending on network load.
- As the application expert processes the selected flows it may assign a priority indication to individual flows based on a provisioned application priority.
- the application expert may use an unbiased priority tagging approach, whereby selected flows from all applications are relegated to the non-priority queue of the focus buffer as a method to reduce the packet arrival rate. This ensures that the media module expert can keep up with a set of flows from all enabled applications without dropping packets for those flows.
- Roving mode is used to enable a sequential evaluation of a set of enabled applications.
- a scheduling mechanism to allow each enabled application component to receive an allotted time-slice for monitoring flows containing its application. This is being referred to as "roving mode" whereby a single application at a time has all expert processing bandwidth and requests the flow processor to only forward packets for those flows that contain the application of interest.
- roving mode a picture can be painted of the average performance of a large number of applications, with a much lesser chance of dropping packets.
- the number of applications enabled and their priority may have an impact on overall performance (i.e. how often the application is evaluated).
- the application expert may further assign an additional priority indication to individual flows.
- selected flows from the current application are relegated to the non-priority queue of the focus buffer as a method to reduce the packet arrival rate. This ensures that the media module expert can keep up with a set of flows from the current application without dropping packets for those flows.
- Focus mode is used to enable an evaluation of a particular application.
- a single application has all expert processing bandwidth and requests the flow processor to only forward packets for those flows that contain the application of interest.
- Focus mode may be entered manually by a user selecting a particular application or automatically (from one of the other modes) by setting up an auto-focus trigger.
- the application expert may assign a priority to individual flows.
- selected flows from the current application are relegated to the non-priority queue of the focus buffer as a method to reduce the packet arrival rate. This ensures that the media module expert can keep up with a set of flows from the current application without dropping packets for those flows.
- Figure 37 depicts a process 3700 for expert application performance analysis.
- an application is monitored.
- performance data is gathered during the monitoring of operation 3702.
- a set of metrics is generated in operation 3706 based on the performance data gathered in operation 3704.
- a performance of the application is measured from at least one of a client perspective, a server perspective, and a network perspective using the metrics. Note operations 3708, 3710, 3712.
- the system may be able to collect various statistics for a server, client, or protocol to perform the functions listed in Table 50.
- Triggers can be set on various objects that are associated with the performance metrics calculated for specific protocol/application. Each protocol/application will publish its triggerable objects. The triggers can cause the system to initiate the events listed in Table 51.
- a user has control over the functions of the following APM configuration settings listed in Table 52.
- the user can also control how each application/protocol is being monitored.
- Each application/protocol specifies the reports that can be created, the objects that can be triggerable via threshold, the metrics that are of interest, etc.
- the capabilities for each application/protocol specifies the reports that can be created, the objects that can be triggerable via threshold, the metrics that are of interest, etc.
- Protocols/Applications and Metrics section will define the Protocols/Applications that have an associated application/protocol subclass. These subclasses classify the transaction associated as one or more of the classes listed in Table 53.
- a specific application/protocol subclass will only generate certain metrics. For instance, in Roving and Flat mode a smaller set of metrics (basically what is defined by APM RMON) will be used than when in Focused or Diagnostics Monitor Mode (much deeper monitoring). For Transaction Orientated Based transactions, the metrics set forth in Table 54 below may be supported.
- the metrics in Table 56 may be supported.
- Disrupted service is related to the items listed in Table 57.
- Some applications/protocols may use the transport mechanism to monitor application response times. This should not be confused with Transport Expert functions that drill down further on determining whether and where the Transport Layer is having problems.
- the Application Performance Monitoring Correlation Expert takes results from the specific application/protocol subclasses and evaluates the performance of the applications/protocols across multiple interfaces.
- the Correlation Expert interprets the difference in performance between different parts of the network. The results can help give a user a clear understanding of how the network works today, how the network works after changes have been made, and others.
- load sharing is used to see how the applications actually work over the different links (multiple links feeding a set of servers).
- the user would not be able to see the same flows across the interfaces.
- the same flows can be monitored across multiple interfaces. Under this scenario, the user can see where potential bottlenecks are in the system.
- a Correlation expert can have the modes listed in Table 58.
- the system can allow a user to specify the correlation's aggregation duration.
- the system may accumulate data over a period of time based on the aggregation duration.
- the accumulated data may be stored to disk or displayed.
- the system may allow a user to view the current aggregation period.
- the system may allow a user look at the performance of flows that are currently active.
- the system may show a distribution of applications over different links within the system.
- Session Experts provide a mechanism to track a particular client or server within the network.
- the tracking involves binding client/server MAC addresses, network addresses, Machine Names and User Names.
- Accurate bindings provide a way to ensure that the information that has been collected by the system can be related to the appropriate client and server.
- Session Experts can also be useful for tracking User sessions for specific services. For example, when login into a Domain, the system can identify the number of attempts that failed, why a user failed, setup a trigger to monitor a particular user, etc.
- an Application Performance Monitoring Expert may exist for DHCP.
- Transport Experts provide mechanisms to monitor transport layer (ex. TCP, SPX) functions. Transport Experts can work with other Application experts to determine whether there are problems occurring at the Transport Layer. For example, if an Application Performance Monitoring Expert detects a performance problem with a particular Server or Client, the Transport Expert can focus on transaction related to that server or client and determine whether the problem is occurring at the Transport Layer. For example, the Transport Expert can determine whether there are too many retransmissions, packets out of order, connection window problems, tunneling problems, etc. In use, the system may support the TPM MIB components.
- Network Experts examine problems within the network that will affect application performance. Network Experts are turned on as Diagnostic. The type of network problems that network experts look at can be routes that where used for certain clients have gone through, fragmentation issues, flapping routes, broadcast storms, multicast storms, etc. Media Experts
- the Gigabit Ethernet Expert monitors the physical and data link layer. The monitoring looks at basic performance over the physical interface. The performance on the physical interface can have an impact on how the specific application/protocol may get impacted.
- the system may keep one or more of the statistics listed in Table 59 below for each link for each interface.
- the system may perform any of the functions listed in Table 60.
- the system may perform one or more of the functions in Table 61.
- Service Experts provide analysis of a particular service that is based on multiple applications/protocols.
- An example of this would be a Voice Over IP Expert that deals with multiple applications/protocols that are involved in making a call, keeping the call up, the call stream, etc.
- Another example is a Packet Cable Service Expert that analyzes different applications/protocols that are involved in providing Packet Cable Services.
- Another Service Expert could be a Network Troubleshooting Expert that uses experts at different layers to detect and evaluate problems with the network that could affect application performance.
- the services expert components can be made available in monitoring and diagnostic mode to provide a mechanism for using expert components from multiple layers to support complex, multi-protocol, multi-application services or diagnostics.
- These experts have a detailed understanding of the service that is being provided by a service provider or network owner and will typically involve multiple interrelated control plane and data plane protocols and endpoints with many states.
- Some examples of possible services experts are given in Table 62 below.
- a PacketCable services expert would need to support the following protocols: DOCSIS, GbE, ATM, POS, IP, RSVP+, MGCP, TGCP, SS7, COPS, RADIUS, TCP, RTP, IPSec, Kerberos, DQOS, etc. All of these protocols are interrelated and involved in setting up a single voice or video call in the PacketCable architecture.
- the signaling, policy enforcement, QOS, fransport, billing and security planes all interoperate according to a specified model which an expert for this service would need to understand.
- a less extreme example would be a diagnostic expert which requires processing from multiple layer expert components.
- Figure 38 illustrates RMON object dependencies and persistence levels.
- the media module RMON agent 3316 uses data and services provided by the flow processor, APM and TPM experts, persistence manager and other subsystems to provide SNMP (vl/v2) clients access to the objects listed in Table 63 below.
- the media module RMON agent builds tables, events, etc. based on information provided by the flow processor (i.e. flow records) and information provided by the expert subsystem (expert flow records/objects).
- the RMON agent subsystem uses three levels of information to build reports for managers:
- Each source can have two associated time intervals for managing its objects:
- the collection interval is based on the sampling rate of the fastest higher-level client process (i.e. the most frequent client's retrieval rate of RMON tables, etc.). This interval is used to normalize the rate at which all selected objects at a given level are updated. The exception to this is the expert subsystem and flow processor subsystem, which use packet arrival rates as the update interval for their objects. This rate may be limited to some minimum interval at each level, which all clients are constrained to.
- the persistence window is based on the sampling rate of the slowest higher-level process (i.e. the least frequent client's retrieval rate of RMON tables, etc.). This interval dictates how long all selected objects may be maintained at a given level.
- This rate may be limited to some maximum interval at each level, which all clients are constrained to. After expiry of the time for this interval, inactive objects may be reclaimed for further processing.
- the triggers manager 3318 is responsible for the creation, deletion, activation and deactivation of media module triggers and is optionally responsible for the scheduling and invocation of actions based on triggers (the exception being hardware based triggers). This includes listening for events for enabled triggers, evaluating conditions required to fire the trigger, and invoking the action(s) for the trigger.
- the set of triggerable events and actions needs to be published by each media module subsystem via the configuration manager (i.e. through the managed objects for the subsystem).
- Trigger groups may be created per-user or globally via the registry.
- the configuration manager 3304 is responsible for providing all access to managed objects on the media module. This includes managing the state and availability of hardware objects, compatibility objects, application objects, objects and trigger objects.
- the managed objects accessed by the configuration manager are not the actual transient objects produced by applications, but are rather configuration objects, which control and reflect the state of applications, hardware, etc. Note that the media module managed objects are created upon power up and reside on the media module. These objects are available for presentation via the higher-level application server configuration manager.
- the media module logging manager subsystem 3320 is responsible for creating and storing media module specific logs, which include time-stamped events, alarms, statistics, and other information as requested on a per session basis.
- the logging manager provides the requested log information to users via the higher-level logging manager on the application server.
- the logging manager uses the optional media module hard drive to persist this data and may additionally use secondary storage (i.e. a file server) for extended capability.
- the logging manager on the application server treats the logging manager on each media module as a remote file server.
- the statistics manager 3322 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the media module.
- This subsystem is used to provide (dispatch) statistics to the application server as well as to the local logging manager.
- the various statistics may be dispatched based on intervals, change occurrence, etc. as defined in the user and SYSTEM registry entries on the application server.
- This subsystem provides dispatch filtering on a per user basis for multiple client sessions. System triggers may be provided by this subsystem to invoke actions based on statistics.
- the actual statistics objects are maintained in the main processor database.
- the alarm manager 3324 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the media module.
- This subsystem is used to provide (dispatch) alarms to the application server as well as to the local logging manager.
- the various alarms may be dispatched based on severity, intervals, change occurrence, etc. as defined in the user and SYSTEM registry entries on the application server.
- This subsystem provides dispatch filtering on a per user basis for multiple client sessions. System triggers may be provided by this subsystem to invoke actions based on alarms (i.e. dial a pager, etc.).
- the actual alarm objects are maintained in the main processor database.
- Event Manager 3326 (See Figure 33)
- the event manager 3326 like the alarm manager 3324 is a common shared resource for all application engines (i.e. RMON, Expert, etc.) on the media module.
- This subsystem is used to provide (dispatch) alarms to the application server as well as to the local logging manager.
- the various events may be dispatched based on severity, intervals, change occurrence, etc. as defined in the user and SYSTEM registry entries.
- This subsystem provides dispatch filtering on a per user basis for multiple client sessions. System triggers may be provided by this subsystem to invoke actions based on events.
- the capture manager subsystem like the logging manager is responsible for creating and storing trace files, which include filtered packets as requested on a per session basis.
- the capture manager provides the requested information to various clients including RMON clients and application server clients (e.g. MI expert) based on capture criteria set on a per session basis.
- the capture manager uses the optional media module hard drive to persist this data and may additionally use secondary storage (i.e. a file server) for extended capability.
- the capture manager on the application server treats the capture manager on each media modules as a remote file server.
- the flow classification engine 3330 is the first part in the media module processing chain for packets received from the line.
- the flow classification engine receives packets from and controls the filtering for the capture subsystem (see capture subsystem in the hardware description section).
- the flow classification engine is generally responsible for the functionality listed in Table 64.
- Deep application processing (sub-type classification, string based recognition, etc) can be packaged into application content experts.
- performance metrics can be packaged into a transport (TPM) expert.
- TPM transport
- the capture subsystem may provide two packet queues to the flow classification engine:
- queues are configured by the flow classification engine and are based on hardware filtering at the ingress of the capture buffer.
- the flow classification engine writes back L3 (or other) addresses for selected flows to the CAM priority filter in the capture subsystem. This gives packets for these flows priority in the capture buffer as well as the ability to reclaim buffers from the non- priority queue.
- This can be thought of as a type of intelligent flow throttling whereby a set of flows can always be processed without dropping packets. This may require an adaptive algorithm for maintaining an average deficit based on capture buffer depth. This will be explained in detail in a later section.
- Figure 39 shows the pipelined (flow processing and expert processing) filter and buffer components provided by the media module.
- two filters are shown (fl and f2) 3902, 3904.
- the representation of these filters is logical rather than physical in order to provide a generalized description of the overall operation and interaction.
- the arrows pointing downward into the filters represent coefficient paths 3906 for the filters.
- the first filter (fl) 3902 provides ingress filtering for the capture buffer. This filter can be configured to operate in several modes:
- Figure 40 depicts a process 4000 for adaptive priority data filtering according to an embodiment.
- operation 4002 all buffers are initially allocated to a low priority queue.
- data is collected from a network segment and stored in the low priority queue.
- operation 4006 the data is classified into multiple flows. The flows are prioritized into high and low priority flows in operation 4008.
- operation 4010 high priority flows are stored in a high priority queue prior to processing, while in operation 4012 low priority flows are stored in a low priority queue prior to processing.
- Each of these queues preferably acts as a high performance first in-first out (FIFO) queue. Data in both the high and low priority queues is processed in operation 4014.
- FIFO first in-first out
- buffers from the low priority queue can be reallocated to the high priority queue if the amount of data in the high priority flows surpasses a predetermined threshold. Alternatively, if the amount of data in the high priority queue surpasses a predetermined threshold, high priority flows are selected from the high priority queue and relegated to the low priority queue. These mechanisms allow the flow processor to focus on servicing priority data over non- priority data to prevent data loss.
- the buffer acts as a raw capture interface, whereby snapshots of data from the line are buffered based on matching include filter criteria.
- This mode will typically use the "forced set" mechanism shown in Figure 39 as driven directly or indirectly by the expert subsystem.
- This forced set is static (provisioned) in this mode and can include patterns that correspond to header fields (up through layer 3) or information from the PMD descriptor, which is prepended to each packet.
- RMON processing and filter 2 are disabled and the expert subsystem or an external client handles all processing of the captured packets.
- the flow classification engine may or may not pre-process the captured packets depending on the configuration.
- the flow processor does pre-process the packets is when the local expert is the post-processing client of the data.
- the flow processor processes the captured packets in batch once the capture stops and then forwards them to the expert subsystem.
- the second filter (f2) is not required, since the captured packets match exact criteria. If the expert subsystem is not the processing client for the captured data, the flow processor does not analyze the packets and they are simply transferred to the external client through the focus buffer (again the second filter is not used).
- the buffer acts as a FIFO interface, whereby data from the line is continuously buffered in one of two queues based on matching priority filter criteria.
- This mode will typically use the "priority set A" mechanism shown in Figure 39 as driven directly or indirectly by the flow processor subsystem.
- This priority set is dynamic in this mode and can include L3 address pairs that correspond to flows that the flow classification engine has selected to be treated as priority.
- the unbiased mode is used to provide broad coverage of as many flows as can be processed by the flow processor. This allows RMON to paint a full picture of all activity observed on the line.
- the result of this monitoring is a statistically unbiased population of flows, which can be fed on to the expert subsystem for further processing (i.e. APM, etc.).
- the size (number of flows) of population is dynamic over time and is created by the flow classification engine using the general algorithm set forth in Table 65 below.
- the priority queue depth increases by taking buffers from the non-priority queue (i.e. reducing its depth) .
- the flow processor may only service the priority queue
- the flows that are sent on to the expert subsystem via the focus buffer are based on scoping criteria received from the application expert ("focus set” 3908 in Figure 39) based on its current monitoring mode (i.e. flat, roving, etc).
- the expert subsystem may use a similar mechanism via the second filter (f2) to reduce the expert sub-population of flows to a level it can keep up with.
- This second filter is actually implemented in software by expert setting a priority tag in the flow records of selected flows.
- the overall behavior is essentially the same as that of the first filter described above.
- the biased mode is used to provide focused coverage of as many flows as can be processed by the flow processor.
- expert is in the drivers seat and adds weight to the priority mechanism used for filter (fl) 3902. This affects RMONs ability to paint a full picture of all activity observed on the line.
- the result of this monitoring is a biased population of flows, which can be fed on to the expert subsystem for further processing (i.e. APM, etc.).
- the size (number of flows) of population is dynamic over time and is created by the flow classification engine using the same algorithm described above, with the exception that the flow discard mechanism is now biased by the expert provided focus set. This is effectively a weighted random discard traffic shaping technique.
- the expert subsystem may use a similar mechanism via the second filter (f2) 3904 to reduce the expert sub-population of flows to a level it can keep up with.
- the buffer acts as a FIFO interface exactly as in monitoring mode, but the (fl) filtering is overridden using the "forced set" mechanism described in the diagnostic mode to enter the L3 addresses of one or more servers and or clients.
- This mode still however uses the "priority set a" and "priority set B" mechanisms shown in Figure 39 to throttle the number of flows that the flow engine and expert can keep up with.
- a particular server, set of servers, client or set of clients may be entered or "forced” into filter (fl) 3902 by the expert subsystem, which restricts all flows the flow processor sees to this forced set.
- filter (fl) 3902 by the expert subsystem, which restricts all flows the flow processor sees to this forced set.
- Figures 41 and 42 present an example of "the life of a packet" within the media module during normal (monitoring) mode. More specifically, Figure 41 is a media module general processing flow 4100. Figure 42 is a high-level media module packet processing sequence diagram 4200.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2003303131A AU2003303131A1 (en) | 2002-07-26 | 2003-05-19 | Network analyzer co-processor system and method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US20625402A | 2002-07-26 | 2002-07-26 | |
| US10/206,254 | 2002-07-26 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2004061550A2 true WO2004061550A2 (fr) | 2004-07-22 |
| WO2004061550A3 WO2004061550A3 (fr) | 2007-05-24 |
Family
ID=32710586
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2003/015829 Ceased WO2004061550A2 (fr) | 2002-07-26 | 2003-05-19 | Systeme et procede de co-processeur d'analyse de reseau |
Country Status (2)
| Country | Link |
|---|---|
| AU (1) | AU2003303131A1 (fr) |
| WO (1) | WO2004061550A2 (fr) |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007061659A3 (fr) * | 2005-11-21 | 2008-01-17 | Motorola Inc | Procede et systeme de traitement de paquets entrants dans un reseau de communication |
| WO2010128841A1 (fr) * | 2009-05-08 | 2010-11-11 | Universiti Sains Malaysia | Plate-forme de surveillance de réseau réparti et de surveillance de sécurité en temps réel (rtdnms) |
| EP2317695A1 (fr) * | 2009-10-29 | 2011-05-04 | Fluke Corporation | Analyse en mode mixte |
| WO2010151723A3 (fr) * | 2009-06-26 | 2011-06-16 | Accelops, Inc. | Méthodologie distribuée pour comptage approximatif d'événements |
| WO2011141211A1 (fr) * | 2010-05-10 | 2011-11-17 | Nokia Siemens Networks Oy | Affichage de données |
| WO2011141212A1 (fr) * | 2010-05-10 | 2011-11-17 | Nokia Siemens Networks Oy | Processeur de données |
| US8150862B2 (en) | 2009-03-13 | 2012-04-03 | Accelops, Inc. | Multiple related event handling based on XML encoded event handling definitions |
| CN111836020A (zh) * | 2020-07-21 | 2020-10-27 | 苏州科达特种视讯有限公司 | 监控系统中的码流传输方法、装置及存储介质 |
| CN113709013A (zh) * | 2021-09-09 | 2021-11-26 | 天津津航计算技术研究所 | 一种国产化带双冗余切换功能的6u-cpci千兆网络模块 |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6421720B2 (en) * | 1998-10-28 | 2002-07-16 | Cisco Technology, Inc. | Codec-independent technique for modulating bandwidth in packet network |
| US6668275B1 (en) * | 1999-12-17 | 2003-12-23 | Honeywell International Inc. | System and method for multiprocessor management |
| US7134141B2 (en) * | 2000-06-12 | 2006-11-07 | Hewlett-Packard Development Company, L.P. | System and method for host and network based intrusion detection and response |
| US20020038362A1 (en) * | 2000-09-22 | 2002-03-28 | Ranjit Bhatia | Apparatus for facilitating realtime information interexchange between a telecommunications network and a service provider |
| US7139823B2 (en) * | 2001-08-23 | 2006-11-21 | International Business Machines Corporation | Dynamic intelligent discovery applied to topographic networks |
-
2003
- 2003-05-19 AU AU2003303131A patent/AU2003303131A1/en not_active Abandoned
- 2003-05-19 WO PCT/US2003/015829 patent/WO2004061550A2/fr not_active Ceased
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007061659A3 (fr) * | 2005-11-21 | 2008-01-17 | Motorola Inc | Procede et systeme de traitement de paquets entrants dans un reseau de communication |
| US8150862B2 (en) | 2009-03-13 | 2012-04-03 | Accelops, Inc. | Multiple related event handling based on XML encoded event handling definitions |
| WO2010128841A1 (fr) * | 2009-05-08 | 2010-11-11 | Universiti Sains Malaysia | Plate-forme de surveillance de réseau réparti et de surveillance de sécurité en temps réel (rtdnms) |
| US9112894B2 (en) | 2009-05-08 | 2015-08-18 | Universiti Sains Malaysia | Real time distributed network monitoring and security monitoring platform (RTD-NMS) |
| WO2010151723A3 (fr) * | 2009-06-26 | 2011-06-16 | Accelops, Inc. | Méthodologie distribuée pour comptage approximatif d'événements |
| US8510432B2 (en) | 2009-06-26 | 2013-08-13 | Accelops, Inc. | Distributed methodology for approximate event counting |
| EP2317695A1 (fr) * | 2009-10-29 | 2011-05-04 | Fluke Corporation | Analyse en mode mixte |
| WO2011141212A1 (fr) * | 2010-05-10 | 2011-11-17 | Nokia Siemens Networks Oy | Processeur de données |
| WO2011141211A1 (fr) * | 2010-05-10 | 2011-11-17 | Nokia Siemens Networks Oy | Affichage de données |
| GB2493661A (en) * | 2010-05-10 | 2013-02-13 | Nokia Siemens Networks Oy | Data display |
| GB2494063A (en) * | 2010-05-10 | 2013-02-27 | Nokia Siemens Networks Oy | Data processor |
| CN111836020A (zh) * | 2020-07-21 | 2020-10-27 | 苏州科达特种视讯有限公司 | 监控系统中的码流传输方法、装置及存储介质 |
| CN113709013A (zh) * | 2021-09-09 | 2021-11-26 | 天津津航计算技术研究所 | 一种国产化带双冗余切换功能的6u-cpci千兆网络模块 |
Also Published As
| Publication number | Publication date |
|---|---|
| AU2003303131A8 (en) | 2004-07-29 |
| WO2004061550A3 (fr) | 2007-05-24 |
| AU2003303131A1 (en) | 2004-07-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6801940B1 (en) | Application performance monitoring expert | |
| US7299277B1 (en) | Media module apparatus and method for use in a network monitoring environment | |
| US8059532B2 (en) | Data and control plane architecture including server-side triggered flow policy mechanism | |
| US9419867B2 (en) | Data and control plane architecture for network application traffic management device | |
| US6587432B1 (en) | Method and system for diagnosing network congestion using mobile agents | |
| US6735702B1 (en) | Method and system for diagnosing network intrusion | |
| US7813352B2 (en) | Packet load shedding | |
| US6466984B1 (en) | Method and apparatus for policy-based management of quality of service treatments of network data traffic flows by integrating policies with application programs | |
| US8625448B2 (en) | Method and system for validating network traffic classification in a blade server | |
| CN109714312B (zh) | 一种基于外部威胁的采集策略生成方法及系统 | |
| WO2003100622A1 (fr) | Commutateur pour reseau local | |
| US7266602B2 (en) | System, method and computer program product for processing accounting information | |
| JP2001519619A (ja) | 通信ネットワークの障害点測定および性能試験 | |
| CN104113433A (zh) | 管理和保护网络的网络操作系统 | |
| JP2004364306A (ja) | クライアント−サーバ接続要求を制御するシステム | |
| US20250267067A1 (en) | Dynamic modification of traffic monitoring policies for a containerized environment | |
| WO2004061550A2 (fr) | Systeme et procede de co-processeur d'analyse de reseau | |
| US20020174362A1 (en) | Method and system for network management capable of identifying sources of small packets | |
| WO2008121690A2 (fr) | Architecture de plan de données et de commande pour un dispositif de gestion de trafic d'application de réseau | |
| US8447880B2 (en) | Network stack instance architecture with selection of transport layers | |
| US8842687B1 (en) | By-pass port facilitating network device failure detection in wide area network topologies | |
| Alhilali et al. | DESIGN AND IMPLEMENT A REAL-TIME NETWORK TRAFFIC MANAGEMENT SYSTEM USING SNMP PROTOCOL. | |
| Bikfalvi | The Management Infrastructure of a Network Measurement System | |
| Madhukar et al. | Indigenous Network Monitoring System | |
| Bikfalvi et al. | The Management infrastructure of a network measurement system for QoS parameters |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase |
Ref country code: JP |
|
| WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |