WO2002003522A1 - Systeme et procede d'execution de tests de disponibilite de systeme et de maintenance, sans presence humaine, pour des programmes de service de serveur - Google Patents
Systeme et procede d'execution de tests de disponibilite de systeme et de maintenance, sans presence humaine, pour des programmes de service de serveur Download PDFInfo
- Publication number
 - WO2002003522A1 WO2002003522A1 PCT/US2001/020774 US0120774W WO0203522A1 WO 2002003522 A1 WO2002003522 A1 WO 2002003522A1 US 0120774 W US0120774 W US 0120774W WO 0203522 A1 WO0203522 A1 WO 0203522A1
 - Authority
 - WO
 - WIPO (PCT)
 - Prior art keywords
 - program
 - server service
 - service program
 - server
 - properly running
 - Prior art date
 
Links
Classifications
- 
        
- H—ELECTRICITY
 - H04—ELECTRIC COMMUNICATION TECHNIQUE
 - H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
 - H04L43/00—Arrangements for monitoring or testing data switching networks
 - H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
 - H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
 - H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
 
 - 
        
- H—ELECTRICITY
 - H04—ELECTRIC COMMUNICATION TECHNIQUE
 - H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
 - H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
 - H04L41/06—Management of faults, events, alarms or notifications
 - H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
 - H04L41/0659—Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
 - H04L41/0661—Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities
 
 - 
        
- H—ELECTRICITY
 - H04—ELECTRIC COMMUNICATION TECHNIQUE
 - H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
 - H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
 - H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
 - H04L41/5003—Managing SLA; Interaction between SLA and QoS
 - H04L41/5009—Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
 - H04L41/5012—Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF] determining service availability, e.g. which services are available at a certain point in time
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING OR CALCULATING; COUNTING
 - G06F—ELECTRIC DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/008—Reliability or availability analysis
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING OR CALCULATING; COUNTING
 - G06F—ELECTRIC DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
 - G06F11/14—Error detection or correction of the data by redundancy in operation
 - G06F11/1402—Saving, restoring, recovering or retrying
 - G06F11/1415—Saving, restoring, recovering or retrying at system level
 - G06F11/1438—Restarting or rejuvenating
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING OR CALCULATING; COUNTING
 - G06F—ELECTRIC DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/30—Monitoring
 - G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
 - G06F11/324—Display of status information
 - G06F11/327—Alarm or error message display
 
 
Definitions
- the present invention relates generally to a system and method for performing unattended system availability tests and maintenance on server service programs.
 - the invention relates to a system and method for providing a notification of a status of a server service program, its related components and sub-components, and for performing maintenance on the same.
 - Server service programs are server-based applications which are designed to be accessed by a plurality of users or "clients" on a networked system. Each server service program may provide and perform a wide variety of tasks for the use of the clients within a network. For instance, a specific server service program may provide a service of retrieving and analyzing business data collected from a plurality of databases throughout a network. Increasingly, server service programs are designed to take advantage of a wider availability of larger amounts of computing resources in order to provide more sophisticated services for clients. Today, a single server service program may have access to and rely on several other servers and server service programs. A server service program of a particular company may, for example, use several databases and servers to coordinate the data for all of the company's sales activities.
 - server service programs have become increasingly difficult to monitor and maintain.
 - One particular reason for this is that any one server service program may rely on many different subcomponents to properly function, hi turn, each of these sub-components may lie within a different layer of network architecture and use a different set of network interfaces.
 - a status of a server service program may be measured solely by whether such server service program as a whole is continuing to operate (i.e., application level monitoring), although many of the server service program's sub-components may no longer be properly responding.
 - This type of application level monitoring has many drawbacks. First, using this type of application level monitoring, a network administrator is often unaware of a problem until a server service program unsuccessfully attempts to access a failed subcomponent. Secondly, using this type of application level monitoring, it is often impossible to determine the exact cause of a failure. Consequently, many times an entire server service program has to be restarted in order to restart operation of a single failed sub-component, thereby resulting in an increase in administrative time, network resources and other inefficiencies.
 - the system and method of the present invention are advantageous because they provide for monitoring of each component of a server service program and for one or more notifications of a status of the server service program.
 - the present invention relates to a method for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link.
 - the method comprises the steps of determining whether the at least one link is active; determining whether the at least one complimentary program is properly running; determining whether the server service program is properly running; and determining whether the at least one task within the server service program is properly running.
 - the present invention relates to a system for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link.
 - the system comprises a first testing element for determining whether the at least one link is active; a second testing element for determining whether the at least one complimentary program is properly running; a third testing element for determining whether the server service program is properly running; and a fourth testing element for determining whether the at least one task within the server service program is properly running.
 - Figure 1 is a simplified schematic representation illustrating one example of a computer network configuration for use with one embodiment of the present invention.
 - Figure 2 is a simplified flowchart of a method for performing a plurality of unattended system availability tests and maintenance for a server service program in accordance with one embodiment of the present invention.
 - FIG. 1 illustrates an example of a network arrangement 50 employing a system and method of the present invention in accordance with a preferred embodiment of the invention. It should be understood that the present invention operates independent of any particular arrangement or mix of network components and that network 50 depicted in Figure 1 is purely illustrative and simplified for the purpose of explanation.
 - network 50 comprises a client 10, an application server 12, a database server 18, and a gateway server 22.
 - Application server 12 includes a processor module 16, a service control manager (SCM) 15, a server service program 14 and a maintenance program 40.
 - Gateway server 22 includes a gateway service program 21.
 - Database server 18 includes a database program 19.
 - Server service program 14 gains access to database 18 via a link 30 and to gateway server 22 via a link 34.
 - Client 10 gains access to the database 18 via a link 26.
 - application server 12 is operated using the Windows NTTM operating system.
 - server service program 14 may be any of a variety of server service programs which rely on a plurality of supporting connections to a plurality of other programs and databases in order to provide services to a plurality of clients.
 - server service program 14 is preferably a sales program such as Siebel Sales EnterpriseTM which relies on access to a plurality of sales information from across a business entity's computer network.
 - server service program 14 may rely on a variety of programs, databases and sub-components (hereafter collectively referred to as complimentary programs) for providing service to client 10. These complimentary programs, may include any programs or functions relied upon by the server service program 14 including, for example, those programs performed by a printer server, a web server, a mail server, a database server, a file server, a proxy server or an application server.
 - server service program 14 is shown in Figure 1 relying on database program 19 within database server 18, and gateway service program 21 within gateway server 22. Using these inputs, server service program 14 may provide client 10, for instance, with a plurality of sales analyses and other business services. However, if server service program 14 cannot access these complimentary programs 19 and 21, then server service program 14 will be unable to perform the sales analyses and other business services. Accordingly, maintenance program 40 is provided to test and maintain server service program 14 and its necessary complimentary programs 19 and 21.
 - maintenance program 40 should be co- located with the server service program 14 and the SCM 15 so that maintenance program 40 can directly interrogate the SCM 15. Further, according to a preferred embodiment, maintenance program 40 should not have a need to interact with a user stationed at the client 10 via a graphical user interface (GUI) or another user interface means since the maintenance program 40 is intended to continually monitor the network 50 without a need to wait for the user at the client 10 to respond to a query or a message box. Accordingly, maintenance program 40 is preferably compiled as an unattended executable program so that all message classes that may require interaction via a GUI or other user interface are redirected to, for instance, a Windows NTTM event log.
 - GUI graphical user interface
 - maintenance program 40 is preferably configured to run as a Windows NTTM service. This enables execution of the maintenance program 40 to be initiated when the network 50 starts up without a need for the maintenance program 40 to be launched by the user at the client 10. Additionally, this will allow maintenance program 40 to run in the same conditions and at the same time as any server service program it oversees.
 - Figure 2 is a flowchart illustrating the steps in the method of the present invention. As a first step 51, the maintenance program 40 is initiated and begins to run. Once initiated, in step 52, the maintenance program 40 checks the initiation file (INI file) of application server 12 for information on the configuration of the network 50 and makes a determination as to which complimentary programs are needed for proper operation of server service program 14. This determination is based upon the configuration of the network 50 and a plurality of preprogrammed user information saved within maintenance program 40.
 - initiation file ITI file
 - Step 54 once the necessary complimentary programs are determined, i.e., programs 19 and 21, maintenance program 40 begins conducting a plurality of maintenance checks by checking for a plurality of active links to each of these complimentary programs 19 and 21.
 - the necessary complimentary programs 19 and 21 include at least one database and a link to this database is checked by maintenance program 40 using an ODBC (Open Data Base Connectivity) interface.
 - Step 56 if the links to any of the necessary complimentary programs 19 and
 - maintenance program 40 clears all information regarding any active links and, in Step 58, checks to see whether any of the links to one of the complimentary programs 19 and 21 is unavailable due to a prescheduled downtime. If any of the links to one of the complimentary programs 19 and 21 is found to be unavailable due to a prescheduled downtime, then, in Step 60, maintenance program 40 waits a predetermined amount of time (corresponding to the prescheduled downtime) and, following the prescheduled downtime, in Step 52, maintenance program 40 proceeds to recheck the INI file without initiating any maintenance of the server service program 14 or a notification to the user.
 - the maintenance program 40 can determine that it was in the pre-scheduled downtime status for a system backup because it had previously checked for the link to complimentary program 19 and found it missing, but then in step 54 it found such link active. Moreover, the maintenance program 40 recognizes that it found the link to the complimentary program 19 missing at a time it expected to find such link missing (i.e., prescheduled downtime for backup over a weekend time period). In other words, maintenance program 40 recognizes that the loss of the link is a normal expected loss of connection within predefined time parameters and that the link is now restored. Maintenance program 40 then sends an email notification to warn that it is going to shut down the programs and the servers.
 - Maintenance program 40 then shuts down the tasks within the server service program 14 and the gateway service program 21 (if needed), and next shuts down programs 14 and 21 in order to clear a plurality of service log files. Maintenance program 40 then sends a special command to application server 12 to "restart" with a one-time startup delay. Maintenance program 40 starts at step 51 and proceeds to step 52 to check the TNI file and then waits for the one-time startup delay. The server service program 14 and the gateway service program 21 are then started. Following the startup delay, the maintenance program 40 begins at step 54. (These steps are not shown in Figure 2.) It is important to note that maintenance program 40 may perform these shut down and restart steps unattended by the user. Prior to the present invention, these operations would have required user monitoring.
 - Step 74 the maintenance program 40 initiates an email notification.
 - the email notification should be sent using an application such as SendSMTP.EXETM produced by Greyware Automation Products, Inc. or a similar application which does not use a messaging application program interface (MAPI) for messaging.
 - the email notification application may be turned off if the user does not desire to be informed of the status of system 50.
 - Step 62 maintenance program 40 proceeds to check each necessary complimentary program 19 and 21 for proper functioning. If any one of the necessary complimentary programs 19 and 21 is unresponsive, in Step 74, maintenance program 40 initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then loops back to Step 52, and proceeds to recheck the INI file.
 - Step 66 maintenance program 40 tests the server service program 14 and the gateway service program 21 to make sure each service program is active.
 - both the server service program 14 and the gateway service program 21 are preferably tested using Windows NT TM service calls via the SCM 15. If either the server service program 14 or the gateway service program 21 are found to be unresponsive, in Step 74, maintenance program 40 then initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then, following the predetermined length of time, loops back to Step 52 where maintenance program 40 proceeds to recheck the TNI file.
 - Step 68 maintenance program 40 proceeds to check to verify that each one of a plurality of tasks within the server service program 14 and the gateway service program 21 (if there are any tasks within the gateway service program 21) is properly running.
 - the checks performed in Step 68 are preferably accomplished using a disk operating system (DOS) interface.
 - DOS disk operating system
 - the checks of server service program 14 and gateway service program 21 may include, for example, a plurality of checks of the status of any of the tasks being run by the server service program 14 and gateway service program 21, a plurality of checks for any changes or updates to server service program 14 and gateway service program 21, and a plurality of checks for any updates for tasks not currently being run by the server service program 14 and gateway service program 21. If any one of the necessary tasks of server service program 14 and gateway service program 21 are found to be improperly running, then, in Step 70, the maintenance program 40 may attempt to restart the improperly running one of the necessary tasks. In Step 72, a determination is made as to whether any one of the necessary tasks is unable to be restarted.
 - Step 74 maintenance program 40 initiates an email notification to the user, and then goes back to Step 60 and waits a predetermined length of time, and following the predetermined length of time, loops back to Step 52 and proceeds to recheck the INI file. If each of the failed tasks is successfully restarted, then maintenance program 40 goes to Step 60 and waits a predetermined length of time before looping back to Step 52 and proceeding to recheck the INI file.
 - the system and method of the present invention may be used in a variety of network configurations in which a server service program relies on additional complimentary program resources to serve a client.
 - the system and method of the invention are also highly flexible and can be easily modified and customized to fit specific situations.
 - the present invention may be used within network arrangements such as a local area network (LAN) including an Ethernet and a Token Ring access method, a metropolitan area network (MAN), and a wide area network (WAN).
 - LAN local area network
 - MAN metropolitan area network
 - WAN wide area network
 - the preferred embodiments are discussed with reference to the Windows NTTM environment, the present invention may also be used in a variety of other server platforms and operating environments such as, for example, Windows 95, 98 and 2000, Unix, OS/2 and NetWare.
 - the present invention may be used to test a variety of .networking links including those based upon, for example, a Network File System (NFS); a Web NFS; a Server Message Block (SMB); a Samba; a Netware Core Protocol (NCP); a Distributed File System (DFS), and a Common Internet File System (CLFS) architecture, as well as use such transport protocols as, for example, TCP/IP, IPX/SPX, HTTP and NetBEUI.
 - NFS Network File System
 - SMB Server Message Block
 - NCP Netware Core Protocol
 - DFS Distributed File System
 - CLFS Common Internet File System
 
Landscapes
- Engineering & Computer Science (AREA)
 - Computer Networks & Wireless Communication (AREA)
 - Signal Processing (AREA)
 - Environmental & Geological Engineering (AREA)
 - Debugging And Monitoring (AREA)
 
Abstract
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| AU2001271645A AU2001271645A1 (en) | 2000-06-29 | 2001-06-29 | System and method for performing unattended system availability tests and maintenance for server service programs | 
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| US60587700A | 2000-06-29 | 2000-06-29 | |
| US09/605,877 | 2000-06-29 | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| WO2002003522A1 true WO2002003522A1 (fr) | 2002-01-10 | 
Family
ID=24425566
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| PCT/US2001/020774 WO2002003522A1 (fr) | 2000-06-29 | 2001-06-29 | Systeme et procede d'execution de tests de disponibilite de systeme et de maintenance, sans presence humaine, pour des programmes de service de serveur | 
Country Status (2)
| Country | Link | 
|---|---|
| AU (1) | AU2001271645A1 (fr) | 
| WO (1) | WO2002003522A1 (fr) | 
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US8589535B2 (en) | 2009-10-26 | 2013-11-19 | Microsoft Corporation | Maintaining service performance during a cloud upgrade | 
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US5751941A (en) * | 1996-04-04 | 1998-05-12 | Hewlett-Packard Company | Object oriented framework for testing software | 
| US5854823A (en) * | 1996-09-29 | 1998-12-29 | Mci Communications Corporation | System and method for providing resources to test platforms | 
| US5987633A (en) * | 1997-08-20 | 1999-11-16 | Mci Communications Corporation | System, method and article of manufacture for time point validation | 
- 
        2001
        
- 2001-06-29 WO PCT/US2001/020774 patent/WO2002003522A1/fr active Application Filing
 - 2001-06-29 AU AU2001271645A patent/AU2001271645A1/en not_active Abandoned
 
 
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US5751941A (en) * | 1996-04-04 | 1998-05-12 | Hewlett-Packard Company | Object oriented framework for testing software | 
| US5854823A (en) * | 1996-09-29 | 1998-12-29 | Mci Communications Corporation | System and method for providing resources to test platforms | 
| US5987633A (en) * | 1997-08-20 | 1999-11-16 | Mci Communications Corporation | System, method and article of manufacture for time point validation | 
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US8589535B2 (en) | 2009-10-26 | 2013-11-19 | Microsoft Corporation | Maintaining service performance during a cloud upgrade | 
Also Published As
| Publication number | Publication date | 
|---|---|
| AU2001271645A1 (en) | 2002-01-14 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US7209963B2 (en) | Apparatus and method for distributed monitoring of endpoints in a management region | |
| US7234072B2 (en) | Method and system for making an application highly available | |
| US20040010716A1 (en) | Apparatus and method for monitoring the health of systems management software components in an enterprise | |
| US10474521B2 (en) | Service directory and fault injection management systems and methods | |
| KR100763326B1 (ko) | 분산 시스템에서의 근본 원인 식별 및 문제점 판정을 위한방법 및 장치 | |
| US20040153703A1 (en) | Fault tolerant distributed computing applications | |
| JP4426797B2 (ja) | 依存性に基づく影響シミュレーションおよび脆弱性分析のための方法および装置 | |
| US7505872B2 (en) | Methods and apparatus for impact analysis and problem determination | |
| US20030196148A1 (en) | System and method for peer-to-peer monitoring within a network | |
| US7120684B2 (en) | Method and system for central management of a computer network | |
| US20040003266A1 (en) | Non-invasive automatic offsite patch fingerprinting and updating system and method | |
| CN101777020B (zh) | 一种用于分布式程序的容错方法和系统 | |
| US7890616B2 (en) | System and method for validation of middleware failover behavior | |
| US7469287B1 (en) | Apparatus and method for monitoring objects in a network and automatically validating events relating to the objects | |
| US20030212788A1 (en) | Generic control interface with multi-level status | |
| US7934199B2 (en) | Automated operation of IT resources with multiple choice configuration | |
| US20090094477A1 (en) | System and program product for detecting an operational risk of a node | |
| KR20050007307A (ko) | 컴퓨터 애플리케이션을 모니터링하는 시스템 및 방법 | |
| CN116225607A (zh) | 数据库的管理方法和装置 | |
| US7206975B1 (en) | Internal product fault monitoring apparatus and method | |
| JP2003233512A (ja) | 保守機能付きクライアント監視システム及び監視サーバ及びプログラム並びにクライアント監視・保守方法 | |
| US6151686A (en) | Managing an information retrieval problem | |
| US8607328B1 (en) | Methods and systems for automated system support | |
| WO2002003522A1 (fr) | Systeme et procede d'execution de tests de disponibilite de systeme et de maintenance, sans presence humaine, pour des programmes de service de serveur | |
| US9183068B1 (en) | Various methods and apparatuses to restart a server | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| AK | Designated states | 
             Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW  | 
        |
| AL | Designated countries for regional patents | 
             Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG  | 
        |
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| REG | Reference to national code | 
             Ref country code: DE Ref legal event code: 8642  | 
        |
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase | 
             Ref country code: JP  |