[go: up one dir, main page]

WO2002003522A1 - System and method for performing unattended system availability tests and maintenance for server service programs - Google Patents

System and method for performing unattended system availability tests and maintenance for server service programs Download PDF

Info

Publication number
WO2002003522A1
WO2002003522A1 PCT/US2001/020774 US0120774W WO0203522A1 WO 2002003522 A1 WO2002003522 A1 WO 2002003522A1 US 0120774 W US0120774 W US 0120774W WO 0203522 A1 WO0203522 A1 WO 0203522A1
Authority
WO
WIPO (PCT)
Prior art keywords
program
server service
service program
server
properly running
Prior art date
Application number
PCT/US2001/020774
Other languages
French (fr)
Inventor
Timothy Haggerty
Original Assignee
Ge Financial Assurance Holdings, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ge Financial Assurance Holdings, Inc. filed Critical Ge Financial Assurance Holdings, Inc.
Priority to AU2001271645A priority Critical patent/AU2001271645A1/en
Publication of WO2002003522A1 publication Critical patent/WO2002003522A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • H04L41/0661Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • H04L41/5012Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF] determining service availability, e.g. which services are available at a certain point in time
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Definitions

  • the present invention relates generally to a system and method for performing unattended system availability tests and maintenance on server service programs.
  • the invention relates to a system and method for providing a notification of a status of a server service program, its related components and sub-components, and for performing maintenance on the same.
  • Server service programs are server-based applications which are designed to be accessed by a plurality of users or "clients" on a networked system. Each server service program may provide and perform a wide variety of tasks for the use of the clients within a network. For instance, a specific server service program may provide a service of retrieving and analyzing business data collected from a plurality of databases throughout a network. Increasingly, server service programs are designed to take advantage of a wider availability of larger amounts of computing resources in order to provide more sophisticated services for clients. Today, a single server service program may have access to and rely on several other servers and server service programs. A server service program of a particular company may, for example, use several databases and servers to coordinate the data for all of the company's sales activities.
  • server service programs have become increasingly difficult to monitor and maintain.
  • One particular reason for this is that any one server service program may rely on many different subcomponents to properly function, hi turn, each of these sub-components may lie within a different layer of network architecture and use a different set of network interfaces.
  • a status of a server service program may be measured solely by whether such server service program as a whole is continuing to operate (i.e., application level monitoring), although many of the server service program's sub-components may no longer be properly responding.
  • This type of application level monitoring has many drawbacks. First, using this type of application level monitoring, a network administrator is often unaware of a problem until a server service program unsuccessfully attempts to access a failed subcomponent. Secondly, using this type of application level monitoring, it is often impossible to determine the exact cause of a failure. Consequently, many times an entire server service program has to be restarted in order to restart operation of a single failed sub-component, thereby resulting in an increase in administrative time, network resources and other inefficiencies.
  • the system and method of the present invention are advantageous because they provide for monitoring of each component of a server service program and for one or more notifications of a status of the server service program.
  • the present invention relates to a method for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link.
  • the method comprises the steps of determining whether the at least one link is active; determining whether the at least one complimentary program is properly running; determining whether the server service program is properly running; and determining whether the at least one task within the server service program is properly running.
  • the present invention relates to a system for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link.
  • the system comprises a first testing element for determining whether the at least one link is active; a second testing element for determining whether the at least one complimentary program is properly running; a third testing element for determining whether the server service program is properly running; and a fourth testing element for determining whether the at least one task within the server service program is properly running.
  • Figure 1 is a simplified schematic representation illustrating one example of a computer network configuration for use with one embodiment of the present invention.
  • Figure 2 is a simplified flowchart of a method for performing a plurality of unattended system availability tests and maintenance for a server service program in accordance with one embodiment of the present invention.
  • FIG. 1 illustrates an example of a network arrangement 50 employing a system and method of the present invention in accordance with a preferred embodiment of the invention. It should be understood that the present invention operates independent of any particular arrangement or mix of network components and that network 50 depicted in Figure 1 is purely illustrative and simplified for the purpose of explanation.
  • network 50 comprises a client 10, an application server 12, a database server 18, and a gateway server 22.
  • Application server 12 includes a processor module 16, a service control manager (SCM) 15, a server service program 14 and a maintenance program 40.
  • Gateway server 22 includes a gateway service program 21.
  • Database server 18 includes a database program 19.
  • Server service program 14 gains access to database 18 via a link 30 and to gateway server 22 via a link 34.
  • Client 10 gains access to the database 18 via a link 26.
  • application server 12 is operated using the Windows NTTM operating system.
  • server service program 14 may be any of a variety of server service programs which rely on a plurality of supporting connections to a plurality of other programs and databases in order to provide services to a plurality of clients.
  • server service program 14 is preferably a sales program such as Siebel Sales EnterpriseTM which relies on access to a plurality of sales information from across a business entity's computer network.
  • server service program 14 may rely on a variety of programs, databases and sub-components (hereafter collectively referred to as complimentary programs) for providing service to client 10. These complimentary programs, may include any programs or functions relied upon by the server service program 14 including, for example, those programs performed by a printer server, a web server, a mail server, a database server, a file server, a proxy server or an application server.
  • server service program 14 is shown in Figure 1 relying on database program 19 within database server 18, and gateway service program 21 within gateway server 22. Using these inputs, server service program 14 may provide client 10, for instance, with a plurality of sales analyses and other business services. However, if server service program 14 cannot access these complimentary programs 19 and 21, then server service program 14 will be unable to perform the sales analyses and other business services. Accordingly, maintenance program 40 is provided to test and maintain server service program 14 and its necessary complimentary programs 19 and 21.
  • maintenance program 40 should be co- located with the server service program 14 and the SCM 15 so that maintenance program 40 can directly interrogate the SCM 15. Further, according to a preferred embodiment, maintenance program 40 should not have a need to interact with a user stationed at the client 10 via a graphical user interface (GUI) or another user interface means since the maintenance program 40 is intended to continually monitor the network 50 without a need to wait for the user at the client 10 to respond to a query or a message box. Accordingly, maintenance program 40 is preferably compiled as an unattended executable program so that all message classes that may require interaction via a GUI or other user interface are redirected to, for instance, a Windows NTTM event log.
  • GUI graphical user interface
  • maintenance program 40 is preferably configured to run as a Windows NTTM service. This enables execution of the maintenance program 40 to be initiated when the network 50 starts up without a need for the maintenance program 40 to be launched by the user at the client 10. Additionally, this will allow maintenance program 40 to run in the same conditions and at the same time as any server service program it oversees.
  • Figure 2 is a flowchart illustrating the steps in the method of the present invention. As a first step 51, the maintenance program 40 is initiated and begins to run. Once initiated, in step 52, the maintenance program 40 checks the initiation file (INI file) of application server 12 for information on the configuration of the network 50 and makes a determination as to which complimentary programs are needed for proper operation of server service program 14. This determination is based upon the configuration of the network 50 and a plurality of preprogrammed user information saved within maintenance program 40.
  • initiation file ITI file
  • Step 54 once the necessary complimentary programs are determined, i.e., programs 19 and 21, maintenance program 40 begins conducting a plurality of maintenance checks by checking for a plurality of active links to each of these complimentary programs 19 and 21.
  • the necessary complimentary programs 19 and 21 include at least one database and a link to this database is checked by maintenance program 40 using an ODBC (Open Data Base Connectivity) interface.
  • Step 56 if the links to any of the necessary complimentary programs 19 and
  • maintenance program 40 clears all information regarding any active links and, in Step 58, checks to see whether any of the links to one of the complimentary programs 19 and 21 is unavailable due to a prescheduled downtime. If any of the links to one of the complimentary programs 19 and 21 is found to be unavailable due to a prescheduled downtime, then, in Step 60, maintenance program 40 waits a predetermined amount of time (corresponding to the prescheduled downtime) and, following the prescheduled downtime, in Step 52, maintenance program 40 proceeds to recheck the INI file without initiating any maintenance of the server service program 14 or a notification to the user.
  • the maintenance program 40 can determine that it was in the pre-scheduled downtime status for a system backup because it had previously checked for the link to complimentary program 19 and found it missing, but then in step 54 it found such link active. Moreover, the maintenance program 40 recognizes that it found the link to the complimentary program 19 missing at a time it expected to find such link missing (i.e., prescheduled downtime for backup over a weekend time period). In other words, maintenance program 40 recognizes that the loss of the link is a normal expected loss of connection within predefined time parameters and that the link is now restored. Maintenance program 40 then sends an email notification to warn that it is going to shut down the programs and the servers.
  • Maintenance program 40 then shuts down the tasks within the server service program 14 and the gateway service program 21 (if needed), and next shuts down programs 14 and 21 in order to clear a plurality of service log files. Maintenance program 40 then sends a special command to application server 12 to "restart" with a one-time startup delay. Maintenance program 40 starts at step 51 and proceeds to step 52 to check the TNI file and then waits for the one-time startup delay. The server service program 14 and the gateway service program 21 are then started. Following the startup delay, the maintenance program 40 begins at step 54. (These steps are not shown in Figure 2.) It is important to note that maintenance program 40 may perform these shut down and restart steps unattended by the user. Prior to the present invention, these operations would have required user monitoring.
  • Step 74 the maintenance program 40 initiates an email notification.
  • the email notification should be sent using an application such as SendSMTP.EXETM produced by Greyware Automation Products, Inc. or a similar application which does not use a messaging application program interface (MAPI) for messaging.
  • the email notification application may be turned off if the user does not desire to be informed of the status of system 50.
  • Step 62 maintenance program 40 proceeds to check each necessary complimentary program 19 and 21 for proper functioning. If any one of the necessary complimentary programs 19 and 21 is unresponsive, in Step 74, maintenance program 40 initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then loops back to Step 52, and proceeds to recheck the INI file.
  • Step 66 maintenance program 40 tests the server service program 14 and the gateway service program 21 to make sure each service program is active.
  • both the server service program 14 and the gateway service program 21 are preferably tested using Windows NT TM service calls via the SCM 15. If either the server service program 14 or the gateway service program 21 are found to be unresponsive, in Step 74, maintenance program 40 then initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then, following the predetermined length of time, loops back to Step 52 where maintenance program 40 proceeds to recheck the TNI file.
  • Step 68 maintenance program 40 proceeds to check to verify that each one of a plurality of tasks within the server service program 14 and the gateway service program 21 (if there are any tasks within the gateway service program 21) is properly running.
  • the checks performed in Step 68 are preferably accomplished using a disk operating system (DOS) interface.
  • DOS disk operating system
  • the checks of server service program 14 and gateway service program 21 may include, for example, a plurality of checks of the status of any of the tasks being run by the server service program 14 and gateway service program 21, a plurality of checks for any changes or updates to server service program 14 and gateway service program 21, and a plurality of checks for any updates for tasks not currently being run by the server service program 14 and gateway service program 21. If any one of the necessary tasks of server service program 14 and gateway service program 21 are found to be improperly running, then, in Step 70, the maintenance program 40 may attempt to restart the improperly running one of the necessary tasks. In Step 72, a determination is made as to whether any one of the necessary tasks is unable to be restarted.
  • Step 74 maintenance program 40 initiates an email notification to the user, and then goes back to Step 60 and waits a predetermined length of time, and following the predetermined length of time, loops back to Step 52 and proceeds to recheck the INI file. If each of the failed tasks is successfully restarted, then maintenance program 40 goes to Step 60 and waits a predetermined length of time before looping back to Step 52 and proceeding to recheck the INI file.
  • the system and method of the present invention may be used in a variety of network configurations in which a server service program relies on additional complimentary program resources to serve a client.
  • the system and method of the invention are also highly flexible and can be easily modified and customized to fit specific situations.
  • the present invention may be used within network arrangements such as a local area network (LAN) including an Ethernet and a Token Ring access method, a metropolitan area network (MAN), and a wide area network (WAN).
  • LAN local area network
  • MAN metropolitan area network
  • WAN wide area network
  • the preferred embodiments are discussed with reference to the Windows NTTM environment, the present invention may also be used in a variety of other server platforms and operating environments such as, for example, Windows 95, 98 and 2000, Unix, OS/2 and NetWare.
  • the present invention may be used to test a variety of .networking links including those based upon, for example, a Network File System (NFS); a Web NFS; a Server Message Block (SMB); a Samba; a Netware Core Protocol (NCP); a Distributed File System (DFS), and a Common Internet File System (CLFS) architecture, as well as use such transport protocols as, for example, TCP/IP, IPX/SPX, HTTP and NetBEUI.
  • NFS Network File System
  • SMB Server Message Block
  • NCP Netware Core Protocol
  • DFS Distributed File System
  • CLFS Common Internet File System

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A system and method are provided for use in performing an unattended system availability test and maintenance on a server service program. The method of this invention includes the steps of determining whether each necessary link (54) is active; determining whether each necessary complimentary program (62) is properly running; and determining whether each necessary task within said server service program (68) is properly running.

Description

SYSTEM AND METHOD FOR PERFORMING UNATTENDED
SYSTEM AVAILABILITY TESTS AND MAINTENANCE FOR
SERVER SERVICE PROGRAMS
FIELD OF THE INVENTION
The present invention relates generally to a system and method for performing unattended system availability tests and maintenance on server service programs.
Specifically, the invention relates to a system and method for providing a notification of a status of a server service program, its related components and sub-components, and for performing maintenance on the same.
BACKGROUND OF THE INVENTION
Server service programs are server-based applications which are designed to be accessed by a plurality of users or "clients" on a networked system. Each server service program may provide and perform a wide variety of tasks for the use of the clients within a network. For instance, a specific server service program may provide a service of retrieving and analyzing business data collected from a plurality of databases throughout a network. Increasingly, server service programs are designed to take advantage of a wider availability of larger amounts of computing resources in order to provide more sophisticated services for clients. Today, a single server service program may have access to and rely on several other servers and server service programs. A server service program of a particular company may, for example, use several databases and servers to coordinate the data for all of the company's sales activities.
One downside to the increased sophistication of server service programs is that they have become increasingly difficult to monitor and maintain. One particular reason for this is that any one server service program may rely on many different subcomponents to properly function, hi turn, each of these sub-components may lie within a different layer of network architecture and use a different set of network interfaces.
Consequently, a status of a server service program may be measured solely by whether such server service program as a whole is continuing to operate (i.e., application level monitoring), although many of the server service program's sub-components may no longer be properly responding.
This type of application level monitoring has many drawbacks. First, using this type of application level monitoring, a network administrator is often unaware of a problem until a server service program unsuccessfully attempts to access a failed subcomponent. Secondly, using this type of application level monitoring, it is often impossible to determine the exact cause of a failure. Consequently, many times an entire server service program has to be restarted in order to restart operation of a single failed sub-component, thereby resulting in an increase in administrative time, network resources and other inefficiencies.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a system and method ' for performing a plurality of unattended system availability tests and maintenance on a server service program. The system and method of the present invention are advantageous because they provide for monitoring of each component of a server service program and for one or more notifications of a status of the server service program.
It is a further object of the invention to provide a system and method for restarting one or more failed sub-components of a server service program and for providing one or more notifications of such failed sub-components.
Additional objects and advantages of the present invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of instrumentalities and combinations, particularly pointed out in the appended claims.
To achieve the objects and in accordance with the purpose of the invention, as embodied and broadly described herein, in its broadest aspects, the present invention relates to a method for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link. The method comprises the steps of determining whether the at least one link is active; determining whether the at least one complimentary program is properly running; determining whether the server service program is properly running; and determining whether the at least one task within the server service program is properly running.
In another aspect, the present invention relates to a system for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link. The system comprises a first testing element for determining whether the at least one link is active; a second testing element for determining whether the at least one complimentary program is properly running; a third testing element for determining whether the server service program is properly running; and a fourth testing element for determining whether the at least one task within the server service program is properly running.
The accompanying drawings, which are incorporated in and constitute a part of this specification, together with the description, serve to further explain the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a simplified schematic representation illustrating one example of a computer network configuration for use with one embodiment of the present invention.
Figure 2 is a simplified flowchart of a method for performing a plurality of unattended system availability tests and maintenance for a server service program in accordance with one embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Reference will now be made in detail to the present preferred embodiment of the invention, an example of which is illustrated in the accompanying drawings in which like reference characters refer to corresponding elements.
Figure 1 illustrates an example of a network arrangement 50 employing a system and method of the present invention in accordance with a preferred embodiment of the invention. It should be understood that the present invention operates independent of any particular arrangement or mix of network components and that network 50 depicted in Figure 1 is purely illustrative and simplified for the purpose of explanation. As shown in Figure 1, network 50 comprises a client 10, an application server 12, a database server 18, and a gateway server 22. Application server 12 includes a processor module 16, a service control manager (SCM) 15, a server service program 14 and a maintenance program 40. Gateway server 22 includes a gateway service program 21. Database server 18 includes a database program 19. Server service program 14 gains access to database 18 via a link 30 and to gateway server 22 via a link 34. Client 10 gains access to the database 18 via a link 26. According to a preferred embodiment, application server 12 is operated using the Windows NT™ operating system.
In accordance with the present invention, server service program 14 may be any of a variety of server service programs which rely on a plurality of supporting connections to a plurality of other programs and databases in order to provide services to a plurality of clients. According to a preferred embodiment of the present invention, server service program 14 is preferably a sales program such as Siebel Sales Enterprise™ which relies on access to a plurality of sales information from across a business entity's computer network.
Within the scope of the present invention, server service program 14 may rely on a variety of programs, databases and sub-components (hereafter collectively referred to as complimentary programs) for providing service to client 10. These complimentary programs, may include any programs or functions relied upon by the server service program 14 including, for example, those programs performed by a printer server, a web server, a mail server, a database server, a file server, a proxy server or an application server. For purposes of illustration, server service program 14 is shown in Figure 1 relying on database program 19 within database server 18, and gateway service program 21 within gateway server 22. Using these inputs, server service program 14 may provide client 10, for instance, with a plurality of sales analyses and other business services. However, if server service program 14 cannot access these complimentary programs 19 and 21, then server service program 14 will be unable to perform the sales analyses and other business services. Accordingly, maintenance program 40 is provided to test and maintain server service program 14 and its necessary complimentary programs 19 and 21.
According to a preferred embodiment, maintenance program 40 should be co- located with the server service program 14 and the SCM 15 so that maintenance program 40 can directly interrogate the SCM 15. Further, according to a preferred embodiment, maintenance program 40 should not have a need to interact with a user stationed at the client 10 via a graphical user interface (GUI) or another user interface means since the maintenance program 40 is intended to continually monitor the network 50 without a need to wait for the user at the client 10 to respond to a query or a message box. Accordingly, maintenance program 40 is preferably compiled as an unattended executable program so that all message classes that may require interaction via a GUI or other user interface are redirected to, for instance, a Windows NT™ event log.
Further, according to a preferred embodiment, maintenance program 40 is preferably configured to run as a Windows NT™ service. This enables execution of the maintenance program 40 to be initiated when the network 50 starts up without a need for the maintenance program 40 to be launched by the user at the client 10. Additionally, this will allow maintenance program 40 to run in the same conditions and at the same time as any server service program it oversees. Figure 2 is a flowchart illustrating the steps in the method of the present invention. As a first step 51, the maintenance program 40 is initiated and begins to run. Once initiated, in step 52, the maintenance program 40 checks the initiation file (INI file) of application server 12 for information on the configuration of the network 50 and makes a determination as to which complimentary programs are needed for proper operation of server service program 14. This determination is based upon the configuration of the network 50 and a plurality of preprogrammed user information saved within maintenance program 40.
At Step 54, once the necessary complimentary programs are determined, i.e., programs 19 and 21, maintenance program 40 begins conducting a plurality of maintenance checks by checking for a plurality of active links to each of these complimentary programs 19 and 21. According to a preferred embodiment, the necessary complimentary programs 19 and 21 include at least one database and a link to this database is checked by maintenance program 40 using an ODBC (Open Data Base Connectivity) interface. In Step 56, if the links to any of the necessary complimentary programs 19 and
21 are found to be unresponsive, then maintenance program 40 clears all information regarding any active links and, in Step 58, checks to see whether any of the links to one of the complimentary programs 19 and 21 is unavailable due to a prescheduled downtime. If any of the links to one of the complimentary programs 19 and 21 is found to be unavailable due to a prescheduled downtime, then, in Step 60, maintenance program 40 waits a predetermined amount of time (corresponding to the prescheduled downtime) and, following the prescheduled downtime, in Step 52, maintenance program 40 proceeds to recheck the INI file without initiating any maintenance of the server service program 14 or a notification to the user.
Following re-checking the INI file in step 52 after the pre-scheduled downtime in step 58, the maintenance program 40 can determine that it was in the pre-scheduled downtime status for a system backup because it had previously checked for the link to complimentary program 19 and found it missing, but then in step 54 it found such link active. Moreover, the maintenance program 40 recognizes that it found the link to the complimentary program 19 missing at a time it expected to find such link missing (i.e., prescheduled downtime for backup over a weekend time period). In other words, maintenance program 40 recognizes that the loss of the link is a normal expected loss of connection within predefined time parameters and that the link is now restored. Maintenance program 40 then sends an email notification to warn that it is going to shut down the programs and the servers. Maintenance program 40 then shuts down the tasks within the server service program 14 and the gateway service program 21 (if needed), and next shuts down programs 14 and 21 in order to clear a plurality of service log files. Maintenance program 40 then sends a special command to application server 12 to "restart" with a one-time startup delay. Maintenance program 40 starts at step 51 and proceeds to step 52 to check the TNI file and then waits for the one-time startup delay. The server service program 14 and the gateway service program 21 are then started. Following the startup delay, the maintenance program 40 begins at step 54. (These steps are not shown in Figure 2.) It is important to note that maintenance program 40 may perform these shut down and restart steps unattended by the user. Prior to the present invention, these operations would have required user monitoring.
If any one of the complimentary programs 19 and 21 is determined to be unavailable during a time when such complimentary program should be up and running, then in Step 74, the maintenance program 40 initiates an email notification. According to the preferred embodiment, the email notification should be sent using an application such as SendSMTP.EXE™ produced by Greyware Automation Products, Inc. or a similar application which does not use a messaging application program interface (MAPI) for messaging. The email notification application may be turned off if the user does not desire to be informed of the status of system 50. Once the email notification is sent in Step 74, maintenance program 40 waits a predetermined length of time in Step 60 and then loops back to Step 52 to recheck the INI file.
Once each link to the complimentary programs 19 and 21 is confirmed to be active and open, in Step 62, maintenance program 40 proceeds to check each necessary complimentary program 19 and 21 for proper functioning. If any one of the necessary complimentary programs 19 and 21 is unresponsive, in Step 74, maintenance program 40 initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then loops back to Step 52, and proceeds to recheck the INI file.
Once each of the necessary complimentary programs 19 and 21 is tested and any errors reported via email notification, in Step 66, maintenance program 40 tests the server service program 14 and the gateway service program 21 to make sure each service program is active. According to the preferred embodiment, both the server service program 14 and the gateway service program 21 are preferably tested using Windows NT ™ service calls via the SCM 15. If either the server service program 14 or the gateway service program 21 are found to be unresponsive, in Step 74, maintenance program 40 then initiates an email notification to the user, goes back to Step 60 and waits a predetermined length of time, and then, following the predetermined length of time, loops back to Step 52 where maintenance program 40 proceeds to recheck the TNI file.
Once server service program 14 and gateway service program 21 are each confirmed to be active, then, in Step 68, maintenance program 40 proceeds to check to verify that each one of a plurality of tasks within the server service program 14 and the gateway service program 21 (if there are any tasks within the gateway service program 21) is properly running. According to a preferred embodiment, the checks performed in Step 68 are preferably accomplished using a disk operating system (DOS) interface. The checks of server service program 14 and gateway service program 21 may include, for example, a plurality of checks of the status of any of the tasks being run by the server service program 14 and gateway service program 21, a plurality of checks for any changes or updates to server service program 14 and gateway service program 21, and a plurality of checks for any updates for tasks not currently being run by the server service program 14 and gateway service program 21. If any one of the necessary tasks of server service program 14 and gateway service program 21 are found to be improperly running, then, in Step 70, the maintenance program 40 may attempt to restart the improperly running one of the necessary tasks. In Step 72, a determination is made as to whether any one of the necessary tasks is unable to be restarted. In Step 74, maintenance program 40 initiates an email notification to the user, and then goes back to Step 60 and waits a predetermined length of time, and following the predetermined length of time, loops back to Step 52 and proceeds to recheck the INI file. If each of the failed tasks is successfully restarted, then maintenance program 40 goes to Step 60 and waits a predetermined length of time before looping back to Step 52 and proceeding to recheck the INI file.
As is readily apparent from the above detailed description, the system and method of the present invention may be used in a variety of network configurations in which a server service program relies on additional complimentary program resources to serve a client. The system and method of the invention are also highly flexible and can be easily modified and customized to fit specific situations. For instance, the present invention may be used within network arrangements such as a local area network (LAN) including an Ethernet and a Token Ring access method, a metropolitan area network (MAN), and a wide area network (WAN). Further, although the preferred embodiments are discussed with reference to the Windows NT™ environment, the present invention may also be used in a variety of other server platforms and operating environments such as, for example, Windows 95, 98 and 2000, Unix, OS/2 and NetWare.
Additionally, the present invention may be used to test a variety of .networking links including those based upon, for example, a Network File System (NFS); a Web NFS; a Server Message Block (SMB); a Samba; a Netware Core Protocol (NCP); a Distributed File System (DFS), and a Common Internet File System (CLFS) architecture, as well as use such transport protocols as, for example, TCP/IP, IPX/SPX, HTTP and NetBEUI. The invention has been described with particular reference to preferred embodiments which are intended to be illustrative rather than restrictive. Alternative embodiments will become apparent to those skilled in the art to which this invention pertains without departing from its spirit and scope. Thus, such variations and modifications of the present invention can be effected within the spirit and scope of the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link, said method comprising the steps of: determining whether said at least one link is active; determining whether said at least one complimentary program is properly running; determining whether said server service program is properly running; and determining whether said at least one task within said server service program is properly running.
2. The method of claim 1, wherein said server service program operates within a Windows NT™ operating system and wherein said the determination of the proper running of said server service program is made using a Windows NT™ service call.
3. The method of claim 2, wherein said at least one complimentary program is a database program and said at least one link is a database link.
4. The method of claim 3, wherein the testing of said at least one database link and said database program is performed using an ODBC interface.
5. The method of claim 4, wherein said determination as to whether said at least one task within said server service program is properly running is made using a DOS interface.
6. The method of claim 5, wherein said method further comprises the steps of : providing notification to a system user if a determination is made that said at least one database link is not active; providing notification to a system user if a determination is made that said at least one database is not properly running; providing notification to a system user if a determination is made that said server service program is not properly running; and providing notification to a system user if a determination is made that said at least one task within said server service program is not properly running.
7. The method of claim 6, wherein said notification is provided via email and does not utilize a MAPI programming interface.
8. The method of claim 7, wherein said method further comprises the step of : restarting said at least one task within said server service program when a determination is made that said at least one task is not properly running.
9. The method of claim 8, wherein said server service program has access to at least one gateway service program and wherein said method further comprises the steps of: testing whether said gateway service is properly running using a Windows NT ™ service call; and providing notification to a system user if a determination is made that said gateway service is not properly running.
10. A system for performing an unattended system availability test and maintenance on a server service program incorporating at least one task and having access to at least one complimentary program via at least one link, said system comprising: a first testing element for determining whether said at least one link is active; a second testing element for determining whether said at least one complimentary program is properly running; a third testing element for determining whether said server service program is properly running; and a fourth testing element for determining whether said at least one task within said server service program is properly running.
11. The system of claim 10, wherein said server service program operates within a Windows NT™ operating system and wherein said third testing element is a Windows NT™ service call.
12. The system of claim 11, wherein said at least one complimentary program is a database program and said at least one link is a database link and wherein each of said first and second testing elements is an ODBC interface.
13. The system of claim 12, wherein each of said third and fourth testing elements is a DOS interface.
14. The system of claim 13, wherein said system further comprises: a notification element for providing a notification if a determination is made that any one of said at least one database link is not active, said at least one database is not properly running, said server service program is not properly running, or if at least one task within said server service program is not properly running.
15. The system of claim 14, wherein said notification element is an email program not utilizing a MAPI programming interface.
16. The system of claim 14, wherein said system further comprises: a restarting element for restarting said at least one task within said server service program when a determination is made that said at least one task is not properly running.
17. The system of claim 15, wherein said server service program has access to at least one gateway service program and wherein said system further comprises: a Windows NT ™ service call to determine whether said gateway service is properly running; and an email notification element for providing a notification to a system user if a determination is made that said gateway service is not properly running, wherein said email notification element does not utilize the MAPI programming interface.
PCT/US2001/020774 2000-06-29 2001-06-29 System and method for performing unattended system availability tests and maintenance for server service programs WO2002003522A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001271645A AU2001271645A1 (en) 2000-06-29 2001-06-29 System and method for performing unattended system availability tests and maintenance for server service programs

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60587700A 2000-06-29 2000-06-29
US09/605,877 2000-06-29

Publications (1)

Publication Number Publication Date
WO2002003522A1 true WO2002003522A1 (en) 2002-01-10

Family

ID=24425566

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/020774 WO2002003522A1 (en) 2000-06-29 2001-06-29 System and method for performing unattended system availability tests and maintenance for server service programs

Country Status (2)

Country Link
AU (1) AU2001271645A1 (en)
WO (1) WO2002003522A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8589535B2 (en) 2009-10-26 2013-11-19 Microsoft Corporation Maintaining service performance during a cloud upgrade

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5751941A (en) * 1996-04-04 1998-05-12 Hewlett-Packard Company Object oriented framework for testing software
US5854823A (en) * 1996-09-29 1998-12-29 Mci Communications Corporation System and method for providing resources to test platforms
US5987633A (en) * 1997-08-20 1999-11-16 Mci Communications Corporation System, method and article of manufacture for time point validation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5751941A (en) * 1996-04-04 1998-05-12 Hewlett-Packard Company Object oriented framework for testing software
US5854823A (en) * 1996-09-29 1998-12-29 Mci Communications Corporation System and method for providing resources to test platforms
US5987633A (en) * 1997-08-20 1999-11-16 Mci Communications Corporation System, method and article of manufacture for time point validation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8589535B2 (en) 2009-10-26 2013-11-19 Microsoft Corporation Maintaining service performance during a cloud upgrade

Also Published As

Publication number Publication date
AU2001271645A1 (en) 2002-01-14

Similar Documents

Publication Publication Date Title
US7209963B2 (en) Apparatus and method for distributed monitoring of endpoints in a management region
US7234072B2 (en) Method and system for making an application highly available
US20040010716A1 (en) Apparatus and method for monitoring the health of systems management software components in an enterprise
US10474521B2 (en) Service directory and fault injection management systems and methods
KR100763326B1 (en) Method and apparatus for root cause identification and problem determination in distributed systems
US20040153703A1 (en) Fault tolerant distributed computing applications
JP4426797B2 (en) Method and apparatus for dependency-based impact simulation and vulnerability analysis
US7505872B2 (en) Methods and apparatus for impact analysis and problem determination
US20030196148A1 (en) System and method for peer-to-peer monitoring within a network
US7120684B2 (en) Method and system for central management of a computer network
US20040003266A1 (en) Non-invasive automatic offsite patch fingerprinting and updating system and method
CN101777020B (en) Fault tolerance method and system used for distributed program
US7890616B2 (en) System and method for validation of middleware failover behavior
US7469287B1 (en) Apparatus and method for monitoring objects in a network and automatically validating events relating to the objects
US20030212788A1 (en) Generic control interface with multi-level status
US7934199B2 (en) Automated operation of IT resources with multiple choice configuration
US20090094477A1 (en) System and program product for detecting an operational risk of a node
KR20050007307A (en) System and method for monitoring a computer application
CN116225607A (en) Database management method and device
US7206975B1 (en) Internal product fault monitoring apparatus and method
JP2003233512A (en) Client monitoring system with maintenance function, monitoring server, program, and client monitoring/ maintaining method
US6151686A (en) Managing an information retrieval problem
US8607328B1 (en) Methods and systems for automated system support
WO2002003522A1 (en) System and method for performing unattended system availability tests and maintenance for server service programs
US9183068B1 (en) Various methods and apparatuses to restart a server

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP