US20130159249A1 - Hybrid data replication - Google Patents
Hybrid data replication Download PDFInfo
- Publication number
- US20130159249A1 US20130159249A1 US13/326,892 US201113326892A US2013159249A1 US 20130159249 A1 US20130159249 A1 US 20130159249A1 US 201113326892 A US201113326892 A US 201113326892A US 2013159249 A1 US2013159249 A1 US 2013159249A1
- Authority
- US
- United States
- Prior art keywords
- replication
- log
- database
- type
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1658—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
- G06F11/1662—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit the resynchronized component or unit being a persistent storage device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2358—Change logging, detection, and notification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2379—Updates performed during online database operations; commit processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/80—Database-specific techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/84—Using snapshots, i.e. a logical point-in-time copy of the data
Definitions
- the present invention relates generally to data processing environments and, more particularly, to a system providing methodology for hybrid data replication.
- Computers are very powerful tools for storing and providing access to vast amounts of information.
- Computer databases are a common mechanism for storing information on computer systems while providing easy access to users.
- a typical database is an organized collection of related information stored as “records” having “fields” of information.
- a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
- a database management system or DBMS is typically provided as a software cushion or layer.
- the DBMS shields the database user from knowing or even caring about the underlying hardware-level details.
- all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation.
- the DBMS provides users with a conceptual view of the database that is removed from the hardware level.
- the general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Addison Wesley, 2000.
- a replicate database is a duplicate or mirror copy of the primary database (or a subset of the primary database) that is maintained either locally at the same site as the primary database, or remotely at a different location than the primary database.
- the availability of a replicate copy of the primary database enables a user (e.g., a corporation or other business) to reconstruct a copy of the database in the event of the loss, destruction, or unavailability of the primary database.
- Database replication technologies comprise a mechanism or tool for duplicating data from a primary source or “publisher” to one or more “subscribers”. The data may also be transformed during this process of replication.
- a primary database may publish items of data to a number of different subscribers. Also, in many cases, each of these subscribers is only interested in receiving a subset of the data maintained by the primary database. In this type of environment, each of the subscribers specifies particular types or items of data (“subscribed items”) that the subscriber wants to receive from the primary database.
- replication typically requires the replicate to be materialized before replication begins.
- Materialization refers to the process of copying data, specified by a subscriber, from a published primary database to a replicate database, thereby initializing the replicate table(s). Once materialized, replication may proceed immediately.
- Continuous replication refers to log-based replication from a primary database to a replicate database and offers near real-time protection by capturing data completely via the log. Snapshot replication offers, a point-in-time copy, and thus, is considered mutually exclusive to continuous replication for a given primary and replicate database pair.
- Such mutual exclusivity may compromise performance for certain environments that would benefit from being able to switch from one form of replication to another. For example, there may be situations, such as certain times of day, when it is known that limited activity would be occurring in a primary database, such that snapshot replication would be sufficient, while at other times, continuous would be preferable. Accordingly, a need exists for an approach to replication that avoids these limitations. The present invention addresses such a need.
- the invention includes system, method, computer program product embodiments and combinations and sub-combinations thereof for hybrid data replication. Aspects include identifying a type of database data replication, the type including a combination of replication approaches, and managing replication based on the identified type, including coordinated switching from one replication approach to another automatically with transactional consistency maintained among source and target databases.
- the capable utilization of multiple replication approaches allows greater flexibility without negatively impacting performance and/or data integrity.
- FIG. 1 illustrates a network in which the present invention, or portions thereof, can be implemented, in accordance with an embodiment of the present invention.
- FIG. 2 is a block diagram illustrating a hybrid replication system, in accordance with an embodiment of the present invention.
- FIGS. 3 , 4 , 5 , 6 , 7 , 8 , 9 , and 10 are flowcharts illustrating processes for hybrid replication in accordance with an embodiment of the present invention.
- FIG. 11 illustrates an example computer useful for implementing components of embodiments of the invention.
- the present invention relates to a system, method, computer program product embodiments and combinations and sub-combinations thereof for providing methodology for hybrid data replication.
- FIG. 1 illustrates a replication environment 100 in which the present invention, or portions thereof, can be implemented.
- a source database engine 102 is able to communicate over network 104 with replication engine 106 , and is a source of transactions that modify data and are captured for replication and distribution to target database engine 107 , via replication engine 106 , in accordance with an embodiment of the present invention.
- Network 104 can be any type of network or combination of networks such as, but not limited to, a local area network, wide area network, or the Internet.
- Network 104 may be any form of a wired network or a wireless network, or a combination thereof.
- the replication environment 100 can be configured in a number of ways in order to achieve the same result, and the aforementioned configuration is shown by way of example, and not limitation.
- source database engine 102 may be located in a single physical computing device or cluster of computing devices.
- Further source database engine 102 and target database engine 107 may be any form of database and can include, but are not limited to, a device having a processor and memory for executing and storing instructions. Such a database may include software, firmware, and hardware or some combination thereof.
- the software may include one or more applications and an operating system.
- the hardware can include, but is not limited to, a processor, memory and user interface display. An optional input device, such as a mouse, stylus or any other pointing device, may be used.
- a publish-and-subscribe model for replicating data across the network 104 is utilized. Users “publish” data that is available in a primary database of the source database engine 102 , and other users “subscribe” to the data for delivery in a target database of target database engine 107 via replication engine 106 . Users can replicate both data changes (e.g., update, insert, and delete operations) and stored procedures using this method.
- An embodiment of the replication engine 106 is the Sybase Replication Server, which is well known and described in publicly available documents.
- hybrid replication occurs using hybrid replication that capably utilizes multiple replication approaches (e.g., snapshot and continuous replication capabilities) in a manner that allows greater flexibility, including coordinated switching between the approaches automatically, without negatively impacting performance and/or data integrity.
- hybrid replication includes auto-switching from snapshot to continuous once the data is materialized (e.g., ‘snapshot+continuous’ replication), processing intermittent and incremental snapshots (e.g., ‘snapshot+snapshot’ replication), or stopping continuous replication and following it with a snapshot after some time delay (e.g., ‘continuous replication+snapshot’ replication), with each form being applicable to publisher, subscriber, and table levels of data, and able to vary among these levels.
- policy declarations enable specification of the manner and type of hybrid replication, as described in co-pending U.S. patent application Ser. No. ______, filed ______, entitled “Directing a Data Replication Environment through Policy Declaration”, (attorney docket #1933.2060000), assigned to the assignee of the present invention and incorporated herein by reference in its entirety.
- a manager 202 identifies the type of replication (e.g., such as from a policy declaration), and manages the hybrid replication processing for replicating data from a primary database 204 to a replicate database 206 by controlling a snapshot replication module 208 and continuous replication module 210 , the details of which are described herein below with reference to FIGS. 3-10 .
- the processes for snapshot and continuous replication may run in parallel. Further, switching between snapshot and continuous replication can occur at anytime through components acting as replication entities that provide targeted and replication-specific functionality.
- snapshot replication 208 operates by reading directly from primary database 204 (is non log-based) and utilizes predefined process components, extract 212 ( FIG. 3 ) and load 214 ( FIG. 4 ), that run in parallel to each other. Extraction is performed as publisher activity, and loading is performed as subscriber activity, with the manager 202 determining how many extract and load components are to be used for each snapshot replication being performed.
- Continuous replication reads from a log 216 and utilizes a log reader 218 , command constructor 220 , capture 222 ( FIG. 6 ), compile 224 ( FIG. 7 ), and apply 226 ( FIG. 8 ) components, where the log reader 218 , command constructor 220 , and capture 222 components correspond to publisher activities, and the compile 224 and apply 226 components correspond to subscriber activities.
- Manager 202 plans how to most efficiently and effectively materialize tables for replication.
- materialization refers to a process of starting log based replication.
- the process occurs through snapshot replication (a complete snapshot including catch-up replication of data) with a switchover to log based replication.
- the physical and logical operations of the process are managed by determining a number of paths and the levels of parallelism to use, including determining a number of extract and load components to deploy, and within the extract and load components, a level of concurrency used, e.g., table or partial-tables (sections), and the number of I/O (input/output) streams.
- materialization may be atomic or non-atomic, where atomicity refers to an operation or transaction that is guaranteed to be completed atomically, such that if any part of the operations or transaction fails, all of it will fail.
- atomic approaches restrict how transactions occur while data is selected, while non-atomic do not, thereby allowing transactions to be performed on the data being selected.
- Auto-correction supports data consistency for non-atomic approaches in accordance with an embodiment of the invention.
- Markers such as extract begin, extract end, load begin, load end, activate, deactivate, and snapshot, are utilized as transactional commands that flow through the replication path and provide inter-component asynchronous communication. By flowing along with the data, the markers mark locations in the replication process when certain actions happen in support of the coordinated auto-switching activities of the manager 202 of potentially multiple, parallel hybrid replications, as will be more fully understood in conjunction with the description of FIGS. 3-10 .
- FIGS. 3 and 4 flowchart diagrams of extract and load processing for snapshot replication, respectively, are illustrated in accordance with an embodiment of the invention.
- the process continues by determining whether the extraction is occurring during incremental snapshot replication (block 318 ), which, when true, causes a last transaction timestamp to be read (block 320 ).
- a marker named “snapshot” is utilized as described herein below. Essentially, incremental snapshots provide a slice of data in some time window without regard to the how the data arrived in that form and require a transactional consistent initial snapshot, such that a log-based replication “records” all changed data.
- an extract begin marker is signaled (block 322 ). As long as there is more data (as determined via block 324 ), it is extracted from the primary database and sent to storage. The extraction either occurs relative to the last transaction timestamp (block 328 ) or not (block 330 ), depending upon whether the replication is incremental snapshot replication (determined via block 326 ). Once there is no more data, a timestamp is stored to reflect the last transaction extracted (block 332 ), and an extract end marker is signaled (block 334 ) before the process ends.
- the load begins, and a check is made for receipt of a deactivate marker (block 410 ). If no marker has been received, a load begin marker is signaled (block 412 ), and the data to be loaded is read from storage and loaded into the replicate database 206 (block 416 ) while load data remains (as determined via block 414 ). When no more data needs to be loaded, a load end marker is signaled (block 418 ).
- utilizing compensating components in accordance with an embodiment of the invention allows materialization to occur with no suspension of non-materializing tables yet continuing to maintain consistency.
- the compensating components know only of the table being materialized, so they will suspend transactions occurring during the materialization of that one table, and when done, perform catch-up based on the materializing table activity. In this manner, the non-materialized tables may continue to replicate via the original compile and apply, achieving non-blocking continuous replication of tables not being materialized and supporting automatic switchover from snapshot to continuous replication.
- FIG. 5 illustrates a flowchart diagram of a continuous replication process in accordance with an embodiment of the invention.
- continuous replication includes a log reader process (block 510 ), which is a straightforward and well-known process of reading the primary database log 216 . Once the log 216 is read, the construction of commands follows (block 512 ) also using well-known techniques. A capture process then occurs (block 514 ).
- FIG. 6 a flowchart diagram of the capture process (block 514 ) in accordance with an embodiment is presented.
- the capture begins and a determination is made to identify whether a deactivate marker has been received (block 602 ). If so, the schema (i.e., the state data unique to the component for the replicating table) is set inactive (block 612 ), and the timestamp of the last transaction is stored (block 614 ) before the capture ends.
- the schema i.e., the state data unique to the component for the replicating table
- the timestamp of the last transaction is stored (block 614 ) before the capture ends.
- the compensating apply is resumed (block 606 ), while when in suspend mode (block 604 is true), apply is resumed (block 608 ).
- commands from the primary database log file are received and sorted into transactions (block 616 ).
- a check is then made for an activate or extract begin marker (block 618 ) that results in the schema being set to active (block 620 ) when true.
- Processing continues by determining whether all commands for a transaction are active (block 622 ). If false, the commands are discarded (block 624 ), or if true, the commands are sent to the compile component in commit order (block 626 ) before the capture ends.
- compile and apply occurs, as shown in FIG. 5 .
- the flow of FIG. 5 following the capture (block 514 ) includes all components involved during continuous replication, with compile (block 516 ) and apply (block 518 ) and compensating compile (block 520 ) and compensating apply (block 522 ) (C-compile and C-apply) components running in parallel to each other.
- the C-compile and C-apply components (collectively referred to as compensating subscribers) allow for the materialization to occur with no suspend, the processes for which are presented more fully herein below with reference to FIGS. 9-10 .
- FIG. 7 a flowchart diagram for a compile process 516 in accordance with an embodiment of the invention is illustrated.
- the compile proceeds with receipt of the commands in commit order from the capture component (block 714 ).
- a determination of whether the materialization is being done in suspend mode follows (block 716 ). When true, and no extract begin or end markers have been received (as determined via blocks 718 and 720 ), the commands are compiled and the transactions grouped (block 722 ). The commands are then sent to the apply component (block 723 ) and the compile ends.
- a check for receipt of a “snapshot” marker is done (block 729 ) to account for an embodiment in which continuous replication is followed by snapshot replication.
- the schema is marked inactive (block 730 ) and the compile process ends.
- a “snapshot” marker has not been received, and the schema is active (as determined via block 731 ), the process continues with the compilation of commands and grouping of transactions (block 722 ). The commands are then sent to the apply component (block 723 ), and the process ends.
- the schema is set to active (block 724 ), either when an extract begin marker has been received and after enabling auto-correction (block 726 ) when the materialization is not atomic (determined via block 728 ), or when the schema is not active (block 731 is false) and an activate marker has been received (determined via block 732 ). If no activate marker is received (block 732 is negative), the process ends.
- FIG. 8 illustrates a flowchart diagram of the apply component processing in accordance with an embodiment of the invention.
- the commands are received from the compiler (block 812 ), and a check is made to identify if materialization is being done in suspend mode (block 813 ).
- the apply process suspends itself (block 815 ), and the process ends.
- the apply process also suspends itself when no extract begin marker has been received, and a snapshot marker has been received (as determined via block 816 ).
- the transactions are written to the replicate database (block 820 ) and the process ends.
- the process also ends following setting of the schema to inactive (block 818 ) when the materialization is not being done in suspend mode, and a snapshot marker has been received (as determined via block 817 ).
- a snapshot marker has been received (as determined via block 817 ).
- the process ends when the materialization is not being done in suspend mode, no snapshot marker has been received, the schema is not active, and no activate marker has been received (as determined via blocks 813 , 817 , 821 and 822 ).
- the schema is set to active (block 824 ).
- the processing loops until a compensating apply is no longer active (determined via block 826 ), enabling catch-up replication of the materialized table to complete prior to resuming normal continuous replication activity on the materialized table, and then proceeds with writing the transactions to the replicate database (block 820 ). Writing the transactions is reached also when the schema is already set to active (determined via block 821 ).
- the schema is marked as inactive (block 828 ), and if dematerialization is not needed (determined via block 830 ), the process ends. If dematerialization is needed, the processing loops until a compensating apply is no longer active (via block 832 ), and then purges the replicated data from the replicate database (block 834 ), signals the completion of the purge (block 836 ), and ends.
- FIG. 9 illustrates a flowchart diagram of a compensating compile process in accordance with an embodiment of the invention. It initiates and receives commands from the capture process 514 ( FIG. 6 ) in commit order (block 910 ). If a deactivate marker is received (determined via block 912 ), the schema is set to inactive (block 913 ), and the process ends. If not received, a check is made for receipt of a snapshot marker (block 915 ), and when received, the schema is marked active (block 916 ) and the process ends.
- the compilation also is reached when an extract end marker is received and after determining if atomic materialization is needed (block 928 ), and after auto correction is disabled (block 930 ) when atomic materialization is not needed.
- the schema gets marked active (block 932 ), and the need for atomic materialization is checked (via block 934 ), to enable auto correction appropriately (block 936 ) before proceeding with compilation (block 924 ) for sending to the compensating apply (block 926 ) before the process ends.
- the compensating apply process 522 in accordance with an embodiment is illustrated by the block flow diagram of FIG. 10 . It initiates and receives the compiled commands from the compensating compile (block 1010 ). If a deactivate marker is received (determined via block 1012 ), the schema is set to inactive (block 1014 ) and the process ends. If not deactivated but an extract begin marker is received (determined via block 1015 ), the compensating apply suspends itself (block 1016 ), and the process ends. The compensating apply also suspends itself when no extract begin marker has been received, but a snapshot marker has been received (as determined via block 1017 ), and the schema has been set to active (block 1018 ).
- an activate marker is sent (block 1022 ). Once the activate marker is sent or when there is a delay switchover (block 1021 is affirmative), the transactions are written to the replicate database (block 1024 ), and the process ends. Alternatively, the transaction writing is reached when the load end marker is not received, and no activate marker has been received (determined via block 1026 ). When the activate marker has been received, the schema is marked inactive (block 1028 ), and the compensating compile is stopped, as well as the compensating apply itself, (block 1030 ), before the process ends.
- hybrid replication in accordance with the present invention supports switching replication approaches, as desired.
- snapshot replication can be followed by continuous replication, either in an automatic form or a manual form, without compromising a transactionally consistent state.
- the automatic form occurs via compensating subscribers and markers, with a switch over to continuous replication.
- a manual case occurs when a snapshot replication was requested without materialization, and then at some later time the same entity is started with continuous replication. In such cases, the activate marker would be sent from an external source, such as a stored procedure, instead of an implicit send from the compensating subscriber.
- FIG. 11 illustrates an example computer system 1100 in which the present invention, or portions thereof, can be implemented as computer-readable code.
- the methods illustrated by the flowcharts of FIGS. 3-10 can be implemented in system 1100 .
- Various embodiments of the invention are described in terms of this example computer system 1100 . After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures.
- Computer system 1100 includes one or more processors, such as processor 1104 .
- Processor 1104 can be a special purpose or a general purpose processor.
- Processor 1104 is connected to a communication infrastructure 1106 (for example, a bus or network).
- Computer system 1100 also includes a main memory 1108 , preferably random access memory (RAM), and may also include a secondary memory 1110 .
- Secondary memory 1110 may include, for example, a hard disk drive 1112 , a removable storage drive 1114 , and/or a memory stick.
- Removable storage drive 1114 may comprise a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like.
- the removable storage drive 1114 reads from and/or writes to a removable storage unit 1118 in a well known manner.
- Removable storage unit 1118 may comprise a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 1114 .
- removable storage unit 1118 includes a computer usable storage medium having stored therein computer software and/or data.
- secondary memory 1110 may include other similar means for allowing computer programs or other instructions to be loaded into computer system 1100 .
- Such means may include, for example, a removable storage unit 1122 and an interface 1120 .
- Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 1122 and interfaces 1120 which allow software and data to be transferred from the removable storage unit 1122 to computer system 1100 .
- Computer system 1100 may also include a communications interface 1124 .
- Communications interface 1124 allows software and data to be transferred between computer system 1100 and external devices.
- Communications interface 1124 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like.
- Software and data transferred via communications interface 1124 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 1124 . These signals are provided to communications interface 1124 via a communications path 1126 .
- Communications path 1126 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link or other communications channels.
- computer program medium and “computer usable medium” are used to generally refer to media such as removable storage unit 1118 , removable storage unit 1122 , and a hard disk installed in hard disk drive 1112 . Signals carried over communications path 1126 can also embody the logic described herein. Computer program medium and computer usable medium can also refer to memories, such as main memory 1108 and secondary memory 1110 , which can be memory semiconductors (e.g. DRAMs, etc.). These computer program products are means for providing software to computer system 1100 .
- Computer programs are stored in main memory 1108 and/or secondary memory 1110 . Computer programs may also be received via communications interface 1124 . Such computer programs, when executed, enable computer system 1100 to implement the present invention as discussed herein. In particular, the computer programs, when executed, enable processor 1104 to implement the processes of the present invention, such as the method illustrated by the flowcharts of FIGS. 3-10 . Accordingly, such computer programs represent controllers of the computer system 1100 . Where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system 1100 using removable storage drive 1114 , interface 1120 , hard drive 1112 or communications interface 1124 .
- the invention is also directed to computer program products comprising software stored on any computer useable medium.
- Such software when executed in one or more data processing device, causes a data processing device(s) to operate as described herein.
- Embodiments of the invention employ any computer useable or readable medium, known now or in the future.
- Examples of computer useable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage device, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- 1. Field of the Invention
- The present invention relates generally to data processing environments and, more particularly, to a system providing methodology for hybrid data replication.
- 2. Background Art
- Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
- Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Addison Wesley, 2000.
- Increasingly, businesses run mission-critical systems which store information on database management systems. Each day more and more users base their business operations on mission-critical systems which store information on server-based database systems, such as Sybase® Adaptive Server® Enterprise (ASE) (available from Sybase, Inc. of Dublin, Calif.). As a result, the operations of the business are dependent upon the availability of data stored in their databases. Because of the mission-critical nature of these systems, users of these systems need to protect themselves against loss of the data due to software or hardware problems, disasters such as floods, earthquakes, or electrical power loss, or temporary unavailability of systems resulting from the need to perform system maintenance.
- One well-known approach that is used to guard against loss of critical business data maintained in a given database (the “primary database”) is to maintain one or more standby or replicate databases. A replicate database is a duplicate or mirror copy of the primary database (or a subset of the primary database) that is maintained either locally at the same site as the primary database, or remotely at a different location than the primary database. The availability of a replicate copy of the primary database enables a user (e.g., a corporation or other business) to reconstruct a copy of the database in the event of the loss, destruction, or unavailability of the primary database.
- Database replication technologies comprise a mechanism or tool for duplicating data from a primary source or “publisher” to one or more “subscribers”. The data may also be transformed during this process of replication.
- In many cases, a primary database may publish items of data to a number of different subscribers. Also, in many cases, each of these subscribers is only interested in receiving a subset of the data maintained by the primary database. In this type of environment, each of the subscribers specifies particular types or items of data (“subscribed items”) that the subscriber wants to receive from the primary database.
- In current replication environments, replication typically requires the replicate to be materialized before replication begins. Materialization refers to the process of copying data, specified by a subscriber, from a published primary database to a replicate database, thereby initializing the replicate table(s). Once materialized, replication may proceed immediately.
- Depending on the needs of a given environment, continuous replication or snapshot replication may be performed following materialization. Continuous replication refers to log-based replication from a primary database to a replicate database and offers near real-time protection by capturing data completely via the log. Snapshot replication offers, a point-in-time copy, and thus, is considered mutually exclusive to continuous replication for a given primary and replicate database pair.
- Such mutual exclusivity may compromise performance for certain environments that would benefit from being able to switch from one form of replication to another. For example, there may be situations, such as certain times of day, when it is known that limited activity would be occurring in a primary database, such that snapshot replication would be sufficient, while at other times, continuous would be preferable. Accordingly, a need exists for an approach to replication that avoids these limitations. The present invention addresses such a need.
- Briefly stated, the invention includes system, method, computer program product embodiments and combinations and sub-combinations thereof for hybrid data replication. Aspects include identifying a type of database data replication, the type including a combination of replication approaches, and managing replication based on the identified type, including coordinated switching from one replication approach to another automatically with transactional consistency maintained among source and target databases.
- Through the aspects, the capable utilization of multiple replication approaches (e.g., snapshot and continuous replication capabilities) allows greater flexibility without negatively impacting performance and/or data integrity.
- Further embodiments, features, and advantages of the invention, as well as the structure and operation of the various embodiments of the invention, are described in detail below with reference to accompanying drawings.
- The accompanying drawings, which are incorporated herein and form part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the relevant art(s) to make and use the invention.
-
FIG. 1 illustrates a network in which the present invention, or portions thereof, can be implemented, in accordance with an embodiment of the present invention. -
FIG. 2 is a block diagram illustrating a hybrid replication system, in accordance with an embodiment of the present invention. -
FIGS. 3 , 4, 5, 6, 7, 8, 9, and 10 are flowcharts illustrating processes for hybrid replication in accordance with an embodiment of the present invention. -
FIG. 11 illustrates an example computer useful for implementing components of embodiments of the invention. - The features and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. Generally, the drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.
- The present invention relates to a system, method, computer program product embodiments and combinations and sub-combinations thereof for providing methodology for hybrid data replication.
-
FIG. 1 illustrates areplication environment 100 in which the present invention, or portions thereof, can be implemented. Asource database engine 102 is able to communicate overnetwork 104 withreplication engine 106, and is a source of transactions that modify data and are captured for replication and distribution to targetdatabase engine 107, viareplication engine 106, in accordance with an embodiment of the present invention. - Network 104 can be any type of network or combination of networks such as, but not limited to, a local area network, wide area network, or the Internet. Network 104 may be any form of a wired network or a wireless network, or a combination thereof. One skilled in the relevant arts will further recognize that the
replication environment 100 can be configured in a number of ways in order to achieve the same result, and the aforementioned configuration is shown by way of example, and not limitation. For instance, in accordance with an embodiment of the present invention,source database engine 102 may be located in a single physical computing device or cluster of computing devices. - Further
source database engine 102 andtarget database engine 107 may be any form of database and can include, but are not limited to, a device having a processor and memory for executing and storing instructions. Such a database may include software, firmware, and hardware or some combination thereof. The software may include one or more applications and an operating system. The hardware can include, but is not limited to, a processor, memory and user interface display. An optional input device, such as a mouse, stylus or any other pointing device, may be used. - In an embodiment, a publish-and-subscribe model for replicating data across the
network 104 is utilized. Users “publish” data that is available in a primary database of thesource database engine 102, and other users “subscribe” to the data for delivery in a target database oftarget database engine 107 viareplication engine 106. Users can replicate both data changes (e.g., update, insert, and delete operations) and stored procedures using this method. An embodiment of thereplication engine 106 is the Sybase Replication Server, which is well known and described in publicly available documents. - In accordance with an embodiment of the invention, replication occurs using hybrid replication that capably utilizes multiple replication approaches (e.g., snapshot and continuous replication capabilities) in a manner that allows greater flexibility, including coordinated switching between the approaches automatically, without negatively impacting performance and/or data integrity. Thus, hybrid replication, in accordance with an embodiment, includes auto-switching from snapshot to continuous once the data is materialized (e.g., ‘snapshot+continuous’ replication), processing intermittent and incremental snapshots (e.g., ‘snapshot+snapshot’ replication), or stopping continuous replication and following it with a snapshot after some time delay (e.g., ‘continuous replication+snapshot’ replication), with each form being applicable to publisher, subscriber, and table levels of data, and able to vary among these levels. In this manner, for example, at the table level within a publisher, the strategy policy for one table may be changed without affecting the other tables within that same publisher. In an embodiment, policy declarations enable specification of the manner and type of hybrid replication, as described in co-pending U.S. patent application Ser. No. ______, filed ______, entitled “Directing a Data Replication Environment through Policy Declaration”, (attorney docket #1933.2060000), assigned to the assignee of the present invention and incorporated herein by reference in its entirety.
- Referring now to
FIG. 2 , a block diagram is represented and illustrates a hybrid replication system in accordance with an embodiment of the invention. Within thereplication engine 106, amanager 202 identifies the type of replication (e.g., such as from a policy declaration), and manages the hybrid replication processing for replicating data from aprimary database 204 to a replicatedatabase 206 by controlling asnapshot replication module 208 andcontinuous replication module 210, the details of which are described herein below with reference toFIGS. 3-10 . In an embodiment, for separate replications, e.g., for a given publisher-subscriber, in which some tables can be snapshot and some can be continuous, or for given multiple subscribers, in which some subscribers (and tables) can be snapshot and some can be continuous, the processes for snapshot and continuous replication may run in parallel. Further, switching between snapshot and continuous replication can occur at anytime through components acting as replication entities that provide targeted and replication-specific functionality. - In general,
snapshot replication 208 operates by reading directly from primary database 204 (is non log-based) and utilizes predefined process components, extract 212 (FIG. 3 ) and load 214 (FIG. 4 ), that run in parallel to each other. Extraction is performed as publisher activity, and loading is performed as subscriber activity, with themanager 202 determining how many extract and load components are to be used for each snapshot replication being performed. - Continuous replication reads from a
log 216 and utilizes alog reader 218,command constructor 220, capture 222 (FIG. 6 ), compile 224 (FIG. 7 ), and apply 226 (FIG. 8 ) components, where thelog reader 218,command constructor 220, and capture 222 components correspond to publisher activities, and the compile 224 and apply 226 components correspond to subscriber activities. -
Manager 202 plans how to most efficiently and effectively materialize tables for replication. For purposes of this disclosure, materialization refers to a process of starting log based replication. In an embodiment, the process occurs through snapshot replication (a complete snapshot including catch-up replication of data) with a switchover to log based replication. - The physical and logical operations of the process are managed by determining a number of paths and the levels of parallelism to use, including determining a number of extract and load components to deploy, and within the extract and load components, a level of concurrency used, e.g., table or partial-tables (sections), and the number of I/O (input/output) streams.
- Further, materialization may be atomic or non-atomic, where atomicity refers to an operation or transaction that is guaranteed to be completed atomically, such that if any part of the operations or transaction fails, all of it will fail. Generally, atomic approaches restrict how transactions occur while data is selected, while non-atomic do not, thereby allowing transactions to be performed on the data being selected. Auto-correction, as is commonly understood, supports data consistency for non-atomic approaches in accordance with an embodiment of the invention.
- Markers, such as extract begin, extract end, load begin, load end, activate, deactivate, and snapshot, are utilized as transactional commands that flow through the replication path and provide inter-component asynchronous communication. By flowing along with the data, the markers mark locations in the replication process when certain actions happen in support of the coordinated auto-switching activities of the
manager 202 of potentially multiple, parallel hybrid replications, as will be more fully understood in conjunction with the description ofFIGS. 3-10 . - Referring now to
FIGS. 3 and 4 , flowchart diagrams of extract and load processing for snapshot replication, respectively, are illustrated in accordance with an embodiment of the invention. Once the extraction process starts, an initial check is made to determine if a stop operation has been signaled (block 310). If true, the extraction of data stops (block 312), and a deactivation marker is sent to the load component (block 314) before the process ends (block 316). - When no stop operation has been signaled, the process continues by determining whether the extraction is occurring during incremental snapshot replication (block 318), which, when true, causes a last transaction timestamp to be read (block 320). To support incremental snapshots, a marker named “snapshot” is utilized as described herein below. Essentially, incremental snapshots provide a slice of data in some time window without regard to the how the data arrived in that form and require a transactional consistent initial snapshot, such that a log-based replication “records” all changed data.
- When the extraction is not occurring during incremental snapshot replication, (block 318 is not true), or once the timestamp has been read, an extract begin marker is signaled (block 322). As long as there is more data (as determined via block 324), it is extracted from the primary database and sent to storage. The extraction either occurs relative to the last transaction timestamp (block 328) or not (block 330), depending upon whether the replication is incremental snapshot replication (determined via block 326). Once there is no more data, a timestamp is stored to reflect the last transaction extracted (block 332), and an extract end marker is signaled (block 334) before the process ends.
- For load processing, the load begins, and a check is made for receipt of a deactivate marker (block 410). If no marker has been received, a load begin marker is signaled (block 412), and the data to be loaded is read from storage and loaded into the replicate database 206 (block 416) while load data remains (as determined via block 414). When no more data needs to be loaded, a load end marker is signaled (block 418).
- Once the load end is signaled, a check is made to determine whether a materialization suspend mode is set (block 420). In contrast to typical operations, which suspend replication on all tables until a materialization of a selected table completes in order to maintain consistency, utilizing compensating components in accordance with an embodiment of the invention allows materialization to occur with no suspension of non-materializing tables yet continuing to maintain consistency. The compensating components know only of the table being materialized, so they will suspend transactions occurring during the materialization of that one table, and when done, perform catch-up based on the materializing table activity. In this manner, the non-materialized tables may continue to replicate via the original compile and apply, achieving non-blocking continuous replication of tables not being materialized and supporting automatic switchover from snapshot to continuous replication.
- Thus, when a set suspend mode is identified, an apply process is resumed (block 422, see
FIG. 8 ), while when a no suspend mode is identified (block 420 is negative), a compensating apply process resumes (block 424, seeFIG. 10 ), before the load process ends. - When the initial check for a receipt of a deactivate marker is affirmative (block 410), data loading is stopped (block 426). A check is then made to identify whether dematerialization is desired (via block 428), and if so, replicated data is purged from the replicate database (block 430). Once the purge is done or when dematerialization is not set, the compensating compile and apply components of a continuous replication flow for the table being materialized that may be active are stopped (block 432) before the load process ends.
-
FIG. 5 illustrates a flowchart diagram of a continuous replication process in accordance with an embodiment of the invention. As mentioned previously with reference toFIG. 2 , continuous replication includes a log reader process (block 510), which is a straightforward and well-known process of reading theprimary database log 216. Once thelog 216 is read, the construction of commands follows (block 512) also using well-known techniques. A capture process then occurs (block 514). - Referring now to
FIG. 6 , a flowchart diagram of the capture process (block 514) in accordance with an embodiment is presented. The capture begins and a determination is made to identify whether a deactivate marker has been received (block 602). If so, the schema (i.e., the state data unique to the component for the replicating table) is set inactive (block 612), and the timestamp of the last transaction is stored (block 614) before the capture ends. - When no deactivate marker has been received, a check is made to determine whether materialization is to occur with or without suspend mode (block 604) to account for an embodiment in which snapshot replication follows snapshot replication (assuming log-based change data is accumulating from the previous snapshot). When in no-suspend mode (block 604 is false), upon receipt of an externally issued “snapshot” marker to the primary database, the compensating apply is resumed (block 606), while when in suspend mode (block 604 is true), apply is resumed (block 608).
- Subsequently, commands from the primary database log file are received and sorted into transactions (block 616). A check is then made for an activate or extract begin marker (block 618) that results in the schema being set to active (block 620) when true. Processing continues by determining whether all commands for a transaction are active (block 622). If false, the commands are discarded (block 624), or if true, the commands are sent to the compile component in commit order (block 626) before the capture ends.
- Once the capture ends, compile and apply occurs, as shown in
FIG. 5 . For completeness, the flow ofFIG. 5 following the capture (block 514) includes all components involved during continuous replication, with compile (block 516) and apply (block 518) and compensating compile (block 520) and compensating apply (block 522) (C-compile and C-apply) components running in parallel to each other. As described previously, the C-compile and C-apply components (collectively referred to as compensating subscribers) allow for the materialization to occur with no suspend, the processes for which are presented more fully herein below with reference toFIGS. 9-10 . - Referring now to
FIG. 7 , a flowchart diagram for a compileprocess 516 in accordance with an embodiment of the invention is illustrated. Unless a deactivate marker has been received (determined via block 710), which causes the materialization schema to be set inactive (block 712) and ends the process, the compile proceeds with receipt of the commands in commit order from the capture component (block 714). A determination of whether the materialization is being done in suspend mode follows (block 716). When true, and no extract begin or end markers have been received (as determined viablocks 718 and 720), the commands are compiled and the transactions grouped (block 722). The commands are then sent to the apply component (block 723) and the compile ends. - When the materialization is not being done in suspend mode (block 716 is negative), a check for receipt of a “snapshot” marker is done (block 729) to account for an embodiment in which continuous replication is followed by snapshot replication. When received, the schema is marked inactive (block 730) and the compile process ends. When a “snapshot” marker has not been received, and the schema is active (as determined via block 731), the process continues with the compilation of commands and grouping of transactions (block 722). The commands are then sent to the apply component (block 723), and the process ends.
- The schema is set to active (block 724), either when an extract begin marker has been received and after enabling auto-correction (block 726) when the materialization is not atomic (determined via block 728), or when the schema is not active (block 731 is false) and an activate marker has been received (determined via block 732). If no activate marker is received (block 732 is negative), the process ends.
-
FIG. 8 illustrates a flowchart diagram of the apply component processing in accordance with an embodiment of the invention. When no deactivate marker has been received (as determined via block 810), the commands are received from the compiler (block 812), and a check is made to identify if materialization is being done in suspend mode (block 813). When true, and an extract begin marker has been received (determined via block 814), the apply process suspends itself (block 815), and the process ends. The apply process also suspends itself when no extract begin marker has been received, and a snapshot marker has been received (as determined via block 816). When a snapshot marker has not been received, the transactions are written to the replicate database (block 820) and the process ends. The process also ends following setting of the schema to inactive (block 818) when the materialization is not being done in suspend mode, and a snapshot marker has been received (as determined via block 817). Alternatively, the process ends when the materialization is not being done in suspend mode, no snapshot marker has been received, the schema is not active, and no activate marker has been received (as determined via 813, 817, 821 and 822).blocks - When an activate marker has been received, the schema is set to active (block 824). The processing loops until a compensating apply is no longer active (determined via block 826), enabling catch-up replication of the materialized table to complete prior to resuming normal continuous replication activity on the materialized table, and then proceeds with writing the transactions to the replicate database (block 820). Writing the transactions is reached also when the schema is already set to active (determined via block 821).
- When a deactivate marker has been received, the schema is marked as inactive (block 828), and if dematerialization is not needed (determined via block 830), the process ends. If dematerialization is needed, the processing loops until a compensating apply is no longer active (via block 832), and then purges the replicated data from the replicate database (block 834), signals the completion of the purge (block 836), and ends.
-
FIG. 9 illustrates a flowchart diagram of a compensating compile process in accordance with an embodiment of the invention. It initiates and receives commands from the capture process 514 (FIG. 6 ) in commit order (block 910). If a deactivate marker is received (determined via block 912), the schema is set to inactive (block 913), and the process ends. If not received, a check is made for receipt of a snapshot marker (block 915), and when received, the schema is marked active (block 916) and the process ends. - When no snapshot marker has been received (block 915 is negative), a check is made for an extract begin marker (block 917). If no extract begin or end marker is received (determined via block 918), and an activate maker is received (determined via block 920), the schema is marked inactive and the process ends. When no activate marker is received, the commands are compiled and the transactions grouped (block 924) for sending to the compensating apply (block 926) before the process ends.
- The compilation also is reached when an extract end marker is received and after determining if atomic materialization is needed (block 928), and after auto correction is disabled (block 930) when atomic materialization is not needed. Alternatively, when an extract begin marker has been received, the schema gets marked active (block 932), and the need for atomic materialization is checked (via block 934), to enable auto correction appropriately (block 936) before proceeding with compilation (block 924) for sending to the compensating apply (block 926) before the process ends.
- The compensating apply
process 522 in accordance with an embodiment is illustrated by the block flow diagram ofFIG. 10 . It initiates and receives the compiled commands from the compensating compile (block 1010). If a deactivate marker is received (determined via block 1012), the schema is set to inactive (block 1014) and the process ends. If not deactivated but an extract begin marker is received (determined via block 1015), the compensating apply suspends itself (block 1016), and the process ends. The compensating apply also suspends itself when no extract begin marker has been received, but a snapshot marker has been received (as determined via block 1017), and the schema has been set to active (block 1018). - When no snapshot marker is received but a load end marker is received (determined via block 1020) and there is no delay switchover (determined via block 1021), an activate marker is sent (block 1022). Once the activate marker is sent or when there is a delay switchover (
block 1021 is affirmative), the transactions are written to the replicate database (block 1024), and the process ends. Alternatively, the transaction writing is reached when the load end marker is not received, and no activate marker has been received (determined via block 1026). When the activate marker has been received, the schema is marked inactive (block 1028), and the compensating compile is stopped, as well as the compensating apply itself, (block 1030), before the process ends. - As described herein, hybrid replication in accordance with the present invention supports switching replication approaches, as desired. For example, snapshot replication can be followed by continuous replication, either in an automatic form or a manual form, without compromising a transactionally consistent state. The automatic form occurs via compensating subscribers and markers, with a switch over to continuous replication. A manual case occurs when a snapshot replication was requested without materialization, and then at some later time the same entity is started with continuous replication. In such cases, the activate marker would be sent from an external source, such as a stored procedure, instead of an implicit send from the compensating subscriber. Further, whether the switching occurs from non-log-based to log-based, from log-based to non-log-based, from non-log-based to non-log-based (incremental snapshot), or from log-based to log-based, one table's replication occurs without impacting other tables in the same publisher/subscriber.
- Various aspects of the present invention can be implemented by software, firmware, hardware, or a combination thereof.
FIG. 11 illustrates anexample computer system 1100 in which the present invention, or portions thereof, can be implemented as computer-readable code. For example, the methods illustrated by the flowcharts ofFIGS. 3-10 , can be implemented insystem 1100. Various embodiments of the invention are described in terms of thisexample computer system 1100. After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures. -
Computer system 1100 includes one or more processors, such asprocessor 1104.Processor 1104 can be a special purpose or a general purpose processor.Processor 1104 is connected to a communication infrastructure 1106 (for example, a bus or network). -
Computer system 1100 also includes amain memory 1108, preferably random access memory (RAM), and may also include a secondary memory 1110. Secondary memory 1110 may include, for example, ahard disk drive 1112, aremovable storage drive 1114, and/or a memory stick.Removable storage drive 1114 may comprise a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like. Theremovable storage drive 1114 reads from and/or writes to aremovable storage unit 1118 in a well known manner.Removable storage unit 1118 may comprise a floppy disk, magnetic tape, optical disk, etc. which is read by and written to byremovable storage drive 1114. As will be appreciated by persons skilled in the relevant art(s),removable storage unit 1118 includes a computer usable storage medium having stored therein computer software and/or data. - In alternative implementations, secondary memory 1110 may include other similar means for allowing computer programs or other instructions to be loaded into
computer system 1100. Such means may include, for example, aremovable storage unit 1122 and aninterface 1120. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and otherremovable storage units 1122 andinterfaces 1120 which allow software and data to be transferred from theremovable storage unit 1122 tocomputer system 1100. -
Computer system 1100 may also include acommunications interface 1124. -
Communications interface 1124 allows software and data to be transferred betweencomputer system 1100 and external devices.Communications interface 1124 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like. Software and data transferred viacommunications interface 1124 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received bycommunications interface 1124. These signals are provided tocommunications interface 1124 via acommunications path 1126.Communications path 1126 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link or other communications channels. - In this document, the terms “computer program medium” and “computer usable medium” are used to generally refer to media such as
removable storage unit 1118,removable storage unit 1122, and a hard disk installed inhard disk drive 1112. Signals carried overcommunications path 1126 can also embody the logic described herein. Computer program medium and computer usable medium can also refer to memories, such asmain memory 1108 and secondary memory 1110, which can be memory semiconductors (e.g. DRAMs, etc.). These computer program products are means for providing software tocomputer system 1100. - Computer programs (also called computer control logic) are stored in
main memory 1108 and/or secondary memory 1110. Computer programs may also be received viacommunications interface 1124. Such computer programs, when executed, enablecomputer system 1100 to implement the present invention as discussed herein. In particular, the computer programs, when executed, enableprocessor 1104 to implement the processes of the present invention, such as the method illustrated by the flowcharts ofFIGS. 3-10 . Accordingly, such computer programs represent controllers of thecomputer system 1100. Where the invention is implemented using software, the software may be stored in a computer program product and loaded intocomputer system 1100 usingremovable storage drive 1114,interface 1120,hard drive 1112 orcommunications interface 1124. - The invention is also directed to computer program products comprising software stored on any computer useable medium. Such software, when executed in one or more data processing device, causes a data processing device(s) to operate as described herein. Embodiments of the invention employ any computer useable or readable medium, known now or in the future. Examples of computer useable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage device, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).
- While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be understood by those skilled in the relevant art(s) that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined in the appended claims. It should be understood that the invention is not limited to these examples. The invention is applicable to any elements operating as described herein. Accordingly, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/326,892 US9672126B2 (en) | 2011-12-15 | 2011-12-15 | Hybrid data replication |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/326,892 US9672126B2 (en) | 2011-12-15 | 2011-12-15 | Hybrid data replication |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20130159249A1 true US20130159249A1 (en) | 2013-06-20 |
| US9672126B2 US9672126B2 (en) | 2017-06-06 |
Family
ID=48611216
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/326,892 Active 2032-12-10 US9672126B2 (en) | 2011-12-15 | 2011-12-15 | Hybrid data replication |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US9672126B2 (en) |
Cited By (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8935207B2 (en) | 2013-02-14 | 2015-01-13 | Sap Se | Inspecting replicated data |
| US9110847B2 (en) | 2013-06-24 | 2015-08-18 | Sap Se | N to M host system copy |
| US20160224327A1 (en) * | 2015-02-02 | 2016-08-04 | Lenovo Enterprise Solutions (Singapore) Pte. Ltd. | Linking a Program with a Software Library |
| US20160328460A1 (en) * | 2015-05-08 | 2016-11-10 | International Business Machines Corporation | Idling individually specified objects during data replication |
| US9836516B2 (en) | 2013-10-18 | 2017-12-05 | Sap Se | Parallel scanners for log based replication |
| US9965536B2 (en) | 2013-12-27 | 2018-05-08 | Sybase, Inc. | Replication description model for data distribution |
| US20180307562A1 (en) * | 2016-10-14 | 2018-10-25 | Tencent Technology (Shenzhen) Company Limited | Data recovery method, device and storage medium |
| US10198493B2 (en) | 2013-10-18 | 2019-02-05 | Sybase, Inc. | Routing replicated data based on the content of the data |
| US20190052516A1 (en) * | 2017-08-11 | 2019-02-14 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US20190266276A1 (en) * | 2018-02-26 | 2019-08-29 | Servicenow, Inc. | Instance data replication |
| US10484461B2 (en) | 2017-08-11 | 2019-11-19 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US10965517B2 (en) | 2017-08-11 | 2021-03-30 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US11182260B1 (en) | 2021-03-02 | 2021-11-23 | International Business Machines Corporation | Avoiding recovery log archive access in database accelerator environments |
| US11226878B1 (en) | 2021-03-02 | 2022-01-18 | International Business Machines Corporation | Accelerator-based database recovery |
| US11226984B2 (en) * | 2019-08-13 | 2022-01-18 | Capital One Services, Llc | Preventing data loss in event driven continuous availability systems |
| USRE49042E1 (en) * | 2015-08-31 | 2022-04-19 | Paypal, Inc. | Data replication between databases with heterogenious data platforms |
| US11397718B2 (en) | 2020-09-09 | 2022-07-26 | International Business Machines Corporation | Dynamic selection of synchronization update path |
| CN115048453A (en) * | 2022-05-17 | 2022-09-13 | 度小满科技(北京)有限公司 | Data synchronization method, device, equipment and storage medium |
| US11500733B2 (en) | 2021-03-19 | 2022-11-15 | International Business Machines Corporation | Volatile database caching in a database accelerator |
| US11630814B2 (en) * | 2020-12-10 | 2023-04-18 | International Business Machines Corporation | Automated online upgrade of database replication |
| US11675809B2 (en) | 2021-03-02 | 2023-06-13 | International Business Machines Corporation | Replicating data changes using multiple storage devices and tracking records of pending data changes stored on the storage devices |
| US11704335B2 (en) | 2020-11-13 | 2023-07-18 | International Business Machines Corporation | Data synchronization in a data analysis system |
| US11797570B2 (en) | 2021-03-19 | 2023-10-24 | International Business Machines Corporation | Asynchronous persistency of replicated data changes in a database accelerator |
| US12360855B1 (en) * | 2020-05-05 | 2025-07-15 | Cohesity, Inc. | Systems and methods for protecting data |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10664361B1 (en) | 2018-07-17 | 2020-05-26 | Amazon Technologies, Inc. | Transactionally consistent backup of partitioned storage |
| US11537310B2 (en) | 2021-02-05 | 2022-12-27 | Microsoft Technology Licensing, Llc | Threading of replication based on data type |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5991768A (en) * | 1996-06-21 | 1999-11-23 | Oracle Corporation | Finer grained quiescence for data replication |
| US20020133507A1 (en) * | 2001-03-16 | 2002-09-19 | Iti, Inc. | Collision avoidance in database replication systems |
| US20050149578A1 (en) * | 2003-11-04 | 2005-07-07 | Sustman Paul A. | Hybrid real-time data replication |
| US20050172092A1 (en) * | 2004-02-04 | 2005-08-04 | Lam Wai T. | Method and system for storing data |
| US20050203908A1 (en) * | 2004-03-12 | 2005-09-15 | Sahn Lam | Managing data replication policies |
| US20060195666A1 (en) * | 2005-02-25 | 2006-08-31 | Naoko Maruyama | Switching method of data replication mode |
| US7509468B1 (en) * | 2006-02-02 | 2009-03-24 | Symantec Operating Corporation | Policy-based data protection |
| US7603395B1 (en) * | 2006-05-02 | 2009-10-13 | Emc Corporation | Using pseudosnapshots for continuous data protection systems to surface a copy of data |
| US20090300304A1 (en) * | 2008-06-02 | 2009-12-03 | International Business Machines Corporation | Managing consistency groups using heterogeneous replication engines |
| US20090313311A1 (en) * | 2008-06-12 | 2009-12-17 | Gravic, Inc. | Mixed mode synchronous and asynchronous replication system |
| US20140289188A1 (en) * | 2013-03-15 | 2014-09-25 | Factual, Inc. | Apparatus, systems, and methods for batch and realtime data processing |
| US9268811B1 (en) * | 2010-10-25 | 2016-02-23 | Symantec Corporation | Replay of writes in replication log |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4977594A (en) * | 1986-10-14 | 1990-12-11 | Electronic Publishing Resources, Inc. | Database usage metering and protection system and method |
| JP2004533220A (en) | 2001-02-21 | 2004-11-04 | ディヴァーサ コーポレイション | Enzyme having α-amylase activity and method of using the same |
| US7739240B2 (en) | 2002-12-09 | 2010-06-15 | Hewlett-Packard Development Company, L.P. | Replication and replica management in a wide area file system |
| US7415467B2 (en) * | 2003-03-06 | 2008-08-19 | Ixion, Inc. | Database replication system |
| US8010484B2 (en) | 2003-06-16 | 2011-08-30 | Sap Aktiengesellschaft | Generating data subscriptions based on application data |
| US7240054B2 (en) * | 2004-02-27 | 2007-07-03 | International Business Machines Corporation | Techniques to preserve data constraints and referential integrity in asynchronous transactional replication of relational tables |
| US7360111B2 (en) * | 2004-06-29 | 2008-04-15 | Microsoft Corporation | Lossless recovery for computer systems with remotely dependent data recovery |
| US20060047713A1 (en) * | 2004-08-03 | 2006-03-02 | Wisdomforce Technologies, Inc. | System and method for database replication by interception of in memory transactional change records |
| US7523110B2 (en) | 2005-03-03 | 2009-04-21 | Gravic, Inc. | High availability designated winner data replication |
| US7571168B2 (en) | 2005-07-25 | 2009-08-04 | Parascale, Inc. | Asynchronous file replication and migration in a storage network |
| US7865673B2 (en) * | 2005-11-04 | 2011-01-04 | Oracle America, Inc. | Multiple replication levels with pooled devices |
| US7651593B2 (en) | 2005-12-19 | 2010-01-26 | Commvault Systems, Inc. | Systems and methods for performing data replication |
| US20080010513A1 (en) | 2006-06-27 | 2008-01-10 | International Business Machines Corporation | Controlling computer storage systems |
| WO2010102084A2 (en) | 2009-03-05 | 2010-09-10 | Coach Wei | System and method for performance acceleration, data protection, disaster recovery and on-demand scaling of computer applications |
| US9378105B2 (en) * | 2010-12-10 | 2016-06-28 | Veritas Technologies Llc | System and method for optimizing replication |
-
2011
- 2011-12-15 US US13/326,892 patent/US9672126B2/en active Active
Patent Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5991768A (en) * | 1996-06-21 | 1999-11-23 | Oracle Corporation | Finer grained quiescence for data replication |
| US20020133507A1 (en) * | 2001-03-16 | 2002-09-19 | Iti, Inc. | Collision avoidance in database replication systems |
| US20050149578A1 (en) * | 2003-11-04 | 2005-07-07 | Sustman Paul A. | Hybrid real-time data replication |
| US20050172092A1 (en) * | 2004-02-04 | 2005-08-04 | Lam Wai T. | Method and system for storing data |
| US20050203908A1 (en) * | 2004-03-12 | 2005-09-15 | Sahn Lam | Managing data replication policies |
| US20060195666A1 (en) * | 2005-02-25 | 2006-08-31 | Naoko Maruyama | Switching method of data replication mode |
| US7509468B1 (en) * | 2006-02-02 | 2009-03-24 | Symantec Operating Corporation | Policy-based data protection |
| US7603395B1 (en) * | 2006-05-02 | 2009-10-13 | Emc Corporation | Using pseudosnapshots for continuous data protection systems to surface a copy of data |
| US20090300304A1 (en) * | 2008-06-02 | 2009-12-03 | International Business Machines Corporation | Managing consistency groups using heterogeneous replication engines |
| US20090313311A1 (en) * | 2008-06-12 | 2009-12-17 | Gravic, Inc. | Mixed mode synchronous and asynchronous replication system |
| US9268811B1 (en) * | 2010-10-25 | 2016-02-23 | Symantec Corporation | Replay of writes in replication log |
| US20140289188A1 (en) * | 2013-03-15 | 2014-09-25 | Factual, Inc. | Apparatus, systems, and methods for batch and realtime data processing |
Cited By (35)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8935207B2 (en) | 2013-02-14 | 2015-01-13 | Sap Se | Inspecting replicated data |
| US9110847B2 (en) | 2013-06-24 | 2015-08-18 | Sap Se | N to M host system copy |
| US10198493B2 (en) | 2013-10-18 | 2019-02-05 | Sybase, Inc. | Routing replicated data based on the content of the data |
| US9836516B2 (en) | 2013-10-18 | 2017-12-05 | Sap Se | Parallel scanners for log based replication |
| US9965536B2 (en) | 2013-12-27 | 2018-05-08 | Sybase, Inc. | Replication description model for data distribution |
| US20160224327A1 (en) * | 2015-02-02 | 2016-08-04 | Lenovo Enterprise Solutions (Singapore) Pte. Ltd. | Linking a Program with a Software Library |
| US20160328461A1 (en) * | 2015-05-08 | 2016-11-10 | International Business Machines Corporation | Idling individually specified objects during data replication |
| US10083216B2 (en) * | 2015-05-08 | 2018-09-25 | International Business Machines Corporation | Idling individually specified objects during data replication |
| US10089375B2 (en) * | 2015-05-08 | 2018-10-02 | International Business Machines Corporation | Idling individually specified objects during data replication |
| US20160328460A1 (en) * | 2015-05-08 | 2016-11-10 | International Business Machines Corporation | Idling individually specified objects during data replication |
| USRE49042E1 (en) * | 2015-08-31 | 2022-04-19 | Paypal, Inc. | Data replication between databases with heterogenious data platforms |
| US20180307562A1 (en) * | 2016-10-14 | 2018-10-25 | Tencent Technology (Shenzhen) Company Limited | Data recovery method, device and storage medium |
| US11429488B2 (en) * | 2016-10-14 | 2022-08-30 | Tencent Technology (Shenzhen) Company Limited | Data recovery method based on snapshots, device and storage medium |
| US20190052516A1 (en) * | 2017-08-11 | 2019-02-14 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US10560308B2 (en) * | 2017-08-11 | 2020-02-11 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US10965517B2 (en) | 2017-08-11 | 2021-03-30 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US10484461B2 (en) | 2017-08-11 | 2019-11-19 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| USRE49866E1 (en) * | 2017-08-11 | 2024-03-05 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| USRE49366E1 (en) | 2017-08-11 | 2023-01-10 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| USRE49914E1 (en) | 2017-08-11 | 2024-04-09 | Microsoft Technology Licensing, Llc | Correlation across non-logging components |
| US10990605B2 (en) * | 2018-02-26 | 2021-04-27 | Servicenow, Inc. | Instance data replication |
| US20190266276A1 (en) * | 2018-02-26 | 2019-08-29 | Servicenow, Inc. | Instance data replication |
| US11226984B2 (en) * | 2019-08-13 | 2022-01-18 | Capital One Services, Llc | Preventing data loss in event driven continuous availability systems |
| US12436969B2 (en) | 2019-08-13 | 2025-10-07 | Capital One Services, Llc | Preventing data loss in event driven continuous availability systems |
| US11921745B2 (en) | 2019-08-13 | 2024-03-05 | Capital One Services, Llc | Preventing data loss in event driven continuous availability systems |
| US12360855B1 (en) * | 2020-05-05 | 2025-07-15 | Cohesity, Inc. | Systems and methods for protecting data |
| US11397718B2 (en) | 2020-09-09 | 2022-07-26 | International Business Machines Corporation | Dynamic selection of synchronization update path |
| US11704335B2 (en) | 2020-11-13 | 2023-07-18 | International Business Machines Corporation | Data synchronization in a data analysis system |
| US11630814B2 (en) * | 2020-12-10 | 2023-04-18 | International Business Machines Corporation | Automated online upgrade of database replication |
| US11182260B1 (en) | 2021-03-02 | 2021-11-23 | International Business Machines Corporation | Avoiding recovery log archive access in database accelerator environments |
| US11675809B2 (en) | 2021-03-02 | 2023-06-13 | International Business Machines Corporation | Replicating data changes using multiple storage devices and tracking records of pending data changes stored on the storage devices |
| US11226878B1 (en) | 2021-03-02 | 2022-01-18 | International Business Machines Corporation | Accelerator-based database recovery |
| US11797570B2 (en) | 2021-03-19 | 2023-10-24 | International Business Machines Corporation | Asynchronous persistency of replicated data changes in a database accelerator |
| US11500733B2 (en) | 2021-03-19 | 2022-11-15 | International Business Machines Corporation | Volatile database caching in a database accelerator |
| CN115048453A (en) * | 2022-05-17 | 2022-09-13 | 度小满科技(北京)有限公司 | Data synchronization method, device, equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| US9672126B2 (en) | 2017-06-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9672126B2 (en) | Hybrid data replication | |
| US8996458B2 (en) | High volume, high speed adaptive data replication | |
| KR101956236B1 (en) | Data replication technique in database management system | |
| US11157370B2 (en) | Consistent backup of a distributed database system | |
| US9760617B2 (en) | Applying transaction log in parallel | |
| US9785518B2 (en) | Multi-threaded transaction log for primary and restore/intelligence | |
| US8412674B2 (en) | Replication resynchronization | |
| US11132350B2 (en) | Replicable differential store data structure | |
| US7610314B2 (en) | Online tablespace recovery for export | |
| US20070294319A1 (en) | Method and apparatus for processing a database replica | |
| US20070288526A1 (en) | Method and apparatus for processing a database replica | |
| US20150205850A1 (en) | Eager replication of uncommitted transactions | |
| EP2746971A2 (en) | Replication mechanisms for database environments | |
| WO2019109854A1 (en) | Data processing method and device for distributed database, storage medium, and electronic device | |
| CN107908503A (en) | Recover database from standby system streaming | |
| WO2009004620A2 (en) | Method and system for data storage and management | |
| US11176004B2 (en) | Test continuous log replay | |
| US11966297B2 (en) | Identifying database archive log dependency and backup copy recoverability | |
| US11494271B2 (en) | Dynamically updating database archive log dependency and backup copy recoverability | |
| US7698319B2 (en) | Database system management method, database system, database device, and backup program | |
| US20060190460A1 (en) | Method and mechanism of handling reporting transactions in database systems | |
| US11372838B2 (en) | Parallel processing of changes in a distributed system | |
| US8892830B2 (en) | Changing ownership of cartridges | |
| US11150964B1 (en) | Sequential processing of changes in a distributed system | |
| US11301341B2 (en) | Replication system takeover with handshake |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: SYBASE, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DEWALL, RHETT DONDI;SHANG, HEPING;SIGNING DATES FROM 20111201 TO 20111205;REEL/FRAME:027388/0839 |
|
| FEPP | Fee payment procedure |
Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |