CN115357661A - Database data synchronization method and device, storage medium and processor - Google Patents
Database data synchronization method and device, storage medium and processor Download PDFInfo
- Publication number
- CN115357661A CN115357661A CN202210911970.1A CN202210911970A CN115357661A CN 115357661 A CN115357661 A CN 115357661A CN 202210911970 A CN202210911970 A CN 202210911970A CN 115357661 A CN115357661 A CN 115357661A
- Authority
- CN
- China
- Prior art keywords
- data
- event
- database
- target
- synchronized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/244—Grouping and aggregation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24556—Aggregation; Duplicate elimination
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请公开了一种数据库的数据同步方法及装置、存储介质、处理器。该方法包括:获取需要进行数据同步的源数据库的待同步数据,待同步数据包括目标事件产生的数据;对待同步数据的目标事件进行拆分,得到第一事件产生的第一数据,第二事件产生的第二数据;根据第一事件和第二事件,进行数据同步,将待同步数据同步到对应的目标数据库,目标数据库与源数据库的种类不同,目标事件在不同种类的数据库中执行的数据结构不同,第一事件和第二事件在不同种类的数据库中执行的数据结构相同。解决了相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。
The application discloses a database data synchronization method and device, a storage medium, and a processor. The method includes: obtaining the data to be synchronized of the source database that needs to be synchronized, the data to be synchronized includes the data generated by the target event; splitting the target event of the data to be synchronized to obtain the first data generated by the first event, the second event Generated second data; perform data synchronization according to the first event and the second event, and synchronize the data to be synchronized to the corresponding target database. The target database is different from the source database, and the target event is executed in a different type of database. The structures are different, and the data structures of the first event and the second event executed in different types of databases are the same. It solves the problem that data synchronization involving distributed databases in related technologies needs to be realized by merging, sorting and aggregation of logs, resulting in poor operability and failure to synchronize in some cases.
Description
技术领域technical field
本申请涉及数据同步领域,具体而言,涉及一种数据库的数据同步方法及装置、存储介质、处理器。The present application relates to the field of data synchronization, in particular, to a data synchronization method and device, a storage medium, and a processor for a database.
背景技术Background technique
分布式数据库的数据同步整体架构有如下几个部分组成:源端数据库,当前支持分布式关系型数据库、分布式文件系统和非结构化数据库等。目标端数据库,当前支持分布式关系型数据库、分布式文件系统和非结构化数据库等。管理节点集群,用于数据核对配置,推送数据核对配置到核对节点。同时接收核对节点反馈回来的数据同步状态、进度等信息。同步节点集群,执行具体数据核对过程的模块。协调器集群,用于协调数据核对的模块。The overall data synchronization architecture of the distributed database consists of the following parts: the source database, which currently supports distributed relational databases, distributed file systems, and unstructured databases. The target database currently supports distributed relational databases, distributed file systems, and unstructured databases. The management node cluster is used for data verification configuration, and the data verification configuration is pushed to the verification node. At the same time, it receives the data synchronization status, progress and other information fed back by the check node. A module that synchronizes the node cluster and executes the specific data verification process. The coordinator cluster is used to coordinate the modules of data verification.
分布式数据库作为源端同步到非分布式数据库时,由于底层分片库的同步进度差异,源端数据出现跨分片update时需要日志汇聚排序来保证数据的最终一致性,极为消耗性能、可操作性差。非分布式数据库作为源端同步到分布式数据库时,某个字段在目标端分布式数据库为分片键,源端更新此字段的值后,目标端因为禁止客户端跨片操作而导致无法直接进行同步。数据同步过程不可重做,源端一条记录进行过连续的主键变更操作,若调整位点到之前的某个时间点重做整个同步过程会出现主键冲突错误。When a distributed database is used as the source to synchronize to a non-distributed database, due to the difference in the synchronization progress of the underlying shard database, when the source data is updated across shards, log aggregation and sorting are required to ensure the final consistency of the data, which consumes a lot of performance and can be Operability is poor. When a non-distributed database is used as the source to synchronize to a distributed database, a certain field is a shard key on the target distributed database. After the source updates the value of this field, the target cannot directly to sync. The data synchronization process cannot be redone. A record at the source has undergone continuous primary key change operations. If the adjustment point is adjusted to a previous point in time and the entire synchronization process is redone, a primary key conflict error will occur.
针对相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题,目前尚未提出有效的解决方案。For data synchronization involving distributed databases in related technologies, logs need to be merged, sorted and aggregated to achieve this, resulting in poor operability and inability to synchronize in some cases. No effective solution has been proposed yet.
发明内容Contents of the invention
本申请的主要目的在于提供一种数据库的数据同步方法及系统,以解决相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。The main purpose of this application is to provide a database data synchronization method and system to solve the problem of data synchronization involving distributed databases in related technologies. Logs need to be merged, sorted and aggregated to achieve this, resulting in poor operability and in some cases unable to synchronize The problem.
为了实现上述目的,根据本申请的一个方面,提供了一种数据库的数据同步方法,包括:获取需要进行数据同步的源数据库的待同步数据,其中,待同步数据包括目标事件产生的数据;对所述待同步数据的目标事件进行拆分,得到第一数据和第二数据,其中,所述第一数据为第一事件产生的数据,所述第二数据为第二事件产生的数据,所述第一事件和所述第二事件的叠加与所述目标事件相同;根据所述第一事件和所述第二事件,进行数据同步,将所述待同步数据同步到对应的目标数据库,其中,所述目标数据库与所述源数据库的种类不同,所述目标事件在不同种类的数据库中执行的数据结构不同,所述第一事件和所述第二事件在不同种类的数据库中执行的数据结构相同。In order to achieve the above object, according to one aspect of the present application, a data synchronization method for a database is provided, including: acquiring data to be synchronized of a source database that needs to be synchronized, wherein the data to be synchronized includes data generated by a target event; The target event of the data to be synchronized is split to obtain first data and second data, wherein the first data is the data generated by the first event, and the second data is the data generated by the second event, so The superposition of the first event and the second event is the same as the target event; data synchronization is performed according to the first event and the second event, and the data to be synchronized is synchronized to the corresponding target database, wherein , the types of the target database and the source database are different, the data structures of the target events executed in different types of databases are different, and the data structures of the first event and the second event executed in different types of databases The structure is the same.
可选的,根据所述第一事件和所述第二事件,进行数据同步,将所述待同步数据同步到对应的目标数据库包括:根据所述源数据库和所述目标数据库的种类,确定对所述第一数据或所述第二数据是否进行修改;在对所述第一数据或所述第二数据进行修改的情况下,根据修改后的第一数据或第二数据,以及未修改的第二数据或第一数据,将第一事件和所述第二事件输入到所述目标数据库,以将所述待同步数据同步到所述目标数据库;和/或,在对所述第一数据或所述第二数据未进行修改的情况下,根据所述第一数据或第二数据的执行属性,将第一事件和所述第二事件输入到所述目标数据库,以将所述待同步数据同步到所述目标数据库。Optionally, performing data synchronization according to the first event and the second event, and synchronizing the data to be synchronized to the corresponding target database includes: according to the types of the source database and the target database, determining the Whether the first data or the second data is modified; if the first data or the second data is modified, according to the modified first data or second data, and the unmodified The second data or the first data, inputting the first event and the second event into the target database, so as to synchronize the data to be synchronized to the target database; and/or, in the first data or if the second data has not been modified, input the first event and the second event into the target database according to the execution attribute of the first data or the second data, so as to input the to-be-synchronized Data is synchronized to the target database.
可选的,在所述源数据库为分布式数据库,所述目标数据库为非分布式数据库的情况下,确定所述第一数据需要修改;根据预设方式对所述第一数据进行修改;根据修改后的第一数据,以及未修改的第二数据,将第一事件和所述第二事件输入到所述目标数据库,以将所述待同步数据同步到所述目标数据库。Optionally, when the source database is a distributed database and the target database is a non-distributed database, it is determined that the first data needs to be modified; modify the first data according to a preset method; The modified first data, and the unmodified second data, input the first event and the second event into the target database, so as to synchronize the data to be synchronized to the target database.
可选的,在所述源数据库为非分布式数据库,所述目标数据库为分布式数据库的情况下,确定所述第一数据或所述第二数据不需要修改;根据所述第一事件的和所述第二事件的第一执行属性,将所述第一事件和所述第二事件输入到所述目标数据库,以将所述待同步数据同步到所述目标数据库。Optionally, when the source database is a non-distributed database and the target database is a distributed database, it is determined that the first data or the second data does not need to be modified; according to the and the first execution attribute of the second event, input the first event and the second event into the target database, so as to synchronize the data to be synchronized to the target database.
可选的,所述方法还包括:对所述第二数据的属性参数进行设置,确定所述第二事件的第二执行属性;根据所述第二执行属性将所述第一事件和所述第二事件输入到所述目标数据库,以将所述待同步数据同步到所述目标数据库。Optionally, the method further includes: setting an attribute parameter of the second data, determining a second execution attribute of the second event; combining the first event and the The second event is input to the target database, so as to synchronize the data to be synchronized to the target database.
可选的,所述目标事件为修改update事件,所述第一事件为删除delete事件,所述第二事件为插入insert事件。Optionally, the target event is an update event, the first event is a delete event, and the second event is an insert event.
可选的,根据预设方式对所述第一数据进行修改包括:对删除事件的语句数据中的查找条件中增加了分片键,其中,所述分片健用于标识数据的存储位置和修改版本,所述第一事件为删除delete事件的情况下,所述第一数据为所述删除事件的语句数据;和/或,第一执行属性为所述删除delete事件和所述插入insert事件执行入库至目标数据库的属性;和/或,第二执行属性为所述插入insert事件在入库至目标数据库时采用不存在则插入,存在则更新的执行策略的属性。Optionally, modifying the first data according to a preset method includes: adding a shard key to the search condition in the statement data of the delete event, wherein the shard key is used to identify the storage location of the data and In a modified version, when the first event is a delete event, the first data is the statement data of the delete event; and/or, the first execution attribute is the delete delete event and the insert insert event and/or, the second execution attribute is an attribute of an execution strategy that adopts an execution strategy of inserting if it does not exist and updating if it exists when the insert event is stored into the target database.
为了实现上述目的,根据本申请的另一方面,提供了一种数据库的数据同步装置。该装置包括:获取模块,用于获取需要进行数据同步的源数据库的待同步数据,其中,待同步数据包括目标事件产生的数据;拆分模块,用于对所述待同步数据的目标事件进行拆分,得到第一数据和第二数据,其中,所述第一数据为第一事件产生的数据,所述第二数据为第二事件产生的数据,所述第一事件和所述第二事件的叠加与所述目标事件相同;同步模块,用于根据所述第一事件和所述第二事件,进行数据同步,将所述待同步数据同步到对应的目标数据库,其中,所述目标数据库与所述源数据库的种类不同,所述目标事件在不同种类的数据库中执行的数据结构不同,所述第一事件和所述第二事件在不同种类的数据库中执行的数据结构相同。In order to achieve the above object, according to another aspect of the present application, a data synchronization device for a database is provided. The device includes: an acquisition module, configured to acquire data to be synchronized from a source database that needs to be synchronized, wherein the data to be synchronized includes data generated by a target event; a splitting module, configured to process the target event of the data to be synchronized Split to obtain first data and second data, wherein, the first data is the data generated by the first event, the second data is the data generated by the second event, the first event and the second The superposition of the event is the same as the target event; the synchronization module is configured to perform data synchronization according to the first event and the second event, and synchronize the data to be synchronized to the corresponding target database, wherein the target The types of the database and the source database are different, the data structures of the target events executed in different types of databases are different, and the data structures of the first event and the second event executed in different types of databases are the same.
根据本申请的另一方面,还提供了一种计算机可读存储介质,所述处存储介质用于存储程序,其中,所述程序执行上述中任意一项所述的数据库的数据同步方法。According to another aspect of the present application, there is also provided a computer-readable storage medium, where the storage medium is used to store a program, wherein the program executes the data synchronization method for a database described in any one of the above.
根据本申请的另一方面,还提供了一种电子设备,包括一个或多个处理器和存储器,所述存储器用于存储一个或多个程序,其中,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现上述中任意一项所述的数据库的数据同步方法。According to another aspect of the present application, there is also provided an electronic device, including one or more processors and a memory, the memory is used to store one or more programs, wherein, when the one or more programs are executed When the one or more processors are executed, the one or more processors realize the data synchronization method for the database described in any one of the above.
通过本申请,获取需要进行数据同步的源数据库的待同步数据,其中,待同步数据包括目标事件产生的数据;对待同步数据的目标事件进行拆分,得到第一数据和第二数据,其中,第一数据为第一事件产生的数据,第二数据为第二事件产生的数据,第一事件和第二事件的叠加与目标事件相同;根据第一事件和第二事件,进行数据同步,将待同步数据同步到对应的目标数据库,其中,目标数据库与源数据库的种类不同,目标事件在不同种类的数据库中执行的数据结构不同,第一事件和第二事件在不同种类的数据库中执行的数据结构相同。Through this application, the data to be synchronized of the source database that needs to be synchronized is obtained, wherein the data to be synchronized includes the data generated by the target event; the target event of the data to be synchronized is split to obtain the first data and the second data, wherein, The first data is the data generated by the first event, the second data is the data generated by the second event, the superposition of the first event and the second event is the same as the target event; according to the first event and the second event, data synchronization is performed, and the The data to be synchronized is synchronized to the corresponding target database, wherein the target database is of different types from the source database, the data structure of the target event executed in different types of databases is different, and the first event and the second event executed in different types of databases The data structure is the same.
通过对目标事件产生的待同步数据进行获取,对目标事件进行拆分为可以直接同步至目标数据库的第一事件和第二事件,从而将待同步数据拆分为第一数据和第二数据,达到了将依赖于日志归并聚合来实现数据同步的目标事件的待同步数据,直接拆分同步到目标数据库种的目的,实现了降低目标时间的待同步数据进行数据同步的操作难度,在避免日志归并聚合的情况下,有效准确的实现数据同步的技术效果,进而解决了相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。By acquiring the data to be synchronized generated by the target event, the target event is split into a first event and a second event that can be directly synchronized to the target database, thereby splitting the data to be synchronized into first data and second data, It achieves the purpose of directly splitting and synchronizing the data to be synchronized to the target database of the target event that relies on log merging and aggregation to achieve data synchronization, and reduces the operational difficulty of synchronizing the data to be synchronized at the target time, while avoiding log In the case of merging and aggregation, the technical effect of effectively and accurately realizing data synchronization, and then solving the data synchronization involving distributed databases in related technologies, needs to be realized by merging, sorting and aggregation of logs, resulting in poor operability and in some cases unable to synchronize The problem.
附图说明Description of drawings
构成本申请的一部分的附图用来提供对本申请的进一步理解,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings constituting a part of the application are used to provide further understanding of the application, and the schematic embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation to the application. In the attached picture:
图1是根据本申请实施例提供的一种数据库的数据同步方法的流程图;Fig. 1 is a flow chart of a data synchronization method for a database provided according to an embodiment of the present application;
图2是根据本申请实施方式的相关技术提供的分布式数据库进行数据同步的流程图;FIG. 2 is a flow chart of data synchronization performed by a distributed database provided by the related technology according to the embodiment of the present application;
图3是根据本申请实施方式提供的分布式数据库进行数据同步的流程图;FIG. 3 is a flow chart of data synchronization performed by a distributed database provided according to an embodiment of the present application;
图4是根据本申请实施方式提供的分布式数据库进行数据同步实际执行样例的示意图;FIG. 4 is a schematic diagram of an actual implementation example of data synchronization performed by a distributed database provided according to an embodiment of the present application;
图5是根据本申请实施方式提供的非分布式数据库作为源端的数据同步的流程图;FIG. 5 is a flow chart of data synchronization provided by a non-distributed database as a source according to an embodiment of the present application;
图6是根据本申请实施方式提供的数据库进行数据同步时可重复入库的流程图;Fig. 6 is a flow chart of repeatable warehousing during data synchronization according to the database provided by the embodiment of the present application;
图7是根据本申请实施例提供的一种数据库的数据同步装置的示意图;FIG. 7 is a schematic diagram of a data synchronization device for a database provided according to an embodiment of the present application;
图8是根据本申请实施例提供的一种电子设备的示意图。Fig. 8 is a schematic diagram of an electronic device provided according to an embodiment of the present application.
具体实施方式Detailed ways
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It should be understood that the data so used may be interchanged under appropriate circumstances for the embodiments of the application described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
需要说明的是,本公开所涉及的相关信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于展示的数据、分析的数据等),均为经用户授权或者经过各方充分授权的信息和数据。例如,本系统和相关用户或机构间设置有接口,在获取相关信息之前,需要通过接口向前述的用户或机构发送获取请求,并在接收到前述的用户或机构反馈的同意信息后,获取相关信息。It should be noted that the relevant information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for display, data for analysis, etc.) involved in this disclosure are authorized by the user. Or information and data fully authorized by the parties. For example, there is an interface between this system and relevant users or institutions. Before obtaining relevant information, it is necessary to send an acquisition request to the aforementioned user or institution through the interface, and obtain relevant information after receiving the consent information fed back by the aforementioned user or institution. information.
实施例Example
下面结合优选的实施步骤对本发明进行说明,图1是根据本申请实施例提供的一种数据库的数据同步方法的流程图,如图1所示,该方法包括如下步骤:The present invention is described below in conjunction with preferred implementation steps. Fig. 1 is a flow chart of a data synchronization method for a database provided according to an embodiment of the present application. As shown in Fig. 1, the method includes the following steps:
步骤S101,获取需要进行数据同步的源数据库的待同步数据,其中,待同步数据包括目标事件产生的数据;Step S101, obtaining the data to be synchronized of the source database that needs to be synchronized, wherein the data to be synchronized includes the data generated by the target event;
步骤S102,对待同步数据的目标事件进行拆分,得到第一数据和第二数据,其中,第一数据为第一事件产生的数据,第二数据为第二事件产生的数据,第一事件和第二事件的叠加与目标事件相同;Step S102, splitting the target event of the data to be synchronized to obtain first data and second data, wherein the first data is the data generated by the first event, the second data is the data generated by the second event, the first event and The superposition of the second event is the same as the target event;
步骤S103,根据第一事件和第二事件,进行数据同步,将待同步数据同步到对应的目标数据库,其中,目标数据库与源数据库的种类不同,目标事件在不同种类的数据库中执行的数据结构不同,第一事件和第二事件在不同种类的数据库中执行的数据结构相同。Step S103, perform data synchronization according to the first event and the second event, and synchronize the data to be synchronized to the corresponding target database, wherein the target database is of different types from the source database, and the data structure of the target event executed in different types of databases Differently, the data structures executed by the first event and the second event in different types of databases are the same.
通过上述步骤,通过对目标事件产生的待同步数据进行获取,对目标事件进行拆分为可以直接同步至目标数据库的第一事件和第二事件,从而将待同步数据拆分为第一数据和第二数据,达到了将依赖于日志归并聚合来实现数据同步的目标事件的待同步数据,直接拆分同步到目标数据库种的目的,实现了降低目标时间的待同步数据进行数据同步的操作难度,在避免日志归并聚合的情况下,有效准确的实现数据同步的技术效果,进而解决了相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。Through the above steps, by acquiring the data to be synchronized generated by the target event, the target event is split into a first event and a second event that can be directly synchronized to the target database, so that the data to be synchronized is split into first data and The second data achieves the purpose of directly splitting and synchronizing the data to be synchronized for the target event that relies on log merging and aggregation to achieve data synchronization, and achieves the goal of reducing the operational difficulty of synchronizing data to be synchronized at the target time , in the case of avoiding log merging and aggregation, effectively and accurately realize the technical effect of data synchronization, and then solve the data synchronization involving distributed databases in related technologies, which needs to be realized by merging, sorting and aggregation of logs, resulting in poor operability, and some The problem of not being able to synchronize in the case.
上述步骤的执行主体可以上述数据库,或者用于管理数据库的功能模块,该功能模块可以设置在上述源数据库或者目标数据库中,还可以单独设置在上述源数据库和目标数据库的外部,还可以设置在上述源数据库和目标数据库外部的第三方设备上,通过与上述源数据库和目标数据库建立数据链接,对上述源数据库与目标数据库之间的数据同步进行管理。The execution subject of the above steps may be the above-mentioned database, or a functional module for managing the database. The functional module may be set in the above-mentioned source database or the target database, or it may be set separately outside the above-mentioned source database and the target database, or it may be set in Data synchronization between the source database and the target database is managed by establishing a data link with the source database and the target database on a third-party device outside the source database and the target database.
上述待同步数据包括目标事件为update修改事件时,在数据库中发生了修改事件时,其数据发生了变化,可以记录其变化的数据进行增量数据同步。在分布式数据库和非分布式数据库中,其数据结构不同,对于update事件的数据同步,非分布式数据库无法明确其数据的先后顺序,而分布式数据库可以根据分片键来确定,这样就导致在分布式数据库和非分布式数据库之间的update事件的数据同步存在一定的障碍。The above data to be synchronized includes when the target event is an update modification event, and when a modification event occurs in the database, its data has changed, and the changed data can be recorded for incremental data synchronization. In distributed databases and non-distributed databases, their data structures are different. For the data synchronization of update events, non-distributed databases cannot clarify the order of their data, while distributed databases can be determined according to the shard key, which leads to There are certain obstacles in the data synchronization of update events between distributed databases and non-distributed databases.
本实施例将update事件拆分为第一事件和第二事件,第一事件和第二事件的叠加与该目标事件,也即是update事件是相同的。上述第一事件可以为delete删除事件,第二事件可以为insert插入事件。在分布式数据库和非分布式数据库中delete事件和insert事件都是可以直接进行同步的,而且update事件在拆分时,delete事件和insert事件是有先后顺序的,因此,通过具有先后顺序的delete事件和insert事件就可以将update事件的数据直接同步到目标数据库中。In this embodiment, the update event is split into a first event and a second event, and the superposition of the first event and the second event is the same as the target event, that is, the update event. The above-mentioned first event may be a delete event, and the second event may be an insert event. In distributed databases and non-distributed databases, delete events and insert events can be directly synchronized, and when update events are split, delete events and insert events are in sequence. Therefore, through the sequence of delete Events and insert events can directly synchronize the data of the update event to the target database.
可选的,根据第一事件和第二事件,进行数据同步,将待同步数据同步到对应的目标数据库包括:根据源数据库和目标数据库的种类,确定对第一数据或第二数据是否进行修改;在对第一数据或第二数据进行修改的情况下,根据修改后的第一数据或第二数据,以及未修改的第二数据或第一数据,将第一事件和第二事件输入到目标数据库,以将待同步数据同步到目标数据库;和/或,在对第一数据或第二数据未进行修改的情况下,根据第一数据或第二数据的执行属性,将第一事件和第二事件输入到目标数据库,以将待同步数据同步到目标数据库。Optionally, performing data synchronization according to the first event and the second event, and synchronizing the data to be synchronized to the corresponding target database includes: determining whether to modify the first data or the second data according to the types of the source database and the target database ; In the case of modifying the first data or the second data, according to the modified first data or second data, and the unmodified second data or first data, input the first event and the second event into The target database, to synchronize the data to be synchronized to the target database; and/or, in the case of no modification to the first data or the second data, according to the execution attribute of the first data or the second data, the first event and The second event is input to the target database to synchronize the data to be synchronized to the target database.
具体的,在实施时,通常分为两种情况,第一种情况,源数据库为分布式数据库,目标数据库为非分布式数据库。第二种情况,源数据库为非分布式数据库,目标数据库为分布式数据库。对此,不同的情况下,对delete事件和insert事件的处理方式不同。Specifically, during implementation, there are usually two cases. In the first case, the source database is a distributed database and the target database is a non-distributed database. In the second case, the source database is a non-distributed database and the target database is a distributed database. In this regard, in different situations, the delete event and insert event are handled differently.
具体的,在第一种情况下,考虑到delete事件和insert事件的先后顺序,在执行delete事件时,由于insert事件执行前后的数据版本不同,可能会发生误删的情况。因此,需要在delete事件中增加其针对的数据版本标识,也即是分片键,来保证delete事件针对的数据是准确无误的。Specifically, in the first case, considering the order of the delete event and the insert event, when the delete event is executed, due to the different data versions before and after the execution of the insert event, accidental deletion may occur. Therefore, it is necessary to add the target data version identifier, that is, the shard key, to the delete event to ensure that the data targeted by the delete event is accurate.
可选的,在源数据库为分布式数据库,目标数据库为非分布式数据库的情况下,确定第一数据需要修改;根据预设方式对第一数据进行修改;根据修改后的第一数据,以及未修改的第二数据,将第一事件和第二事件输入到目标数据库,以将待同步数据同步到目标数据库。Optionally, when the source database is a distributed database and the target database is a non-distributed database, it is determined that the first data needs to be modified; the first data is modified according to a preset method; according to the modified first data, and For unmodified second data, input the first event and the second event into the target database, so as to synchronize the data to be synchronized to the target database.
在第二种情况下,考虑到分布式数据库出于性能和安全方面的考虑,通常会禁止客户端执行跨分片更新动作,这也将导致源端为非分布式数据库时涉及相关字段时会无法进行数据同步的问题。因此,对update事件进行解析,根据update事件的变更前后信息构造delete事件和insert事件,相当于对无法直接执行的语句进行等效转化,从而入库目标端分布式数据库,实现数据同步的最终一致性。也即是不需要对delete事件和insert事件进行修改,只需要将update事件拆分为delete事件和insert事件就可以进行数据同步了。In the second case, considering the performance and security considerations of distributed databases, the client is usually prohibited from performing cross-shard update actions, which will also cause the source to be non-distributed databases when related fields are involved. Unable to sync data. Therefore, analyzing the update event and constructing the delete event and insert event according to the information before and after the change of the update event is equivalent to equivalently converting the statement that cannot be directly executed, so as to store the target distributed database and achieve the final consistency of data synchronization sex. That is to say, there is no need to modify the delete event and insert event, and only need to split the update event into delete event and insert event to perform data synchronization.
可选的,在源数据库为非分布式数据库,目标数据库为分布式数据库的情况下,确定第一数据或第二数据不需要修改;根据第一事件的和第二事件的第一执行属性,将第一事件和第二事件输入到目标数据库,以将待同步数据同步到目标数据库。Optionally, when the source database is a non-distributed database and the target database is a distributed database, it is determined that the first data or the second data does not need to be modified; according to the first execution attribute of the first event and the second event, Input the first event and the second event into the target database, so as to synchronize the data to be synchronized to the target database.
需要说明的是,不论在第一种情况或者第二种情况下,对于insert事件的执行均可以采用不存在则插入,存在则更新的策略。来保证insert事件执行前后的数据安全性。It should be noted that, regardless of the first case or the second case, the execution of the insert event can adopt the strategy of inserting if it does not exist, and updating if it exists. To ensure data security before and after the execution of the insert event.
可选的,方法还包括:对第二数据的属性参数进行设置,确定第二事件的第二执行属性;根据第二执行属性将第一事件和第二事件输入到目标数据库,以将待同步数据同步到目标数据库。Optionally, the method further includes: setting the attribute parameters of the second data, and determining the second execution attribute of the second event; inputting the first event and the second event into the target database according to the second execution attribute, so as to synchronize the Data is synchronized to the target database.
可选的,根据预设方式对第一数据进行修改包括:对删除事件的语句数据中的查找条件中增加了分片键,其中,分片健用于标识数据的存储位置和修改版本,第一事件为删除delete事件的情况下,第一数据为删除事件的语句数据;和/或,第一执行属性为删除delete事件和插入insert事件执行入库至目标数据库的属性;和/或,第二执行属性为插入insert事件在入库至目标数据库时采用不存在则插入,存在则更新的执行策略的属性。Optionally, modifying the first data according to a preset method includes: adding a shard key to the search condition in the statement data of the delete event, wherein the shard key is used to identify the storage location and modified version of the data. In the case that the first event is a delete event, the first data is the statement data of the delete event; and/or, the first execution attribute is the attribute of deleting the delete event and inserting the insert event into the target database; and/or, the second The second execution attribute is an attribute of an execution strategy that adopts an execution strategy of inserting if it does not exist and updating if it exists when inserting an insert event into the target database.
另外,在实施时,通常存在一段时间内的数据进行主键的连续变更操作。数据同步组件通过日志解析对记录的变更过程进行重放,有时候由于数据修复或者其他原因,需要调整同步的位点到某个历史时间点再次进行同步,此时在不清理目标表的情况下会出现主键冲突无法入库目标端的情况。数据同步可重复入库方法通过将update事件拆分为delete+insert语句并在insert入库时执行不存在则插入,存在则更新的策略,实现数据同步过程的可重做而不影响数据的最终一致性。In addition, during implementation, there is usually data within a period of time for continuous primary key change operations. The data synchronization component replays the record change process through log parsing. Sometimes due to data repair or other reasons, it is necessary to adjust the synchronization point to a certain historical time point to perform synchronization again. At this time, the target table is not cleaned up. There will be situations where primary key conflicts cannot be stored at the target end. The data synchronization and repeatable storage method divides the update event into a delete+insert statement and executes the strategy of inserting if it does not exist and updating if it exists when the insert is stored, so as to realize the redoability of the data synchronization process without affecting the final data. consistency.
需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。It should be noted that the steps shown in the flowcharts of the accompanying drawings may be performed in a computer system, such as a set of computer-executable instructions, and that although a logical order is shown in the flowcharts, in some cases, The steps shown or described may be performed in an order different than here.
需要说明的是,本申请还提供了一种可选的实施方式,下面对该实施方式进行详细说明。It should be noted that the present application also provides an optional implementation manner, which will be described in detail below.
本实施方式提供了一种非分布式与分布式数据库之间可重做跨片同步方法,主要包括分布式数据库作为源端出现跨片update数据时同步到非分布式数据库的方法、非分布式数据库作为源端时某个字段在目标端是分片键的入库方法、数据可重复同步入库方法。本实施方式提出的方法巧妙地通过数据等效拆分、入库限定条件、合理调用数据库原生的主键合并等技术,将原本分布式数据库到非分布式数据库同步时必须要执行的日志归并排序聚合的环节进行规避,极大程度地提高了数据同步的性能和可操作性;突破非分布式数据库同步到分布式数据库时入库不能执行跨分片update的限制;同时实现了数据同步的可重复入库,在数据修复环节无需清理目标端现有的数据,当同步位点追平时即可达到数据的最终一致性,在实际运维中具有极大的运用意义。This implementation mode provides a redoable cross-shard synchronization method between non-distributed and distributed databases, which mainly includes a method for synchronizing to non-distributed databases when cross-slice update data appears on the distributed database as the source, and non-distributed databases. When the database is used as the source, a certain field is the sharding key storage method on the target side, and the data can be repeatedly synchronized into the storage method. The method proposed in this implementation mode cleverly adopts technologies such as data equivalent splitting, warehousing restrictions, and reasonable invocation of database native primary key merging to merge, sort, and aggregate logs that must be executed when the original distributed database is synchronized to the non-distributed database. Avoiding the link, greatly improving the performance and operability of data synchronization; breaking through the limitation that the non-distributed database cannot perform cross-shard update when the non-distributed database is synchronized to the distributed database; at the same time, it realizes the repeatability of data synchronization In the data storage process, there is no need to clean up the existing data on the target side in the data restoration process. When the synchronization points are equal, the final consistency of the data can be achieved, which has great application significance in actual operation and maintenance.
1.分布式数据库作为源端出现跨片update数据时同步到非分布式数据库的方法:1. The method of synchronizing to a non-distributed database when the distributed database is used as the source when cross-slice update data appears:
图2是根据本申请实施方式的相关技术提供的分布式数据库进行数据同步的流程图,如图2所示,传统的处理流程如图2,其中日志聚合和排序的环节是为了保证底层分片库的同步进度有差异时数据仍然能够达成最终一致性。传统方法极为消耗性能、可操作性差。Fig. 2 is a flow chart of the data synchronization of the distributed database provided by the related technology according to the embodiment of the present application. When there is a difference in the synchronization progress of the library, the data can still achieve final consistency. The traditional method consumes a lot of performance and has poor operability.
图3是根据本申请实施方式提供的分布式数据库进行数据同步的流程图,如图3所示,本实施方式的方法处理源端出现跨分片update动作时,由代理层拆分delete+insert动作,日志解析后分别读取所有底层库的日志进行独立同步。入库时对SQL语句做额外的处理,delete语句入库时增加限定条件,即在原有where条件中除了主键字段之外额外增加分片键;insert时采用不存在时插入、存在即更新的策略。经过以上处理后无论目标端先执行delete语句还是先执行insert语句,均不会影响数据同步的最终一致性,且不需要复杂的日志排序归并环节。Fig. 3 is a flow chart of data synchronization of the distributed database provided according to the embodiment of the present application. As shown in Fig. 3, when the method of this embodiment handles a cross-shard update action at the source end, the proxy layer splits delete+insert After the log is parsed, the logs of all underlying libraries are read separately for independent synchronization. Additional processing is done on the SQL statement when storing in the database, and a limited condition is added when the delete statement is stored in the database, that is, an additional shard key is added in addition to the primary key field in the original where condition; the strategy of inserting when it does not exist and updating when it exists is adopted when inserting . After the above processing, no matter whether the target executes the delete statement first or the insert statement first, it will not affect the final consistency of data synchronization, and there is no need for complex log sorting and merging links.
图4是根据本申请实施方式提供的分布式数据库进行数据同步实际执行样例的示意图,如图4所示,以图4所描述的一个实际执行过程为例,本实施方式的方法在数据同步入库时delete和insert...on duplicate key update语句将在目标端分布式数据库的不同分片上执行。不论哪条语句先执行都不影响数据同步的最终一致性:Fig. 4 is a schematic diagram of an actual execution example of data synchronization provided by a distributed database according to an embodiment of the present application. As shown in Fig. 4, taking an actual execution process described in Fig. 4 as an example, the method of this embodiment is The delete and insert...on duplicate key update statements will be executed on different shards of the target-side distributed database during warehousing. No matter which statement is executed first, it will not affect the final consistency of data synchronization:
1)先执行insert...on duplicate key update语句,后执行delete语句,由于delete语句中的where条件带上了分片键的条件,因此先执行的insert语句插入或更新的数据并不会被广播删除。1) Execute the insert...on duplicate key update statement first, and then execute the delete statement. Since the where condition in the delete statement carries the condition of the shard key, the data inserted or updated by the insert statement executed first will not be deleted Broadcast delete.
2)先执行delete语句,后执行insert...on duplicate key update语句,与期望的顺序一致,数据同步可以达成一致性。2) Execute the delete statement first, and then execute the insert...on duplicate key update statement, which is consistent with the expected order, and data synchronization can achieve consistency.
2.非分布式数据库作为源端时某个字段在目标端是分片键的入库方法:2. When a non-distributed database is used as the source, a certain field is the storage method of the shard key at the target:
图5是根据本申请实施方式提供的非分布式数据库作为源端的数据同步的流程图,如图5所示,通常分布式数据库出于性能和安全方面的考虑,会禁止客户端执行跨分片更新动作,这也将导致源端为非分布式数据库时涉及相关字段时会无法进行数据同步的问题。本实施方式提出的方法在数据同步组件层面对update事件进行解析,根据update的变更前后信息构造delete和insert事件,相当于对无法直接执行的语句进行等效转化,从而入库目标端分布式数据库,实现数据同步的最终一致性。Fig. 5 is a flow chart of data synchronization provided by a non-distributed database as a source according to an embodiment of the present application. As shown in Fig. 5, usually a distributed database prohibits the client from performing cross-sharding due to performance and security considerations Update action, which will also lead to the problem that data synchronization cannot be performed when related fields are involved when the source is a non-distributed database. The method proposed in this embodiment analyzes the update event at the level of the data synchronization component, and constructs delete and insert events according to the information before and after the update, which is equivalent to equivalently converting the statement that cannot be directly executed, so as to store in the target distributed database , to achieve the final consistency of data synchronization.
3.数据同步可重复入库方法:3. Data synchronization and repeatable storage method:
业务上通常存在一段时间内某条记录进行主键的连续变更操作。数据同步组件通过日志解析对记录的变更过程进行重放,有时候由于数据修复或者其他原因,需要调整同步的位点到某个历史时间点再次进行同步,此时在不清理目标表的情况下会出现主键冲突无法入库目标端的情况。数据同步可重复入库方法通过将update事件拆分为delete+insert语句并在insert入库时执行不存在则插入,存在则更新的策略,实现数据同步过程的可重做而不影响数据的最终一致性。In business, there is usually a continuous change operation of the primary key of a certain record within a period of time. The data synchronization component replays the record change process through log parsing. Sometimes due to data repair or other reasons, it is necessary to adjust the synchronization point to a certain historical time point to perform synchronization again. At this time, the target table is not cleaned up. There will be situations where primary key conflicts cannot be stored at the target end. The data synchronization and repeatable storage method divides the update event into a delete+insert statement and executes the strategy of inserting if it does not exist and updating if it exists when the insert is stored, so as to realize the redoability of the data synchronization process without affecting the final data. consistency.
图6是根据本申请实施方式提供的数据库进行数据同步时可重复入库的流程图,如图6所示,图6对比了传统方法和本实施方式的方法在处理重复进行数据同步时的差异,使用本方法可以实现在随意调整位点而无需清理目标表的情况下进行数据同步过程的重放,具有很强的可操作性和实用性。Fig. 6 is a flow chart of repeatable warehousing during data synchronization according to the database provided by the embodiment of the present application. As shown in Fig. 6, Fig. 6 compares the difference between the traditional method and the method of this embodiment when processing repeated data synchronization , using this method can realize the replay of the data synchronization process under the condition of adjusting the position at will without cleaning the target table, and has strong operability and practicability.
本实施方式相对现有技术而言,巧妙使用入库限定条件来回避分布式数据库到非分布式数据库同步中日志归并排序聚合的环节,减少了同步系统的整体复杂度,大幅度提高同步速度。使用数据等效拆分技术在保证数据同步最终一致性的前提下突破非分布式数据库同步到分布式数据库时入库不能执行跨分片update的限制。合理运用数据库原生的主键冲突合并技术,结合数据等效拆分方法共同实现数据可重复同步,在实践中具有很强的可操作性和运维意义。Compared with the prior art, this implementation method cleverly uses the storage limit conditions to avoid the link of log merging, sorting and aggregation in the synchronization of distributed databases to non-distributed databases, reduces the overall complexity of the synchronization system, and greatly improves the synchronization speed. Using data equivalent splitting technology to break through the limitation that cross-shard update cannot be performed when the non-distributed database is synchronized to the distributed database under the premise of ensuring the final consistency of data synchronization. Reasonable use of the database's native primary key conflict merging technology, combined with the data equivalent split method to achieve repeatable data synchronization, has strong operability and operation and maintenance significance in practice.
例如:生产系统中根据不同职责的划分同时使用基于mySQL的分布式数据库和oracle数据库,初始时需要做一次全量数据割接,把分布式数据的业务数据全部导入到oracle数据库,然后每天把分布式数据库中产生的约110G业务相关数据同步到oracle数据库,同时把oracle数据库中的相关配置数据同步到分布式数据库。For example, in the production system, a distributed database based on mySQL and an oracle database are used at the same time according to the division of different responsibilities. At the beginning, a full data cutover is required to import all the business data of the distributed data into the oracle database, and then transfer the distributed database to the oracle database every day. About 110G business-related data generated in the database is synchronized to the oracle database, and at the same time, the relevant configuration data in the oracle database is synchronized to the distributed database.
具体的实施步骤如下:The specific implementation steps are as follows:
1.使用本实施方式的方法开发的数据同步系统进行表数据同步,可以做全量同步和实时增量同步;1. Use the data synchronization system developed by the method of this embodiment to synchronize table data, which can perform full synchronization and real-time incremental synchronization;
2.本实施例中需要在分布式与非分布式数据库之间进行数据的相互同步,故建立两条单独的同步链路。链路之一为分布式数据库到oracle数据库,另外一条链路为oracle数据库到分布式数据库。填入相对应的源库信息和目标服务器信息,可以一键自动化生成同步映射关系;2. In this embodiment, mutual data synchronization between distributed and non-distributed databases is required, so two separate synchronization links are established. One of the links is from the distributed database to the oracle database, and the other link is from the oracle database to the distributed database. Fill in the corresponding source library information and target server information to automatically generate a synchronous mapping relationship with one click;
3.开启分布式数据库到oracle数据库的同步链路做一次全量数据同步,并记录全量开始的时间点;3. Open the synchronization link from the distributed database to the oracle database to perform a full data synchronization, and record the time point when the full data starts;
4.调整增量同步位点到全量时间点之前,开启分布式数据库到oracle数据库的增量同步。其中出现跨分片update动作时由本实施方式描述的方法进行处理,不需要用户额外操作;4. Before adjusting the incremental synchronization point to the full time point, start the incremental synchronization from the distributed database to the oracle database. When a cross-shard update action occurs, it will be processed by the method described in this implementation mode, and no additional operation by the user is required;
5.开启oracle数据库到分布式数据的同步链路,进行配置数据的增量同步。遇到某个字段在分布式数据库为分片键时,由于禁止跨片动作的设置,update语句无法直接在分布式数据库执行。同步系统会使用本实施方式描述的方法进行数据等效拆分,并同步到目标端。5. Open the synchronization link from the oracle database to the distributed data to perform incremental synchronization of configuration data. When a certain field is used as a shard key in the distributed database, the update statement cannot be executed directly in the distributed database due to the prohibition of cross-shard actions. The synchronization system will use the method described in this embodiment to split data equivalently and synchronize to the target end.
6.日常同步过程中偶尔出现应用误删除分布式数据库同步到oracle数据库的部分增量数据的情况,可根据时间调整同步位点,使用本实施方式描述的方法可以重做数据同步的过程,达到数据修复的效果。6. Occasionally, during the daily synchronization process, the application accidentally deletes some incremental data that is synchronized from the distributed database to the oracle database. The synchronization point can be adjusted according to the time. Using the method described in this embodiment, the data synchronization process can be redone to achieve The effect of data restoration.
分布式与非分布式数据库之间无障碍高效同步,数据同步过程可重做,增量同步位点可按时间点随意进行调整,当增量同步追平后数据可达成最终一致性。Barrier-free and efficient synchronization between distributed and non-distributed databases. The data synchronization process can be redone. The incremental synchronization location can be adjusted at will according to the time point. When the incremental synchronization is equalized, the data can reach final consistency.
本申请实施例还提供了一种数据库的数据同步装置,需要说明的是,本申请实施例的数据库的数据同步装置可以用于执行本申请实施例所提供的用于数据库的数据同步方法。以下对本申请实施例提供的数据库的数据同步装置进行介绍。The embodiment of the present application also provides a data synchronization device for a database. It should be noted that the data synchronization device for a database in the embodiment of the present application can be used to implement the data synchronization method for a database provided in the embodiment of the present application. The following introduces the data synchronization device for the database provided by the embodiment of the present application.
图7是根据本申请实施例提供的一种数据库的数据同步装置的示意图,如图7所示,该装置包括:获取模块72,拆分模块74,同步模块76,下面对该装置进行详细说明。Fig. 7 is a schematic diagram of a data synchronization device for a database provided according to an embodiment of the present application. As shown in Fig. 7, the device includes: an acquisition module 72, a splitting module 74, and a synchronization module 76. The device will be described in detail below illustrate.
获取模块72,用于获取需要进行数据同步的源数据库的待同步数据,其中,待同步数据包括目标事件产生的数据;拆分模块74,与上述获取模块72相连,用于对待同步数据的目标事件进行拆分,得到第一数据和第二数据,其中,第一数据为第一事件产生的数据,第二数据为第二事件产生的数据,第一事件和第二事件的叠加与目标事件相同;同步模块76,与上述拆分模块74相连,用于根据第一事件和第二事件,进行数据同步,将待同步数据同步到对应的目标数据库,其中,目标数据库与源数据库的种类不同,目标事件在不同种类的数据库中执行的数据结构不同,第一事件和第二事件在不同种类的数据库中执行的数据结构相同。The obtaining module 72 is used to obtain the data to be synchronized of the source database that needs to be synchronized, wherein the data to be synchronized includes the data generated by the target event; the splitting module 74 is connected to the above-mentioned obtaining module 72 and is used to target the data to be synchronized The event is split to obtain the first data and the second data, wherein the first data is the data generated by the first event, the second data is the data generated by the second event, the superposition of the first event and the second event and the target event Same; the synchronization module 76 is connected with the above-mentioned splitting module 74, and is used for performing data synchronization according to the first event and the second event, and synchronizing the data to be synchronized to the corresponding target database, wherein the target database is different from the source database , the target event has different data structures executed in different types of databases, and the data structures of the first event and the second event executed in different types of databases are the same.
本申请实施例提供的数据库的数据同步装置,通过对目标事件产生的待同步数据进行获取,对目标事件进行拆分为可以直接同步至目标数据库的第一事件和第二事件,从而将待同步数据拆分为第一数据和第二数据,达到了将依赖于日志归并聚合来实现数据同步的目标事件的待同步数据,直接拆分同步到目标数据库种的目的,实现了降低目标时间的待同步数据进行数据同步的操作难度,在避免日志归并聚合的情况下,有效准确的实现数据同步的技术效果,进而解决了相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。The database data synchronization device provided by the embodiment of the present application obtains the data to be synchronized generated by the target event, and splits the target event into a first event and a second event that can be directly synchronized to the target database, thereby synchronizing the data to be synchronized The data is split into the first data and the second data, which achieves the purpose of directly splitting and synchronizing the data to be synchronized to the target database for the target event that relies on log merging and aggregation to achieve data synchronization, and reduces the waiting time of the target time. It is difficult to synchronize data for data synchronization. In the case of avoiding log merging and aggregation, the technical effect of data synchronization can be effectively and accurately realized, thereby solving the problem of data synchronization involving distributed databases in related technologies. Logs need to be merged, sorted and aggregated To achieve, resulting in poor operability, in some cases can not be synchronized.
数据库的数据同步装置包括处理器和存储器,上述获取模块72,拆分模块74,同步模块76等均作为程序单元存储在存储器中,由处理器执行存储在存储器中的上述程序单元来实现相应的功能。The data synchronization device of the database includes a processor and a memory, and the above-mentioned acquisition module 72, splitting module 74, synchronization module 76, etc. are all stored in the memory as program units, and the processor executes the above-mentioned program units stored in the memory to realize the corresponding Function.
处理器中包含内核,由内核去存储器中调取相应的程序单元。内核可以设置一个或以上,通过调整内核参数来解决了相关技术中涉及到分布式数据库的数据同步,需要将日志进行归并排序聚合来实现,导致操作性差,部分情况下无法同步的问题。The processor includes a kernel, and the kernel fetches corresponding program units from the memory. One or more kernels can be set. By adjusting the kernel parameters, data synchronization involving distributed databases in related technologies can be solved. Logs need to be merged, sorted and aggregated to achieve this, resulting in poor operability and in some cases the problem of inability to synchronize.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。Memory may include non-permanent memory in computer-readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory includes at least one memory chip.
本发明实施例提供了一种计算机可读存储介质,其上存储有程序,该程序被处理器执行时实现数据库的数据同步方法。An embodiment of the present invention provides a computer-readable storage medium on which a program is stored, and when the program is executed by a processor, a data synchronization method for a database is implemented.
本发明实施例提供了一种处理器,处理器用于运行程序,其中,程序运行时执行数据库的数据同步方法。An embodiment of the present invention provides a processor, and the processor is used to run a program, wherein the data synchronization method of the database is executed when the program is running.
图8是根据本申请实施例提供的一种电子设备的示意图,如图8所示,本申请实施例提供了一种电子设备80,设备包括处理器、存储器及存储在存储器上并可在处理器上运行的程序,处理器执行程序时实现上述任一项方法的步骤。Fig. 8 is a schematic diagram of an electronic device provided according to an embodiment of the present application. As shown in Fig. 8, the embodiment of the present application provides an
本申请中的设备可以是服务器、PC、PAD、手机等。The devices in this application can be servers, PCs, PADs, mobile phones, etc.
本申请还提供了一种计算机程序产品,当在数据库的数据同步设备上执行时,适于执行初始化有上述任一方法步骤的程序。The present application also provides a computer program product, which, when executed on a data synchronization device of a database, is suitable for executing a program initialized with any of the above method steps.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据库的数据同步设备的处理器以产生一个机器,使得通过计算机或其他可编程数据库的数据同步设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, a special purpose computer, an embedded processor, or a processor of other programmable database data synchronization equipment to produce a machine that is executed by a processor of a computer or other programmable database data synchronization equipment The instructions generate means for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据库的数据同步设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer readable memory capable of directing a computer or other programmable database data synchronization device to operate in a specific manner such that the instructions stored in the computer readable memory produce an article of manufacture comprising instruction means, The instruction means implements the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据库的数据同步设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable database data synchronization device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The executed instructions provide steps for implementing the functions specified in the procedure or procedures of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. The memory is an example of a computer readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus that includes the element.
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
以上仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。The above are only examples of the present application, and are not intended to limit the present application. For those skilled in the art, various modifications and changes may occur in this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application shall be included within the scope of the claims of the present application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210911970.1A CN115357661A (en) | 2022-07-29 | 2022-07-29 | Database data synchronization method and device, storage medium and processor |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210911970.1A CN115357661A (en) | 2022-07-29 | 2022-07-29 | Database data synchronization method and device, storage medium and processor |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN115357661A true CN115357661A (en) | 2022-11-18 |
Family
ID=84031371
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210911970.1A Pending CN115357661A (en) | 2022-07-29 | 2022-07-29 | Database data synchronization method and device, storage medium and processor |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115357661A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117112696A (en) * | 2023-08-30 | 2023-11-24 | 新华三大数据技术有限公司 | Data synchronization methods, devices, computer equipment and storage media |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106844507A (en) * | 2016-12-27 | 2017-06-13 | 星环信息科技(上海)有限公司 | A kind of method and apparatus of data batch processing |
| CN110209728A (en) * | 2019-04-22 | 2019-09-06 | 凯通科技股份有限公司 | A kind of Distributed Heterogeneous Database synchronous method, electronic equipment and storage medium |
| CN113254461A (en) * | 2021-02-07 | 2021-08-13 | 浪潮云信息技术股份公司 | Optimization method and system for realizing database synchronization based on NIFI |
| WO2021189670A1 (en) * | 2020-03-26 | 2021-09-30 | 上海依图网络科技有限公司 | Data synchronization method, data synchronization system, data synchronization apparatus, medium, and system |
| CN114490865A (en) * | 2021-12-22 | 2022-05-13 | 天翼云科技有限公司 | Database synchronization method, device, equipment and computer storage medium |
-
2022
- 2022-07-29 CN CN202210911970.1A patent/CN115357661A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106844507A (en) * | 2016-12-27 | 2017-06-13 | 星环信息科技(上海)有限公司 | A kind of method and apparatus of data batch processing |
| CN110209728A (en) * | 2019-04-22 | 2019-09-06 | 凯通科技股份有限公司 | A kind of Distributed Heterogeneous Database synchronous method, electronic equipment and storage medium |
| WO2021189670A1 (en) * | 2020-03-26 | 2021-09-30 | 上海依图网络科技有限公司 | Data synchronization method, data synchronization system, data synchronization apparatus, medium, and system |
| CN113254461A (en) * | 2021-02-07 | 2021-08-13 | 浪潮云信息技术股份公司 | Optimization method and system for realizing database synchronization based on NIFI |
| CN114490865A (en) * | 2021-12-22 | 2022-05-13 | 天翼云科技有限公司 | Database synchronization method, device, equipment and computer storage medium |
Non-Patent Citations (1)
| Title |
|---|
| 杨东;谢菲;杨晓刚;何遵文;SUDONG YANG;: "分布式数据库技术的研究与实现", 电子科学技术, no. 01, 10 January 2015 (2015-01-10) * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117112696A (en) * | 2023-08-30 | 2023-11-24 | 新华三大数据技术有限公司 | Data synchronization methods, devices, computer equipment and storage media |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9218405B2 (en) | Batch processing and data synchronization in cloud-based systems | |
| KR101956236B1 (en) | Data replication technique in database management system | |
| CN109906448B (en) | Methods, devices and media for facilitating operations on pluggable databases | |
| RU2767149C2 (en) | Method and configuration for automated testing system | |
| CN107644030A (en) | Data synchronization method for distributed database, relevant apparatus and system | |
| US8494888B2 (en) | Offline modification of business data | |
| CN104885077A (en) | Managing continuous queries with archived relations | |
| CN102272751B (en) | Data integrity in a database environment through background synchronization | |
| WO2021147935A1 (en) | Log playback method and apparatus | |
| US12253979B2 (en) | Self-healing data synchronization | |
| WO2022095366A1 (en) | Redis-based data reading method and apparatus, device, and readable storage medium | |
| EP3049968A1 (en) | Master schema shared across multiple tenants with dynamic update | |
| US12147556B2 (en) | Providing data as a service using a multi-tenant system | |
| CN116467275A (en) | Shared remote storage method, device, system, electronic equipment and storage medium | |
| WO2024040902A1 (en) | Data access method, distributed database system and computing device cluster | |
| CN114238494A (en) | Data synchronization processing method and device, computer equipment and storage medium | |
| CN105989049A (en) | Data middle layer realizing method and system | |
| US11789971B1 (en) | Adding replicas to a multi-leader replica group for a data set | |
| CN115357661A (en) | Database data synchronization method and device, storage medium and processor | |
| CN104881336A (en) | Data backup method and device | |
| CN108897822A (en) | A kind of data-updating method, device, equipment and readable storage medium storing program for executing | |
| CN116244384A (en) | Data synchronization method, data synchronization device, electronic equipment and storage medium | |
| CN111782619A (en) | A method, synchronization device and storage medium for incremental synchronization of documents between servers | |
| US10534756B1 (en) | Systems and methods for cross-referencing electronic documents on distributed storage servers | |
| CN117724741A (en) | Mapper configuration file update method, device and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |