CN107544999B - Synchronization device and synchronization method for retrieval system, and retrieval system and method - Google Patents
Synchronization device and synchronization method for retrieval system, and retrieval system and method Download PDFInfo
- Publication number
- CN107544999B CN107544999B CN201610487175.9A CN201610487175A CN107544999B CN 107544999 B CN107544999 B CN 107544999B CN 201610487175 A CN201610487175 A CN 201610487175A CN 107544999 B CN107544999 B CN 107544999B
- Authority
- CN
- China
- Prior art keywords
- data
- synchronization
- trigger
- retrieval
- synchronous
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000001360 synchronised effect Effects 0.000 claims abstract description 99
- 238000012544 monitoring process Methods 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims description 18
- 230000007246 mechanism Effects 0.000 claims description 14
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 14
- 238000013461 design Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000013481 data capture Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a synchronization device and a synchronization method for a retrieval system, and the retrieval system and the retrieval method. The retrieval system comprises a synchronization node and a cluster node, wherein the synchronization node is independent from the cluster node and comprises one or more synchronization devices, and each synchronization device comprises at least one synchronization trigger and a synchronization unit, wherein the synchronization trigger is used for generating a synchronization trigger indication containing data identification information of data to be synchronized when monitoring that a synchronization trigger condition is met; the data grabber is used for grabbing data corresponding to the data identification information from a corresponding data source according to the synchronous trigger indication and transmitting the data to the dump device; and the dump device is used for dumping the captured data to a corresponding data table in the cluster node, and the cluster node is used for storing the data table and providing retrieval service. The invention does not need to restart the cluster nodes during upgrading or data transfer, meets the actual retrieval requirement, improves the retrieval efficiency and has the characteristics of high availability and high concurrency.
Description
Technical Field
The present invention relates to a retrieval technology, and in particular, to a synchronization apparatus and a synchronization method for a retrieval system, and a retrieval system and a retrieval method.
Background
Many existing application scenarios need to provide retrieval services for users, for example, a glutinous rice operation behavior is developed based on stores, and many application programs need to retrieve information of the stores, including multi-field query and sorting, chinese retrieval of store names and brand names, and distance retrieval of geographic coordinates.
At present, there are many implementations of a search system providing a search service, for example, tools or components such as an Elasticsearch (ES), river, and Elasticsearch-jetty are used.
Taking ES as an example, the retrieval architecture of ES provides retrieval services with a plurality of cluster nodes. Each cluster node is used to provide a database for storing data, and the database is composed of a data table (Type). The database is provided with an Index (Index), and the cluster nodes can respond to the data retrieval request of the user based on the Index, inquire corresponding data in the database and the data table, and feed back the data to the user. In the data table, data can be provided by different types of data sources, including mysql, mongoDB and rabitMQ, and the data table acquires data by calling the data sources.
Two main links are involved in the retrieval service, namely data synchronization and data operation (which can include reading, writing, querying and the like) of a user. The existing retrieval system has some disadvantages aiming at the operations provided by the two links.
For the data synchronization link, data synchronization implemented based on the ES system needs to provide different synchronization modes for different types of data sources, and a program for implementing synchronization operation is installed on a cluster node in a plug-in manner, which results in two defects: 1. different synchronous plug-ins need to be separately compiled for different data sources; 2. when the synchronous plug-in is required to run, the cluster nodes need to stop working, and the synchronous plug-in is installed and then started to run. In summary, if a new function is added in a plug-in form, for example, upgrading, the cluster needs to be restarted, which results in unstable service.
Meanwhile, for the data operation link of the user, the cluster node of the ES system cannot perform authority control, that is, any user data operation instruction sent to the cluster node is executed. And the data of different users are not physically isolated and stored in the cluster nodes, so that the problems of misoperation or malicious operation are easily caused.
Disclosure of Invention
The invention provides a synchronization device and a synchronization method for a retrieval system, and the retrieval system and the retrieval method, which are used for realizing that a user can flexibly configure a synchronization strategy according to different design requirements to obtain the retrieval system matched with the design requirements.
According to an aspect of the present invention, there is provided a synchronization apparatus applied to a retrieval system, including: the synchronous trigger is used for generating a synchronous trigger indication when the synchronous trigger condition is monitored to be met, and the synchronous trigger indication comprises data identification information of data to be synchronized; the data grabber is used for grabbing data corresponding to the data identification information from a corresponding data source according to the synchronous trigger instruction and transmitting the grabbed data to the dump device; and the dump memory is used for dumping the captured data to a corresponding data table in a cluster node, and the cluster node is used for storing the data table and providing retrieval service.
According to another aspect of the present invention, there is provided a retrieval system comprising a cluster node and a synchronization node, the synchronization node being independent of the cluster node and comprising one or more synchronization devices as described above, the cluster node being configured to store data tables and provide retrieval services.
According to another aspect of the present invention, there is provided a retrieval system, including a cluster node and a synchronization node, the cluster node being configured to store a data table and provide a retrieval service, the synchronization node being independent of the cluster node and configured to synchronize the data table in the cluster node, the cluster node further including: the permission control module is used for judging whether an account corresponding to the retrieval request has operation permission or not according to an account identifier, operation content, an operation object and a permission configuration table in the received user retrieval request after receiving the user retrieval request, wherein the permission configuration table stores a mapping relation among the account identifier, the operation content, the operation object and the operation permission; if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request.
According to another aspect of the present invention, there is provided a synchronization method applied to a retrieval system, including: when the condition that synchronous triggering is met is monitored through at least one synchronous trigger, generating synchronous triggering indication, wherein the synchronous triggering indication comprises data identification information of data to be synchronized; capturing data corresponding to the data identification information from a data source through a data grabber according to the synchronous trigger instruction, and transmitting the captured data to a dump device; and dumping the captured data to a corresponding data table in a cluster node through the dump memory, wherein the cluster node is used for storing the data table and providing retrieval service.
According to another aspect of the present invention, there is provided a retrieval method including: after receiving a user retrieval request, searching a data table of a cluster node for a matching result, wherein the data table is synchronized by a synchronization device according to the synchronization method, and the synchronization device is independent of the cluster node.
According to another aspect of the present invention, there is provided a retrieval method including: after receiving a user retrieval request, judging whether an account corresponding to the retrieval request has an operation authority or not according to an account identifier, operation content, an operation object and an authority configuration table in the received user retrieval request, wherein the authority configuration table stores a mapping relation among the account identifier, the operation content, the operation object and the operation authority; if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request.
According to the invention, the synchronous nodes are set independently of the cluster nodes, and the cluster nodes do not need to be restarted during installation or updating of the synchronous nodes, so that the stability of the service is improved. Different synchronous trigger conditions are configured according to different design requirements, different synchronous trigger indications are generated so as to control the data grabber and the data dump device to execute different data synchronization strategies, and the actual retrieval requirements are met. Under the same conditions, the synchronization time of 100 pieces of data synchronized by a single synchronization trigger from a data source in real time is less than 1 second, the synchronization time of 1000 ten thousand pieces of data synchronized by the single synchronization trigger in batch is less than 1 hour, and the response time of a retrieval system is less than 500 milliseconds.
Drawings
Fig. 1 is a schematic structural diagram of a retrieval system according to a first embodiment of the present invention;
fig. 2 is a schematic structural diagram illustrating an example of a synchronization apparatus according to a first embodiment of the present invention;
fig. 3 is a flowchart showing an example of a synchronization method for a retrieval system according to a first embodiment of the present invention;
fig. 4 is a schematic structural diagram of another example of a synchronization apparatus according to the first embodiment of the present invention;
FIG. 5 is a flowchart illustrating another example of a synchronization method for a retrieval system according to the first embodiment of the present invention;
FIG. 6 is a flow chart of a retrieval method according to a first embodiment of the invention;
FIGS. 7A-7D are schematic diagrams illustrating an example of one implementation of a retrieval method according to an embodiment of the invention;
FIG. 8 is a diagram showing a structure of a retrieval system according to a second embodiment of the present invention;
fig. 9 is a flowchart showing an example of a retrieval method according to the second embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be further noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings, not all of them.
Example one
Fig. 1 is a schematic structural diagram of a retrieval system 1 according to an embodiment of the present invention, which is applicable to a situation where a synchronization trigger configured by a user is used to generate a synchronization trigger indication, and data is synchronized to a set cluster according to different synchronization policies.
As shown in fig. 1, the retrieval system 1 comprises a synchronization node 110 and a cluster node 120, the synchronization node 110 being independent of the cluster node 120. The cluster node 120 stores a data table 121, and is configured to perform a search in the data table 121 and provide a search result after receiving a search request sent by a user. The synchronization node 110 includes one or more synchronization devices for synchronizing the data tables 121 in the cluster nodes 120.
The synchronization node 110 is a control terminal that is provided independently from the cluster node 120 and can perform data transmission with the cluster to synchronize data from a data source to the cluster. One or more synchronization devices may be created in the synchronization node 110 to perform data synchronization operations. While one of the synchronizers is upgraded, the other synchronizers 111 may continue to provide service without requiring the cluster node 120 to be down for maintenance.
Fig. 2 is a schematic structural diagram illustrating an example of a synchronization apparatus according to a first embodiment of the present invention. As shown in fig. 2, the synchronization apparatus includes at least one synchronization trigger 111-1, 111-2, \8230;, 111-n, a data grabber 112, and a dumper 113.
The synchronization trigger 111 is configured to generate a synchronization trigger indication when it is monitored that the synchronization trigger condition is satisfied, where the synchronization trigger indication includes data identification information of data to be synchronized, and send the generated synchronization trigger indication to the data grabber 112. In an example of the present invention, after the synchronization trigger 111 listens to the information, the synchronization trigger 111 may obtain the data identification information of the data to be synchronized based on the configuration information of the synchronization trigger 111 and the information listened by the synchronization trigger 111.
In an example of the present invention, the synchronization trigger 111 may include at least one of a timing trigger, a message trigger, and a log trigger. The timing trigger is a functional module that generates a synchronization trigger instruction for synchronizing the data to be synchronized of the set data source to the target data table 121 when the current time meets the set time or reaches the set period. The message trigger is a functional module that generates a synchronization trigger instruction for synchronizing data to be synchronized of the set data source to the target data table 121 when a set message is generated in the operation message for the data source. The log trigger is a functional module that generates a synchronization trigger instruction for synchronizing data of a set data source to the target data table 121 when the operation log of the data source includes data update of the data source.
The configuration message of the timing trigger may include a trigger period, which may be, for example, a set time or a set period, and data identification field information to be monitored, which may be unique identification information of data to be monitored, such as a data ID, and the like. The configuration message of the message trigger may include a message cluster, a message queue name, and a message token, and the configuration message is used to configure the pipe information for receiving the message. The configuration message of the log trigger may include the databus source, log data information to be listened to.
When the synchronization triggering condition is monitored to be met, generating a synchronization triggering indication, wherein the synchronization triggering indication comprises at least one of the following conditions:
(1) When the current time is monitored to meet the set time through the timing trigger or a set period is reached, generating a synchronous trigger indication containing unique identification information of the data needing to be synchronized based on monitoring information of which data need to be synchronized;
(2) When the message trigger monitors that a set message is generated in an operation message of a data source, extracting unique identification information of data needing to be synchronized from the set message, and generating a synchronization trigger indication containing the unique identification information of the data needing to be synchronized;
(3) And when the operation log of the data source is acquired and analyzed through the log trigger and the data updating of the data source is monitored, generating a synchronization trigger instruction containing the unique identification information of the data needing to be synchronized.
After receiving the synchronization trigger indication, the data grabber 112 grabs data corresponding to the data identification information from the corresponding data source according to the synchronization trigger indication, and transmits the grabbed data to the dump memory 113, such as a dump queue of the dump memory 113. The capturing mode can adopt a distributed service framework Dubbo mode or an SQL mode. The strategy of grabbing can be batch grabbing or time-limited grabbing. For example, it is preset that a set number of records are fetched from the data source for one fetch operation. The batch grabbing is to grab records with set number from a data source according to the preset number of the records grabbed at one time, and transmit the grabbed records to a dump. Meanwhile, the execution time of one-time grabbing operation can be preset. And if the number of the records captured at one time does not meet the set number and the execution time of the current capturing operation reaches the preset execution time, transmitting the captured records to the dump 113. When the data volume needing to be synchronized is larger than a set threshold value, a batch capture strategy is adopted, and long-time occupation of network and disk IO interfaces is reduced. And when the number of the required synchronizations is smaller than a set threshold value, a time-limited grabbing strategy is adopted to meet the timeliness requirement.
In an example of the present invention, the correspondence between the data fetcher and the data source may be obtained based on configuration information of the data fetcher. The configuration information of the data fetcher may include corresponding data source address information. In one example of the invention, the data fetcher may include an SQL fetcher and a Dubbo fetcher. Where the data fetcher is an SQL fetcher, the configuration information may include a database address. In addition, preferably, the configuration information may further include a database user name, a database password, and a monitoring statement. When the data fetcher is a Dubbo fetcher, the configuration information may include a remote service address. In addition, preferably, the configuration information may further include a service name, a method name, and a parameter.
Dump memory 113 is used to dump the captured data into corresponding data tables 121 in cluster node 120 for use by cluster node 120 to provide retrieval services. In an example of the present invention, the correspondence between the dump bank 113 and the data table 121 in the cluster node 120 may be obtained based on the configuration information of the dump bank 113. The configuration information of the dump 113 may include a data structure of index data.
Furthermore, preferably, the synchronization apparatus may further include a configuration module (not shown) for providing the configuration information of the synchronization trigger, the configuration information of the data grabber, and the configuration information of the dump memory for the user.
Fig. 3 is a flowchart illustrating an example of a synchronization method for a retrieval system according to a first embodiment of the present invention. As shown in fig. 3, in step S310, when it is monitored by at least one synchronization trigger that a synchronization trigger condition is met, the synchronization trigger generates a synchronization trigger indication, where the synchronization trigger indication includes data identification information of data to be synchronized, and sends the synchronization trigger indication to the data grabber. After receiving the synchronization trigger instruction sent by the synchronization trigger, in step S320, the data grabber grabs the data corresponding to the data identification information from the corresponding data source according to the synchronization trigger instruction, and transmits the grabbed data to the dump device. Next, in step S330, the dump memory dumps the captured data into a corresponding data table in the cluster node, so that the cluster node can provide the retrieval service.
Preferably, before the synchronization trigger generates the synchronization trigger indication, the method may further include: and receiving configuration information of the synchronous trigger, configuration information of the data grabber and configuration information of the dump device, which are provided by a user. The configuration information of the synchronization trigger, the data grabber and the dump device is as described above, and is not described herein again.
After receiving the configuration information, the data identification information of the data to be synchronized may be obtained based on the configuration information of the synchronization trigger and the information monitored by the synchronization trigger, the correspondence between the data grabber and the data source may be obtained based on the configuration information of the data grabber, and the correspondence between the data table in the dump and the data table in the cluster node may be obtained based on the configuration information of the dump.
Fig. 4 is a schematic structural diagram illustrating another example of a synchronization apparatus according to a first embodiment of the present invention. As shown in fig. 4, the synchronization apparatus includes at least one synchronization trigger 111' -1, 111' -2, \8230;, 111' -n, a data grabber 112', a dump device 113', and a scheduler 114.
The synchronization trigger 111' is configured to generate a synchronization trigger indication when it is monitored that the synchronization trigger condition is satisfied, the synchronization trigger indication including data identification information of data to be synchronized, and send the generated synchronization trigger indication to the scheduler 114. After receiving at least one synchronization trigger indication sent by the synchronization trigger, the scheduler 114 sends the synchronization trigger indication to the data fetcher 112' to obtain corresponding data according to a scheduling policy in the scheduler.
The data fetcher 112' fetches data corresponding to the data identification information from a corresponding data source according to the received synchronization trigger indication, and transmits the fetched data to the scheduler 114. The scheduler 114 sends the received captured data to the dump device 113 'according to the scheduling policy in the scheduler, and the dump device 113' dumps the captured data into the corresponding data table.
In an example of the present invention, after receiving at least one synchronization trigger indication sent by the synchronization trigger, scheduling, by a scheduler, the data fetcher to acquire corresponding data according to a scheduling policy in the scheduler may include: after receiving at least one synchronous trigger instruction sent by the synchronous trigger, putting the received at least one synchronous trigger instruction into a task pool as a task; acquiring a synchronous trigger instruction to be distributed to the data grabber according to a scheduling strategy in a scheduler; and allocating the acquired synchronous trigger instruction to be allocated to the data grabber to grab the corresponding data.
In one example of the present invention, the scheduling policy may include: maximum allocation synchronization trigger indication number, maximum dump data number, and synchronization trigger indication allocation mechanism. Here, the maximum number of assigned synchronization trigger indications refers to the maximum number of synchronization trigger indications assigned to the data fetcher at a time, that is, the maximum number of synchronization trigger indications that the data fetcher can process at a time. Generally, the maximum allocation trigger indication number refers to the maximum number of threads that can be concurrently processed by the data fetcher at a time. The maximum dump data amount refers to the maximum amount of data of the dump at a time. Generally, the maximum dump data amount refers to the maximum number of threads that the dump can concurrently process at a single time. Furthermore, preferably, the synchronization trigger indication allocation mechanism may include: a priority assignment mechanism; and/or a resource-saving allocation mechanism.
The priority allocation mechanism means that the scheduler determines the synchronization trigger indications to be allocated according to the priority of the synchronization trigger indications, for example, the synchronization trigger indications to be allocated are determined according to the priority from high to low according to the maximum allocation synchronization trigger indication number. Specifically, after receiving the synchronization triggering indication, the scheduler may assign different priorities to the synchronization triggering indication according to the trigger type corresponding to the synchronization triggering indication. For example, the priority of the synchronization indication triggered by the timing trigger is common; and the message trigger and the log trigger synchronously trigger and indicate that the priority is priority. And synchronous trigger indications triggered by the message trigger and the log trigger are sequenced according to the triggering time.
The resource saving allocation mechanism is to collect single synchronous instructions into a batch of synchronous instructions to capture and dump in batch, so that the execution times of a grabber and a dump device are reduced, and the consumption of a CPU, a network IO and a disk IO is further reduced.
Preferably, when the synchronization apparatus 110' includes a configuration module, the configuration module can be further used to configure a scheduling policy in the scheduler.
Fig. 5 is a flowchart illustrating another example of a synchronization method for a retrieval system according to the first embodiment of the present invention.
As shown in fig. 5, in step S510, when it is monitored by at least one synchronization trigger that a synchronization trigger condition is satisfied, the synchronization trigger generates a synchronization trigger indication containing data identification information of data to be synchronized, and sends the generated synchronization trigger indication to a scheduler. After receiving the synchronization trigger indication sent by the synchronization trigger, in step S520, after receiving at least one synchronization trigger indication sent by the synchronization trigger, the scheduler sends the synchronization trigger indication to the data grabber to obtain the corresponding data according to the scheduling policy in the scheduler. Next, in step S530, the data fetcher fetches data corresponding to the data identification information from a corresponding data source according to the received synchronization trigger indication, and transmits the fetched data to the scheduler. Subsequently, in step S540, the scheduler sends the received fetched data to the dump according to the scheduling policy in the scheduler. Then, in step S550, the dump memory dumps the captured data into a corresponding data table for use by the cluster node to provide the retrieval service.
Preferably, before the synchronization trigger generates the synchronization trigger indication, the synchronization method may further include: and receiving configuration information of the synchronization trigger, configuration information of the data grabber, configuration information of the dump memory and a scheduling strategy of the scheduler, wherein the configuration information of the synchronization trigger, the configuration information of the data grabber and the configuration information of the dump memory are provided by a user. The configuration information of the synchronization trigger, the data grabber and the dump device and the scheduling policy of the scheduler are as described above, and are not described herein again.
Fig. 6 shows a flowchart of a retrieval method according to a first embodiment of the present invention, which is executed by the retrieval system shown in fig. 1. As shown in fig. 6, a retrieval request input by a user is received at step S610, and then, a data table of a cluster node, which is synchronized according to the synchronization method as described above by a synchronization means independent of the cluster node, is searched for a matching result based on the received retrieval request at step S620.
Fig. 7A to 7D are schematic diagrams showing an implementation example of the retrieval method according to the embodiment of the present invention. FIG. 7A illustrates a schematic diagram of an application registering and acquiring a token on a web page; FIG. 7B is a diagram showing index mapping and synchronization configuration applied under a web page creation application; FIG. 7C is a schematic diagram illustrating the system automatically completing real-time synchronization of retrieved data; fig. 7D shows a schematic diagram of an application retrieving index data under the application through the api.
In this example, a user with a search requirement first registers in the RTS system to obtain an app and a token, as shown in fig. 7A. The rules for data synchronization and the data structure of the index are specified by the configuration module, as shown in FIG. 7B. The system automatically completes the synchronization of the retrieved data as shown in fig. 7C. Users can search data through ES native api, which facilitates migration of history items, as shown in FIG. 7D.
In summary, a mode of splitting a synchronization trigger, a grabber and a dump device in the synchronization node 110 is adopted, the synchronization trigger is responsible for indicating data to be synchronized, the grabber grabs the data and then transmits the data to a dump queue, the dump device acquires the data from the dump queue and synchronizes the data to a corresponding data table 121 of the cluster node 120, so that the stateless characteristic of the data of the synchronization node 110 is realized. Therefore, when the same batch of data (data corresponding to the same index) is synchronized, a plurality of synchronization strategies can be configured. Due to the data stateless characteristic, a plurality of synchronization strategies can be executed concurrently, and errors of the synchronization data version caused by different execution sequence of the synchronization strategies and the data updating sequence can be avoided. For example, at least one synchronization trigger can be configured for the same index, and different synchronization trigger conditions are adopted to generate a synchronization trigger indication, so that the fetcher and the dump execute data synchronization according to the synchronization trigger indication. Due to the stateless characteristic of the data, the error of the version of the synchronous data can not be caused by the difference between the execution sequence of the synchronous strategy and the update sequence of the data.
According to the technical scheme, the synchronization node is set independently of the cluster node, the cluster node does not need to be restarted during installation or updating of the synchronization node, and the stability of service is improved. Different synchronous trigger conditions are configured according to different design requirements, different synchronous trigger instructions are generated to control the data grabber and the data dump device to execute different data synchronization strategies, actual retrieval requirements are met, retrieval efficiency is improved, and the method has the advantages of being high in availability and high in concurrency.
Example two
Fig. 8 is a schematic structural diagram of a retrieval system in the second embodiment of the present invention. The technical solution of this embodiment is based on the above embodiment, and further includes a configuration platform 130 and an authority control module 122.
The authority control module 122 is configured in the cluster node 120, and is configured to determine whether an account corresponding to the received user retrieval request has an operation authority according to an account identifier, operation content, an operation object, and an authority configuration table, where the authority configuration table stores a mapping relationship between an account identifier, operation content, and an operation object. If the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request. The advantage of this arrangement is that the data in the cluster node 120 is controlled by the authority, and the requirements of isolation and sharing of data when multiple users work cooperatively are met.
Preferably, the account identifier may include an application name and an account password, the operation object includes database index information and data table information, and the operation content includes an operation right, such as a read-write right. The permission configuration table includes mapping relationships of application names, account passwords, database Index information (i.e., index information), data table information (i.e., type information), and operation permissions. For example, a type defining a dedicated index is used to store ACL mapping rules. The ACL mapping rule includes, among other things, application names (apps), keys (keys), indexes (indices), data tables (types), and permissions (permissions) defined. When the user input operation request is detected, whether the ACL mapping rule can be passed or not is judged. And if the operation request meets the ACL mapping rule, executing the operation request, otherwise, prompting the user that no operation authority exists. The authority configuration table may be stored in the authority control module, or may be stored in other locations of the cluster node.
In this way, for the index or data table created by the user himself, the user's account can be given the read-write authority for the created index or data table. The index or the data table created by the non-user does not have the operation authority for the index or the data table. If the user wants to access the index or the data table without the operation authority, the user needs to apply for the administrator, and the administrator allocates the reading authority to the account of the user. The advantage of this setting is that the index level and type level of the control of the authority division can be realized, and the user's read-write authority for each index and type can be controlled.
The configuration platform 130 is configured to provide a human-computer interaction configuration interface based on a WEB manner, so that a user inputs configuration information to configure the permission configuration table. The configuration information includes permission configuration parameters (including a user account and a key, and an operable index), data table structure parameters (including a data table name and a data table type), data capture configuration parameters (including a data source, a network address url, an account, a key, and a capture mode), and data synchronization configuration parameters (including a data source, a network address url, an account, a key, a trigger mode, and a trigger frequency). The configuration platform 130 is used for realizing the decentralized control and the data synchronous configuration, so that a user does not need to develop codes by himself, and the research and development efficiency is improved.
According to the technical scheme of the embodiment, the authority control module 122 is used for performing the authority-sharing control on the data in the cluster node 120, and the requirements of data isolation and sharing during multi-user cooperative work are met. The configuration platform 130 is used for realizing the decentralized control and the data synchronous configuration, so that a user does not need to develop codes by himself, and the research and development efficiency is improved.
Furthermore, it is noted here that preferably, the configuration module in the synchronization apparatus may also be provided in the configuration platform 130.
Fig. 9 is a flowchart showing an example of a retrieval method according to the second embodiment of the present invention.
As shown in fig. 9, in step 910, the right control module in the cluster node receives a retrieval request of a user through a right control plug-in running in the cluster node, where the retrieval request includes an account identifier, operation content, and an operation object. In one example, the account identifier may include an application name and an account password, the operation object includes database index information and data table information, and the operation content includes an operation right.
After receiving the retrieval request of the user, in step S920, extracting the account identifier, the operation content and the operation object information from the retrieval request, and then in step S930, querying a rights configuration table in the rights control module based on the extracted account identifier, the extracted operation content and the extracted operation object information to determine whether the rights control module has the operation rights, wherein the rights configuration table stores the mapping relationship between the account identifier, the extracted operation content, the extracted operation object and the extracted operation object information. For example, the authority configuration table includes mapping relationships between application names, account passwords, database index information, data table information, and operation authorities.
For example, a type defining a dedicated index is used to store the ACL mapping rule, i.e., the rights configuration table. The ACL mapping rule includes, among other things, application names (apps), keys (keys), indexes (indices), data tables (types), and permissions (permissions) defined.
After intercepting an operation instruction of a user, the authority control plug-in extracts an account identifier in the operation instruction, inquires an authority configuration table according to the account identifier, and determines an index or a data table which can be operated (written and/or read) by the user. And extracting the operation object in the operation instruction, and matching the operation object with the determined index or data table which can be operated by the user. And extracting the operation content corresponding to the successfully matched index or data table, and comparing the operation content with the operation authority of the user, thereby judging whether the user has the operation authority.
If the operation authority is provided, in step S940, the retrieval request is transmitted to a retrieval application program interface of the cluster node, and a matching result is searched in a data table of the cluster node;
if the operation authority is not available, in step S950, the search request is masked.
The invalid operation instruction input by the user includes that the user has no operation authority on the data table included in the operation instruction, or the operation authority of the user on the data table included in the operation instruction does not accord with the operation content in the operation instruction. For example, the operation authority of the user on the data table included in the operation instruction has only a read operation, but the operation content in the input operation instruction is to perform a write operation on the data table.
And when the operation instruction input by the user is invalid, shielding the operation instruction and prompting that no operation authority exists.
According to the technical scheme, the authority control plug-in is operated in the cluster node, the authority control of the account at the index level or the data table level is achieved, the data of different users are physically isolated and stored, the users cannot execute operation on the data without the operation authority, and the effect of avoiding misoperation or malicious operation is achieved.
It will be appreciated that the invention also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the invention into practice. The program may be in the form of source code, object code, a code intermediate source and object code such as partially compiled form, or in any other form suitable for use in the implementation of the method according to the invention. It will also be noted that such programs may have many different architectural designs. For example, program code implementing the functionality of a method or system according to the invention may be subdivided into one or more subroutines.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in some detail by the above embodiments, the invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the invention, and the scope of the invention is determined by the scope of the appended claims.
Claims (25)
1. A synchronization apparatus for use in a search system, comprising:
the synchronous trigger is used for generating a synchronous trigger instruction when the synchronous trigger condition is monitored to be met, and the synchronous trigger instruction comprises data identification information of data to be synchronized;
the data grabber is used for grabbing data corresponding to the data identification information from a corresponding data source according to the synchronous trigger instruction and transmitting the grabbed data to the dump device;
the dump device is used for dumping the captured data to a corresponding data table in a cluster node, and the cluster node is used for storing the data table and providing retrieval service;
the configuration module is used for providing configuration information of the synchronous trigger, configuration information of the data grabber and configuration information of the dump memory for a user; the corresponding relation between the data grabber and the data source and the corresponding relation between the data dump and the data table in the cluster node are respectively obtained based on the configuration information of the data grabber and the configuration information of the data dump;
wherein the retrieval system comprises a synchronization node and the cluster node; the synchronization node is used for creating one or more synchronization devices to execute data synchronization operation;
the synchronous node is a control terminal which is independent of the cluster node, and the control terminal is used for carrying out data transmission with the cluster and synchronizing data from the data source to the cluster; the cluster is composed of at least two cluster nodes, and the cluster nodes are server clusters for storing data tables.
2. The synchronization apparatus according to claim 1, further comprising:
the data identification information of the data to be synchronized is obtained based on the configuration information of the synchronization trigger and the information monitored by the synchronization trigger.
3. The synchronization apparatus of claim 1 or 2, wherein the synchronization trigger comprises at least one of a timing trigger, a message trigger, and a log trigger;
and when the synchronization triggering condition is monitored to be met, generating a synchronization triggering indication, wherein the synchronization triggering indication comprises at least one of the following conditions:
when the current time is monitored to meet the set moment through the timing trigger or a set period is reached, the synchronous trigger indication is generated;
generating the synchronous trigger indication when monitoring that a set message is generated in an operation message of a data source through the message trigger;
and when the operation log of the data source is acquired and analyzed through the log trigger and the data update of the data source is monitored, generating the synchronous trigger indication.
4. The synchronization apparatus according to claim 2, further comprising:
and the scheduler is used for scheduling the data grabber to acquire corresponding data and/or scheduling the dump memory to store the grabbed data into a corresponding data table according to a scheduling strategy in the scheduler after receiving at least one synchronous trigger instruction sent by the synchronous trigger and/or the data grabbed by the data grabber.
5. The synchronization apparatus according to claim 4, wherein the scheduler is specifically configured to:
after receiving at least one synchronous trigger instruction sent by the synchronous trigger, putting the received at least one synchronous trigger instruction into a task pool as a task;
acquiring a synchronous trigger instruction to be allocated to the data grabber according to a scheduling strategy in a scheduler;
and distributing the acquired synchronous trigger indication to be distributed to the data grabber to grab corresponding data.
6. The synchronization apparatus of claim 5, wherein the scheduling policy comprises: maximum allocation synchronization trigger indication number, maximum dump data number, and synchronization trigger indication allocation mechanism.
7. The synchronization apparatus of claim 6, wherein the synchronization trigger indication allocation mechanism comprises:
a priority assignment mechanism; and/or the presence of a gas in the atmosphere,
a resource saving allocation mechanism.
8. The synchronization apparatus of claim 4, wherein the configuration module is further configured to:
and configuring a scheduling strategy in the scheduler.
9. A retrieval system, characterized in that the retrieval system comprises a synchronization node comprising one or more synchronization devices according to any of claims 1 to 8.
10. The retrieval system of claim 9, further comprising:
the authority control module is configured in the cluster node and used for judging whether an account corresponding to the retrieval request has an operation authority or not according to an account identifier, operation content, an operation object and an authority configuration table in the received user retrieval request, wherein the authority configuration table stores a mapping relation among the account identifier, the operation content and the operation object;
if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request.
11. The retrieval system of claim 10, wherein the account identifier comprises an application name and an account password, the operation object comprises database index information and data table information, the operation content comprises an operation right, and the right configuration table comprises a mapping relationship of the application name, the account password, the database index information, the data table information and the operation right.
12. The retrieval system of claim 10 or 11, further comprising:
and the configuration platform is used for providing a human-computer interaction configuration interface based on a WEB mode, and a user inputs configuration information to configure the permission configuration table.
13. Retrieval system according to any of claims 9 to 11, characterised in that the cluster node is an ElasticSearch cluster node.
14. A retrieval system, characterized in that the retrieval system comprises a synchronization node and a cluster node, said synchronization node being adapted to synchronize data tables in said cluster node, said synchronization node comprising one or more synchronization devices according to any of claims 1 to 8; the cluster node further comprises: the permission control module is used for judging whether an account corresponding to the retrieval request has operation permission or not according to an account identifier, operation content, an operation object and a permission configuration table in the received user retrieval request after receiving the user retrieval request, wherein the permission configuration table stores a mapping relation among the account identifier, the operation content, the operation object and the operation permission;
if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request.
15. A synchronization method for use in a search system, comprising:
when the condition that synchronous triggering is met is monitored through at least one synchronous trigger, generating synchronous triggering indication, wherein the synchronous triggering indication comprises data identification information of data to be synchronized;
capturing data corresponding to the data identification information from a corresponding data source through a data capturing device according to the synchronous trigger indication, and transmitting the captured data to a dump device;
dumping the captured data to a corresponding data table in a cluster node through the dump memory, wherein the cluster node is used for storing the data table and providing retrieval service;
wherein the retrieval system comprises a synchronization node and the cluster node; the synchronization node is used for creating one or more synchronization devices to execute data synchronization operation;
the synchronous node is a control terminal which is independent of the cluster node, and the control terminal is used for carrying out data transmission with the cluster and synchronizing data from the data source to the cluster; the cluster is composed of at least two cluster nodes, and the cluster nodes are server clusters for storing data tables;
wherein the method further comprises:
receiving configuration information of the synchronous trigger, configuration information of the data grabber and configuration information of the dump device, which are provided by a user;
and acquiring the corresponding relation between the data grabber and the data source and the corresponding relation between the dump device and the data table in the cluster node respectively based on the configuration information of the synchronous trigger, the configuration information of the data grabber and the configuration information of the dump device.
16. The synchronization method according to claim 15, characterized in that the method further comprises:
the data identification information of the data to be synchronized is obtained based on the configuration information of the synchronization trigger and the information monitored by the synchronization trigger.
17. The synchronization method according to claim 15 or 16, wherein the synchronization trigger comprises at least one of a timing trigger, a message trigger, and a log trigger;
and generating a synchronization trigger indication when the synchronization trigger condition is monitored to be met by at least one synchronization trigger, wherein the synchronization trigger indication comprises at least one of the following conditions:
when the current time is monitored to meet the set time through the timing trigger or a set period is reached, the synchronous trigger indication is generated;
when monitoring that a set message is generated in an operation message of a data source through the message trigger, generating the synchronous trigger indication;
and when the operation log of the data source is acquired and analyzed through the log trigger and the data update of the data source is monitored, generating the synchronous trigger indication.
18. The synchronization method according to claim 15, further comprising:
after receiving at least one synchronous trigger instruction sent by the synchronous trigger and/or data grabbed by the data grabber, respectively scheduling the data grabber by the scheduler to acquire corresponding data and/or scheduling the dump memory to store the grabbed data into a corresponding data table according to a scheduling strategy in the scheduler.
19. The synchronization method according to claim 18, wherein after receiving at least one synchronization trigger indication sent by the synchronization trigger, scheduling the data fetcher by the scheduler to obtain corresponding data according to a scheduling policy in the scheduler comprises:
after receiving at least one synchronous trigger instruction sent by the synchronous trigger, putting the received at least one synchronous trigger instruction into a task pool as a task;
acquiring a synchronous trigger instruction to be distributed to the data grabber according to a scheduling strategy in a scheduler;
and allocating the acquired synchronous trigger instruction to be allocated to the data grabber to grab the corresponding data.
20. The synchronization method of claim 18, wherein the scheduling policy comprises: maximum allocation synchronization trigger indication number, maximum dump data number, and synchronization trigger indication allocation mechanism.
21. The synchronization method of claim 20, wherein the synchronization trigger indication allocation mechanism comprises:
a priority assignment mechanism; and/or the presence of a gas in the atmosphere,
a resource saving allocation mechanism.
22. The synchronization method of claim 18, wherein the scheduling policy is configured based on user-provided configuration information.
23. A method of searching, the method comprising:
after receiving the user retrieval request, searching the data table of the cluster node for a matched result,
wherein the data table is synchronized with a synchronization method according to any of claims 15 to 22 by a synchronization means, which is independent of the cluster nodes.
24. The retrieval method of claim 23, wherein the method further comprises:
after receiving a user retrieval request, judging whether an account corresponding to the retrieval request has an operation authority or not according to an account identifier, operation content, an operation object and an authority configuration table in the received user retrieval request, wherein the authority configuration table stores a mapping relation among the account identifier, the operation content, the operation object and the operation authority;
if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matched result in a data table of the cluster node; and if the operation authority is not available, shielding the retrieval request.
25. A method of searching, the method comprising:
after receiving a user retrieval request, judging whether an account corresponding to the retrieval request has an operation authority or not according to an account identifier, operation content, an operation object and an authority configuration table in the received user retrieval request, wherein the authority configuration table stores a mapping relation among the account identifier, the operation content, the operation object and the operation authority;
if the operation authority is provided, transmitting the retrieval request to a retrieval application program interface of the cluster node, and searching a matching result in a data table of the cluster node, wherein the data table is synchronized by a synchronization device according to the synchronization method of any one of claims 15 to 22, and the synchronization device is independent of the cluster node;
and if the operation authority is not available, shielding the retrieval request.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610487175.9A CN107544999B (en) | 2016-06-28 | 2016-06-28 | Synchronization device and synchronization method for retrieval system, and retrieval system and method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610487175.9A CN107544999B (en) | 2016-06-28 | 2016-06-28 | Synchronization device and synchronization method for retrieval system, and retrieval system and method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN107544999A CN107544999A (en) | 2018-01-05 |
| CN107544999B true CN107544999B (en) | 2022-10-21 |
Family
ID=60962461
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610487175.9A Active CN107544999B (en) | 2016-06-28 | 2016-06-28 | Synchronization device and synchronization method for retrieval system, and retrieval system and method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN107544999B (en) |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108875410A (en) * | 2018-06-29 | 2018-11-23 | 北京奇虎科技有限公司 | Distributed search cluster right management method and device calculate equipment |
| CN110866062B (en) * | 2018-08-09 | 2023-11-24 | 菜鸟智能物流控股有限公司 | Data synchronization method and device based on distributed cluster |
| CN109471870B (en) * | 2018-11-16 | 2021-07-20 | 北京金山云网络技术有限公司 | Method, apparatus, electronic device and computer-readable medium for reading resource data |
| CN110597630B (en) * | 2019-09-05 | 2022-02-15 | 郑州阿帕斯科技有限公司 | Method and system for processing content resources in distributed system |
| CN112579695A (en) * | 2019-09-29 | 2021-03-30 | 北京京东拓先科技有限公司 | Data synchronization method and device |
| CN112632114B (en) * | 2019-10-08 | 2024-03-19 | 中国移动通信集团辽宁有限公司 | Method, device and computing equipment for fast reading data by MPP database |
| CN111427906B (en) * | 2020-03-30 | 2023-06-09 | 南方电网数字平台科技(广东)有限公司 | Data visualization system for drag-and-drop multi-component hybrid applications |
| CN113364864B (en) * | 2021-06-03 | 2022-09-30 | 上海微盟企业发展有限公司 | Server data synchronization method, system and storage medium |
| CN113901076A (en) * | 2021-10-13 | 2022-01-07 | 平安国际智慧城市科技股份有限公司 | Data synchronization method, device, server and storage medium |
| CN113987597B (en) * | 2021-10-28 | 2025-05-02 | 中国建设银行股份有限公司 | A method and device for allocating accounts to applications |
| CN113824797B (en) * | 2021-11-19 | 2022-02-18 | 南京好先生智慧科技有限公司 | Self-adaptive synchronization method and device for teaching resources |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2009308480B2 (en) * | 2008-10-21 | 2015-12-17 | Google Inc. | Search based specification for data synchronization |
| WO2011100365A1 (en) * | 2010-02-09 | 2011-08-18 | Google Inc. | Method and system for dynamically replicating data within a distributed storage system |
| CN102567378A (en) * | 2010-12-28 | 2012-07-11 | 上海杉达学院 | Information retrieval system based on heterogeneous data |
| CN103136231B (en) * | 2011-11-25 | 2016-03-02 | 中国移动通信集团江苏有限公司 | Method of data synchronization between a kind of heterogeneous database and system |
| CN103530568B (en) * | 2012-07-02 | 2016-01-20 | 阿里巴巴集团控股有限公司 | Authority control method, Apparatus and system |
| CN104572672B (en) * | 2013-10-15 | 2018-10-02 | 北大方正集团有限公司 | The synchronous method and application system of heterogeneous database |
| CN104679896A (en) * | 2015-03-18 | 2015-06-03 | 成都金本华科技股份有限公司 | Intelligent retrieval method under big data environment |
| CN105262831B (en) * | 2015-10-30 | 2019-02-22 | 北京奇艺世纪科技有限公司 | The method, apparatus and synchronization system of synchrodata between a kind of storage system |
| CN105227683B (en) * | 2015-11-11 | 2018-10-19 | 中国建设银行股份有限公司 | A kind of LDAP company-datas synchronous method and system |
-
2016
- 2016-06-28 CN CN201610487175.9A patent/CN107544999B/en active Active
Also Published As
| Publication number | Publication date |
|---|---|
| CN107544999A (en) | 2018-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107544999B (en) | Synchronization device and synchronization method for retrieval system, and retrieval system and method | |
| US10698712B2 (en) | Methods and apparatus to manage virtual machines | |
| EP3069495B1 (en) | Client-configurable security options for data streams | |
| CN105933137A (en) | Resource management method, device and system | |
| US10310900B2 (en) | Operating programs on a computer cluster | |
| JP2016536690A (en) | Partition-based data stream processing framework | |
| JP5686034B2 (en) | Cluster system, synchronization control method, server device, and synchronization control program | |
| US12050519B2 (en) | Messaging system failover | |
| US11042409B2 (en) | Leader election with lifetime term | |
| CN116450355A (en) | Multi-cluster model training method, device, equipment and medium | |
| US9514176B2 (en) | Database update notification method | |
| CN111240892A (en) | Data backup method and device | |
| CN113672240A (en) | Container-based multi-machine-room batch automatic deployment application method and system | |
| CN111258726A (en) | Task scheduling method and device | |
| CN113157411B (en) | Celery-based reliable configurable task system and device | |
| JP2016091555A (en) | Data staging management system | |
| US10579419B2 (en) | Data analysis in storage system | |
| CN105373563A (en) | Database switching method and apparatus | |
| CN115454606A (en) | Task scheduling method, device, medium and related equipment under remote multi-activity architecture | |
| US20080127301A1 (en) | Delivering Callbacks Into Secure Application Areas | |
| US20200110739A1 (en) | Transaction processing method, apparatus, and device | |
| CN111741097B (en) | Method for tenant to monopolize node, computer equipment and storage medium | |
| CN113448775B (en) | Multi-source heterogeneous data backup method and device | |
| CN115481156A (en) | Data processing method, device, equipment and medium | |
| US12346735B2 (en) | Workload execution on backend systems |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |