[go: up one dir, main page]

CN111783131A - Data protection method, computer device and computer-readable storage medium - Google Patents

Data protection method, computer device and computer-readable storage medium Download PDF

Info

Publication number
CN111783131A
CN111783131A CN201910272613.3A CN201910272613A CN111783131A CN 111783131 A CN111783131 A CN 111783131A CN 201910272613 A CN201910272613 A CN 201910272613A CN 111783131 A CN111783131 A CN 111783131A
Authority
CN
China
Prior art keywords
data
query
interference
identification
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910272613.3A
Other languages
Chinese (zh)
Inventor
邓铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201910272613.3A priority Critical patent/CN111783131A/en
Publication of CN111783131A publication Critical patent/CN111783131A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a data protection method, the method comprising: receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data; determining at least one query condition in the data query instruction; generating interference data according to the identification data and the at least one query condition; and adding the interference data into the target data, and returning the target data added with the interference data. The present disclosure also provides a computer device and a computer-readable storage medium.

Description

Data protection method, computer device and computer-readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data protection method, a computer device, and a computer-readable storage medium.
Background
In the era of rapid development of network technology and big data technology, data brings great value and difficulties in privacy protection, and how to protect private data and prevent sensitive information from being leaked in the era of smooth network becomes a new challenge.
At present, the conventional privacy protection scheme usually uses a differential privacy technology to modify the privacy data of the user, so as to prevent the privacy data of the user from being revealed. However, the conventional privacy protection scheme is too simple and random, so that the privacy protection is poor in stability, low in accuracy and poor in safety.
Disclosure of Invention
The purpose of the present disclosure is to provide a data protection method, a computer device, and a computer-readable storage medium, which are used to solve the defects of poor stability, low accuracy, and poor security of the privacy protection scheme in the prior art.
One aspect of the present disclosure provides a data protection method, including: receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data; determining at least one query condition in the data query instruction; generating interference data according to the identification data and the at least one query condition; and adding the interference data to the target data, and returning the target data added with the interference data.
According to an embodiment of the present disclosure, the generating interference data according to the identification data and the at least one query condition includes: extracting query elements included in each query condition in the at least one query condition; recombining the query elements included in each query condition in the at least one query condition to obtain at least one combined element; and generating the interference data according to the identification data and the at least one combined element.
According to an embodiment of the present disclosure, the above recombining query elements included in each query condition of the at least one query condition to obtain at least one combined element includes: dividing query elements included in each query condition in the at least one query condition into constant categories and/or operator categories; and combining the query elements included in each query condition in the at least one query condition according to the divided categories to obtain at least one combined element.
According to an embodiment of the present disclosure, the generating the interference data according to the identification data and the at least one combination element includes: sorting the identification data, and generating an identification character string according to the sorted identification data; converting the identification character string into the identification numerical value; and generating the interference data according to the identification value and the at least one combination element.
According to an embodiment of the present disclosure, the generating the interference data according to the identification value and the at least one combination element includes: generating a conditional character string according to the identification numerical value and each combination element in the at least one combination element to obtain at least one conditional character string; converting each condition character string in the at least one condition character string into an input numerical value to obtain the at least one input numerical value; generating the interference data according to the at least one input value.
According to an embodiment of the present disclosure, the generating the interference data according to the at least one input numerical value includes: inputting each input value in the at least one input value into a random number generator to obtain at least one interference coefficient; generating interference subdata according to each interference coefficient in the at least one interference coefficient to obtain at least one interference subdata, wherein the interference subdata satisfies that the mean value is 0 and the standard deviation is 0
Figure BDA0002018901000000021
N is the number of the identification data; and generating the interference data according to the at least one piece of interference sub data.
According to the embodiment of the disclosure, the receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data comprises receiving the data query instruction, and judging whether the data query instruction has a function of querying the identification data; if not, adding the function of inquiring the identification data to the data inquiry command, acquiring target data corresponding to the data inquiry command added with the function of inquiring the identification data, and acquiring the identification data corresponding to the target data.
According to an embodiment of the present disclosure, the determining whether the data query instruction has a function of querying the identification data includes: identifying the constituent elements of the data query instruction; and judging whether the composition elements contain elements corresponding to the function of inquiring the identification data.
Yet another aspect of the disclosure provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor when executing the computer program being adapted to implement the steps of the method as claimed in any of the above.
Yet another aspect of the disclosure provides a computer readable storage medium having stored thereon a computer program for implementing the steps of the method as claimed in any of the above when executed by a processor.
According to the data protection method, after a data query instruction is received, besides target data corresponding to the data query instruction, identification data corresponding to the target data are further obtained, interference data are generated according to the identification data, at least one query condition in the data query instruction is determined, the interference data are generated according to the identification data and the at least one query condition, the generated interference data are added to the queried target data, and the target data added with the interference data are returned, so that the effect of protecting the target data is achieved. In view of the characteristic that the identification data has uniqueness, under the condition that the query conditions are the same, the interference data generated according to the same identification data and the query conditions are the same, and therefore the obtained interference data and the target data protected by the interference data can be guaranteed not to change. Obviously, through the embodiment of the disclosure, the stability, accuracy and safety of the query result can be ensured, and the defects of poor stability and low accuracy of the privacy protection scheme in the prior art are avoided.
Drawings
FIG. 1 schematically illustrates a flow diagram of a method of data protection according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a method of data protection according to another embodiment of the present disclosure;
FIG. 3 schematically illustrates a schematic diagram of a split attack according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a schematic diagram of a differential attack according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a block diagram of a data protection system according to an embodiment of the present disclosure; and
fig. 6 schematically shows a hardware architecture diagram of a computer device adapted to implement the data protection method according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clearly understood, the present disclosure is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not intended to limit the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
According to the data protection method, after a data query instruction is received, besides target data corresponding to the data query instruction, identification data corresponding to the target data are further obtained, interference data are generated according to the identification data, at least one query condition in the data query instruction is determined, the interference data are generated according to the identification data and the at least one query condition, the generated interference data are added to the queried target data, and the target data added with the interference data are returned, so that the effect of protecting the target data is achieved. In view of the characteristic that the identification data has uniqueness, under the condition that the query conditions are the same, the interference data generated according to the same identification data and the query conditions are the same, and therefore the obtained interference data and the target data protected by the interference data can be guaranteed not to change. Obviously, through the embodiment of the disclosure, the stability, accuracy and safety of the query result can be ensured, and the defects of poor stability and low accuracy of the privacy protection scheme in the prior art are avoided.
Before introducing the data protection scheme of the present disclosure, an application scenario of an embodiment of the present disclosure is described herein. It should be understood that the application scenario described in the present disclosure is only one example, which is not limiting. In the application scenario of the disclosure, when the inquiring party inquires the target data belonging to the privacy level, the server may produce the interference data through the identification data uniquely corresponding to the target data and the inquiry condition in the data inquiry instruction, and encrypt the inquired target data by using the interference data, and then return the encrypted target data to the inquiring party. For example, an inquiring party wants to know the spreading trend of user salaries (also called target data) of an enterprise, the inquiring party can send a data query instruction to a server by using a client, such as a desktop computer, after receiving the data query instruction, the server can query the corresponding user salaries and user identifications (also called identification data), then interference data is produced according to the user identifications and query conditions in the data query instruction, then the user salaries are encrypted by using the interference data, and then the encrypted user salaries are returned to the client.
At present, in the prior art, a method based on differential privacy is usually adopted to protect private data, specifically, a random noise is added when a query result is returned, the random noise satisfies Laplace distribution, and the intensity of the noise can be controlled by configuring the sensitivity Δ f and the differential privacy budget. However, this solution of the prior art has the following drawbacks: 1. the results are poor in stability: because the added noise is different in each query, the same query statement is caused, and the query result is not necessarily the same in each query; 2. the logic continuity is poor, the query result is not necessarily consistent with the actual situation when the query is carried out for a plurality of times with the data size relation, for example, the number of people with the age of more than 25 years and 30 years in a certain class is queried respectively, and the query result is possibly smaller than the query result of the later due to different noise addition; 3. poor data security: the differential privacy method introduces differential privacy budgets, but each query consumes the differential privacy budgets, so that noise can be eliminated by averaging query results through a large number of same queries, and real data can be obtained.
The scheme disclosed by the invention can overcome the defects of poor result stability, poor logic continuity and poor data safety in the prior art.
Fig. 1 schematically shows a flow chart of a data protection method according to an embodiment of the present disclosure.
As shown in fig. 1, the data protection method may include steps S101 to S104, wherein:
step 101, receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data.
The target data may include private data that needs to be encrypted, such as salaries of users, confidential data inside enterprises, parameters of devices, and so on. The identification data may uniquely correspond to the target data, such as a user job number, a user identification number, a user phone number, a profile number, a device number, and the like.
Receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data, wherein the data query instruction is received, and whether the data query instruction has a function of querying the identification data or not can be judged; if not, adding the function of inquiring the identification data to the data inquiry command, acquiring target data corresponding to the data inquiry command added with the function of inquiring the identification data, and acquiring the identification data corresponding to the target data. The determining whether the data query instruction has a function of querying the identification data may include: identifying constituent elements of the data query instruction; and judging whether the component elements contain elements corresponding to the function of inquiring the identification data.
The data query instruction may include constituent elements such as a query target, a query table, a query condition, a sorting condition, a set of grouping conditions, and so forth. Because the embodiment of the present disclosure needs to generate the interference data by using the identification data, the embodiment of the present disclosure needs to first determine whether there is a function of querying the identification data in the data query instruction, for example, component elements of the data query instruction may be identified, and then determine whether these component elements include an element corresponding to the function of querying the identification data, if so, the data query instruction may be directly responded to; if not, the function of inquiring the identification data can be automatically added to the data inquiry command, so as to form a new data inquiry command.
For example, the data query instruction is selected from table t person class ═ A' and name! Performing semantic analysis on the data query instruction, namely identifying all constituent elements in the data query instruction, knowing that the query target is salary, the query table is table t, and the query conditions are class a and name! As can be seen from the identified component elements, the data query instruction does not have a function of querying the identification data. Therefore, the function uid of querying the identification data can be added to the data query command, and the rewritten data query command is selected, and the local from table t where class is 'a' and name! 'tom'.
Step S102, determining at least one query condition in the data query instruction;
the data query instruction may be subjected to semantic analysis to determine at least one query condition in the data query instruction, and preferably, all query conditions in the data query instruction may be determined.
For example, the data query instruction is select uid, and the sample from table t person class ═ A' and name! If 'tom', then the query condition is: class ═ A and name! Tom.
Step S103, generating the interference data according to the identification data and the at least one query condition.
Wherein generating the interference data according to the identification data and the at least one query condition may include: extracting query elements included in each query condition in the at least one query condition; recombining the query elements included in each query condition in the at least one query condition to obtain at least one combined element; generating the interference data according to the identification data and the at least one combined element. The re-combining the query elements included in each query condition of the at least one query condition to obtain at least one combined element may include: dividing query elements included in each query condition in the at least one query condition into a constant category and/or an operator category; and combining the query elements included in each query condition in the at least one query condition according to the divided categories to obtain at least one combined element.
For any query condition, the query condition may include a plurality of query elements, all query elements in the query condition are extracted, and then all query elements in the query condition are recombined to obtain a combined element. The rule of the recombination may be to classify all query elements in the query condition, for example, the query elements may be classified into a constant category and an operator category, and then combine all query elements in the query condition according to the classified categories.
For example, for query condition class a, query elements are class, and a, and these query elements are classified into constant categories: class, A, get operator class: is as follows. These query elements are combined according to the divided categories, and for example, a combination element classA can be obtained by setting a constant category before an operator category. Similarly, for the query condition name! Tom, the corresponding combination element is nametom! Is as follows.
According to an embodiment of the present disclosure, the generating the interference data according to the identification data and the at least one combination element may include: sorting the identification data, and generating an identification character string according to the sorted identification data; converting the identification character string into the identification numerical value; and generating the interference data according to the identification value and the at least one combined element. Wherein generating the interference data according to the identification value and the at least one combination element may include: generating a conditional character string according to the identification numerical value and each combined element in the at least one combined element to obtain at least one conditional character string; converting each condition character string in the at least one condition character string into an input numerical value to obtain the at least one input numerical value; generating the interference data based on the at least one input value.
Specifically, before sorting the identification data, the identification data may be subjected to deduplication processing, then the deduplicated identification data is sorted, and then the identification character string is generated according to the deduplicated and sorted identification data. When the identification character string is converted into the identification numerical value, the SHA-2 algorithm or the MD5 algorithm may be used for the conversion, and the embodiment of the disclosure is not limited.
For example, through the rewritten query statement select uid, the search from table t where class ═ A' and name! The identification data queried by the 'tom' is 001, 004, 003, 001 and 002, the identification data is subjected to deduplication processing to obtain 001, 004, 003 and 002, then the identification data is sorted from small to large to obtain 001, 002, 003 and 004, the sorted identification data are further spliced to obtain an identification character string 001002003004, and the identification character string is converted into an identification numerical value such as 23 by utilizing the SHA-2 algorithm.
For each combination element, the identification value may be concatenated with the combination element to generate a conditional string. For example, if the identification number is 23 and the combination element is classA, the generated conditional string is classA 23; if the identification number is 23 and the combination element is nameom! If the condition string is composed of two words, the condition string is composed of nametom! 23. For each generated condition character string, the input numerical value can be obtained by converting the condition character string by utilizing an SHA-2 algorithm or an MD5 algorithm. It is worth noting that for these conditional strings, the same algorithm should be used for translation.
According to an embodiment of the present disclosure, generating the interference data according to the at least one input numerical value may include: inputting each input numerical value in the at least one input numerical value into a random number generator to obtain at least one interference coefficient; generating interference subdata according to each interference coefficient in the at least one interference coefficient to obtain at least one interference subdata, wherein the interference subdata satisfies that the mean value is 0 and the standard deviation is 0
Figure BDA0002018901000000091
N is the number of the identification data; generating the interference according to the at least one interference subdataAnd (4) data.
For each input value, an interference coefficient satisfying a standard Gaussian distribution can be generated, and then the interference coefficient and the standard deviation can be compared
Figure BDA0002018901000000092
Multiplication is carried out, and the average value and the standard deviation which are satisfied are obtained
Figure BDA0002018901000000093
The interference data is generated according to all the interference sub data, for example, all the interference sub data are sequentially subtracted or added, etc., which is not limited in this disclosure. Wherein, the Box-Muller algorithm or the ziggurat algorithm can be adopted to generate the interference coefficient.
For example, for the conditional string classA ═ 23, the conditional string may be converted to an input numerical value, e.g., 12, using the SHA-2 algorithm; for the conditional string nameom! The conditional string may be converted to an input numerical value, e.g., 33, using the SHA-2 algorithm 23. By inputting each input value, e.g. 12 and 33, into the random generator in turn, the interference factors, e.g. a 1-9 and a 2-6, can be obtained, each interference factor being associated with a standard deviation of
Figure BDA0002018901000000101
The multiplication can obtain interference subdata
Figure BDA0002018901000000102
And then adding the two interference sub data to obtain the interference data Y which is Y1+ Y2.
And step S104, adding the interference data into the target data, and returning the target data added with the interference data.
The adding of the interference data to the target data may be directly adding or subtracting the interference data and the target data, and the like, which is not limited in this embodiment of the disclosure.
It should be noted that, the embodiment of the present disclosure only encrypts the data that needs to be returned, and does not change the original target data. For example, the target data is stored in a database, and the encrypted target data is returned to the terminal sending the data query instruction as a return result after the acquired target data is encrypted according to the embodiment of the disclosure, but the embodiment of the disclosure does not change the target data originally stored in the database.
According to the embodiment of the disclosure, after the interference data is added to the target data and before the target data added with the interference data is returned, the disclosure may further determine whether the query result (the query result is also referred to as the target data added with the interference data) meets the historical query logic continuity. For example, if the database is not updated, the difference between the current data query command and the last data query command is compared, and if the query range of the current data query command is large and the query range of the last data query command is small, the query result of the current time should be larger than the query result of the last time, and the like. If the query result meets the query history logic continuity, recording the data query instruction, the target data added with the interference data, the database update timestamp (recording the database update timestamp if the database is updated), and the like, and further returning the target data added with the interference data. And if the current query result does not meet the historical query logic continuity, returning the last query result, and re-executing the step of inputting each input numerical value in the at least one input numerical value into the random number generator to obtain at least one interference coefficient.
A detailed flow diagram of an embodiment of the present disclosure may be as shown in fig. 6, where:
step S201, analyzing SQL sentences and identifying constituent elements;
step S202, judging whether the SQL sentence inquires the identification data, if not, executing step S203, and if so, executing step S204;
step S203, rewriting SQL sentences and adding the function of inquiring identification data;
step S204, executing inquiry and obtaining a result;
step S205, sorting and connecting the identification data into identification character strings;
step S206, analyzing each query condition;
step S207, recording query elements belonging to constant categories and query elements belonging to operator categories in each query condition;
step S208, generating a seed for each query condition and identification data;
step S209, generating a noise sub-value satisfying Gaussian distribution by using each seed;
step S210, the encrypted target data is obtained in a gathering mode;
step S211, determining whether the history query logic continuity is satisfied, if yes, executing step S212, otherwise, returning to execute step S209;
step S212, recording SQL sentences, encrypted target data and database updating time stamps;
step S213, returning the encrypted target data to the querying user.
In the embodiment of the present disclosure, the SQL statement is also referred to as a data query instruction, and whether the SQL statement has a function of querying identification data (also referred to as a user set) can be determined by identifying constituent elements of the SQL statement, if yes, the SQL statement can be directly responded to for querying, and if not, the SQL statement can be rewritten by the method described in the above embodiment, and querying is performed based on the rewritten SQL statement. The method comprises the steps of obtaining inquired identification data and target data, then removing duplication, sorting and splicing the identification data to generate an identification character string, and then converting the identification character string into an identification numerical value by using a preset algorithm such as an SHA-2 algorithm. Determining all query conditions in the SQL statement, determining query elements belonging to constant categories and query elements belonging to operator categories in each query condition, and then combining the query elements in each query condition according to the divided categories to obtain combined elements. For each of the obtained combination elements, converting the combination element and the identification value into a condition string, and then using a preset algorithm such as SHA-2 algorithm to convert each conditionThe string is converted into an input numerical value. And then, sequentially inputting each input numerical value into a random number generator to obtain an interference coefficient. These interference coefficients are then used to generate a signal satisfying the standard deviation of
Figure BDA0002018901000000121
Is detected (also referred to as interference sub-data). The interference data can be obtained by adding all the interference sub-data, and the encrypted target data (also called a noise adding result) can be obtained by adding the interference data to the target data. Further, whether the historical query logic continuity is met is judged, for example, if the database is not updated, the difference between the current SQL statement and the previous SQL statement is compared, for example, if the current SQL statement query range is large and the previous SQL statement query range is small, the current noise adding result should be larger than the previous noise adding result. For example, if the database is not updated, the number of users whose ages are greater than 32 years old is queried this time, and the number of users whose ages are greater than 40 years old is queried last time, the noise adding result of this time should be greater than the noise adding result of the last time. Further, if the query history query logic continuity is judged to be satisfied, the current SQL statement, the encrypted target data, the database update timestamp (if the database is updated, the database update timestamp is recorded), and the like are recorded, and the encrypted target data can be further returned to the query user, wherein the query user can be a user who sends the SQL statement; if the query history query logic continuity is judged not to be satisfied, the last noise adding result is returned to the query user, and the step S209 is returned to. According to the embodiment of the disclosure, under the condition that the original query statement is not changed, the queried target data and the identification data are not changed, and the interference data generated by the identification data are not changed, so that the encrypted target data are not changed, and the query stability and accuracy are ensured. Meanwhile, since the historical query is recorded, when the encrypted target data is returned subsequently, the returned noise adding result can be ensured to be consistent with the noise adding result returned previously logically, and the small initial query range and the small subsequent query range can not occurThe continuous query range is large, but the initial returned result is larger than the previous returned result.
Further, the scheme provided by the disclosure performs testing, and firstly executes the splitting attack and then executes the differential attack. As shown in fig. 3, in the splitting attack, each time an attacker constructs a different query pair as shown in fig. 3, the age condition changes, and the corresponding identification data also changes, so that noise on the age condition and the identification data can be eliminated by multiple queries and averaging. But the maker condition of each query does not change, but is stable noise, and the average elimination cannot be realized. Thus, even if the query summary is averaged for a plurality of times, the query result containing noise can still be obtained. As shown in fig. 4, the conditions of the left and right query pairs in the differential attack are significantly different, so that after the query is executed, each age group contains different noise for the returned query result, and therefore, each value corresponding to the two results is different, so that the attack cannot be performed by comparing the values on the two sides to find the difference.
The data protection scheme disclosed by the invention is associated with the identification data and the query conditions, stable interference data with controllable sources can be provided by connecting the query with the interference data, and meanwhile, historical query statements and query results are recorded, the logicality is checked, and the accuracy of the query results is further ensured. The scheme generates a mean value of 0 and a standard deviation of 0 for each query condition
Figure BDA0002018901000000131
The interference data with the unchanged mean value and the changed standard deviation is gathered, and then the interference data is added into the target data, so that the safety of data protection can be improved.
FIG. 5 schematically illustrates a block diagram of a data protection system according to an embodiment of the disclosure.
As shown in fig. 5, the data protection system 500 may include a receiving module 510, a determining module 520, a generating module 530, and an adding module 540, wherein:
a receiving module 510, configured to receive a data query instruction, obtain target data corresponding to the data query instruction, and obtain identification data corresponding to the target data;
a determining module 520, configured to determine at least one query condition in the data query instruction;
a generating module 530, configured to generate interference data according to the identification data and the at least one query condition;
an adding module 540, configured to add the interference data to the target data, and return the target data to which the interference data is added.
According to the data protection system provided by the disclosure, after a data query instruction is received, besides target data corresponding to the data query instruction, identification data corresponding to the target data is further acquired, interference data is generated according to the identification data, at least one query condition in the data query instruction is determined, interference data is generated according to the identification data and the at least one query condition, the generated interference data is added to the queried target data, and the target data added with the interference data is returned, so that the effect of protecting the target data is achieved. In view of the characteristic that the identification data has uniqueness, under the condition that the query conditions are the same, the interference data generated according to the same identification data and the query conditions are the same, and therefore the obtained interference data and the target data protected by the interference data can be guaranteed not to change. Obviously, through the embodiment of the disclosure, the stability, accuracy and safety of the query result can be ensured, and the defects of poor stability and low accuracy of the privacy protection scheme in the prior art are avoided.
As an alternative embodiment, the generating module may include: the extracting unit is used for extracting the query elements included by each query condition in the at least one query condition; the combination unit is used for recombining the query elements included in each query condition in the at least one query condition to obtain at least one combination element; a generating unit, configured to generate the interference data according to the identification data and the at least one combination element.
As an alternative embodiment, the combination unit may include: a dividing subunit, configured to divide query elements included in each query condition of the at least one query condition into a constant category and/or an operator category; and the combining subunit is used for combining the query elements included in each query condition in the at least one query condition according to the divided categories to obtain at least one combined element.
As an alternative embodiment, the generating unit may include: the sorting subunit is used for sorting the identification data and generating an identification character string according to the sorted identification data; the conversion character string is used for converting the identification character string into the identification numerical value; and the generating subunit is used for generating the interference data according to the identification value and the at least one combined element.
As an optional embodiment, the generating subunit may further be configured to: generating a conditional character string according to the identification numerical value and each combined element in the at least one combined element to obtain at least one conditional character string; converting each condition character string in the at least one condition character string into an input numerical value to obtain the at least one input numerical value; generating the interference data in dependence on the at least one input second value.
As an optional embodiment, when the generating unit generates the interference data according to the at least one input numerical value, the generating unit is further configured to: inputting each input numerical value in the at least one input numerical value into a random number generator to obtain at least one interference coefficient; generating interference subdata according to each interference coefficient in the at least one interference coefficient to obtain at least one interference subdata, wherein the interference subdata satisfies that the mean value is 0 and the standard deviation is 0
Figure BDA0002018901000000151
N is the number of the identification data; and generating the interference data according to the at least one piece of interference subdata.
As an alternative embodiment, the receiving module may include: the receiving unit is used for receiving the data query instruction and judging whether the data query instruction has a function of querying the identification data; and the adding unit is used for adding the function of inquiring the identification data to the data inquiry instruction under the condition that the data inquiry instruction is judged not to have the function of inquiring the identification data, acquiring target data corresponding to the data inquiry instruction added with the function of inquiring the identification data and acquiring the identification data corresponding to the target data.
As an alternative embodiment, when determining whether the data query instruction has a function of querying the identification data, the receiving unit may include: the identification subunit is used for identifying the constituent elements of the data query instruction; and the judging subunit is used for judging whether the component elements contain elements corresponding to the function of inquiring the identification data.
Fig. 6 schematically shows a hardware architecture diagram of a computer device adapted to implement the data protection method according to an embodiment of the present disclosure. In this embodiment, the computer device 600 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set or stored in advance. For example, the server may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of a plurality of servers). As shown in fig. 6, the computer device 600 includes at least, but is not limited to: the memory 610, processor 620, and network interface 630 may be communicatively coupled to each other via a system bus. Wherein:
the memory 610 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 610 may be an internal storage module of the computer device 600, such as a hard disk or a memory of the computer device 600. In other embodiments, the memory 610 may also be an external storage device of the computer device 600, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device 600. Of course, the memory 610 may also include both internal and external memory modules of the computer device 600. In this embodiment, the memory 610 is generally used for storing an operating system and various application software installed in the computer device 600, such as program codes of a data protection method. In addition, the memory 610 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 620 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 620 generally serves to control the overall operation of the computer device 600, such as performing control and processing related to data interaction or communication with the computer device 600. In this embodiment, the processor 620 is configured to execute the program codes stored in the memory 610 or process data.
The network interface 630 may include a wireless network interface or a wired network interface, and the network interface 630 is typically used to establish communication connections between the computer device 600 and other computer devices. For example, the network interface 630 is used to connect the computer apparatus 600 to an external terminal via a network, establish a data transmission channel and a communication connection between the computer apparatus 600 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G network, Bluetooth (Bluetooth), Wi-Fi, etc.
It is noted that FIG. 6 only shows a computer device having components 610 and 630, but it is understood that not all of the shown components are required and that more or fewer components may be implemented instead.
In this embodiment, the data protection method stored in the memory 610 may be further divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 620) to implement the present invention.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data protection method in the embodiments.
In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device. Of course, the computer-readable storage medium may also include both internal and external storage devices of the computer device. In this embodiment, the computer-readable storage medium is generally used for storing an operating system and various types of application software installed in the computer device, for example, the program codes of the data protection method in the embodiment, and the like. Further, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for data protection, the method comprising:
receiving a data query instruction, acquiring target data corresponding to the data query instruction, and acquiring identification data corresponding to the target data;
determining at least one query condition in the data query instruction;
generating interference data according to the identification data and the at least one query condition;
and adding the interference data into the target data, and returning the target data added with the interference data.
2. The method of claim 1, wherein generating interference data based on the identification data and the at least one query condition comprises:
extracting query elements included in each query condition in the at least one query condition;
recombining the query elements included in each query condition in the at least one query condition to obtain at least one combined element;
generating the interference data according to the identification data and the at least one combined element.
3. The method according to claim 2, wherein the recombining the query elements included in each of the at least one query condition to obtain at least one combined element comprises:
dividing query elements included in each query condition in the at least one query condition into a constant category and/or an operator category;
and combining the query elements included in each query condition in the at least one query condition according to the divided categories to obtain at least one combined element.
4. The method of claim 2, wherein generating the interference data based on the identification data and the at least one combined element comprises:
sorting the identification data, and generating an identification character string according to the sorted identification data;
converting the identification character string into the identification numerical value;
and generating the interference data according to the identification value and the at least one combined element.
5. The method of claim 4, wherein generating the interference data based on the identification value and the at least one combined element comprises:
generating a conditional character string according to the identification numerical value and each combined element in the at least one combined element to obtain at least one conditional character string;
converting each condition character string in the at least one condition character string into an input numerical value to obtain the at least one input numerical value;
generating the interference data based on the at least one input value.
6. The method of claim 5, wherein generating the interference data based on the at least one input value comprises:
inputting each input numerical value in the at least one input numerical value into a random number generator to obtain at least one interference coefficient;
generating interference subdata according to each interference coefficient in the at least one interference coefficient to obtain at least one interference subdata, wherein the interference subdata satisfies that the mean value is 0 and the standard deviation is 0
Figure FDA0002018900990000021
N is the number of the identification data;
and generating the interference data according to the at least one piece of interference subdata.
7. The method of claim 1, wherein the receiving a data query instruction, obtaining target data corresponding to the data query instruction, and obtaining identification data corresponding to the target data comprises:
receiving the data query instruction, and judging whether the data query instruction has a function of querying the identification data;
if not, adding the function of inquiring the identification data to the data inquiry command, acquiring target data corresponding to the data inquiry command added with the function of inquiring the identification data, and acquiring the identification data corresponding to the target data.
8. The method of claim 7, wherein the determining whether the data query command has a function of querying the identification data comprises:
identifying constituent elements of the data query instruction;
and judging whether the component elements contain elements corresponding to the function of inquiring the identification data.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor is adapted to carry out the steps of the method according to any of claims 1 to 8 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method of any one of claims 1 to 8.
CN201910272613.3A 2019-04-04 2019-04-04 Data protection method, computer device and computer-readable storage medium Pending CN111783131A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910272613.3A CN111783131A (en) 2019-04-04 2019-04-04 Data protection method, computer device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910272613.3A CN111783131A (en) 2019-04-04 2019-04-04 Data protection method, computer device and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN111783131A true CN111783131A (en) 2020-10-16

Family

ID=72755385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910272613.3A Pending CN111783131A (en) 2019-04-04 2019-04-04 Data protection method, computer device and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN111783131A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117993027A (en) * 2024-03-28 2024-05-07 之江实验室 Data protection method and device for repeated query attack

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117993027A (en) * 2024-03-28 2024-05-07 之江实验室 Data protection method and device for repeated query attack

Similar Documents

Publication Publication Date Title
US9471285B1 (en) Identifying software components in a software codebase
US9003529B2 (en) Apparatus and method for identifying related code variants in binaries
US10303689B2 (en) Answering natural language table queries through semantic table representation
US10810176B2 (en) Unsolicited bulk email detection using URL tree hashes
CN111159413A (en) Log clustering method, device, equipment and storage medium
CN110188568B (en) Confidential information identification method, apparatus, device and computer readable storage medium
CN113434506B (en) Data management and retrieval method, device, computer equipment and readable storage medium
CN112433753A (en) Interface document generation method, device, equipment and medium based on parameter information
CN112364022B (en) Information deduction management method, device, computer equipment and readable storage medium
CN109460363B (en) Automatic testing method and device, electronic equipment and computer readable medium
CN111723077A (en) Data dictionary maintenance method and device and computer equipment
US9971789B2 (en) Selective disk volume cloning for virtual disk creation
CN113177407A (en) Data dictionary construction method and device, computer equipment and storage medium
CN111221690B (en) Model determination method and device for integrated circuit design and terminal
CN110866007B (en) Information management method, system and computer equipment for big data application and table
CN112784596A (en) Method and device for identifying sensitive words
CN111783131A (en) Data protection method, computer device and computer-readable storage medium
US20150347402A1 (en) System and method for enabling a client system to generate file system operations on a file system data set using a virtual namespace
US11170010B2 (en) Methods and systems for iterative alias extraction
CN112685389B (en) Data management method, data management device, electronic device, and storage medium
CN115934571A (en) Interface test case generation method and device based on Bayesian classification algorithm
CN111783110A (en) Data protection method, computer device and computer-readable storage medium
CN114722401A (en) Equipment safety testing method, device, equipment and storage medium
US11138075B2 (en) Method, apparatus, and computer program product for generating searchable index for a backup of a virtual machine
CN114416806A (en) Method and device for acquiring power safety knowledge data and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201016