[go: up one dir, main page]

CN116094708B - Privacy protection method, terminal and storage medium of DBSCAN algorithm - Google Patents

Privacy protection method, terminal and storage medium of DBSCAN algorithm

Info

Publication number
CN116094708B
CN116094708B CN202310080172.3A CN202310080172A CN116094708B CN 116094708 B CN116094708 B CN 116094708B CN 202310080172 A CN202310080172 A CN 202310080172A CN 116094708 B CN116094708 B CN 116094708B
Authority
CN
China
Prior art keywords
server
points
point
data
dbscan algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310080172.3A
Other languages
Chinese (zh)
Other versions
CN116094708A (en
Inventor
王小伟
张旭
吴睿振
孙华锦
王凛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202310080172.3A priority Critical patent/CN116094708B/en
Publication of CN116094708A publication Critical patent/CN116094708A/en
Application granted granted Critical
Publication of CN116094708B publication Critical patent/CN116094708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0819Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s)
    • H04L9/0825Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) using asymmetric-key encryption or public key infrastructure [PKI], e.g. key signature or public key certificates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/085Secret sharing or secret splitting, e.g. threshold schemes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0863Generation of secret information including derivation or calculation of cryptographic keys or passwords involving passwords or one-time passwords
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Storage Device Security (AREA)

Abstract

本发明涉及信息技术领域,具体涉及DBSCAN算法的隐私保护方法、终端及存储介质。该方法通过DH协议分别建立客户端与服务器A,B之间的密钥;其中,设第l个客户端与服务器A之间的密钥为与服务器B之间的密钥为客户端利用加法秘密共享,将数据点划分为两部分,分别使用对应的密钥加密后发送给对应的服务器A,B;服务器A,B分别对接收到的数据点的秘密共享值解密,获得任意两点间的距离与M的大小;服务器A通过两点间的距离与ε的大小关系,得到每一点的ε‑邻域内的点集;服务器A通过DBSCAN算法获得聚类结果。本发明节省了客户端的计算和通讯花费,并降低了云服务器的计算花费。

The present invention relates to the field of information technology, and in particular to a privacy protection method, terminal and storage medium for a DBSCAN algorithm. The method uses a DH protocol to establish keys between a client and servers A and B respectively; wherein, the key between the first client and server A is The key between server B is The client uses additive secret sharing to divide the data points into two parts, and encrypts them with corresponding keys and sends them to the corresponding servers A and B. Servers A and B decrypt the secret sharing values of the received data points to obtain the distance between any two points and the size of M. Server A obtains the point set in the ε-neighborhood of each point through the relationship between the distance between the two points and ε. Server A obtains the clustering result through the DBSCAN algorithm. The present invention saves the computing and communication costs of the client and reduces the computing costs of the cloud server.

Description

Privacy protection method, terminal and storage medium of DBSCAN algorithm
Technical Field
The present invention relates to the field of information technologies, and in particular, to a privacy protection method, a terminal, and a storage medium for a DBSCAN algorithm.
Background
With the rapid development of cloud services, people have made great changes in the way mass data is processed and calculated. When the computing capacity of the local computer is limited and the computing capacity required for solving the problem is huge, the data can be uploaded to the cloud server, and the work of the user can be completed by means of the powerful computing capacity of the cloud server. Today's cloud servers involve content that encompasses aspects of life, such as personal physical status, family member information, company member constitution, etc., that may be related to personal privacy or business confidentiality, which if uploaded directly to the server would result in the risk of revealing the privacy or business confidentiality. For the application scenario of machine learning, since the data volume and the calculation volume of the application scenario tend to be relatively large, the application scenario is often outsourced to a server for operation. Common unsupervised machine learning algorithms are the K-means algorithm, the DBSCAN algorithm, and the like. For the K-means algorithm, there are already some efficient and safe ways of computation. Compared with the K-means algorithm, the DBSCAN algorithm can find any cluster in any shape under the condition of noisy points, and has a remarkable effect of removing malicious data.
In the prior art, a BGV homomorphic encryption algorithm is adopted to secret user data related to a DBSCAN algorithm. The user encrypts the data and then uploads the data to the cloud server, and as the BGV algorithm meets the property of full homomorphism, the cloud server sends the calculation result to the user after performing a series of calculations on the ciphertext, and the user can obtain the correct algorithm result after decrypting the calculation result.
Due to the complexity of fully homomorphic encryption algorithms, there are many operations that increase the computational expense and network expense of clients as well as greatly increasing the amount of server computation during operation of the algorithm. For example, before data encryption, data is first encoded, matrix operation is performed on each piece of data, the encrypted ciphertext may be several times as large as the original data, and the ciphertext needs to be sent to a cloud server through a network. In order to solve the technical problem, a privacy protection method, a terminal and a storage medium of a DBSCAN algorithm are provided.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a privacy protection method, a terminal and a storage medium of a DBSCAN algorithm.
In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
in a first aspect, in one embodiment provided by the present invention, there is provided a privacy protection method of a DBSCAN algorithm, the method including the steps of:
Respectively establishing keys between a client and the servers A and B through DH protocol;
wherein, the key between the first client and the server A is set as The key with the server B is
The client divides the data point into two parts by utilizing addition secret sharing, encrypts the data point by using corresponding keys respectively and then sends the encrypted data point to corresponding servers A and B;
The server A, B decrypts the secret sharing value of the received data point respectively, obtain the distance between any two points and the size of M;
the server A obtains a point set in epsilon-neighbor of each point through the size relation between the distance between two points and epsilon, and the server A obtains a clustering result through a DBSCAN algorithm.
As a further aspect of the invention, data points are usedAndFor illustration. Assume that server a gets the data points AndSecret shared value of (2) isData points obtained by server BAndSecret shared value of (2) is The method for obtaining the distance between any two points and the size of M comprises the following steps:
s301, calculating by server A Server B computing
S302, the servers A and B respectively obtain data through multiplication secret sharingAnd is also provided withAndSatisfy the following requirements
S303, calculating by the server A to obtain data The server B calculates the data
S304, the server B sends sign (V B) to the server A, and the server A preliminarily judgesAnd epsilon size;
S305, comparing the magnitudes of |V A | and |V B | by adopting a method OT_Compare (|V A|,|VB |) between the servers A and B, thereby obtaining And epsilon.
As a further scheme of the invention, the server B sends sign (V B) to the server A, and the server A preliminarily judgesAnd epsilon, which includes the following cases:
I. V A and V B are 0 or more, and can be obtained
II. V A and V B are 0 or less, and can be obtained
III, V A is greater than or equal to 0, V B is less than 0, V A is less than or equal to 0, and V B is greater than 0, then adopting a method OT_Compare (I V A|,|VB I) to Compare the magnitudes of I V A I and I V B I between the servers A and B, thereby obtainingAnd epsilon.
As a further aspect of the present invention, the sizes of |v A | and |v B | are compared between servers a, B using the method ot_compound (|v A|,|VB |), thereby obtainingAnd epsilon, which includes the following cases:
I. if sign (V A)·sign(|VA|-|VB |) is not less than 0, it can be obtained
II. If sign (V A)·sign(|VA|-|VB |) is less than or equal to 0, then
As a further aspect of the present invention, the server a obtains a clustering result through a DBSCAN algorithm, which includes:
Input sample set Neighborhood parameters (. Epsilon., M).
The output is class division c= { C 1,C2,…,Ck }.
As a further scheme of the invention, the server A obtains a clustering result through a DBSCAN algorithm, and the method comprises the following steps:
S401, all core points are obtained;
if at some point (e.g ) The number of points in the epsilon-neighborhood of (i.e.)) Setting the set of all core points as omega;
s402, marking all core points as unvisited;
s403, taking out a core point from omega (After removal)) Marking the point as accessed, setting the current class sequence number k, and creating a queue of the current core pointCreating a set of points of a current category
S404, taking out an element from the core point queue omega k (After removal)) Acquiring all atNon-visited point set in epsilon-neighborhood of (c-c)Non-accessed core point setWill beThese points within epsilon-neighborhood are marked as accessed, and the current class point set is updated toThe core points of the points are collectedTo the core point queue (after joining));
S405, repeating the step S404 until the core point queue omega k is empty after the step S404 is executed, wherein the current category point set C k forms a category, and eliminating the core points marked as accessed from omega;
S406, repeating the steps S403-S405 until the element in omega is null, and obtaining the category division of C= { C 1,C2,…,Ck }.
As a further aspect of the present invention, the servers a, B do not cross each other.
As a further aspect of the present invention, client dataWherein, the The value range of i is 1-m, and the values of m and n are positive integers.
In a second aspect, in yet another embodiment provided by the present invention, a terminal is provided, including a memory storing a computer program and a processor implementing the steps of the privacy preserving method of the DBSCAN algorithm when the computer program is loaded and executed by the processor.
In a third aspect, in yet another embodiment of the present invention, a storage medium is provided storing a computer program which, when loaded and executed by a processor, implements the steps of the privacy preserving method of the DBSCAN algorithm.
The technical scheme provided by the invention has the following beneficial effects:
According to the privacy protection method, the terminal and the storage medium of the DBSCAN algorithm, too many matrix operations are not needed to be carried out on the client, the number of parameters uploaded to cloud services is greatly reduced, and the calculation and communication cost of the client is saved. And the algorithm does not need the cloud server to perform extremely complex operation on the ciphertext, so that the calculation cost of the cloud server is effectively reduced. The data provider may be a plurality of clients. After the data is divided by adopting an addition secret sharing technology, each divided part is respectively encrypted and sent to different servers, so that interception of other people can be prevented. And then, a multiplication secret sharing technology and an unintentional transmission technology are used between the two servers to acquire the size relation between the distance between the two points and the neighborhood range threshold, so that the contents such as core points, density reachable relations, density connection relations and the like in the DBCSCAN algorithm can be acquired, and further, the safe calculation of the DBSCAN algorithm is realized. Compared with the existing results, the method and the system save the calculation and communication costs of the client and reduce the calculation costs of the cloud server.
These and other aspects of the invention will be more readily apparent from the following description of the embodiments. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a privacy preserving method of DBSCAN algorithm of an embodiment of the present invention;
fig. 2 is a block diagram of a terminal according to an embodiment of the present invention.
In the figure, a processor-501, a communication interface-502, a memory-503 and a communication bus-504 are shown.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In particular, embodiments of the present invention are further described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flowchart of a privacy protection method of a DBSCAN algorithm according to an embodiment of the present invention, and as shown in fig. 1, the privacy protection method of the DBSCAN algorithm includes steps S10 to S40. The privacy protection method of the DBSCAN algorithm is applied to a system, wherein the system comprises at least one client and at least one server A and at least one server B.
S10, secret keys between the client and the servers A and B are respectively established through DH protocol.
Wherein, the key between the first client and the server A is set asThe key with the server B is
And S20, the client divides the data point into two parts by utilizing addition secret sharing, encrypts the data point by using the corresponding secret key and sends the encrypted data point to the corresponding servers A and B.
In embodiments of the invention, for example, data pointsT represents matrix transposition, divided into by addition secretAnd then respectively using the corresponding key pairsAnd after encryption, the encrypted data are sent to the servers A and B.
S30, the server A and the server B respectively decrypt the secret sharing values of the received data points, and the distance between any two points and the size of M are obtained by the following steps.
We use the data pointsAndFor illustration.
Suppose that server A getsAndSecret shared value of (2) isObtained by server BAndSecret shared value of (2) is The method comprises the following specific steps:
s301, calculating by server A Server B computing
S302, the servers A and B respectively obtain data through multiplication secret sharingThey satisfy
S303, calculating by the server A to obtain data The server B calculates the data
It can be seen that V A+VB results inSo long as the relation between V A+VB and 0 can be judgedAnd epsilon.
S304, the server B sends sign (V B) to the server A, and the server A preliminarily judgesAnd epsilon. The following conditions are adopted:
I. V A and V B are 0 or more, and can be obtained
II. V A and V B are 0 or less, and can be obtained
III, V A is greater than or equal to 0, V B is less than 0, V A is less than or equal to 0, and V B is greater than 0, and the process proceeds to step S305.
S305, comparing the magnitudes of |V A | and |V B | by adopting a method OT_Compare (|V A|,|VB |) between the servers A and B, thereby obtainingAnd epsilon. The following cases are divided into
I. if sign (V A)·sign(|VA|-|VB |) is not less than 0, it can be obtained
II. If sign (V A)·sign(|VA|-|VB |) is less than or equal to 0, then
S40, the server A obtains a point set in the epsilon-neighborhood of each point according to the size relation between the distance between the two points and epsilon, and the server A obtains a clustering result according to a DBSCAN algorithm.
The invention firstly divides the data into two parts by an addition secret sharing technology, encrypts the two parts and sends the encrypted data to the corresponding cloud server. After the cloud server decrypts the data, the secret sharing value of the square of the distance between two points can be obtained by using the addition and multiplication secret sharing. And then, obtaining the magnitude relation between the square of the distance between the two points and the square of the distance threshold value by using the unintentional transmission, and further obtaining the magnitude relation between the distance between the two points and the distance threshold value. Then, one server can obtain a core point, a boundary point and a noise point in the DBSCAN algorithm by utilizing the distance size relation, and obtain the density direct, density reachable and density connection relation of the data points by the core point, the boundary point and the noise point, so as to realize the secret calculation of the DBSCAN algorithm.
The DBSCAN algorithm describes the degree of compactness of a sample set based on the number of samples in the neighborhood. We first describe the parameters associated with the DBSCAN algorithm.
Assuming that the data point set isThere are the following relevant definitions
Epsilon-neighborhood. For the followingIts epsilon-neighborhood refers to the sum of the data point set DIs not greater than epsilon, i.e.The number of this set is counted as
Core points. For any data pointIf it is epsilon-neighborhoodComprises at least M samples, i.eThenIs the core point.
The density is directly reached. If it isIs positioned atWithin epsilon-neighborhood of (2), andIs the core point, then calledFrom the following componentsThe density is directly reached. The opposite is not necessarily the case, unlessIs also the core point. The density is directly asymmetric.
The density can be achieved. For the followingAndIf a sample sequence is presentSatisfy the following requirements AndFrom the following componentsThe density is direct and is calledFrom the following componentsThe density can be achieved. It is easy to see that the density is achieved with transitivity. As with the direct density, the density can be achieved without symmetry.
And (3) density connection. For the followingAndIf there is a core objectMake the following stepsAndAre all provided withThe density is up toAndAnd (3) density connection. Density-connected symmetry can be obtained.
Wherein the servers A and B are not mutually communicated.
Client dataWherein the method comprises the steps ofT represents the matrix transpose, i.e., all data points are n-dimensional vectors. Neighborhood parameters (. Epsilon., M).
In the embodiment of the present invention, the server a obtains a clustering result through a DBSCAN algorithm, which includes:
Input sample set Neighborhood parameters (. Epsilon., M).
The output is class division c= { C 1,C2,…,Ck }, where the value of k is not known prior to program execution.
In the embodiment of the invention, the server A obtains the clustering result through the DBSCAN algorithm, and the method comprises the following steps:
s401, all core points are obtained. If at some point (e.g ) The number of points in the epsilon-neighborhood of (i.e.)) This is then the core point. Let the set of all core points be Ω.
S402, marking all core points as unvisited.
S403, taking out a core point from omega(After removal)) Marking the point as accessed, setting the current class sequence number k, and creating a queue of the current core pointCreating a set of points of a current category
S404, taking out an element from the core point queue omega k (After removal)) Acquiring all atNon-visited point set in epsilon-neighborhood of (c-c)Non-accessed core point setWill beThese points within epsilon-neighborhood are marked as accessed, and the current class point set is updated toThe core points of the points are collectedTo the core point queue (after joining))。
S405, repeating the step S404 until the core point queue Ω k is empty after the step S404 is completed. At this time, the current category point set C k forms a category, and core points marked as accessed are removed from Ω.
S406, repeating the steps S403-S405 until the element in omega is null, and obtaining the category division of C= { C 1,C2,…,Ck }.
The key to the confidentiality of the DBSCAN algorithm is the confidentiality of user data information. It can be seen that in the above algorithm, the secret calculation of the DBSCAN algorithm can be realized as long as we can realize the secret judgment of the distance and epsilon of the point. With respect to the measurement of the distance of the data points, we use Euclidean distance in the present invention.
Among them, secret sharing is an important technology in multiparty security computing, and is widely used in the field of privacy protection because it is relatively simple to use. Only two-way additive secret sharing is used in the present invention, which is briefly described below.
In additive secret sharing, data x is randomly split into the sum of two data (x 0,x1), i.e., x=x 0+x1, and then x 0,x1 is stored separately at the participants P 0,P1. The recovery of data x must be performed in combination with the data of party P 0,P1.
The party P 0 may be provided with the data x and the other party P 1 with the data y. The additive secret sharing may be implemented as follows.
1. P 0 generates a random number x 0 and computes x 1=x-x0, then sends x 1 to P 1.
2. P 1 generates a random number y 0 and calculates y 1=y-y0, then sends y 0 to P 0.
3. P 0 calculated z 0=x0+y0,P1 calculated z 1=x1+y1.
At this time, the sum of x and y is calculated, and only z 0 and z 1 are needed to be summarized to the demander, and the result is z 0+z1. It can be seen that in this process, the data x and y are not revealed, so that the data privacy is well protected.
In addition, if data information is eavesdropped during the data transmission process, the data privacy still has the risk of leakage, the data can be encrypted before the sender sends the data, and for the sake of simplicity of calculation, a traditional symmetric encryption algorithm can be used. With respect to the transmission of the key, we use Diffie-Hellman (hereinafter referred to as DH) key exchange method. The DH key exchange protocol is briefly described next.
Alice and Bob want to share a single key for symmetric encryption. But the communication channel between them is not secure. All information passing through this channel is seen by the adversary Eve. How do they exchange information to let Eve know this key?
The security of the DH algorithm depends on the degree of difficulty in computing discrete logarithms. The concept of primitive root is needed in the following schemes, and we give a definition of this.
Definition 1a is the primitive root of n if the smallest positive power m, which makes a m =1 mod n hold, satisfies m=Φ n. Where Φ n is the Euler function.
Thus, for any primitive root, a, of integer b and prime number, p, there is a unique power, i, such that b=a i modp, 0≤i≤p-1. The discrete logarithm difficulty problem is that given a, b, p, it is difficult to calculate i.
The following is a DH protocol scheme:
alice and Bob agree on p and g first, where p is a large prime number, g is the primitive root of p, and p and g are disclosed. Eve also knows their values.
Alice takes a private integer a, not known to anyone, and gives Bob the calculation result a=g a modp. Eve also sees the value of A.
3. Similarly, bob takes a private integer B and sends Alice the result b=g b modp. Also Eve will see what is passing B.
Alice calculates s=b amodp=(gb)amodp=gab modp.
Bob can also calculate s=a bmodp=(ga)bmodp=gab modp.
Alice and Bob now have a common key S. Although Eve sees p, g, A, B, she cannot know the specific values of a and B given the difficulty in computing discrete logarithms. Eve has no knowledge of what is the key S.
The 1-out-of-N OT can be implemented as follows:
1. And (5) a preparation stage. The protocol is in the order of the group of big prime number q The upper operation (i.e. the operation results in the protocol are all the results in the modulo q sense), the selection groupIs a primitive root g. A random predictive function H (e.g., SHA-1) is selected. The parameters q, g, H are shared by Alice and Bob.
2. In the initialization stage, alice selects N-1 random numbers C 1,C2,…,CN-1. A random number r is then selected and g r is calculated and then C 1,C2,…,CN-1,gr is sent to Bob. Alice pre-calculates (C 1)r,(C2)r,…,(CN-1)r. (Bob cannot obtain the discrete logarithm of C 1,C2,…,CN-1 and the value of r due to the discrete logarithm difficulty).
3. On-line computing stage:
a. Bob selects a random number k, sets
PK 0 then sends Alice. (Alice cannot obtain the value of k).
B. Alice calculates (PK 0)r, then calculates (PK i)r=(Ci/PK0)r, 1. Ltoreq.i≤n-1. Then selects a random string R (where R is selected long enough to ensure that the Hash values for the two different data pairs are different) and encrypts each M i, 0≤i≤n-1 with H ((PK i)r,R,i)⊕Mi), then sends the encryption result and R to Bob.
C. Bob can calculate (PKσ)r=(Cσ/PK0)r=(gk)r=(gr)k, and then decrypt using H ((PK σ)r, R, σ) to yield M σ.
The two numbers can then be safely compared in size using this protocol.
Suppose Alice owns data x and Bob owns data y. And x, y e {0,... We can compare the x, y magnitudes using the following procedure.
1. Alice constructs N plaintext messages
2. Bob obtains the value of m y through 1-out-of-N OT, and can obtain a result
Bob then sends the size result of both to Alice.
If x, y is a generally real number, x can be expressed in N-ary form, e.g., x=x p-1…x0.x-1…x-q, i.e.The integer has p bits and the decimal has q bits. y can also be expressed as the form y=y p-1…y0·y-1…y-q of N-ary. The two sizes are then compared starting from the most significant bit,
1. If x i=yi, -q≤i≤p, x=y.
2. If k is present, x i=yi is present when i > k, and x i>yi is present when i=k, then x > y.
3. If k is present, x i=yi is present when i > k, and x i<yi is present when i=k, x < y.
We denote the above procedure of comparing the two sizes by ot_component (x, y).
Secret sharing is an important technology in multiparty security computing, and is widely used in the field of privacy protection because it is relatively simple to use.
In two-party addition secret sharing, data x is randomly split into the sum of two data (x 0,x1), i.e., x=x 0+x1, and then x 0,x1 is held at the participants P 0,P1, respectively. The recovery of data x must be performed in combination with the data of party P 0,P1.
The party P 0 may be provided with the data x and the other party P 1 with the data y. The addition secret sharing can be implemented as follows.
1. P 0 generates a random number x 0 and computes x 1=x-x0, then sends x 1 to P 1.
2. P 1 generates a random number y 0 and calculates y 1=y-y0, then sends y 0 to P 0.
3. P 0 calculated z 0=x0+y0,P1 calculated z 1=x1+y1.
At this time, the sum of x and y is calculated, and only z 0 and z 1 are needed to be summarized to the demander, and the result is z 0+z1. It can be seen that in this process, the data x and y are not revealed, so that the data privacy is well protected.
Two-party multiplicative secret sharing can be implemented with 1-out-of-N OT.
Let party P 0 possess data x and party P 1 possess data y. X is expressed as N-ary form x=x p-1…x0·x-1…x-q, i.eThe integer has p bits and the decimal has q bits. y can also be expressed as the form y=y p-1…y0·y-1…y-q of N-ary. When i= -q, p-1, the following methods were used for calculation, respectively
1. Bob generates (m i,0,…,mi,N-1), where m i,0 is a random number, m i,j=Nijy-mi,0.
2. Alice uses 1-out-of-N OT acquisition
3. Alice calculationBob calculation
It is easy to see that z A+zB =x·y, i.e. to obtain the value of x·y, only the value of z A,zB needs to be summarized, and the values of x, y do not need to be revealed.
The invention adopts the addition secret sharing technology to divide the data, then encrypts each divided part and sends the encrypted part to different servers, thereby preventing others from eavesdropping. And then, a multiplication secret sharing technology and an unintentional transmission technology are used between the two servers to acquire the size relation between the distance between the two points and the neighborhood range threshold, so that the contents such as core points, density reachable relations, density connection relations and the like in the DBCSCAN algorithm can be acquired, and further, the safe calculation of the DBSCAN algorithm is realized. Compared with the existing results, the method saves the calculation and communication cost of the client and reduces the calculation cost of the cloud server.
It should be understood that although described in a certain order, the steps are not necessarily performed sequentially in the order described. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, some steps of the present embodiment may include a plurality of steps or stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the steps or stages in other steps or other steps.
In one embodiment, referring to fig. 2, a terminal is further provided in an embodiment of the present invention, where the terminal includes a processor 501, a communication interface 502, a memory 503, and a communication bus 504, where the processor 501, the communication interface 502, and the memory 503 perform communication with each other through the communication bus 504.
A memory 503 for storing a computer program;
The processor 501 is configured to execute the privacy protection method of the DBSCAN algorithm when executing the computer program stored in the memory 503, and the processor executes the instructions to implement the steps in the method embodiments described above:
The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (PERIPHERAL COMPONENTINTERCONNECT, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry StandardArchitecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the terminal and other devices.
The memory may include random access memory (Random Access Memory, RAM) or may include non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor including a central Processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a digital signal processor (DIGITAL SIGNAL Processing, DSP), an application specific integrated circuit (Application SpecificIntegrated Circuit, ASIC), a Field-Programmable GATE ARRAY, FPGA, or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
The terminal comprises user equipment and network equipment. The user equipment comprises, but is not limited to, a computer, a smart phone, a PDA and the like, and the network equipment comprises, but is not limited to, a single network server, a server group formed by a plurality of network servers or Cloud Computing (Cloud Computing) based Cloud formed by a large number of computers or network servers, wherein the Cloud Computing is one of distributed Computing and is a super virtual computer formed by a group of loosely coupled computer sets. The terminal can independently operate to realize the invention, and can also access the network and realize the invention through the interaction operation with other terminals in the network. The network where the terminal is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
In one embodiment of the invention there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described embodiment methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the above described embodiment methods. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory.
It should be understood that as used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The foregoing embodiment of the present invention has been disclosed with reference to the number of embodiments for the purpose of description only, and does not represent the advantages or disadvantages of the embodiments.
It will be appreciated by persons skilled in the art that the foregoing discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the invention, including the claims, is limited to such examples, that technical features of the above embodiments or different embodiments may be combined and that many other variations of the different aspects of the embodiments of the invention as described above exist within the spirit of the embodiments of the invention, which are not provided in detail for clarity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present invention.

Claims (9)

1.一种DBSCAN算法的隐私保护方法,其特征在于,该方法包括:1. A privacy protection method for a DBSCAN algorithm, characterized in that the method comprises: 通过DH协议分别建立客户端与服务器A,B之间的密钥;Use the DH protocol to establish keys between the client and servers A and B respectively; 其中,设第l个客户端与服务器A之间的密钥为,与服务器B之间的密钥为Among them, let the key between the lth client and server A be , the key between server B is ; 客户端利用加法秘密共享,将数据点划分为两部分,分别使用对应的密钥加密后发送给对应的服务器A,BThe client uses additive secret sharing to divide the data point into two parts, encrypts them with the corresponding keys and sends them to the corresponding servers A and B ; 服务器A,B分别对接收到的数据点的秘密共享值解密,获得任意两点间的距离与M的大小;Servers A and B decrypt the secret shared values of the received data points respectively to obtain the distance between any two points and the size of M ; 服务器A通过两点间的距离与ε的大小关系,得到每一点的邻域内的点集;Server A obtains the value of each point by the relationship between the distance between two points and ε . The set of points in the neighborhood; 服务器A通过DBSCAN算法获得聚类结果;Server A obtains clustering results using the DBSCAN algorithm; 服务器A得到数据点的秘密共享值为,服务器B得到数据点的秘密共享值为;T为矩阵转置,所述获得任意两点间的距离与M的大小,包括如下步骤:Server A gets the data point and The secret sharing value is , server B gets the data point and The secret sharing value is ; T is the matrix transpose, and the method of obtaining the distance between any two points and the size of M includes the following steps: 服务器A计算,服务器B计算Server A calculates , server B calculates ; 通过乘法秘密共享,服务器分别得到数据,,且满足Through multiplicative secret sharing, the server Get data separately , ,and satisfy ; 服务器计算得到数据,服务器B计算得到数据,其中,ε为数据点的领域半径;server Calculate the data , server B calculates the data , where ε is the area radius of the data point; 服务器B将sign(V B)发送给服务器A,由服务器A初步判断ε的大小;Server B sends sign( V B ) to Server A , which makes a preliminary judgment and the size of ε ; 在服务器A,B之间采用方法比较的大小,得到ε的大小;Between servers A and B , the method Compare and The size of and the size of ε ; 其中,ε为数据点的领域半径,M为样本数量。Among them, ε is the domain radius of the data point, and M is the number of samples. 2.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,所述服务器B将sign(V B)发送给服务器A,由服务器A初步判断ε的大小,其包括以下几种情况:2. The privacy protection method of the DBSCAN algorithm as claimed in claim 1, characterized in that the server B sends sign( VB ) to the server A , and the server A preliminarily determines With the size of ε , it includes the following cases: V AV B均大于等于0,则得到If both VA and VB are greater than or equal to 0, we get ; V AV B均小于等于0,则得到If both VA and VB are less than or equal to 0, we get ; V A大于等于0和V B小于0、V A小于等于0和V B大于0,则在服务器A,B之间采用方法比较的大小,得到ε的大小。 If VA is greater than or equal to 0 and VB is less than 0, or VA is less than or equal to 0 and VB is greater than 0, then the method is used between servers A and B. Compare and The size of with the size of ε . 3.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,在服务器A,B之间采用方法比较的大小,得到ε的大小,其包括以下几种情况:3. The privacy protection method of the DBSCAN algorithm as claimed in claim 1, characterized in that a method is used between servers A and B Compare and The size of With the size of ε , it includes the following cases: 如果,则得到if , then we get , 如果,则得到if , then we get . 4.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,所述服务器A通过DBSCAN算法获得聚类结果,其包括:4. The privacy protection method of the DBSCAN algorithm according to claim 1, wherein the server A obtains the clustering result through the DBSCAN algorithm, which comprises: 输入:样本集,邻域参数(εM) ;Input: Sample set , neighborhood parameter ( ε , M ); 输出:类划分Output: Classification . 5.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,所述服务器A通过DBSCAN算法获得聚类结果,包括如下步骤:5. The privacy protection method of the DBSCAN algorithm according to claim 1, characterized in that the server A obtains the clustering result through the DBSCAN algorithm, comprising the following steps: S401、求出所有的核心点;S401, finding all core points; 如果某一点的邻域内点的个数大于M,这一点则为核心点;设所有核心点的集合为ΩIf a certain point If the number of points in the neighborhood is greater than M , then this point is a core point. Let the set of all core points be Ω ; S402、将所述核心点标记为未访问;S402, marking the core point as unvisited; S403、从Ω取出一个核心点,将此点标记为已访问,设置当前类别序号k,创建当前核心点的队列,创建当前类别点集S403. Take a core point from Ω , mark this point as visited, set the current category number k , and create a queue for the current core point , create the current category point set ; S404、从核心点队列Ω k中取出一个元素,取出后,获取所有在邻域内未访问的点集,未访问的核心点集;将邻域内的这些点都标记为已访问,将当前类别点集更新为,将这些点中的核心点集加入到核心点队列中;S404, take out an element from the core point queue Ω k , after taking out , get all of The set of points not visited in the neighborhood , the set of unvisited core points ;Will of These points in the neighborhood are marked as visited, and the current category point set is updated to , the core point set among these points Join the core point queue; S405、重复步骤S404,直到执行完步骤S404后,核心点队列Ω k为空;当前类别点集C k构成一类,从Ω中剔除标记为已访问的核心点;S405, repeat step S404 until after step S404 is executed, the core point queue Ω k is empty; the current category point set C k constitutes one category, and the core points marked as visited are removed from Ω ; S406、重复步骤S403-S405,直至Ω中的元素为空,得到类别划分为S406, repeat steps S403-S405 until the element in Ω is empty, and the category division is obtained. . 6.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,所述服务器A,B相互之间不串通。6. The privacy protection method of the DBSCAN algorithm as described in claim 1 is characterized in that the servers A and B do not collude with each other. 7.如权利要求1所述的DBSCAN算法的隐私保护方法,其特征在于,客户端数据,其中,n维向量,i的取值范围为1-m;m和n的取值均正整数。7. The privacy protection method of the DBSCAN algorithm as claimed in claim 1, characterized in that the client data ,in, , is an n- dimensional vector, the value range of i is 1-m; the values of m and n are both positive integers. 8.一种终端,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器加载并执行所述计算机程序时实现如权利要求1-7任一项所述的DBSCAN算法的隐私保护方法的步骤。8. A terminal comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the privacy protection method of the DBSCAN algorithm as described in any one of claims 1 to 7 when loading and executing the computer program. 9.一种存储介质,存储有计算机程序,所述计算机程序被处理器加载并执行时实现如权利要求1-7任一项所述的DBSCAN算法的隐私保护方法的步骤。9. A storage medium storing a computer program, wherein when the computer program is loaded and executed by a processor, the steps of the privacy protection method of the DBSCAN algorithm as described in any one of claims 1 to 7 are implemented.
CN202310080172.3A 2023-01-30 2023-01-30 Privacy protection method, terminal and storage medium of DBSCAN algorithm Active CN116094708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310080172.3A CN116094708B (en) 2023-01-30 2023-01-30 Privacy protection method, terminal and storage medium of DBSCAN algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310080172.3A CN116094708B (en) 2023-01-30 2023-01-30 Privacy protection method, terminal and storage medium of DBSCAN algorithm

Publications (2)

Publication Number Publication Date
CN116094708A CN116094708A (en) 2023-05-09
CN116094708B true CN116094708B (en) 2025-07-15

Family

ID=86198971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310080172.3A Active CN116094708B (en) 2023-01-30 2023-01-30 Privacy protection method, terminal and storage medium of DBSCAN algorithm

Country Status (1)

Country Link
CN (1) CN116094708B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117195017A (en) * 2023-09-07 2023-12-08 中国人民解放军国防科技大学 Privacy-preserving density clustering method, electronic devices and storage media
CN118410521B (en) * 2024-06-27 2024-09-13 山东云海国创云计算装备产业创新中心有限公司 Multi-client data privacy processing method, system, equipment, medium and product

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113919513A (en) * 2021-10-22 2022-01-11 全球能源互联网研究院有限公司南京分公司 Method and device for aggregating security of federated learning and electronic equipment
CN114154554A (en) * 2021-10-28 2022-03-08 上海海洋大学 Privacy protection outsourcing data KNN algorithm based on non-collusion double-cloud server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6660319B2 (en) * 2017-02-03 2020-03-11 Kddi株式会社 Classification device, classification method and classification program
US10771488B2 (en) * 2018-04-10 2020-09-08 Cisco Technology, Inc. Spatio-temporal anomaly detection in computer networks using graph convolutional recurrent neural networks (GCRNNs)

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113919513A (en) * 2021-10-22 2022-01-11 全球能源互联网研究院有限公司南京分公司 Method and device for aggregating security of federated learning and electronic equipment
CN114154554A (en) * 2021-10-28 2022-03-08 上海海洋大学 Privacy protection outsourcing data KNN algorithm based on non-collusion double-cloud server

Also Published As

Publication number Publication date
CN116094708A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN114586313B (en) System and method for signing information
Liu et al. An efficient privacy-preserving outsourced calculation toolkit with multiple keys
RU2534944C2 (en) Method for secure communication in network, communication device, network and computer programme therefor
CN115510502B (en) PCA method and system for privacy protection
CN111162906B (en) Collaborative secret sharing method, device, system and medium based on vast transmission algorithm
JP2020052393A (en) Post-quantum asymmetric key encryption system with one-to-many distributed key management based on double encapsulation of prime modulo
WO2018184407A1 (en) K-means clustering method and system having privacy protection
CN116681141A (en) Privacy-protected federated learning method, terminal and storage medium
CN116094708B (en) Privacy protection method, terminal and storage medium of DBSCAN algorithm
Erkin et al. Privacy-preserving distributed clustering
WO2017041669A1 (en) Password based key exchange from ring learning with er-rors
US10630476B1 (en) Obtaining keys from broadcasters in supersingular isogeny-based cryptosystems
US20240396735A1 (en) Round optimal oblivious transfers from isogenies
CN116361649A (en) An Efficient Unbalanced PSI Based on Bloom Filter and Hash
US20240333492A1 (en) Statistically private oblivious transfer from cdh
CN112953700B (en) A method, system and storage medium for improving the efficiency of secure multi-party computing
CN107852324B (en) Method and encryption node for encrypting messages
Battarbee et al. Cryptanalysis of semidirect product key exchange using matrices over non-commutative rings
US20240330485A1 (en) Third-party private set intersection with communication and computational efficiency
CN117273156A (en) A privacy set intersection method based on error learning problem
Patel et al. Comparative evaluation of elliptic curve cryptography based homomorphic encryption schemes for a novel secure multiparty computation
CN114564730B (en) Method, device and medium for calculating federated group statistics based on symmetric encryption
Bouamama et al. EdgeSA: Secure Aggregation for Privacy-Preserving Federated Learning in Edge Computing
Gong et al. Nearly optimal protocols for computing multi-party private set union
CN118540125A (en) Privacy set intersection method, system and device based on cloud assistance and threshold homomorphic encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant