WO2025061284A1 - Devices and methods for providing privacy-preserving user data for recommendation system - Google Patents
Devices and methods for providing privacy-preserving user data for recommendation system Download PDFInfo
- Publication number
- WO2025061284A1 WO2025061284A1 PCT/EP2023/076020 EP2023076020W WO2025061284A1 WO 2025061284 A1 WO2025061284 A1 WO 2025061284A1 EP 2023076020 W EP2023076020 W EP 2023076020W WO 2025061284 A1 WO2025061284 A1 WO 2025061284A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user data
- ldp
- client device
- privacy
- ldp mechanism
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
Definitions
- the present disclosure relates to digital security, in particular privacy of user data. More specifically, the present disclosure relates to devices and methods for providing privacypreserving user data to a server, in particular a recommendation server configured to generate one or more recommendations based on the privacy-preserving user data.
- a recommendation system (also referred to as a recommender system) is an online information filtering system that usually provides suggestions for items, i.e. recommendations that are most pertinent to a particular user. Typically, the suggestions refer to various decision-making processes, such as what product to purchase, what music to listen to, or what online news to read. Recommendation systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer. For determining the most suitable recommendations, for instance, online advertisements for a user, a recommendation system usually requires user data, such as a search history, a history of previously clicked advertisements, user location, user age, user gender, and the like, resulting in a substantial risk that this user data may get compromised, i.e. leaked during one or more of the different stages of the recommendation process.
- user data such as a search history, a history of previously clicked advertisements, user location, user age, user gender, and the like
- ML machine learning
- neural networks for generating the most suitable recommendations based on the user data, which have been trained with a training dataset of user data.
- Al artificial intelligence
- ML machine learning
- the suggested solutions focus on preventing a potential attacker from inferring whether the data of a specific user was used or not to train the recommendation system, e.g., by using differentially private training of the ML model.
- each user by means of his client device sends a request, for instance, for an advertisement to an advertisement recommendation server, where the request includes information about the user, i.e. user data.
- a potential attacker e.g., the server itself or an external attacker, may then reveal/infer private information about the user by using the sent information.
- a client device for using a service, in particular a recommendation service, provided by a server, in particular a recommendation server, based on user data associated with a user of the client device.
- the client device may be a user equipment, UE.
- the client device is configured to apply a first local differential privacy, LDP, mechanism and a second LDP mechanism to the user data for obtaining privacy-preserving user data, wherein the first LDP mechanism is a randomized selection-based LDP mechanism and wherein the second LDP mechanism is a LDP mechanism configured to add noise to the input data of the LDP mechanism.
- the client device is configured to transmit the privacy-preserving user data to the server.
- the client device allows: (a) preventing the inference of private user data (sometimes also referred to as attributes) based on the privacy-preserving user data provided to the recommendation server; (b) preventing the inference of private user data based on the serverside information, e.g., chosen recommendation(s) or the recommendation-user matching scores; (c) maintaining a high utility, i.e. providing a meaningful response, in particular recommendation based on the user data; and (d) fulfilling the hardware and user experience requirements with an acceptable speed, bandwidth and memory footprint.
- private user data sometimes also referred to as attributes
- serverside information e.g., chosen recommendation(s) or the recommendation-user matching scores
- maintaining a high utility i.e. providing a meaningful response, in particular recommendation based on the user data
- fulfilling the hardware and user experience requirements with an acceptable speed, bandwidth and memory footprint.
- the server is a recommendation server
- the client device is configured, in response to transmitting the privacy-preserving user data to the recommendation server, to receive one or more recommendations based on the privacypreserving user data from the recommendation server.
- the client device according to the first aspect may receive recommendations based on its private user data without the risk of the private user data being leaked.
- the client device comprises a display configured to display the one or more recommendations received from the recommendation server to the user of the client device.
- the client device according to the first aspect may efficiently inform the user about the one or more recommendations received from the recommendation server.
- the client device for applying the first LPD mechanism and the second LDP mechanism to the user data the client device is configured to first apply the first LDP mechanism to the user data and then apply the second LDP mechanism to the output of the first LDP mechanism for obtaining the privacy-preserving user data.
- the client device is configured to apply first the randomized selection-based first LDP mechanism and subsequently the noise adding second LDP mechanism to the private user data.
- the client device for applying the first LPD mechanism and the second LDP mechanism to the user data the client device is configured to first apply the second LDP mechanism to the user data and then apply the first LDP mechanism to the output of the second LDP mechanism for obtaining the privacy-preserving user data.
- the client device is configured to apply first the noise adding second LDP mechanism and subsequently the randomized selectionbased first LDP mechanism to the private user data.
- the randomized selection-based LDP mechanism i.e. the first LDP mechanism is configured to output with a defined probability either the user data or randomly sampled generic user data.
- the randomized selection-based LDP mechanism is a Generalized Randomized Response, GRR, mechanism.
- GRR Generalized Randomized Response
- the second LDP mechanism for adding noise is a Laplace LDP mechanism or a Gaussian LDP mechanism.
- the client device according to the first aspect may efficiently implement the noise adding second LDP mechanism in the form of a Laplace or Gaussian LDP mechanism.
- the client device is configured to implement, i.e. operate, a machine learning, ML, model, wherein the ML model is configured to generate the user data as a user data embedding based on raw user data.
- the ML model for generating the user data embedding for instance in the form of a vector, may be a neural network.
- the randomized selection-based LDP mechanism i.e. the first LDP mechanism is configured to output with a defined probability either the user data embedding based on the raw user data or a generic user data embedding randomly sampled from a distribution of simulated user data embeddings.
- the client device according to the first aspect may efficiently implement the noise adding second LDP mechanism in the form of a Laplace or Gaussian LDP mechanism.
- the client device is configured to implement, i.e. operate, a further machine learning, ML, model, wherein the further ML model is configured to generate the plurality of simulated user data embeddings of the distribution of simulated user data embeddings.
- the further ML model for generating the distribution of simulated user data embeddings may be a further neural network.
- the further ML model of the client device according to the first aspect may generate a distribution of simulated user data embeddings being indistinguishable from an embedding of the user data of a real user.
- the user data or the raw user data may comprise one or more of the following: a name; an age; an address; a gender; a search history; an application usage by the user; a browser history; and/or information about online advertisements previously selected, i.e. clicked by the user.
- a method is provided for using a service, in particular a recommendation service, provided by a server, in particular a recommendation server, based on user data associated with a user of a client device.
- the method comprises the steps of applying a first local differential privacy, LDP, mechanism and a second LDP mechanism to the user data for obtaining privacy-preserving user data, wherein the first LDP mechanism is a randomized selection-based LDP mechanism and wherein the second LDP mechanism is a LDP mechanism configured to add noise to the input data of the LDP mechanism; and transmitting the privacy-preserving user data to the server.
- the server is a recommendation server and the method according to the second aspect further comprises, in response to transmitting the privacy-preserving user data to the recommendation server, receiving one or more recommendations based on the privacy-preserving user data from the recommendation server.
- the step of applying the first LPD and the second LDP to the user data comprises first applying the first LDP to the user data and then applying the second LDP to the output of the first LDP for obtaining the privacy-preserving user data.
- the step of applying the first LPD and the second LDP to the user data comprises first applying the second LDP to the user data and then applying the first LDP to the output of the second LDP for obtaining the privacypreserving user data.
- the method according to the second aspect of the present disclosure can be performed by the client device according to the first aspect of the present disclosure.
- further features of the method according to the second aspect of the present disclosure result directly from the functionality of the client device according to the first aspect of the present disclosure as well as its different implementation forms described above and below.
- a computer program product comprising a computer- readable storage medium for storing program code which causes a computer or a processor to perform the method according to the third aspect when the program code is executed by the computer or the processor.
- Fig. 1 shows a schematic diagram illustrating a recommendation system including a plurality of client devices according to an embodiment and a recommendation server providing a recommendation service to the plurality of client devices;
- Fig. 2 shows a schematic diagram illustrating in more detail the interaction between a conventional client device and a recommendation server for using a recommendation service
- Fig. 3 shows a schematic diagram illustrating in more detail the interaction between a client device according to an embodiment and a recommendation server for using a recommendation service
- Fig. 4 shows a flow diagram illustrating processing steps of a method of operating a client device according to an embodiment.
- a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa.
- a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures.
- a specific apparatus is described based on one or a plurality of units, e.g.
- a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.
- Figure 1 shows a schematic diagram illustrating a recommendation system 100 comprising a recommendation server 140 (also referred to as a recommender server 140) and a plurality of client devices 120 using a recommendation service provided by the recommendation server 140.
- the recommendation server 140 may be implemented, for instance, as a cloud server 140 configured to communicate with the plurality of client devices 120 via a communication network, such as the Internet.
- the plurality of client devices 120 may comprise, for instance, smartphones, tablet computers, laptop computers, notebook computers, desktop computers, smart cars, smart TVs or other communication devices capable of using an online recommendation service.
- the recommendation system 100 may be configured to recommend online advertisements, online products, music, videos/movies, video games, online news, search results for online searches and the like.
- each client device 120 may comprise processing circuitry 121, e.g. one or more processors 121, a communication interface 123 and/or a memory 125.
- the processing circuitry 121 may be implemented in hardware and/or software and may comprise digital circuitry, or both analog and digital circuitry.
- Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), or general-purpose processors.
- the communication interface 123 may be configured to communicate with the recommendation server 140 via wired and/or wireless connections via a wired and/or wireless communication network, such as the Internet.
- the memory 125 of the client device 120 may be configured to store executable program code which, when executed by the processing circuitry 121, causes the client device 120 to perform the functions and methods described herein. As illustrated in figure 1 and will be described in more detail below, in an embodiment the memory 125 of the client device 120 may store user data 120d, i.e. data that is private to the user.
- the private user data may comprise, for instance, the name of the user; the age of the user; the address of the user; the gender of the user; an online search history of the user; an application usage by the user; a browser history of the user; and/or information about online advertisements previously selected by the user.
- the recommendation server 140 may comprise processing circuitry 141, e.g. one or more processors 141, a communication interface 143 and/or a memory 145.
- the processing circuitry 141 may be implemented in hardware and/or software and may comprise digital circuitry, or both analog and digital circuitry.
- Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), or general- purpose processors.
- the communication interface 143 may be configured to communicate with the plurality of client devices 120 via wired and/or wireless connections via a wired and/or wireless communication network, such as the Internet.
- the memory 145 of the recommendation server 140 may be configured to store executable program code which, when executed by the processing circuitry 141, causes the recommendation server 140 to perform the functions and methods described herein.
- the recommendation system 100 processes user data for generating one or more recommendations in a privacypreserving and computationally inexpensive manner.
- Figure 2 shows a schematic diagram illustrating in more detail the interaction between a conventional client device 20 and a recommendation server 40, in particular an advertisement recommendation server 40 for implementing a conventional recommendation system 10, in particular an advertisement recommendation system 10.
- the client device 20 comprises a ML model 21 configured and trained to generate based on a user data 20d an user data embedding 30, e.g. an user data feature vector 30, and to provide the user data embedding 30 to the recommendation server 40.
- the user data embedding 30 may be, for instance, part of a request for a recommendation, in particular an advertisement recommendation from the recommendation server 40 and may be send, for instance, via the Internet to the recommendation server 40.
- a first attacker device 50 may try to intercept the user data embedding 30 and to retrieve the original user data 20d based on an attribute interference model 51 and thereby reveal private user data, such as the age, gender, the search history, the previously clicked advertisements of the user of the client device 20.
- a CTR prediction ML model 44 of the recommendation server 40 takes as input data the user data embedding 30, e.g. the user data feature vector 30, provided by the client device 20 and an advertisement embedding 43 generated by an advertisement embedding ML model 42 of the recommendation server 40 based on advertisement data 41, such as advertisement category, product price, and the like. Based on this input data the CTR prediction ML model
- a second attacker device 60 may try to retrieve the advertisement embedding 43 and/or the advertisement recommendations 45 and to generate on the basis of this data the original user data 20d and thereby reveal private user data, such as the age, gender, the search history, the previously clicked advertisements of the user of the client device 20.
- the conventional recommendation system 10 of figure 2 has three potential weak points that an attacker might try to exploit for gaining access to private user data.
- an attacker can infer private user data by using (a) the user data embedding 30 (see attacker device 50 in figure 2), (b) the advertisement embeddings 43 and the corresponding matching scores 45 (see attacker device 60 in figure 2), and (c) a combination of (a) and (b), i.e., user data and advertisements embeddings 30, 43 as well as the matching scores 45.
- embodiments of the recommendation system 100 shown in figure 1 fulfill the following requirements for protecting private user data as will be described in more detail in the following: (a) prevent the inference of private user data (sometimes also referred to as attributes) based on the user information provided to the recommendation server; (b) prevent the inference of private user data based on the server-side information, e.g., the chosen recommendation(s) or the recommendation-user matching scores; (c) maintain a high utility, i.e. the recommendations determined by the recommendation server 140 based on the user data are meaningful and of high quality for the user; and (d) fulfill the hardware and user experience requirements, i.e., the solution speed, bandwidth and memory footprint are acceptable.
- Figure 3 shows a schematic diagram illustrating in more detail the interaction between one of the plurality of client devices 120 according to an embodiment and the recommendation server 140, in particular advertisement recommendation server 140 for implementing a recommendation system 100, in particular an advertisement recommendation system 100 according to an embodiment.
- the client device 120 is configured, for instance by means of its processing circuitry 120a, to apply a first local differential privacy, LDP, mechanism (referred to as LDP 1 in figure 3) 122 and a second LDP mechanism (referred to as LDP 2 in figure 3) 126 to the user data 120d for obtaining privacy-preserving user data 130 (referred to as protected user embedding in figure 3), wherein the first LDP mechanism 122 is a randomized selectionbased LDP mechanism 122 and wherein the second LDP mechanism 126) is a LDP mechanism 126 configured to add noise to the input data of the LDP mechanism 126.
- LDP local differential privacy
- the client device 120 is further configured to transmit the privacy-preserving user data 130 to the recommendation server 140, for instance, as part of a request for recommended advertisements to be displayed on a display of the client device 120.
- the user data 120d may comprise, for instance, one or more of the following: a name; an age; an address; a gender; a search history; an application usage by the user; a browser history; and/or information about online advertisements previously selected by the user.
- the client device 120 is configured to first apply the first LDP mechanism 122 to the user data 120d and then apply the second LDP mechanism 126 to the output of the first LDP mechanism 122 for obtaining the privacy-preserving user data 130.
- the client device 120 may be configured to first apply the second LDP mechanism 126 to the user data 120d and then apply the first LDP mechanism 122 to the output of the second LDP mechanism 126 for obtaining the privacy-preserving user data 130.
- the randomized selection-based first LDP mechanism 122 is configured to output with a defined probability either the user data embedding 124 (or alternatively the user data 120d if no user embedding ML model 121 is employed) or randomly sampled user data, for instance, the generic, i.e. synthetic user data embedding 125 shown in figure 3 randomly sampled from a distribution of simulated user data embeddings.
- the randomized selection-based first LDP mechanism 122 is a Generalized Randomized Response, GRR, mechanism 122.
- GRR Generalized Randomized Response
- the client device 120 may implement a further ML model 123 (referred to as the synthetic embedding generator 123 in figure 3). Since the user of the client device 120 does not have access to embeddings of other users, the further ML model 123 generates synthetic user embeddings 125 to play the role of the randomly sampled embeddings. In an embodiment, the synthetic user embeddings 125 follow the distribution of the true user embeddings 124, and therefore are indistinguishable from the true ones (from the perspective of the attacker 150, 160 or the server 140).
- the second LDP mechanism 126 for adding noise is a Laplace LDP mechanism or a Gaussian LDP mechanism.
- the privacy-preserving user data 130 may be, for instance, part of a request for a recommendation, in particular an advertisement recommendation from the recommendation server 140 and may be send, for instance, via the Internet to the recommendation server 140.
- a second attacker device 160 may retrieve the advertisement embedding 143 and/or the advertisement recommendations 145, but will not be able to retrieve the original user data 120d based on an attribute interference model 161 due to the sequential application of the first and the second LPM mechanism 122, 126 used by the client device 120 for generating the privacy-preserving user data 130 (which is used by the recommendation server 140 for generating the advertisement recommendations 145).
- the client device 120 protects against attribute inference based on the advertisement information 143, 145 (which may be obtained by the attacker device 160) and, thus, fulfills requirement (b) mentioned above. This is because the server 140 (or the attacker device 160) cannot infer private user data of the user with a high confidence since the received privacy-preserving user data 130 may be with a certain probability a synthetically generated user embedding 125.
- the sequential application of the first and the second LDP mechanism 122, 126 still allows the recommendation server 140 to provide meaningful recommendations, in particular advertisements based on the privacypreserving user data 130 (thus, fulfilling requirement (c) mentioned above) and may be implemented in a computationally efficient (small computational overhead) and flexible manner with a low memory footprint and, thus, also fulfills requirement (d) mentioned above.
- the sequential application of the two LDP mechanisms 122, 126 results in a composite LDP mechanism, whose privacy budget parameter is a combination of the privacy budget parameters of the two LDP mechanisms 122, 126, i.e. a combination of the first privacy parameter and the second privacy parameter.
- the sequential application of the two LDP mechanisms 122, 126 as part of the same query yields a higher privacy protection (mathematically corresponding to a lower compound epsilon) than the application of only one of the LDP mechanisms 122, 126.
- the sequential application of the first and second LDP mechanism implemented by the client device 120 is different from the case where two different LDP mechanisms are applied as part of two separate queries (in this case the privacy protection is lower than applying a single LDP mechanism as part of a single query).
- the simulated user embeddings may be determined by a different entity and stored on the client device 120, in particular in its memory 120c (instead of using the synthetic data generator model 123).
- the client device 120 may retrieve a sample from the stored simulated user embeddings each time the outcome of the first LDP mechanism 122 is to send a simulated user embedding to the recommendation server 140.
- the simulated user embeddings should be indistinguishable from the true embeddings from the perspective of the recommendation server 140.
- Figure 4 shows a flow diagram illustrating processing steps of a method 400 for using a service provided by the server 140 based on user data 120d or an user data embedding 124 associated with a user of the client device 120.
- the method 400 comprises a step 401 of applying a first LDP mechanism 122 and a second LDP mechanism 126 to the user data 120d of the user data embedding 124 for obtaining privacy-preserving user data 130.
- the first LDP mechanism 122 is a randomized selection-based LDP mechanism and the second LDP mechanism 126 is a LDP mechanism configured to add noise to the input data of the LDP mechanism.
- the method 400 comprises a step 403 of transmitting the privacy-preserving user data to the server.
- the server 140 is a recommendation server 140 and the method 400 further comprises, in response to the step 403 of transmitting the privacy-preserving user data 130 to the recommendation server 140, receiving one or more recommendations 145 based on the privacy-preserving user data 130 from the recommendation server 140.
- the step 401 of applying the first LPD mechanism 122 and the second LDP mechanism 126 to the user data 120d or the user data embedding 124 comprises first applying the first LDP mechanism 122 to the user data 120d or the user data embedding 124 and then applying the second LDP mechanism 126 to the output of the first LDP mechanism 122 for obtaining the privacy-preserving user data 130.
- the step 401 of applying the first LPD mechanism 122 and the second LDP mechanism 126 to the user data 120d or the user data embedding 124 comprises first applying the second LDP mechanism 126 to the user data 120d or the user data embedding 124 and then applying the first LDP mechanism 122 to the output of the second LDP mechanism 126 for obtaining the privacy-preserving user data 130.
- the method 400 shown in figure 4 can be performed by the client device 120 according to different embodiments described above. Thus, further features of the method 400 result directly from the functionality of the client device 120 as well as its different embodiments described above and below.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the described embodiment of an apparatus is merely exemplary.
- the unit division is merely logical function division and may be another division in an actual implementation.
- a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
- the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces.
- the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
- the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- functional units in the embodiments disclosed herein may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Game Theory and Decision Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Economics (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A client device (120) for using a service, in particular a recommendation service, provided by a server (140) based on user data (120d) associated with a user of the client device (120) is disclosed. The client device (120) is configured to apply a first local differential privacy, LDP, mechanism and a second LDP mechanism to the user data (120d) for obtaining privacy- preserving user data, wherein the first LDP mechanism is a randomized selection-based LDP mechanism and wherein the second LDP mechanism is a LDP mechanism configured to add noise to the input data of the LDP mechanism. Moreover, the client device (120) is configured to transmit the privacy-preserving user data to the server (140).
Description
DEVICES AND METHODS FOR PROVIDING PRIVACY-PRESERVING USER DATA FOR RECOMMENDATION SYSTEM
TECHNICAL FIELD
The present disclosure relates to digital security, in particular privacy of user data. More specifically, the present disclosure relates to devices and methods for providing privacypreserving user data to a server, in particular a recommendation server configured to generate one or more recommendations based on the privacy-preserving user data.
BACKGROUND
A recommendation system (also referred to as a recommender system) is an online information filtering system that usually provides suggestions for items, i.e. recommendations that are most pertinent to a particular user. Typically, the suggestions refer to various decision-making processes, such as what product to purchase, what music to listen to, or what online news to read. Recommendation systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer. For determining the most suitable recommendations, for instance, online advertisements for a user, a recommendation system usually requires user data, such as a search history, a history of previously clicked advertisements, user location, user age, user gender, and the like, resulting in a substantial risk that this user data may get compromised, i.e. leaked during one or more of the different stages of the recommendation process.
Often recommendation systems employ artificial intelligence (Al), i.e. machine learning (ML) models, such as neural networks for generating the most suitable recommendations based on the user data, which have been trained with a training dataset of user data. There have been some suggestions for ensuring the privacy of the training dataset, i.e., data of previous users that has been used to train the recommendation system (e.g., collected from users who consented to the processing of their data). The suggested solutions focus on preventing a potential attacker from inferring whether the data of a specific user was used or not to train the recommendation system, e.g., by using differentially private training of the ML model. Less solutions have been suggested for ensuring the privacy at the inference stage, i.e., when the trained ML model is deployed and serves new users. During the inference stage, each user
by means of his client device sends a request, for instance, for an advertisement to an advertisement recommendation server, where the request includes information about the user, i.e. user data. A potential attacker, e.g., the server itself or an external attacker, may then reveal/infer private information about the user by using the sent information.
Thus, there is a need for privacy-preserving solutions for recommendation systems, in particular the inference stage of a recommendation system.
SUMMARY
It is an objective to provide improved devices and methods for providing privacy-preserving user data to a server, in particular a recommendation server configured to generate one or more recommendations based on the privacy-preserving user data.
The foregoing and other objectives are achieved by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.
According to a first aspect a client device is provided for using a service, in particular a recommendation service, provided by a server, in particular a recommendation server, based on user data associated with a user of the client device. In an implementation form the client device may be a user equipment, UE. The client device is configured to apply a first local differential privacy, LDP, mechanism and a second LDP mechanism to the user data for obtaining privacy-preserving user data, wherein the first LDP mechanism is a randomized selection-based LDP mechanism and wherein the second LDP mechanism is a LDP mechanism configured to add noise to the input data of the LDP mechanism. Moreover, the client device is configured to transmit the privacy-preserving user data to the server.
The client device according to the first aspect and its different implementation forms described below allow: (a) preventing the inference of private user data (sometimes also referred to as attributes) based on the privacy-preserving user data provided to the recommendation server; (b) preventing the inference of private user data based on the serverside information, e.g., chosen recommendation(s) or the recommendation-user matching scores; (c) maintaining a high utility, i.e. providing a meaningful response, in particular
recommendation based on the user data; and (d) fulfilling the hardware and user experience requirements with an acceptable speed, bandwidth and memory footprint.
In a further possible implementation form, the server is a recommendation server, and the client device is configured, in response to transmitting the privacy-preserving user data to the recommendation server, to receive one or more recommendations based on the privacypreserving user data from the recommendation server. Thus, the client device according to the first aspect may receive recommendations based on its private user data without the risk of the private user data being leaked.
In a further possible implementation form, the client device comprises a display configured to display the one or more recommendations received from the recommendation server to the user of the client device. Thus, the client device according to the first aspect may efficiently inform the user about the one or more recommendations received from the recommendation server.
In a further possible implementation form, for applying the first LPD mechanism and the second LDP mechanism to the user data the client device is configured to first apply the first LDP mechanism to the user data and then apply the second LDP mechanism to the output of the first LDP mechanism for obtaining the privacy-preserving user data. In other words, in this implementation form, the client device is configured to apply first the randomized selection-based first LDP mechanism and subsequently the noise adding second LDP mechanism to the private user data.
Alternatively, in a further possible implementation form, for applying the first LPD mechanism and the second LDP mechanism to the user data the client device is configured to first apply the second LDP mechanism to the user data and then apply the first LDP mechanism to the output of the second LDP mechanism for obtaining the privacy-preserving user data. In other words, in this implementation form, the client device is configured to apply first the noise adding second LDP mechanism and subsequently the randomized selectionbased first LDP mechanism to the private user data.
In a further possible implementation form, the randomized selection-based LDP mechanism, i.e. the first LDP mechanism is configured to output with a defined probability either the user
data or randomly sampled generic user data. In an implementation form, the randomized selection-based LDP mechanism is a Generalized Randomized Response, GRR, mechanism. Thus, the client device according to the first aspect may efficiently implement the randomized selection-based first LDP mechanism in the form of a GRR mechanism.
In a further possible implementation form, the second LDP mechanism for adding noise is a Laplace LDP mechanism or a Gaussian LDP mechanism. Thus, the client device according to the first aspect may efficiently implement the noise adding second LDP mechanism in the form of a Laplace or Gaussian LDP mechanism.
In a further possible implementation form, the client device is configured to implement, i.e. operate, a machine learning, ML, model, wherein the ML model is configured to generate the user data as a user data embedding based on raw user data. In an implementation form, the ML model for generating the user data embedding, for instance in the form of a vector, may be a neural network. Thus, in an implementation form, the randomized selection-based LDP mechanism, i.e. the first LDP mechanism is configured to output with a defined probability either the user data embedding based on the raw user data or a generic user data embedding randomly sampled from a distribution of simulated user data embeddings. Thus, the client device according to the first aspect may efficiently implement the noise adding second LDP mechanism in the form of a Laplace or Gaussian LDP mechanism.
In a further possible implementation form, the client device is configured to implement, i.e. operate, a further machine learning, ML, model, wherein the further ML model is configured to generate the plurality of simulated user data embeddings of the distribution of simulated user data embeddings. In an implementation form, the further ML model for generating the distribution of simulated user data embeddings may be a further neural network. Thus, the further ML model of the client device according to the first aspect may generate a distribution of simulated user data embeddings being indistinguishable from an embedding of the user data of a real user.
In a further possible implementation form, the user data or the raw user data may comprise one or more of the following: a name; an age; an address; a gender; a search history; an application usage by the user; a browser history; and/or information about online advertisements previously selected, i.e. clicked by the user.
According to a second aspect a method is provided for using a service, in particular a recommendation service, provided by a server, in particular a recommendation server, based on user data associated with a user of a client device. The method comprises the steps of applying a first local differential privacy, LDP, mechanism and a second LDP mechanism to the user data for obtaining privacy-preserving user data, wherein the first LDP mechanism is a randomized selection-based LDP mechanism and wherein the second LDP mechanism is a LDP mechanism configured to add noise to the input data of the LDP mechanism; and transmitting the privacy-preserving user data to the server.
In a further possible implementation form, the server is a recommendation server and the method according to the second aspect further comprises, in response to transmitting the privacy-preserving user data to the recommendation server, receiving one or more recommendations based on the privacy-preserving user data from the recommendation server.
In a further possible implementation form, the step of applying the first LPD and the second LDP to the user data comprises first applying the first LDP to the user data and then applying the second LDP to the output of the first LDP for obtaining the privacy-preserving user data. Alternatively, in a further possible implementation form, the step of applying the first LPD and the second LDP to the user data comprises first applying the second LDP to the user data and then applying the first LDP to the output of the second LDP for obtaining the privacypreserving user data.
The method according to the second aspect of the present disclosure can be performed by the client device according to the first aspect of the present disclosure. Thus, further features of the method according to the second aspect of the present disclosure result directly from the functionality of the client device according to the first aspect of the present disclosure as well as its different implementation forms described above and below.
According to a third aspect a computer program product is provided, comprising a computer- readable storage medium for storing program code which causes a computer or a processor to perform the method according to the third aspect when the program code is executed by the computer or the processor.
Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.
BRIEF DESCRIPTION OF THE DRAWINGS
In the following, embodiments of the present disclosure are described in more detail with reference to the attached figures and drawings, in which:
Fig. 1 shows a schematic diagram illustrating a recommendation system including a plurality of client devices according to an embodiment and a recommendation server providing a recommendation service to the plurality of client devices;
Fig. 2 shows a schematic diagram illustrating in more detail the interaction between a conventional client device and a recommendation server for using a recommendation service;
Fig. 3 shows a schematic diagram illustrating in more detail the interaction between a client device according to an embodiment and a recommendation server for using a recommendation service; and
Fig. 4 shows a flow diagram illustrating processing steps of a method of operating a client device according to an embodiment.
In the following, identical reference signs refer to identical or at least functionally equivalent features.
DETAILED DESCRIPTION OF THE EMBODIMENTS
In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, specific aspects of embodiments of the present disclosure or specific aspects in which embodiments of the present disclosure may be used. It is understood that embodiments of the present disclosure may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following
detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.
For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of specific method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a specific apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless specifically noted otherwise.
Figure 1 shows a schematic diagram illustrating a recommendation system 100 comprising a recommendation server 140 (also referred to as a recommender server 140) and a plurality of client devices 120 using a recommendation service provided by the recommendation server 140. The recommendation server 140 may be implemented, for instance, as a cloud server 140 configured to communicate with the plurality of client devices 120 via a communication network, such as the Internet. The plurality of client devices 120 may comprise, for instance, smartphones, tablet computers, laptop computers, notebook computers, desktop computers, smart cars, smart TVs or other communication devices capable of using an online recommendation service. In an embodiment, the recommendation system 100 may be configured to recommend online advertisements, online products, music, videos/movies, video games, online news, search results for online searches and the like.
As illustrated in figure 1, each client device 120 may comprise processing circuitry 121, e.g. one or more processors 121, a communication interface 123 and/or a memory 125. The processing circuitry 121 may be implemented in hardware and/or software and may comprise
digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), or general-purpose processors. The communication interface 123 may be configured to communicate with the recommendation server 140 via wired and/or wireless connections via a wired and/or wireless communication network, such as the Internet. The memory 125 of the client device 120 may be configured to store executable program code which, when executed by the processing circuitry 121, causes the client device 120 to perform the functions and methods described herein. As illustrated in figure 1 and will be described in more detail below, in an embodiment the memory 125 of the client device 120 may store user data 120d, i.e. data that is private to the user. In an embodiment, the private user data may comprise, for instance, the name of the user; the age of the user; the address of the user; the gender of the user; an online search history of the user; an application usage by the user; a browser history of the user; and/or information about online advertisements previously selected by the user.
As further illustrated in figure 1, also the recommendation server 140 may comprise processing circuitry 141, e.g. one or more processors 141, a communication interface 143 and/or a memory 145. The processing circuitry 141 may be implemented in hardware and/or software and may comprise digital circuitry, or both analog and digital circuitry. Digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), or general- purpose processors. The communication interface 143 may be configured to communicate with the plurality of client devices 120 via wired and/or wireless connections via a wired and/or wireless communication network, such as the Internet. The memory 145 of the recommendation server 140 may be configured to store executable program code which, when executed by the processing circuitry 141, causes the recommendation server 140 to perform the functions and methods described herein.
As will be described in more detail below in the context of figure 3, the recommendation system 100 processes user data for generating one or more recommendations in a privacypreserving and computationally inexpensive manner. Before describing embodiments of the client devices 110 and the recommendation server 140 in more detail, in the following some technical background as well as terminology will be introduced in the context of figure 2 making use of one or more of the following abbreviations:
LDP Local Differential Privacy
ML Machine Learning
GRR Generalized Randomized Response
CTR Click-Through Rate
Figure 2 shows a schematic diagram illustrating in more detail the interaction between a conventional client device 20 and a recommendation server 40, in particular an advertisement recommendation server 40 for implementing a conventional recommendation system 10, in particular an advertisement recommendation system 10. The client device 20 comprises a ML model 21 configured and trained to generate based on a user data 20d an user data embedding 30, e.g. an user data feature vector 30, and to provide the user data embedding 30 to the recommendation server 40. The user data embedding 30 may be, for instance, part of a request for a recommendation, in particular an advertisement recommendation from the recommendation server 40 and may be send, for instance, via the Internet to the recommendation server 40. A first attacker device 50 may try to intercept the user data embedding 30 and to retrieve the original user data 20d based on an attribute interference model 51 and thereby reveal private user data, such as the age, gender, the search history, the previously clicked advertisements of the user of the client device 20.
A CTR prediction ML model 44 of the recommendation server 40 takes as input data the user data embedding 30, e.g. the user data feature vector 30, provided by the client device 20 and an advertisement embedding 43 generated by an advertisement embedding ML model 42 of the recommendation server 40 based on advertisement data 41, such as advertisement category, product price, and the like. Based on this input data the CTR prediction ML model
44 of the recommendation server 40 generates one or more advertisement recommendations
45 together with a respective matching score and returns the advertisement recommendations back to the client device 20. A second attacker device 60 may try to retrieve the advertisement embedding 43 and/or the advertisement recommendations 45 and to generate on the basis of this data the original user data 20d and thereby reveal private user data, such as the age, gender, the search history, the previously clicked advertisements of the user of the client device 20.
As will be appreciated, the conventional recommendation system 10 of figure 2 has three potential weak points that an attacker might try to exploit for gaining access to private user data. That is an attacker can infer private user data by using (a) the user data embedding 30 (see attacker device 50 in figure 2), (b) the advertisement embeddings 43 and the corresponding matching scores 45 (see attacker device 60 in figure 2), and (c) a combination of (a) and (b), i.e., user data and advertisements embeddings 30, 43 as well as the matching scores 45.
In order to address the security, in particular privacy risks of the conventional recommendation system 10 shown in figure 2 embodiments of the recommendation system 100 shown in figure 1 fulfill the following requirements for protecting private user data as will be described in more detail in the following: (a) prevent the inference of private user data (sometimes also referred to as attributes) based on the user information provided to the recommendation server; (b) prevent the inference of private user data based on the server-side information, e.g., the chosen recommendation(s) or the recommendation-user matching scores; (c) maintain a high utility, i.e. the recommendations determined by the recommendation server 140 based on the user data are meaningful and of high quality for the user; and (d) fulfill the hardware and user experience requirements, i.e., the solution speed, bandwidth and memory footprint are acceptable.
Figure 3 shows a schematic diagram illustrating in more detail the interaction between one of the plurality of client devices 120 according to an embodiment and the recommendation server 140, in particular advertisement recommendation server 140 for implementing a recommendation system 100, in particular an advertisement recommendation system 100 according to an embodiment. As will be described in more detail below and as shown in figure 3, generally the client device 120 is configured, for instance by means of its processing circuitry 120a, to apply a first local differential privacy, LDP, mechanism (referred to as LDP 1 in figure 3) 122 and a second LDP mechanism (referred to as LDP 2 in figure 3) 126 to the user data 120d for obtaining privacy-preserving user data 130 (referred to as protected user embedding in figure 3), wherein the first LDP mechanism 122 is a randomized selectionbased LDP mechanism 122 and wherein the second LDP mechanism 126) is a LDP mechanism 126 configured to add noise to the input data of the LDP mechanism 126. The client device 120 is further configured to transmit the privacy-preserving user data 130 to the recommendation server 140, for instance, as part of a request for recommended
advertisements to be displayed on a display of the client device 120. As already described above, the user data 120d may comprise, for instance, one or more of the following: a name; an age; an address; a gender; a search history; an application usage by the user; a browser history; and/or information about online advertisements previously selected by the user.
As will be appreciated, in the embodiment shown in figure 3 the client device 120 is configured to first apply the first LDP mechanism 122 to the user data 120d and then apply the second LDP mechanism 126 to the output of the first LDP mechanism 122 for obtaining the privacy-preserving user data 130. In a further embodiment, the client device 120 may be configured to first apply the second LDP mechanism 126 to the user data 120d and then apply the first LDP mechanism 122 to the output of the second LDP mechanism 126 for obtaining the privacy-preserving user data 130.
In the embodiment shown in figure 3 the client device 120 comprises a ML model 121 configured and trained to generate based on the raw user data 120d an user data embedding 124, e.g. an user data feature vector 124, wherein the first and second LDP mechanism 122, 126 are sequentially applied to the user data embedding 124, e.g. the user data feature vector 124. In a further embodiment, the client device 120 may be configured to directly apply the first and second LDP mechanism 122, 126 sequentially to the user data 120d (i.e. without the intermediate ML model 121). As already described above, the first LDP mechanism 122 is a randomized selection-based LDP mechanism 122. As indicated in figure 3, the randomized selection-based first LDP mechanism 122 is configured to output with a defined probability either the user data embedding 124 (or alternatively the user data 120d if no user embedding ML model 121 is employed) or randomly sampled user data, for instance, the generic, i.e. synthetic user data embedding 125 shown in figure 3 randomly sampled from a distribution of simulated user data embeddings. In an embodiment, the randomized selection-based first LDP mechanism 122 is a Generalized Randomized Response, GRR, mechanism 122. As will be appreciated, for the first LDP mechanism 122 in the form of a GRR mechanism the true user data embedding 124 is output with a fixed probability.
For generating the plurality of simulated user data embeddings of the distribution of simulated user data embeddings used for the selection made by the first LDP mechanism 122 the client device 120 may implement a further ML model 123 (referred to as the synthetic embedding generator 123 in figure 3). Since the user of the client device 120 does not have access to
embeddings of other users, the further ML model 123 generates synthetic user embeddings 125 to play the role of the randomly sampled embeddings. In an embodiment, the synthetic user embeddings 125 follow the distribution of the true user embeddings 124, and therefore are indistinguishable from the true ones (from the perspective of the attacker 150, 160 or the server 140).
The output of the randomized selection-based first LDP mechanism 122, i.e. either the user data embedding 124 (or alternatively the user data 120d if no user embedding ML model 121 is employed) or the synthetic user data embedding 125 randomly sampled from the distribution of simulated user data embeddings, is the input of the second LDP mechanism. As already described above, the second LDP mechanism 126 is configured to add noise to its input data, i.e. to the user data embedding 124 (or alternatively the user data 120d if no user embedding ML model 121 is employed) or the synthetic user data embedding 125 randomly sampled from the distribution of simulated user data embeddings, and thereby generate the privacy-preserving user data 130 (referred to as protected user embedding 130 in figure 3), which is provided to the recommendation server 140. In an embodiment, the second LDP mechanism 126 for adding noise is a Laplace LDP mechanism or a Gaussian LDP mechanism. The privacy-preserving user data 130 may be, for instance, part of a request for a recommendation, in particular an advertisement recommendation from the recommendation server 140 and may be send, for instance, via the Internet to the recommendation server 140. A first attacker device 150 may intercept the privacy-preserving user data 130, but will not be able to retrieve the original user data 120d based on an attribute interference model 151 due to the sequential application of the first and the second LDP mechanism 122, 126 used by the client device 120 for generating the privacy-preserving user data 130. More specifically, the attacker device 150 (or the server 140) cannot infer private user data of the user with a high accuracy due to the added noise provided by the second LDP mechanism 126.
A CTR prediction ML model 144 of the recommendation server 140 shown in figure 3 takes as input data the privacy-preserving user data 130 30, e.g. the protected user embedding 130, provided by the client device 120 and an advertisement embedding 143 generated by an advertisement embedding ML model 142 of the recommendation server 140 based on advertisement data 141, such as advertisement category, product price, and the like. Based on this input data the CTR prediction ML model 144 of the recommendation server 140 generates one or more advertisement recommendations 145 together with a respective
matching score and returns the advertisement recommendations back to the client device 120. A second attacker device 160 may retrieve the advertisement embedding 143 and/or the advertisement recommendations 145, but will not be able to retrieve the original user data 120d based on an attribute interference model 161 due to the sequential application of the first and the second LPM mechanism 122, 126 used by the client device 120 for generating the privacy-preserving user data 130 (which is used by the recommendation server 140 for generating the advertisement recommendations 145).
As will be appreciated, due to the sequential combination of the first and the second LDP mechanism 122, 126 the client device 120 ensures a strong privacy of the private user data 120. Additional privacy may be achieved by generating the privacy-preserving user data 130 based on the user data embedding 124 instead of the “raw” user data 120d (split-inference). The client device 120 protects against attribute inference based on the privacy-preserving user data 130 (which may be intercepted by attacker device 150) via the noise adding second LDP mechanism 126, such as a Laplace or Gaussian LDP mechanism and, thus, fulfills requirement (a) mentioned above. Moreover, due to the usage of the first randomized response LDP mechanism 122 the client device 120 protects against attribute inference based on the advertisement information 143, 145 (which may be obtained by the attacker device 160) and, thus, fulfills requirement (b) mentioned above. This is because the server 140 (or the attacker device 160) cannot infer private user data of the user with a high confidence since the received privacy-preserving user data 130 may be with a certain probability a synthetically generated user embedding 125. As will be further appreciated, the sequential application of the first and the second LDP mechanism 122, 126 still allows the recommendation server 140 to provide meaningful recommendations, in particular advertisements based on the privacypreserving user data 130 (thus, fulfilling requirement (c) mentioned above) and may be implemented in a computationally efficient (small computational overhead) and flexible manner with a low memory footprint and, thus, also fulfills requirement (d) mentioned above.
In an embodiment, the client device 120 may be configured to adjust a first privacy parameter of the first LDP mechanism 122 for adjusting the degree of privacy generated by the first LDP mechanism 122 and/or a second privacy parameter of the second LDP mechanism for adjusting the degree of privacy generated by the second LDP mechanism 126. In other words, in an embodiment, each LDP mechanism 122, 126 has a privacy budget parameter (also known as a parameter) that can be tuned such that setting a higher value for this respective
parameter leads to lower privacy. As will be appreciated, this allows the client device 120 to control how much performance/utility may be lost due to the LDP mechanisms 122, 126 by changing the value of this parameter, for instance, for achieving a desired performance. It can be shown mathematically that the sequential application of the two LDP mechanisms 122, 126 results in a composite LDP mechanism, whose privacy budget parameter is a combination of the privacy budget parameters of the two LDP mechanisms 122, 126, i.e. a combination of the first privacy parameter and the second privacy parameter.
It can be further mathematically proven that the sequential application of the two LDP mechanisms 122, 126 as part of the same query yields a higher privacy protection (mathematically corresponding to a lower compound epsilon) than the application of only one of the LDP mechanisms 122, 126. As will be appreciated, the sequential application of the first and second LDP mechanism implemented by the client device 120 is different from the case where two different LDP mechanisms are applied as part of two separate queries (in this case the privacy protection is lower than applying a single LDP mechanism as part of a single query).
In a further embodiment, the simulated user embeddings may be determined by a different entity and stored on the client device 120, in particular in its memory 120c (instead of using the synthetic data generator model 123). In this embodiment, the client device 120 may retrieve a sample from the stored simulated user embeddings each time the outcome of the first LDP mechanism 122 is to send a simulated user embedding to the recommendation server 140. As will be appreciated, also in this embodiment the simulated user embeddings should be indistinguishable from the true embeddings from the perspective of the recommendation server 140.
Figure 4 shows a flow diagram illustrating processing steps of a method 400 for using a service provided by the server 140 based on user data 120d or an user data embedding 124 associated with a user of the client device 120. The method 400 comprises a step 401 of applying a first LDP mechanism 122 and a second LDP mechanism 126 to the user data 120d of the user data embedding 124 for obtaining privacy-preserving user data 130. As already described above, the first LDP mechanism 122 is a randomized selection-based LDP mechanism and the second LDP mechanism 126 is a LDP mechanism configured to add noise to the input data of the LDP mechanism. Furthermore, the method 400 comprises a step 403 of transmitting the privacy-preserving user data to the server.
In an embodiment, the server 140 is a recommendation server 140 and the method 400 further comprises, in response to the step 403 of transmitting the privacy-preserving user data 130 to the recommendation server 140, receiving one or more recommendations 145 based on the privacy-preserving user data 130 from the recommendation server 140.
In an embodiment, the step 401 of applying the first LPD mechanism 122 and the second LDP mechanism 126 to the user data 120d or the user data embedding 124 comprises first applying the first LDP mechanism 122 to the user data 120d or the user data embedding 124 and then applying the second LDP mechanism 126 to the output of the first LDP mechanism 122 for obtaining the privacy-preserving user data 130. Alternatively, the step 401 of applying the first LPD mechanism 122 and the second LDP mechanism 126 to the user data 120d or the user data embedding 124 comprises first applying the second LDP mechanism 126 to the user data 120d or the user data embedding 124 and then applying the first LDP mechanism 122 to the output of the second LDP mechanism 126 for obtaining the privacy-preserving user data 130.
The method 400 shown in figure 4 can be performed by the client device 120 according to different embodiments described above. Thus, further features of the method 400 result directly from the functionality of the client device 120 as well as its different embodiments described above and below.
The person skilled in the art will understand that the "blocks" ("units") of the various figures (method and apparatus) represent or describe functionalities of embodiments of the present disclosure (rather than necessarily individual "units" in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit = step).
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. The described embodiment of an apparatus is merely exemplary. For example, the unit division is merely logical function division and may be another division in an actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be
implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments. In addition, functional units in the embodiments disclosed herein may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
Claims
1. A client device (120) for using a service provided by a server (140) based on user data (120d; 124) associated with a user of the client device (120), wherein the client device (120) is configured to: apply a first local differential privacy, LDP, mechanism (122) and a second LDP mechanism (126) to the user data (120d; 124) for obtaining privacy-preserving user data (130), wherein the first LDP mechanism (122) is a randomized selection-based LDP mechanism (122) and wherein the second LDP mechanism (126) is a LDP mechanism (126) configured to add noise to the input data of the LDP mechanism (126); and transmit the privacy-preserving user data (130) to the server (140).
2. The client device (120) of claim 1, wherein the server (140) is a recommendation server (140) and wherein the client device (120) is configured, in response to transmitting the privacy-preserving user data (130) to the recommendation server (140), to receive one or more recommendations (145) based on the privacy-preserving user data (130) from the recommendation server (140).
3. The client device (120) of claim 2, wherein the client device (120) comprises a display configured to display the one or more recommendations (145) received from the recommendation server (140).
4. The client device (120) of any one of the preceding claims, wherein for applying the first LPD mechanism (122) and the second LDP mechanism (126) to the user data (120d;
124) the client device (120) is configured to first apply the first LDP mechanism (122) to the user data (120d; 124) and then apply the second LDP mechanism (126) to the output of the first LDP mechanism (122) for obtaining the privacy-preserving user data (130).
5. The client device (120) of any one of claims 1 to 3, wherein for applying the first LPD mechanism (122) and the second LDP mechanism (126) to the user data (120d) the client device (120) is configured to first apply the second LDP mechanism (126) to the user data
( 120d; 124) and then apply the first LDP mechanism (122) to the output of the second LDP mechanism (126) for obtaining the privacy-preserving user data (130).
6. The client device (120) of any one of the preceding claims, wherein the randomized selection-based LDP mechanism (122) is configured to output with a defined probability either the user data (120d) or randomly sampled generic user data.
7. The client device (120) of claim 6, wherein the randomized selection-based LDP mechanism (122) is a Generalized Randomized Response, GRR, mechanism (122).
8. The client device (120) of any one of the preceding claims, wherein the second LDP mechanism (126) for adding noise is a Laplace LDP mechanism (126) or a Gaussian LDP mechanism (126).
9. The client device (120) of any one of the preceding claims, wherein the client device (120) is configured to implement a machine learning, ML, model (121), wherein the ML model (121) is configured to generate the user data as a user data embedding (124) based on raw user data (120d).
10. The client device (120) of claim 9, wherein the randomized selection-based LDP mechanism (122) is configured to output with a defined probability either the user data embedding (124) based on the raw user data (120d) or a generic user data embedding (125) randomly sampled from a distribution of simulated user data embeddings.
11. The client device (120) of claim 10, wherein the client device (120) is configured to implement a further machine learning, ML, model (123), wherein the further ML model (123) is configured to generate the plurality of simulated user data embeddings of the distribution of simulated user data embeddings.
12. The client device (120) of any one of the preceding claims, wherein the client device (120) is a user equipment, UE, (120).
13. The client device (120) of any one of the preceding claims, wherein the user data (120d) or the raw user data (120d) comprises one or more of the following: a name; an age; an
address; a gender; a search history; an application usage by the user; a browser history; and/or information about online advertisements previously selected by the user.
14. A method (400) for using a service provided by a server (140) based on user data (120d; 124) associated with a user of a client device (120), wherein the method (400) comprises: applying (401) a first local differential privacy, LDP, mechanism (122) and a second LDP mechanism (126) to the user data (120d; 124) for obtaining privacy-preserving user data (130), wherein the first LDP mechanism (122) is a randomized selection-based LDP mechanism (122) and wherein the second LDP mechanism (126) is a LDP mechanism (126) configured to add noise to the input data of the LDP mechanism (126); and transmitting (403) the privacy-preserving user data (130) to the server (140).
15. The method (400) of claim 14, wherein the server (140) is a recommendation server (140) and wherein the method (400) further comprises, in response to transmitting (403) the privacy-preserving user data (130) to the recommendation server (140), receiving one or more recommendations (145) based on the privacy-preserving user data (130) from the recommendation server (140).
16. The method (400) of claim 14 or 15, wherein applying (401) the first LPD mechanism (122) and the second LDP mechanism (126) to the user data (120d; 124) comprises first applying the first LDP mechanism (122) to the user data (120d; 124) and then applying the second LDP mechanism (126) to the output of the first LDP mechanism (122) for obtaining the privacy-preserving user data (130).
17. The method (400) of claim 14 or 15, wherein applying (401) the first LPD mechanism (122) and the second LDP mechanism (126) to the user data (120d; 124) comprises first applying the second LDP mechanism (126) to the user data (120d; 124) and then applying the first LDP mechanism (122) to the output of the second LDP mechanism (126) for obtaining the privacy-preserving user data (130).
18. A computer program product comprising a computer-readable storage medium for storing a program code which causes a computer or a processor to perform the method (400)
of any one of claims 14 to 17, when the program code is executed by the computer or the processor.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2023/076020 WO2025061284A1 (en) | 2023-09-21 | 2023-09-21 | Devices and methods for providing privacy-preserving user data for recommendation system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2023/076020 WO2025061284A1 (en) | 2023-09-21 | 2023-09-21 | Devices and methods for providing privacy-preserving user data for recommendation system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025061284A1 true WO2025061284A1 (en) | 2025-03-27 |
Family
ID=88188779
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2023/076020 Pending WO2025061284A1 (en) | 2023-09-21 | 2023-09-21 | Devices and methods for providing privacy-preserving user data for recommendation system |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025061284A1 (en) |
-
2023
- 2023-09-21 WO PCT/EP2023/076020 patent/WO2025061284A1/en active Pending
Non-Patent Citations (3)
| Title |
|---|
| SEIRA HIDANO ET AL: "Degree-Preserving Randomized Response for Graph Neural Networks under Local Differential Privacy", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 October 2022 (2022-10-07), XP091336258 * |
| SHIJIE ZHANG ET AL: "Comprehensive Privacy Analysis on Federated Recommender System against Attribute Inference Attacks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 19 March 2023 (2023-03-19), XP091463072 * |
| ZHIDONG SHEN ET AL: "RRN: A differential private approach to preserve privacy in image classification", IET IMAGE PROCESSING, IET, UK, vol. 17, no. 7, 20 March 2023 (2023-03-20), pages 2192 - 2203, XP006118376, ISSN: 1751-9659, DOI: 10.1049/IPR2.12784 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8438184B1 (en) | Uniquely identifying a network-connected entity | |
| JP6745384B2 (en) | Method and apparatus for pushing information | |
| JP4947477B1 (en) | RECOMMENDATION DEVICE, RECOMMENDATION METHOD, AND RECOMMENDATION PROGRAM | |
| US11281734B2 (en) | Personalized recommender with limited data availability | |
| US11501331B2 (en) | System for providing proof and attestation services for claim verification | |
| US20150051973A1 (en) | Contextual-bandit approach to personalized news article recommendation | |
| US11430049B2 (en) | Communication via simulated user | |
| US20210019375A1 (en) | Computing system including virtual agent bot providing semantic topic model-based response | |
| US12135820B2 (en) | Automatically detecting unauthorized re-identification | |
| CN113392200A (en) | Recommendation method and device based on user learning behaviors | |
| US11907963B2 (en) | On-device privacy-preservation and personalization | |
| CN112269942B (en) | Method, device and system for recommending object and electronic equipment | |
| US11586688B2 (en) | Computerized anonymous permission-based communications system with micro-catalog server enabling permission-based third-party communications | |
| WO2025061284A1 (en) | Devices and methods for providing privacy-preserving user data for recommendation system | |
| CN116049531B (en) | Method and device for determining associated applications, method and device for determining recommended content | |
| CN113112312B (en) | Method, apparatus and computer-readable storage medium for generating a model for a user | |
| US11100558B1 (en) | Recommendations utilizing noise detection and filtering | |
| US11673044B1 (en) | Contextual recommendations in media streaming | |
| Wadpelli et al. | Manifesta: An Event Management Platform Using Recommendation System | |
| CN114742614B (en) | Recommended methods, servers, clients, computer media and devices | |
| US12362945B1 (en) | Techniques for limiting manipulation of URLs | |
| US11144956B1 (en) | Targeted media delivery based on previous consumer interactions | |
| CN117473155A (en) | Training method and device for interaction result prediction model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23776323 Country of ref document: EP Kind code of ref document: A1 |