WO2025165372A1 - Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de données - Google Patents
Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de donnéesInfo
- Publication number
- WO2025165372A1 WO2025165372A1 PCT/US2024/014293 US2024014293W WO2025165372A1 WO 2025165372 A1 WO2025165372 A1 WO 2025165372A1 US 2024014293 W US2024014293 W US 2024014293W WO 2025165372 A1 WO2025165372 A1 WO 2025165372A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- weights
- personalization
- base
- permission
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2291—User-Defined Types; Storage management thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9032—Query formulation
- G06F16/90332—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
Definitions
- This specification relates to processing data using machine learning models.
- Machine learning models receive an input and generate an output, e.g., a predicted output, based on the received input.
- Some machine learning models are parametric models and generate the output based on the received input and on values of the weights of the model.
- Some machine learning models are deep models that employ multiple layers of models to generate an output for a received input.
- a deep neural network is a deep machine learning model that includes an output layer and one or more hidden layers that each apply a non-linear transformation to a received input to generate an output.
- This specification describes a system implemented as computer programs on one or more computers in one or more locations that can generate, maintain, and deploy a set of personalization weights for use with a base neural network (“base network”).
- base network a base neural network
- the system can maintain a set of personalization weights for each of one or more granted permissions for one or more users.
- a granted permission refers to an allowance for data sharing between a corresponding set of one or more software applications (“applications” or “apps”) or software application features, e g., specific functional aspects of a software application, for a given user.
- the granted permission can be a cross-feature permission, e.g., across features within the software application, a cross-app permission, e.g., across software applications, or a cross-app crossfeature permission, e.g., across features from different software applications.
- the corresponding set of personalization weights are learned using data from the apps, features, or both specified by the permission, e.g., not using data from any other apps and features not specified by the permission.
- the data used to train the set of personalization weights can be generated as a result of an interaction by the user with respect to the apps, features, or both specified by the permission.
- the user can generate data by searching for restaurants nearby using a map application and selecting a few of interest to view in greater detail.
- the user can generate data using a chat feature in one or more applications.
- the user can generate data on an email application based on the time of viewing or sending emails.
- the personalization weights for a particular granted permission can be used to process inputs to the base network, e.g., the system can process the request using the set of base weights and personalization weights. More specifically, in response to a user request for processing using the base network, the system can identify the set of personalization weights corresponding to the particular granted permission and can process the request using the base network and the identified set of personalization weights. In some cases, this can involve loading the set of personalization weights as a subset of the weights of the base network.
- the system can generate a new set of personalization weights corresponding to the particular granted permission.
- the system can train a parameter-efficient adaptation of the base network using data derived from the one or more software applications specified by the particular granted permission, e.g., using a low-rank approximation of the updates to the base network.
- the base network is a language processing neural network (language processing network) configured to process one or more prompts as input, e g , a large language model
- the system can alternatively train soft prompts as the set of personalization weights.
- a computer-implemented method for maintaining a plurality of sets of personalization weights for a given user wherein each set of personalization weights corresponds to a respective granted permission that allows data sharing between a corresponding set of one or more software applications and has been determined by training a base neural network on training data that includes data for the given user from the corresponding set of one or more software applications, receiving a request for processing by the base neural network, determining a particular granted permission associated with the request, and processing the request using the base neural network and in accordance with a set of personalization weights corresponding to the particular granted permission.
- the system of this specification can be used to facilitate direct user control over which user data can be leveraged to personalize processing with a base network, e.g., based on the set of personalization weights used.
- the system can enable the sharing of data between chat software applications or between an email application and a messaging application based on a granted user permission.
- the system can enable the sharing of data intra-application, e.g.. the user can grant a permission to share data between different features of an application, e.g., a search and recommendation feature. More specifically, the system can enable the sharing of data within the context of a particular granted permission, while prohibiting the sharing of the data between applications, features, or both outside the context of granted permissions.
- the system can generate and maintain a set of personalization weights corresponding with each particular granted user permission. Since the personalization weights are associated with a particular granted user permission, the user can decide to revoke a particular granted user permission at any time.
- the system can update the set of personalization weights, e.g., by removing the set of personalization weights associated with the particular granted permission, according to the removal specified by the user request.
- the separation of personalization weights from the remainder of the base weights of the base network (“base network weights”) can enable efficient training of the personalization weights using the base network. More specifically, the system can generate the personalization weights by finetuning only a subset of base network weights, e.g.. the system can freeze the majority of the base network weights and use a low-rank approximation to update a target set of update weights or freeze all of the base network weights and use prompt tuning to leam anew “soft” prompt that can be provided as input to the base network, thereby reducing the computational resources needed to generate the set of personalization weights for a particular granted permission.
- base network weights can enable efficient training of the personalization weights using the base network. More specifically, the system can generate the personalization weights by finetuning only a subset of base network weights, e.g.. the system can freeze the majority of the base network weights and use a low-rank approximation to update a target set of update weights or
- the system can drastically reduce the resources required to train the personalization weights for the base network by using a low-rank approximation or prompt tuning since each language processing model can have billions or trillions of weights to update each training iteration.
- FIG. 3 provides an overview of the example cross-app permission management system of FIG. 1 deploying a set of personalization weights for a language processing network.
- FIG. 4 is a block diagram that illustrates an example prompt tuning technique that the cross-app permission management system of FIG. I can use to generate a set of personalization weights for a language processing network.
- FIG. 5 demonstrates how an example cross-app permission management system that can manage a set of personalization weights for a set of granted permissions.
- FIG. 6 is a flow diagram of an example process for identifying and deploying a set of personalization weights for use with a base network.
- FIG. 7 is a flow diagram of an example process for generating and maintaining a set of personalization weights for use with a base network.
- FIG. 1 shows an example permission management system 100.
- the permission management system 100 is an example of a system implemented as computer programs on one or more computers in one or more locations in which the systems, components, and techniques described below are implemented.
- the permission management system 100 can generate, maintain, and deploy a set of one or more personalization weights for one or more users for use with a base network.
- the personalization weights can include a set of one or more values that can be used along with the weights of the base network for personalized processing.
- Each set of personalization weights corresponds to a granted permission, e.g., an allowance for data sharing between a corresponding set of one or more software applications, e.g., cross-app, cross-feature, or cross- app cross feature, for a given user.
- the base network can be used by any one of the applications or features managed by the permission management system 100 in order to provide a user experience or tool for the user.
- the base network can be a chat-bot, email-writing assistant, strategy aid for game playing, notification manager, etc.
- the base network can provide a generic functionality to the user by processing inputs using a base set of weights.
- the base network can also provide a customized functionality for a particular user based on a set of granted permissions that are managed by the permission management system 100.
- the base network can provide a customized functionality, e.g., personalized processing, for a user by processing inputs using a set of personalization weights corresponding to the particular granted permission, e.g., by processing inputs using the set of base weights and the personalization weights.
- the system can generate the personalization weights by finetuning only a subset of base network weights, e.g.. the system can freeze the majority of the base network weights and use a low-rank approximation to update a target set of update weights.
- the base network is a language processing model
- the system can freeze all of the base network weights and use prompt tuning to learn anew “softy prompt as the set of personalization weights for a particular granted permission.
- the personalization weights can be used in conjunction with the base set of weights to adapt the outputs generated by the base neural network for personalized processing, e.g., the base network can process inputs using the set of personalization weights and the base set of weights, which are shared between the respective granted permissions.
- the system 100 can deploy the set of personalization weights corresponding to a particular granted permission in response to a request from a user device 105 for processing, e.g., personalized processing, using the base network.
- the base network can be trained to perform any kind of machine learning task, e.g.. a computer vision, text processing, audio processing, multi-modal task, etc.
- the base network is a neural network that is configured to perform an image processing task for which a user can have a preference for personalized processing.
- the base network can receive one or more input images and process the input images, e.g., process the intensity values of the pixels of each input image, to generate a network output for the input image.
- the user can have a preference for the base network’s output for image captioning or visual question-answering, e.g., the user can have specific language style preferences or characteristics for captioning or the format of the answer.
- the base network is a neural network that is configured to perform an audio processing task for which a user can have a preference for personalized processing.
- the base network can process a spoken utterance from a user and transcribe the spoken utterance into text of a different style and format as desired by the user.
- the output generated by the neural network may be a score for each of a set of podcasts or songs, each score representing an estimated likelihood that the user will enjoy the podcast or song.
- the user can have a preference for how an input audio, e.g., a lecture, can be summarized into a shorter format according to the user’s preferences.
- the base network is a neural network that is configured to perform a natural language processing or understanding task, e.g.. an entailment task, a paraphrase task, a textual similarity task, a sentiment task, a sentence completion task, a grammaticality task, and so on, that operates on a sequence of text in some natural language.
- a natural language processing or understanding task e.g.. an entailment task, a paraphrase task, a textual similarity task, a sentiment task, a sentence completion task, a grammaticality task, and so on
- the user can have a preference for the output, e.g., that the output adhere to the user’s language style.
- the output generated by the neural network may be a score for each of a set of pieces of text in another language, with each score representing an estimated likelihood that the piece of text in the other language is a proper translation of the input text into the other language.
- the user can have a preference that the translation be proper in the other language while aligning with the specific manner in which the user communicates in the language of the processed input.
- the user can prefer that the output be more casual, e.g.. include slang and idioms, or more formal, e.g., for a professional setting.
- the base network is a neural network that is configured to perform a multi-modal task that involves processing one or more input modalities, e.g., image, text, audio, etc.
- the user can have a preference for image generation using a vision-language model, e.g., the base network can process an apple, a basket, and a photo of the user with the text input “me holding a basket of apples” and output a generated image of the user holding a basket of apples.
- the user can have a preference for the style of the generated output image, e.g., post-impressionist, pencil or crayon drawing, cartoon, etc.
- the user can have a preference for the base network’s output for visual question answering, e.g., the base network can provide an accurate response while adhering to user preferences, e.g., regarding specificity of the answer.
- the user can have a preference for the base network’s output for image captioning, e.g., the user can have specific language style preferences or characteristics for captioning.
- the captions include information about the experience from a travel blog the user wrote or from a travel website that the user accessed when planning the trip.
- the permission management system 100 can receive a permission request 110 from a user device 105, e.g., a laptop, tablet, smart-phone, smart-watch, etc., that can include a permission 112 specifying data sharing between a set of one or more software applications or a request to remove a previously granted permission.
- the received permission 112 can specify the sharing of data across applications, e.g., between an email application and a map application.
- the received permission 112 can specify the sharing of data across features of different applications, e.g., between the review feature of a hypermarket store application and the review feature of the map application or two messaging application program interfaces (API) of different social media applications.
- the received permission 112 can indicate the sharing of data between features of the same application, e.g., between different camera editing features of a photo application.
- the system 100 can process the permission request 110 using a personalization engine 120.
- the system 100 can use a network, e.g., the internet 115, to transmit the permission request 110 to the personalization engine 120.
- the personalization engine 120 can be executed on the user device 105.
- the engine 120 can process the permission request 110 to determine a set of personalization weights, e.g., the personalization weights that can be used for personalized processing using a base network, e.g., the pretrained base network 125, or a set of personalization weights that need to be removed from the system 110.
- determining the set of personalization weights can include identify ing a set of pre-existing personalization weights 155 that correspond to the permission request 110.
- In other cases, determining the set of personalization weights can include determining that the corresponding set of personalization weights is not yet maintained by the system 100. In this case, the subsystem 130 can initiate a request to generate the set of personalization weights corresponding to the particular received permission 112, as will be described in more detail below.
- the engine 120 can determine the corresponding set of personalization weights 155 for a particular received permission 112 using a permissions subsystem 130.
- the permissions subsystem 130 can identify the set of personalization weights 155 corresponding to the particular received permission 112, e.g., by accessing the personalization weights from a personalization database 150.
- the permissions subsystem 130 can also allow a user to manage their respective granted permissions.
- the permissions subsystem 130 can present the granted permissions for the user in response to a request to review the granted permissions, e.g., using a user interface.
- the permission subsystem 130 can enable a user to revoke a granted permission and, in response to a request 110 to remove a permission, remove the corresponding personalization weights from the personalization database 150.
- An example for removing personalization weights from the database 150 will be covered in more detail in FIG. 5.
- the personalization database 150 can maintain a repository of all active, e.g., provided and not yet removed, granted permissions.
- the personalization database 150 can maintain one or more data structures for one or more users of the system 100, each including one or more sets of personalization weights associated with a primary key based at least on the particular granted permission 112.
- the personalization database 150 can maintain one or more data structures for permissions associated with one or more specific software applications, e g., per-product personalization weights.
- the personalization database 150 can maintain one or more data structures for permissions associated with one or more features of one or more software applications, e.g., per-feature personalization weights.
- the data structures in the database 150 can be tables and the permission subsystem 130 can identify a primary key, e.g., a unique permission identifier for the permission 1 12, that can be used to access the corresponding personalization weights in the database 150.
- the primary key can be a hash code associated with the permission 112 or a unique string, e.g., a string that combines a user identifier with one or more applications and one or more feature identifiers based on the allowed data sharing specified by the permission 112.
- the permission subsystem 130 can include a data structure to maintain the primary keys for the permissions associated with the application.
- the permissions subsystem 130 can maintain a dictionary of all sets of personalization weights maintained by the personalization weight database 150.
- the dictionary can be used to map the permission 112 to the primary key that can be used to identify the corresponding set of personalization weights 155 in the database 150.
- the permissions subsystem 130 can maintain a graph including a set of nodes and a set of edges, where each node is representative of an application or a feature and each edge that connects a pair of nodes is representative of a given permission specifying data sharing between the applications or features.
- the edges can include the primary key as associated metadata.
- the permission subsystem 130 can access the corresponding set of personalization weights 155 using the identified primary key.
- the permission subsystem 130 can include a set of hard-coded logic-rules, e.g., a set of if-then statements, that can be used to identify the primary key for the permission 112, which can be used to access the corresponding set of personalization weights 155.
- the personalization engine 120 can then incorporate the personalization weights 155 with the pretrained base network 125 into a deployed network 160, which will be described in further detail below.
- the permission request 110 received from the user can indicate the removal of a permission from the system.
- the subsystem 130 can access the corresponding set of personalization weights 155 using the identified primary key and remove the personalization weights 155. As an example, this can involve entirely deleting the set of personalization weights 155. As another example, this can involve updating, e.g., retraining, the set of personalization weights 155 to remove the effect of the shared data specified by the permission request 110. Removing personalization weights in response to a user request will be covered in more detail in FIG. 5. In particular, removing the personalization weights 155 can depend on the type of personalization weights 155 the system has trained. For example, personalization weights created with a low-rank approximation, which will be described in more detail in FIG. 2, can require a different removal technique, than personalization weights created with prompt tuning, which will be described in more detail in FIG. 4.
- the permissions subsystem 130 can determine that the set of personalization weights corresponding with the received permission 112 do not exist in the database 150.
- the engine 120 can generate the set of personalization weights corresponding to the particular permission 112. e.g., using a training subsystem 140, that can access the pretrained base network 125 and derive data from the one or more software applications specified by the particular granted permission 112 for finetuning, e.g., training the corresponding set of personalization weights 155 using the pretrained base network 125.
- the set of personalization weights can include a subset of the weights of a pretrained base network 125.
- the subset of the weights can represent a small percentage, e.g., less than 2%, 5%, or 10%, of the overall weights of the base network 125.
- the pretrained base network 125 can be a neural network with any appropriate machine learning architecture.
- the base network can have any appropriate number of neural network layers (e.g., 1 layer. 5 layers, or 10 layers) of any appropriate type (e.g., fully- connected layers, attention layers, convolutional layers, etc.) connected in any appropriate configuration (e.g., as a linear sequence of layers, or as a directed graph of layers).
- the pretrained base network 125 can have any appropriate architecture that allows the network to perform the particular machine learning task, e.g.. to map network inputs of the type and dimensions required by the task to network outputs of the type and dimensions required by the task.
- the base network 125 can be a convolutional neural network, e.g., aneural network having a ResNet architecture, an Inception architecture, an EfficientNet architecture, and so on, or a Transformer neural network, e.g., a vision Transformer.
- the base network 125 can be a recurrent neural network, e.g., a long short-term memory (LSTM) or gated recurrent unit (GRU) based neural network, or a large language model, e.g., a Transformer neural network.
- the base network 125 can be a feed-forw ard neural netw ork, e.g., an MLP, that includes multiple fully -connected layers.
- the network 125 is pretrained such that the personalization engine 120 can deploy the pretrained base network 125 without any personalization weights 165.
- the deployed network 280 e.g., the pretrained base network 125, can still maintain generic functionality without personalization by processing a user request using the base set of weights.
- the pretrained base network 125 can be a language processing neural netw ork.
- the base network 125 can have a recurrent neural netw ork architecture that is configured to sequentially process an input and trained to perform next element prediction, e.g., to define a likelihood score distribution over a set of next elements.
- the base network 125 can be a recurrent neural network (RNN), long short-term memory (LSTM), or gated-recurrent unit (GRU).
- RNN recurrent neural network
- LSTM long short-term memory
- GRU gated-recurrent unit
- the base network 125 can be an encoder-decoder transformer configured to perform parallel processing of the contents of the multimodal input using a multi-headed attention mechanism.
- the base network 125 can be a large language model, e.g., a foundation model such as a transformer large language model that has been configured to process a prompt as input, e.g., a question, statement, code snippet or example, to generate an output.
- a prompt as input
- code snippet code snippet
- Large language models have been demonstrated to achieve state of the art performance in semantic understanding, e.g., their ability to effectively capture semantic information from prompt inputs.
- the training subsystem 140 can derive the data needed for training the pretrained base network 125 in accordance with the particular granted permission 112 by accessing a cache of data that can be stored on the user-device 105.
- the data specified by the permission 112 can become accessible to the personalization engine 120 upon receipt of the permission request 110.
- the engine 120 can receive the user data for the features included in the cross-feature permission from the first software application.
- the permission 112 includes a cross-app cross-feature permission between a feature A of the first software application and a feature B of a second software application
- the engine 120 can receive the user data for feature A and feature B that can be used to train the base network 125. Receiving personalization data for the correct data specified by the permission 112 will be covered in greater detail in FIG. 5.
- the training subsystem 140 can train the pretrained base network 125 using one or more techniques that can be used to update a subset of the pretrained base network 125 weights.
- the subsystem 140 can use a low-rank approximation (LoRA) 142, e.g., by finetuning a subset of base network weights with a matrix decomposition to approximate updates to the base network, or. in the case that the pretrained base network 125 is a language processing model, the system can use LoRA 142 or prompt tuning 144, e g., training a set of personalization weights that can be prepended to prompt inputs received for processing by the language processing model.
- LoRA low-rank approximation
- prompt tuning e.g., training a set of personalization weights that can be prepended to prompt inputs received for processing by the language processing model.
- An example of training personalization weights using a low-rank approximation will be covered in more detail in FIG. 2.
- the system 100 can train one or more adapter modules as the personalization weights.
- the system can insert each untrained adapter module with respect to a particular place within the base network and train the adapter modules to yield personalized processing.
- the cross-app generated personalization engine 120 can then deploy the generated personalization weights 155 with the pretrained base network 140, e.g., as the deployed network 160 which can process requests using the base network according to the personalization weights.
- the deployed network 160 can be maintained by the engine 120 and accessible to the user via the internet 115, e.g., the deployed network 160 can be maintained in the cloud.
- a user request for processing can involve transmitting input data for the deployed network 160 to the personalization engine 120 for processing using the deployed network 160.
- the engine 120 can transmit the personalization weights 155 to the user-device 105 and the personalization weights 155 can be combined with an on-edge pretrained base network 185, e.g., by loading the personalization weights 155 as a subset of the weights of the base network 185.
- the engine 120 can generate the personalization weights 155 and the system 100 can store the weights 155 on the user-device 105.
- the system can maintain a user database of personalization weights on each user-device, e.g., such that the user retains all of their personalization information on their device 105.
- the engine 120 can access the database 150, e.g., using the internet 115, in response to a user request for generating a set of personalization weights or removing a permission.
- the engine 120 can access and load the weights directly on the user device 105.
- the on-edge pretrained base network 185 can be the same as the base network 140 maintained by the engine 120.
- the on-device base network 185 can be a smaller version, e.g., with less weights, of the pretrained base network 140 maintained by the engine.
- the on-edge base network 185 can require less computational resources and can perform processing faster than the pretrained base network 140 in the engine 120.
- the engine 120 can train one or more projection models to translate the personalization weights 155 into the proper dimensionality of the on-device base network 185, as will be discussed in more detail in FIG. 5.
- the engine 120 can incorporate the personalization weights 155 into the personalization weight database 150.
- the permission subsystem 130 can be able to identify the corresponding personalization weights 155 for the permission 112 in the database 150 after the personalization weights 155 have been generated.
- the permission subsystem 130 can also monitor a set of criteria pertaining to retraining the corresponding personalization weights 155.
- the set of criteria can include a specification that a certain amount of time has elapsed since the personalization weights 155 were most recently updated.
- the set of criteria can include a data accumulation threshold criteria specifying that the amount of new data pertaining to the permission 110 has accumulated above a set threshold, e.g., in the data cache the system 100 can access to train the personalization weights.
- FIG. 2 is a block diagram that illustrates a simplification of an example low-rank approximation (LoRA) technique that the permission management system 100 of FIG. 1 can employ to train a set of personalization weights for the base network.
- the set of personalization weights include a low-rank factorization of an update weight matrix that can be used to update a given weight matrix of the base network.
- FIG. 2 presents the low- rank approximation technique with respect to updating one set of target weights in a particular layer, e.g., the target update weights 210, the low-rank approximation technique can be performed on multiple sets of target weights to update corresponding different layers of the base network.
- the system can use a low-rank approximation to approximate an update to the base network weights during each training update of the base network, e.g., by optimizing a product of two smaller matrices in order to reduce the dimensionality of the calculation required to compute the change in weights required by the update. More specifically, performing a low- rank approximation refers to breaking up a matrix containing the base network weights identified for updating into a product of two smaller matrices that when multiplied together can recover the values of the base network weights with high fidelity.
- the low-rank decomposition can represent VK 0 + AW « W o + BA. where AW is the update to the base network weights W o and the product BA approximates AW.
- the rank of a matrix refers to the number of linearly independent vectors, e.g., the sum of columns or rows within the matrix decomposition BA that do not contain correlative data.
- the rank determined specifies the dimensionality of the update needed by providing a constraint on the dimensions of the two smaller matrices. For example, in the case in which B is a matrix of dimension d X r and A has dimension r X k, where r must be the same to enable the matrix multiplication, the rank r can be a value much less than the minimum of d and k, e.g., r « min(d, k).
- training the set of personalization weights can include finetuning the pretrained base network.
- the system can identify a set of target update weights 210 from the pretrained weights 200 of the base network.
- each circle represents a neuron of a neural network in a single layer, e.g., each circle is an oversimplification of a respective weight within a layer of weights in the base network.
- Example values for the target update weights 210 are shown in the subset of base weights 215. In particular, these values are the W Q of the W o + AW.
- the identified target update weights 210 can be chosen as any subset of the pretrained weights 200.
- the subset of weights can include only a small fraction of the pretrained weights 200, e.g., 2%, 5%, 10%, or certain organizational blocks of the base network, e.g., one or more layers that function as a unit within the base network, e.g., a residual block in a ResNet. a convolutional block in a convolutional neural network, etc.
- the subset of weights can come from different organization blocks or layers of the base network.
- the target updated weights 210 can be identified with respect to their location in the network, e.g., the system can identify a subset of target weights 210 in the beginning, middle, or end of the network. In the particular example depicted, the system has identified a subset of target update weights 210 of a layer of the pretrained weights 200 for updating.
- the updates AW for the target update weights 210 can be initialized as a decomposition of a product of two matrices, e.g.. the low -rank decomposition 220, based on the rank of the target update weights 210.
- the initialized low-rank update approximation 220 provides an overview of approximating a training update to the base network using matrix 1 240 and matrix 2 250.
- matrix 1 240 can be A and matrix 2 can be B in W/ o + 414/ ⁇ I4/ o + BA. More specifically, the rank of A and B can be chosen such that the dimensionality of the target update weights 210 can be recovered.
- matrix 2 250 has a dimensionality of 6 x 2 and matrix 1 240 has a dimensionality of 2 x 6, such that when B and A are multiplied together respecting matrix multiplication rules, the result has the same dimensionality of the target update weights 210, e.g., 6 x 6, such that the updates 414 can be added to the subset of base weights 215.
- Example initialized values for A and B are depicted in the initialized low-rank decomposition 220.
- the system can set one of the two matrices to zero, e.g., by setting all of the values of the matrix to zero, and can use a random Gaussian initialization or other random initialization for the values of the other matrix, e.g., by randomly sampling the values of the other matnx from a standard normal distribution with vanance o.
- matrix 2 250 has been set to zero and matrix 1 240 has been drawn from a standard normal distribution with variance of one.
- the system can then update matrix 1 240 and matrix 2 250 by training with the data from the one or more software applications corresponding to the permission over a sequence of training iterations.
- the system can perform gradient updates in accordance with an objective function, e.g., based on minimizing a loss function applicable to the base network, over the sequence of training iterations in order to generate an approximation of the target update weights using the low-rank approximation.
- the target update weights 210 can be kept frozen while the system updates matrix 1 240 and matrix 2 250 to approximate the update ZIVF « BA.
- training can yield the trained matrix 1 245 and the trained matrix 2 255, which can be multiplied together to provide the trained low-rank update approximation 225.
- the trained low-rank update approximation 225 can then be added to the frozen base weights, e.g., the subset of base weights 215.
- a modified forward pass using the matrix decomposition can involve multiplying an input x with the quantity (Wo + BA). More details on training a low-rank approximation are included in Hu, E., et. al: “LoRA: Low-Rank Adaptation of Large Language Models” (arXiv:2106.09685v2).
- the system can train matrix 1 240 and matrix 2 until a termination criteria is met, e.g., a certain number of training iterations or time has elapsed, or when the loss function falls below a predetermined threshold.
- a termination criteria e.g., a certain number of training iterations or time has elapsed, or when the loss function falls below a predetermined threshold.
- the resultant trained matrix 1 245 and matrix 2 255 represent the updates to the target update weights 210 and can be multiplied together and combined with the subset of base weights 215 as the set of personalization weights.
- the combination of the updates and the target update weights 210 can be performed in parallel such that the resultant personalization weights include the updates from each subset.
- the system can also train different types of personalization weights, e.g., soft prompt personalization weights for use in a base language processing model.
- FIG. 3 demonstrates the example permission management system 100 of FIG. 1 managing a set of personalization weights for an example base language processing network.
- the personalization engine 120 can receive a permission 112 from a user device 105 indicating a sharing of data across one or more software applications for processing with a base language processing network.
- the personalization engine 120 functions as described in FIG. 1 to generate, deploy , or generate and deploy personalization weights that can be used with the base language processing network in response to a request by the user for personalized processing with the base language processing network.
- the base language processing network can be a pretrained base large language model (LLM) 345.
- the pretrained large language model can have been trained over a common language corpus and previously finetuned on a task specific dataset to specialize the large language model as a virtual assistant that users can access for a variety of purposes.
- large language models are becoming increasingly ubiquitous across software applications as Al assistants, e.g., virtual assistants that can be used to write emails, recommendations, edit photos, summarize search results, maintain a calendar, etc.
- the one or more software applications can provide the LLM 345 as a nonpersonalized assistant 340 to the user of the user device 105.
- the personalization engine 120 can enable personalization of the pretrained base LLM 345 as a personalized large language model (LLM) assistant 330 in the case that the received permission 112 specifies personalization of the LLM 345.
- LLM personalized large language model
- the engine 120 can deploy a non-personalized LLM assistant 340 in the application, e.g., using the base LLM 345.
- the separation of personalization weights from the remainder of the base language processing network weights can enable efficient training of the personalization weights for the personalized LLM 335.
- Large language models can have billions or trillions of weights to update each training iteration.
- the system can drastically reduce the computational resources required to train the personalization weights.
- the system can use a low-rank approximation, e.g.. as detailed in FIG. 2, or prompt tuning, which will be described in more detail in FIG. 4 to generate the personalization weights for the base language processing network.
- the user of user device 105 can grant a particular permission 112 to leverage data collected in an email application in order to access a personalized LLM assistant 330 for writing reviews in a map or restaurant review application.
- the user can grant permissions for the map or restaurant review application to provide personalization features based on data collected in the email application.
- the system can train personalization weights with the pretrained base LLM 345 to leam personalized writing, e.g., based on the style and format of the user, e.g., for writing reviews.
- the system can then deploy the personalization weights such that the personalized LLM assistant 330 can write reviews on behalf of the user by suggesting or generating review text, e.g., in the map or restaurant review application.
- the system can remove the personalization weights and the user can elect to use the non-personalized LLM assistant 340 to assist in writing reviews.
- the user of user device 105 can grant a permission 112 to leam a preferred summarization style from interactions with the non-personalized LLM assistant 340 in a text editor application in order to summarize search results in an internet application.
- the system can train personalization weights with a pretrained base LLM 345 to learn summarization preferences, e.g., that the user prefers bullet-point summaries over singlesentence summaries in the text editor application, and deploy the personalization weights in a personalized LLM assistant 330 that can summarize search results in the internet application according to the user’s preferences.
- the user can revoke the permission 112, and the system can remove the personalization weights.
- An example of the system removing personalization weights in response to a request from a user will be covered in more detail in FIG. 5.
- FIG. 4 is a block diagram that illustrates an example prompt tuning technique that the permission management system 100 of FIG. 1 can use to generate a set of personalization weights for an example base language processing network, e g., the example base large language model of FIG 3.
- an example base language processing network e g., the example base large language model of FIG 3.
- the base large language model (LLM) 345 can be configured to process prompts as inputs.
- the prompts can be one or more of a question, statement, image, audio clip, etc. input by the user as a request to the virtual assistant.
- the user can prompt the base large language model by inputting the text "Summarize this news article.” with a corresponding news article, inputting an audio clip containing the question “When did the Spanish American War end?”, or submitting an image of the handwritten arithmetic problem “What is 2+2?”, etc. to generate an output as a response.
- the base LLM 345 is deployed as a non-personalized assistant, the user will receive outputs based on the pretrained weights of the base LLM 345.
- Prompt tuning involves processing a set of example prompt-response pairs 400 to generate a set of tunable tokens 425, with different subsets of the tunable tokens 425 corresponding with each of the example prompt-response pairs 400.
- the tunable tokens 425 can then be updated during a training process over a sequence of training iterations using backpropagation, as described in further detail below.
- the tunable tokens 425 can be used as a soft prompt 440, e.g., the tunable tokens 425 can be used as the set of personalization weights.
- the soft prompt 440 e.g., the task-specific tokens
- the soft prompt 440 can be prepended to an input prompt 452 to the base LLM 345 in order to condition the base LLM 335 output with respect to a personalization preference of the user represented by the soft prompt 440.
- the soft prompt 440 can cause the base LLM 345 to generate a conditioned output 460, e.g., the task-specific tokens can be used to tune the base LLM 435 output based on user preferences, e.g., a style, format of response, desired brevity, etc. included in the example prompts 400.
- the system can generate the soft prompt 430 from a set of example prompt-response pairs 400.
- the system can compile a set of example prompt-response pairs 400 according to a received permission for a personalized LLM, e.g., personalized with respect to data shared between one or more applications as specified by the permission.
- the example prompt-responses 400 can include prompts and responses that the user curated to demonstrate one or more preferences 410 to the base LLM 345, e.g., a desired style 412 or format 414 that the user expects responses of the personalized LLM to align with for given inputs.
- the example prompt-response pairs 400 can include a set of one or more emails, e.g., 5, 20, 50 emails, or all the emails the user has written, that the system samples from the data derived from the email application.
- the example prompt-response pairs 400 can include a set of one or more before-and-after editing photos from the social media posts that the user has curated to be used as personalization data, e.g., that the user feels demonstrates their desired style of photo editing.
- each tunable token of the soft prompt 440 can include a set of token weights that can be updated through gradient descent using the base LLM 345, e.g., for the given example prompt-response.
- the tunable tokens 425 can be learned by freezing the base LLM 345 and updating only the weights that correspond with the tunable tokens 425 over a sequence of training iterations.
- the tunable tokens 425 can be initialized from existing tokens of the base LLM 345 vocabulary’ using the tokenizer before training of the soft prompt 440 begins, e.g., to ensure the soft prompt 440 is restricted to the legal output classes defined by the existing tokens of the base LLM 345.
- a reserved embedding from the vocabulary can be identified based on a hard token, e.g., a token that is provided to the tokenizer of the base LLM 345, but not used, e.g., " ⁇ extra_id_0>".
- the tunable tokens 425 can be randomly initialized vectors of the same dimension as a vocabulary token that can bypass the tokenizer. e.g., can be provided directly to the base LLM 345 and learned using gradient descent.
- the set of tunable tokens 425 can be generated using a prompt encoder model, e.g., the prompt encoder model 420, e.g., a model external to the base LLM 345, that is configured to process a set of example prompt-response pairs 400 to generate the tunable tokens 425 that can be tuned to incorporate signals, e.g., patterns, from the set of examples 400 during training.
- the soft prompt model 420 can be configured to recognize the similarities between the prompt-response pairs 400 and to create one or more embeddings, e.g., tunable tokens 425, that represent the similarities.
- the tunable tokens 425 can be prepended to a training prompt that can be processed by the base LLM 345. More specifically, the pretrained LLM 345 can process the training input to generate an input prompt embedding and the system can prepend the tunable tokens 425 to the user input prompt embedding in order to provide a representation of the user’s preferences for the base LLM 345 response.
- the system can update the weights of the tunable tokens 425, e.g., by providing training updates 450 to tune the values of the tunable tokens 425 directly or to update the weights of the prompt encoder model 420 to indirectly tune the values of the tunable tokens 425.
- the training updates 450 can be based on an objective function based on a measure of the response 430 generated by the base LLM 345 aligning with the preferences demonstrated by the example responses.
- the system can train the personalization weights, e.g., the soft prompt 440, for the base LLM 345 by updating the weights of the tunable tokens 425 to maximize the probability of the conditioned output 460 based on the user’s personalization preference represented by the set of example promptresponse pairs 400. More details on using prompt tuning are included in Lester, B., et. al: “The Power of Scale for Parameter-Efficient Prompt Tuning’’ (arXiv:2104.08691v2).
- the system can store the final soft prompt 440 as the set of personalization weights.
- the soft prompt 440 can then be used to condition every output of the base LLM 435 applicable to the permission with respect to the user preference encoded for by the soft prompt, e.g., a prompt 450 can include a user input 452 and the soft prompt 440.
- the system can personalize the pretrained base LLM 435 by prepending the soft prompt 440 to the user input prompt embedding during processing in order to generate the conditioned output 460.
- FIG. 5 demonstrates how an example cross-app permission management system can manage a set of personalization weights for a set of granted permissions.
- the permission management system 100 of FIG. 1 can manage the permissions 500, which have been previously received from one or more users.
- the system can use the permissions subsystem 120 to manage the received permissions for one or more software applications and for one or more features within the applications.
- the permissions subsystem 120 can organize the permissions based on applications, e.g., can manage permissions for software application 1 (app 1) 512, software application 2 (app 2) 514, software application 3 (app 3) 522, and software application 4 (app 4) 524.
- the system can manage permissions for one or more features of the application, e.g., the feature 2 516 in app
- the subsystem 120 can maintain additional levels of organization, e.g., suites of software applications.
- sets of one or more related software applications can be included in a suite of software applications, e.g.. a collection of applications that share similar user interfaces or contain related functionality.
- suites can organize the one or more applications based on a characteristic purpose of the application, e.g., app 1 512 and app 2 514 can both be included in suite 1 510 since they are messaging applications.
- suites can organize the one or more applications based on the entity that created the applications, e.g., app 3 522 and app 4 524 can both be included in suite 2 520 since they were created by the same entity.
- data collected within one application of a software suite e.g., app 3 522, can be more easily combined with data in another application of the same software suite, e g., app 4 524.
- the permissions subsystem 120 can manage the one or more permissions 500, which can include one or more cross-feature, e.g., the feature 2 and 4 permission 506, cross-app, e.g., the app 1 and 2 feature 502 and app 1 and 3 feature 508, or cross-feature cross-app permissions, e.g., the feature 2 and 4 permission 506.
- the subsystem 120 can manage cross-suite permissions, e g., the feature 2 and 4 permission 506 and the app 1 and 3 permission 508.
- the permission subsystem 120 can additionally manage user requests for permission removal, e.g., the remove app 3 form suite 2 permission 504, as will be described in further detail below.
- the permissions 500 can be associated with personalization data 530, e.g., data from a user-device data cache based on the received permission.
- personalization data 530 e.g., data from a user-device data cache based on the received permission.
- the system 100 can generate the personalization weights corresponding with each permission, e.g., using the training subsystem 140, by training a base network, e.g., the pretrained base network 125. for each permission 502, 506, and 508 with the associated personalization data 530.
- the system can train a low-rank approximation as the set of personalization weights, as discussed in FIG. 2, or a soft prompt as the set of personalization weights, as discussed in FIG. 4, in the case that the base network is a language processing network.
- the personalization weights can be stored in the personalization weight database 150, e.g., as LoRA weights 542 in the case that low-rank approximation was used or as soft prompt(s) 540 in the case that prompt tuning was used for training.
- the system 100 can train the personalization weights using app 1 and 2 data 532, e.g., using email and messaging application data, for the suite 1 permission 502.
- the system 100 can train personalization weights using feature 2 and feature 4 data 536, e.g., using photo editing data from a photo editing and social media application, for the permission 506.
- the system 100 can train the personalization weights using app 3 and 4 data 538, e.g., data from a messaging application in suite 1 and a messaging application in suite 2, for the permission 508.
- the system can train or generate multiple personalization weights for each feature corresponding with each different pretrained base network using the data specified by the permission.
- a granted permission can specify training a personalized chat assistant for use in three different social media applications that each have a respective base language processing network.
- the system can train three sets of personalization weights for a cross-suite feature corresponding with three different pretrained base networks.
- the system can generate different personalization weights for different pretrained base networks using one or more projection models, e.g., in order to project the personalization weights between the parameter spaces of the different pretrained base networks.
- the system can generate the personalization weights for one or more different pretrained base networks using the personalization weights for a first pretrained base network without access to the data specified by the permission.
- the system can train one or more projection models to map between parameter spaces, e.g., a pair of models to map between the personalization weights of a first pretrained base network and the personalization weights of a second pretrained base network.
- the projection models can be distillation models.
- the system can process a set of inputs using the first network and the first personalization weights to generate a first output, can process the same set of inputs using the second network and the second personalization weights to generate a second output, and can update the personalization weights of the second network in accordance with minimizing a loss function based on the discrepancy of the first and second outputs.
- the system can train a bidirectional mapping model to map between the first and second parameter spaces.
- the system can train three bidirectional projection models to map the personalization weights between pretrained base networks, e.g., a projection model to bidirectionally map between the first and second base network parameter spaces, a projection model to bidirectionally map between the first and third base network parameter spaces, and a projection model to bidirectionally map between the second and third base network parameter spaces.
- the first and second pretrained base networks can be multi-layer perceptron networks and the system can train a first model to process the personalization weights in the first parameter space to generate the personalization weights in the second parameter space and a second model to process the personalization weights in the second parameter space to generate the personalization weights in the first parameter space.
- the permissions subsystem 120 can also manage the removal of a permission.
- the subsystem 120 can receive a request to remove app 3 personalization from suite 2 504, e.g., to remove the impact of personalization using app 3 data from personalization weights trained from app 3 and app 4 data.
- the subsystem 120 can manage the removal of the permission 504 depending on the training technique used to generate the personalization weights.
- the subsystem 120 can delete the personalization weights and retrain the personalization weights only using data from app 4.
- the low-rank approximation yields updated values of a subset of the base network parameters from training with app 3 and app 4, the signal provided by the data from app 3 cannot be isolated and removed from the personalization weights.
- the system can delete the personalization weights and retrain the soft prompt using only app 4 data.
- the system can train a set of personalization weights for a permission that subsumes another permission, e.g., for app 3 and 4, using the set of personalization weights for the subsumed permission, e.g., for app 3.
- the system can have previously trained a soft prompt or LoRA for app 3 using app 3 data, then a soft prompt or LoRA for app 3 and 4 using app 4 data and the frozen soft prompt or frozen LoRA for app 3.
- the system can recover the subsumed permission since it was frozen, e.g., did not update, during further training of the personalization weights.
- the system can recover the set of weights for the subsumed permission. For example, the system can have previously trained a soft prompt for app 3 using app 3 data, then a soft prompt for app
- the system can recover the set of personalization weights for the subsumed permission, e.g., for app 3, from the set of personalization weights for app 3 and 4.
- the system can be returned to the frozen soft prompt for app 3, e.g., in response to a request revoking the permission for app 3 and 4 but maintaining the permission for app 3, without affecting app 3 personalization.
- the system cannot recover the soft prompt for app 4 from the soft prompt for app 3 and 4.
- the system can delete the soft prompt for app 3 and 4 and retrain the soft prompt for app 4 using app 4 data.
- FIG. 6 is a flow diagram of an example process for identifying and deploying a set of personalization weights for use with a base network.
- the process 600 will be described as being performed by a system of one or more computers located in one or more locations.
- a cross-application permission management system e.g., the permission management system 100 of FIG. 1 , appropriately programmed in accordance with this specification, can perform the process 600.
- the system can receive a request for processing by a base network (step 610), and determine a particular granted permission associated with the request (step 620).
- the granted permission can specify data sharing between one or more software applications, one or more features of an application, or features of different applications.
- the granted permission can correspond with a set of personalization weights, e.g., a subset of weights of the base neural network that have been trained for use with the base network for personalized processing, e.g., in accordance with the particular granted permission.
- An example of training the set of personalization weights will be covered in more detail in FIG. 7.
- the system can then process the request using the base network in accordance with the personalization weights (step 630), using the base network with the base network and personalization weights.
- this can involve loading the set of personalization weights on the base network, such that the base network can process an input using both the set of base network weights, e.g., a pretrained set of weights, and the set of personalization weights.
- the base network is a language processing model configured to process one or more prompts as input
- processing the request can involve the system prepending a set of personalization weights to each prompt input, e.g., as a soft prompt.
- FIG. 7 is a flow diagram of an example process for generating and maintaining a set of personalization weights for use with a base network.
- the process 700 will be described as being performed by a system of one or more computers located in one or more locations.
- a cross-application permission management system e.g., the permission management system 100 of FIG. 1, appropriately programmed in accordance with this specification, can perform the process 700.
- the system can receive a granted permission from a user-device (step 710), e.g., a laptop, tablet, smart-phone, smart-watch, etc., that indicates data sharing between a set of one or more software applications.
- a user-device e.g., a laptop, tablet, smart-phone, smart-watch, etc.
- the particular granted permission can be a crossfeature permission, e.g., permission across features within the software application, a cross- app permission, e.g., permission across software applications, or a cross-app cross-feature permission, e.g., permission across features within different software applications.
- the system can derive data from the one or more software applications specified by the particular granted permission (step 720) and generate a set of personalization weights for a base network (step 730).
- the system can derive the data needed for training the base network in accordance with the particular granted permission by accessing a cache of data that can be stored on the user-device.
- the system can obtain the data that corresponds with the relevant data for the permission, e.g., cross-app, cross-feature, cross-app and cross-feature, and train a corresponding set of personalization weights for the base network.
- the base network has been pretrained, e.g., the weights of the base network have been previously trained, such that processing with the base network without the personalization weights yields a generic functionality, e.g., in accordance with a set of base weights.
- the set of personalization weights can include a subset of the weights of a pretrained base network.
- the subset of the weights can represent a small percentage, e.g., less than 2%, 5%, or 10%, of the overall weights of the base network that can be trained as personalization weights.
- the system can use a low-rank approximation (LoRA) to train a subset of base network weights with a matrix decomposition to approximate updates to the base network.
- LoRA low-rank approximation
- the system can use LoRA or prompt tuning, e.g., training a set of personalization weights that can be prepended to prompt inputs for the language processing model.
- the system can then process inputs to the base network using the personalization weights with the base network weights to yield personalized processing.
- the system can maintain the personalization weights (step 740).
- the system can maintain one or more sets of personalization weights for one or more users by storing the personalization weights in a personalization weight database.
- the system can then use a permissions subsystem to access personalization weights corresponding to incoming requests for processing using the base network in accordance with a particular permission.
- the system can then identify and deploy a set of personalization weights corresponding to the particular permission for use with a base network as described in FIG. 6.
- Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non-transitory storage medium for execution by, or to control the operation of, data processing apparatus.
- the computer storage medium can be a machine- readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
- data processing apparatus refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
- the apparatus can also be, or birther include, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
- the apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
- a computer program which may also be referred to or described as a program, software, a software application, an app, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages; and it can be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a program may, but need not, correspond to a fde in a fde system.
- a program can be stored in a portion of a fde that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., fdes that store one or more modules, subprograms, or portions of code.
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a data communication network.
- engine is used broadly to refer to a softwarebased system, subsystem, or process that is programmed to perform one or more specific functions.
- an engine will be implemented as one or more software modules or components, installed on one or more computers in one or more locations.
- one or more computers will be dedicated to a particular engine; in other cases, multiple engines can be installed and running on the same computer or computers.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by special purpose logic circuitry, e.g., an FPGA or an ASIC, or by a combination of special purpose logic circuitry and one or more programmed computers.
- Computers suitable for the execution of a computer program can be based on general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and data from a read-only memory or a random access memory or both.
- the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
- the central processing unit and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
- PDA personal digital assistant
- GPS Global Positioning System
- USB universal serial bus
- Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto-optical disks e.g., CD-ROM and DVD-ROM disks.
- embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
- a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user’s device in response to requests received from the web browser.
- a computer can interact with a user by sending text messages or other forms of message to a personal device, e.g., a smartphone that is running a messaging application, and receiving responsive messages from the user in return.
- Data processing apparatus for implementing machine learning models can also include, for example, special-purpose hardware accelerator units for processing common and compute-intensive parts of machine learning training or production, i.e., inference, workloads.
- Machine learning models can be implemented and deployed using a machine learning framework, e.g., a TensorFlow framework, or a Jax framework.
- Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a frontend component, e.g., a client computer having a graphical user interface, a web browser, or an app through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or frontend components.
- the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
- LAN local area network
- WAN wide area network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- a server transmits data, e.g., an HTML page, to a user device, e.g., for purposes of displaying data to and receiving user input from a user interacting with the device, which acts as a client.
- Data generated at the user device e.g., a result of the user interaction, can be received at the server from the device.
- Embodiment 1 is a method comprising: maintaining a plurality of sets of personalization weights for a given user, wherein each set of personalization weights corresponds to a respective granted permission that allows data sharing between a corresponding set of one or more software applications and has been determined by training a base neural network on training data that includes data for the given user from the corresponding set of one or more software applications; receiving a request for processing by the base neural network; determining a particular granted permission associated with the request; and processing the request using the base neural network and in accordance with a set of personalization weights corresponding to the particular granted permission.
- Embodiment 2 is the method of embodiment 1, wherein processing the request using the base neural network and in accordance with the set of personalization weights corresponding to the particular granted permission comprises: processing the request using the base neural network and in accordance with (i) the set of personalization weights corresponding to the particular granted permission and (ii) a base set of weights for the base neural network that are shared between the respective granted permissions.
- Embodiment 3 is the method of any one of embodiments 1-2, further comprising: transmitting the set of personalization weights corresponding to the particular granted permission to a user-device of the given user; and loading the set of personalization weights as a subset of the base set of weights of the base neural network.
- Embodiment 4 is the method of any one of embodiments 1-3, wherein determining the particular granted permission associated with the request further comprises: identifying the set of personalization weights corresponding to the particular granted permission from the maintained plurality of sets of personalization weights.
- Embodiment 5 is the method of any one of embodiments 1-3, wherein determining the particular granted permission associated with the request further comprises: determining that the set of personalization weights corresponding to the particular granted permission is not in the maintained plurality of sets of personalization weights; and generating the set of personalization weights corresponding to the particular granted permission in response to a request from the given user.
- Embodiment 6 is the method of embodiment 5, wherein generating the set of personalization weights corresponding to the particular granted permission in response to the request from the given user comprises: receiving the particular granted permission indicative of allowing data sharing between one or more software applications; training the base neural network in accordance with the particular granted permission, wherein training comprises: deriving data from the one or more software applications specified by the particular granted permission; training the set of personalization weights corresponding to the particular granted permission while holding the base set of weights fixed; and including the set of personalization weights corresponding to the particular granted permission in the maintained plurality of sets of personalization weights for the given user.
- Embodiment 7 is the method of embodiment 6, wherein training the set of personalization weights comprises training a low-rank approximation of weights representative of updates to the base neural network using the data from the one or more software applications, wherein training the low-rank approximation of weights comprises: identifying a set of target update weights comprising a subset of the base set of weights of the base neural network; receiving a target rank specifying a dimensionality of the low-rank approximation of weights; representing an update approximation for training updates to the target update weights using matrix decomposition in accordance with the target rank, wherein the matrix decomposition comprises a first matrix comprising values drawn from a standard normal distribution and a second matrix comprising zero values; updating the first and second matrix by training with the data from the one or more software applications to perform gradient updates in accordance with an objective function; and generating an approximation of the target update weights using the low-rank approximation of weights as the set of personalization weights.
- Embodiment 8 is the method of embodiment 7, wherein the set of target update weights comprise one or more layers of the base neural network.
- Embodiment 9 is the method of any one of embodiments 6-8, wherein the base neural network comprises a base language processing neural network configured to process one or more prompts.
- Embodiment 10 is the method of embodiment 9, wherein training the set of personalization weights comprises performing prompt tuning to generate a soft prompt, wherein performing prompt tuning comprises: receiving a set of example prompt-response pairs that parameterize a specific task from the given user; processing the set of example prompt-response pairs using the base language processing network to generate a soft prompt comprising a set of tunable tokens, each comprising a set of embedding weights, wherein processing the set of example promptresponse pairs comprises: prepending the set of tunable tokens to one or more training inputs; and training each set of embedding weights in the set of tunable tokens by processing the one or more training inputs with the set of tunable tokens using the language processing model.
- Embodiment 11 is the method of embodiment 2, when dependent on the method of embodiment 10, further comprising prepending the soft prompt to a given user prompt to process the request using the base language processing neural network in accordance with the set of personalization weights.
- Embodiment 12 is the method of embodiment 11, wherein the set of example prompt-response pairs are indicative of preference of the user.
- Embodiment 13 is the method of embodiment 12, wherein the preference of the user comprises a style and format preference.
- Embodiment 14 is the method of any one of embodiments 1-13, wherein maintaining the plurality of sets of personalization weights for a given user comprises maintaining one or more of sets of per-product personalization weights and sets of per-feature personalization weights.
- Embodiment 15 is the method of any one of embodiments 1-14, further comprising: receiving a request to identify one or more previously granted permissions from the given user; and providing the corresponding one or more sets of personalization weights for the one or more previously granted permissions to the given user.
- Embodiment 16 is the method of any one of embodiments 1-15, further comprising: receiving a request specifying a previously granted permission for removal; identifying the one or more sets of personalization weights corresponding to the previously granted permission for removal; and updating the plurality of sets of personalization weights for the given user in accordance with the specified removal.
- Embodiment 17 is the method of embodiment 16, wherein updating the maintained plurality of sets of personalization weights in accordance with the specified removal comprises: removing the corresponding one or more sets of personalization weights from the maintained plurality of sets of personalization weights.
- Embodiment 18 is the method of embodiment 17, when dependent on the method of embodiment 5, further comprising retraining the low-rank approximation of weights for a corresponding set of remaining granted permissions in accordance with the updated set of permissions.
- Embodiment 19 is the method of embodiment 18, when dependent on the method of embodiment 8, wherein removing the corresponding one or more sets of personalization weights comprises removing the soft prompt corresponding with the specified removal permission.
- Embodiment 20 is the method of any one of embodiments 1-19, wherein maintaining a plurality of sets of personalization weights for the given user further comprises retraining a first set of personalization weights for the given user periodically using snapshot data, wherein the snapshot data comprises an accumulation of data from at least two software applications corresponding with the first set of personalization weights collected in accordance with one or more threshold criteria.
- Embodiment 21 is a system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform the method of any one of embodiments 1 to 20.
- Embodiment 22 is a computer storage medium encoded with a computer program, the program comprising instructions that are operable, when executed by data processing apparatus, to cause the data processing apparatus to perform the method of any one of embodiments 1 to 20.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Information Transfer Between Computers (AREA)
Abstract
L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes d'ordinateur codés sur un support de stockage d'ordinateur, pour générer, maintenir et déployer un ensemble de poids de personnalisation destinés à être utilisés avec un réseau de neurones artificiels de base. Selon un aspect, un système comprend le maintien d'une pluralité d'ensembles de poids de personnalisation pour un utilisateur donné, chaque ensemble de poids de personnalisation correspondant à une autorisation accordée respective qui permet un partage de données entre un ensemble correspondant d'une ou de plusieurs applications logicielles et a été déterminé par entraînement d'un réseau de neurones artificiels de base sur des données d'entraînement qui comprennent des données pour l'utilisateur donné à partir de l'ensemble correspondant d'une ou de plusieurs applications logicielles, la réception d'une demande de traitement par le réseau de neurones artificiels de base, la détermination d'une autorisation accordée particulière associée à la demande, et le traitement de la demande à l'aide du réseau de neurones artificiels de base et conformément à un ensemble de poids de personnalisation correspondant à l'autorisation accordée particulière.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2024/014293 WO2025165372A1 (fr) | 2024-02-02 | 2024-02-02 | Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de données |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2024/014293 WO2025165372A1 (fr) | 2024-02-02 | 2024-02-02 | Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de données |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025165372A1 true WO2025165372A1 (fr) | 2025-08-07 |
Family
ID=90364181
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/014293 Pending WO2025165372A1 (fr) | 2024-02-02 | 2024-02-02 | Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de données |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025165372A1 (fr) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4002231A1 (fr) * | 2020-11-18 | 2022-05-25 | Telefonica Digital España, S.L.U. | Apprentissage fédéré en tant que service |
| US20230325725A1 (en) * | 2022-04-12 | 2023-10-12 | Google Llc | Parameter Efficient Prompt Tuning for Efficient Models at Scale |
-
2024
- 2024-02-02 WO PCT/US2024/014293 patent/WO2025165372A1/fr active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4002231A1 (fr) * | 2020-11-18 | 2022-05-25 | Telefonica Digital España, S.L.U. | Apprentissage fédéré en tant que service |
| US20230325725A1 (en) * | 2022-04-12 | 2023-10-12 | Google Llc | Parameter Efficient Prompt Tuning for Efficient Models at Scale |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12039280B2 (en) | Multi-turn dialogue response generation with persona modeling | |
| US20250315676A1 (en) | Augmenting neural networks with external memory | |
| US10534863B2 (en) | Systems and methods for automatic semantic token tagging | |
| US9846836B2 (en) | Modeling interestingness with deep neural networks | |
| CN112417090B (zh) | 利用未提交的用户输入数据来提高任务性能 | |
| US11048875B2 (en) | Skimming data sequences using recurrent neural networks | |
| CN108780445A (zh) | 用于对小数据的机器理解的并行分层模型 | |
| EP3132362A1 (fr) | Recherche sensible au contexte à l'aide d'un modèle d'apprentissage profond | |
| CN112101042B (zh) | 文本情绪识别方法、装置、终端设备和存储介质 | |
| CN111259647A (zh) | 基于人工智能的问答文本匹配方法、装置、介质及电子设备 | |
| US20200177527A1 (en) | Reusing entities in automated task-based multi-round conversation | |
| CN112445959A (zh) | 检索方法、检索装置、计算机可读介质及电子设备 | |
| KR20240128104A (ko) | 언어 모델 신경망을 사용하여 인라인 증거를 갖는 출력 시퀀스 생성 | |
| WO2024076446A1 (fr) | Réponse à une question informatisée sur la base de chaînes de preuves | |
| WO2025165372A1 (fr) | Maintien et déploiement de poids de personnalisation pour un réseau de neurones artificiels de base sur la base d'autorisations de données | |
| US12099803B2 (en) | Training a model in a data-scarce environment using added parameter information | |
| Moholkar et al. | A Hybrid optimized deep learning framework to enhance question answering system | |
| US12505134B2 (en) | Large language model architecture to leverage public and private data | |
| US20250378307A1 (en) | Universal embedding based entity retrieval model | |
| US20250259014A1 (en) | Customizing Information Using a Local Language Model Based on a Profile | |
| US20250225484A1 (en) | Method of automating collection and screening of resumes | |
| US20250225163A1 (en) | Large langauge model architecture to leverage public and private data | |
| CN121071106A (zh) | 一种运维数据的动态意图蒸馏检索方法、设备及介质 | |
| CN120781971A (zh) | 基于语言模型的问答方法、装置、设备和存储介质 | |
| CN120687561A (zh) | 基于大语言模型的问答方法、装置、设备和存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24711031 Country of ref document: EP Kind code of ref document: A1 |