WO2025114004A1 - Système et procédé utilisant un grand modèle de langage - Google Patents
Système et procédé utilisant un grand modèle de langage Download PDFInfo
- Publication number
- WO2025114004A1 WO2025114004A1 PCT/EP2024/082204 EP2024082204W WO2025114004A1 WO 2025114004 A1 WO2025114004 A1 WO 2025114004A1 EP 2024082204 W EP2024082204 W EP 2024082204W WO 2025114004 A1 WO2025114004 A1 WO 2025114004A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- language model
- request
- probability
- result
- answer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
Definitions
- the invention relates to a system and a method in which a result is generated from a requirement formulated in natural language using a large language model.
- LLMs Large language models offer a range of possible applications. For example, they can be used to select the right component for a given scenario from a range of available components in a complex industrial environment using natural language input.
- the components can be automation components.
- a system that includes the following basic components:
- the user is provided with an input mask (graphical user interface, GUI) that has a text input field and a text output field as well as a confirmation or submit button (e.g., a "Submit” button).
- GUI graphical user interface
- submit button e.g., a "Submit” button.
- a backend that uses a large language model is invisible to the user.
- a user describes their application in natural language in the text input field. After clicking the submit button, the content of the text input field is forwarded to the Large Language Model.
- the Large Language Model processes the input (infers about it) and provides its output in the text output field of the input mask.
- the output text can generally fall into three categories .
- the first category is an indication of an incorrect input .
- the second category includes output texts that contain an indication of missing information in the input .
- the third category includes output texts that contain the actually desired information, for example a Selection from the available automation components appropriate for the input.
- the input texts belonging to these categories each exhibit one of the following three properties. It may be input text that has no connection to an application description, for example, a spam request or a completely incorrect input. Furthermore, the input text may contain an overly abstract form of the application description. Finally, the input text may be designed in such a way that the application is sufficiently described.
- Inferior processing i.e., the evaluation of input text by the Large Language Model, requires considerable hardware resources on the backend and causes significant energy consumption. Depending on the load, users may have to wait a long time for a response. Spam requests also contribute to load and resource consumption.
- the object of the present invention is to create a system with a large language model for use in the described technical field, in which the use of resources and energy consumption are reduced.
- a further object is to specify an operating method for such a system.
- the inventive system for providing a result for a requirement formulated in natural language comprises a first large language model.
- the first large language model is configured to generate the result from the requirement.
- the system further comprises a second language model, which has fewer resources than the first and is designed to generate a two-valued response from the request.
- the system is further configured to forward the request to the first large language model in the case of a first of the two possible answers of the two-valued answer, and to provide a definable warning as a result of the request in the case of a second of the two possible answers of the two-valued answer.
- a first large language model and a second large language model which has reduced resources compared to the first, are provided.
- the request is forwarded to the second language model, and the second language model generates a two-valued response therefrom.
- the request is forwarded to the first large language model, which generates a result from the request.
- a definable answer is provided as the result of the request.
- the second language model is resource-reduced, for example, by using a reduced number of neurons (layers), i.e., parameters, and is designed to provide only a two-value answer, for example, "sufficient" and "insufficient.”
- the answer is "not sufficient"
- the request does not reach the first large language model at all. Instead of a result that would be provided by the large language model specifically for the request, a fixed answer or warning is issued instead. This is just as helpful for the user, but requires considerably fewer resources to create because it is predetermined.
- Language models are artificial neural networks with a large number of parameters. They are designed to generate a natural language response from a natural language input by repeatedly predicting the most likely next word in the response. Large language models are those that have a very large number of parameters (> 10 9 ).
- the request can be forwarded to the second language model with a first probability different from 1 and with a second probability different from the first. ity complementary probability to the first Large Language Model.
- the second language model can have a number of parameters in the range from 10 8 (100 million) to 10 9 (1 billion). Preferably, a second language model with between 200 million and 400 million parameters is used.
- the ratio of the parameters of the first large language model to the second language model can be at least 5:1, in particular at least 10:1 or at least 20:1.
- the second language model is appropriately adapted for the task of the two-valued answer in that the output layer of the already trained second language model only allows one classification, for example it has two output neurons.
- the system may include a device for determining a metric for system utilization. This is typically a software component.
- the metric is preferably incorporated into the probability. This can ensure that, during periods of low utilization, even potentially inappropriate requests are processed by the first large language model, thus providing responses that go beyond the specified warning.
- the probability is preferably determined as the product of the utilization metric and a definable upper limit for the probability.
- the definable upper limit ensures that, during operation, a portion of the requests is always directed to the first large language model.
- the system may have a user interface that can be output on a data processing device and includes an input option for the request and an output option for the result generated from the request.
- Data processing devices include, for example, computers, tablets and Smartphones. Input can be written (typed) or verbal (dictated).
- the system can comprise a device for natural language processing (NLP), designed to generate a two-value classification for a result generated by the first large language model from the requirement, wherein the system is designed to feed the requirement as an input value and the classification as the correct output value to the second language model for training.
- NLP natural language processing
- the classification by the device expediently corresponds to the two-value answer to be generated by the second language model.
- the device can itself be a large language model. Preferably, however, it is a device that searches the result for definable pieces of text. Such pieces of text are, for example, “insufficient”, “not clear”, “cannot” and others that indicate an unusable requirement in the result.
- Figure 1 shows a system for selecting a suitable component from automation technology for an application description in natural language
- Figure 2 shows the system in a training configuration.
- Figure 1 shows a system 10 for selecting a suitable component from automation technology for an application description in natural language.
- the component may be a control system for a production line in a manufacturing plant.
- the component may be a suitable robot for an activity in a manufacturing plant.
- the system 100 described below helps users make such a selection.
- the system 10 includes a user interface 120, which is used for interaction with users and thus for entering the application description in natural language.
- the user interface is expediently displayed on the screen of a computing device.
- the computing device can be a tablet, smartphone, or a PC.
- the user interface 120 can be implemented as a program, i.e., run on the computing device, but can also be a web interface, i.e., simply displayed in the style of a website.
- the input is conveniently made in writing in an input field 121 provided for this purpose. It is understood that the input can be made not only via keyboard but also as dictation, which is converted into text.
- the user interface 120 further includes a submit button 122. If the user is satisfied with the request text, they can submit the request text by pressing the button 122. This transmits the request text to a backend 100.
- the backend 100 processes the request text and generates a result.
- the backend 100 is typically not implemented locally on the computing device, as this would require too many resources, but is instead connected to the computing device via the Internet.
- the request text is transmitted to the backend 100 via the Internet.
- the backend 100 includes a first Large Language Model 105 .
- Large Language Models are well known and are based on very large neural networks with a number of weights up to the billions. They are trained on extensive databases of language information, typically generated from the Internet and other written sources.
- the backend 100 further comprises a second language model 110.
- the second language model 110 has reduced resource requirements compared to the first large language model 105, i.e., it is designed such that its operation places less strain on the hardware and/or results in lower energy consumption. This can be achieved, for example, by the second language model 110 using a smaller number of artificial neurons.
- the second language model 110 can be equipped with only 80% or only 60% of the neurons of the first large language model 105.
- the backend 100 also includes a software component that operates as a switch 115. During operation, an incoming request text is forwarded to the switch 115.
- the switch 115 is connected to a device 116 for determining utilization, which is expediently also a software component.
- the device 116 determines the utilization u sys of a definable part of the backend 100, for example the first large language model 105.
- the utilization u sys is usually output as a number between 0 and 1, where 0 stands for no utilization and 1 for full utilization. The value thus determined for a current utilization u sys is available in the switch 115 as an input value.
- the existing request text is forwarded with probability p to the second language model 110 and with the complementary probability 1 - p to the first large language model 105 .
- a decision about forwarding is then made based on the probability p . This usually happens- by determining a random number between 0 and 1 and comparing it with the probability p .
- the request text is forwarded to the first Large Language Model 105 and processed. However, if the load is different from zero, a portion of the incoming request texts is forwarded to the second Language Model 110 .
- the second language model 110 operates with reduced resources compared to the first large language model 105. It is also designed to generate a two-valued result, i.e. to output exactly one first or second answer option 111, 112 for each request text.
- the first answer option 111 is essentially “OK”.
- the second answer option 112 is essentially “not OK”. It is understood that the concrete output form for the two answer options is practically arbitrary, as long as it is used consistently. With an appropriately designed neural network, the output can consist of the value of a single output neuron. If the second language model 110 outputs a text, for example, a 1 can be output for “OK” and a 0 for “not OK”, or exactly the texts “OK” and “not OK”.
- the second language model 110 therefore works as a classifier for the request texts. Since the second Language Model 110 , in contrast to the first Large Language Model 105 , only has to generate a two-valued answer and not a factually correct answer text in natural language , it can be implemented in a resource-saving manner.
- the request text is forwarded to the first large language model 105 and processed.
- the processing corresponds to that which occurs when the request text is forwarded directly from the switch 115 to the first large language model 105.
- the first large language model 105 prepares a response to the request text, which is displayed as text output 123 in the user interface 120.
- the request text is not forwarded to the first large language model 105 and is also not processed. Instead, in this case, a defined warning 124 is displayed in the user interface 120, informing the user that the request text is not sufficient to generate a suitable response.
- the defined warning 124 is therefore not a response generated by a large language model, but is already defined and stored. It is therefore also not adapted to the request text.
- the second language model 110 is trained as shown in Figure 2.
- the system 10 that is also shown in Figure 1 is used again.
- Corresponding elements are designated by the same reference symbols as in Figure 1.
- at least part of the training for the second language model 110 is carried out using requirement texts that are processed in the system 10. While, in principle, prefabricated requirement texts can also be used, Figure 2 shows training using real requirement texts, i.e. requirement texts created by users.
- the first Large Language Model 105 produces a response to the request text, which is presented as text output 123 in the user interface 120 is displayed. Furthermore, the response of the first large language model 105 is also passed on to a device 210.
- the device 210 is a device for natural language processing which is designed to carry out a classification for the response of the first large language model 105. The classification corresponds in its result to that carried out by the second language model 110, but here it takes place using the response of the first large language model 105 instead of the request text.
- the two response options 211, 212 of the device 210 therefore correspond to the response options "OK" and "not OK".
- the device 210 can, for example, itself be a large language model. Alternatively, and much simpler, a device 210 searches the output of the first large language model 105 for specific keywords, such as "cannot" or "insufficient.” This search and a corresponding keyword catalog require minimal resources compared to a large language model.
- the answer option 211, 212 resulting from the request text from device 210 is passed on to the software component for labeling 215. There, the answer option 211, 212 is assigned to the request text. Pairs of request text and answer thus formed are transmitted to the second language model 110 as training data in a step 220.
- the respective request text serves as the input variable; the second language model 110 always processes the request texts as input variable.
- the answer option 211, 212 serves as the correct output variable because it corresponds to the correct result to be delivered by the second language model 110.
- the second Language Model 110 can be configured both in advance and during operation with relatively little effort. effort and thus reduce resource consumption.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Machine Translation (AREA)
Abstract
Un second modèle de langage est ajouté à un système pour fournir un résultat d'une demande formulée en langage naturel au moyen d'un premier grand modèle de langage. Le second modèle de langage est réduit en ressources par rapport au premier grand modèle de langage et est conçu pour générer une réponse à deux valeurs à partir de la demande. Pour l'une des deux réponses possibles, une réponse prédéterminée est fournie suite à la demande au lieu d'une réponse provenant du premier grand modèle de langage.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DE102023211799.1 | 2023-11-27 | ||
| DE102023211799.1A DE102023211799A1 (de) | 2023-11-27 | 2023-11-27 | System und Verfahren mit einem Large Language Model |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025114004A1 true WO2025114004A1 (fr) | 2025-06-05 |
Family
ID=93651148
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/082204 Pending WO2025114004A1 (fr) | 2023-11-27 | 2024-11-13 | Système et procédé utilisant un grand modèle de langage |
Country Status (2)
| Country | Link |
|---|---|
| DE (1) | DE102023211799A1 (fr) |
| WO (1) | WO2025114004A1 (fr) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250291934A1 (en) * | 2024-03-18 | 2025-09-18 | Logistics and Supply Chain MultiTech R&D Centre Limited | Computer Implemented Method of Evaluating LLMs |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11152009B1 (en) * | 2012-06-20 | 2021-10-19 | Amazon Technologies, Inc. | Routing natural language commands to the appropriate applications |
-
2023
- 2023-11-27 DE DE102023211799.1A patent/DE102023211799A1/de active Pending
-
2024
- 2024-11-13 WO PCT/EP2024/082204 patent/WO2025114004A1/fr active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11152009B1 (en) * | 2012-06-20 | 2021-10-19 | Amazon Technologies, Inc. | Routing natural language commands to the appropriate applications |
Non-Patent Citations (3)
| Title |
|---|
| CHEN LINGJIAO ET AL: "FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance", 9 May 2023 (2023-05-09), XP093249305, Retrieved from the Internet <URL:https://arxiv.org/pdf/2305.05176> * |
| ONG ISAAC ET AL: "RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing | LMSYS Org", 1 July 2024 (2024-07-01), XP093249279, Retrieved from the Internet <URL:https://lmsys.org/blog/2024-07-01-routellm/> * |
| SHNITZER TAL ET AL: "Large Language Model Routing with Benchmark Datasets", 27 September 2023 (2023-09-27), XP093249289, Retrieved from the Internet <URL:https://arxiv.org/pdf/2309.15789> * |
Also Published As
| Publication number | Publication date |
|---|---|
| DE102023211799A1 (de) | 2025-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2000063788A2 (fr) | Reseau semantique d'ordre n, operant en fonction d'une situation | |
| WO2019011356A1 (fr) | Procédé de conduite de dialogue homme-ordinateur | |
| DE10051021A1 (de) | System, Verfahren und Computerprogramm zur Veröffentlichung interaktiver Web-Inhalte in einer statisch verknüpften Web-Hierarchie | |
| DE102013205737A1 (de) | System und Verfahren zum automatischen Erkennen und interaktiven Anzeigen von Informationen über Entitäten, Aktivitäten und Ereignisse aus multimodalen natürlichen Sprachquellen | |
| DE112016005266T5 (de) | Schnelle Musterentdeckung für Protokollanalyse | |
| DE10356399B4 (de) | Datenverarbeitungssystem | |
| WO2021104608A1 (fr) | Procédé de génération d'une proposition d'ingénierie pour un dispositif ou une installation | |
| WO2025114004A1 (fr) | Système et procédé utilisant un grand modèle de langage | |
| EP2601594A1 (fr) | Procédé et dispositif de traitement automatique de données en un format de cellule | |
| EP1264253B1 (fr) | Procede et dispositif pour la modelisation d'un systeme | |
| EP3716058A1 (fr) | Procédé de commande d'un appareil au moyen d'un nouveau code de programme | |
| DE102019213003A1 (de) | Wissensbereitstellungsprogramm, wissensbereitstellungsvorrichtung und betriebsdienstsystem | |
| WO2022069248A1 (fr) | Système de fourniture de données | |
| EP3507943B1 (fr) | Procédé de communication dans un réseau de communication | |
| DE102018222156A1 (de) | Verfahren, Anordnung und Verwendung zum Erzeugen einer Antwortausgabe in Reaktion auf eine Spracheingabeinformation | |
| EP2423830A1 (fr) | Procédé de recherche dans une multitude d'ensembles de données et machine de recherche | |
| DE112024000839T5 (de) | Zusammenfassung der vorherrschenden meinungen zur medizinischen entscheidungsfindung | |
| EP4506867A1 (fr) | Procédé et appareil de traitement assisté par ordinateur d'un circuit quantique sur une unité de traitement quantique | |
| DE102024102109A1 (de) | Verfahren zum Bereitstellen eines Bauteils, Computerprogramm und elektronisch lesbarer Datenträger | |
| DE202025102608U1 (de) | Modulares System eines selbstentwickelnden künstlichen Bewusstseins | |
| EP3026602A1 (fr) | Procede de traitement de commandes assiste par ordinateur destine a analyser des donnees dans une memoire de donnees | |
| DE202022101222U1 (de) | Ein automatisch skalierbares System für optimierte Arbeitsempfehlungen | |
| DE102005037505B4 (de) | Netzwerk | |
| WO2020094415A1 (fr) | Procédé de mesure et dispositif de mesure servant à déterminer une propriété technique d'un code de programme servant à piloter un appareil électronique | |
| DE10046116B4 (de) | Verfahren und Vorrichtung zum rechnergestützten Ermitteln mindestens eines gespeicherten Produkts und/oder mindestens eines gespeicherten Lösungsprinzips und Computerprogramm-Element |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24812709 Country of ref document: EP Kind code of ref document: A1 |