WO2025111610A1 - Artificial intelligence systems and methods for patient charts - Google Patents
Artificial intelligence systems and methods for patient charts Download PDFInfo
- Publication number
- WO2025111610A1 WO2025111610A1 PCT/US2024/057340 US2024057340W WO2025111610A1 WO 2025111610 A1 WO2025111610 A1 WO 2025111610A1 US 2024057340 W US2024057340 W US 2024057340W WO 2025111610 A1 WO2025111610 A1 WO 2025111610A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- computing device
- data objects
- clinical
- client computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present disclosure relates to the use of large language models (LLMs), and more particularly, to the use of an LLM to generate structured data from unstructured documents in a clinical health setting.
- LLMs large language models
- structured clinical registries allows for automated back-end data processing through various applications e.g., data visualization applications, auditing or reporting applications, realtime viewing applications for physicians, nurses, and other staff, etc.
- structured clinical registries may be searched or parsed to identify, from a multiplicity of patients in the clinical registry, one, two, three or more patients to form a cohort for a clinical trial based upon the stored information for each patient (e.g., a group of patients may be chosen for a clinical trial based upon having a shared clinical diagnosis and comparable attributes).
- the clinical data is likely structured differently from the clinical registry (e.g., the recorded clinical data may contain fields that do not match one-to-one with the clinical registry). Populating a clinical registry on behalf of one or more patients requires focused effort to convert the usually unstructured clinical data into structured entries into the clinical registry.
- the present disclosure proposes a use of a large language model (z.e., one or more large language models (LLMs)) to extract structured information from unstructured clinical data (e.g., typed or handwritten freeform clinical notes or reports).
- LLMs large language models
- the LLM may identify data elements contained within the input, identify locations in a structured database (e.g., a clinical registry) corresponding to the identified data elements, and populate the corresponding structured database locations with the corresponding identified data elements.
- a user may specify a particular set of unstructured clinical data (e.g., one or more case files, documents, pages, paragraphs, etc.), request a particular one or more data elements, and use the LLM to retrieve the requested one or more data elements from the unstructured clinical data, e.g., by applying the LLM to the unstructured clinical data, or, if the structured database has already been populated with the requested one or more data elements (e.g., as a result of prior requests for the data element(s)) automatically retrieving the requested one or more data elements from the structured database.
- the LLM and the structured database populated therewith may be used, for example, to identify patient cohorts for clinical trials, query the populated structured database (e.g., via further LLM techniques), and supply functionality to various clinical applications.
- functionality of the LLM may be made accessible via a user-facing application (i.e., one or more applications), which may be accessible via mobile devices, laptop computers, desktop computers, and/or other devices of medical personnel such as physicians, nurses, residents, consultants, data extractors, etc.
- Medical personnel may use the LLM to convert any unstructured clinical data into corresponding structured database entries, for example by providing images or electronic documents containing the unstructured clinical data to the LLM.
- medical personnel may provide natural language queries or other queries to the LLM to return various clinical information from the unstructured clinical notes and/or the structured database.
- a computer-implemented method for may be provided, the method being implemented via one or more processors.
- the computer-implemented method may include (1) obtaining one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populating the structured database with the one or more extracted data elements; and (4) providing an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
- the method may include additional, fewer, and/or alternate actions, including actions described herein.
- one or more computer readable media may be provided.
- the one or more computer readable media may store computer-executable instructions that, when executed via one or more processors of one or more computers, cause the one or more computers to (I) obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populate the structured database with the one or more extracted data elements, and (4) provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
- the one or more computer readable media may store additional, fewer, and/or alternate instructions, including instructions described herein.
- a computing system may be provided.
- the computing system may include one or more processors, and one or more memories (e.g., non-transitory memories) having stored thereon computer-executable instructions.
- the computer-executable instructions when executed by the one or more processors, may cause the computing system to (1) obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populate the structured database with the one or more extracted data elements, and (4) provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
- the computing system may include additional, fewer, and/or alternate components, including components described herein.
- the computing system may be configured to perform additional, fewer, and/or alternate actions, including
- FIG. 1 depicts an example computing environment in which techniques of the present description may be implemented, in accordance with some embodiments
- FIG. 2 depicts a block diagram of example flows of data, in accordance with some embodiments
- FIG. 3 depicts a flow diagram of still other example flows of data, in accordance with some embodiments.
- FIG. 4 depicts a block diagram of an example computer-implemented method, in accordance with some embodiments.
- Systems and methods of the present disclosure relate, inter alia, to use of a large language model (LLM, i.e., one or more large language models) to extract structured database entries (e.g., entries of a clinical registry) from unstructured clinical data, for example in response to natural language queries and other requests for particular structured data from unstructured documents, such as handwritten or typewritten notes, clinical reports, and/or other clinical data not aligning with the structure of the structured database (e.g., a clinical data registry).
- LLM large language model
- the LLM of the present disclosure may receive, as input, clinical data including unstructured clinical notes and/or other clinical information not matching the structure of an established structured database. Based upon the received input, the LLM identifies data elements contained therein, and matches the data elements to corresponding fields and patients in the structured database. The LLM then may appropriately populate the structured database using the corresponding identified data elements from the received clinical data, and may return relevant information to a user in response to a natural language query or database query, for example.
- LLM-based functionalities to receive queries and/or to populate the structured database from unstructured clinical data may be provided, for example, at various client electronic computing devices associated with medical personnel such as physicians, nurses, medical residents, consultants, dedicated data extractors, and/or other qualified/authorized personnel.
- these client computing devices may include desktop computers, laptop computers, tablets, smartphones, and/or smart wearable devices.
- medical personnel may, for example, capture one or more images of handwritten clinical notes or printed reports, upload locally stored computerized records (e.g., typed notes or other reports), or export electronic documents from another one or more applications executing at the client computing device.
- the LLM intelligently interprets the unstructured clinical data and populates clinical information into appropriate fields according to the predefined structure of the structured database.
- the unstructured clinical notes and the data elements identified therein can be applied for various purposes.
- the structured database and/or LLM support querying for various applications.
- various computer applications execute database queries (e.g., SQL queries) of the structured database to obtain populated clinical information contained therein.
- Queried clinical information may be used, for example, for charts, graphs, and/or other visualizations created/managed by data visualization applications (e.g., Tableau), or for reports generated/managed by clinical data auditing/reporting applications (e.g., REDCap).
- the structured database and LLM support natural language queries from users.
- medical personnel may provide natural language input (e.g., via speech or text) prompting the structured database for particular information associated with one or more patients (e.g., “show me the patient’s most recent radiology report,” “show me blood work reports within four days of the [a particular procedure, e.g., surgery] performed by [a particular personnel, e.g. surgeon],” “show me vital measurements for each new patient to whom I have been assigned in the past three days,” or “show me patient’s notes from immediately after the patient’s MRI procedure]).
- natural language input e.g., via speech or text
- patients e.g., “show me the patient’s most recent radiology report,” “show me blood work reports within four days of the [a particular procedure, e.g., surgery] performed by [a particular personnel, e.g. surgeon],” “show me vital measurements for each new patient to whom I have been assigned in the past three days,” or “show me patient’s notes from immediately after
- the LLM may responsively incorporate the specified source(s) into the structured database, and generate a response to the query and to future queries for same or similar information.
- the LLM may provide output in the form of particular clinical parameters and/or the context of the documents in which the parameters appear (e.g., to output a paragraph, page or entirety of a handwritten note containing an element of information requested by the user).
- Natural language queries to the LLM may, in some embodiments, be phrased in the form of questions, in which case the query may include options for answers to be provided via the LLM (e.g., was the patient’s [location] nerve preserved during [procedure]? Options: yes/no/partially”).
- the LLM may return as output the answer(s) to the question(s), e.g., via visual and/or audial output.
- a client application(s) supporting the LLM additionally returns a clinical document or portion thereof indicating the answer(s) to the question(s), such that the user may view the context of the answer(s).
- the user may refine the previously submitted query to thereby narrow the LLM output further from among the initially returned results.
- natural language queries to the LLM may include any one or more of various filters for information contained in (or requested to be obtained/stored in) the structured database, including but not limited to patient, hospital, hospital wing, room, type of procedure, portion of the body, symptom, diagnoses or disorder, medical personnel involved, date/time, current procedural terminology (CPT) code, and/or other medically relevant parameters in accordance with data maintained by the structured database (e.g., a clinical registry).
- CPT current procedural terminology
- the LLM may prompt the user for additional information to refine the output of the LLM.
- the determination of what query refinements or clarifications are required from the user may be generated by the LLM based upon the particular initial output provided by the LLM.
- the interactive input/output between the LLM and the user may produce a Chatbot functionality by which the user uses the LLM to request and refine selections of unstructured data objects (e.g., notes, clinical reports, etc.) to be incorporated into the structured database and provided to the user, and/or requests access to information from the unstructured data that has already been incorporated into the structured database.
- one or more user- facing applications z.e., executing at one or more client computing devices e.g., of medical personnel
- a user may provide one or more natural language queries (or database queries) to the LLM via user-facing application(s) to summon any particular data already contained in the structured database, and/or specify sources to be incorporated into the structured database and data from which to be provided to the user (contingent upon authorization of the user to access the requested data).
- the user may for example, cause data to be added, modified, and/or removed from the structured database (e.g., to thereby modify or maintain a clinical registry).
- the structured database generated and/or modified via the LLM of the present disclosure may be utilized to identify cohorts of patients for selection in clinical trials. That is, clinical information extracted from unstructured clinical data may include information identifying eligibility of respective patients for clinical trials (e.g., pathology findings, biomarkers, etc.), as well as commonalities and/or contrasts between groups of patients that may be used to formulate two or more sub-groups of patients for involvement in the clinical trial. Users retrieving one or more patients for cohort selection may view and/or modify selections generated from the structured database to thereby confirm or refine cohort selection.
- clinical information extracted from unstructured clinical data may include information identifying eligibility of respective patients for clinical trials (e.g., pathology findings, biomarkers, etc.), as well as commonalities and/or contrasts between groups of patients that may be used to formulate two or more sub-groups of patients for involvement in the clinical trial.
- Users retrieving one or more patients for cohort selection may view and/or modify selections generated from the structured database to thereby confirm or refine cohort selection.
- clinical information extracted via the LLM from unstructured clinical data may be utilized by medical personnel to identify other treatments/therapies for a patient based upon data from the structured database.
- a natural language query to the structured database via the LLM may, for example, include “has the patient previously been prescribed [a given drug] for [a given disorder]” or [is the patient eligible for treatment using [a given therapy].”
- the LLM described herein may be used to generate an entirely new structured database, e.g., upon applying the LLM to unstructured data from a new entity such as a new hospital, urgent care facility, governmental organization, etc. Particularly, upon defining a multiplicity of structured data fields that will make up the new structured database, the LLM can then by applied to extract corresponding structured data from unstructured clinical data managed by the entity and populate the corresponding structured data into the new structured database.
- Integration of an LLM into a structured database and applications associated therewith presents a number of improvements over traditional techniques associated with generating, populating, and maintaining structured databases.
- use of the LLM of the present disclosure significantly increases the amount of data that can feasibly be input to the structured database from unstructured clinical data (i.e., freeform clinical data and/or other clinical data not matching a structure of the structured database), without requiring medical personnel to granularly parse the unstructured clinical data.
- accuracy of the intake of clinical information into the structured database is improved.
- preliminary experimental work demonstrates that an LLM described herein may extract at least 80% of structured database fields from unstructured clinical data with an accuracy of between 90% and 95% at a speed of one data element per second. Further experimentation is expected to produce still further improvements to these performance metrics.
- the improved volume and accuracy of intake of information into the structured database improves the usability of the structured database for various applications described herein (e.g., using clinical registries and/or other structured databases for cohort identification for clinical trials and/or treatment/therapy identification, use of clinical information for data visualization or auditing/reporting, etc.).
- the user-facing natural language query functionalities described herein may allow for user-friendly, intuitive, and (in some cases) hands-free querying of the structured database to obtain various relevant information associated with a patient or a medical practice.
- the techniques herein improved the speed via which the relevant information can be queried, by allowing the user to specify sources of information (e.g., clinical records, notes, or portions thereof), and automatically incorporate the sources into the structured database and source information therefrom.
- the LLM (i.e., one or more large language models) of the present disclosure may include various existing or future large language models, which may be tailored to the specific medical use cases described herein.
- the LLM includes the Pathways Language Model (PaLM) 2.0, and/or the Gemini LLM and/or chatbot developed by Google. Additionally or alternatively, though, the LLM may include GPT 3.5, GPT 4, and/or another LLM(s) developed by OpenAI, Llama 2, OpenLLaMA, Falcon, Dolly 2.0, and/or other open-source and/or private large language models. Further details regarding the training and implementation of the LLM will be provided in subsequent portions of the present disclosure.
- FIG. 1 depicts an exemplary computing environment 100 in which techniques disclosed herein may be implemented.
- the environment 100 may include computing resources for training and/or operating machine learning models (particularly, including one or more large language models (LLMs)) to perform functionalities described herein, e.g., interpreting unstructured clinical data to populate a structured database, and/or responding to natural language queries of the structured database.
- LLMs large language models
- the computing environment 100 may include a client computing device 102, a server computing device 104, an electronic network 106, a structured database 108 (e.g., including or consisting of one or more clinical registries), a context electronic database 110 and a model electronic database 112.
- the computing environment may further include one or more cloud application programming interfaces (APIs) 114.
- APIs cloud application programming interfaces
- the components of the computing environment 100 may be communicatively connected to one another via the electronic network 106 (e.g., one or more wired and/or wireless communication networks), in various implementations.
- the client computing device 102 may implement, inter alia, operation of one or more applications for obtaining unstructured clinical data (e.g., via capturing of images, uploading of locally stored documents, or exporting of data/documents form other applications accessed via the client computing device 102).
- the client computing device 102 may be implemented as one or more computing devices (e.g., one or more laptops, one or more mobile computing devices, one or more tablets, one or more wearable devices, one or more cloud-computing virtual instances, etc.).
- a plurality of client computing devices 102 may be part of the environment 100 - for example, a first user may access a client computing device 102 that is a laptop, while a second user accesses the client computing device 102 that is a smart phone, while yet a third user accesses a client computing device 102 that is a wearable device. Each of these respective users may participate to manage structured database information associated with same or different patients, hospitals, etc.
- the client computing device 102 may include one or more processors 120, one or more network interface controllers 122, one or more memories 124, an input device 126, an output device 128 and a client API 130.
- the one or more memories 124 may have stored thereon one or more modules 140 (e.g., one or more sets of instructions).
- the one or more processors 1 0 may include one or more central processing units, one or more graphics processing units, one or more field-programmable gate arrays, one or more application-specific integrated circuits, one or more tensor processing units, one or more digital signal processors, one or more neural processing units, one or more RISC-V processors, one or more coprocessors, one or more specialized processors/accelerators for artificial intelligence or machine learning-specific applications, one or more microcontrollers, etc.
- the client computing device 102 may include one or more network interface controllers 122, such as Ethernet network interface controllers, wireless network interface controllers, etc.
- the network interface controllers 122 may include advanced features, in some implementations, such as hardware acceleration, specialized networking protocols, etc.
- the memories 124 of the client computing device 102 may include volatile and/or nonvolatile storage media.
- the memories 124 may include one or more random access memories, one or more read-only memories, one or more cache memories, one or more hard disk drives, one or more solid-state drives, one or more non-volatile memory express, one or more optical drives, one or more universal serial bus flash drives, one or more external hard drives, one or more network- attached storage devices, one or more cloud storage instances, one or more tape drives, etc.
- the memories 124 may have stored thereon one or more modules 140, for example, as one or more sets of computer-executable instructions.
- the modules 140 may include additional storage, such as one or more operating systems (e.g., Microsoft Windows, GNU/Linux, Mac OSX, etc.).
- the operating systems may be configured to run the modules 140 during operation of the client computing device 102 - for example, the modules 140 may include additional modules and/or services for receiving and processing data from one or more other components of the environment 100 such as the one or more cloud APIs 114 or the server computing device 104.
- the modules 140 may be implemented using any suitable computer programming language(s) (e.g., Python, JavaScript, C, C++, Rust, C#, Swift, Java, Go, LISP, Ruby, Fortran, etc.).
- the modules 140 may include a model configuration module 142, an API module 144, an input processing module 146, an authentication/security module 148, a context module 150 and a clinical data capture module 152, in some implementations. In some implementations, more or fewer modules 140 may be included.
- the modules 140 may be configured to communicate with one another (e.g., via inter-process communication, via a bus, via sockets, pipes, message queues, etc.).
- the model configuration module 142 may include one or more sets of computerexecutable instructions (i.e., software, code, etc.) for configuring one or more models (e.g., one or more LLMs) for extracting structured information from unstructured clinical data (e.g., any of various clinical data described herein, such as freeform notes/reports or clinical data having a structure not matching that of a clinical registry and/or another structured database(s)).
- models e.g., one or more LLMs
- unstructured clinical data e.g., any of various clinical data described herein, such as freeform notes/reports or clinical data having a structure not matching that of a clinical registry and/or another structured database(s)
- the model configuration module 142 may be omitted from the modules 140, or its access may be restricted to administrative users only.
- one or more of the modules 140 may be packaged into a downloadable application (e.g., a smart phone app available from an app store) that enables registered but nonprivileged (i.e., non-administrative) users to access the environment 100 using their consumer client computing device 102.
- a downloadable application e.g., a smart phone app available from an app store
- one or more of the client computing device 102 may be locked down, such that the client computing device 102 is controlled hardware, accessible only to those who have physical access to certain areas (e.g., of a hospital, urgent care facility, etc.).
- the model configuration module 142 may include instructions for generating one or more graphical user interfaces that allow a user (e.g., a physician, nurse, data extractor, etc.) to identify one or more unstructured clinical data objects (e.g., pages, files, portions thereof, etc.) for data extraction and population into a structured database (e.g., stored via the one or more servers 104).
- a user e.g., a physician, nurse, data extractor, etc.
- identify one or more unstructured clinical data objects e.g., pages, files, portions thereof, etc.
- a structured database e.g., stored via the one or more servers 104.
- the API module 144 may include one or more sets of computer executable instructions for accessing one or more remote APIs, and/or for enabling one or more other components within the environment 100 to access functionality of the client computing device 102.
- the API module 144 may enable a remote user to specify and/or query unstructured clinical data objects, structured clinical data elements, natural language queries and responses thereto, and/or other data that may be stored locally at the client computing device 102 via the context database 110.
- the API module 144 may enable other client applications (i.e., not applications facilitated by the modules 140) to connect to the client computing device 102, for example, to send queries or prompts, and to receive responses from the client computing device 102.
- the API module 144 may include instructions for authentication, rate limiting and error handling.
- the client computing device 102 may enable one or more users to access one or more trained models (e.g., LLMs) by providing input prompts that are processed by one or more trained models.
- the input processing module 146 may perform pre-processing of user prompts prior to being input into one or more models, and/or post-processing of outputs output by one or more models.
- the input processing module 146 may process data input into one or more input fields, voice inputs or other input methods (e.g., file attachments) depending upon the application.
- the input processing module 146 may receive inputs directly via the input device 126, in some implementations (e.g., natural language queries received via text input or voice input).
- the input processing module 146 may perform postprocessing of input received from one or more trained models.
- postprocessing and/or pre-processing
- the input processing module 146 may include instructions for handling errors and for displaying errors to users (e.g., via the output device 128).
- the input processing module 146 may cause one or more graphical user interfaces to be displayed, for example to enable the user to enter information directly via a text field.
- the authentication/security module 148 may include one or more sets of computerexecutable instructions for implementing access control mechanisms for one or more trained models, ensuring that the model can only be accessed by those who are authorized to do so, and that the access of those users is private and secure. It should be appreciated that the security module 148 may permission users/agents based upon their respective permissions in a clinical context (e.g., permissions of particular medical personnel to access potentially sensitive medical information associated with particular patients, cohorts, medical facilities or portions thereof, etc.). [0050] Generally, trained models, especially trained models, require state information in order to meaningfully carry on a dialogue with a user or with another trained model.
- the model should understand that, in context, the second query relates to the first query, insofar as the user is asking about the weather tomorrow in the same location (Chicago).
- language models e.g., large language models (LLMs)
- LLMs large language models
- many systems add statefulness to models using context information. This may be implemented using sliding context windows, wherein a predetermined number of tokens (e.g., 4096 maximum tokens in the case of GPT 3.5, equivalent to about 3000 words) may be “remembered” by the LLM and can be used to enrich multiple sequential prompts input into the LLM (for example, when the LLM is used in a chat mode).
- the context module 150 may include one or more sets of computer-executable instructions for maintaining state of the type found in this example, and other types of state information.
- the context module 150 may implement sliding window context, in some implementations. In other implementations, the context module 150 may perform other types of state maintaining strategies. For example, the context module 150 may implement a strategy in which information from the immediately preceding prompt is part of the window, regardless of the size of that prior prompt.
- the context module 150 may implement a strategy in which one or more prior prompts are included in each current prompt.
- This prompt stuffing technique, or prompt concatenation may be limited by prompt size constraints — once the total size of the prompt exceeds the prompt limit, the model immediately loses state information related to parts of the prompt truncated from the prompt.
- the clinical data capture module 152 may include one or more sets of computerexecutable instructions for obtaining unstructured clinical data objects (e.g., pages, files, portions thereof, etc., for example in response to user queries directly or indirectly identifying the unstructured clinical data objects or broader sets of sources including the unstructured clinical data objects), and/or extracting structured clinical data matching the structured database from the unstructured data.
- unstructured clinical data objects e.g., pages, files, portions thereof, etc., for example in response to user queries directly or indirectly identifying the unstructured clinical data objects or broader sets of sources including the unstructured clinical data objects
- portions of operation of the clinical data capture module are instead executed at the one or more servers 104.
- a portion of the clinical data capture module 152 may obtain one or more unstructured clinical data objects and upload the object(s) to the one or more server(s) 104, which may use one or more trained LLMs to extract structured clinical data and populate the structured database 108.
- the clinical data capture module 152 at the client computing device 102 extracts the structured clinical data elements and uploads the extracted elements to the structured database 108, e.g., via the one or more servers 104.
- the clinical data capture module may access one or more trained LLMs, which may be stored at the client computing device 102 and/or, accessed from the server computing device 104 via the network 106, and/or accessed via other means, including for example means depicted in FIG. 1.
- the server computing device 104 may include one or more processors 160, one or more network interface controllers 162, one or more memories 164, an input device (not depicted), an output device (not depicted) and a server API 166.
- the one or more memories 164 may have stored thereon one or more modules 170 (e.g., one or more sets of instructions).
- the one or more processors 160 may include one or more central processing units, one or more graphics processing units, one or more field-programmable gate arrays, one or more application-specific integrated circuits, one or more tensor processing units, one or more digital signal processors, one or more neural processing units, one or more RISC-V processors, one or more coprocessors, one or more specialized processors/accelerators for artificial intelligence or machine learning-specific applications, one or more microcontrollers, etc.
- the server computing device 104 may include one or more network interface controllers 162, such as Ethernet network interface controllers, wireless network interface controllers, etc.
- the network interface controllers 162 may include advanced features, in some implementations, such as hardware acceleration, specialized networking protocols, etc.
- the memories 164 of the server computing device 104 may include volatile and/or non-volatile storage media.
- the memories 164 may include one or more random access memories, one or more read-only memories, one or more cache memories, one or more hard disk drives, one or more solid-state drives, one or more non-volatile memory express, one or more optical drives, one or more universal serial bus flash drives, one or more external hard drives, one or more network-attached storage devices, one or more cloud storage instances, one or more tape drives, etc.
- the memories 164 may have stored thereon one or more modules 170, for example, as one or more sets of computer-executable instructions.
- the modules 170 may include additional storage, such as one or more operating systems (e.g., Microsoft Windows, GNU/Linux, Mac OSX, etc.).
- the operating systems may be configured to run the modules 170 during operation of the server computing device 104 - for example, the modules 170 may include additional modules and/or services for receiving and processing data from one or more other components of the environment 100 such as the one or more cloud APIs 114 or the client computing device 102.
- the modules 170 may be implemented using any suitable computer programming language(s) (e.g., Python, JavaScript, C, C++, Rust, C#, Swift, Java, Go, LISP, Ruby, Fortran, etc.).
- the modules 170 may include a data collection module 172, a data pre-processing module 174, a model pretraining module 176, a fine-tuning module 178, a model training module 180, a checkpointing module 182, a hyperparameter tuning module 184, a validation and testing module 186, an auto-prompting module 188, a model operation module 190 and an ethics and bias module 192.
- more or fewer modules 170 may be included.
- the modules 170 may be configured to communicate with one another (e.g., via inter-process communication, via a bus, via sockets, pipes, message queues, etc.).
- the modules 170 may respond to network requests (e.g., via the API 166) or other requests received via the network 106 (e.g., via the client computing device 102 or other components of the environment 100).
- the data collection module 172 may be configured to collect information used to train one or more modules. In general, the information collected may be any suitable information used for training a language model.
- the data collection module 172 may collect data via web scraping, via API calls/access, via database extract-transform-load (ETL) processes, etc.
- Sources accessed by the data collection module 172 include social media websites, books, websites, academic publications, web forums/ interest sites (e.g., Reddit, Facebook, bulletin boards, medical journals, etc.), etc.
- the data collection module 172 may access data sources by active means (c.g., scraping or other retrieval) or may access existing corpuscs.
- the data collection module 172 may include sets of instructions for performing data collection in parallel, in some implementations.
- the data collection module 172 may store collected data in one or more electronic databases, such as a database accessible via the cloud APIs 114 or via a local electronic database (not depicted).
- the data may be stored in a structured and/or unstructured format.
- the data collection module 172 may store large data volumes used for training one or more models (i.e., training data). For example, the data collection module 172 may store terabytes, petabytes, exabytes or more of training data.
- the data collection module 172 may retrieve data from the structured database 108. For example, the data collection module 172 may process the retrieved/ received data and sort the data into multiple subsets based on information included within the structured database 108.
- the model preprocessing module 174 may include instructions for pre-processing data collected by the data collection module 172.
- the model preprocessing module 174 may perform text extraction and/or cleaning operations on data collected by the data collection module 172.
- the data pre-processing module 174 may perform preprocessing operations, such as lexical parsing, tokenizing, case conversions and other string splitting/munging.
- the data collection module 172 may perform data deduplication, filtering, annotation, compliance, version control, validation, quality control, etc.
- one or more human reviewers may be looped into the process of pre-processing data collected by the data pre-processing module 174.
- a distributed work queue may be used to transmit batch jobs and receive human-computed responses from one or more human workers.
- the data pre-processing module 174 may store copied and/or modified copies of the training date in an electronic database.
- the data pre-processing module 174 may include instructions for parsing the unstructured text received by the data collection module 172 to structure the text.
- the present techniques may train one or more models to perform language generation tasks that include token generation. Both training inputs and model outputs may be tokenized.
- tokenization refers to the process by which text used for training is divided into units such as words, subwords or characters. Tokenization may break a single word into multiple subwords (e.g., “LLM” may be tokcnizcd as “L” and “LM”).
- LLM may be tokcnizcd as “L” and “LM”.
- the present techniques may train one or more models using a set of tokens (e.g., a vocabulary) that includes many (e.g., thousands or more) of tokens. These tokens may be embedded into a vector. This vector of token or “embeddings” may include numerical representations of the individual tokens in the vocabulary in high-dimensional vector space.
- the modules 170 may access and modify the embeddings during training to learn relationships between tokens. These relationships effectively represent semantic language meaning.
- a specialized database e.g., a vector store, a graph database , etc.
- Embedding databases may include specialized features, such as efficient retrieval, similarity search and scalability.
- the server computing device 104 may include a local electronic embedding database (not depicted).
- a remote embedding database service may be used (e.g., via the cloud APIs 114). Such a remote embedding database service may be based on an open source or proprietary model (e.g., Milvus, Pinecone, Redis, Postgres, MongoDB, Facebook Al Similarity Search (FAISS), etc.).
- the server computing device 104 may include instructions (e.g., in the data collection module 172) for adding training data to one or more specialized databases, and for accessing it to train models.
- the present techniques may include language modeling, wherein one or more deep learning models are trained by processing token sequences using a large language model architecture.
- a transformer architecture may be used to process a sequence of tokens.
- Such a transformer model may include a plurality of layers including self-attention and feedforward neural networks. This architecture may enable the model to learn contextual relationships between the tokens, and to predict the next token in a sequence, based upon the preceding tokens. During training, the model is provided with the sequence of tokens and it learns to predict a probability distribution over the next token in the sequence.
- This training process may include updating one or more model parameters (e.g., weights or biases) using an objective function that minimizes the difference between the predicted distribution and a true next token in the training data.
- model parameters e.g., weights or biases
- Alternatives to the transformer architecture may include recurrent neural networks, long short-term memory networks, gated recurrent networks, convolutional neural networks, recursive neural networks, and other modeling architectures.
- the modules 170 may include instructions for performing pretraining of a language model (e.g., an LLM), for example, in a pretraining module 176.
- the pretraining module 176 may include one or more sets of instructions for performing pretraining, which as used herein, generally refers to a process that may span pre-processing of training data via the data pre-processing module 174 and initialization of an as-yet untrained language model.
- a pre-trained model is one that has no prior training of specific tasks.
- the model pretraining module 176 may include instructions that initialize one more model weights.
- model pretraining module 176 may initialize the weights to have random values.
- the model pretraining module 176 may train one or more models using unsupervised learning, wherein the one or more models process one or more tokens (e.g., preprocessed data output by the data pre-processing module 174) to learn to predict one or more elements e.g., tokens).
- the model pretraining module 176 may include one or more optimizing objective functions that the model pretraining module 176 applies to the one or more models, to cause the one or more models to predict one or more most-likely next tokens, based on the likelihood of tokens in the training data.
- the model pretraining module 176 causes the one or more models to learn linguistic features such as grammar and syntax.
- the pretraining module 176 may include additional steps, including training, data batching, hyperparameter tuning and/or model checkpointing.
- the model pretraining module 176 may include instructions for generating a model that is pretrained for a general purpose, such as general text processing/understanding.
- This model may be known as a “base model” in some implementations.
- the base model may be further trained by downstream training process(es), for example, those training processes described with respect to the fine-tuning module 178.
- the model pretraining module 176 generally trains foundational models that have general understanding of language and/or knowledge. Pretraining may be a distinct stage of model training in which training data of a general and diverse nature (i.e., not specific to any particular task or subset of knowledge) is used to train the one or more models.
- a single model may be trained and copied. Copies of this model may serve as respective base models for a plurality of finetuned models.
- base models may be trained to have specific levels of knowledge common to more advanced agents.
- the model pretraining module 176 may train a medical student base model that may be subsequently used to fine tune an internist model, a surgeon model, a resident model, etc. In this way, the base model can star! from a relatively advanced stage, without requiring pretraining of each more advanced model individually.
- This strategy represents an advantageous improvement, because pretraining can take a long time (many days), and pretraining the common base model only requires that pretraining process to be performed once.
- the modules 170 may include a fine-tuning module 178.
- the fine-tuning module 178 may include instructions that train the one or models further to perform specific tasks. For example, the fine-tuning module 178 may train each of a plurality of agents to generate one or more outputs that are based on each respective agents’ personality or characteristics. Specifically, the fine-tuning module 178 may include instructions that train one or more models to generate respective language outputs (e.g., text generation), summarization, question answering or translation activities based on the characteristics of each respective agent.
- respective language outputs e.g., text generation
- the fine-tuning module 178 may include sets of instructions for retrieving one or more structured data sets, such as time series generated by the data preprocessing module 174.
- the fine-tuning module 178 may include instructions for configuring an objective function for performing a specific task, such as generating text in accordance with a structure of the structured database 108 or in accordance with output expected by a medical personnel utilizing an LLM.
- the fine-tuning module 178 may include instructions for fine-tuning a pathologist model, based on a base language model. A medical resident model may be fine-tuned by the fine-tuning module 178, wherein the base model is the same used to fine-tune the pathologist model.
- the fine-tuning module 178 may train many (e.g., hundreds or more) additional models.
- the fine-tuning module 178 may include user-selectable parameters that affect the fine-tuning of the one or more models.
- a “caution” bias parameter may be included that represents medical conservativeness. This bias parameter may be adjusted to affect the cautiousness with which the resulting trained model (i.e., agent) approaches medical decision-making. Additional models may be trained, for additional personas/tasks, as discussed below.
- one or more open source frameworks may be used.
- Example frameworks include TensorFlow, Keras, MXNet, Caffe, SciKit learn, PyTorch.
- frameworks such as OpenLLM and LangChain may be used, in some implementations.
- the fine-tuning module 178 may use an algorithm such as stochastic gradient descent or another optimization technique to adjust weights of the pretrained model.
- Fine-tuning may be an optional operation, in some implementations.
- training may be performed by the training module 180 after pretraining by the model pretraining module 176.
- the model training module 180 may perform task- specific training like the fine-tuning module 178, on a smaller scale or with a more tailored objective. For example, whereas the fine-tuning module 178 may fine tune a model to learn knowledge corresponding to a surgeon, the model training module 180 may further train the model to learn knowledge of a plastic surgeon, an orthopedic surgeon, etc.
- the training module 180 may include one or more submodules, including the checkpointing module 182, the hyperparameter tuning module 184, the validation and testing module 186 and the auto-prompting module 188.
- the checkpointing module 182 may perform checkpointing, which is saving of a model’s parameters.
- the checkpointing module 182 may store checkpoints during training and at the conclusion of training, for example, in the model electronic database 112. In this way, the model may be run (e.g., for testing and validation) at multiple stages and its training parameters loaded, and also retrained from a checkpoint. In this way, the model can be run and trained forward without being re-trained from the beginning, which may save significant time (e.g., days of computation).
- the hyperparameter tuning module 184 may include hyperparameters such as batch size, model size, learning rate, etc. These hyperparameters may be adjusted to influence model training.
- the hyperparameter tuning module 184 may include instructions for tuning hyperparameters by successive evaluation.
- the validation and testing module 186 may include sets of instructions for validating and testing one or more machine learning models, including those generated by the model pretraining module 176, the fine-tuning module 178 and the model training module 180.
- the auto-prompting module 188 may include sets of instructions for performing auto-prompting of one or more models. Specifically, the auto-prompting module 188 may enrich a prompt with additional information.
- the auto-prompting module 188 may include additional information in a prompt, so that the model receiving the prompt has additional context or directions that it can use. This may allow the auto-prompting module 188 to fine-tune a base model using one-shot of few-shot learning, in some implementations.
- the auto-prompting module 188 may also be used to focus the output of the one or more models.
- the training module 180 may train multi-modal models.
- the training module 180 may train a plurality of models each capable of drawing from multimodal data types such as written text, imaging data, laboratory data, real-time monitoring data, pathology images, etc.
- the training module 180 may train a single model capable of processing the multimodal data types.
- a trained multimodal model may be used in conjunction with another model (e.g., a large language model) to provide non-text data interactions with users.
- the model operation module 190 may operate one or more trained models. Specifically, the model operation module 190 may initialize one or more trained models, load parameters into the model(s), and provide the model(s) with inference data (e.g., prompt inputs). In some implementations, the model operation module 190 may deploy one or more trained model (e.g., a pretrained model, a fine-tuned model and/or a trained model) onto a cloud computing device (e.g., via the API 166). The model operation module 190 may receive one or more inputs, for example from the client computing device 102, and provide those inputs (e.g., one or more prompts) to the trained model.
- the model operation module 190 may initialize one or more trained models, load parameters into the model(s), and provide the model(s) with inference data (e.g., prompt inputs).
- the model operation module 190 may deploy one or more trained model (e.g., a pretrained model, a fine-tuned model and/
- the API 166 may include elements for receiving requests to the model, and for generating outputs based on model outputs.
- the API 166 may include a RESTful API that receives a GET or POST request including a prompt parameter.
- the model operation module 190 may receive the request from the API 166, and pass the prompt parameter into the trained model and receive a corresponding input.
- the prompt parameter may be “What is the smallest bone in the human body.”
- the prompt output may be “The stapes bone of the inner ear is the smallest bone in the human body.”
- multi-modal modeling may be used.
- the data preprocessing module may, for example, process and understand image data, audio data, video data, etc.
- the server computing device 104 may interpret and respond to queries that involve understanding content from these different modalities.
- the server computing device 104 may include an image processing module (not depicted) including instructions for performing image analysis on images provided by users, or images retrieved from patient EHR data.
- the server computing device 104 may generate outputs in modalities other than text.
- the server computing device 104 may generate an audio response, an image, etc.
- the operating module 190 may include a set of computer-executable instructions that when executed by one or more processors (e.g., the processors 160) cause a computer (e.g., the server computing device 104) to perform retrieval-augmented generation. Specifically, the operating module 190 may perform retrieval-augmented generation based upon inputs or queries received from the user. This allows the operating module 190 to tailor responses of a model based on the specific input and context, such as the medical issue, patient, or clinical data record under discussion. For example, one or more models may be pre-trained, fine-tuned and/or trained as discussed above. During that training, the model may learn to generate tokens based on general language understanding as well as application-specific training. Such a model at that point may be static, insofar as it cannot access further information when presented with an input query.
- the operating module 190 may perform retrieval operations, such as searching or selecting information from a document, a database, or another source.
- the operating module 190 may include instructions for processing user input and for performing a keyword search, a regular expression search, a similarity search, etc. based upon that user input.
- the operating module 190 may input the results of that search, along with the user input, into the trained model.
- the trained model may process this additional retrieved information to augment, or contextualize, the generation of tokens that represent responses to the user’s query.
- retrieval augmented generation applied in this manner allows the model to dynamically generate outputs that are more relevant to the user’s input query at runtime.
- Information that may be retrieved may include data corresponding to a patient (e.g., patient demographic information, medical history, clinical notes, diagnoses, medications, allergies, immunizations, laboratory results, oncology information, radiation and imaging information, vitals, etc.) and additional training information, such as medical journals, notes or speech transcripts from symposia or other meetings/conferences, etc.
- patient demographic information e.g., patient demographic information, medical history, clinical notes, diagnoses, medications, allergies, immunizations, laboratory results, oncology information, radiation and imaging information, vitals, etc.
- additional training information such as medical journals, notes or speech transcripts from symposia or other meetings/conferences, etc.
- the present techniques may trigger retrieval augmented generation by processing a prompt, in some implementations.
- a prompt may be processed by the input processing module 146 of the client computing device 102, prior to processing the prompt by the one or more generative models.
- the input processing module 146 may trigger retrieval augmented generation based on the presence of certain inputs, such as patient information, or a request for specific information, in the form of keywords.
- the input processing module 146 may perform entity recognition or other natural language processing functions to determine whether the prompt should be processed using retrieval augmented generation prior to being provided to the trained model.
- prompts may be received via the input processing module 146 of the client computing device 102 and transmitted to the server computing device 104 via the electronic network 106.
- the output of the model may be modulated prior to being transmitted, output or otherwise displayed to a user.
- the ethics and bias module 192 may process the prompt input prior to providing the prompt input to the trained model, to avoid passing objectionable content into the trained model.
- the ethics and bias module 192 may process the output of the trained model, also to avoid providing objectionable output. It should be appreciated that trained language models may be unpredictable, and thus, processing outputs for ethical and bias concerns (especially in a medical context) may be important.
- the client computing device 102 and the server computing device 104 may communicate with one another via the network 106.
- the client computing device 102 and/or the server computing device 104 may offload some or all of their respective functionality to the one or more cloud APIs 114.
- the one or more cloud APIs 114 may include one or more public clouds, one or more private clouds and/or one or more hybrid clouds.
- the one or more cloud APIs 114 may include one or resources provided under one or more service models, such as Infrastructure as a Service (laaS), Platform as a Service (PaaS), Software as a Service (SaaS), and Function as a Service (FaaS).
- laaS Infrastructure as a Service
- PaaS Platform as a Service
- SaaS Software as a Service
- FaaS Function as a Service
- the one or more cloud APIs 114 may include one or more cloud computing resources, such as computing instances, electronic databases, operating systems, email resources, etc.
- the one or more cloud APIs 114 may include distributed computing resources that enable, for example, the model pretraining module 176 and/or other of the modules 170 to distribute parallel model training jobs across many processors.
- the one or more cloud APIs 114 may include one or more language operation APIs, such as OpenAI, Bing, Claude. ai, etc.
- the one or more cloud APIs 114 may include an API configured to operate one or more open source models, such as Llama 2.
- the electronic network 106 may be a collection of interconnected devices, and may include one or more local area networks, wide area networks, subnets, and/or the Internet.
- the network 106 may include one or more networking devices such as routers, switches, etc. Each device within the network 106 may be assigned a unique identifier, such as an IP address, to facilitate communication.
- the network 106 may include wired (e.g., Ethernet cables) and wireless (e.g., Wi-Fi) connections.
- the network 106 may include a topology such as a star topology (devices connected to a central hub), a bus topology (devices connected along a single cable), a ring topology (devices connected in a circular fashion), and/or a mesh topology (devices connected to multiple other devices).
- the electronic network 106 may facilitate communication via one or more networking protocols, such as packet protocols (e.g., Internet Protocol (IP)) and/or application-layer protocols (e.g., HTTP, SMTP, SSH, etc.).
- IP Internet Protocol
- application-layer protocols e.g., HTTP, SMTP, SSH, etc.
- the network 106 may perform routing and/or switching operations using routers and switches.
- the network 106 may include one or more firewalls, file servers and/or storage devices.
- the network 106 may include one or more subnetworks such as a virtual LAN (VLAN).
- VLAN virtual LAN
- the environment 100 may include one or more electronic databases, such as a relational database that uses structured query language (SQL) and/or a NoSQL database or other schema-less database suited for the storage of unstructured or semi-structured data.
- These electronic databases may include, for example, the structured database 108 and/or the model database 112.
- the present techniques may store training data, training parameters and/or trained models in an electronic database such as the database 112.
- one or more trained machine learning models may be serialized and stored in a database (e.g., as a binary, a JSON object, etc.). Such a model can later be retrieved, deserialized and loaded into memory and then used for predictive purposes.
- the one or more trained models and their respective training parameters may also be stored as blob objects.
- Cloud computing APIs may also be used to stored trained models, via the cloud APIs 114. Examples of these services include AWS SageMaker, Google Al Platform and Azure Machine Learning.
- a user may access a prompt graphical user interface via the client computing device 102.
- the prompt graphical user interface may be configured by the model configuration module 142 and generated by the input processing module 146, and displayed by the input processing module 146 via the output device 128.
- the model configuration module 142 may configure the graphical user interface to accept prompts and display corresponding prompt outputs generated by one or more models processing the accepted outputs.
- the input processing module 146 may be configured to transmit the prompts input by the user via the network electronic network 106 to the API 166 of the server computing device 104.
- the API 166 may process the user inputs via one or more trained models, or agents.
- one or more models may already be trained, including pretraining and fine-tuning. These trained models may be selectively loaded into the one or more agent objects based on configuration parameters, and/or based upon the content of the user’s input prompts.
- the user may engage in a question-answer session with the client computing device 102, for example using the LLM of the present description to refine, clarify, or correct a request for information from the structured database 108, and/or to identify unstructured documents to be incorporated into the structured database 108.
- the environment 100 may include additional, fewer, and/or alternate computing components, in various possible implementations.
- FIGS. 2 and 3 respectively depict a block diagram and flow diagram of example flows of data in accordance with various implementations of techniques of the present disclosure.
- a flow diagram involves unstructured clinical data 204.
- the unstructured clinical data 204 may, for example, include freeform, handwritten or typed notes, for example written by a physician, nurse, specialist, etc., clinical notes, and/or other suitable data described herein.
- any structure of the unstructured clinical data 204 is such that the structure (if present) does not match that of a structured database 212 (e.g., a structured clinical registry).
- the structured database 212 contains various organized fields in which particular corresponding values are placed to populate the structured database 212, whereas freeform notes on the other hand usually do not have structured fields, and to the extent that any of the data 204 contains any structured fields, these fields usually do not correspond one-to-one with the structure of the structured database 212.
- Unstructured clinical data 204 may, for example, be stored on a client computing device of a user (e.g., medical personnel), stored on a filing system of a medical facility, accessible via one or more applications from a client computing device of a user, and/or accessed via other means to supply the unstructured clinical data 204 to the large language model (LLM) as described herein.
- unstructured clinical data 204 may include physical documents (e.g., handwritten notes or printed clinical reports), which may be provided to the LLM, for example, by taking photographs or electronic scans of the physical documents before or during the process of providing the physical documents to the LLM and/or querying data therefrom (which may for example perform optical character recognition on data contained therein).
- the LLM receives the unstructured clinical data 204 and identifies particular data elements therein to populate the structured database 212 (e.g., in response to queries for information from the unstructured clinical data 204).
- the structured database 212 also stores analyzed unstructured clinical data 214, which may correspond to the unstructured clinical data 204 with pointers to specific data elements therein added based on the analysis by the LLM. That is, for any particular data element identified from an unstructured clinical data object (e.g., a page, a file, etc.), the structured database 212 may store the original data object itself with a pointer to where the particular data element appears in the original data object.
- the structured database 212 may return not just the data element, but at least a portion of the original data object where the data element appears such that the user/program can view the context of the data element (e.g., the structured database 212 may return a page containing the data element, with the data element itself highlighted or otherwise signified).
- a human user may form a natural language query 222 to the structured database 212, for example via text input, voice input, and/or other input at a client computing device of the user.
- a natural language query may include, for example, “show me the patient’s reports immediately after Wednesday’s surgery” or “was the patient’s nerve preserved during surgery? Options: Yes/no/partially.”
- the LLM may convert the natural language query into a corresponding database query (e.g., SQL query) that can be used to search the structured database 212 for particular data objects (e.g., pages, files, etc.) or data elements therefrom.
- the LLM may effectively provide a chatbot functionality, wherein the LLM returns objects/data elements in accordance with user requests and allows the user to further refine, clarify, or modify requests for information as needed, for example based on original output of the LLM.
- the LLM may return an indication of both a blood work report and a radiology report for viewing at the user’s client computing device.
- the LLM may ask the user (e.g., via text or simulated voice output) whether the user wants to see the blood work report, the radiology report, both, or neither.
- the user may then provide further input (e.g., via typing, speech, selection of user interface options, etc.) indicating “show me the radiology report,” in response to which the LLM may cause the radiology report (or relevant portions thereof, when requested) to be presented at the user’s client computing device.
- further input e.g., via typing, speech, selection of user interface options, etc.
- the LLM may cause the radiology report (or relevant portions thereof, when requested) to be presented at the user’s client computing device.
- the LLM upon returning responses to a natural language query 222 may include additional questions to be asked of the user to augment the structured database 212 (or more particular, to augment a returned clinical data object). For example, upon returning a patient’s radiology report to a physician, the LLM may identify additional questions to ask of the physician to augment the radiology report or another associated portion of the patient’s records. Based upon further input (/. ⁇ ?., answers) provided by the physician, the LLM may identify corresponding portions of the structured database 212 to update or augment.
- database queries 226 may be provided to the structured database 212.
- Database queries 226 may be generated and submitted, for example, via various applications executing on behalf of medical personnel, a medical facility, or another entity associated therewith.
- a data visualization application e.g., Tableau
- REDCap an auditing and/or reporting application
- the structured database 212 at the time of receiving a query already includes the information requested.
- the query may refer to one or more sources of information that include unstructured clinical data 204 that has already been analyzed and populated in the structured database 212.
- one or more sources are obtained and incorporated into the structured database 212 based on the query itself.
- the query may include a request for patient information associated with the one or more sources, or the query may specifically identify the one or more sources from which relevant data (such as patient information, etc.) is to be extracted.
- FIG. 3 other example flows of information are depicted. These flows of information generally include (1) interactions between a user 302 and an application (“app”) 304 (e.g., executing at the client device 102 of FIG. 1 from memory 124 via the processor 120), (2) interactions between the application 304 and an app service 306 (e.g., back-end service executing at the server computing device 104 and/or another server computing device), and (3) interactions between the app service 306 and a large language model (LLM) service 308 (e.g., Google Gemini and/or another suitable LLM, which may for example be stored executed at the server computing device 104 and/or another server computing device).
- LLM large language model
- Step 1 the user 302 uses the application 304 (e.g., via the input device 126 and/or output device 128 of FIG. 1) to identify one or more patients regarding whom clinical data is to be obtained.
- Actions between the user 302 and application 304 herein may, for example, take the form of one or more natural language queries and/or one or more database queries, e.g., as described with respect to FIG. 2).
- Step 1 may include actions 1 .
- the application 304 communicates with the app service 306 to identify and return information indicating the identified patients and records associated therewith (e.g., records already analyzed to produce structured data in a structured database such as a clinical registry, and/or unstructured records not yet analyzed via the techniques of this disclosure).
- the application 304 may display indications of patients and records associated therewith to the user 302 (e.g., via the output device 128 of FIG. 1).
- the user 302 filters records returned from step 1, e.g., via interactions between the user 302, application 304, and app service 306.
- the user may, for example, select one or more sources from which desired data is to be retrieved, e.g., using filtering mechanisms to limit to a particular hospital, hospital wing, room, type of procedure, portion of the body, symptom, diagnoses or disorder, medical personnel involved, date/time, current procedural terminology (CPT) code, and/or other medically relevant parameter.
- Step 2 may include actions 2.1 between the application 304 and app service 306 to review and refine subsets of filtered records in response to filtering selections from the user 302.
- Step 3 the user 302 extracts particular subsets of information from the filtered records of step 2, e.g., by defining particular data elements and/or portions of the filtered records desired to be analyzed via the LLM of the present disclosure.
- Step 3 may include action 3.1, where the application 304 communicates with the app service 306 to prepare and review prompts to the LLM service 308, e.g., in the form of natural language queries (or additionally/altematively, in embodiments, database queries).
- Preparing and reviewing prompts at action 3.1 may include the user 302 reviewing and/or modifying the prompts via interactions with the application 304.
- Extracting information at step 3 further includes actions 3.1.1 and 3.1.2, where the app service 306 provides the prepared prompts to the LLM service 308 and receives output from the LLM service 308.
- the app service 306 and application 304 return results to the user, e.g. , in the form of structured data from the structured database that is responsive to the prompt(s) and that is sourced from unstructured records or portions thereof specified by the user 302.
- the user 302 reviews the results returned from step 3, e.g., via further interactions with the application 304. If desired, the user 302 may re-submit a prompt by repeating actions of steps 1 , 2, and/or 3, e.g., to modify a parameter of a previous prompt and/or to obtain further information associated previously returned results.
- EIG. 4 depicts a block diagram of an example computer-implemented method 400 in accordance with at least some of the techniques described herein.
- the method 400 may be performed via computing components of the environment 100 described with respect to EIG. 1.
- a computing system includes one or more processors and one or more computer memories (e.g., one or more non-transitory memories) storing instructions thereon that, when executed via the one or more processors, cause the computing system to perform actions of the method 400, alone or in combination with other actions of this disclosure.
- one or more computer readable media e.g., one or more non-transitory computer readable media
- the method 400 includes obtaining one or more data objects defining clinical data associated with at least one patient (402).
- the one or more data objects may be entirely unstructured (e.g., freeform text in the form of handwritten or typewritten notes, unstructured reports, etc.) or at least do not match a structure of a structured database (e.g., do not have one- to-one correspondence with fields to be populated in the structured database.
- Obtaining the one or more data objects may include obtaining photo images, files, etc., for example from local storage of the client computing device, from a camera unit of the client computing device, and/or via export from another application executing on or otherwise accessed via the client computing device.
- the method 400 also includes analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database (404).
- the method 400 still further includes populating the structured database with the one or more extracted elements to thereby update the structured database based upon information contained in the one or more obtained data objects (406).
- the method 400 still yet further includes providing an indication of the one or more extracted data elements to a clint computing device of a user in response to a query from the client computing device (408).
- the query may, for example, by a natural language query or a database query received from the client computing device, e.g., as described with respect to FIG. 2 or FIG. 3.
- the query itself defines the specific patient information to be returned (alternatively, the query may simply request information about a patient or group of patients more generally). Moreover, in some embodiments, the query identifies one or more data sources from which to extract the one or more data elements matching the structure of the structured database. In these scenarios, action 402 may include obtaining the unstructured data object(s) based on the query itself, e.g., by identifying unstructured documents to analyze using the large language model and input into the structured database at actions 404 and 406, respectively.
- the method 400 may include still additional, fewer, and/or alternate actions, including various suitable actions described in this disclosure.
- the method 400 provides original data objects (e.g., files, photos, etc.) back to a user from the structured database, with the provided original data objects including indications (e.g., annotations, highlights, etc.) where particular requested data elements are found.
- indications e.g., annotations, highlights, etc.
- actions of the method 400 are performed iteratively, such that the user of the client computing device provides further requests to the large language model based upon the output of action 408.
- the structured database is accessed to identify one or more patients or groups of patients in the structured database for a clinical trial.
- any reference to "one implementation” or “an implementation” means that a particular element, feature, structure, or characteristic described in connection with the implementation is included in at least one implementation.
- the appearances of the phrase “in one implementation” in various places in the specification are not necessarily all referring to the same implementation.
- the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, arc intended to cover a non-cxclusivc inclusion.
- a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
- the present disclosure contemplates a number of aspects, including but not limited to:
- a computing system comprising: one or more processors; and one or more memories having stored thereon computer-executable instructions that, when executed by the one or more processors, cause the computing system to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
- a computer-implemented method performed via one or more processors, the method comprising: obtaining one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populating the structured database with the one or more extracted data elements; and providing an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
- One or more non-transitory computer-readable media storing instructions that, when executed via one or more processors of one or more computers, cause the one or more computers to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
A large language model (LLM) may extract information from unstructured data from a clinical setting, including for example freeform physicians' notes and/or patient reports. Extracted information may be used to populate a structured database (e.g., a clinical data registry), which may then be queried via database queries from various applications, for example to identify clinical trial cohorts or generate data visualizations or reports. Additionally, the LLM integrated with the structured database may support natural language queries from users (e.g., medical personnel) to support on-demand retrieval of relevant patient parameters.
Description
ARTIFICIAL INTELLIGENCE SYSTEMS AND METHODS FOR PATIENT CHARTS CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to and the benefit of the filing date of U.S. Provisional Patent Application No. 63/602,541, filed November 24, 2023 and titled “ChartMiner: Utilizing Al for Structured Data Retrieval from Patient Charts,” the entirety of which is hereby incorporated by reference herein.
FIELD OF THE DISCLOSURE
[0002] The present disclosure relates to the use of large language models (LLMs), and more particularly, to the use of an LLM to generate structured data from unstructured documents in a clinical health setting.
BACKGROUND
[0003] In clinical health settings such as hospitals, physician’s offices, urgent care facilities, etc., it is desirable to generate, populate, and maintain structured clinical data registries that define, for any patient, values for a variety of predefined fields. Fields in a clinical dataset for a patient in an electronically stored database (e.g., a clinical registry) can, for example, record clinical data contained in patient progress reports, operative notes, radiology or pathology reports, etc. Populating and maintaining the data registries in a predefined, structured format has many benefits. For one, the structured format allows a human to more easily locate, open, and parse any desired information for any desired patient. Further, the structured format of clinical registries, as with much structured data, allows for automated back-end data processing through various applications e.g., data visualization applications, auditing or reporting applications, realtime viewing applications for physicians, nurses, and other staff, etc.) Still further, structured clinical registries may be searched or parsed to identify, from a multiplicity of patients in the clinical registry, one, two, three or more patients to form a cohort for a clinical trial based upon the stored information for each patient (e.g., a group of patients may be chosen for a clinical trial based upon having a shared clinical diagnosis and comparable attributes). Of course, the electronic nature of clinical registries, in combination with modern communications technologies, makes the data contained within clinical registries available to various persons and applications, agnostic to the physical location(s) from which the clinical registry is populated or accessed.
[0004] Despite the benefits described above, though, the structured nature of clinical registries presents challenges and drawbacks when implemented in real clinical settings. Specifically, converting unstructured clinical notes into structured clinical datasets presents significant challenges. It is estimated that around 80% of all clinical data is unstructured, freeform text written in natural language (e.g., freeform notes typed or written on paper, for example reflecting patient progress reports, operative notes, clinical observations, radiology or pathology reports, etc.). At best, if this clinical data is structured at all, the clinical data is likely structured differently from the clinical registry (e.g., the recorded clinical data may contain fields that do not match one-to-one with the clinical registry). Populating a clinical registry on behalf of one or more patients requires focused effort to convert the usually unstructured clinical data into structured entries into the clinical registry.
[0005] Specifically, populating the clinical registry often involves manual data extraction or entry by a nurse, medical resident, consultant, or data extraction specialist. This process is slow and methodical, effectively limiting the amount of data that can be populated and maintained in the clinical registry. Conventionally, data extractors are left with two general approaches for handling the mismatch between unstructured clinical data (e.g., freeform notes or reports) and the structured clinical registry. In one approach, the data extractor can limit the data entered into the clinical registry to only the structured fields contained within or easily identifiable from the unstructured clinical data. This approach is not optimal as it may effectively discard a significant amount of the information contained in the unstructured clinical notes. This results in less patient data being available to provide to clinical applications, use for identifying patient cohorts for clinical trials, inform clinical decisions, etc. In a second possible approach for handling the unstructured clinical data, the data extractor may spend an extensive amount of time and mental effort to translate particular elements or combinations of elements in the unstructured clinical notes to respective entries in the clinical registry. This approach, though, is quite complex and not easily scalable in a clinical setting that may require extraction/entry of data for tens, hundreds, or even thousands of patients and medical personnel. Moreover, either of the two approaches identified above fundamentally relies upon the abilities of the data extractor to accurately identify and record data into the clinical registry.
[0006] As a result of the drawbacks of the conventional approaches above, there exists a risk that data maintained in a clinical registry is incomplete and/or inaccurate, limiting the potential
to references or queries of the clinical registry to inform clinical decisions, identify patient cohorts for clinical trials, or supply to data visualization, auditing, reporting, or other applications.
SUMMARY
[0007] The present disclosure proposes a use of a large language model (z.e., one or more large language models (LLMs)) to extract structured information from unstructured clinical data (e.g., typed or handwritten freeform clinical notes or reports). Specifically, for any input comprising unstructured clinical data, the LLM may identify data elements contained within the input, identify locations in a structured database (e.g., a clinical registry) corresponding to the identified data elements, and populate the corresponding structured database locations with the corresponding identified data elements. In embodiments, a user may specify a particular set of unstructured clinical data (e.g., one or more case files, documents, pages, paragraphs, etc.), request a particular one or more data elements, and use the LLM to retrieve the requested one or more data elements from the unstructured clinical data, e.g., by applying the LLM to the unstructured clinical data, or, if the structured database has already been populated with the requested one or more data elements (e.g., as a result of prior requests for the data element(s)) automatically retrieving the requested one or more data elements from the structured database. The LLM and the structured database populated therewith may be used, for example, to identify patient cohorts for clinical trials, query the populated structured database (e.g., via further LLM techniques), and supply functionality to various clinical applications.
[0008] In implementations described herein, functionality of the LLM may be made accessible via a user-facing application (i.e., one or more applications), which may be accessible via mobile devices, laptop computers, desktop computers, and/or other devices of medical personnel such as physicians, nurses, residents, consultants, data extractors, etc. Medical personnel may use the LLM to convert any unstructured clinical data into corresponding structured database entries, for example by providing images or electronic documents containing the unstructured clinical data to the LLM. Additionally, in some implementations, using the LLM, medical personnel may provide natural language queries or other queries to the LLM to return various clinical information from the unstructured clinical notes and/or the structured database.
[0009] In one implementation, a computer-implemented method for may be provided, the method being implemented via one or more processors. The computer-implemented method may include (1) obtaining one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populating the structured database with the one or more extracted data elements; and (4) providing an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device. The method may include additional, fewer, and/or alternate actions, including actions described herein.
[0010] In another implementation, one or more computer readable media (e.g., one or more non-transitory computer readable media) may be provided. The one or more computer readable media may store computer-executable instructions that, when executed via one or more processors of one or more computers, cause the one or more computers to (I) obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populate the structured database with the one or more extracted data elements, and (4) provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device. The one or more computer readable media may store additional, fewer, and/or alternate instructions, including instructions described herein.
[0011] In still another implementation, a computing system may be provided. The computing system may include one or more processors, and one or more memories (e.g., non-transitory memories) having stored thereon computer-executable instructions. The computer-executable instructions, when executed by the one or more processors, may cause the computing system to (1) obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database, (2) analyze the one or more obtained data objects using one or more large language machine learning models to extract,
from the one or more obtained data objects, one or more data elements matching the structure of the structured database, (3) populate the structured database with the one or more extracted data elements, and (4) provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device. The computing system may include additional, fewer, and/or alternate components, including components described herein. Moreover, the computing system may be configured to perform additional, fewer, and/or alternate actions, including actions described herein.
[0012] Advantages will become more apparent to those of ordinary skill in the art from the following description of the preferred implementations, which have been shown and described by way of illustration. As will be realized, the present implementations can be capable of other and different implementations, and their details are capable of modification in various respects. Accordingly, the drawings and description are to be regarded as illustrative in nature and not as restrictive.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 depicts an example computing environment in which techniques of the present description may be implemented, in accordance with some embodiments;
[0014] FIG. 2 depicts a block diagram of example flows of data, in accordance with some embodiments;
[0015] FIG. 3 depicts a flow diagram of still other example flows of data, in accordance with some embodiments; and
[0016] FIG. 4 depicts a block diagram of an example computer-implemented method, in accordance with some embodiments.
[0017] The figures depict embodiments of this disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternate embodiments of the structures and methods illustrated herein may be employed without departing from the principles set forth herein. The figures are not to scale. Instead, they are drawn to clarify implementations of this disclosure. Connecting lines or connectors shown in the various figures presented are intended to represent example functional relationships, physical couplings, or logical couplings between the various elements. In general, the same reference
numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts.
DETAILED DESCRIPTION
[0018] Reference will now be made in detail to the various embodiments and implementations of the present disclosure illustrated in the accompanying drawings. Wherever possible, the same or like reference numbers will be used throughout the drawings to refer to the same or like features. Certain terminology is used in the following description for convenience only and is not limiting.
[0019] Systems and methods of the present disclosure relate, inter alia, to use of a large language model (LLM, i.e., one or more large language models) to extract structured database entries (e.g., entries of a clinical registry) from unstructured clinical data, for example in response to natural language queries and other requests for particular structured data from unstructured documents, such as handwritten or typewritten notes, clinical reports, and/or other clinical data not aligning with the structure of the structured database (e.g., a clinical data registry).
[0020] Traditionally, approximately 80% of clinical data recorded in real clinical settings is made up of unstructured data, including for example typewritten notes, notes recorded by hand, or reports otherwise being unstructured or not matching a structure of fields included in the structured database. Such unstructured clinical data may, for example, include patient progress reports, operative notes, radiology or pathology reports, etc. The LLM of the present disclosure may receive, as input, clinical data including unstructured clinical notes and/or other clinical information not matching the structure of an established structured database. Based upon the received input, the LLM identifies data elements contained therein, and matches the data elements to corresponding fields and patients in the structured database. The LLM then may appropriately populate the structured database using the corresponding identified data elements from the received clinical data, and may return relevant information to a user in response to a natural language query or database query, for example.
[0021] LLM-based functionalities to receive queries and/or to populate the structured database from unstructured clinical data may be provided, for example, at various client electronic computing devices associated with medical personnel such as physicians, nurses, medical
residents, consultants, dedicated data extractors, and/or other qualified/authorized personnel. By way of example and not limitation, these client computing devices may include desktop computers, laptop computers, tablets, smartphones, and/or smart wearable devices. Using one or more applications executing at a client computing device(s), medical personnel may, for example, capture one or more images of handwritten clinical notes or printed reports, upload locally stored computerized records (e.g., typed notes or other reports), or export electronic documents from another one or more applications executing at the client computing device.
[0022] In any case, once supplied to an LLM, the LLM intelligently interprets the unstructured clinical data and populates clinical information into appropriate fields according to the predefined structure of the structured database. Once provided to the structured database using the LLM, the unstructured clinical notes and the data elements identified therein can be applied for various purposes.
[0023] For example, in implementations, the structured database and/or LLM support querying for various applications. For example, various computer applications execute database queries (e.g., SQL queries) of the structured database to obtain populated clinical information contained therein. Queried clinical information may be used, for example, for charts, graphs, and/or other visualizations created/managed by data visualization applications (e.g., Tableau), or for reports generated/managed by clinical data auditing/reporting applications (e.g., REDCap).
[0024] Additionally or alternatively, in implementations, the structured database and LLM support natural language queries from users. Using one or more applications at a client computing device(s), for example, medical personnel may provide natural language input (e.g., via speech or text) prompting the structured database for particular information associated with one or more patients (e.g., “show me the patient’s most recent radiology report,” “show me blood work reports within four days of the [a particular procedure, e.g., surgery] performed by [a particular personnel, e.g. surgeon],” “show me vital measurements for each new patient to whom I have been assigned in the past three days,” or “show me patient’s notes from immediately after the patient’s MRI procedure]). Where a source(s) of unstructured information (e.g., a particular portion of a physician’s notes) specified in a query has not already been provided to the structured database, the LLM may responsively incorporate the specified source(s) into the
structured database, and generate a response to the query and to future queries for same or similar information.
[0025] In any case, from the structured database, the LLM may provide output in the form of particular clinical parameters and/or the context of the documents in which the parameters appear (e.g., to output a paragraph, page or entirety of a handwritten note containing an element of information requested by the user). Natural language queries to the LLM may, in some embodiments, be phrased in the form of questions, in which case the query may include options for answers to be provided via the LLM (e.g., was the patient’s [location] nerve preserved during [procedure]? Options: yes/no/partially”). The LLM may return as output the answer(s) to the question(s), e.g., via visual and/or audial output. In some embodiments, a client application(s) supporting the LLM additionally returns a clinical document or portion thereof indicating the answer(s) to the question(s), such that the user may view the context of the answer(s). When multiple documents are returned via the LLM, the user may refine the previously submitted query to thereby narrow the LLM output further from among the initially returned results.
[0026] Effectively, natural language queries to the LLM may include any one or more of various filters for information contained in (or requested to be obtained/stored in) the structured database, including but not limited to patient, hospital, hospital wing, room, type of procedure, portion of the body, symptom, diagnoses or disorder, medical personnel involved, date/time, current procedural terminology (CPT) code, and/or other medically relevant parameters in accordance with data maintained by the structured database (e.g., a clinical registry). In the instance that providing output via the LLM requires refinement or clarification from the user (e.g., to clarify which procedure, date, etc. from which the user seeks information), the LLM may prompt the user for additional information to refine the output of the LLM. The determination of what query refinements or clarifications are required from the user may be generated by the LLM based upon the particular initial output provided by the LLM. Effectively, the interactive input/output between the LLM and the user may produce a Chatbot functionality by which the user uses the LLM to request and refine selections of unstructured data objects (e.g., notes, clinical reports, etc.) to be incorporated into the structured database and provided to the user, and/or requests access to information from the unstructured data that has already been incorporated into the structured database.
[0027] In some implementations, one or more user- facing applications (z.e., executing at one or more client computing devices e.g., of medical personnel) may access the structured database to generate, manage, and/or maintain the structured database. For example, similarly to as described above, a user may provide one or more natural language queries (or database queries) to the LLM via user-facing application(s) to summon any particular data already contained in the structured database, and/or specify sources to be incorporated into the structured database and data from which to be provided to the user (contingent upon authorization of the user to access the requested data). The user may for example, cause data to be added, modified, and/or removed from the structured database (e.g., to thereby modify or maintain a clinical registry).
[0028] In some implementations, the structured database generated and/or modified via the LLM of the present disclosure may be utilized to identify cohorts of patients for selection in clinical trials. That is, clinical information extracted from unstructured clinical data may include information identifying eligibility of respective patients for clinical trials (e.g., pathology findings, biomarkers, etc.), as well as commonalities and/or contrasts between groups of patients that may be used to formulate two or more sub-groups of patients for involvement in the clinical trial. Users retrieving one or more patients for cohort selection may view and/or modify selections generated from the structured database to thereby confirm or refine cohort selection.
[0029] Similarly, clinical information extracted via the LLM from unstructured clinical data may be utilized by medical personnel to identify other treatments/therapies for a patient based upon data from the structured database. A natural language query to the structured database via the LLM may, for example, include “has the patient previously been prescribed [a given drug] for [a given disorder]” or [is the patient eligible for treatment using [a given therapy].”
[0030] In addition to (or alternatively to) being usable to service an existing structured database, the LLM described herein may be used to generate an entirely new structured database, e.g., upon applying the LLM to unstructured data from a new entity such as a new hospital, urgent care facility, governmental organization, etc. Particularly, upon defining a multiplicity of structured data fields that will make up the new structured database, the LLM can then by applied to extract corresponding structured data from unstructured clinical data managed by the entity and populate the corresponding structured data into the new structured database.
[0031] Integration of an LLM into a structured database and applications associated therewith (particularly, user-facing applications), presents a number of improvements over traditional techniques associated with generating, populating, and maintaining structured databases. For one, use of the LLM of the present disclosure significantly increases the amount of data that can feasibly be input to the structured database from unstructured clinical data (i.e., freeform clinical data and/or other clinical data not matching a structure of the structured database), without requiring medical personnel to granularly parse the unstructured clinical data. Additionally, accuracy of the intake of clinical information into the structured database is improved. For example, preliminary experimental work demonstrates that an LLM described herein may extract at least 80% of structured database fields from unstructured clinical data with an accuracy of between 90% and 95% at a speed of one data element per second. Further experimentation is expected to produce still further improvements to these performance metrics. The improved volume and accuracy of intake of information into the structured database, in turn, improves the usability of the structured database for various applications described herein (e.g., using clinical registries and/or other structured databases for cohort identification for clinical trials and/or treatment/therapy identification, use of clinical information for data visualization or auditing/reporting, etc.). Still additionally, the user-facing natural language query functionalities described herein may allow for user-friendly, intuitive, and (in some cases) hands-free querying of the structured database to obtain various relevant information associated with a patient or a medical practice. Still yet additionally, the techniques herein improved the speed via which the relevant information can be queried, by allowing the user to specify sources of information (e.g., clinical records, notes, or portions thereof), and automatically incorporate the sources into the structured database and source information therefrom.
[0032] The LLM (i.e., one or more large language models) of the present disclosure may include various existing or future large language models, which may be tailored to the specific medical use cases described herein. In some envisioned implementations, the LLM includes the Pathways Language Model (PaLM) 2.0, and/or the Gemini LLM and/or chatbot developed by Google. Additionally or alternatively, though, the LLM may include GPT 3.5, GPT 4, and/or another LLM(s) developed by OpenAI, Llama 2, OpenLLaMA, Falcon, Dolly 2.0, and/or other open-source and/or private large language models. Further details regarding the training and implementation of the LLM will be provided in subsequent portions of the present disclosure.
[0033] Example Computing Environment
[0034] FIG. 1 depicts an exemplary computing environment 100 in which techniques disclosed herein may be implemented. The environment 100 may include computing resources for training and/or operating machine learning models (particularly, including one or more large language models (LLMs)) to perform functionalities described herein, e.g., interpreting unstructured clinical data to populate a structured database, and/or responding to natural language queries of the structured database.
[0035] The computing environment 100 may include a client computing device 102, a server computing device 104, an electronic network 106, a structured database 108 (e.g., including or consisting of one or more clinical registries), a context electronic database 110 and a model electronic database 112. The computing environment may further include one or more cloud application programming interfaces (APIs) 114. The components of the computing environment 100 may be communicatively connected to one another via the electronic network 106 (e.g., one or more wired and/or wireless communication networks), in various implementations.
[0036] The client computing device 102 may implement, inter alia, operation of one or more applications for obtaining unstructured clinical data (e.g., via capturing of images, uploading of locally stored documents, or exporting of data/documents form other applications accessed via the client computing device 102). In some implementations, the client computing device 102 may be implemented as one or more computing devices (e.g., one or more laptops, one or more mobile computing devices, one or more tablets, one or more wearable devices, one or more cloud-computing virtual instances, etc.). A plurality of client computing devices 102 may be part of the environment 100 - for example, a first user may access a client computing device 102 that is a laptop, while a second user accesses the client computing device 102 that is a smart phone, while yet a third user accesses a client computing device 102 that is a wearable device. Each of these respective users may participate to manage structured database information associated with same or different patients, hospitals, etc.
[0037] The client computing device 102 may include one or more processors 120, one or more network interface controllers 122, one or more memories 124, an input device 126, an output device 128 and a client API 130. The one or more memories 124 may have stored thereon one or more modules 140 (e.g., one or more sets of instructions).
[0038] In some implementations, the one or more processors 1 0 may include one or more central processing units, one or more graphics processing units, one or more field-programmable gate arrays, one or more application-specific integrated circuits, one or more tensor processing units, one or more digital signal processors, one or more neural processing units, one or more RISC-V processors, one or more coprocessors, one or more specialized processors/accelerators for artificial intelligence or machine learning-specific applications, one or more microcontrollers, etc.
[0039] The client computing device 102 may include one or more network interface controllers 122, such as Ethernet network interface controllers, wireless network interface controllers, etc. The network interface controllers 122 may include advanced features, in some implementations, such as hardware acceleration, specialized networking protocols, etc.
[0040] The memories 124 of the client computing device 102 may include volatile and/or nonvolatile storage media. For example, the memories 124 may include one or more random access memories, one or more read-only memories, one or more cache memories, one or more hard disk drives, one or more solid-state drives, one or more non-volatile memory express, one or more optical drives, one or more universal serial bus flash drives, one or more external hard drives, one or more network- attached storage devices, one or more cloud storage instances, one or more tape drives, etc.
[0041] As noted, the memories 124 may have stored thereon one or more modules 140, for example, as one or more sets of computer-executable instructions. In some implementations, the modules 140 may include additional storage, such as one or more operating systems (e.g., Microsoft Windows, GNU/Linux, Mac OSX, etc.). The operating systems may be configured to run the modules 140 during operation of the client computing device 102 - for example, the modules 140 may include additional modules and/or services for receiving and processing data from one or more other components of the environment 100 such as the one or more cloud APIs 114 or the server computing device 104. The modules 140 may be implemented using any suitable computer programming language(s) (e.g., Python, JavaScript, C, C++, Rust, C#, Swift, Java, Go, LISP, Ruby, Fortran, etc.).
[0042] The modules 140 may include a model configuration module 142, an API module 144, an input processing module 146, an authentication/security module 148, a context module 150
and a clinical data capture module 152, in some implementations. In some implementations, more or fewer modules 140 may be included. The modules 140 may be configured to communicate with one another (e.g., via inter-process communication, via a bus, via sockets, pipes, message queues, etc.).
[0043] The model configuration module 142 may include one or more sets of computerexecutable instructions (i.e., software, code, etc.) for configuring one or more models (e.g., one or more LLMs) for extracting structured information from unstructured clinical data (e.g., any of various clinical data described herein, such as freeform notes/reports or clinical data having a structure not matching that of a clinical registry and/or another structured database(s)).
[0044] In some implementations, the model configuration module 142 may be omitted from the modules 140, or its access may be restricted to administrative users only. For example, in some implementations, one or more of the modules 140 may be packaged into a downloadable application (e.g., a smart phone app available from an app store) that enables registered but nonprivileged (i.e., non-administrative) users to access the environment 100 using their consumer client computing device 102. In other implementations, one or more of the client computing device 102 may be locked down, such that the client computing device 102 is controlled hardware, accessible only to those who have physical access to certain areas (e.g., of a hospital, urgent care facility, etc.).
[0045] The model configuration module 142 may include instructions for generating one or more graphical user interfaces that allow a user (e.g., a physician, nurse, data extractor, etc.) to identify one or more unstructured clinical data objects (e.g., pages, files, portions thereof, etc.) for data extraction and population into a structured database (e.g., stored via the one or more servers 104).
[0046] The API module 144 may include one or more sets of computer executable instructions for accessing one or more remote APIs, and/or for enabling one or more other components within the environment 100 to access functionality of the client computing device 102. For example, the API module 144 may enable a remote user to specify and/or query unstructured clinical data objects, structured clinical data elements, natural language queries and responses thereto, and/or other data that may be stored locally at the client computing device 102 via the context database 110. In some implementations, the API module 144 may enable other client
applications (i.e., not applications facilitated by the modules 140) to connect to the client computing device 102, for example, to send queries or prompts, and to receive responses from the client computing device 102. The API module 144 may include instructions for authentication, rate limiting and error handling.
[0047] As noted, the client computing device 102 may enable one or more users to access one or more trained models (e.g., LLMs) by providing input prompts that are processed by one or more trained models. The input processing module 146 may perform pre-processing of user prompts prior to being input into one or more models, and/or post-processing of outputs output by one or more models. For example, the input processing module 146 may process data input into one or more input fields, voice inputs or other input methods (e.g., file attachments) depending upon the application. The input processing module 146 may receive inputs directly via the input device 126, in some implementations (e.g., natural language queries received via text input or voice input).
[0048] In some implementations, the input processing module 146 may perform postprocessing of input received from one or more trained models. In some implementations, postprocessing (and/or pre-processing) may include implementing content moderation mechanisms, to prevent misuses of trained models or inappropriate content generation. The input processing module 146 may include instructions for handling errors and for displaying errors to users (e.g., via the output device 128). The input processing module 146 may cause one or more graphical user interfaces to be displayed, for example to enable the user to enter information directly via a text field.
[0049] The authentication/security module 148 may include one or more sets of computerexecutable instructions for implementing access control mechanisms for one or more trained models, ensuring that the model can only be accessed by those who are authorized to do so, and that the access of those users is private and secure. It should be appreciated that the security module 148 may permission users/agents based upon their respective permissions in a clinical context (e.g., permissions of particular medical personnel to access potentially sensitive medical information associated with particular patients, cohorts, medical facilities or portions thereof, etc.).
[0050] Generally, trained models, especially trained models, require state information in order to meaningfully carry on a dialogue with a user or with another trained model. For example, if a user prompts a trained model with a question such as “What is the weather in Chicago today?” followed by a second prompt “And how about tomorrow?” the model should understand that, in context, the second query relates to the first query, insofar as the user is asking about the weather tomorrow in the same location (Chicago).
[0051] However, language models (e.g., large language models (LLMs)) are generally stateless, meaning that after they process a prompt, they have no internal record or memory of the information that was input, or the information that was generated as part of the language model’s processing. Thus, many systems add statefulness to models using context information. This may be implemented using sliding context windows, wherein a predetermined number of tokens (e.g., 4096 maximum tokens in the case of GPT 3.5, equivalent to about 3000 words) may be “remembered” by the LLM and can be used to enrich multiple sequential prompts input into the LLM (for example, when the LLM is used in a chat mode).
[0052] The context module 150 may include one or more sets of computer-executable instructions for maintaining state of the type found in this example, and other types of state information. The context module 150 may implement sliding window context, in some implementations. In other implementations, the context module 150 may perform other types of state maintaining strategies. For example, the context module 150 may implement a strategy in which information from the immediately preceding prompt is part of the window, regardless of the size of that prior prompt.
[0053] In some implementations, the context module 150 may implement a strategy in which one or more prior prompts are included in each current prompt. This prompt stuffing technique, or prompt concatenation, may be limited by prompt size constraints — once the total size of the prompt exceeds the prompt limit, the model immediately loses state information related to parts of the prompt truncated from the prompt.
[0054] The clinical data capture module 152 may include one or more sets of computerexecutable instructions for obtaining unstructured clinical data objects (e.g., pages, files, portions thereof, etc., for example in response to user queries directly or indirectly identifying the unstructured clinical data objects or broader sets of sources including the unstructured clinical
data objects), and/or extracting structured clinical data matching the structured database from the unstructured data. In some implementations portions of operation of the clinical data capture module are instead executed at the one or more servers 104. For example, a portion of the clinical data capture module 152 may obtain one or more unstructured clinical data objects and upload the object(s) to the one or more server(s) 104, which may use one or more trained LLMs to extract structured clinical data and populate the structured database 108. In another implementation, the clinical data capture module 152 at the client computing device 102 extracts the structured clinical data elements and uploads the extracted elements to the structured database 108, e.g., via the one or more servers 104. Accordingly, the clinical data capture module may access one or more trained LLMs, which may be stored at the client computing device 102 and/or, accessed from the server computing device 104 via the network 106, and/or accessed via other means, including for example means depicted in FIG. 1.
[0055] The server computing device 104 may include one or more processors 160, one or more network interface controllers 162, one or more memories 164, an input device (not depicted), an output device (not depicted) and a server API 166. The one or more memories 164 may have stored thereon one or more modules 170 (e.g., one or more sets of instructions).
[0056] In some implementations, the one or more processors 160 may include one or more central processing units, one or more graphics processing units, one or more field-programmable gate arrays, one or more application-specific integrated circuits, one or more tensor processing units, one or more digital signal processors, one or more neural processing units, one or more RISC-V processors, one or more coprocessors, one or more specialized processors/accelerators for artificial intelligence or machine learning-specific applications, one or more microcontrollers, etc.
[0057] The server computing device 104 may include one or more network interface controllers 162, such as Ethernet network interface controllers, wireless network interface controllers, etc. The network interface controllers 162 may include advanced features, in some implementations, such as hardware acceleration, specialized networking protocols, etc.
[0058] The memories 164 of the server computing device 104 may include volatile and/or non-volatile storage media. For example, the memories 164 may include one or more random access memories, one or more read-only memories, one or more cache memories, one or more
hard disk drives, one or more solid-state drives, one or more non-volatile memory express, one or more optical drives, one or more universal serial bus flash drives, one or more external hard drives, one or more network-attached storage devices, one or more cloud storage instances, one or more tape drives, etc.
[0059] As noted, the memories 164 may have stored thereon one or more modules 170, for example, as one or more sets of computer-executable instructions. In some implementations, the modules 170 may include additional storage, such as one or more operating systems (e.g., Microsoft Windows, GNU/Linux, Mac OSX, etc.). The operating systems may be configured to run the modules 170 during operation of the server computing device 104 - for example, the modules 170 may include additional modules and/or services for receiving and processing data from one or more other components of the environment 100 such as the one or more cloud APIs 114 or the client computing device 102. The modules 170 may be implemented using any suitable computer programming language(s) (e.g., Python, JavaScript, C, C++, Rust, C#, Swift, Java, Go, LISP, Ruby, Fortran, etc.).
[0060] In some implementations, the modules 170 may include a data collection module 172, a data pre-processing module 174, a model pretraining module 176, a fine-tuning module 178, a model training module 180, a checkpointing module 182, a hyperparameter tuning module 184, a validation and testing module 186, an auto-prompting module 188, a model operation module 190 and an ethics and bias module 192. In some implementations, more or fewer modules 170 may be included. The modules 170 may be configured to communicate with one another (e.g., via inter-process communication, via a bus, via sockets, pipes, message queues, etc.). The modules 170 may respond to network requests (e.g., via the API 166) or other requests received via the network 106 (e.g., via the client computing device 102 or other components of the environment 100).
[0061] The data collection module 172 may be configured to collect information used to train one or more modules. In general, the information collected may be any suitable information used for training a language model. The data collection module 172 may collect data via web scraping, via API calls/access, via database extract-transform-load (ETL) processes, etc. Sources accessed by the data collection module 172 include social media websites, books, websites, academic publications, web forums/ interest sites (e.g., Reddit, Facebook, bulletin boards,
medical journals, etc.), etc. The data collection module 172 may access data sources by active means (c.g., scraping or other retrieval) or may access existing corpuscs. The data collection module 172 may include sets of instructions for performing data collection in parallel, in some implementations. The data collection module 172 may store collected data in one or more electronic databases, such as a database accessible via the cloud APIs 114 or via a local electronic database (not depicted). The data may be stored in a structured and/or unstructured format. In some implementations, the data collection module 172 may store large data volumes used for training one or more models (i.e., training data). For example, the data collection module 172 may store terabytes, petabytes, exabytes or more of training data.
[0062] In some implementations, the data collection module 172 may retrieve data from the structured database 108. For example, the data collection module 172 may process the retrieved/ received data and sort the data into multiple subsets based on information included within the structured database 108.
[0063] The model preprocessing module 174 may include instructions for pre-processing data collected by the data collection module 172. In particular, the model preprocessing module 174 may perform text extraction and/or cleaning operations on data collected by the data collection module 172. The data pre-processing module 174 may perform preprocessing operations, such as lexical parsing, tokenizing, case conversions and other string splitting/munging. In some implementations, the data collection module 172 may perform data deduplication, filtering, annotation, compliance, version control, validation, quality control, etc. In some implementations, one or more human reviewers may be looped into the process of pre-processing data collected by the data pre-processing module 174. For example, a distributed work queue may be used to transmit batch jobs and receive human-computed responses from one or more human workers. Once pre-processed, the data pre-processing module 174 may store copied and/or modified copies of the training date in an electronic database. In some implementations, the data pre-processing module 174 may include instructions for parsing the unstructured text received by the data collection module 172 to structure the text.
[0064] Generally, the present techniques may train one or more models to perform language generation tasks that include token generation. Both training inputs and model outputs may be tokenized. Herein, tokenization refers to the process by which text used for training is divided
into units such as words, subwords or characters. Tokenization may break a single word into multiple subwords (e.g., “LLM” may be tokcnizcd as “L” and “LM”). The present techniques may train one or more models using a set of tokens (e.g., a vocabulary) that includes many (e.g., thousands or more) of tokens. These tokens may be embedded into a vector. This vector of token or “embeddings” may include numerical representations of the individual tokens in the vocabulary in high-dimensional vector space. The modules 170 may access and modify the embeddings during training to learn relationships between tokens. These relationships effectively represent semantic language meaning.
[0065] In some implementations, a specialized database (e.g., a vector store, a graph database , etc.) may be used to store and query the embeddings. Embedding databases may include specialized features, such as efficient retrieval, similarity search and scalability. For example, the server computing device 104 may include a local electronic embedding database (not depicted). In some implementations, a remote embedding database service may be used (e.g., via the cloud APIs 114). Such a remote embedding database service may be based on an open source or proprietary model (e.g., Milvus, Pinecone, Redis, Postgres, MongoDB, Facebook Al Similarity Search (FAISS), etc.). The server computing device 104 may include instructions (e.g., in the data collection module 172) for adding training data to one or more specialized databases, and for accessing it to train models.
[0066] The present techniques may include language modeling, wherein one or more deep learning models are trained by processing token sequences using a large language model architecture. For example, in some implementations, a transformer architecture may be used to process a sequence of tokens. Such a transformer model may include a plurality of layers including self-attention and feedforward neural networks. This architecture may enable the model to learn contextual relationships between the tokens, and to predict the next token in a sequence, based upon the preceding tokens. During training, the model is provided with the sequence of tokens and it learns to predict a probability distribution over the next token in the sequence. This training process may include updating one or more model parameters (e.g., weights or biases) using an objective function that minimizes the difference between the predicted distribution and a true next token in the training data.
[0067] Alternatives to the transformer architecture may include recurrent neural networks, long short-term memory networks, gated recurrent networks, convolutional neural networks, recursive neural networks, and other modeling architectures.
[0068] In some implementations, the modules 170 may include instructions for performing pretraining of a language model (e.g., an LLM), for example, in a pretraining module 176. The pretraining module 176 may include one or more sets of instructions for performing pretraining, which as used herein, generally refers to a process that may span pre-processing of training data via the data pre-processing module 174 and initialization of an as-yet untrained language model. In general, a pre-trained model is one that has no prior training of specific tasks. For example, the model pretraining module 176 may include instructions that initialize one more model weights. In some implementations, model pretraining module 176 may initialize the weights to have random values. The model pretraining module 176 may train one or more models using unsupervised learning, wherein the one or more models process one or more tokens (e.g., preprocessed data output by the data pre-processing module 174) to learn to predict one or more elements e.g., tokens). The model pretraining module 176 may include one or more optimizing objective functions that the model pretraining module 176 applies to the one or more models, to cause the one or more models to predict one or more most-likely next tokens, based on the likelihood of tokens in the training data. In general, the model pretraining module 176 causes the one or more models to learn linguistic features such as grammar and syntax. The pretraining module 176 may include additional steps, including training, data batching, hyperparameter tuning and/or model checkpointing.
[0069] The model pretraining module 176 may include instructions for generating a model that is pretrained for a general purpose, such as general text processing/understanding. This model may be known as a “base model” in some implementations. The base model may be further trained by downstream training process(es), for example, those training processes described with respect to the fine-tuning module 178. The model pretraining module 176 generally trains foundational models that have general understanding of language and/or knowledge. Pretraining may be a distinct stage of model training in which training data of a general and diverse nature (i.e., not specific to any particular task or subset of knowledge) is used to train the one or more models. In some implementations, a single model may be trained
and copied. Copies of this model may serve as respective base models for a plurality of finetuned models.
[0070] In some implementations, base models may be trained to have specific levels of knowledge common to more advanced agents. For example, the model pretraining module 176 may train a medical student base model that may be subsequently used to fine tune an internist model, a surgeon model, a resident model, etc. In this way, the base model can star! from a relatively advanced stage, without requiring pretraining of each more advanced model individually. This strategy represents an advantageous improvement, because pretraining can take a long time (many days), and pretraining the common base model only requires that pretraining process to be performed once.
[0071] The modules 170 may include a fine-tuning module 178. The fine-tuning module 178 may include instructions that train the one or models further to perform specific tasks. For example, the fine-tuning module 178 may train each of a plurality of agents to generate one or more outputs that are based on each respective agents’ personality or characteristics. Specifically, the fine-tuning module 178 may include instructions that train one or more models to generate respective language outputs (e.g., text generation), summarization, question answering or translation activities based on the characteristics of each respective agent.
[0072] Continuing the example, the fine-tuning module 178 may include sets of instructions for retrieving one or more structured data sets, such as time series generated by the data preprocessing module 174. For example, the fine-tuning module 178 may include instructions for configuring an objective function for performing a specific task, such as generating text in accordance with a structure of the structured database 108 or in accordance with output expected by a medical personnel utilizing an LLM. For example, the fine-tuning module 178 may include instructions for fine-tuning a pathologist model, based on a base language model. A medical resident model may be fine-tuned by the fine-tuning module 178, wherein the base model is the same used to fine-tune the pathologist model. The fine-tuning module 178 may train many (e.g., hundreds or more) additional models.
[0073] In some implementations, the fine-tuning module 178 may include user-selectable parameters that affect the fine-tuning of the one or more models. For example, a “caution” bias parameter may be included that represents medical conservativeness. This bias parameter may
be adjusted to affect the cautiousness with which the resulting trained model (i.e., agent) approaches medical decision-making. Additional models may be trained, for additional personas/tasks, as discussed below.
[0074] In some implementations, to manage complexity of fine-tuning and other machine learning operations of the server computing device 104, one or more open source frameworks may be used. Example frameworks include TensorFlow, Keras, MXNet, Caffe, SciKit learn, PyTorch. Specifically for training and operating language models, frameworks such as OpenLLM and LangChain may be used, in some implementations. The fine-tuning module 178 may use an algorithm such as stochastic gradient descent or another optimization technique to adjust weights of the pretrained model.
[0075] Fine-tuning may be an optional operation, in some implementations. In some implementations, training may be performed by the training module 180 after pretraining by the model pretraining module 176. In some implementations, the model training module 180 may perform task- specific training like the fine-tuning module 178, on a smaller scale or with a more tailored objective. For example, whereas the fine-tuning module 178 may fine tune a model to learn knowledge corresponding to a surgeon, the model training module 180 may further train the model to learn knowledge of a plastic surgeon, an orthopedic surgeon, etc.
[0076] The training module 180 may include one or more submodules, including the checkpointing module 182, the hyperparameter tuning module 184, the validation and testing module 186 and the auto-prompting module 188. The checkpointing module 182 may perform checkpointing, which is saving of a model’s parameters. The checkpointing module 182 may store checkpoints during training and at the conclusion of training, for example, in the model electronic database 112. In this way, the model may be run (e.g., for testing and validation) at multiple stages and its training parameters loaded, and also retrained from a checkpoint. In this way, the model can be run and trained forward without being re-trained from the beginning, which may save significant time (e.g., days of computation). The hyperparameter tuning module 184 may include hyperparameters such as batch size, model size, learning rate, etc. These hyperparameters may be adjusted to influence model training. The hyperparameter tuning module 184 may include instructions for tuning hyperparameters by successive evaluation. The validation and testing module 186 may include sets of instructions for validating and testing one
or more machine learning models, including those generated by the model pretraining module 176, the fine-tuning module 178 and the model training module 180. The auto-prompting module 188 may include sets of instructions for performing auto-prompting of one or more models. Specifically, the auto-prompting module 188 may enrich a prompt with additional information. The auto-prompting module 188 may include additional information in a prompt, so that the model receiving the prompt has additional context or directions that it can use. This may allow the auto-prompting module 188 to fine-tune a base model using one-shot of few-shot learning, in some implementations. The auto-prompting module 188 may also be used to focus the output of the one or more models.
[0077] In some implementations, the training module 180 may train multi-modal models. For example, the training module 180 may train a plurality of models each capable of drawing from multimodal data types such as written text, imaging data, laboratory data, real-time monitoring data, pathology images, etc. In some cases, the training module 180 may train a single model capable of processing the multimodal data types. In some implementations, a trained multimodal model may be used in conjunction with another model (e.g., a large language model) to provide non-text data interactions with users.
[0078] The model operation module 190 may operate one or more trained models. Specifically, the model operation module 190 may initialize one or more trained models, load parameters into the model(s), and provide the model(s) with inference data (e.g., prompt inputs). In some implementations, the model operation module 190 may deploy one or more trained model (e.g., a pretrained model, a fine-tuned model and/or a trained model) onto a cloud computing device (e.g., via the API 166). The model operation module 190 may receive one or more inputs, for example from the client computing device 102, and provide those inputs (e.g., one or more prompts) to the trained model. In some implementations, the API 166 may include elements for receiving requests to the model, and for generating outputs based on model outputs. For example, the API 166 may include a RESTful API that receives a GET or POST request including a prompt parameter. The model operation module 190 may receive the request from the API 166, and pass the prompt parameter into the trained model and receive a corresponding input. For example, the prompt parameter may be “What is the smallest bone in the human body.” The prompt output may be “The stapes bone of the inner ear is the smallest bone in the human body.”
[0079] As discussed, in some implementations, multi-modal modeling may be used. The data preprocessing module may, for example, process and understand image data, audio data, video data, etc. The server computing device 104 may interpret and respond to queries that involve understanding content from these different modalities. For example, the server computing device 104 may include an image processing module (not depicted) including instructions for performing image analysis on images provided by users, or images retrieved from patient EHR data. In some implementations, the server computing device 104 may generate outputs in modalities other than text. For example, the server computing device 104 may generate an audio response, an image, etc.
[0080] The operating module 190 may include a set of computer-executable instructions that when executed by one or more processors (e.g., the processors 160) cause a computer (e.g., the server computing device 104) to perform retrieval-augmented generation. Specifically, the operating module 190 may perform retrieval-augmented generation based upon inputs or queries received from the user. This allows the operating module 190 to tailor responses of a model based on the specific input and context, such as the medical issue, patient, or clinical data record under discussion. For example, one or more models may be pre-trained, fine-tuned and/or trained as discussed above. During that training, the model may learn to generate tokens based on general language understanding as well as application-specific training. Such a model at that point may be static, insofar as it cannot access further information when presented with an input query.
[0081] When the model is used at runtime, however, such as when deployed in the environment 100, the operating module 190 may perform retrieval operations, such as searching or selecting information from a document, a database, or another source. The operating module 190 may include instructions for processing user input and for performing a keyword search, a regular expression search, a similarity search, etc. based upon that user input. The operating module 190 may input the results of that search, along with the user input, into the trained model. Thus, the trained model may process this additional retrieved information to augment, or contextualize, the generation of tokens that represent responses to the user’s query. In sum, retrieval augmented generation applied in this manner allows the model to dynamically generate outputs that are more relevant to the user’s input query at runtime. Information that may be retrieved may include data corresponding to a patient (e.g., patient demographic information,
medical history, clinical notes, diagnoses, medications, allergies, immunizations, laboratory results, oncology information, radiation and imaging information, vitals, etc.) and additional training information, such as medical journals, notes or speech transcripts from symposia or other meetings/conferences, etc.
[0082] The present techniques may trigger retrieval augmented generation by processing a prompt, in some implementations. For example, a prompt may be processed by the input processing module 146 of the client computing device 102, prior to processing the prompt by the one or more generative models. The input processing module 146 may trigger retrieval augmented generation based on the presence of certain inputs, such as patient information, or a request for specific information, in the form of keywords. The input processing module 146 may perform entity recognition or other natural language processing functions to determine whether the prompt should be processed using retrieval augmented generation prior to being provided to the trained model.
[0083] As discussed above, prompts may be received via the input processing module 146 of the client computing device 102 and transmitted to the server computing device 104 via the electronic network 106. In some implementations, the output of the model may be modulated prior to being transmitted, output or otherwise displayed to a user.
[0084] For example, the ethics and bias module 192 may process the prompt input prior to providing the prompt input to the trained model, to avoid passing objectionable content into the trained model. In some implementations, the ethics and bias module 192 may process the output of the trained model, also to avoid providing objectionable output. It should be appreciated that trained language models may be unpredictable, and thus, processing outputs for ethical and bias concerns (especially in a medical context) may be important.
[0085] The client computing device 102 and the server computing device 104 may communicate with one another via the network 106. In some implementations, the client computing device 102 and/or the server computing device 104 may offload some or all of their respective functionality to the one or more cloud APIs 114. In implementations, the one or more cloud APIs 114 may include one or more public clouds, one or more private clouds and/or one or more hybrid clouds. The one or more cloud APIs 114 may include one or resources provided under one or more service models, such as Infrastructure as a Service (laaS), Platform as a
Service (PaaS), Software as a Service (SaaS), and Function as a Service (FaaS). For example, the one or more cloud APIs 114 may include one or more cloud computing resources, such as computing instances, electronic databases, operating systems, email resources, etc. The one or more cloud APIs 114 may include distributed computing resources that enable, for example, the model pretraining module 176 and/or other of the modules 170 to distribute parallel model training jobs across many processors.
[0086] In some implementations, the one or more cloud APIs 114 may include one or more language operation APIs, such as OpenAI, Bing, Claude. ai, etc. In other implementations, the one or more cloud APIs 114 may include an API configured to operate one or more open source models, such as Llama 2.
[0087] The electronic network 106 may be a collection of interconnected devices, and may include one or more local area networks, wide area networks, subnets, and/or the Internet. The network 106 may include one or more networking devices such as routers, switches, etc. Each device within the network 106 may be assigned a unique identifier, such as an IP address, to facilitate communication. The network 106 may include wired (e.g., Ethernet cables) and wireless (e.g., Wi-Fi) connections. The network 106 may include a topology such as a star topology (devices connected to a central hub), a bus topology (devices connected along a single cable), a ring topology (devices connected in a circular fashion), and/or a mesh topology (devices connected to multiple other devices). The electronic network 106 may facilitate communication via one or more networking protocols, such as packet protocols (e.g., Internet Protocol (IP)) and/or application-layer protocols (e.g., HTTP, SMTP, SSH, etc.). The network 106 may perform routing and/or switching operations using routers and switches. The network 106 may include one or more firewalls, file servers and/or storage devices. The network 106 may include one or more subnetworks such as a virtual LAN (VLAN).
[0088] The environment 100 may include one or more electronic databases, such as a relational database that uses structured query language (SQL) and/or a NoSQL database or other schema-less database suited for the storage of unstructured or semi-structured data. These electronic databases may include, for example, the structured database 108 and/or the model database 112.
[0089] The present techniques may store training data, training parameters and/or trained models in an electronic database such as the database 112. Specifically, one or more trained machine learning models may be serialized and stored in a database (e.g., as a binary, a JSON object, etc.). Such a model can later be retrieved, deserialized and loaded into memory and then used for predictive purposes. The one or more trained models and their respective training parameters (e.g., weights) may also be stored as blob objects. Cloud computing APIs may also be used to stored trained models, via the cloud APIs 114. Examples of these services include AWS SageMaker, Google Al Platform and Azure Machine Learning.
[0090] In operation, a user may access a prompt graphical user interface via the client computing device 102. The prompt graphical user interface may be configured by the model configuration module 142 and generated by the input processing module 146, and displayed by the input processing module 146 via the output device 128. The model configuration module 142 may configure the graphical user interface to accept prompts and display corresponding prompt outputs generated by one or more models processing the accepted outputs. The input processing module 146 may be configured to transmit the prompts input by the user via the network electronic network 106 to the API 166 of the server computing device 104. The API 166 may process the user inputs via one or more trained models, or agents.
[0091] At the time the user accesses the prompt graphical user interface, one or more models may already be trained, including pretraining and fine-tuning. These trained models may be selectively loaded into the one or more agent objects based on configuration parameters, and/or based upon the content of the user’s input prompts.
[0092] In some implementations, the user may engage in a question-answer session with the client computing device 102, for example using the LLM of the present description to refine, clarify, or correct a request for information from the structured database 108, and/or to identify unstructured documents to be incorporated into the structured database 108.
[0093] The environment 100 may include additional, fewer, and/or alternate computing components, in various possible implementations.
[0094] Example Data Flows
[0095] In consideration of the foregoing, FIGS. 2 and 3 respectively depict a block diagram and flow diagram of example flows of data in accordance with various implementations of techniques of the present disclosure.
[0096] First referring to FIG. 2, a flow diagram involves unstructured clinical data 204. The unstructured clinical data 204 may, for example, include freeform, handwritten or typed notes, for example written by a physician, nurse, specialist, etc., clinical notes, and/or other suitable data described herein. In any case, generally speaking, any structure of the unstructured clinical data 204 is such that the structure (if present) does not match that of a structured database 212 (e.g., a structured clinical registry). That is, the structured database 212 contains various organized fields in which particular corresponding values are placed to populate the structured database 212, whereas freeform notes on the other hand usually do not have structured fields, and to the extent that any of the data 204 contains any structured fields, these fields usually do not correspond one-to-one with the structure of the structured database 212.
[0097] Unstructured clinical data 204 may, for example, be stored on a client computing device of a user (e.g., medical personnel), stored on a filing system of a medical facility, accessible via one or more applications from a client computing device of a user, and/or accessed via other means to supply the unstructured clinical data 204 to the large language model (LLM) as described herein. In some embodiments, unstructured clinical data 204 may include physical documents (e.g., handwritten notes or printed clinical reports), which may be provided to the LLM, for example, by taking photographs or electronic scans of the physical documents before or during the process of providing the physical documents to the LLM and/or querying data therefrom (which may for example perform optical character recognition on data contained therein).
[0098] According to various techniques of the present disclosure, the LLM receives the unstructured clinical data 204 and identifies particular data elements therein to populate the structured database 212 (e.g., in response to queries for information from the unstructured clinical data 204). In embodiments, the structured database 212 also stores analyzed unstructured clinical data 214, which may correspond to the unstructured clinical data 204 with pointers to specific data elements therein added based on the analysis by the LLM. That is, for any particular data element identified from an unstructured clinical data object (e.g., a page, a
file, etc.), the structured database 212 may store the original data object itself with a pointer to where the particular data element appears in the original data object. Thus, if a user or program subsequently requests the data element form the structured database 212 (e.g., via a natural language query or a database query), the structured database 212 may return not just the data element, but at least a portion of the original data object where the data element appears such that the user/program can view the context of the data element (e.g., the structured database 212 may return a page containing the data element, with the data element itself highlighted or otherwise signified).
[0099] Using the LLM, a human user may form a natural language query 222 to the structured database 212, for example via text input, voice input, and/or other input at a client computing device of the user. A natural language query may include, for example, “show me the patient’s reports immediately after Wednesday’s surgery” or “was the patient’s nerve preserved during surgery? Options: Yes/no/partially.” The LLM may convert the natural language query into a corresponding database query (e.g., SQL query) that can be used to search the structured database 212 for particular data objects (e.g., pages, files, etc.) or data elements therefrom. As described in the foregoing, the LLM may effectively provide a chatbot functionality, wherein the LLM returns objects/data elements in accordance with user requests and allows the user to further refine, clarify, or modify requests for information as needed, for example based on original output of the LLM. For example, for the query “show me the patient’s reports immediately after Wednesday’s surgery,” the LLM may return an indication of both a blood work report and a radiology report for viewing at the user’s client computing device. The LLM may ask the user (e.g., via text or simulated voice output) whether the user wants to see the blood work report, the radiology report, both, or neither. The user may then provide further input (e.g., via typing, speech, selection of user interface options, etc.) indicating “show me the radiology report,” in response to which the LLM may cause the radiology report (or relevant portions thereof, when requested) to be presented at the user’s client computing device. It should be appreciated that this is only one example and that innumerable queries, query refinements, and flows may be possible.
[0100] In some implementations, the LLM upon returning responses to a natural language query 222 may include additional questions to be asked of the user to augment the structured database 212 (or more particular, to augment a returned clinical data object). For example, upon
returning a patient’s radiology report to a physician, the LLM may identify additional questions to ask of the physician to augment the radiology report or another associated portion of the patient’s records. Based upon further input (/.<?., answers) provided by the physician, the LLM may identify corresponding portions of the structured database 212 to update or augment.
[0101] In addition to or alternatively to natural language queries 222, database queries 226 (e.g., SQL queries) may be provided to the structured database 212. Database queries 226 may be generated and submitted, for example, via various applications executing on behalf of medical personnel, a medical facility, or another entity associated therewith. For example, a data visualization application (e.g., Tableau) may request particular data elements from the structured database 212 for use in generating a chart for a patient or aggregated data describing a multiplicity of patients treated by a medical facility. As another example, an auditing and/or reporting application (e.g., REDCap) may generate and submit requests for information to be used in various reporting contexts. Because the structure of a database query 226 may match the structure of the structured database 212, use of the LLM may not be required to return appropriate information from a database query 226.
[0102] In some scenarios, the structured database 212 at the time of receiving a query (e.g., natural language query 222 or database query 226) already includes the information requested. For example, the query may refer to one or more sources of information that include unstructured clinical data 204 that has already been analyzed and populated in the structured database 212. Alternatively, in some scenarios, one or more sources are obtained and incorporated into the structured database 212 based on the query itself. For example, the query may include a request for patient information associated with the one or more sources, or the query may specifically identify the one or more sources from which relevant data (such as patient information, etc.) is to be extracted.
[0103] Moving to FIG. 3, other example flows of information are depicted. These flows of information generally include (1) interactions between a user 302 and an application (“app”) 304 (e.g., executing at the client device 102 of FIG. 1 from memory 124 via the processor 120), (2) interactions between the application 304 and an app service 306 (e.g., back-end service executing at the server computing device 104 and/or another server computing device), and (3) interactions between the app service 306 and a large language model (LLM) service 308 (e.g., Google
Gemini and/or another suitable LLM, which may for example be stored executed at the server computing device 104 and/or another server computing device). Steps and actions within these steps will be described, which generally may be performed for example by corresponding components of the environment 100 of FIG. 1. The steps of FIG. 3 may be augmented and/or substituted by other actions in this detailed description, in various embodiments.
[0104] At a step 1, the user 302 uses the application 304 (e.g., via the input device 126 and/or output device 128 of FIG. 1) to identify one or more patients regarding whom clinical data is to be obtained. Actions between the user 302 and application 304 herein may, for example, take the form of one or more natural language queries and/or one or more database queries, e.g., as described with respect to FIG. 2). Step 1 may include actions 1 . 1 and 1 .2, where the application 304 communicates with the app service 306 to identify and return information indicating the identified patients and records associated therewith (e.g., records already analyzed to produce structured data in a structured database such as a clinical registry, and/or unstructured records not yet analyzed via the techniques of this disclosure). The application 304 may display indications of patients and records associated therewith to the user 302 (e.g., via the output device 128 of FIG. 1).
[0105] At a step 2, the user 302 filters records returned from step 1, e.g., via interactions between the user 302, application 304, and app service 306. The user may, for example, select one or more sources from which desired data is to be retrieved, e.g., using filtering mechanisms to limit to a particular hospital, hospital wing, room, type of procedure, portion of the body, symptom, diagnoses or disorder, medical personnel involved, date/time, current procedural terminology (CPT) code, and/or other medically relevant parameter. Step 2 may include actions 2.1 between the application 304 and app service 306 to review and refine subsets of filtered records in response to filtering selections from the user 302.
[0106] At a step 3, the user 302 extracts particular subsets of information from the filtered records of step 2, e.g., by defining particular data elements and/or portions of the filtered records desired to be analyzed via the LLM of the present disclosure. Step 3 may include action 3.1, where the application 304 communicates with the app service 306 to prepare and review prompts to the LLM service 308, e.g., in the form of natural language queries (or additionally/altematively, in embodiments, database queries). Preparing and reviewing prompts
at action 3.1 may include the user 302 reviewing and/or modifying the prompts via interactions with the application 304. Extracting information at step 3 further includes actions 3.1.1 and 3.1.2, where the app service 306 provides the prepared prompts to the LLM service 308 and receives output from the LLM service 308. At action 3.2, the app service 306 and application 304 return results to the user, e.g. , in the form of structured data from the structured database that is responsive to the prompt(s) and that is sourced from unstructured records or portions thereof specified by the user 302.
[0107] At a step 4, the user 302 reviews the results returned from step 3, e.g., via further interactions with the application 304. If desired, the user 302 may re-submit a prompt by repeating actions of steps 1 , 2, and/or 3, e.g., to modify a parameter of a previous prompt and/or to obtain further information associated previously returned results.
Example Computer-Implemented Method
[0108] EIG. 4 depicts a block diagram of an example computer-implemented method 400 in accordance with at least some of the techniques described herein.
[0109] In various implementations, the method 400 may be performed via computing components of the environment 100 described with respect to EIG. 1. In some implementations, a computing system includes one or more processors and one or more computer memories (e.g., one or more non-transitory memories) storing instructions thereon that, when executed via the one or more processors, cause the computing system to perform actions of the method 400, alone or in combination with other actions of this disclosure. In some implementations, one or more computer readable media (e.g., one or more non-transitory computer readable media) store instructions that, when executed via one or more computers, cause the one or more computers to perform actions of the method 400, alone or in combination with other actions of this disclosure.
[0110] The method 400 includes obtaining one or more data objects defining clinical data associated with at least one patient (402). The one or more data objects may be entirely unstructured (e.g., freeform text in the form of handwritten or typewritten notes, unstructured reports, etc.) or at least do not match a structure of a structured database (e.g., do not have one- to-one correspondence with fields to be populated in the structured database. Obtaining the one or more data objects may include obtaining photo images, files, etc., for example from local storage of the client computing device, from a camera unit of the client computing device, and/or
via export from another application executing on or otherwise accessed via the client computing device.
[0111] The method 400 also includes analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database (404).
[0112] The method 400 still further includes populating the structured database with the one or more extracted elements to thereby update the structured database based upon information contained in the one or more obtained data objects (406).
[0113] The method 400 still yet further includes providing an indication of the one or more extracted data elements to a clint computing device of a user in response to a query from the client computing device (408). The query may, for example, by a natural language query or a database query received from the client computing device, e.g., as described with respect to FIG. 2 or FIG. 3.
[0114] In some embodiments, the query itself defines the specific patient information to be returned (alternatively, the query may simply request information about a patient or group of patients more generally). Moreover, in some embodiments, the query identifies one or more data sources from which to extract the one or more data elements matching the structure of the structured database. In these scenarios, action 402 may include obtaining the unstructured data object(s) based on the query itself, e.g., by identifying unstructured documents to analyze using the large language model and input into the structured database at actions 404 and 406, respectively.
[0115] The method 400 may include still additional, fewer, and/or alternate actions, including various suitable actions described in this disclosure. In some implementations, the method 400 provides original data objects (e.g., files, photos, etc.) back to a user from the structured database, with the provided original data objects including indications (e.g., annotations, highlights, etc.) where particular requested data elements are found. Further, in some implementations, actions of the method 400 are performed iteratively, such that the user of the client computing device provides further requests to the large language model based upon the output of action 408. Still further, in some implementations, the structured database is accessed to identify one or more patients or groups of patients in the structured database for a clinical trial.
[0116] Additional Considerations
[0117] The following considerations also apply to the foregoing discussion. Throughout this specification, plural instances may implement operations or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein.
[0118] It should also be understood that, unless a term is expressly defined in this patent using the sentence “As used herein, the term is hereby defined to mean . . . " or a similar sentence, there is no intent to limit the meaning of that term, either expressly or by implication, beyond its plain or ordinary meaning, and such term should not be interpreted to be limited in scope based on any statement made in any section of this patent (other than the language of the claims). To the extent that any term recited in the claims at the end of this patent is referred to in this patent in a manner consistent with a single meaning, that is done for sake of clarity only so as to not confuse the reader, and it is not intended that such claim term be limited, by implication or otherwise, to that single meaning. Finally, unless a claim element is defined by reciting the word "means" and a function without the recital of any structure, it is not intended that the scope of any claim element be interpreted based on the application of 35 U.S.C. § 112(f).
[0119] Unless specifically stated otherwise, discussions herein using words such as "processing," "computing," "calculating," "determining," "presenting," "displaying," or the like may refer to actions or processes of a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or a combination thereof), registers, or other machine components that receive, store, transmit, or display information.
[0120] As used herein any reference to "one implementation" or "an implementation" means that a particular element, feature, structure, or characteristic described in connection with the implementation is included in at least one implementation. The appearances of the phrase "in one implementation" in various places in the specification are not necessarily all referring to the same implementation.
[0121] As used herein, the terms "comprises," "comprising," "includes," "including," "has," "having" or any other variation thereof, arc intended to cover a non-cxclusivc inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, "or" refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
[0122] In addition, use of "a" or "an" is employed to describe elements and components of the implementations herein. This is done merely for convenience and to give a general sense of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise.
[0123] The various embodiments described above can be combined to provide further embodiments. All U.S. patents, U.S. patent application publications, U.S. patent application, foreign patents, foreign patent application and non-patent publications referred to in this specification and/or listed in the Application Data Sheet are incorporated herein by reference, in their respective entireties, for all purposes. Implementations of the embodiments can be modified if necessary to employ concepts of the various patents, applications, and publications to provide yet further embodiments.
[0124] These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.
[0125] Upon reading this disclosure, those of skill in the art will appreciate still additional alternative structural and functional designs for implementing the concepts disclosed herein, through the principles disclosed herein. Thus, while particular implementations and applications have been illustrated and described, it is to be understood that the disclosed implementations are not limited to the precise construction and components disclosed herein. Various modifications, changes and variations, which will be apparent to those skilled in the art, may be made in the
arrangement, operation and details of the method and apparatus disclosed herein without departing from the spirit and scope defined in the appended claims.
[0126] By way of example, and not limitation, the present disclosure contemplates a number of aspects, including but not limited to:
[0127] 1. A computing system comprising: one or more processors; and one or more memories having stored thereon computer-executable instructions that, when executed by the one or more processors, cause the computing system to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
[0128] 2. The computing system of claim 1, wherein the query includes a request for patient information corresponding to the one or more extracted data elements.
[0129] 3. The computing system of aspect 1 or 2, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects.
[0130] 4. The computing system of aspect 3, wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
[0131] 5. The computing system of any one or aspects 1-4, wherein the one or more data objects include one or more clinical reports.
[0132] 6. The computing system of any one or aspects 1-5, wherein the one or more data objects include one or more handwritten clinical notes.
[0133] 7. The computing system of any one of aspects 1-6, wherein the one or more data objects include one or more typewritten clinical notes.
[0134] 8. The computing system of any one of aspects 1-7, wherein at least one of the one or more data objects is obtained via exporting the one or more data objects from a local storage of the client computing device, or via one or more applications executing at the client computing device.
[0135] 9. The computing system of any one of aspects 1-8, wherein the instructions to provide an indication of the one or more extracted data elements to a client computing device include instructions to cause the client computing device to display a corresponding data object in which the one or more extracted data elements appear, the corresponding data object being selected from among the one or more data objects.
[0136] 10. The computing system of aspect 9. wherein the instructions to cause the client computing device to display the corresponding data object include instructions to cause the client computing device to display one or more visual indicators of the one or more extracted data elements.
[0137] 11. The computing system of any one of aspects 1-10, wherein the query includes a natural language query from the client computing device.
[0138] 12. The computing system of aspect 11, wherein the natural language query is defined based on text input from the client computing device.
[0139] 13. The computing system of aspect 11, wherein the natural language query is defined based on voice input from the client computing device.
[0140] 14. The computing system of any one of aspects 1-10, wherein the query includes a database query.
[0141] 15. The computing system of any one of aspects 1-14, wherein the structured database includes a clinical registry.
[0142] 16. The computing system of any one of aspects 1-15, configured to perform the actions of the computing system of any other suitable one of aspects 1-15.
[0143] 17. A computer-implemented method performed via one or more processors, the method comprising: obtaining one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyzing the one or more obtained data objects using one or more large language machine
learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populating the structured database with the one or more extracted data elements; and providing an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
[0144] 18. The computer-implemented method of aspect 17, wherein the query includes a request for patient information corresponding to the one or more extracted data elements.
[0145] 19. The computer-implemented method of aspect 17 or 18, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects.
[0146] 20. The computer-implemented method of aspect 19, wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
[0147] 21. The computer-implemented method of any one of aspects 17-20, wherein the one or more data objects include one or more clinical reports.
[0148] 22. The computer-implemented method of any one of aspects 17-21, wherein the one or more data objects include one or more handwritten clinical notes.
[0149] 23. The computer- implemented method of any one of aspects 17-22, wherein the one or more data objects include one or more typewritten clinical notes.
[0150] 24. The computer-implemented method of any one of aspects 17-23, wherein at least one of the one or more data objects is obtained via exporting the one or more data objects from a local storage of the client computing device, or via one or more applications executing at the client computing device.
[0151] 25. The computer- implemented method of any one of aspects 17-24, wherein providing the indication of the one or more extracted data elements to a client computing device includes causing the client computing device to display a corresponding data object in which the one or more extracted data elements appear, the corresponding data object being selected from among the one or more data objects.
[0152] 26. The computer-implemented method of aspect 25, wherein causing the client computing device to display the corresponding data object includes causing the client computing device to display one or more visual indicators of the one or more extracted data elements.
[0153] 27. The computer-implemented method of any one of aspects 17-26, wherein the query includes a natural language query from the client computing device.
[0154] 28. The computer-implemented method of aspect 27, wherein the natural language query is defined based on text input from the client computing device.
[0155] 29. The computer-implemented method of aspect 27, wherein the natural language query is defined based on voice input from the client computing device.
[0156] 30. The computer-implemented method of any one of aspects 17-26, wherein the query includes a database query.
[0157] 31. The computer-implemented of any one of aspects 17-30, wherein the structured database includes a clinical registry.
[0158] 32. The computer-implemented method of any one of aspects 17-31, in combination with any other suitable one of aspects 17-31.
[0159] 33. The computer-implemented method of any one of aspects 17-32, performed via the computing system of any suitable one of aspects 1-16.
[0160] 34. One or more non-transitory computer-readable media storing instructions that, when executed via one or more processors of one or more computers, cause the one or more computers to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
[0161] 35. The one or more computer-readable media of aspect 34, wherein the query includes a request for patient information corresponding to the one or more extracted data elements.
[0162] 36. The one or more computer-readable media of aspect 34 or 35, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects.
[0163] 37. The one or more computer-readable media of aspect 36, wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
[0164] 38. The one or more computer-readable media of any one of aspects 34-37, wherein the one or more data objects include one or more clinical reports.
[0165] 39. The one or more computer-readable media of any one of aspects 34-38, wherein the one or more data objects include one or more handwritten clinical notes.
[0166] 40. The one or more computer-readable media of any one of aspects 34-39, wherein the one or more data objects include one or more typewritten clinical notes.
[0167] 41. The one or more computer-readable media of any one of aspects 34-40, wherein at least one of the one or more data objects is obtained via exporting the one or more data objects from a local storage of the client computing device, or via one or more applications executing at the client computing device.
[0168] 42. The one or more computer-readable media of any one of aspects 34-41, wherein the instructions to provide an indication of the one or more extracted data elements to a client computing device include instructions to cause the client computing device to display a corresponding data object in which the one or more extracted data elements appear, the corresponding data object being selected from among the one or more data objects.
[0169] 43. The one or more computer-readable media of aspect 42, wherein the instructions to cause the client computing device to display the corresponding data object include instructions to cause the client computing device to display one or more visual indicators of the one or more extracted data elements.
[0170] 44. The one or more computer-readable media of any one of aspects 34-43, wherein the query includes a natural language query from the client computing device.
[0171] 45. The one or more computer-readable media of aspect 44, wherein the natural language query is defined based on text input from the client computing device.
[0172] 46. The one or more computer-readable media of aspect 44, wherein the natural language query is defined based on voice input from the client computing device.
[0173] 47. The one or more computer-readable media of any one of aspects 34-43, wherein the query includes a database query.
[0174] 48. The one or more computer-readable media of any one of aspects 34-47, executed via the computing system of any suitable one of aspects 1-15.
[0175] 49. The one or more computer-readable media of any one of aspects 34-48, wherein the structured database includes a clinical registry.
[0176] 50. The one or more computer-readable media of any one of aspects 34-49, containing instructions to perform the method of any suitable one of aspects 17-33.
[0177] 51. The one or more computer-readable media of any one of aspects 34-50, in combination with the one or more computer-readable media of any other suitable one of aspects 34-50.
[0178] 52. Any one of aspects 1-51, in combination with any other suitable one of aspects 1-
51.
Claims
1. A computer-implemented method performed via one or more processors, the method comprising: obtaining one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyzing the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populating the structured database with the one or more extracted data elements; and providing an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
2. The computer-implemented method of claim 1, wherein the query includes a request for patient information corresponding to the one or more extracted data elements.
3. The computer-implemented method of claim 1, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects.
4. The computer-implemented method of claim 3, wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
5. The computer-implemented method of claim 1, wherein the one or more data objects include one or more handwritten clinical notes.
6. The computer-implemented method of claim 1 , wherein at least one of the one or more data objects is obtained via exporting the one or more data objects from a local storage of the client computing device, or via one or more applications executing at the client computing device.
7. The computer-implemented method of claim 1, wherein providing the indication of the one or more extracted data elements to a client computing device includes causing the client computing device to display a corresponding data object in which the one or more extracted data elements appeal-, the corresponding data object being selected from among the one or more data objects.
8. The computer-implemented method of claim 1, wherein the query includes a natural language query from the client computing device.
9. The computer-implemented method of claim 1, wherein the structured database includes a clinical registry.
10. One or more non-transitory computer readable media storing instructions that, when executed via one or more processors of one or more computers, cause the one or more computers to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
11. The one or more non-transitory computer readable media of claim 10, wherein the query includes a request for patient information corresponding to the one or more extracted data elements.
12. The one or more non-transitory computer readable media of claim 10, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects.
13. The one or more non-transitory computer readable media of claim 10, wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
14. The one or more non-transitory computer readable media of claim 10, wherein the one or more data objects include one or more handwritten clinical notes.
15. The one or more non-transitory computer readable media of claim 10, wherein at least one of the one or more data objects is obtained via exporting the one or more data objects from a local storage of the client computing device, or via one or more applications executing at the client computing device.
16. The one or more non-transitory computer readable media of claim 10, wherein the instructions to provide an indication of the one or more extracted data elements to a client computing device include instructions to cause the client computing device to display a corresponding data object in which the one or more extracted data elements appear, the corresponding data object being selected from among the one or more data objects.
17. The one or more non-transitory computer readable media of claim 10, wherein the query includes a natural language query from the client computing device.
18. The one or more non-transitory computer readable media of claim 10, wherein the structured database includes a clinical registry.
19. A computing system comprising: one or more processors; and one or more memories having stored thereon computer-executable instructions that, when executed by the one or more processors, cause the computing system to: obtain one or more data objects defining clinical data associated with at least one patient, the one or more data objects not matching a structure of a structured database; analyze the one or more obtained data objects using one or more large language machine learning models to extract, from the one or more obtained data objects, one or more data elements matching the structure of the structured database; populate the structured database with the one or more extracted data elements; and provide an indication of the one or more extracted data elements to a client computing device of a user in response to a query from the client computing device.
20. The computing system of claim 19, wherein the query identifies one or more sources from which to extract the one or more data elements matching the structure of the structured database, the identified one or more sources including the obtained one or more data objects, and wherein the obtaining of the one or more data objects is based on the obtained one or more data objects being included in the one or more sources.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363602541P | 2023-11-24 | 2023-11-24 | |
| US63/602,541 | 2023-11-24 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025111610A1 true WO2025111610A1 (en) | 2025-05-30 |
Family
ID=93924692
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/057340 Pending WO2025111610A1 (en) | 2023-11-24 | 2024-11-25 | Artificial intelligence systems and methods for patient charts |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025111610A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200159848A1 (en) * | 2018-11-20 | 2020-05-21 | International Business Machines Corporation | System for responding to complex user input queries using a natural language interface to database |
| US20220044812A1 (en) * | 2019-02-20 | 2022-02-10 | Roche Molecular Systems, Inc. | Automated generation of structured patient data record |
| US20230153641A1 (en) * | 2021-11-16 | 2023-05-18 | ExlService Holdings, Inc. | Machine learning platform for structuring data in organizations |
-
2024
- 2024-11-25 WO PCT/US2024/057340 patent/WO2025111610A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200159848A1 (en) * | 2018-11-20 | 2020-05-21 | International Business Machines Corporation | System for responding to complex user input queries using a natural language interface to database |
| US20220044812A1 (en) * | 2019-02-20 | 2022-02-10 | Roche Molecular Systems, Inc. | Automated generation of structured patient data record |
| US20230153641A1 (en) * | 2021-11-16 | 2023-05-18 | ExlService Holdings, Inc. | Machine learning platform for structuring data in organizations |
Non-Patent Citations (3)
| Title |
|---|
| AGRAWAL MONICA: "Towards Scalable Structured Data from Clinical Text", PH.D. THESIS, 31 March 2023 (2023-03-31), XP093243005, Retrieved from the Internet <URL:https://hdl.handle.net/1721.1/150049> * |
| ARUN JAMES THIRUNAVUKARASU: "Large language models in medicine", NATURE MEDICINE, vol. 29, no. 8, 17 June 2023 (2023-06-17), New York, pages 1930 - 1940, XP093163890, ISSN: 1078-8956, Retrieved from the Internet <URL:https://www.nature.com/articles/s41591-023-02448-8.pdf> DOI: 10.1038/s41591-023-02448-8 * |
| KATIKAPALLI SUBRAMANYAM KALYAN: "A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 4 October 2023 (2023-10-04), XP091638932 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10395641B2 (en) | Modifying a language conversation model | |
| CN112509690A (en) | Method, apparatus, device and storage medium for controlling quality | |
| US10691827B2 (en) | Cognitive systems for allocating medical data access permissions using historical correlations | |
| US10585902B2 (en) | Cognitive computer assisted attribute acquisition through iterative disclosure | |
| US11321534B2 (en) | Conversation space artifact generation using natural language processing, machine learning, and ontology-based techniques | |
| US11847411B2 (en) | Obtaining supported decision trees from text for medical health applications | |
| US20200311610A1 (en) | Rule-based feature engineering, model creation and hosting | |
| US10540440B2 (en) | Relation extraction using Q and A | |
| US11195119B2 (en) | Identifying and visualizing relationships and commonalities amongst record entities | |
| US20180121607A1 (en) | Electronic health record quality enhancement | |
| US20210133605A1 (en) | Method, apparatus, and computer program products for hierarchical model feature analysis and decision support | |
| US20250378070A1 (en) | Answer generation using machine reading comprehension and supported decision trees | |
| JP2025102692A (en) | Context-based task execution method and system for goal-oriented dialogue | |
| US11526509B2 (en) | Increasing pertinence of search results within a complex knowledge base | |
| WO2025111558A1 (en) | Methods and systems for optimizing healthcare data management using generative artificial intelligence agents | |
| WO2025145165A1 (en) | Chart and nearest neighbor patient mapping and llm output | |
| Freise et al. | Automatic prompt optimization techniques: Exploring the potential for synthetic data generation | |
| Shafi et al. | Llm-therapist: A rag-based multimodal behavioral therapist as healthcare assistant | |
| CN113836284A (en) | Method and device for constructing knowledge base and generating response statement | |
| WO2025145006A1 (en) | Techniques for optimizing summary generation using generative artificial intelligence models | |
| WO2025111610A1 (en) | Artificial intelligence systems and methods for patient charts | |
| Puppala et al. | SCAN: A HealthCare Personalized ChatBot with Federated Learning Based GPT | |
| Oruche et al. | Science gateway adoption using plug‐in middleware for evidence‐based healthcare data management | |
| Rough et al. | Core Concepts in Pharmacoepidemiology: Principled Use of Artificial Intelligence and Machine Learning in Pharmacoepidemiology and Healthcare Research | |
| WO2025111262A1 (en) | Systems and methods for analyzing a corpus of documents using large language machine learning models |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24827447 Country of ref document: EP Kind code of ref document: A1 |