WO2025093962A1 - Methods, media, and systems for human in the loop supervision for real-time machine assisted notes clustering - Google Patents
Methods, media, and systems for human in the loop supervision for real-time machine assisted notes clustering Download PDFInfo
- Publication number
- WO2025093962A1 WO2025093962A1 PCT/IB2024/059838 IB2024059838W WO2025093962A1 WO 2025093962 A1 WO2025093962 A1 WO 2025093962A1 IB 2024059838 W IB2024059838 W IB 2024059838W WO 2025093962 A1 WO2025093962 A1 WO 2025093962A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- users
- clusters
- notes
- clustering
- models
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- At least one non-transitory computer-readable medium encoded with instructions that, when executed, configure at least one processor for receiving data that sorts objects into preliminary clusters and generating models comprising clustering decisions leading to formation of the preliminary clusters.
- the non-transitory computer-readable medium also includes instructions for generating, based on the models, predictions about sequences of clustering decisions to be made by users and comparing the models to subsequent clustering decisions made by the users.
- the non-transitory computer-readable medium also includes instructions for identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
- a computing device comprises at least one memory and at least one processor coupled to at least one of the at least one computer memory, the at least one processor being configured to perform operations to receive data that sorts objects into preliminary clusters and generate models comprising clustering decisions leading to formation of the preliminary clusters.
- the processor is further configured to generate, based on the models, predictions about sequences of clustering decisions to be made by users and compare the models to subsequent clustering decisions made by the users.
- the processor is further configured to identify at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generate, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
- a method in yet another embodiment, includes receiving data that sorts objects into preliminary clusters and generating models comprising clustering decisions leading to formation of the preliminary clusters. The method also includes generating, based on the models, predictions about sequences of clustering decisions to be made by users and comparing the models to subsequent clustering decisions made by the users. The method further includes identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
- Figure 1 is a representation illustrating one example of a user capturing an image of a workspace with notes using an image capture device on a mobile device according to various embodiments herein;
- Figure 2 is a block diagram illustrating one example of a mobile device utilized to implement various embodiments herein;
- FIG. 3 is a block diagram of computing hardware utilized to implement various embodiments herein;
- Figure 4A is a flow diagram schematically illustrating a method for generating a superordinate level cluster that captures clustering strategies across multiple users in accordance with embodiments herein;
- Figure 4B is a flow diagram schematically illustrating a method for generating multiple superordinate level clusters that represent different subpopulations of users in accordance with embodiments herein;
- Figure 5 is a diagram schematically illustrating users manually clustering notes in accordance with embodiments herein;
- Figure 6 illustrates a graphical user interface for clustering digital notes in accordance with embodiments herein;
- Figure 7 is a diagram schematically illustrating a high-level system overview of a human in the loop automatic clustering solution in accordance with embodiments herein;
- Figure 8A is a diagram schematically illustrating generation of a subset of labels in accordance with embodiments herein;
- Figure 8B is a diagram schematically illustrating semi-supervised model training in accordance with embodiments herein;
- Figure 8C is a diagram schematically illustrating model scoring on an unseen dataset in accordance with embodiments herein;
- Figure 9 is a diagram schematically illustrating a natural language processing/understanding workflow in accordance with embodiments herein;
- Figure 10 is a diagram schematically illustrating a similarity analysis workflow in accordance with embodiments herein;
- Figure 11 is a diagram schematically illustrating cluster evaluation in accordance with embodiments herein;
- Figure 12 is a diagram schematically illustrating pairwise comparison of notes and incorporating feedback in accordance with embodiments herein;
- Figure 13 illustrates a graphical user interface for clustering digital notes by geography in accordance with embodiments herein;
- Figure 14 illustrates a graphical user interface for clustering digital notes by sport in accordance with embodiments herein;
- Figure 15 illustrates a graphical user interface for clustering digital notes by team type in accordance with embodiments herein.
- the present disclosure describes embodiments of human in the loop supervision for real-time machine assisted notes clustering.
- Embodiments of the approaches described herein may involve a human (i.e., a subject matter expert, also referred to interchangeably as an SME) in the loop who, in the beginning of the process, plays the role of a teacher teaching a system what new unseen data looks like and the category it belongs to. This can either happen in real-time, in which the subject matter expert teaches the system to learn a new category, or in batch, where the subject matter expert can give positive or negative rewards. In such embodiments, the subject matter expert will be in the loop until the system hits convergence in learning all the new labels.
- a human i.e., a subject matter expert, also referred to interchangeably as an SME
- At least some aspects of the present disclosure are directed to systems, media, and methods of humans in the loop executing some of the initial clustering decisions. Aspects of the present disclosure may also be directed toward algorithms using the human clustering decisions as a supervised learning feedback loop to drive automated clustering processes to improve the speed/power of automated clustering and the degree to which the automated clustering decisions match the results that would have occurred with manual laborious decisions about each of the notes (hundreds, thousands, etc.).
- the note management system can improve the efficiency in capturing and extracting note content from a large number of notes.
- the note management system in some embodiments may improve the efficiency in grouping and managing notes.
- One or more superordinate clusters may ultimately be generated as a result of utilizing a subject matter expert in the loop, such that the superordinate cluster(s) may be able to replicate the accuracy and/or specificity of human users within certain thresholds, while delivering significantly improved speed. If one superordinate cluster is utilized in the embodiment depicted in FIG. 4A, efficiency/speed may be optimized while accounting still meeting accuracy and/or specificity thresholds. If a plurality of superordinate clusters are utilized in the embodiment depicted in FIG. 4B, the collective accuracy/specificity of the superordinate clusters may be optimized while still delivering significant speed advantages of manual clustering. More specifically, in some embodiments, the clustering strategies of various users may be accounted for in more detail within a particular superordinate cluster that does not have to account for all other users.
- notes may include physical notes and digital notes.
- Physical notes may generally refer to objects with a general boundary and recognizable content. Physical notes may include the resulting objects after people write, draw, or enter via other type of inputs on the objects, for example, paper, white board, or other objects accepting the inputs.
- physical notes may include hand-written Post-it® notes, paper, or film, white-board with drawings, posters, signs, and the like.
- physical notes may be generated using digital means, e.g., printing onto printable Post-it® notes or printed document.
- one object can include several notes. For example, several ideas can be written on a piece of poster paper or a white-board.
- marks such as lines, shapes, colors, symbols, markers, stickers, and the like, can be applied to the edges of the notes.
- Physical notes may be two-dimensional or three dimensional. Physical notes may have various shapes and sizes. For example, a physical note may be a 7.62 x 7.62 cm (3 x 3 inches) note, a 66.04 x 99.06 cm (26 x 39 inches) poster, a triangular metal sign, and the like.
- physical notes have known shapes and/or sizes that may conform to standards, such as legal, A3, A4, and other size standards, and known shapes, which may not be limited to geometric shapes, such as stars, circles, rectangles, or the like.
- Digital notes may refer to digital objects with information and/or ideas.
- Digital notes may be generated using digital inputs.
- Digital inputs may include, for example, keyboards, touch screens, digital cameras, digital recording devices, stylus, digital pens, and the like, which in some embodiments may correspond to I/O (input/output) 176 in FIG. 2 and/or input devices 306 in FIG. 3.
- Voice input metadata by way of non-limiting example, may be added to notes via a digital recording device and/or any other devices capable of detecting sound.
- digital notes may be obtained utilizing nonimage techniques such as with a digital pen having one or more inertial measurement units or the like.
- digital notes may be representative of physical notes.
- notes may be used in one or more collaboration spaces.
- a collaboration space may refer to a physical gathering area allowing more than one person to share ideas and thoughts with each other, as depicted in FIG. 5.
- a collaboration space may include virtual spaces allowing a group of persons to share ideas and thoughts remotely, besides the gathering area, as depicted in FIG. 6.
- a gathering space may include a hybrid approach, with some users being in-person and others participating virtually.
- collaboration need not be performed at the same time.
- a user may perform some clustering of notes and stop, with another user continuing on with the clustering at a later point in time when the first user is no longer present or participating.
- FIG. 1 illustrates an example of a note recognition environment 100.
- environment 100 may include one or more mobile devices 115 to capture and recognize one or more notes 122 from a workspace 120.
- the mobile device 115 may provide an execution environment for one or more software applications that, as described, may efficiently capture and extrac t note content from a large number of physical notes, such as the collection of notes 122 from workspace 120.
- notes 122 may be the results of a collaborative brainstorming session having multiple participants.
- a mobile device 115 and the software executing thereon may perform a variety of note-related operations, including automated creation of digital notes representative of physical notes 122 of workspace 120.
- a mobile device 115 may include, among other components, an image capture device 118 and a presentation device 128.
- mobile device 115 may include one or more processors, microprocessors, internal memory and/or data storage and other electronic circuitry for executing software or firmware to provide the functionality described herein, which may correspond in some embodiments to computing components depicted in FIG. 3.
- Image capture device 118 may be a camera or any other suitable component configured to capture image data representative of workspace 120 and notes 122 positioned therein.
- the image data may capture one or more visual representations of an environment, such as workspace 120, having one or more physical notes.
- Post-it® digital solutions may be utilized to provide the ability to synthesize notes on a digital whiteboard.
- image capture device 118 may comprise other components capable of capturing image data, such as a video recorder, an infrared camera, a CCD (Charge Coupled Device) array, a laser scanner, or the like, which may correspond to in some embodiments to image capture device 118 and/or input device 306 in FIG. 3.
- the captured image data can include at least one of an image, a video, a sequence of images (i.e., multiple images taken within a time period and/or with an order), a collection of images, image portion(s), and/or the like, and the term input image may refer to any suitable type of image data.
- Presentation device 128 may include, but not limited to, an electronically addressable display, such as a liquid crystal display (LCD) or other type of display device capable of use with mobile device 115. In some embodiments this may correspond to I/O 176 in FIG. 2 and/or display/output device(s) 304 in FIG. 3. In some embodiments, mobile device 115 may generate content to display on a presentation device 128 for the notes in a variety of formats, for example, a list, grouped in rows and/or column, a flow diagram, or the like. Mobile device 115 may, in some cases, communicate display information for presentation by other devices, such as a tablet computer, a projector, an electronic billboard or other external device.
- LCD liquid crystal display
- mobile device 115 may generate content to display on a presentation device 128 for the notes in a variety of formats, for example, a list, grouped in rows and/or column, a flow diagram, or the like.
- Mobile device 115 may, in some cases, communicate display information for presentation by other devices,
- mobile device 115 may provide a platform for creating and manipulating digital notes representative of physical notes 122.
- mobile device 115 may be configured to process image data produced by image capture device 118 to detect and recognize at least one of physical notes 122 positioned within workspace 120.
- the mobile device 115 may be configured to recognize note(s) by determining the general boundary of the note(s). After a note is recognized, mobile device 115 may extract the content of at least one of the one or more notes, where the content may be the visual information of note 122.
- mobile device 115 may implement techniques for automated detection and recognition of physical notes 122 and extraction of information, content or other characteristics associated with each of the physical notes. For example, mobile device 115 may allow a user 126 fine grain control over techniques used by mobile device 115 to detect and recognize physical notes 122. As one example, mobile device 115 may allow a user 126 to select between marker-based detection techniques in which one or more of notes 122 includes a physical fiducial mark on the surface of the note and/or non- marker-based techniques in which no fiducial mark is used.
- mobile device 115 may provide user 126 with an improved electronic environment for generating and manipulating corresponding digital notes representative of physical notes 122.
- mobile device 115 may provide mechanisms allowing user 126 to easily add digital notes to, edit notes within, and/or delete digital notes from a set of digital notes representative of the brainstorming activity associated with workspace 120.
- mobile device 115 may provide functionality by which user 126 is able to record and manage relationships between groups of notes 122.
- mobile device 115 may provide functionality by which user 126 is able to export the digital notes to other systems, such as cloud-based repositories (e.g., cloud server 112) and/or other computing devices (e.g., computer system 114 and/or mobile device 116).
- Mobile device 115 may be a mobile phone.
- mobile device 115 may be a tablet computer, a personal digital assistant (PDA), a laptop computer, a media player, an e-book reader, a wearable computing device (e.g., a watch, eyewear, a glove), or any other type of mobile or non-mobile computing device suitable for performing the techniques described herein.
- PDA personal digital assistant
- wearable computing device e.g., a watch, eyewear, a glove
- FIG. 2 a block diagram illustrates a non-limiting example of a mobile device 115 that operates in accordance with the techniques described herein.
- the mobile device of FIG. 2 is described with respect to mobile device 115 of FIG. 1.
- mobile device 115 may include various hardware components that provide core functionality for operation of the device, some or all of which may correspond to the computing hardware components depicted in FIG. 3.
- mobile device 115 may include one or more programmable processors 170 configured to operate according to executable instructions (i.e., program code), typically stored in a computer-readable medium or data storage 168 such as a static, random-access memory (SRAM) device or a Flash memory device.
- executable instructions i.e., program code
- I/O 176 may include one or more devices, such as a keyboard, camera button, power button, volume button, home button, back button, menu button, or presentation device 128 as described in FIG. 1.
- Transmitter 172 and receiver 174 may provide wireless communication with other devices, such as cloud server 112, computer system 114, or other mobile device 116 as described in FIG. 1, via a wireless communication interface as described in FIG. 1, such as but not limited to high-frequency radio frequency (RF) signals.
- Mobile device 115 may include additional discrete digital logic or analog circuitry not shown in FIG. 2, and may be embodied in any suitable type of device such as a smartphone, laptop, tablet, wearable device (such as wearable augmented reality glasses), and the like.
- operating system 164 may execute on processor 170 and provide an operating environment for one or more user applications 177 (commonly referred to “apps”), including note management application 178.
- User applications 177 may, by way of non-limiting example, comprise executable program code stored in computer-readable storage device (e.g., data storage 168) for execution by processor 170.
- user applications 177 may comprise firmware or, in some examples, may be implemented in discrete logic.
- a mobile device 115 may receive input image data and process the input image data in accordance with the techniques described herein.
- image capture device 118 may capture an input image of an environment having a one or more notes, such as workspace 120 of FIG. 1 having many notes 122.
- a mobile device 115 may receive image data from external sources, such as a cloud server 112, a computer system 114, and/or a mobile device 116, via a receiver 174.
- mobile device 115 may store the image data in data storage 68 for access and processing by note management application 178 and/or other user applications 177. [0037] As shown in FIG.
- GUI graphical user interface
- note management application 178 may construct and/or control GUI 179 to provide an improved electronic environment for generating and/or manipulating corresponding digital notes representative of physical notes 122.
- note management application 178 may construct GUI 179 to include a mechanism that allows user 126 to easily add digital notes to and/or deleting digital notes from defined sets of digital notes recognized from the image data.
- note management application 178 may provide functionality by which user 126 is able to record and/or manage relationships between groups of the digital notes by way of GUI 179.
- FIG. 3 a block diagram illustrates computing hardware, such as an exemplary computing device 300, through which embodiments of the disclosure can be implemented, such as those depicted and/or described in FIGS. 1-2, which may include, by way of non-limiting examples, cloud server 112, computer system 114, mobile device 115, mobile device 116, and/or any other suitable device.
- Computing device 300 as described herein is but one example of a suitable computing device and does not suggest any limitation on the scope of any embodiments presented.
- Nothing illustrated or described with respect to the computing device 300 should be interpreted as being required or as creating any type of dependency with respect to any element or plurality of elements.
- the computing device 300 may include, but need not be limited to, a desktop, laptop, server, client, tablet, smartphone, computing cloud or any other type of device that can utilize data.
- the computing device 300 includes at least one processor 302 and memory comprising non-volatile memory 308 and/or volatile memory 310.
- the processor 302 may but need not correspond to processor 170 depicted in FIG. 2.
- the computing device 300 may include one or more displays, display hardware, and/or output devices 304 such as, for example, AR/VR/MR/XR hardware (which may utilize input devices 306 such as imaging sensors), monitors, speakers, headphones, projectors, wearable-displays, holographic displays, printers, and the like.
- Output devices 304 may further include, for example, displays and/or speakers, devices that emit energy (radio, microwave, infrared, visible light, ultraviolet, x-ray and gamma ray), electronic output devices (WiFi, radar, laser, etc.), audio (of any frequency), and the like.
- energy radio, microwave, infrared, visible light, ultraviolet, x-ray and gamma ray
- electronic output devices WiFi, radar, laser, etc.
- audio of any frequency
- Computing device 300 may further include one or more input devices 306 which can include, by way of example, any type of mouse, keyboard, disk/media drive, memory stick/thumb-drive, memory card, pen, touch-input device, biometric scanner, gaze and/or blink tracker, tracker, voice/auditory input device, motion-detector, camera, scale, and any device capable of measuring data such as motion data (e.g., an accelerometer, GPS, a magnetometer, a gyroscope, etc.), biometric data (e.g., blood pressure, pulse, heart rate, perspiration, temperature, voice, facial-recognition, motion/gesture tracking, gaze tracking, iris or other types of eye recognition, hand geometry, oxygen saturation, glucose level, fingerprint, DNA, dental records, weight, or any other suitable type of biometric data, etc.), video/still images, and audio (including human-audible and human-inaudible ultrasonic sound waves).
- motion data e.g., an accelerometer, GPS, a magnet
- Input devices 306 may include any type of device capable of receiving data, whether from another device, visual and/or audio data captured from the real world, object detection data, and the like.
- Input devices 306 may include cameras (with or without audio recording), such as digital and/or analog cameras, still cameras, video cameras, thermal imaging cameras, infrared cameras, imaging sensors, cameras with a charge-couple display, night-vision cameras, three-dimensional cameras, webcams, audio recorders, and the like.
- input device 306 and/or display/output device 304 may correspond to I/O 176 depicted in FIG. 2.
- Computing device 300 in some embodiments includes non-volatile memory 308 (e.g., ROM, flash memory, etc.), volatile memory 310 (e.g., RAM, etc.), or a combination thereof.
- a network interface 312 may facilitate communications over a network 314 with other data source(s) such as a database 318 via wires, a wide area network, a local area network, a personal area network, a cellular network, a satellite network, and the like .
- Suitable local area networks may include wired Ethernet and/or wireless technologies such as, for example, wireless fidelity (Wi-Fi).
- Suitable personal area networks may include wireless technologies such as, for example, IrDA, Bluetooth, Wireless USB, Z-Wave, ZigBee, and/or other near field communication protocols. Suitable personal area networks may similarly include wired computer buses such as, for example, USB and FireWire. Suitable cellular networks may include, but are not limited to, technologies such as UTE, WiMAX, UMTS, CDMA, GSM, and the like.
- Network interface 312 can be communicatively coupled to any device capable of transmitting and/or receiving data via one or more network(s) 314.
- the network interface 312 may correspond to a transmitter 172 and/or a receiver 174 as depicted in FIG. 2.
- the network interface 312 may include a communication transceiver for sending and/or receiving any wired or wireless communication.
- the network interface 312 may include an antenna, a modem, LAN port, Wi-Fi card, WiMax card, mobile communications hardware, near-field communication hardware, satellite communication hardware and/or any wired or wireless hardware for communicating with other networks and/or devices.
- a computer-readable medium 316 comprises one or more plurality of computer readable mediums, each of which is non-transitory.
- a computer readable medium may reside, for example, within an input device 306, non-volatile memory 308, volatile memory 310, or any combination thereof.
- a readable storage medium can include tangible media that is able to store instructions associated with, or used by, a device or system.
- a computer readable medium, also referred to herein as a non-transitory computer readable medium includes, by way of non-limiting examples: RAM, ROM, cache, fiber optics, EPROM/Flash memory, CD/DVD/BD-ROM, hard disk drives, solid-state storage, optical or magnetic storage devices, diskettes, electrical connections having a wire, or any combination thereof.
- a non-transitory computer readable medium may also include, for example, a system or device that is of a magnetic, optical, semiconductor, or electronic type.
- a non-transitory computer readable medium excludes carrier waves and/or propagated signals taking any number of forms such as optical, electromagnetic, or combinations thereof.
- the computing device 300 may include one or more network interfaces 312 to facilitate communication with one or more remote devices, which may include, for example, client and/or server devices.
- the network interface 312 may also be described as a communications module, as these terms may be used interchangeably.
- the database 318 is depicted as being accessible over the network 314 and may reside within a server, the cloud, or any other configuration to support being able to remotely access data and store data in the database 318.
- database 318, non-volatile memory 308, volatile memory 310, and/or computer readable medium 316 may correspond to data storage 168 depicted in FIG. 2.
- FIG. 4A a flow diagram 400 schematically illustrates the generation of a superordinate level cluster that captures clustering strategies across multiple users.
- users may begin sorting notes, which they created and/or imported, into clusters. This may be done with digital notes as depicted in Figs. 6 and 13-15, with physical notes as depicted in Figs. 1 and 5, or any combination thereof.
- notes may be sorted into clusters utilizing any quantity of suitable dimensions.
- notes with information about sports teams may be sorted according to dimensions such as type of sport, geographical location, and type of team name.
- the sorting of notes into clusters may be received, for example, as data by users utilizing a graphical user interface as depicted in FIGS. 6, 13-15 and/or users physically sorting notes and captured as depicted in FIG. 1.
- multiple models referred to interchangeably as alternative models and/or candidate models
- the alternative models may be utilized to perform clustering based upon inferences derived from analyzing how the users performed there clustering. More specifically, this may be accomplished by analyzing the real-time clustering decisions that the users make and generating models of what dimensions against which each user tends to cluster notes. These dimensions may be analyzed to identify the policies that the users are using to cluster the notes.
- the models may be utilized to make predictions about sequences of clustering decisions that have yet to be executed by the users.
- Embodiments herein may theorize and therefore predict that some users tend to cluster based on geographic dimensions (cities, in one example), whereas some users tend to cluster based on the dimension of which sport type a team plays (football, baseball, basketball, etc.) whereas other users tend to cluster by mascot category (e.g., animal, historical persona, etc.).
- a model may make a prediction about how the users or a subset the users will sort notes. Predictions may be made about which unclustered note will next be added to a cluster, along with predicting which the cluster to which the note will be added.
- predictions may include a note being added to a cluster, and subsequently moved to a different cluster.
- the models may examine concordance and/or discordance between alternative models and/or human clustering decisions.
- the subsequent clustering of a note can be thought of as an opportunity for hypothesis testing.
- a user clusters notes based on geographic dimensions, then they should cluster a particular note into cluster X where the other notes in that cluster are more similar on the geographic dimension. In this way, the accuracy of a certain model’s predictions can be used to reinforce model confidence in some embodiments.
- a determination may be made as to whether multiple models that have been generated that, across these models, meet one or more accuracy and/or specificity thresholds for a sufficient number of the users. This may entail, for example, determining the accuracy of any candidate model for predicting where users will sort notes during the previous hypothesis-testing at block 408 and evaluating this against an accuracy threshold. This determination may be made by one or more subject matter experts that may or may not be part of the users. In some embodiments, if the accuracy of the given candidate model meets/exceeds a given threshold, it may be deemed to be a working model for how and which dimensions the various users use to guide clustering. Otherwise, if accuracy and/or specificity threshold(s) are not met (condition NO), then the flow diagram may return to block 408. Thus, as an example, more hypothesis testing data may be collected in this way.
- the model may alert the users that they can stop manually sorting and/or it may generate a superordinate - level cluster that may be a logically consistent hybrid of individual stakeholder models.
- an alert may be provided to the user that there is enough confidence in one of the clustering models that the human users can stop manual clustering and the system may generate a singular superordinate cluster of the notes that covers some or all of the clustering methods the current users are utilizing. This may mean, by way of non-limiting example, that the selected model covers more of the clustering methods that the current users are utilizing, and in a more accurate way.
- the superordinate level cluster may be updated based upon feedback by one or more subject matter expert, where the feedback may be received in real-time (or near real-time, intervals, and the like) and/or in batch (i.e., x instances of feedback are delivered at a time, etc.). In one example, this could entail the generation of a cluster based on geographic dimensions, sport the team plays, and by mascot category.
- the superordinate cluster may comprise one or more clustering strategies from all of the users or a subset of the users.
- the superordinate level cluster may incorporate a plurality of clustering strategies from a plurality of the users.
- the system might also report out what percentage of users in the crowd employed a given method of clustering (e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)).
- a given method of clustering e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)
- mascot category e.g., animals
- the system may notify human the users to stop manually clustering. In this way, the system can utilize a singular superordinate model for automated completion of the clustering of the notes.
- FIG. 4B a flow diagram 450 schematically illustrates the generating of multiple superordinate level clusters that represent different subpopulations of users.
- users may begin sorting notes, which they created and/or imported, into clusters. This may be done with digital notes as depicted in Figs. 6 and 13-15, with physical notes as depicted in Figs. 1 and 5, or any combination thereof. As discussed in more detail herein, notes may be sorted into clusters utilizing any quantity of suitable dimensions. As discussed further with respect to FIGS. 13-15, notes with information about sports teams may be sorted according to dimensions such as type of sport, geographical location, and type of team name.
- the sorting of notes into clusters may be received, for example, as data by users utilizing a graphical user interface as depicted in FIGS. 6, 13-15 and/or users physically sorting notes and captured as depicted in FIG. 1.
- multiple models referred to interchangeably as alternative models and/or candidate
- the alternative models may be utilized to perform clustering based upon inferences derived from analyzing how the users performed there clustering. More specifically, this may be accomplished by analyzing the real-time clustering decisions that the users make and generating models of what dimensions against which each user tends to cluster notes. These dimensions may be analyzed to identify the policies that the users are using to cluster the notes.
- the models may be utilized to make predictions about sequences of clustering decisions that have yet to be executed by the users.
- Embodiments herein may theorize and therefore predict that some users tend to cluster based on geographic dimensions (cities, in one example), whereas some users tend to cluster based on the dimension of which sport type a team plays (football, baseball, basketball, etc.) whereas other users tend to cluster by mascot category (e.g., animal, historical persona, etc.).
- a model may make a prediction about how the users or a subset the users will sort notes. Predictions may be made about which unclustered note will next be added to a cluster, along with predicting which the cluster to which the note will be added.
- predictions may include a note being added to a cluster, and subsequently moved to a different cluster.
- the models may examine concordance and/or discordance between alternative models and/or human clustering decisions.
- the subsequent clustering of a note can be thought of as an opportunity for hypothesis testing.
- a user clusters notes based on geographic dimensions, then they should cluster a particular note into cluster X where the other notes in that cluster are more similar on the geographic dimension. In this way, the accuracy of a certain model’s predictions can be used to reinforce model confidence in some embodiments.
- a determination may be made as to whether multiple models that have been generated that, across these models, meet one or more accuracy and/or specificity thresholds for a sufficient number of the users. This may entail, for example, the accuracy of any candidate model for predicting where users will sort notes during the previous hypothesis-testing at block 458 and evaluating this against an accuracy threshold. In some embodiments, if multiple candidate models meet/exceed a given threshold, they may be deemed to collectively determine how and which dimensions the various users utilize to guide clustering. Otherwise, if accuracy and/or specificity threshold(s) are not met (condition NO), then the flow diagram may return to block 458. Thus, as an example, more hypothesis testing data may be collected in this way.
- the model may alert the users that they can stop manually sorting and/or it may generate multiple superordinate-level clusters that may be a logically consistent hybrids of individual stakeholder models.
- the system would alert the user that the system has enough confidence in these clustering models that the human users can stop manual clustering and the system would generate multiple superordinate clusters of the notes that cover all of the clustering methods the current users are utilizing.
- one or more of the superordinate level clusters may be updated based upon feedback by one or more subject matter expert, where the feedback may be received in real-time (or near real-time, intervals, and the like) and/or in batch (i.e., x instances of feedback are delivered at a time, etc.).
- this would mean that the system would generate multiple superordinate clusters based on geographic dimensions, sport the team plays, and by mascot category.
- the superordinate clusters may each and/or collectively comprise one or more clustering strategies from all of the users or a subset of the users.
- the superordinate level clusters may each and/or collectively incorporate a plurality of clustering strategies from a plurality of the users.
- the system might also report out what percentage of users in the crowd employed a given method of clustering (e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)).
- a given method of clustering e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)
- mascot category e.g., animals
- the system may notify the human users to stop manually clustering. In this way, the system can utilize multiple superordinate models for automated completion of the clustering of the notes.
- users 502 are depicted in an environment 500 manually moving notes 504 into clusters 506.
- users 502 may work together, individually, and/or in any suitable fashion to cluster 506 notes 504 based upon one or more criteria/dimensions.
- the image-capture technology depicted in FIGS. 1-3 may be utilized to capture images and/or videos of the notes 504 being put into clusters 506.
- Clusters 506 may be any suitable size, involve any number of notes 504 and/or any number of clusters 506, and may involve any suitable distance(s) between notes 504 and/or clusters 506.
- Digital notes 602 may be electronically represented on a digital surface 600 to display any suitable numberZtype(s) of digital notes within a graphical user interface.
- notes 602 may be created within the digital surface 600, may be captured from physical notes as depicted in FIG. 1, and/or any combination thereof.
- notes 602 may contain multiple pieces of information, such as a name 604 of a sport team, a sport type 606 associated with the team, and/or a city 608 associated with the team.
- the digital notes 602 within the graphical user interface may be arranged in any suitable/de sired fashion by any number of users on one or more devices.
- each user may access the digital surface 600 using their own device.
- the graphical user interface may include, by way of non-limiting example, a control panel 650 having one or more options such as note deletion 652, digital note creation 654, and a digital note sizing button 656.
- one or more users may move digital notes 602 into one or more clusters. Any number of users present at the digital surface 600 m ay move digital notes 602 for any reason.
- the digital notes 602 may be clustered by one or more criteria, such as by geography (e.g., city 608), sport (e.g., sport type 606), and/or team name type (e.g., name). Any other suitable criteria for clustering may be utilized, such as color and/or size of the digital note 602.
- continual and/or periodic digital capture of physical notes may be utilized to update their positions.
- their position may be determined by manipulation from users via the digital surface 600.
- FIG. 7 a high-level system overview of the human in the loop automatic clustering solution 700 is depicted.
- a known subset of a content category may be supplied by a subject matter expert during every new brainstorming session if the system has not yet learned this content category.
- the subject matter expert may thus provide a data subset and associate labels and/or categories, which may be put into data storage at block 704.
- a self-supervised machine learning model approach may be used in this embodiment to reduce data labeling overheads for subject matter experts.
- Self-supervised learning is an approach to machine learning that combines a small amount of labeled data with a large amount of unlabeled data during training.
- a machine learning model may utilize the contents of the data storage at block 704 to learn utilize the new data representation.
- the machine learning model may be triggered for either a batch of records or to proceed in a real-time or near real-time fashion, although any suitable timing for processing records may be utilized in other embodiments.
- the model may learn the latent representations of the content category. Based on this latent representation, the model may predict categories based on the unlabeled dataset. The generated categories may then be submitted to the subject matter expert(s) for approval. The subject matter expert(s) may then issue either positive or negative feedback, such that labels may be generated for subject matter review by returning to block 702.
- the closed loop system of this embodiment may continue until a steady set of labels get generated. Put another way, a determination may be made that the process is to continue until a steady state of labels are obtained. Upon achieving a steady state, the subject matter expert(s) may no longer be needed in the loop.
- a flow diagram 802 depicts the generation of a subset of labels.
- data may be important to obtaining an accurate model, yet data labelling may become a time-consuming process when the size of the dataset is large.
- uncategorized Post-it® Notes residing on a Post-it® Board may serve as an input source.
- uncategorized notes may be digital, physical, and/or a combination of thereof may serve as one or more input sources.
- a subject matter expert may label a set of categories.
- the Post-it® Notes that the subject matter expert has provided may form a subset of labels for the content in the Post-it® Board, such as by categories A, B, and C.
- the labelled notes/data could be viewed as the training data, and the yet-to-be labelled data could be viewed as the test data.
- the data may be structured in, by way of non-limiting example, an n*2 form comprising content and category.
- a flow diagram 812 depicts semi-supervised model training.
- a semi/self-supervised learning model may be trained on the data comprising content and categories 814 utilizing the label subset provided by the subject matter expert resulting at the end step 1 in FIG. 8A.
- a semi/self-supervised learning model autoencoder may be utilized to reduce the time consumed by the one or more subject matter experts in labeling data on the Post-it® Board.
- the model may learn the overall latent representations in the dataset.
- an offline model may be trained on a subset of content and its associated labels/category, such that the model may learn the latent representation of the new data.
- This step may in some embodiments also involve natural language processing and/or understanding capabilities to semantically understand content in the Post-it® Notes.
- unlabeled data may be scored into n-categories by the trained self-supervised learning model.
- the semi-supervised learning model may be put into a package model. The may be done, for example, to ensure that the model can be easily deployed and maintained in a production environment. Accordingly, this may provide a trained semi/self-supervised learning model at 820.
- a flow diagram 822 depicts model scoring on an unseen dataset.
- the trained semi/self-supervised learning model 820 may be used to score the remaining unlabeled data from the Post-it® board 824.
- the final categories may then be submitted to a subject matter expert fortheir evaluation.
- the subject matter expert may review the labels and either provide a positive reward indicating a successful model iteration or a negative reward penalizing the model.
- the flow in FIG. 8C may be regarded as iterative, such that after the data is structured at block 830, another iteration may be performed utilizing the updated data returning to block 824. As discussed with respect to FIG. 7, steps 1-3 in FIG. 8 may continue until the model reaches convergence.
- a natural language processing/understanding workflow 900 is depicted.
- the process may start with the textual content being annotated by the subject matter expert.
- the text may be subjected to text tokenization, utilizing a word and sentence tokenizer, a punctuation tokenizer, a byte pair tokenizer, Bidirectional Encoder Representations from Transformers (BERT) tokenizer, and/or any suitable tokenizer(s), by way of non-limiting examples.
- the text may be subjected to standardization procedures to tokenize words, remove stopwords, numbers, special characters, and/or the like.
- the text may be processed with word length standardization, unknown sequence handling, and stopword removal, by way of non-limiting examples. Additionally, if content needs to be grouped into n-gram combinations to make sense semantically, such operations can also be carried out in the standardization procedures.
- pre-trained or custom word embeddings may be used to map text content onto a relevant correlated co-ordinate space to obtain weights that may help in content classification/categorization.
- machine leaming/artificial intelligence approaches long short-term memory networks, recurrent neural network, et al.
- Non-limiting examples include BERT, Embeddings from Language Model (ELmO), Generative Pre-trained Transformer (GPT), long short-term memory (LSTM), transformers, and the like.
- a similarity analysis workflow 1000 is depicted. This embodiment may be utilized to help avoid having content of the same type being entered by the subject matter expert. In some embodiments, the efficacy and/or efficiency may be dependent upon the data variety in the subset. To ensure subject matter experts do not select data that is too different, a pre-filter approach is depicted as performing a pairwise similarity analysis 1006 on all the data points 1002, 1004. This may be utilized to remove data points that are too similar to one another, while still retaining distinctly different data points. As depicted in this embodiment, an instance of text content A 1002 may be removed, while another instance may be retained along with text content B 1004. This type of check may be utilized to eliminate user error or unconscious bias that could occur when selecting data subsets from which the model may learn.
- clustering algorithms are usually evaluated based on minimal intra cluster distance and maximal inter cluster distance.
- First may be extrinsic measures, which in some embodiments require ground truth labels 1102, which may be considered the truth against which clusters and/or categories may be checked for validity.
- Second may be intrinsic measures, which do not require ground truth labels.
- Cluster evaluation metrics 1104 in this embodiment may be utilized to evaluate the clustering model’s performance to eventually provide a reward (positive/negative), which in turn may be used to improve the model performance to achieve convergence.
- Cluster evaluation metrics may include, by way of non-limiting examples, Adjusted Rand Index, Fowlkes- Malows scores, Completeness and V-Measure, Davies-Bouldh Index, Calinski-Harabasz Index, and/or the like. In utilizing such cluster evaluation metrics in some embodiments, the higher the index score, the better the clustering performance.
- the results of the cluster evaluation metrics 1104 may be demonstrated as rewards or a figure of merit may be utilized to comparatively assess and/or rank the clustering performances. Any suitable type of machine learning model 1108 or any other suitable type of model may then be utilized to perform the modeling, such that the machine learning model results 1110 may serve as input, along with the consumer ground truth labels 1102, for further evaluation with the cluster evaluation metric 1104.
- FIG. 12 a diagram of a pairwise comparison of notes and incorporating feedback 1200 is depicted.
- a set of notes may be presented to the user for their feedback 1202. This may incorporate their approval, such as within a user interface 1204, as to whether particular content has been categorized accurately or not.
- the user interface 1204 in this embodiment may ensure that users spend less time providing feedback on the model’s performance in categorizing content.
- the user interface 1204 may provide a mechanism to collect user inputs as to whether a cluster label presented to them has correctly been classified into a correct category or not at block 1206, and then stored within a data storage layer 1208, which may serve as input for a machine learning model or any other suitable type of model.
- the machine learning model results 1212 may then serve as input further evaluation within the user interface 1204.
- Digital notes 1310 in this embodiment may have dimensions related to team name 1312 (also referred to herein as mascot), sport type 1314, and city 1316, although any types and/or quantity of criteria may be utilized.
- Clusters 1302, 1304, 1306, 1308 may be of any suitable size, involve any number of digital notes 1310, and may involve any suitable distance(s) between clusters.
- Non-limiting examples of interface options include a panel 1350 of various options such as note creation 1352, note view enlargement 1354 for editing, and/or text options 1356 including any suitable text editing options (including text size, font, alignment, color, and/or the like).
- Other options may include a cluster shape option 1358 wherein the digital notes 1310 within a selected cluster 1304 may be reorganized into a variety of possible configurations/shapes/arrangements.
- Other options may include a cluster deletion option 1360, cluster approval option 1362, and the like.
- digital notes 1310 representing sports teams have been clustered by city 1316 (i.e., geography).
- city 1316 i.e., geography
- digital notes 1408 may have dimensions related to team name 1410, sport type 1412, and city 1414, although any types and/or quantity of criteria may be utilized.
- digital notes 1408 clustering sports teams whose sport type 1412 is hockey There may be a hockey cluster 1402 of digital notes 1408 clustering sports teams whose sport type 1412 is hockey, a football cluster 1404 of digital notes 1408 clustering sports teams whose sport type 1412 is football, a baseball cluster 1406 of digital notes 1408 clustering sports teams whose sport type 1412 is baseball, and the like.
- FIG. 15 a graphical user interface 1500 for clustering digital notes by type of team (e.g., team name and/or mascot) is depicted.
- digital notes 1518 in this embodiment may have dimensions related to team name 1512, sport type 1514, and city 1516, although any types and/or quantity of criteria may be utilized.
- a clothing cluster 1502 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of clothing a people cluster 1504 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of people, an animals cluster 1506 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of animals, a miscellaneous cluster 1508 of digital notes 1518 clustering those sports teams whose team name 1512 relates to miscellaneous team names, a mythical creatures cluster 1510 of digital notes 1518 clustering those sports teams whose team name 1512 relates to mythical creatures, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method includes receiving data that sorts objects into preliminary clusters and generating models comprising clustering decisions leading to formation of the preliminary clusters. The method also includes generating, based on the models, predictions about sequences of clustering decisions to be made by users and comparing the models to subsequent clustering decisions made by the users. The method further includes identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
Description
METHODS, MEDIA, AND SYSTEMS FOR HUMAN IN THE LOOP SUPERVISION FOR REAL-TIME MACHINE ASSISTED NOTES CLUSTERING
BACKGROUND
[0001] Notes have been broadly used in recording, sharing, and communicating ideas and information. For example, during a collaboration session (e.g., brainstorming session), participants write down ideas on Post-it® notes, whiteboard(s), notebook(s) (reusable or non-reusable), and/or paper, and then share with one another. During a typical brainstorming exercise, individuals or groups of individuals generate dozens or even hundreds of individual ideas represented as individual digital or physical Post-it® Notes. Sifting through those notes and grouping them into clusters is time-consuming and can be inefficient.
SUMMARY
[0002] In an embodiment, at least one non-transitory computer-readable medium encoded with instructions that, when executed, configure at least one processor for receiving data that sorts objects into preliminary clusters and generating models comprising clustering decisions leading to formation of the preliminary clusters. The non-transitory computer-readable medium also includes instructions for generating, based on the models, predictions about sequences of clustering decisions to be made by users and comparing the models to subsequent clustering decisions made by the users. The non-transitory computer-readable medium also includes instructions for identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
[0003] In another embodiment, a computing device comprises at least one memory and at least one processor coupled to at least one of the at least one computer memory, the at least one processor being configured to perform operations to receive data that sorts objects into preliminary clusters and generate models comprising clustering decisions leading to formation of the preliminary clusters. The processor is further configured to generate, based on the models, predictions about sequences of clustering decisions to be made by users and compare the models to subsequent clustering decisions made by the users. The processor is further configured to identify at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generate, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality
subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
[0004] In yet another embodiment, a method includes receiving data that sorts objects into preliminary clusters and generating models comprising clustering decisions leading to formation of the preliminary clusters. The method also includes generating, based on the models, predictions about sequences of clustering decisions to be made by users and comparing the models to subsequent clustering decisions made by the users. The method further includes identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] Figure 1 is a representation illustrating one example of a user capturing an image of a workspace with notes using an image capture device on a mobile device according to various embodiments herein;
[0006] Figure 2 is a block diagram illustrating one example of a mobile device utilized to implement various embodiments herein;
[0007] Figure 3 is a block diagram of computing hardware utilized to implement various embodiments herein;
[0008] Figure 4A is a flow diagram schematically illustrating a method for generating a superordinate level cluster that captures clustering strategies across multiple users in accordance with embodiments herein;
[0009] Figure 4B is a flow diagram schematically illustrating a method for generating multiple superordinate level clusters that represent different subpopulations of users in accordance with embodiments herein;
[0010] Figure 5 is a diagram schematically illustrating users manually clustering notes in accordance with embodiments herein;
[0011] Figure 6 illustrates a graphical user interface for clustering digital notes in accordance with embodiments herein;
[0012] Figure 7 is a diagram schematically illustrating a high-level system overview of a human in the loop automatic clustering solution in accordance with embodiments herein;
[0013] Figure 8A is a diagram schematically illustrating generation of a subset of labels in accordance with embodiments herein;
[0014] Figure 8B is a diagram schematically illustrating semi-supervised model training in accordance with embodiments herein;
[0015] Figure 8C is a diagram schematically illustrating model scoring on an unseen dataset in accordance with embodiments herein;
[0016] Figure 9 is a diagram schematically illustrating a natural language processing/understanding workflow in accordance with embodiments herein;
[0017] Figure 10 is a diagram schematically illustrating a similarity analysis workflow in accordance with embodiments herein;
[0018] Figure 11 is a diagram schematically illustrating cluster evaluation in accordance with embodiments herein;
[0019] Figure 12 is a diagram schematically illustrating pairwise comparison of notes and incorporating feedback in accordance with embodiments herein;
[0020] Figure 13 illustrates a graphical user interface for clustering digital notes by geography in accordance with embodiments herein;
[0021] Figure 14 illustrates a graphical user interface for clustering digital notes by sport in accordance with embodiments herein; and
[0022] Figure 15 illustrates a graphical user interface for clustering digital notes by team type in accordance with embodiments herein.
DETAILED DESCRIPTION
[0023] The present disclosure describes embodiments of human in the loop supervision for real-time machine assisted notes clustering. Embodiments of the approaches described herein may involve a human (i.e., a subject matter expert, also referred to interchangeably as an SME) in the loop who, in the beginning of the process, plays the role of a teacher teaching a system what new unseen data looks like and the category it belongs to. This can either happen in real-time, in which the subject matter expert teaches the system to learn a new category, or in batch, where the subject matter expert can give positive or negative rewards. In such embodiments, the subject matter expert will be in the loop until the system hits convergence in learning all the new labels.
[0024] At least some aspects of the present disclosure are directed to systems, media, and methods of humans in the loop executing some of the initial clustering decisions. Aspects of the present disclosure may also be directed toward algorithms using the human clustering decisions as a supervised learning feedback loop to drive automated clustering processes to improve the speed/power of automated clustering and the degree to which the automated clustering decisions match the results that would have occurred with manual laborious decisions about each of the notes (hundreds, thousands, etc.). In some embodiments, the note management system can improve the efficiency in capturing and extracting note content from a large number of notes. In addition, the note management system in some embodiments may improve the efficiency in grouping and managing notes. One or more superordinate clusters may ultimately be generated as a result of utilizing a subject matter expert in the loop, such that the superordinate cluster(s) may be able to replicate the accuracy and/or specificity of human users within certain thresholds, while delivering significantly improved speed. If one superordinate cluster is utilized in the embodiment depicted in FIG. 4A, efficiency/speed may be optimized while accounting still meeting accuracy and/or specificity thresholds. If a plurality of superordinate clusters are utilized in the embodiment depicted in FIG. 4B, the collective accuracy/specificity of the superordinate clusters may be optimized while still delivering significant speed advantages of manual clustering. More specifically, in some embodiments, the clustering strategies of various users may be accounted for in more detail within a particular superordinate cluster that does not have to account for all other users.
[0025] In general, notes may include physical notes and digital notes. Physical notes may generally refer to objects with a general boundary and recognizable content. Physical notes may include the resulting objects after people write, draw, or enter via other type of inputs on the objects, for example, paper, white board, or other objects accepting the inputs. By way of non-limiting example, physical notes may include hand-written Post-it® notes, paper, or film, white-board with drawings, posters, signs, and the like. In some cases, physical notes may be generated using digital means, e.g., printing onto printable Post-it® notes or printed document. In some cases, one object can include several notes. For example, several ideas can be written on a piece of poster paper or a white-board. In some embodiments, to facilitate the recognition of these notes, marks, such as lines, shapes, colors, symbols, markers, stickers, and the like, can be applied to the edges of the notes. Physical notes may be two-dimensional or three dimensional. Physical notes may have various shapes and sizes. For example, a physical note may be a 7.62 x 7.62 cm (3 x 3 inches) note, a 66.04 x 99.06 cm (26 x 39 inches) poster, a triangular metal sign, and the like. In some cases, physical notes have known shapes and/or sizes that may conform to standards, such as legal, A3, A4, and other size standards, and known shapes, which may not be limited to geometric shapes, such as stars, circles, rectangles, or the like. In other cases, physical notes may have non-standardized sizes and/or irregular shapes. Digital notes may refer to digital objects with information and/or ideas. Digital notes may be generated using digital inputs. Digital inputs may include, for example, keyboards, touch screens, digital cameras, digital recording devices, stylus, digital pens, and the like, which in some embodiments may
correspond to I/O (input/output) 176 in FIG. 2 and/or input devices 306 in FIG. 3. Voice input metadata, by way of non-limiting example, may be added to notes via a digital recording device and/or any other devices capable of detecting sound. In some embodiments, digital notes may be obtained utilizing nonimage techniques such as with a digital pen having one or more inertial measurement units or the like. In some embodiments, digital notes may be representative of physical notes.
[0026] In some embodiments, notes may be used in one or more collaboration spaces. In some embodiments, a collaboration space may refer to a physical gathering area allowing more than one person to share ideas and thoughts with each other, as depicted in FIG. 5. A collaboration space may include virtual spaces allowing a group of persons to share ideas and thoughts remotely, besides the gathering area, as depicted in FIG. 6. In some embodiments, a gathering space may include a hybrid approach, with some users being in-person and others participating virtually. In embodiments, collaboration need not be performed at the same time. By way of non-limiting example, a user may perform some clustering of notes and stop, with another user continuing on with the clustering at a later point in time when the first user is no longer present or participating.
[0027] FIG. 1 illustrates an example of a note recognition environment 100. In the non-limiting example of FIG. 1, environment 100 may include one or more mobile devices 115 to capture and recognize one or more notes 122 from a workspace 120. As described herein, the mobile device 115 may provide an execution environment for one or more software applications that, as described, may efficiently capture and extrac t note content from a large number of physical notes, such as the collection of notes 122 from workspace 120. In this non-limiting example, notes 122 may be the results of a collaborative brainstorming session having multiple participants. As described herein, a mobile device 115 and the software executing thereon may perform a variety of note-related operations, including automated creation of digital notes representative of physical notes 122 of workspace 120.
[0028] In some embodiments, a mobile device 115 may include, among other components, an image capture device 118 and a presentation device 128. In addition, although not shown in FIG. 1, mobile device 115 may include one or more processors, microprocessors, internal memory and/or data storage and other electronic circuitry for executing software or firmware to provide the functionality described herein, which may correspond in some embodiments to computing components depicted in FIG. 3.
[0029] Image capture device 118 may be a camera or any other suitable component configured to capture image data representative of workspace 120 and notes 122 positioned therein. In other words, the image data may capture one or more visual representations of an environment, such as workspace 120, having one or more physical notes. In some embodiments, Post-it® digital solutions may be utilized to provide the ability to synthesize notes on a digital whiteboard. Although discussed as a camera of mobile device 115, image capture device 118 may comprise other components capable of capturing image data, such as
a video recorder, an infrared camera, a CCD (Charge Coupled Device) array, a laser scanner, or the like, which may correspond to in some embodiments to image capture device 118 and/or input device 306 in FIG. 3. Moreover, the captured image data can include at least one of an image, a video, a sequence of images (i.e., multiple images taken within a time period and/or with an order), a collection of images, image portion(s), and/or the like, and the term input image may refer to any suitable type of image data.
[0030] Presentation device 128 may include, but not limited to, an electronically addressable display, such as a liquid crystal display (LCD) or other type of display device capable of use with mobile device 115. In some embodiments this may correspond to I/O 176 in FIG. 2 and/or display/output device(s) 304 in FIG. 3. In some embodiments, mobile device 115 may generate content to display on a presentation device 128 for the notes in a variety of formats, for example, a list, grouped in rows and/or column, a flow diagram, or the like. Mobile device 115 may, in some cases, communicate display information for presentation by other devices, such as a tablet computer, a projector, an electronic billboard or other external device.
[0031] As described herein, mobile device 115, and the software executing thereon, may provide a platform for creating and manipulating digital notes representative of physical notes 122. For example, in general, mobile device 115 may be configured to process image data produced by image capture device 118 to detect and recognize at least one of physical notes 122 positioned within workspace 120. In some examples, the mobile device 115 may be configured to recognize note(s) by determining the general boundary of the note(s). After a note is recognized, mobile device 115 may extract the content of at least one of the one or more notes, where the content may be the visual information of note 122.
[0032] As further described herein, mobile device 115 may implement techniques for automated detection and recognition of physical notes 122 and extraction of information, content or other characteristics associated with each of the physical notes. For example, mobile device 115 may allow a user 126 fine grain control over techniques used by mobile device 115 to detect and recognize physical notes 122. As one example, mobile device 115 may allow a user 126 to select between marker-based detection techniques in which one or more of notes 122 includes a physical fiducial mark on the surface of the note and/or non- marker-based techniques in which no fiducial mark is used.
[0033] In addition, mobile device 115 may provide user 126 with an improved electronic environment for generating and manipulating corresponding digital notes representative of physical notes 122. By way of non-limiting example, mobile device 115 may provide mechanisms allowing user 126 to easily add digital notes to, edit notes within, and/or delete digital notes from a set of digital notes representative of the brainstorming activity associated with workspace 120. In some embodiments, mobile device 115 may provide functionality by which user 126 is able to record and manage relationships between groups of notes 122. In some embodiments, mobile device 115 may provide functionality by which user 126 is able to export the digital notes to other systems, such as cloud-based repositories (e.g., cloud server 112) and/or
other computing devices (e.g., computer system 114 and/or mobile device 116). Mobile device 115 may be a mobile phone. In other embodiments, mobile device 115 may be a tablet computer, a personal digital assistant (PDA), a laptop computer, a media player, an e-book reader, a wearable computing device (e.g., a watch, eyewear, a glove), or any other type of mobile or non-mobile computing device suitable for performing the techniques described herein.
[0034] Turning to FIG. 2, a block diagram illustrates a non-limiting example of a mobile device 115 that operates in accordance with the techniques described herein. By way of non-limiting example, the mobile device of FIG. 2 is described with respect to mobile device 115 of FIG. 1. In this embodiment, mobile device 115 may include various hardware components that provide core functionality for operation of the device, some or all of which may correspond to the computing hardware components depicted in FIG. 3. For example, mobile device 115 may include one or more programmable processors 170 configured to operate according to executable instructions (i.e., program code), typically stored in a computer-readable medium or data storage 168 such as a static, random-access memory (SRAM) device or a Flash memory device. I/O 176 (i.e., input/output) may include one or more devices, such as a keyboard, camera button, power button, volume button, home button, back button, menu button, or presentation device 128 as described in FIG. 1. Transmitter 172 and receiver 174 may provide wireless communication with other devices, such as cloud server 112, computer system 114, or other mobile device 116 as described in FIG. 1, via a wireless communication interface as described in FIG. 1, such as but not limited to high-frequency radio frequency (RF) signals. Mobile device 115 may include additional discrete digital logic or analog circuitry not shown in FIG. 2, and may be embodied in any suitable type of device such as a smartphone, laptop, tablet, wearable device (such as wearable augmented reality glasses), and the like.
[0035] In general, operating system 164 may execute on processor 170 and provide an operating environment for one or more user applications 177 (commonly referred to “apps”), including note management application 178. User applications 177 may, by way of non-limiting example, comprise executable program code stored in computer-readable storage device (e.g., data storage 168) for execution by processor 170. As other non-limiting examples, user applications 177 may comprise firmware or, in some examples, may be implemented in discrete logic.
[0036] In operation, a mobile device 115 may receive input image data and process the input image data in accordance with the techniques described herein. For example, image capture device 118 may capture an input image of an environment having a one or more notes, such as workspace 120 of FIG. 1 having many notes 122. As another non-limiting example, a mobile device 115 may receive image data from external sources, such as a cloud server 112, a computer system 114, and/or a mobile device 116, via a receiver 174. In some embodiments, mobile device 115 may store the image data in data storage 68 for access and processing by note management application 178 and/or other user applications 177.
[0037] As shown in FIG. 2, user applications 177 may invoke kernel functions of operating system 164 to output a graphical user interface (GUI) 179 for presenting information to a user of a mobile device 115. As further described below, and also reflected further in an embodiment depicted with respect to FIG. 6, note management application 178 may construct and/or control GUI 179 to provide an improved electronic environment for generating and/or manipulating corresponding digital notes representative of physical notes 122. For example, note management application 178 may construct GUI 179 to include a mechanism that allows user 126 to easily add digital notes to and/or deleting digital notes from defined sets of digital notes recognized from the image data. In some embodiments, note management application 178 may provide functionality by which user 126 is able to record and/or manage relationships between groups of the digital notes by way of GUI 179.
[0038] Referring now to Figure 3, a block diagram illustrates computing hardware, such as an exemplary computing device 300, through which embodiments of the disclosure can be implemented, such as those depicted and/or described in FIGS. 1-2, which may include, by way of non-limiting examples, cloud server 112, computer system 114, mobile device 115, mobile device 116, and/or any other suitable device. Computing device 300 as described herein is but one example of a suitable computing device and does not suggest any limitation on the scope of any embodiments presented. Nothing illustrated or described with respect to the computing device 300 should be interpreted as being required or as creating any type of dependency with respect to any element or plurality of elements. In various embodiments, the computing device 300 may include, but need not be limited to, a desktop, laptop, server, client, tablet, smartphone, computing cloud or any other type of device that can utilize data. In an embodiment, the computing device 300 includes at least one processor 302 and memory comprising non-volatile memory 308 and/or volatile memory 310. The processor 302 may but need not correspond to processor 170 depicted in FIG. 2. The computing device 300 may include one or more displays, display hardware, and/or output devices 304 such as, for example, AR/VR/MR/XR hardware (which may utilize input devices 306 such as imaging sensors), monitors, speakers, headphones, projectors, wearable-displays, holographic displays, printers, and the like. Output devices 304 may further include, for example, displays and/or speakers, devices that emit energy (radio, microwave, infrared, visible light, ultraviolet, x-ray and gamma ray), electronic output devices (WiFi, radar, laser, etc.), audio (of any frequency), and the like.
[0039] Computing device 300 may further include one or more input devices 306 which can include, by way of example, any type of mouse, keyboard, disk/media drive, memory stick/thumb-drive, memory card, pen, touch-input device, biometric scanner, gaze and/or blink tracker, tracker, voice/auditory input device, motion-detector, camera, scale, and any device capable of measuring data such as motion data (e.g., an accelerometer, GPS, a magnetometer, a gyroscope, etc.), biometric data (e.g., blood pressure, pulse, heart rate, perspiration, temperature, voice, facial-recognition, motion/gesture tracking, gaze tracking, iris or other types of eye recognition, hand geometry, oxygen saturation, glucose level, fingerprint, DNA, dental
records, weight, or any other suitable type of biometric data, etc.), video/still images, and audio (including human-audible and human-inaudible ultrasonic sound waves). Input devices 306 may include any type of device capable of receiving data, whether from another device, visual and/or audio data captured from the real world, object detection data, and the like. Input devices 306 may include cameras (with or without audio recording), such as digital and/or analog cameras, still cameras, video cameras, thermal imaging cameras, infrared cameras, imaging sensors, cameras with a charge-couple display, night-vision cameras, three-dimensional cameras, webcams, audio recorders, and the like. By way of non-limiting example, input device 306 and/or display/output device 304 may correspond to I/O 176 depicted in FIG. 2.
[0040] Computing device 300 in some embodiments includes non-volatile memory 308 (e.g., ROM, flash memory, etc.), volatile memory 310 (e.g., RAM, etc.), or a combination thereof. A network interface 312 may facilitate communications over a network 314 with other data source(s) such as a database 318 via wires, a wide area network, a local area network, a personal area network, a cellular network, a satellite network, and the like . Suitable local area networks may include wired Ethernet and/or wireless technologies such as, for example, wireless fidelity (Wi-Fi). Suitable personal area networks may include wireless technologies such as, for example, IrDA, Bluetooth, Wireless USB, Z-Wave, ZigBee, and/or other near field communication protocols. Suitable personal area networks may similarly include wired computer buses such as, for example, USB and FireWire. Suitable cellular networks may include, but are not limited to, technologies such as UTE, WiMAX, UMTS, CDMA, GSM, and the like. Network interface 312 can be communicatively coupled to any device capable of transmitting and/or receiving data via one or more network(s) 314. By way of non-limiting example, the network interface 312 may correspond to a transmitter 172 and/or a receiver 174 as depicted in FIG. 2. Accordingly, the network interface 312 may include a communication transceiver for sending and/or receiving any wired or wireless communication. For example, the network interface 312 may include an antenna, a modem, LAN port, Wi-Fi card, WiMax card, mobile communications hardware, near-field communication hardware, satellite communication hardware and/or any wired or wireless hardware for communicating with other networks and/or devices.
[0041] A computer-readable medium 316 comprises one or more plurality of computer readable mediums, each of which is non-transitory. A computer readable medium may reside, for example, within an input device 306, non-volatile memory 308, volatile memory 310, or any combination thereof. A readable storage medium can include tangible media that is able to store instructions associated with, or used by, a device or system. A computer readable medium, also referred to herein as a non-transitory computer readable medium, includes, by way of non-limiting examples: RAM, ROM, cache, fiber optics, EPROM/Flash memory, CD/DVD/BD-ROM, hard disk drives, solid-state storage, optical or magnetic storage devices, diskettes, electrical connections having a wire, or any combination thereof. A non-transitory computer readable medium may also include, for example, a system or device that is of a magnetic, optical, semiconductor, or electronic type. A non-transitory computer readable medium excludes carrier waves
and/or propagated signals taking any number of forms such as optical, electromagnetic, or combinations thereof.
[0042] The computing device 300 may include one or more network interfaces 312 to facilitate communication with one or more remote devices, which may include, for example, client and/or server devices. The network interface 312 may also be described as a communications module, as these terms may be used interchangeably. The database 318 is depicted as being accessible over the network 314 and may reside within a server, the cloud, or any other configuration to support being able to remotely access data and store data in the database 318. By way of non-limiting example, database 318, non-volatile memory 308, volatile memory 310, and/or computer readable medium 316 may correspond to data storage 168 depicted in FIG. 2.
[0043] Turning to FIG. 4A, a flow diagram 400 schematically illustrates the generation of a superordinate level cluster that captures clustering strategies across multiple users. At block 402, users may begin sorting notes, which they created and/or imported, into clusters. This may be done with digital notes as depicted in Figs. 6 and 13-15, with physical notes as depicted in Figs. 1 and 5, or any combination thereof. As discussed in more detail herein, notes may be sorted into clusters utilizing any quantity of suitable dimensions. As discussed further with respect to FIGS. 13-15, notes with information about sports teams may be sorted according to dimensions such as type of sport, geographical location, and type of team name. The sorting of notes into clusters (e.g., preliminary clusters) may be received, for example, as data by users utilizing a graphical user interface as depicted in FIGS. 6, 13-15 and/or users physically sorting notes and captured as depicted in FIG. 1. At block 404, multiple models (referred to interchangeably as alternative models and/or candidate models) may be generated to account for the observed clustering decisions of the users. The alternative models may be utilized to perform clustering based upon inferences derived from analyzing how the users performed there clustering. More specifically, this may be accomplished by analyzing the real-time clustering decisions that the users make and generating models of what dimensions against which each user tends to cluster notes. These dimensions may be analyzed to identify the policies that the users are using to cluster the notes.
[0044] At block 406, the models may be utilized to make predictions about sequences of clustering decisions that have yet to be executed by the users. Embodiments herein may theorize and therefore predict that some users tend to cluster based on geographic dimensions (cities, in one example), whereas some users tend to cluster based on the dimension of which sport type a team plays (football, baseball, basketball, etc.) whereas other users tend to cluster by mascot category (e.g., animal, historical persona, etc.). In some embodiments, a model may make a prediction about how the users or a subset the users will sort notes. Predictions may be made about which unclustered note will next be added to a cluster, along with predicting which the cluster to which the note will be added. In some embodiments, predictions may include a note being added to a cluster, and subsequently moved to a different cluster.
[0045] At block 408, the models may examine concordance and/or discordance between alternative models and/or human clustering decisions. In embodiments, the subsequent clustering of a note can be thought of as an opportunity for hypothesis testing. By way of non-limiting example, if a user clusters notes based on geographic dimensions, then they should cluster a particular note into cluster X where the other notes in that cluster are more similar on the geographic dimension. In this way, the accuracy of a certain model’s predictions can be used to reinforce model confidence in some embodiments.
[0046] At block 410, a determination may be made as to whether multiple models that have been generated that, across these models, meet one or more accuracy and/or specificity thresholds for a sufficient number of the users. This may entail, for example, determining the accuracy of any candidate model for predicting where users will sort notes during the previous hypothesis-testing at block 408 and evaluating this against an accuracy threshold. This determination may be made by one or more subject matter experts that may or may not be part of the users. In some embodiments, if the accuracy of the given candidate model meets/exceeds a given threshold, it may be deemed to be a working model for how and which dimensions the various users use to guide clustering. Otherwise, if accuracy and/or specificity threshold(s) are not met (condition NO), then the flow diagram may return to block 408. Thus, as an example, more hypothesis testing data may be collected in this way.
[0047] Otherwise, if accuracy and/or specificity threshold(s) are met (condition YES), then at block 412 the model may alert the users that they can stop manually sorting and/or it may generate a superordinate - level cluster that may be a logically consistent hybrid of individual stakeholder models. In other words, an alert may be provided to the user that there is enough confidence in one of the clustering models that the human users can stop manual clustering and the system may generate a singular superordinate cluster of the notes that covers some or all of the clustering methods the current users are utilizing. This may mean, by way of non-limiting example, that the selected model covers more of the clustering methods that the current users are utilizing, and in a more accurate way. In some embodiments, the superordinate level cluster may be updated based upon feedback by one or more subject matter expert, where the feedback may be received in real-time (or near real-time, intervals, and the like) and/or in batch (i.e., x instances of feedback are delivered at a time, etc.). In one example, this could entail the generation of a cluster based on geographic dimensions, sport the team plays, and by mascot category. For example, once accuracy of predicted clustering of notes all reach a certain threshold respectively, there may be confidence that the model identified that 50% of users tends to cluster based on geographic dimensions (cities, in this example), whereas 35% of users tends to cluster based on the dimension of which sport the team plays (football, baseball, basketball, etc.) and 15% of users tend to cluster by mascot category (e.g., animal, etc.). By way of non-limiting example, the superordinate cluster may comprise one or more clustering strategies from all of the users or a subset of the users. In another non-limiting example, the superordinate level cluster may incorporate a plurality of clustering strategies from a plurality of the users.
[0048] Accordingly, the system might also report out what percentage of users in the crowd employed a given method of clustering (e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)). By way of non-limiting example, suppose that after 20% of the total notes have been manually sorted by users, one of the candidate models may have a demonstrated accuracy of at least 95% on the predicted clustering ofthe test notes (e.g., all the manual notes for which the system had an a priori predicted sorting). Continuing with this non-limiting example, the system, based on the performance of the candidate model having met or exceeded the accuracy threshold to be deemed a superordinate model, may notify human the users to stop manually clustering. In this way, the system can utilize a singular superordinate model for automated completion of the clustering of the notes.
[0049] Turning to FIG. 4B, a flow diagram 450 schematically illustrates the generating of multiple superordinate level clusters that represent different subpopulations of users. At block 452, users may begin sorting notes, which they created and/or imported, into clusters. This may be done with digital notes as depicted in Figs. 6 and 13-15, with physical notes as depicted in Figs. 1 and 5, or any combination thereof. As discussed in more detail herein, notes may be sorted into clusters utilizing any quantity of suitable dimensions. As discussed further with respect to FIGS. 13-15, notes with information about sports teams may be sorted according to dimensions such as type of sport, geographical location, and type of team name. The sorting of notes into clusters (e.g., preliminary clusters) may be received, for example, as data by users utilizing a graphical user interface as depicted in FIGS. 6, 13-15 and/or users physically sorting notes and captured as depicted in FIG. 1. At block 454, multiple models (referred to interchangeably as alternative models and/or candidate) may be generated to account for the observed clustering decisions of the users. The alternative models may be utilized to perform clustering based upon inferences derived from analyzing how the users performed there clustering. More specifically, this may be accomplished by analyzing the real-time clustering decisions that the users make and generating models of what dimensions against which each user tends to cluster notes. These dimensions may be analyzed to identify the policies that the users are using to cluster the notes.
[0050] At block 456, the models may be utilized to make predictions about sequences of clustering decisions that have yet to be executed by the users. Embodiments herein may theorize and therefore predict that some users tend to cluster based on geographic dimensions (cities, in one example), whereas some users tend to cluster based on the dimension of which sport type a team plays (football, baseball, basketball, etc.) whereas other users tend to cluster by mascot category (e.g., animal, historical persona, etc.). In some embodiments, a model may make a prediction about how the users or a subset the users will sort notes. Predictions may be made about which unclustered note will next be added to a cluster, along with predicting
which the cluster to which the note will be added. In some embodiments, predictions may include a note being added to a cluster, and subsequently moved to a different cluster.
[0051] At block 458, the models may examine concordance and/or discordance between alternative models and/or human clustering decisions. In embodiments, the subsequent clustering of a note can be thought of as an opportunity for hypothesis testing. By way of non-limiting example, if a user clusters notes based on geographic dimensions, then they should cluster a particular note into cluster X where the other notes in that cluster are more similar on the geographic dimension. In this way, the accuracy of a certain model’s predictions can be used to reinforce model confidence in some embodiments.
[0052] At block 460, a determination may be made as to whether multiple models that have been generated that, across these models, meet one or more accuracy and/or specificity thresholds for a sufficient number of the users. This may entail, for example, the accuracy of any candidate model for predicting where users will sort notes during the previous hypothesis-testing at block 458 and evaluating this against an accuracy threshold. In some embodiments, if multiple candidate models meet/exceed a given threshold, they may be deemed to collectively determine how and which dimensions the various users utilize to guide clustering. Otherwise, if accuracy and/or specificity threshold(s) are not met (condition NO), then the flow diagram may return to block 458. Thus, as an example, more hypothesis testing data may be collected in this way.
[0053] Otherwise, if accuracy and/or specificity threshold(s) are met (condition YES), then at block 462 the model may alert the users that they can stop manually sorting and/or it may generate multiple superordinate-level clusters that may be a logically consistent hybrids of individual stakeholder models. In other words, the system would alert the user that the system has enough confidence in these clustering models that the human users can stop manual clustering and the system would generate multiple superordinate clusters of the notes that cover all of the clustering methods the current users are utilizing. In some embodiments, one or more of the superordinate level clusters may be updated based upon feedback by one or more subject matter expert, where the feedback may be received in real-time (or near real-time, intervals, and the like) and/or in batch (i.e., x instances of feedback are delivered at a time, etc.). In the present example, this would mean that the system would generate multiple superordinate clusters based on geographic dimensions, sport the team plays, and by mascot category. For example, once accuracy of the predicted clusters of notes all reach a certain threshold respectively, there may be confidence that multiple superordinate models collectively identified that 50% of users tends to cluster based on geographic dimensions (cities, in this example), whereas 35% of users tends to cluster based on the dimension of which sport the team plays (football, baseball, basketball, etc.) and 15% of users tend to cluster by mascot category (e.g., animal, etc.). By way of non-limiting example, the superordinate clusters may each and/or collectively comprise one or more clustering strategies from all of the users or a subset of the users. In another non-limiting example, the superordinate level clusters may each and/or collectively incorporate a plurality of clustering strategies from a plurality of the users.
[0054] Accordingly, the system might also report out what percentage of users in the crowd employed a given method of clustering (e.g., that 50% of users tends to cluster based on geographic dimensions (cities, for example) whereas 35% of users tends to cluster based on the dimension of which sport the team plays (e.g., football, baseball, basketball), and 15% of users tend to cluster by mascot category (e.g., animals)). By way of non-limiting example, suppose that after 20% of the total notes have been manually sorted by users, certain models may have a demonstrated accuracy of at least 95% on the predicted clustering of the test notes (e.g., all the manual notes for which the system had an a priori predicted sorting). Continuing with this non-limiting example, the system, based on the performance of certain candidate models having exceeded the accuracy threshold to be deemed superordinate models, may notify the human users to stop manually clustering. In this way, the system can utilize multiple superordinate models for automated completion of the clustering of the notes.
[0055] Turning to FIG. 5, users 502 are depicted in an environment 500 manually moving notes 504 into clusters 506. As discussed in more detail herein, users 502 may work together, individually, and/or in any suitable fashion to cluster 506 notes 504 based upon one or more criteria/dimensions. In various embodiments, the image-capture technology depicted in FIGS. 1-3 may be utilized to capture images and/or videos of the notes 504 being put into clusters 506. Clusters 506 may be any suitable size, involve any number of notes 504 and/or any number of clusters 506, and may involve any suitable distance(s) between notes 504 and/or clusters 506.
[0056] Turning to FIG. 6, a graphical user interface for clustering digital notes 602 is depicted. Digital notes 602 may be electronically represented on a digital surface 600 to display any suitable numberZtype(s) of digital notes within a graphical user interface. By way of non-limiting example, notes 602 may be created within the digital surface 600, may be captured from physical notes as depicted in FIG. 1, and/or any combination thereof. Also, by way of non-limiting example, and as further depicted with respect to FIGS. 13-15, notes 602 may contain multiple pieces of information, such as a name 604 of a sport team, a sport type 606 associated with the team, and/or a city 608 associated with the team. The digital notes 602 within the graphical user interface may be arranged in any suitable/de sired fashion by any number of users on one or more devices. By way of non-limiting example, each user may access the digital surface 600 using their own device. The graphical user interface may include, by way of non-limiting example, a control panel 650 having one or more options such as note deletion 652, digital note creation 654, and a digital note sizing button 656. In this embodiment, one or more users may move digital notes 602 into one or more clusters. Any number of users present at the digital surface 600 m ay move digital notes 602 for any reason. By way of non-limiting example, and as depicted in FIGS. 13-15, the digital notes 602 may be clustered by one or more criteria, such as by geography (e.g., city 608), sport (e.g., sport type 606), and/or team name type (e.g., name). Any other suitable criteria for clustering may be utilized, such as color and/or size of the digital note 602. In some embodiments, a hybrid approach of digitally-captured physical notes appearing
on the same digital surface 600 with digitally-created digital notes 602. In such embodiments, continual and/or periodic digital capture of physical notes may be utilized to update their positions. In other embodiments, once physical notes are digitally captured, their position may be determined by manipulation from users via the digital surface 600.
[0057] Turning to FIG. 7, a high-level system overview of the human in the loop automatic clustering solution 700 is depicted. At block 702, by way of non-limiting example, a known subset of a content category may be supplied by a subject matter expert during every new brainstorming session if the system has not yet learned this content category. The subject matter expert may thus provide a data subset and associate labels and/or categories, which may be put into data storage at block 704. A self-supervised machine learning model approach may be used in this embodiment to reduce data labeling overheads for subject matter experts. Self-supervised learning is an approach to machine learning that combines a small amount of labeled data with a large amount of unlabeled data during training.
[0058] At block 706, a machine learning model may utilize the contents of the data storage at block 704 to learn utilize the new data representation. In this embodiment, the machine learning model may be triggered for either a batch of records or to proceed in a real-time or near real-time fashion, although any suitable timing for processing records may be utilized in other embodiments. The model may learn the latent representations of the content category. Based on this latent representation, the model may predict categories based on the unlabeled dataset. The generated categories may then be submitted to the subject matter expert(s) for approval. The subject matter expert(s) may then issue either positive or negative feedback, such that labels may be generated for subject matter review by returning to block 702. At block 708, the closed loop system of this embodiment may continue until a steady set of labels get generated. Put another way, a determination may be made that the process is to continue until a steady state of labels are obtained. Upon achieving a steady state, the subject matter expert(s) may no longer be needed in the loop.
[0059] Turning to step 1 as depicted in FIG. 8A as part of a multi-part flow 800, a flow diagram 802 depicts the generation of a subset of labels. In this embodiment, for a machine learning model, data may be important to obtaining an accurate model, yet data labelling may become a time-consuming process when the size of the dataset is large. At 804, uncategorized Post-it® Notes residing on a Post-it® Board may serve as an input source. In embodiments, uncategorized notes may be digital, physical, and/or a combination of thereof may serve as one or more input sources. At 806, a subject matter expert may label a set of categories. At 808, the Post-it® Notes that the subject matter expert has provided may form a subset of labels for the content in the Post-it® Board, such as by categories A, B, and C. In some embodiments, the labelled notes/data could be viewed as the training data, and the yet-to-be labelled data could be viewed as the test data. At block 810, the data may be structured in, by way of non-limiting example, an n*2 form comprising content and category.
[0060] Turning to step 2 as depicted in FIG. 8B, a flow diagram 812 depicts semi-supervised model training. A semi/self-supervised learning model may be trained on the data comprising content and categories 814 utilizing the label subset provided by the subject matter expert resulting at the end step 1 in FIG. 8A. At 816, a semi/self-supervised learning model autoencoder may be utilized to reduce the time consumed by the one or more subject matter experts in labeling data on the Post-it® Board. The model may learn the overall latent representations in the dataset. As part of this step, an offline model may be trained on a subset of content and its associated labels/category, such that the model may learn the latent representation of the new data. This step may in some embodiments also involve natural language processing and/or understanding capabilities to semantically understand content in the Post-it® Notes. In the online phase, unlabeled data may be scored into n-categories by the trained self-supervised learning model. At 818, the semi-supervised learning model may be put into a package model. The may be done, for example, to ensure that the model can be easily deployed and maintained in a production environment. Accordingly, this may provide a trained semi/self-supervised learning model at 820.
[0061] Turning to step 3 as depicted in FIG. 8C, a flow diagram 822 depicts model scoring on an unseen dataset. At 826, the trained semi/self-supervised learning model 820 may be used to score the remaining unlabeled data from the Post-it® board 824. At 828, the final categories may then be submitted to a subject matter expert fortheir evaluation. At 830, the subject matter expert may review the labels and either provide a positive reward indicating a successful model iteration or a negative reward penalizing the model. In some embodiments, the flow in FIG. 8C may be regarded as iterative, such that after the data is structured at block 830, another iteration may be performed utilizing the updated data returning to block 824. As discussed with respect to FIG. 7, steps 1-3 in FIG. 8 may continue until the model reaches convergence.
[0062] Turning to FIG. 9, a natural language processing/understanding workflow 900 is depicted. At block 902, the process may start with the textual content being annotated by the subject matter expert. At block 904, the text may be subjected to text tokenization, utilizing a word and sentence tokenizer, a punctuation tokenizer, a byte pair tokenizer, Bidirectional Encoder Representations from Transformers (BERT) tokenizer, and/or any suitable tokenizer(s), by way of non-limiting examples. At block 906, the text may be subjected to standardization procedures to tokenize words, remove stopwords, numbers, special characters, and/or the like. The text may be processed with word length standardization, unknown sequence handling, and stopword removal, by way of non-limiting examples. Additionally, if content needs to be grouped into n-gram combinations to make sense semantically, such operations can also be carried out in the standardization procedures.
[0063] At block 908, to obtain more rich features for classification, pre-trained or custom word embeddings may be used to map text content onto a relevant correlated co-ordinate space to obtain weights that may help in content classification/categorization. At block 910, machine leaming/artificial intelligence approaches (long short-term memory networks, recurrent neural network, et al.) may be applied to learn
the representations and classify content into the n-category. Non-limiting examples include BERT, Embeddings from Language Model (ELmO), Generative Pre-trained Transformer (GPT), long short-term memory (LSTM), transformers, and the like.
[0064] Turning to FIG. 10, a similarity analysis workflow 1000 is depicted. This embodiment may be utilized to help avoid having content of the same type being entered by the subject matter expert. In some embodiments, the efficacy and/or efficiency may be dependent upon the data variety in the subset. To ensure subject matter experts do not select data that is too different, a pre-filter approach is depicted as performing a pairwise similarity analysis 1006 on all the data points 1002, 1004. This may be utilized to remove data points that are too similar to one another, while still retaining distinctly different data points. As depicted in this embodiment, an instance of text content A 1002 may be removed, while another instance may be retained along with text content B 1004. This type of check may be utilized to eliminate user error or unconscious bias that could occur when selecting data subsets from which the model may learn.
[0065] Turning to FIG. 11, a cluster evaluation diagram 1100 is depicted. In some embodiments, clustering algorithms are usually evaluated based on minimal intra cluster distance and maximal inter cluster distance. There may be two measures to assess clustering performance. First may be extrinsic measures, which in some embodiments require ground truth labels 1102, which may be considered the truth against which clusters and/or categories may be checked for validity. Second may be intrinsic measures, which do not require ground truth labels. Cluster evaluation metrics 1104 in this embodiment may be utilized to evaluate the clustering model’s performance to eventually provide a reward (positive/negative), which in turn may be used to improve the model performance to achieve convergence. Cluster evaluation metrics may include, by way of non-limiting examples, Adjusted Rand Index, Fowlkes- Malows scores, Completeness and V-Measure, Davies-Bouldh Index, Calinski-Harabasz Index, and/or the like. In utilizing such cluster evaluation metrics in some embodiments, the higher the index score, the better the clustering performance. At block 1106, the results of the cluster evaluation metrics 1104 may be demonstrated as rewards or a figure of merit may be utilized to comparatively assess and/or rank the clustering performances. Any suitable type of machine learning model 1108 or any other suitable type of model may then be utilized to perform the modeling, such that the machine learning model results 1110 may serve as input, along with the consumer ground truth labels 1102, for further evaluation with the cluster evaluation metric 1104.
[0066] Turning to FIG. 12, a diagram of a pairwise comparison of notes and incorporating feedback 1200 is depicted. Based on notes categorization, in some embodiments, a set of notes may be presented to the user for their feedback 1202. This may incorporate their approval, such as within a user interface 1204, as to whether particular content has been categorized accurately or not. The user interface 1204 in this embodiment may ensure that users spend less time providing feedback on the model’s performance in categorizing content. The user interface 1204 may provide a mechanism to collect user inputs as to whether
a cluster label presented to them has correctly been classified into a correct category or not at block 1206, and then stored within a data storage layer 1208, which may serve as input for a machine learning model or any other suitable type of model. The machine learning model results 1212 may then serve as input further evaluation within the user interface 1204.
[0067] Turning to FIG. 13, a graphical user interface 1300 for clustering digital notes by geography is depicted. Digital notes 1310 in this embodiment may have dimensions related to team name 1312 (also referred to herein as mascot), sport type 1314, and city 1316, although any types and/or quantity of criteria may be utilized. Clusters 1302, 1304, 1306, 1308 may be of any suitable size, involve any number of digital notes 1310, and may involve any suitable distance(s) between clusters. Non-limiting examples of interface options include a panel 1350 of various options such as note creation 1352, note view enlargement 1354 for editing, and/or text options 1356 including any suitable text editing options (including text size, font, alignment, color, and/or the like). Other options may include a cluster shape option 1358 wherein the digital notes 1310 within a selected cluster 1304 may be reorganized into a variety of possible configurations/shapes/arrangements. Other options may include a cluster deletion option 1360, cluster approval option 1362, and the like.
[0068] In this non-limiting example, digital notes 1310 representing sports teams have been clustered by city 1316 (i.e., geography). Thus, there may be an east coast cluster 1302 of digital notes 1310 representing those sports teams whose respective cities 1316 are located on the east coast, a midwest cluster 1304 of digital notes 1310 representing those sports teams whose cities 1316 are located in the midwest, a southeast cluster 1306 of digital notes 1310 representing those sports teams whose cities 1316 are located in the southeast, and a south and west cluster 1308 of digital notes 1310 representing those sports teams whose cities 1316 are located in the south and west, and the like.
[0069] Turning to FIG. 14, a graphical user interface 1400 for clustering digital notes by type of sport is depicted. Continuing with the embodiment depicted in FIG. 13, digital notes 1408 may have dimensions related to team name 1410, sport type 1412, and city 1414, although any types and/or quantity of criteria may be utilized. There may be a hockey cluster 1402 of digital notes 1408 clustering sports teams whose sport type 1412 is hockey, a football cluster 1404 of digital notes 1408 clustering sports teams whose sport type 1412 is football, a baseball cluster 1406 of digital notes 1408 clustering sports teams whose sport type 1412 is baseball, and the like.
[0070] Turning to FIG. 15, a graphical user interface 1500 for clustering digital notes by type of team (e.g., team name and/or mascot) is depicted. Continuing with the embodiment depicted in FIGS. 13-14, digital notes 1518 in this embodiment may have dimensions related to team name 1512, sport type 1514, and city 1516, although any types and/or quantity of criteria may be utilized. There may be a clothing cluster 1502 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of
clothing, a people cluster 1504 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of people, an animals cluster 1506 of digital notes 1518 clustering those sports teams whose team name 1512 relates to types of animals, a miscellaneous cluster 1508 of digital notes 1518 clustering those sports teams whose team name 1512 relates to miscellaneous team names, a mythical creatures cluster 1510 of digital notes 1518 clustering those sports teams whose team name 1512 relates to mythical creatures, and the like.
[0071] As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” encompass embodiments having plural referents, unless the content clearly dictates otherwise. As used in this specification and the appended claims, the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise.
[0072] While particular embodiments have been illustrated and described herein, it should be understood that various other changes and modifications may be made without departing from the scope of the claimed subject matter. Moreover, although various aspects of the claimed subject matter have been described herein, such aspects need not be utilized in combination. It is therefore intended that the appended claims cover all such changes and modifications that are within the scope of the claimed subject matter.
Claims
1. At least one non-transitory computer-readable medium encoded with instructions that, when executed, configure at least one processor for: receiving data sorting objects into preliminary clusters; generating models comprising clustering decisions leading to formation of the preliminary clusters; generating, based on the models, predictions about sequences of clustering decisions to be made by users; comparing the models to subsequent clustering decisions made by the users; identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users; and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
2. The at least one non-transitory computer-readable medium of claim 1 wherein the received data corresponds to one or more users sorting notes into the clusters.
3. The at least one non-transitory computer-readable medium of claim 1 encoded with further instructions for receiving feedback of each of the superordinate level clusters by at least one subject matter expert, wherein each subject matter is not a user.
4. The at least one non-transitory computer-readable medium of claim 3 encoded with further instructions for updating at least one of the superordinate level clusters based upon feedback by the least one subject matter expert.
5. The at least one non-transitory computer-readable medium of claim 3 wherein the feedback is received in real-time.
6. The at least one non-transitory computer-readable medium of claim 3 wherein the feedback is received in batch.
7. The at least one non-transitory computer-readable medium of claim 1 encoded with further instructions for, upon identifying the at least one of the models meeting the at least one threshold, notifying the users to stop sorting notes.
8. The at least one non-transitory computer-readable medium of claim 1 wherein generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users further comprises completing one or more of the preliminary clusters.
9. A computing device comprising: at least one memory; and at least one processor coupled to at least one of the at least one memory, the at least one processor being configured to perform operations to: receive data sorting objects into preliminary clusters; generate models comprising clustering decisions leading to formation of the preliminary clusters; generate, based on the models, predictions about sequences of clustering decisions to be made by users; compare the models to subsequent clustering decisions made by the users; identify at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users; and generate, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
10. The computing device of claim 9 wherein the received data corresponds to one or more users sorting notes into the clusters.
11. The computing device of claim 9 wherein the at least one processor is further configured to receive feedback of each of the superordinate level clusters by at least one subject matter expert, wherein each subject matter is not a user.
12. The computing device of claim 11 wherein the at least one processor is further configured to update at least one of the superordinate level clusters based upon feedback by the least one subject matter expert.
13. The computing device of claim 11 wherein the feedback is received in real-time.
14. The computing device of claim 11 wherein the feedback is received in batch.
15. The computing device of claim 9 wherein the at least one processor is further configured to, upon identifying the at least one of the models meeting the at least one threshold, notify the users to stop sorting notes.
16. The computing device of claim 9 wherein generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users further comprises completing one or more of the preliminary clusters.
17. A method comprising: receiving data sorting objects into preliminary clusters; generating models comprising clustering decisions leading to formation of the preliminary clusters; generating, based on the models, predictions about sequences of clustering decisions to be made by users; comparing the models to subsequent clustering decisions made by the users; identifying at least one model meeting at least one threshold for accuracy, specificity, or both, for a plurality of the users; and generating, using the at least one identified model, a plurality of superordinate level clusters that each comprise clustering strategies exhibited by a plurality subset of the users, wherein at least some of the users corresponding to the superordinate level clusters differ among the superordinate level clusters.
18. The method of claim 17 wherein the received data corresponds to one or more users sorting notes into the clusters.
19. The method of claim 17 further comprising receiving feedback of each of the superordinate level clusters by at least one subject matter expert, wherein each subject matter is not a user.
20. The method of claim 19 further comprising updating at least one of the superordinate level clusters based upon feedback by the least one subject matter expert.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363594455P | 2023-10-31 | 2023-10-31 | |
| US63/594,455 | 2023-10-31 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025093962A1 true WO2025093962A1 (en) | 2025-05-08 |
Family
ID=93150574
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2024/059838 Pending WO2025093962A1 (en) | 2023-10-31 | 2024-10-08 | Methods, media, and systems for human in the loop supervision for real-time machine assisted notes clustering |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025093962A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100312725A1 (en) * | 2009-06-08 | 2010-12-09 | Xerox Corporation | System and method for assisted document review |
| US20220230089A1 (en) * | 2021-01-15 | 2022-07-21 | Microsoft Technology Licensing, Llc | Classifier assistance using domain-trained embedding |
| US20220309250A1 (en) * | 2021-03-23 | 2022-09-29 | Hewlett Packard Enterprise Development Lp | Facilitating an automated, interactive, conversational troubleshooting dialog regarding a product support issue via a chatbot |
-
2024
- 2024-10-08 WO PCT/IB2024/059838 patent/WO2025093962A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100312725A1 (en) * | 2009-06-08 | 2010-12-09 | Xerox Corporation | System and method for assisted document review |
| US20220230089A1 (en) * | 2021-01-15 | 2022-07-21 | Microsoft Technology Licensing, Llc | Classifier assistance using domain-trained embedding |
| US20220309250A1 (en) * | 2021-03-23 | 2022-09-29 | Hewlett Packard Enterprise Development Lp | Facilitating an automated, interactive, conversational troubleshooting dialog regarding a product support issue via a chatbot |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Mao et al. | Visual aware hierarchy based food recognition | |
| Liu et al. | Crowdsourcing construction activity analysis from jobsite video streams | |
| US20210406266A1 (en) | Computerized information extraction from tables | |
| Lotfi et al. | Storytelling with image data: a systematic review and comparative analysis of methods and tools | |
| Hua et al. | Collaborative active visual recognition from crowds: A distributed ensemble approach | |
| Myagila et al. | A comparative study on performance of SVM and CNN in Tanzania sign language translation using image recognition | |
| US20130346401A1 (en) | Topical affinity badges in information retrieval | |
| Fang et al. | Robust grasping across diverse sensor qualities: The GraspNet-1Billion dataset | |
| CN113392864B (en) | Model generation method, video screening method, related device and storage medium | |
| CN112837466A (en) | Bill identification method, device, device and storage medium | |
| CN110222582B (en) | Image processing method and camera | |
| US20160026656A1 (en) | Retrieving/storing images associated with events | |
| Rabee et al. | Comparative analysis of automated foul detection in football using deep learning architectures | |
| CN117795502A (en) | The evolution of topics in messaging systems | |
| US20250094538A1 (en) | Dataset clustering via language model prompts | |
| Vidhyalakshmi et al. | Indian Sign Language Recognition using transfer learning with Efficient Net | |
| Negi et al. | Object of interest and unsupervised learning-based framework for an effective video summarization using deep learning | |
| Gubbala et al. | AdaBoost based Random forest model for Emotion classification of Facial images | |
| CN113821681B (en) | Video tag generation method, device and equipment | |
| WO2025093962A1 (en) | Methods, media, and systems for human in the loop supervision for real-time machine assisted notes clustering | |
| WO2025093958A1 (en) | Methods, media, and systems for mechanisms for integrating human in the loop supervision into automated processes for clustering digital notes | |
| WO2025099523A1 (en) | Methods, media, and systems for clustering based upon previously-generated profiles | |
| Kumar et al. | Real-Time Student Activity Detection and Incident Monitoring Using Artificial Intelligence | |
| CN116578738A (en) | Graph-text retrieval method and device based on graph attention and generating countermeasure network | |
| Jofre et al. | Crowdsourcing image extraction and annotation: software development and case study |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24791105 Country of ref document: EP Kind code of ref document: A1 |