SE2000041A1 - Method and system for anonymously tracking and/or analysing health states in a population - Google Patents
Method and system for anonymously tracking and/or analysing health states in a populationInfo
- Publication number
- SE2000041A1 SE2000041A1 SE2000041A SE2000041A SE2000041A1 SE 2000041 A1 SE2000041 A1 SE 2000041A1 SE 2000041 A SE2000041 A SE 2000041A SE 2000041 A SE2000041 A SE 2000041A SE 2000041 A1 SE2000041 A1 SE 2000041A1
- Authority
- SE
- Sweden
- Prior art keywords
- health
- individuals
- group
- population
- group identifier
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/68—Arrangements of detecting, measuring or recording means, e.g. sensors, in relation to patient
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/535—Tracking the activity of the user
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/02—Protecting privacy or anonymity, e.g. protecting personally identifiable information [PII]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Public Health (AREA)
- Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Primary Health Care (AREA)
- Bioethics (AREA)
- Software Systems (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Heart & Thoracic Surgery (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
There is provided methods and systems for anonymously tracking and/or analysing transition of individual subjects between different health states. In particular, there is provided a computer-implemented method for enabling estimation of the amount, ratio and/or rate of individuals in a population transitioning and/or coinciding between two or more health states. The method comprises the steps of receiving (S1) identifying data from two or more individuals, generating (S2), by one or more processors, a group identifier for each individual that is effectively uncorrelated with a transition between health states, and storing (S3): the group identifier of each individual together with data describing health state; and/or a counter per health state and group identifier.
Description
METHOD AND SYSTEM FOR ANONYMOUSLY TRACKING AND/ORANALYSlNG HEALTH STATES IN A POPULATION TECHNICAL FIELD The invention generally relates to the issue of anonymity in technologicalapplications; and technological aspects of data collection and data/populationstatistics, and more specifically concerns the technical field of health monitoringand/or analysis, and especially tracking and/or estimating or measuring transition10 between health states and/or methods and systems and computer programs for enabling such estimation.
BACKGROUND Legislation and public opinion increasingly drive a movement towards a right ofanonymity in technology. This stands in conflicts with a need to collect data abouthealth in populations in order to automate or optimize healthcare and societies.Pharmaceutical companies depend on health information to improve theirmedications and dosages, hospitals depend on similar information to improve their treatments and recommendations. New health monitoring devices rely onpopulation data in order to create recommendations, warnings or otherinterventions.
Technologies that enable both data collection for statistical purposes while25 preserving personal anonymity are in high demand. ln particular the tracking ofhealth developments in a population is problematic, since the reidentification of anindividual at a later time is commonly the very definition of a breach of saidindividuals right to anonymity. This means that the whole idea of anonymoustracking of the health of a population is somewhat counter-intuitive, since it is often practically impossible on the individual level.
Current anonymization methodologies used for tracking health of people that arebased on pseudo-anonymization or unique identifiers are clearly unable to fulfillthese needs, which means that companies avoid collecting data on transitionsbetween health states. Other common anonymization methods, such as k-anonymization, first store personal data and then anonymize them, which is still aviolation of the individual's privacy until the anonymization takes place. lt is highlydesirable to find any systems able to collect data on such transitions betweenhealth states of individuals without violating anonymity. This anonymity needs toextend to computer networks, medical devices, mobile devices, computers,phones, cars and other devices that can be used to reasonably identify individualsand monitor their health states. ln some related anonymization technologies,encryption with a very minor destruction of information has been used, so thatindividuals can be reidentified with sufficiently high probability (commonly witherror rates of one in several hundred million identifications) that anymisidentification can be neglected altogether. However, such encryptiontechniques, irrespective of whether they are or are not practically reversible, arenot deemed to be compatible with the legislative interpretation of strictanonymization nor with public opinion of the same, since the possibility of thereidentification act itself is the defining attribute of personal data even if the specific device used and/or the identifier used cannot be known.
SUMMARY lt is a general object to provide a system for providing anonymity while calculatingstatistics on health transitions (i.e. transitions between health states) in populations. lt is a specific object to provide a system and method for preserving anonymitywhile estimating or measuring the transition of individuals between two or more health states. lt is another object to provide a system for anonymously tracking and/or analysingtransition between health states of individual subjects, referred to as individuals in a population. lt is also an object to provide a health monitoring system comprising such a system _ Yet another object is to provide a computer-implemented method for enablingestimation of the amount and/or transition of individual subjects, referred to asindividuals, in a population moving and/or coinciding between two or more health states.
A further object is to provide a method for generating a measure of transition of individual subjects, referred to as individuals, between health states.
Still another object is to provide a computer program and/or computer-program product configured to perform such a computer-implemented method.
These and other objects are met by embodiments as defined herein.
According to a first aspect, there is provided a system comprising: - one or more processors; - an anonymization module configured to, by the one or more processors;receive, for each one of a multitude of individual subjects, referred to asindividuals, in a population of individuals, identifying information representative ofan identity of the individual, and generate a group identifier based on theidentifying information of the individual to effectively perform microaggregation ofthe population into corresponding groups; - a memory configured to store group identifier counters for each of twoor more group identifiers from each of two or more health states associated with the corresponding individuals; - an estimator configured to, by the one or more processors: receivecounter information from at least two group identifier counters, and generate oneor more transition measures related to individuals passing from one health state to another health state.
According to a second aspect, there is provided a system for anonymouslytracking and/or analysing transition between health states of individual subjects,referred to as individuals. The system is configured to determine, for eachindividual in a population of multiple individuals, a group identifier based on ahashing function using information representative of an identity of the individual asinput wherein each group identifier corresponds to a groupof individuals, theidentity information of which results in the same group identifier, thereby effectivelyperforming microaggregation ofthe population into at least two groups.
The system is further configured to keep track, per group, of assignment datarepresenting the number of assignments to two or more health states byindividuals belonging to the group.
The system is also configured to determine at least one (health) transitionmeasure representative of the number of individuals passing from a first healthstate to a second health state, or being assigned both to the first health state and also to the second health state, based on assignment data per group identifier.
According to a third aspect, there is provided a health monitoring system comprising a system according to the first or second aspect.
According to a fourth aspect, there is provided a computer-implemented methodfor enabling estimation of the amount, ratio and/or rate of individuals in apopulation transitioning and/or coinciding between two or more health states. Themethod comprises the steps of: - receiving identifying data from two or more individuals; - generating, by one or more processors, a group identifier for eachindividual that is a priori effectively uncorrelated with a transition between health states; and - storing: the group identifier of each individual together with data describing health state; and/or a counter per health state and group identifier.
According to a fifth aspect, there is provided a computer-implemented method forgenerating a measure of transition of individual subjects, referred to as individuals,between health states. The method comprises the steps of: - configuring one or more processors to receive counters of anonymousand approximately independently distributed group identities originating fromassignment of individuals to each of two health states; - generating, using said one or more processors, a health transitionmeasure between two health states using a linear correlation between counters ofgroup identities for each of the two health states; - storing said health transition measure to a memory.
According to a sixth aspect, there is provided a computer program comprisinginstructions, which when executed by at least one processor, cause the at leastone processor to perform the computer-implemented method according to thefourth aspect and/or fifth aspect.
According to a seventh aspect, there is provided a computer-program productcomprising a non-transitory computer-readable medium having stored thereon such a computer program.
According to an eight aspect, there is provided a system for performing the method according to the fourth aspect and/or fifth aspect.
In this way, it is actually possible to provide anonymity while allowing data collection and calculation of statistics on populations of individuals. ln particular, the proposed technology enables preservation of anonymity whileestimating or measuring the transition of individuals between two or more health states. ln general, the invention provides improved technologies for enabling and/or securing anonymity in connection with data collection and statistics.
Other advantages offered by the invention will be appreciated when reading the below description of embodiments of the invention.BRIEF DESCRIPTION OF THE DRAWINGS The invention, together with further objects and advantages thereof, may best beunderstood by making reference to the following description taken together with the accompanying drawings, in which: FIG. 1 is a schematic diagram illustrating an example of a system according to an embodiment.
FIG. 2 is a schematic diagram illustrating an example of micro-aggregation of a population into groups.
FIG. 3 is a schematic diagram illustrating another example of micro-aggregation of a population into groups, including the concept of group identifier counters.
FIG. 4 is a schematic diagram illustrating how each group of individuals may beassociated with a set of health states N, optionally each for a set of points in time.
FIG. 5 is a schematic diagram illustrating the association of health state data and useful identifying information (ID).
FIG. 6 is a schematic diagram illustrating an example of a health monitoring system.
FIG. 7 is a schematic flow diagram illustrating an example of a computer- implemented method for enabling estimation of the amount, ratio and/or rate of individuals in a population transitioning and/or coinciding between two or more health states.
FIG. 8 is a schematic flow diagram illustrating another example of a computer-implemented method for enabling estimation of the amount, ratio and/or rate ofindividuals in a population transitioning and/or coinciding between two or more health states.
FIG. 9 is a schematic diagram illustrating an example of health outcome, ortransition of one or more individuals from health state A to health state B. health outcome FlG.1O is a schematic diagram illustrating an example of a computer- implementation according to an embodiment.
FIG. 11 is a schematic flow diagram illustrating an example of a computer-implemented method for generating a measure of transition e of individual subjects, referred to as individuals, between health states.
DETAILED DESCRIPTION Throughout the drawings, the same reference numbers are used for similar or corresponding elements.
For a better understanding of the proposed technology, it may be useful to begln with a brief analysis of the technical problem.
A careful analysis by the inventor has revealed that it is possible to divide andmicroaggregate a population into groups without storing personal data. Further, itis, perhaps surprisingly, possible to construct a health monitoring or surveillancesystem that is able to measure health state transitions using microaggregated data if even this microaggregation is based on factors that are independent and/or independently distributed from the transitions and/or their distribution. lmportantly,the proposed invention also works even if the used factors are uncorrelated withthe transitions and if any estimation of this joint distribution is infeasible. Theinvention is thus applicable on general populations using almost any identifyingfactors (i.e. types of data) without any need for further knowledge of the underlying distributions.
In the following non-limiting examples of the proposed technology will bedescribed, with reference to the exemplary schematic diagrams of FIG. 1 toFIG. 11.
FIG. 1 is a schematic diagram illustrating an example of a system according to anembodiment. In this particular example, the system 10 basically comprises one ormore processors 11, an anonymization module 12, an estimator 13, aninput/output module 14, and a memory 15 with one or more counters 16, also referred to as group identifier counters.
According to a first aspect of the invention, there is provided a system 10comprising: - one or more processors 11; - an anonymization module 12 configured to, by the one or moreprocessors: receive, for each one of a multitude of individual subjects , alsoreferred to as individuals, in a population of individuals, identifying informationrepresentative of an identity of the individual, and generate a group identifierbased on the identifying information of the individual to effectively performmicroaggregation of the population into corresponding groups; - a memory 15 configured to store group identifier counters 16 for each oftwo or more group identifiers from each of two or more health states associatedwith the corresponding individuals; - an estimator 13 configured to, by the one or more processors: receive counter information from at least two group identifier counters, and generate one or more transition measures related to individuals passing from one health state to another heath state.
By way of example, the anonymization module 12 may be configured to generatea group identifier based on the identifying information of the individual by using a hashing function.
FlG. 2 is a schematic diagram illustrating an example of micro-aggregation of apopulation into groups. By way of example, a population of subjects/objects understudy may be micro-aggregated into groups by using suitable one-way hashing. lnshort, a basic idea is to use, for each one of a multitude of individuals, identifyinginformation (such as lD#1, lD#2, individual, and generate a group identifier (Group lD#1, Group ID#X) based on lD#Y) representative of an identity of the the identifying information of the individual to effectively perform microaggregation of the population into corresponding groups (Group #1, Group #X).
FIG. 3 is a schematic diagram illustrating another example of micro-aggregation ofa population into groups, including the concept of group identifier counters. Thereare group identifier counters 16 for each of two or more group identifiers from eachof two or more health states or localities associated with the correspondingindividuals. ln other words, each of at least two groups (with corresponding groupidentifiers) has a number (K, L, M) of group identifier counters for maintainingcounts from each of two or more health states or localities associated with the corresponding individuals of the considered group.
The estimator 13, also referred to as a health transition estimator, may then beconfigured to receive counter information from at least two group identifiercounters, and generate one or more transition measures related to individuals passing from one health state to another health state.
FIG. 4 is a schematic diagram illustrating how each group of individuals may be associated with a set of health states N, optionally each for a set of points in time.
Optionally, the system 10 comprises an input module 14 configured to, by the oneor more processors: receive health state data, for each one of the multitude ofindividuals, representative of a health state, and match the health state of theindividual with a group identifier counter 16 corresponding to the group identifier related to the individual.
For example, each group identifier counter 16 for each group identifier also corresponds to a specific health state.
By way of example, the one or more transition measures includes the number and/or ratio of individuals passing from one health state to another health state. ln a particular example, at least one of said one or more health transitionmeasures is generated at least partly based on a linear transform of the counter information of two or more group identifier counters.
For example, the anonymization module 12 and/or the identifying informationrepresentative of the identity of an individual may be stochastic, and thestochasticity of the identifying information (identifier) and/or anonymization module may be taken into consideration when generating the linear transform.
As an example, the linear transform may be at least partly based on a correlationbetween two group identifier counters and from which a baseline corresponding tothe expected correlation from two independently generated populations is subtracted.
FIG. 5 is a schematic diagram illustrating the association of health state data and useful identifying information (ID).
Non-Iimiting examples of identifying information representative of the identity of an individual may include at least one of: - an Internet Protocol (IP) address, - a mobile phone number or subscriber identity,- a car license number, - biometric data originating from a subject, - a MAC-address, - a credit card number; - bar codes; - home coordinates; - name; - age or day of birth; - social security number, - tax identification number, - patient number and similar identifiers; and/or the identity may be an implicit link to a computer and the corresponding group identifier may be stored as a cookie.
This means one or more of the above information items and/or a combination thereof. ln a particular example, the anonymization module is configured to operate basedon a random table, a pseudorandom table, a cryptographic hash function and/orother similar function that is effectively uncorrelated with the aspect of interest the system is designed to study.
As an example, the hashing process may be non-deterministic.
By way of example, it may be considered important that data of at least two individuals is collected or expected to be collected per unique hash.
Alternatively, or complementary, the proposed technology may be viewed in the following way.
According to another aspect, there is provided a system for anonymously trackingand/or analysing transition between health states of individual subjects , referred to as individuals.
The system is configured to determine, for each individual in a population ofmultiple individuals, a group identifier based on a hashing function usinginformation representative of an identity of the individual as input. Each groupidentifier corresponds to a group of individuals, the identity information of whichresults in the same group identifier, thereby effectively performing microaggregation of the population into at least two groups.
The system is further configured to keep track, per group identifier, of assignmentdata representing the number of assignments to two or more health states by individuals belonging to the group.
The system is also configured to determine at least one transition measure (for thewhole population) of the number of individuals passing from a first health state to asecond health state or being assigned both to the first health state and also to the second health state based on assignment data per group identifier.
With exemplary reference to FIG. 1 and/or FIG. 10, the system may compriseprocessing circuitry 11; 110 and memory 15; 120, wherein the memory 15; 120comprises instructions, which, when executed by the processing circuitry 11; 110,causes the system to anonymously track and/or analyse transition between health states of individuals.
According to yet another aspect, the proposed technology provides a healthmonitoring or surveillance system 50 comprising a system 10 as described herein, as schematically illustrated in FIG. 6.
FIG. 7 is a schematic flow diagram illustrating an example of a computer- implemented method for enabling estimation of the amount, ratio and/or rate of individuals in a population transitioning and/or coinciding between two or more health states.
Basically, the comprises the steps of: S1: receiving identifying data from two or more individuals;S2: generating, by one or more processors, a group identifier for each individualthat is a priori effectively uncorrelated with a transition between health states; and S3: storing: the group identifier of each individual together with data describinghealth state; and/or a (group identifier) counter per health state and group identifier.
By way of example, the group identifier may be generated by applying a hashingfunction that effectively removes any pre-existing correlation between the identifying data and tendency to be assigned to one or more of the health states.
For example, the population being measured may be an unknown sample from agreater population, with the greater population being large enough that theexpected number of individuals that would be assigned to each group identifier is two or more.
Optionally, the generation of group identifier may be partly stochastic each time itis applied.
By way of example, the identifying data may include, per individual, informationrepresentative of the identity of the individual. Non-limiting examples of suchinformation may include at least one of: an Internet Protocol (IP) address, - a mobile phone number or subscriber identity, - a car license number, - biometric data originating from a subject,- a MAC-address, - a credit card number; - bar codes; - home coordinates; - name; - age or day of birth; - social security number, - tax identification number, - patient number and similar identifiers; and/or where the identity is an implicit link to a computer and the corresponding group identifier is stored as a cookie.
FIG. 8 is a schematic flow diagram illustrating another example of a computer-implemented method for enabling estimation of the amount, ratio and/or rate ofindividuals in a population transitioning and/or coinciding between two or more health states. ln this particular example, the method further comprises the step of: S4: generating a health transition measure between two health states using counters of group identities for each of the two health states.
For example, the generation of the health transition may be based on a linear transform of the population counters.
Optionally, the linear transform may include a correlation between a vectordescribing the health transition per group identifier in a first location and a vector describing the health transition per group identifier in a second location.
As an example, a base baseline is subtracted from the correlation that corresponds to the expected correlation between the two vectors.
For example, the number of individuals in the population may be two or more pergroup identifier.
Optionally, activity data representative of one or more actions or activities of eachindividual may also be stored together with the corresponding group identifier anddata describing health state, enabling analysis and understanding not only ofdirect health state aspects but also of actions or activities of individuals. lt mayalso be possible to store tempo-spatial data defining temporal and/or spatial aspects such as time and/or place of an individual in a given health state.
FIG. 11 is a schematic flow diagram illustrating an example of a computer-implemented method for generating a measure of transition of individual subjects, referred to as individuals, between health states.
Basically, the method comprises the steps of: S11: configuring one or more processors to receive counters of anonymous andapproximately independently distributed group identities originating from assignments of individuals to each of two health states; S12: generating, using said one or more processors, a health transition measurebetween two health states using a linear correlation between counters of groupidentities for each of the two health states; and S13: storing said health transition measure to a memory.
For a better understanding, various aspects of the proposed technology will nowbe described with reference to non-limiting examples of some of the basic keyfeatures followed by some optional features.
The invention receives some identifying data that is able to, with a high probability,uniquely identify an individual and/or personal item of an individual. Such data can be discrete numberings, for example MAC-addresses, IP addresses, license platenumbers, bar codes or random numbers stored in a cookie file. lt may alternativelybe continuous data, for example home coordinates, biometric measurements or afloating-point measurement identifying some unique characteristic of a personaldevice. lt may also be types of personal data such as tax identification numbers,social security numbers, names, home addresses and phone numbers. lt may alsobe any combination and/or function of such data from one or more sources.Depending on definition used, this identifying data may be similar to concepts such as identifiers and/or quasi-identifiers.
The invention comprises a hashing module. A hashing module, in our sense, is asystem that is able retrieve identifying data and generate some data about aperson's identity that is sufficient to identify the individual to some group that issubstantially smaller than the whole population, but not sufficiently small touniquely identify the individual. This effectively divides the population into groupswith one or more individuals, i.e. it performs an automatic online microaggregationof the population. These groups should ideally be independent from the healthtransitions being studied. ln otherwords, we seek to divide them in such a waythat the expectation of the transition of each group should be approximately thesame. ln particular, the variance in any pair of groups should be approximatelyindependently distributed. Expressed differently, we would like to be able toconsider the group as an effectively random subset of the population in ourstatistical estimates. For example, this can be achieved by applying cryptographichash or other hash that has a so-called avalanche effect. A specific example of asuitable hash, if locality-sensitivity is not desired, is a subset of bits of acryptographic hash, such as SHA-2, of a size suitable to represent the desirednumber of groups that correspond to the number of individuals we would like tohave per group. Padding with a constant set of bits can be used in this example toreach necessary message length. However, this specific example of hash bringssome overhead to the computational requirements and hashing modules betteradapted for this specific purpose can also be designed, as the application herein does not necessitate all the cryptographic requirements. ln other words, any correlation, whether linear or of another type, that couldsignificantly bias the resulting measure from the system should effectively beremoved by the hashing module. As an example, a sufficient approximation of a5 random mapping, such as a system based on block ciphers or pseudorandom number generation, can achieve this goal. ln some aspects of the invention, depending on the required conditions foranonymity, the amount of groups may be set so that either an expected two or10 more people from the population whose data has been retrieved or two or morepeople from some greater population, from which the population is essentially arandom sample, is expected to be assigned to each group. The invention allowsan efficient unbiased estimation in both of these cases as well as more extreme anonymizing hashing schemes with a very large number of individuals per group.
The hash key, representing a group identifier, can be stored explicitly, for examplea number in a database, or implicitly, for example by having a separate list perhash key. ln other words, the hashing module takes some identifying data, the identifyingdata not being values of direct interest to the population study, of a population andgenerates effectively (i.e. an approximation sufficiently good for the purposesherein) randomly sampled subgroups from the whole population. The hashingmodule as described herein has several purposes: ensuring/guaranteeing the decorrelation of data from the transition (i.e. using a group identifier that has,posslbly unlike the identifying data, effectively no correlation with the transition)and anonymizing the data by microaggregating it while preserving some limitedinformation about the identity of each individual. ln some embodiments of theinvention the hashing module may also, as described in more detail below, serve to preserve limited information about the data itself by using a locality-sensitivehashing.
The statistics collected per group identifier are instrumental in generating thetransition statistics for the (whole) studied population comprising a multitude suchgroups. The purpose of the invention is not to measure the differences betweenthe groups as such, as the decorrelation is intentionally generating in themselvesmeaningless subdivisions of the population due to the effective removal of any potential correlations between members of the group.
For the purpose of providing anonymity it is important that this hashing takes placeeffectively online (or in real-time and/or near real-time), i.e. continuously with but ashort delay between the acquiring of the identifier and the generation of the hashkey. ln the preferred embodiment the hashing takes place inside a general-purpose computer being located in a sensor system or a general-purposecomputer immediately receiving this value. The value should not be able to beexternally accessed with reasonable effort before being processed. immediatelyafter processing the identifier should be deleted. However, if needed the data maybe batched at various points and/or othen/vise handled over some small timeinterval (for example transmission in nightly batches) in the preferred embodimentif this extended type of online processing is necessary for reasonable technicalrequirements and if it is also not considered to substantially weaken the providedanonymity of the subject. ln contrast, offline methods are generally applied after the whole data collection has been completed.
The people, devices, etc that are assigned to the same group by the hashingmodule are denoted by a group identifier.
As an example of suitable hashing modules, divisions into group based on largecontinuous ranges of one or more of many meaningful variables, such as yearlyincomes, home location, IP-range or height are unsuitable criteria, as this is likelyto results in different expected transition patterns for each group. On the otherhand, we could use for example a cryptographic hash on an initial group intosufficiently small ranges of any of these criteria(s) to produce a set of groups that is effectively indistinguishable from a random subset of the whole population.
Alternatively, we could save a cookie on the user's computer that is a pseudo-randomly generated number in a certain range that is small enough that several users are expected to get the same number.
Stochastic group assignments will not prevent the technology from being appliedand can also add a meaningful layer of extra anonymity. Certain data, such asbiometric data, usually contains some noise level due to measurement errorand/or other factors that makes any subsequent group assignment based on thisdata a stochastic mapping as a function of the identity. ln other cases, stochasticelements can be added on purpose. For example, the system may simply roll adice and assign an individual to a group according to a deterministic mapping a 50% of the time and assign the individual to a completely random group the other 50% of the time. The data can still be used in our system as long as the distributionof this stochastic assignment is known and/or can be estimated. Further, thesimple dice strategy above will be roughly equivalent to a k-anonymity with k=2 inaddition to the anonymity already provided by the grouping.
An individual is used in descriptions of the invention to refer to any individualperson, identifiable device and/or similar objects that can be considered linked to aperson and used to identify a person. For example, cars, mobile phones andnetwork cards can be considered as individuals in the context of this inventions, since tracking these objects allow tracking of individuals.
The invention further comprises a group identifier counter for two or more healthstates. A health state is any population somehow defined and being of interest to astudy of health transition. lt may, for example, be defined as: people living in acertain area, people admitted to a certain hospital on Fridays, people using asmart health monitor with high heart rate, people with certain syndromes, peoplevolunteering to a study and/or other similar such categories that could be ofinterest to a health study. ln additional examples, any combination of health status,diagnosis, treatment, intervention, monitoring, syndrome, test results, sensor data ranges, localisation and/or time may be used to define health states. The counter itself keeps track of how many people from each of two or more group identifierthat are in that health state. This can, for example, be a relative number, as in apercentage of individuals, and/or an absolute number of individuals. lt can bestored in a variety of ways, for example as a vector or as a number of databaseentries with anonymized identifiers indicating to which group identifier the entrybelongs. Also, counters encoding information about the number of people in otherways may be used, such as a Boolean value indicating if the number of people inthe group is higher than average or higher than a set threshold. Many other waysto encode the information such that information about the number of people per group identifier per health state can be extracted is obvious to the skilled person.
A transition herein refers to any assignment to two different health states. This canbe assignment to a health state defined as people in a certain region that havebeen diagnosed with a certain syndrome and an assignment to another healthstate defined as people that have deceased in that region. Transition does notnecessarily imply change in health. lt can, for example, be the assignment to agroup of healthy individuals in year 1 and assignment to a group of healthyindividuals in year 2, i.e. if the invention is measuring how many people stay freefrom disease in a certain population by default and/or after some intervention.Transition can, for example, also be in/out of groups, i.e. that they no longerbelong to any of the studied health groups under treatment after year 1. The non-studied group can then be viewed as an implicitly defined health group. Transitionalso does not necessarily imply that the two health states have a simpleseparation in time. For example, the transitioning from a health state defined asbeing diagnoses with a certain disease and the health state of being cured fromthe disease can be measured in a population for two years without having anyseparation in time between the two groups. ln other cases, the temporal directionof the transition may, for example, be undefined and/or different for variousindividual in the group (e.g. having no separation between people first eatingchocolate and then becoming allergic and people first being allergic and then starting to eat chocolate in a study if chocolate influences allergy).
The group identifier counter and/or any data tied to group identities may bemodified in any way, for example by removing outliers, filtering specific locations,filtering group identities that coincide with known individuals, or by performingfurther microaggregation of any data.
The spatial aspect of health state can also be virtual extents of IP addresses,domain names, frames or similar aspects describing the connection between aperson to part of the state of an electronic device and that describes the state of his interaction with it.
The measurement may use the data from the group identifier counter to measurethe transition of individuals from one health state (A) to another health state (B).Since each group identifier contains a multitude of individuals, we cannot knowprecisely how many people from a certain group that was present in A were alsopresent in B. lnstead, the invention exploits higher order statistics to generate noisy measurements.
FIG. 9 is a schematic diagram illustrating an example of transition of one or moreindividuals from health state A to health state B. ln a preferred embodiment, a baseline is established by estimating, for example bydividing the total number of assignments for all groups in the group identifiercounter with the number of groups, the expected number of assignments pergroup. Such an expectation baseline may also contain a model of the bias, e.g. incase the expected bias by sensor systems and/or similar that are used in directlyor indirectly in generating the hash key (i.e. the group identifier) can be calculatedby some system depending on factor such as location, recording conditions andtime of recording. Additionally, the baseline may be designed taking intoconsideration population behavioural models, for example: the tendency forrepeated assignments to a health state per individual and/or the behaviour ofpatients that are not recorded for some reason. By subtracting this baseline, the preferred embodiment arrives at the skew of the data per group. Skew of data herein refers to how some particular data is distributed compared to the expectation from the generating distribution.
The correlation between the variances per group in A and B represents the skewof the joint distribution. A careful consideration by the inventor reveals that ameasure of the number of individuals can be achieved by exploiting the fact thatthe group identifier and probability of an individual to transition from A to B caneffectively be considered independent and identicaliy distributed, which is indeedguaranteed by the design of the hashing module. For example, by relying on theassumption of the independence attribute and by using: knowledge of thestochastic aspect of the distribution of the hashing module (which may includemodels of any sensor noise, transmission noise and other factors involved), ifapplicable; and a behavioural model that describe the distribution of the number ofvisits per individual etc, we can create a baseline skew (often identical to 0) of the joint distribution that would be expected if the two populations being assigned to A and B were, from a stochastic perspective, independently generated. We can also,using a similar behavioural model and/or knowledge of the stochastic distributionin the hashing module, estimate the skew of the joint distribution in case the twopopulations consisted of exactly the same individuals (often equal to 1 ifdeterministic). For example, such a skew for perfectly coinciding populations maybe adjusted based on models of sensor noise, wherein the sensor noise modelcan be dependent on other factors, such as sensor noise models, location, groupidentifier, identifier noise and/or knowledge of the stochasticity in the hashingprocess. ln a simple example with homogenous groups, compensating a hashingmodule with 50 % chance for consistent group assignment for each individual (withotherwise random assignment between all groups) could double the populationestimate for the same skew compared to the estimate for a 100 % accurate hashing module.
A statistical measure of the number of individuals can then be generated byperforming a linear interpolation between these two extremes based on the actual skew as measured by comparing the estimators. Note that these steps are only an example, but that the independence assumption will result in the transitionmeasurement being representable as a linear transform, such as the one indicatedin some aspect described herein. Various specific embodiments and ways todesign specific such embodiments can be arrived at by the skilled person from this and other examples and descriptions herein.
The complexity in generating such a measure without the independenceassumption made possible by the inherent design of the hashing module would inmost cases be prohibitive and this simplification is a key feature of the invention.Note that this simplification does not only simplify the precise design process ofthe embodiments, but will also result in cheaper, faster and/or more energyefficient methods and systems due to the reduced number of processingoperations being reduced and/or simplification in the hardware architecture required.
The behavioural model may be generated independently from the generation ofthe measure. Alternatively, it may be updated through iterative steps with themeasurement. For example, it may be updated through estimation-maximizationmethods by iterating updates of the transition measure and improved estimates of“behavioural models using an estimation-maximization scheme. ln yet anotheralternative, the transition measure and the behavioural model can be updatedtogether, for example through defining an explicit or implicit likelihood as a functionof both the measure and the behavioural model. The measure and the behaviouralmodel in this example can then be jointly improved through algorithms such asgenetic algorithms or gradient descent. Any combination of the alternatives above is also viable.
The groups do not necessarily need to be of the same distribution (for examplehaving identical estimated group sizes) a priori. With different expected groupsizes, the transition estimation will affect the estimated value per group counterand the (normalized) correlation in a straightfon/vard manner. Any related estimation of variance for the transition measure might become more convoluted, as the Gaussian approximation of the distribution of correlations might be invalid if the group differences are large.
An imperfect hash, in terms of eliminating noticeable correlations, can usually, butnot always, be avoided. A particular case of where it cannot usually be avoided isin certain continuous identifiers. For example, continuous measurements ofbiometric data can be hashed using a locality-sensitive hashing (LSH), whichallows continuous measurements that contain sensor noise to be used inmicroaggregation for our purposes. Such a hash function can be approximately,but not perfectly, decorrelating. Any choice of a specific LSH necessitates abalance between its decorrelating properties and its locality preserving properties.Even if it is largely decorrelating the data it is still likely to preserve someremaining small bias in the distribution of the hash resulting from any correlationbetween biometric measurement and a priori tendency to be assigned to a healthstate (if such correlations are at all present in the original continuous distribution).A term in the baseline(“err”), further elaborated on below, may then be used as afirst order compensation of such remaining correlations while preserving thebenefits of the linearity of the transform. Note that we do not strictly usedecorrelation such as that from the avalanche effect in this setting but assume thatsmall scale correlations resulting from the locality-sensitivity have a small effect onthe resulting statistics (in other words, the correlations are effectively removed). lnparticular, any relevant correlation between the data and a priori tendency to beassigned a health state is likely to be a large-scale pattern. LSH-based hashingmodules are not limited to discrete data, but could be utilized for other data, for example integer values, as well.
As a particular example of LSH, a locality-sensitive hashing may be designed bysplitting the space of continuous identifier values into 30 000 smaller regions. Acryptographic hash may be used to effectively randomly assign 30 regions to eachof 1000 group identifiers. This means that two effectively independently samplednoisy continuous identifiers received from an individual have a large probability ofbeing assigned to the same group. At the same time, two different groups may be likely to have a negligible difference between them due to each group consisting of24 independently sampled regions of the feature space. The decorrelation willgenerally be effective if the regions are much smaller than the correlation patternsof interest. For many well-behaved continuous distributions both the noiseresistance and the effective decorrelation of the groups can be achieved at thesame time. Since an individual may be assigned the different regions solely due tothe noise in the identifying data it may beneficial to compensate the estimation for the resulting stochasticity in the group identifier assignment.
As an example of the above concepts concerning LSH, people over 120 cm ofheight may be significantly less likely to have health issues than those under 120cm, while the corresponding a priori difference between people whose height is119.5-120 cm and people between 120.0-120.5 cm of height is likely to be negligible and hence approximately uncorrelated. lt can be noted that for large number of sample and a large number of possiblehashes the correlation of two independent populations are approximately normallydistributed. This makes it easy to also present confidence intervals for generated measures if desired. ln an example preferred embodiment a server in the example system applies ahashing module to received identifiers and stores an integer between 1 and 1000,effectively random due to the avalanche effect. Assuming the number ofindividuals to be 10000 assigned to health state A and B respectively andassuming individuals can only transition from A to B (e.g. A is some state in year 1and B some state in year 2) and with no other correlation between thecorresponding populations assigned to A and B, the expected mean for bothpoints is 10000 / 1000 = 10 individuals per group. We may encode the measurednumber of individuals per group in integer valued vectors n_a and n_brespectively. We can now calculate the unit length relative variance vectors v_aand v_b as v_a = (n_a - 10)/norm(n_a - 10) etc (where the function norm(x) is thenorm of the vector and subtracting from a vector signifies removing the scalarvalue from each component). Assuming that every individual assigned to A is also assigned to B we arrive at a perfect correlation, E[v_a * v_b] = 1 (where * is the dot product if used between vectors and E[] is the expectation). Instead assuming thatthe population in A and B always consist of different individuals, we can insteadestimate a baseline as E[v_a * v_b] = O, here using the uncorrelated assumptionmade feasible due to the use of a hashing module. Assume now that the numberof individuals at B, c3, consist of two groups of individuals, c1 (with relativevariance vector v_a1) transitioning from A and c2 (with relative variance vectorv_a2) not transitioning from A. The expected correlation in this case becomesE[c3*v_b*v_a1] = E[(c1*v_a1 + c2*va2)*v_a1] = c1. This means we can measurethe expected number of individuals transitioning from A to B as nab = v_b * v_a1*1000O. Assuming we measure a linear correlation of 0.45 between v_b and v_a inthis example we arrive at a measure of 4500 individuals, or 45 % of the individualsin B, transitioning from A. ln other words, we arrive at an unbiased measurementusing strictly anonymous microaggregated data that can be implemented as alinear transform through the use of a decorrelating hash module. The datagenerated by the hash module in the example may be considered anonymous anduploaded to any database without storing personal data. The describedcalculations herein can then preferably be performed on a cloud server/databasethrough the use of lambda functions or other such suitable computing options forthe low-cost calculations required to perform a linear transform. Note that thecorrelation used in these particular calculation(s) is some linear correlation, as thistype of correlation is a result from the transition, while any other correlation typemay be assumed to have been effectively removed by the hashing module.
The counters and/or correlation may be normalized or rescaled in any way as partof generating the estimate. The various calculations should be interpreted in ageneral sense and can be performed or approximated with any of a large numberof possible Variations in the order of operations and/or specific subroutines thatimplicitly perform effectively the same mapping between input and output data asthe calculations mentioned herein in their most narrow sense. Such variations willbe obvious to the skilled person and/or automatically designed, for example bycompilers and/or various other systems and methods. ln case of a slightlyimperfect hash function the resulting error in the above assumptions can be partly compensated for by assuming E[v_a2 * v_b] = err, where err is some correlation inthe data that can be estimated empirically by comparing two different independentsamplings from the population (i.e. measuring traffic at two spots that can have nocorrelation with each other). The expectation then follows the following equality: c1= E[(c1*v_a1 + c2*va2)*v_b] - err. This err term may for example be used as a baseline or part of a baseline.
Note that in the calculation above, we implicitly assume that the total populationsin A and B are identical and/or that B is greater for simplicity. lf the if the amount ofpeople in A is greater than B, then the expected correlation, assuming everyonecomes from B, needs to compensate for the variation in v_a caused by randomlyselecting a subset of the population in A of size equal to the population in B. lnother words, the group identifier counter or any randomly selected subset of A ofsize B is expected to deviate slightly from the group identifier counter of the wholepopulation in A. This means that even if everyone in B comes from A the groupidentifier counters in A and B will not be perfectly correlated. By estimating themaximum expected correlation, compensated for the greater population size in A,when everyone in B has transitioned from A, we receive the expected correlationfor this case. This can be used to set up a linear transform that is used to estimateany rate of transition by interpolating between the expected correlation cases ofeveryone in B transitioning from A and the expected correlation when no persontransitions from A. The details of the calculation of compensations due topopulations sizes such as described herein and similar variations and compensations on this theme will be obvious to the skilled person.
Optionally, the group identifier may be used to create a difference in thepopulation. For example, a treatment and/or intervention may be randomizedbased on the group identifier. Even if the groups are effectively uncorrelated apriori, part of the system and/or method may introduce a difference. ln this case,the intervention (or other introduced difference) is preferably chosen as tomaximize the difference between group identifiers, e.g. some intervention takes place for all member of some groups and no intervention (or an alternative27 intervention) takes place for all members of the other groups etc. The groupidentifier is no longer independent of the aspect of interest to the study. Dependingon what is being studied, the group identifier counter can in some of these casesbe reduced to a counter of the various interventions for the purposes of studyingthe intervention. A larger number of group identifiers, on the other hand, allows for simultaneous study of other aspects of interest on the population. ln the following, a non-exhaustive number of non-limiting examples of specific technological applications will be outlined. 1. Anonymously comparing effectiveness of two different treatments. Peoplevolunteering to a study can be randomized into two groups, each being assigned adifferent treatment administered by a specialist. Their social security number ishashed into a group id and added to a group identifier counter. Three months aftertreatment their response to the treatment is recorded by a different specialist asone of five different categories. For each category the identity of the patient is again hashed and the result added to a group identifier counter.
By studying the correlation between the initial treatment groups and the outcomesthe effect of each treatment can be studied blindly and fully anonymously without storing any personal data about the patients. 2. Anonymously comparing and/or studying the effect of diet on cardiac disease. Aquestionnaire describing several variables describing the intake of various foodtypes and cooking oils are sent by health authorities to all inhabitants in a cityencoded with pseudo-anonymous identifiers matched against a registry. Whenreturned to the health authorities social security number is retrieved from theregistry and hashed into a group id without human intervention. Ten years later thepossible responses to the questionnaire are divided into five types of diets. Thesocial security number of all patients seeking treatment at hospitals in the city andall people diseased in the past ten years is hashed into group identities. For eachdiet type a group identifier counter is created and compared against the group identifier counter from the patient and the diseased and the correlation between28 diet on hospital treatment rates and mortality is estimated anonymously. The rawestimate is then corrected for the response rate to the questionnaire, the agedistribution and immigration and emigration numbers in order to achieve a smaller bias. 3. All wearable devices of a certain model that measures health variables have aunique MAC address. This MAC address is hashed into a group identifier andcertain patterns describing the patients heart function and step counter isuploaded to the key together with the group identifier. All data is time stamped.
The step counter is sent regularly. Over time it is possible to deduce, with themethods described herein, how the heart function changes into different patternsdepending on the step count over 1, 2 and 3 months. This can further be dividedinto subpopulations depending on the starting pattern of the heart of the user.These combinations can be structured into a matrix form and used to create aMarkov model that can guide exercise for a patient month-by-month. 4. Patients volunteer to a double-blind placebo-controlled study. Group identitiesare generated using their social security number. Each group identifier is assigneda batch of either medication or placebo, with the contents unknown to bothpatients and their caretakers. Three years later, half the group identifiers arerandomly assigned to a rehabilitation treatment. Five years later, their socialsecurity number is again transformed into a group identifier and stored in adatabase together with details of their general health. The effect of the medicationcan easily be estimated by comparing groups that received treatment compared toother groups. At the same time, we can see the effect of rehabilitation both on the medicated group and the placebo group. ln this example, the effect can be estimated even if, for example, the population studied at year five also contains other people not participating in the study. ln each of these examples, multiple assignments of the same individual to the same health state will naively be indistinguishable from multiple assignments from29 different individuals. As such, if the precise number of unique individuals isdesired, a behavioural model may, as an example, be combined with thegenerated measure. We may for example measure the average number ofrecurring assignments using a related and/or different method to the onedescribed herein. Such a behavioural model can then be used, for example, asindicated in the more general description, to compensate the transition model bydividing the total number of assignments to a health state by the average numberof recurring assignments and so generate a measure of the number of uniqueassignments. Many other types of behavioural models can also be fitted to thedata using the general methodology described herein and complex behavioural models may result from the combination of several such submodels.
The whole population may also be divided in subpopulations of interest. Forexample, patients may be divided into subpopulations, for example such asmale/female, age, region, etc, before applying the hashing. Each subpopulation isthen considered a separate population being studied for the purposes herein, evenif the same hashing function may be shared across several subpopulations. Thisinformation can be stored as separate counters, or the additional information can be stored explicitly together with the group identifier.
These examples above are not exhaustive of the possibilities. Measures can bearrived at using several other linear methods, such as comparing any set of 2 ormore numbers that are functions of the hash in a similar fashion as the example above. lt will be appreciated that the methods and devices described above can becombined and re-arranged in a variety of ways, and that the methods can beperformed by one or more suitably programmed or configured digital signalprocessors and other known electronic circuits (e.g. discrete logic gatesinterconnected to perform a specialized function, or application-specific integrated circuits).
Many aspects of this invention are described in terms of sequences of actions that can be performed by, for example, elements of a programmable computer system.
The steps, functions, procedures and/or blocks described above may beimplemented in hardware using any conventional technology, such as discretecircuit or integrated circuit technology, including both general-purpose electronic circuitry and application-specific circuitry.
Alternatively, at least some of the steps, functions, procedures and/or blocksdescribed above may be implemented in software for execution by a suitablecomputer or processing device such as a microprocessor, Digital Signal Processor(DSP) and/or any suitable programmable logic device such as a FieldProgrammable Gate Array (FPGA) device and a Programmable Logic Controller(PLC) device. lt should also be understood that it may be possible to re-use the generalprocessing capabilities of any device in which the invention is implemented. lt mayalso be possible to re-use existing software, e.g. by reprogramming of the existing software or by adding new software components. lt is also possible to provide a solution based on a combination of hardware andsoftware. The actual hardware-software partitioning can be decided by a systemdesigner based on a number of factors including processing speed, cost of implementation and other requirements.
FIG. 10 is a schematic diagram illustrating an example of a computer-implementation 100 according to an embodiment. ln this particular example, atleast some of the steps, functions, procedures, modules and/or blocks describedherein are implemented in a computer program 125; 135, which is loaded into thememory 120 for execution by processing circuitry including one or moreprocessors 110. The processor(s) 110 and memory 120 are interconnected to each other to enable normal software execution. An optional input/output device may also be interconnected to the processor(s) 110 and/or the memory 120 toenable input and/or output of relevant data such as input parameter(s) and/or resulting output parameter(s).
The term ”processor” should be interpreted in a general sense as any system ordevice capable of executing program code or computer program instructions to perform a particular processing, determining or computing task.
The processing circuitry including one or more processors 110 is thus configured toperform, when executing the computer program 125, well-defined processing tasks such as those described herein. ln particular, the proposed technology provides a computer program comprisinginstructions, which when executed by at least one processor, cause the at least one processor to perform the computer-implemented method described herein.
The processing circuitry does not have to be dedicated to only execute the above-described steps, functions, procedure and/or blocks, but may also execute other tasks.
Moreover, this invention can additionally be considered to be embodied entirelywithin any form of computer-readable storage medium having stored therein anappropriate set of instructions for use by or in connection with an instruction-execution system, apparatus, or device, such as a computer-based system,processor-containing system, or other system that can fetch instructions from a medium and execute the instructions.
The software may be realized as a computer program product, which is normallycarried on a non-transitory computer-readable medium, for example a CD, DVD,USB memory, hard drive or any other conventional memory device. The softwaremay thus be loaded into the operating memory of a computer or equivalent processing system for execution by a processor. The computer/processor does not have to be dedicated to only execute the above-described steps, functions, procedure and/or blocks, but may also execute other software tasks.
The flow diagram or diagrams presented herein may be regarded as a computer5 flow diagram or diagrams, when performed by one or more processors. Acorresponding apparatus may be defined as a group of function modules, whereeach step performed by the processor corresponds to a function module. ln thiscase, the function modules are implemented as a computer program running on the processor.
The computer program residing in memory may thus be organized as appropriatefunction modules configured to perform, when exeouted by the processor, at least part of the steps and/or tasks described herein.
Alternatively, it is possible to realize the module(s) predominantly by hardwaremodules, or alternatively by hardware, with suitable interconnections betweenrelevant modules. Particular examples include one or more suitably configureddigital signal processors and other known electronic circuits, e.g. discrete logicgates interconnected to perform a specialized function, and/or Application Specific Integrated Circuits (ASlCs) as previously mentioned. Other examples of usablehardware include input/output (l/O) circuitry and/or circuitry for receiving and/orsending signals. The extent of software versus hardware is purely implementation selection. lt is becoming increasingly popular to provide computing services (hardwareand/or software) where the resources are delivered as a service to remotelocations over a network. By way of example, this means that functionality, asdescribed herein, can be distributed or re-located to one or more separate physicalnodes or servers. The functionality may be re-located or distributed to one or more jointly acting physical and/or virtual machines that can be positioned in separatephysical node(s), i.e. in the so-called cloud. This is sometimes also referred to as cloud computing, which is a model for enabling ubiquitous on-demand network access to a pool of configurable computing resources such as networks, servers, storage, applications and general or customized services.
The embodiments described above are to be understood as a few illustrative5 examples of the present invention. lt will be understood by those skilled in the artthat various modifications, combinations and changes may be made to theembodiments without departing from the scope of the present invention. lnparticular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SE2000041A SE544098C2 (en) | 2020-02-25 | 2020-02-25 | Method and system for anonymously tracking and/or analysing health states in a population |
| US17/059,366 US20210366603A1 (en) | 2019-09-25 | 2020-08-26 | Methods for anonymously tracking and/or analysing health in a population of subjects |
| PCT/IB2020/057982 WO2021059053A1 (en) | 2019-09-25 | 2020-08-26 | Methods and systems for anonymously tracking and/or analysing health in a population of subjects |
| US17/247,530 US11404167B2 (en) | 2019-09-25 | 2020-12-15 | System for anonymously tracking and/or analysing health in a population of subjects |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SE2000041A SE544098C2 (en) | 2020-02-25 | 2020-02-25 | Method and system for anonymously tracking and/or analysing health states in a population |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| SE2000041A1 true SE2000041A1 (en) | 2021-08-26 |
| SE544098C2 SE544098C2 (en) | 2021-12-21 |
Family
ID=77663150
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| SE2000041A SE544098C2 (en) | 2019-09-25 | 2020-02-25 | Method and system for anonymously tracking and/or analysing health states in a population |
Country Status (1)
| Country | Link |
|---|---|
| SE (1) | SE544098C2 (en) |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120204026A1 (en) * | 2011-02-04 | 2012-08-09 | Palo Alto Research Center Incorporated | Privacy-preserving aggregation of time-series data |
| US8972187B1 (en) * | 2013-06-28 | 2015-03-03 | Google Inc. | Varying the degree of precision in navigation data analysis |
| US20150149208A1 (en) * | 2013-11-27 | 2015-05-28 | Accenture Global Services Limited | System for anonymizing and aggregating protected health information |
| GB2549786A (en) * | 2016-04-29 | 2017-11-01 | Fujitsu Ltd | A system and method for storing and controlling access to behavioural data |
| US20180307859A1 (en) * | 2013-11-01 | 2018-10-25 | Anonos Inc. | Systems and methods for enforcing centralized privacy controls in de-centralized systems |
| US20190026491A1 (en) * | 2017-07-24 | 2019-01-24 | Mediasift Limited | Event processing system |
| US20190073489A1 (en) * | 2017-09-05 | 2019-03-07 | Qualcomm Incorporated | Controlling access to data in a health network |
| WO2020050760A1 (en) * | 2018-09-07 | 2020-03-12 | Indivd Ab | System and method for handling anonymous biometric and/or behavioural data |
-
2020
- 2020-02-25 SE SE2000041A patent/SE544098C2/en unknown
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120204026A1 (en) * | 2011-02-04 | 2012-08-09 | Palo Alto Research Center Incorporated | Privacy-preserving aggregation of time-series data |
| US8972187B1 (en) * | 2013-06-28 | 2015-03-03 | Google Inc. | Varying the degree of precision in navigation data analysis |
| US20180307859A1 (en) * | 2013-11-01 | 2018-10-25 | Anonos Inc. | Systems and methods for enforcing centralized privacy controls in de-centralized systems |
| US20150149208A1 (en) * | 2013-11-27 | 2015-05-28 | Accenture Global Services Limited | System for anonymizing and aggregating protected health information |
| GB2549786A (en) * | 2016-04-29 | 2017-11-01 | Fujitsu Ltd | A system and method for storing and controlling access to behavioural data |
| US20190026491A1 (en) * | 2017-07-24 | 2019-01-24 | Mediasift Limited | Event processing system |
| US20190073489A1 (en) * | 2017-09-05 | 2019-03-07 | Qualcomm Incorporated | Controlling access to data in a health network |
| WO2020050760A1 (en) * | 2018-09-07 | 2020-03-12 | Indivd Ab | System and method for handling anonymous biometric and/or behavioural data |
Non-Patent Citations (3)
| Title |
|---|
| Chris Clifton; Murat Kantarcioglu; Jaideep Vaidya; Xiaodong Lin; Michael Y Zhu, Tools for privacy preserving distributed data mining, ACM SIGKDD explorations newsletter, 2002-12-01, Association for Computing Machinery, Inc, US, 4, 28-34 doi:10.1145/772862.772867 * |
| Jin Hao; Luo Yan; Li Peilong; Mathew Jomol, A Review of Secure and Privacy-Preserving Medical Data Sharing, IEEE Access, USA, pg 61656 - 61669, 2019-05-23, doi:10.1109/ACCESS.2019.2916503 * |
| Rebollo-Monedero David; Forné Jordi; Soriano Miguel; Puiggalí Allepuz Jordi, k-Anonymous microaggregation with preservation of statistical dependence, INFORMATION SCIENCES, 2016-01-07, AMSTERDAM, NL, 342, 1-23, doi:10.1016/j.ins.2016.01.012 * |
Also Published As
| Publication number | Publication date |
|---|---|
| SE544098C2 (en) | 2021-12-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11404167B2 (en) | System for anonymously tracking and/or analysing health in a population of subjects | |
| US12105836B2 (en) | System and method for handling anonymous biometric and/or behavioral data | |
| Mønsted et al. | Phone-based metric as a predictor for basic personality traits | |
| Armstrong et al. | The application of analysis of variance (ANOVA) to different experimental designs in optometry | |
| US10013569B2 (en) | Privacy-preserving data collection, publication, and analysis | |
| Rudolph et al. | When effects cannot be estimated: redefining estimands to understand the effects of naloxone access laws | |
| US11159580B2 (en) | System for anonymously tracking and/or analysing web and/or internet visitors | |
| Scosyrev | Identification of causal effects using instrumental variables in randomized trials with stochastic compliance | |
| Liljeros et al. | The contact network of inpatients in a regional healthcare system. A longitudinal case study | |
| US12210652B2 (en) | Methods and systems for anonymously tracking and/or analysing individuals based on biometric data | |
| US20210366603A1 (en) | Methods for anonymously tracking and/or analysing health in a population of subjects | |
| WO2021059032A1 (en) | Methods and systems for anonymously tracking and/or analysing individual subjects and/or objects | |
| SE2000041A1 (en) | Method and system for anonymously tracking and/or analysing health states in a population | |
| Fan et al. | A smoothed Q‐learning algorithm for estimating optimal dynamic treatment regimes | |
| Gasparini et al. | Analysis of Cohort Stepped Wedge Cluster‐Randomized Trials With Nonignorable Dropout via Joint Modeling | |
| Tsiatis et al. | Estimation of the odds ratio in a proportional odds model with censored time‐lagged outcome in a randomized clinical trial | |
| Derbeko et al. | Efficient and privacy preserving approximation of distributed statistical queries | |
| van den Broek | Modelling the reproductive power function | |
| Clark et al. | The effect of hospital care on early survival after penetrating trauma | |
| Mildenberger et al. | Influence of cluster‐period cells in stepped wedge cluster randomized trials | |
| Mulder | Ensuring anonymity in survey panel research | |
| Lau et al. | Using health-seeking pattern to estimate disease burden from sentinel surveillance | |
| CN113963812A (en) | Information sharing method, system and device | |
| Della Penna et al. | Instrumental Variable Analysis of Electronic Health Records | |
| Jasoliya et al. | A Survey: Privacy Preserving Techniques in Data Stream Mining |