WO2025145004A1 - Methods, systems, compositions and kits for target detection - Google Patents
Methods, systems, compositions and kits for target detection Download PDFInfo
- Publication number
- WO2025145004A1 WO2025145004A1 PCT/US2024/062056 US2024062056W WO2025145004A1 WO 2025145004 A1 WO2025145004 A1 WO 2025145004A1 US 2024062056 W US2024062056 W US 2024062056W WO 2025145004 A1 WO2025145004 A1 WO 2025145004A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- detection
- nucleic acid
- target nucleic
- oligonucleotide
- hypercode
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6816—Hybridisation assays characterised by the detection means
- C12Q1/682—Signal amplification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6816—Hybridisation assays characterised by the detection means
Definitions
- genomic and multi-omic testing can enable higher precision in clinical decision making from assessing risk of disease for a subject, disease diagnosis, treatment monitoring for response and disease recurrence, to clinically informing what drug and treatment regimen may be most beneficial. In short, enabling personalized patient care.
- aspects disclosed herein provide methods for determining the presence of one or more target nucleic acid molecules, the method comprising: (a) providing a plurality of target nucleic acid molecules from a sample; (b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises: (i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and (ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set of computational states; (c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products; (d) hybridizing
- the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof. In some embodiments, the one or more detection moieties comprises the fluorophore. In some embodiments, the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more optically distinct fluorescent moieties.
- the plurality of detection polynucleotide complexes comprises two or more distinct detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the two or more distinct detection polynucleotide complexes comprises one or more optically distinct fluorescent moieties.
- the plurality of detection polynucleotide complexes comprises three or more distinct detection polynucleotide complexes.
- the plurality of detection polynucleotide complexes comprises four or more distinct detection polynucleotide complexes.
- the detection oligonucleotide comprises a length of 5 to 25 nucleotides.
- the anchor oligonucleotide comprises a length of 20 to 100 nucleotides. In some embodiments, the anchor oligonucleotide comprises a length of 40 to 50 nucleotides. In some embodiments, the method further comprises forming the plurality of detection polynucleotide complexes. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide prior to the hybridizing of (d).
- the hybridizing of (d) comprises hybridizing the detection polynucleotide complex of the plurality of detection polynucleotide complexes to the hypercode of the set of hypercodes on the plurality of concatemeric amplification products.
- forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide concurrently with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
- forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide after hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide substantially simultaneously with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
- the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
- the method further comprises extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof.
- the plurality of target nucleic acid molecules comprises DNA.
- the plurality of target nucleic acid molecules comprises RNA.
- the RNA comprises mRNA.
- the method further comprises (g) imaging the detection polynucleotide complex of the plurality of detection polynucleotide complexes hybridized to the at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
- the method further comprises repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products.
- the repeating of (d) through (g) comprises 2 to 15 iterative repetitions. In some embodiments, the repeating of (d) through (g) comprises 2 to 10 iterative repetitions.
- the detecting of (e) comprises fluorescence detection. In some embodiments, the method further comprises applying a soft decision algorithm to a detected hypercode profile. In some embodiments, the method further comprises performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the sample. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules. In some embodiments, each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence. In some embodiments, the plurality of recognition elements comprises 10 to 10,000 recognition elements.
- the plurality of recognition elements comprises 10 to 1,000 recognition elements. In some embodiments, the plurality of segments comprises 2 to 10 segments. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises at least 5 contiguous nucleotides. In some embodiments, the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from the other computational states of the set of computational states. In some embodiments, the set of computational states comprises 5 to 30 computational states.
- the method further comprises determining a Hamming distance between any two hypercodes of the set of hypercodes. In some embodiments, the method further comprises determining a Hamming distance between any two segments of a hypercode of the set of hypercodes. In some embodiments, the Hamming distance is 2 to 8. In some embodiments, the method further comprises repeating (d) through (f). In some embodiments, the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments.
- the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule.
- the method further comprises providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe.
- the method further comprises ligating and circularizing the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules.
- the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
- aspects disclosed herein provide methods for determining the presence of one or more target nucleic acid molecules, the method comprising: (a) providing a plurality of target nucleic acid molecules from a sample; (b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises: (i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and (ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set of computational states; (c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products; (d) hybridizing
- the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof. In some embodiments, the one or more detection moieties comprises the fluorophore. In some embodiments, the detection oligonucleotide comprises one or more optically distinct fluorescent moieties. In some embodiments, the plurality of detection oligonucleotides comprises two or more distinct detection oligonucleotides. In some embodiments, the plurality of detection oligonucleotides comprises three or more distinct detection oligonucleotides.
- the plurality of detection oligonucleotides comprises four or more distinct detection oligonucleotides. In some embodiments, the detection oligonucleotide comprises a length of 5 to 25 nucleotides.
- the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
- the method further comprises extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof.
- the plurality of target nucleic acid molecules comprises DNA. In some embodiments, the plurality of target nucleic acid molecules comprises RNA. In some embodiments, the RNA comprises mRNA.
- the method further comprises (g) imaging the detection oligonucleotide of the plurality of detection oligonucleotides hybridized to the portion of the hypercode of the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the method further comprises repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the repeating of (d) through (g) comprises 2 to 15 iterative repetitions. In some embodiments, the repeating of (d) through (g) comprises 2 to 10 iterative repetitions.
- the method further comprises performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the sample.
- the detecting of (e) comprises fluorescence detection.
- the method further comprises applying a soft decision algorithm to a detected hypercode profile.
- the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules.
- the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules.
- each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence.
- the plurality of recognition elements comprises 10 to 10,000 recognition elements.
- the plurality of recognition elements comprises 10 to 1,000 recognition elements. In some embodiments, the plurality of segments comprises 2 to 10 segments. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises at least 5 contiguous nucleotides. In some embodiments, the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from other computational states of the set of computational states. In some embodiments, the set of computational states comprises 5 to 30 computational states.
- the method further comprises determining a Hamming distance between any two hypercodes of the set of hypercodes. In some embodiments, the method further comprises determining a Hamming distance between any two segments of a hypercode. In some embodiments, the Hamming distance is 2 to 8. In some embodiments, the method further comprises repeating (d) through (f). In some embodiments, the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments.
- the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule.
- the method further comprises providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe.
- the method further comprises ligating and circularizing the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules.
- the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
- aspects disclosed herein provide systems, comprising: (a) a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises:
- a hypercode from a set of hypercodes wherein the hypercode from the set of hypercodes is associated with each target nucleic acid molecule of the plurality of target nucleic acid molecules from the sample including the corresponding target nucleic acid molecule in (i), and wherein the hypercode comprises a plurality of segments that corresponds to at least two computational states of a set of computational states; and (b) a plurality of detection polynucleotide complexes comprising: (i) a detection oligonucleotide; and (ii) an anchor oligonucleotide, wherein: (1) a first portion of the anchor oligonucleotide is complementary to a segment of the hypercode or a portion thereof; and
- a second portion of the anchor oligonucleotide is complementary to a portion of the detection oligonucleotide.
- the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
- the plurality of target nucleic acid molecules comprises DNA.
- the plurality of target nucleic acid molecules comprises RNA.
- the RNA comprises mRNA.
- each segment of the plurality of segments comprises at least 5 contiguous nucleotides, wherein the at least 5 contiguous nucleotides each correspond to a computational state that is different from another computational state of the set of computational states.
- the set of computational states comprises 2 to 20 computational states. In some embodiments, the set of computational states comprises 2 to 10 computational states. In some embodiments, the set of computational states comprises 4 computational states. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 3 to 5. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 3. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 4. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 5. In some embodiments, the detection oligonucleotide comprises one or more fluorescence molecules.
- the one or more fluorescent molecules comprises an organic dye, a biological fluorophore, a quantum dot, or a combination thereof.
- the detection oligonucleotide comprises a length of 5 to 10 nucleotides.
- the anchor oligonucleotide comprises a length of 10 to 25 nucleotides.
- the plurality of detection polynucleotide complexes comprises 2 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 2 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
- the plurality of detection polynucleotide complexes comprises 3 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 3 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
- the plurality of detection polynucleotide complexes comprises 4 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 4 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
- the system further comprises a solid substrate configured to immobilize nucleic acids.
- the solid substrate comprises a welled plate or a flow cell, wherein a surface of the welled plate or a surface of the flow cell comprises a cation-coating layer coupled thereto.
- the system further comprises (c) a fluid flow controller; (d) an imaging system; (e) a computer system; or (f) any combination of (c) to (e).
- the fluid flow controller comprises one or more pumps, valves, mixing manifolds, reagent reservoirs, waste reservoirs, or any combination thereof.
- the fluid flow controller is configured to provide programmable control of fluid flow velocity, volumetric fluid flow rate, timing of reagent or buffer introduction, or any combination thereof.
- a detection polynucleotide complex is bound to a recognition element of the plurality of recognition elements, or a concatemeric amplification product thereof to form a detectable binding complex.
- FIG. 2 shows a schematic diagram of an example of a recognition element and a target nucleic acid molecule according to some embodiments herein.
- FIG. 3 shows a schematic diagram illustrating non-limiting examples of factors considered in the design of recognition elements according to some embodiments herein.
- FIG. 9A shows a dual ligation recognition element strategy for use when a target of interest has high homology with a homolog or pseudogene.
- FIG. 10A shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including an example of a structure of a detector oligonucleotide.
- FIG. 10B shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including four optically distinct detector oligonucleotides.
- FIG. 10C shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including four non-optically distinct detector oligonucleotides.
- FIG. 12A shows schematic illustrations of detection polynucleotide complexes including a schematic diagram illustrating an example of a pool of 16 anchor oligonucleotides and four different detector oligonucleotides.
- FIG. 12B shows schematic illustrations of detection polynucleotide complexes including an image illustrating an example of one detection polynucleotide complex hybridized to a portion of a code (e.g., segment) of a recognition element.
- a code e.g., segment
- FIG. 13A shows schematic illustrations of detection polynucleotide complexes including a schematic diagram illustrating a pool of 16 anchor oligonucleotides and four detection oligonucleotides.
- FIG. 13B shows schematic illustrations of detection polynucleotide complexes including a pool of 16 detection polynucleotide complexes where each potential detection polynucleotide complex comprises a different combination of anchor oligonucleotide and detection oligonucleotide pairs.
- FIG. 14 shows an example of an encoded assay workflow for detecting methylated DNA.
- FIG. 15 shows an example of a table illustrating how states are assigned using four colors in a two flow decode system.
- FIG. 16A shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with median skew.
- FIG. 16B shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with high skew.
- FIG. 16C shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with high skew.
- FIG. 17 is a table illustrating examples of sequences of 64 unique detection polynucleotide complexes used to identify 64 possible codes.
- FIG. 18 is an example of a table of the permutations that may be used to achieve a relatively large combination code space from which to select a subset of codes for detecting and decoding a recognition element code.
- FIG. 19A shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including (i) cycles of hypercode detection and (ii) representative intensity vectors of both a learned profile and the ideal intensity profile.
- FIG. 19B shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a profile skew.
- FIG. 19C shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a profile score.
- FIG. 19D shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a decoded concatemeric amplification products for a range of target input concentrations.
- FIG. 19E shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating decoded error rates as a function of target input concentration.
- FIG. 22 is a schematic diagram of an example of a soft decision decoding algorithm of the present disclosure.
- FIG. 25 shows a non-limiting example of a web/mobile application provision system; in this case, a system providing browser-based and/or native mobile user interfaces.
- NGS Next generation sequencing
- qPCR quantitative polymerase chain reaction
- the target molecule may be a nucleic acid molecule from a sample (e.g., a biological sample) or a nucleic acid molecule serving as a surrogate of a target molecule that is other than a nucleic acid molecule (e.g., polypeptide, sugar, metabolite, etc.).
- Methods, systems, compositions and kits of the present disclosure provide encoded assays (or components of the encoded assays) comprising a recognition element that uniquely recognizes and binds to a target molecule from a sample under conditions sufficient that the recognition element undergoes a molecular transformation in the presence (but not in the absence) of the target molecule to produce a modified recognition element.
- the detection polynucleotide complexes of the present disclosure may include a detection oligonucleotide having a detectable label and a nucleic acid sequence configured to bind to an anchor oligonucleotide, wherein the anchor oligonucleotide is configured to bind to a segment of the code and the detection oligonucleotide, as shown for example in FIG. 12B.
- a recognition element comprises a circularizable linear DNA molecule that comprises a code or hypercode unique to a particular target of interest that when hybridized to target sequences can conform into a padlock probe configuration.
- a recognition element in the presence of the complementary genomic sequence, wild type or variant, can hybridize to its intended target sequences of interest, undergoes a conformation change into a circular DNA molecule, and the two adjacent sequences of the circularized recognition element can be ligated together to form a modified, circularized recognition element.
- the modified, circularized recognition element can be amplified to increase the number of hypercodes for downstream robust fluorescence detection.
- the methods described herein relate to determining the presence of a target molecule from a sample.
- the presence of the target molecule is determined by introducing a plurality of recognition elements to the sample, wherein each recognition element comprises a target recognition region specific to a target in the sample under conditions sufficient to bind the recognition elements to the respective targets.
- the target recognition regions are complementary to the respective target nucleic acid molecule.
- the recognition elements bound to the respective target molecules may be selectively amplified to produce a plurality of amplification products comprising amplified codes.
- the amplification products are immobilized on a substrate (e.g., welled plate, flow cell).
- the amplification to generate amplification products is performed in solution.
- the amplification to generate amplification products is performed on a substrate.
- a plurality of detection polynucleotide complexes is introduced to the plurality of amplification products.
- a detection oligonucleotide and an anchor oligonucleotide are added to the amplification products, wherein the detection oligonucleotide and the anchor oligonucleotide assemble to form a detection polynucleotide complex substantially simultaneously to the anchor oligonucleotide hybridizing to a code or a portion of a code.
- the detection polynucleotide complex comprises a detection oligonucleotide and an anchor oligonucleotide.
- a portion of an anchor oligonucleotide may be complementary to at least a portion of the amplification product.
- Another portion of the anchor oligonucleotide may be complementary to at least a portion of the detection oligonucleotide.
- the detectable binding complexes are imaged with an imaging system disclosed elsewhere herein to obtain signals associated with the code for each amplification product. This process may be repeated for each segment of each code, thereby building a color profile for each code.
- a decoding process e.g., soft decision decoding
- methods comprising analyzing a plurality of target nucleic acid molecules from a sample, providing a plurality of recognition elements, wherein each recognition element of the plurality comprises one or more target recognition regions complementary to a corresponding target nucleic acid molecule of the plurality of target nucleic acid molecules; and a code from a set of codes, wherein the code is associated with one or more target nucleic acid molecules of the plurality from the sample including the corresponding target nucleic acid molecule.
- the code can comprise a plurality of segments that corresponds to at least two computational states of a set of computational states.
- the methods comprise selectively amplifying a subset of the plurality of recognition elements bound to the plurality of target nucleic acid molecules to produce a plurality of amplification products, wherein each amplification product comprises a code, introducing a plurality of detection polynucleotide complexes to the plurality of amplification products, wherein each detection polynucleotide complex of the plurality comprises a detection oligonucleotide and an anchor oligonucleotide, wherein a portion of each anchor oligonucleotide is complementary to at least a portion of the amplification product of the plurality of amplification products, and another portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide forming a plurality of detectable binding complexes.
- each detectable binding complex comprises a detection polynucleotide complex bound to an amplification product of the plurality of amplification products, further including imaging the plurality of detectable binding complexes to obtain signals associated with the different segments of the plurality of segments of each code of the plurality of codes for each amplification product of the plurality of amplification products, iteratively repeating the operations of the introducing, forming, and imaging for each segment of each code of the plurality of codes and applying a soft decision decoding algorithm to the plurality of codes to predict a presence of the one or more target nucleic acid molecules from the sample.
- methods comprising analyzing a plurality of target nucleic acid molecules from a sample, providing a plurality of recognition elements, wherein each recognition element of the plurality comprises one or more target recognition regions complementary to a corresponding target nucleic acid molecule of the plurality of target nucleic acid molecules; and a code from a set of codes, wherein the code is associated with one or more target nucleic acid molecules of the plurality from the sample including the corresponding target nucleic acid molecule.
- the methods comprise introducing a plurality of detection oligonucleotides to the plurality of amplification products, wherein each detection oligonucleotide comprises a sequence complementary to a portion of a code and further comprises a detectable moiety and when bound to a segment of the code is call a detectable binding complex, wherein each detectable binding complex comprises a detection oligonucleotide bound to an amplification product of the plurality of amplification products, imaging the plurality of detectable binding complexes to obtain signals associated with the different segments of the plurality of segments of each code of the plurality of codes for each amplification product of the plurality of amplification products, iteratively repeating the operations of the introducing, forming, and imaging for each segment of each code of the plurality of codes and applying a soft decision decoding algorithm to the plurality of codes to predict a presence of the one or more target nucleic acid molecules from the sample.
- the methods described herein may comprise iteratively repeating the operations of: (i) introducing detection oligonucleotide complexes; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes to obtain signals, in order to generate a pattern of states indicative of each code.
- the method is performed for each segment within a code.
- each operation comprises a cycle whereby a first pool of detection polynucleotide complexes is introduced to a plurality of modified recognition elements associated with respective target molecules, or amplification products thereof, under conditions sufficient to bind a detection polynucleotide complex to a modified recognition element or an amplification product thereof to facilitate detection of the segment of a code.
- the iteratively repeating operations (i) to (iii) comprises adding and cycling additional pools of detection polynucleotide complexes in a sequential manner until all or substantially all of the segments for each of the codes are detected.
- methods further comprise a wash operation between operations (ii) and (iii) to remove one or more of unbound detection polynucleotide complexes, unbound detection oligonucleotides, or anchor oligonucleotides.
- methods further comprise a dehybridization operation in between cycles to destabilize and remove a first pool of detection polynucleotide complexes.
- FIG. 5 provides a non-limiting example of a process 500 for detecting the presence of a target nucleic acid molecule.
- the process 500 comprises the operations of incubation 510, wash and image 520, and dehybridization and wash 530.
- detection polynucleotide complexes comprising a detection oligonucleotide and an anchor oligonucleotide may be incubated with amplification products of recognition elements comprising a code of one or more segments.
- a detection polynucleotide complex may bind to a complementary segment of a code, or a portion thereof, present on a recognition element.
- the detection polynucleotide complexes that are not bound to a segment of a code, or a portion thereof may be removed by washing, and the detection polynucleotide complex that is bound to the segment of a code, or a portion thereof, may be imaged.
- the detection polynucleotide complex that is bound to the segment of a code, or a portion thereof may be dehybridized and removed from the assay environment.
- the methods described herein relate to performing process 500 and iteratively repeating operations 510, 520, and 530 to determine the code state or color profile of an amplified recognition element, thereby identifying the presence of the target nucleic acid molecule.
- the encoded assays disclosed herein are capable of multiplex target detection.
- the readout of the encoded assays can be measured alongside the readout of various molecular assays that may be performed in parallel, thereby enabling a multiomic platform for the analysis of different target molecules from a sample.
- An assay workflow may comprise the following operations.
- a sample may be collected or provided.
- the sample may be whole blood, lymphatic fluid, serum, plasma, sweat, tear, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, liquids containing multi-celled organisms, tissues, cells, biopsy samples, biological swabs or biological washes.
- a blood or saliva sample may be collected.
- a whole blood sample may be collected and processed to separate the plasma fraction from the cellular components of whole blood.
- Target molecule extraction, concentration, conversion, and/or purification processes may be performed.
- the target molecule may be DNA.
- DNA e.g., cell-free DNA
- a proteinase K digestion operation may be used to digest proteins present in the plasma sample.
- a heat denaturation operation e.g., 94-98°C for 20-30 seconds
- a bead-based extraction and concentration protocol may be used to capture single-stranded DNA in the plasma sample.
- the bead-based extraction protocol uses magnetically responsive nucleic acid capture beads.
- the bead-bound DNA may be released from the capture beads using an elution buffer (or other elution means suitable to the capture bead used) to produce a processed DNA sample for analysis.
- the DNA sample may be further processed in a bisulfite conversion reaction for analysis of the methylation status of a set of targets in the sample.
- a skilled artisan will understand the methods that can be used to extract and/or purify target molecules such as nucleic acids or proteins, from a sample.
- the DNA sample may be transferred into an analysis cartridge or device according to some embodiments herein.
- the analysis cartridge or device may comprise a reaction vessel.
- reaction vessels include a plate, a well, a container, a tube, a flow cell, a microfluidic chip, or the like.
- the plate may be a welled plate, such as a 12-well plate, a 24-well plate, a 48-well plate, a 96-well plate, a 384 well plate, a 1536-well plate, and the like.
- the reaction vessel, or a reaction surface thereof may be optically clear to enable optical target detection in the reaction vessel.
- the reaction vessel may comprise a glass surface.
- the reaction vessel may comprise a glass-bottomed, well plate.
- the reaction vessel may comprise a surface coating that promotes sequestration of nucleic acid amplification products.
- the reaction vessel may comprise a cationic coating.
- FIG. 1 provides an example of an encoded assay workflow.
- a recognition element (see also FIG. 2) comprises 5’ and 3’ ends that are complementary to target sequences of interest.
- a recognition element further comprises a code, also known as a hypercode.
- the hypercode can be used as a proxy for the presence of a target of interest.
- Target nucleic acids of interest are incubated with the recognition element wherein they can hybridize, if present, to their complementary sequences in the recognition element (middle top illustration).
- the ends of the recognition element are adjacently located and ligation between the 5’ and 3’ ends of the recognition element can occur. If there is no target of interest, there is no hybridization and no ligation.
- the reactions can be treated with one or more exonucleases, thereby digesting any linear nucleic acids present in the reaction that did not participate in the hybridization and ligation events.
- the circularized recognition elements are aliquoted into a welled plate where they are immobilized onto the surface of the welled plate.
- a polymerase e.g., DNA polymerase
- the concatenated amplified products can be queried with fluorescent detection complexes (bottom left of FIG. 1) (see also FIG. 5) and imaged.
- a hypercode profile can be generated upon multiple query events, also called cycles or flows. The resulting hypercode profile can be decoded to identify the hypercode which can be used as a proxy for the presence of the target of interest.
- the target molecule can be uniquely recognized by and bound to a recognition element associated with a hypercode (and optionally other elements).
- the recognition event for the set of target molecules uses a plurality of coded recognition elements.
- the recognition event for the set of target molecules uses a panel of molecular inversion probes.
- the recognition event for the set of target molecules uses a panel of padlock probes. The recognition event yields a set of hypercoded target molecules comprising the target molecule and the recognition element.
- the recognition event may include sequence-specific binding between a 5’ probe arm and a 3’ probe arm of the recognition element to the target nucleic acid molecule under conditions sufficient to form a binding complex comprising the recognition element and the target nucleic acid molecule.
- the recognition element is a padlock probe
- the 5’ probe arm and the 3’ probe arm comprises a target recognition element that binds to two adjacent sequences in the target molecule.
- the 5’ probe arm and the 3’ probe arm bind to the target nucleic acid molecule at 3’ and 5’ regions flanking the target region leaving a gap between the 5’ probe arm and the 3’ probe arm of the padlock probe.
- the recognition event may include sequence-specific binding between a 5’ probe arm and a 3’ probe arm of the recognition element to 3’ region and a 5’ region of a bridge oligonucleotide having a target-specific element complementary to the target nucleic acid molecule interposed between the 3’ region and the 5’ region.
- the bridge oligonucleotide is a surrogate for the target molecule.
- the bridge oligonucleotide and the recognition element are introduced to the target nucleic acid molecule under conditions sufficient to form a ternary binding complex comprising the recognition element, the bridge oligonucleotide and target nucleic acid molecule.
- the recognition element may include sequence-specific binding between the target nucleic acid molecule and a target-binding region of a pre-circularized recognition element.
- the target nucleic acid molecule is a surrogate for the target molecule, such as, for example, a cleavage product from a flap endonuclease cleavage reaction between a dualprobe recognition element and the target molecule.
- the target nucleic acid molecule serves as a primer for an amplification reaction, such as a rolling circle amplification (RCA) reaction or a multiple strand displacement reaction.
- RCA rolling circle amplification
- An exonuclease cleanup operation may be used following ligation of the recognition element to digest any remaining single stranded nucleic acid, such as unhybridized recognition elements, amplification primers, and single-stranded target molecules.
- exonucleases useful for digesting remaining single-stranded nucleic acids include Exonuclease I, Exonuclease I, Exonuclease VII, Msz Exonuclease I, T5 exonuclease, Exonuclease V, DNase I, or any combination thereof.
- An amplification event for modified recognition elements may be performed.
- the amplification event may be a rolling circle amplification (RCA) reaction to generate a set of concatenated amplification products.
- the amplification event thereby yields a set of concatenated amplified recognition elements including their unique codes (e.g., codes present in the modified recognition elements) that can be correlated to the target molecule.
- An amplification event could further be a multiple strand displacement reaction to generate a set of target molecule-specific amplification products.
- a detection event followed by a decoding event for each amplified code as found in the amplified recognition elements may be performed to identify the code of an amplified modified recognition element.
- the code may be detected by hybridization of one or more segments of the code (and optionally other elements) to a detection polynucleotide complex or a detection oligonucleotide of the present disclosure.
- the detection events detect the code as a surrogate or proxy for identifying the presence of the target molecule in the sample. Decoding the detection events may in some cases make use of a soft decision decoding algorithm.
- a bioinformatics analysis of the code information (and optionally other elements) from the detection operation may be performed.
- the bioinformatic analysis may be performed by one or more computer systems as described herein.
- the amplification event and the detection event may occur in a step wise manner, such that first the modified recognition elements are amplified followed by one or more washes, followed by the detection event and the decoding event.
- presence of the codes may be determined with a detection and decoding by hybridization process disclosed herein.
- a plurality of detection polynucleotide complexes or detection oligonucleotides may be introduced to the amplified modified recognition elements iteratively for detection of each segment, or a portion thereof, within all or substantially all amplified codes in the amplified recognition elements.
- FIG. 14 is a schematic diagram illustrating an example of a process 1400 of using a bisulfite conversion reaction in combination with a coded recognition element to detect a methylated target nucleic acid of interest.
- a DNA sample may include a target sequence of interest 1410 that may be methylated (e.g., 1410a “Methylated Target”) or unmethylated (e.g., 1410b “Unmethylated Target”) at a CpG site of interest.
- a bisulfite conversion reaction is used to convert non-methylated cytosines to thymines (C — > T) in the target sequence 1410b.
- target sequence 1410 is recognized and bound by a recognition element comprising a code, 1415.
- Recognition element 1415 includes a 3'- terminal G nucleotide that base pairs with the target C at the CpG site of interest.
- ligation of recognition element 1415 only occurs when the 3'-terminus of the recognition element (e.g., a guanine “G”) hybridizes to the target site “C” of interest in target sequence 1410a to generate a circularized modified recognition element 1420. No ligation occurs at the mismatched target site “T” in the bisulfite converted target sequence 1410b and consequently, there is no ligation and circularization of the recognition element.
- the circular modified recognition element 1420 may be amplified in an amplification reaction to generate an amplification product comprising many copies of the circular modified recognition element including its code (among other elements) and the code may be detected and decoded.
- the recognition element may be a padlock probe as shown in 1415.
- a molecular inversion probe that includes a 3'-terminal single base gap at a target site of interest may be used.
- a gap-fill and ligation event using only a single added nucleotide (at a minimum) may be used to generate the circular modified recognition element comprising the code only when the nucleotide corresponding to the target site of interest is incorporated.
- This approach provides two forms of specificity to the assay: (i) the 3 '-terminus of the recognition element recognizes and binds the interrogated site; and (ii) a single base extension reaction that incorporates the nucleotide corresponding to the target site of interest occurs.
- Hypercoding of targets of interest can be enabled by utilizing a multi-functional oligonucleotide called a “recognition element”.
- a plurality of recognition elements is provided in assays disclosed herein.
- each recognition element in the plurality of recognition elements may comprise one or more target recognition regions.
- the target recognition regions of the recognition elements may comprise one or more nucleic acid sequence(s) complementary to a target molecule of interest.
- the one or more nucleic acid sequences of a recognition element may hybridize to one or more nucleic acid sequences of the target molecule.
- the target recognition region is configured to bind to one or more regions of the target molecule of interest flanking a target of interest (e.g., SNP, indel, and so on). In some embodiments, the target recognition region is configured to bind to the target of interest (e.g., the target recognition region base pairs with the SNP).
- the recognition element comprises a code. In some embodiments, the target molecule comprises the code. In either embodiment, the code may be detected as a surrogate or proxy for the presence of the target molecule.
- FIG. 2 A non-liming example of a recognition element comprising target recognition regions and a code is depicted in FIG. 2. In this non-limiting depiction, the recognition element comprises two target recognition regions (e.g., one at the 5’ end and another on the 3’ end of the recognition element) and a code comprising four segments (e.g., as an example).
- the structure of the recognition elements may vary.
- the structure of the recognition element may configure into a specific structure when hybridized to a target nucleic acid.
- Non-limiting examples of a recognition element configuration may include a padlock probe, a molecular inversion probe, a hairpin oligonucleotide, a single-stranded oligonucleotide, a double-stranded oligonucleotide, or a combination thereof.
- the recognition element is linear.
- the linear recognition element is circularized during the molecular transformation once hybridized to the respective target molecule.
- the recognition element is circular prior to the molecular transformation.
- the target molecule may serve as a primer for an amplification reaction (e.g., rolling circle amplification, multiple strand displacement amplification, etc.).
- the recognition element is configured to be a padlock probe once the recognition element is hybridized to the target molecule of interest sequences.
- Padlock probes may be referred to as linear oligonucleotides whose ends are complementary to adjacent target sequences.
- the two ends e.g., 5’ end and 3’ end
- the recognition element may be brought into contact, generating a padlock probe configuration for subsequent circularization by ligation.
- a recognition element as described herein further comprises a hypercode, or code, which can be used to uniquely identify the recognition element, and hence the target of interest to which it can hybridize to thereby providing an indirect determination of whether a target of interest is present in a sample.
- a hypercode or code
- the recognition element may comprise a number of target recognition regions.
- the recognition element may comprise two target recognition regions.
- one target recognition region may be present at the 5’ end of the recognition element, and another target recognition region may be present at the 3’ end of the recognition element.
- the target recognition region may be interposed between the 5’ end and the 3’ end of the recognition element.
- the recognition element may comprise one or more target recognition regions, two or more target recognition regions, three or more target recognition regions, four or more target recognition regions, five or more target recognition regions, six or more target recognition regions, seven or more target recognition regions, eight or more target recognition regions, nine or more target recognition regions, 10 or more target recognition regions, 15 or more target recognition regions, 20 or more target recognition regions, or 25 or more target recognition regions.
- the recognition element may comprise 25 or less target recognition regions, 20 or less target recognition regions, 15 or less target recognition regions, 10 or less target recognition regions, nine or less target recognition regions, eight or less target recognition regions, seven or less target recognition regions, six or less target recognition regions, five or less target recognition regions, four or less target recognition regions, three or less target recognition regions, or two or less target recognition regions.
- each target recognition region of the recognition element may comprise a plurality of nucleotides.
- each target recognition region may comprise a length of 2 or more nucleotides, 3 or more nucleotides, 4 or more nucleotides, 5 or more nucleotides, 10 or more nucleotides, 15 or more nucleotides, 20 or more nucleotides, 25 or more nucleotides, 30 or more nucleotides, 35 or more nucleotides, 40 or more nucleotides, 45 or more nucleotides, or 50 or more nucleotides.
- each target recognition region comprises a length of 50 or less nucleotides, 45 or less nucleotides, 40 or less nucleotides, 35 or less nucleotides, 30 or less nucleotides, 25 or less nucleotides, 20 or less nucleotides, 15 or less nucleotides, 5 or less nucleotides, 4 or less nucleotides, 3 or less nucleotides, or 2 or less nucleotides.
- Recognition element design design
- a recognition element also known as a PlenoidTM
- the first operation is to select the target sequences, including a screen for sequence similarity in the genome (FIG. 8A).
- the recognition element for a locus may be designed utilized a dual-ligation strategy for design instead of a standard design.
- a pharmacogenomic target such as CYP2D6 is difficult to genotype because of high homology with a pseudogene CYP2D7, so using a dual-ligation strategy to identify targets of interest in CYP2D6 and not in CYP2D7 overcomes that difficulty and allows for CYP2D6 targets of interest to be uniquely identified in a back of the presence of nucleic acids from the pseudogene CYP2D7 (see FIG. 9A).
- a strategy to genotype variants in regions with high homology to homologs or pseudogenes includes a two-ligation approach wherein one ligation event occurs at the base of interest and a second ligation event occurs at a downstream site where the high homology region and the region of interest differ. Additionally, a bridge oligonucleotide can be added to fill the gap between the two ligation sites.
- FIG. 9B demonstrates the success of following this two-ligation strategy. In FIG.
- the target of interest lies in the CYP2D6 gene and is differentiated from the pseudogene CYP2D7 using a dual ligation recognition element strategy compared to a single ligation (e.g., “normal”) recognition element strategy.
- Graph (i) shows results using a single ligation recognition element strategy in identifying a variant rs774671100, wherein known HomRef (Ref) genotyped samples show elevated counts for the alternative, or Alt, allele.
- Ref HomRef
- the second operation is to select 5’ and 3’ regions (also known as probe arms) of the recognition element that are complementary to the target sequences of interest identified in the first operation (FIG. 8B).
- 5’ and 3’ complementary regions which may be of varying lengths, are selected based on thermodynamic predictions for hybridization to the intended target sequences.
- the third operation is to assign the complementary 5’ and 3’ sequences from the second operation to a hypercode by computing and minimizing the potential for interaction between the complementary 5’ and 3’ sequences and the hypercode assigned thereto (FIG. 8C).
- the sequences for a target of interest are generally provided as a standard .vcf (e.g., variant call format) file that includes the chromosome number, position, locus id, reference allele (e.g., ref allele), and alternative (e.g., variant, Alt, Het) allele(s).
- a standard .vcf e.g., variant call format
- reference allele e.g., ref allele
- alternative e.g., variant, Alt, Het allele(s).
- each locus is evaluated as N recognition elements where N is the total number of alleles (reference and alternative).
- the reference genome of interest can be queried to obtain a sequence that extends, for exmaple, 45 bases before the locus and 44 bases after the locus.
- the reference base e.g., wild type nucleotide
- the alternative base(s) e.g., variant nucleotide(s)
- a variant of interest could be a single nucleotide polymorphism or other variant of interest.
- the sequence of the target of interest can be divided into two separate portions: a portion of the target of interest that is complementary to the 5’ end of a recognition element and a portion of the target of interest that is complementary to the 3’ end of a recognition element.
- the reference/altemative target of interest can be located at the end of the 3’ end of the recognition element.
- the reference/altemative target of interest can be located at the end of the 5’ end of the recognition element.
- a list of decreasing target sequence lengths from 45 nucleotides to a minimum sequence length of 15 nucleotides, and the reverse complements of those sequences is generated.
- a Nearest Neighbor estimation of the thermodynamic parameters between the two sequences is computed (e.g., AG and melting temperature).
- the pairs with a melting temperature closest to 65 °C and 60°C, respectively are selected.
- a hypercode unique to those sequences, and hence the target of interest is assigned to the recognition element.
- a series of potential end/hypercode sequence matches is computed and screened for secondary structure due to potential intermolecular interactions between the sequences.
- this screening strategy involves randomly selecting 25 hypercodes for each recognition element from the selected codespace. For each hypercode, the 5’ target end sequence is placed in close proximity to the 5’ end of the potential hypercode sequence and the 3’ target end sequence is placed in close proximity to the 3’ end of the hypercode sequence, as such generating a linear recognition element.
- one or more primer binding site sequences can be inserted into the recognition element for amplification.
- a hypercode or a portion thereof can be utilized as a primer binding site for amplification.
- a sliding window along the sequence of the unique recognition element sequences can be generated, wherein the window is stepped every 10 nucleotides. At each window placement the thermodynamic interactions between the sequence within the window and the full recognition element sequence can be computed. Upon the window reaching the end of the linear recognition element sequence, the minimum AG is reported across all windows.
- the recognition element that has the maximum AG e.g., a AG value greater than 0 to indicate unfavorable intermolecular interactions
- its incorporated hypercode is reserved for the particular target of interest as represented in the 5’ and 3’ end sequences of the linear recognition element.
- the first consideration includes ligation errors that may occur due to ligase specificity.
- the second consideration includes pseudogenes and homologs that may have high homology between the specified locus and target of interest and other regions in the genome.
- a trimmed target sequence can be generated, by looking, for example, at 20 nucleotides before and after the target locus of interest.
- a Basic Local Alignment Search Tool (BLAST) search can be performed using the 20 nucleotide target sequence to identify homology that is > 80% match with the 20 nucleotide target sequence.
- BLAST Basic Local Alignment Search Tool
- All hits reported from the BLAST search that are >80% match to the 20 nucleotide target sequence can be added to a list of potential homology matches that may cause off-target hybridization events with a recognition element.
- the BLAST alignments can be reviewed, for example from 200 bases before and after each locus of interest.
- the sequence alignments can be reviewed, for example by moving in the 5’ direction and comparing the bases of the reference sequence against the alignment hits.
- the mismatch position can be recorded as a potential anchor site for a double ligation recognition element design strategy. Additional anchor sites can be recorded as they are identified.
- the anchor sites that are identified moving along the 5’ direction can be considered for 3’ ligation sites and the anchor sites found moving in the 3’ direction can be considered for 5’ ligation sites, or vice versa if targeting the reverse strand.
- available databases for allele information can also be queried and if the alleles result in a matching homolog the allele frequency at the site can be recorded. If the allele frequency is below a specified threshold, then it can be assumed that these homologs, when looking across samples, can be corrected for. If the allele frequency is above a specified threshold, then a dual ligation recognition element as described herein can be implemented for target of interest identification.
- a bridge oligonucleotide comprising target sequence between the target sequence of interest to the anchor nucleotide can be generated, wherein the 5’ end sequence and 3’ end sequence of the recognition element hybridizes adjacent to the ends of the bridge oligonucleotide when hybridized to the target of interest.
- the target of interest sequence is found on the 5’ end sequence and the anchor nucleotide is found on the 3’ end sequence of the recognition element.
- the target of interest sequence is found on the 3’ end sequence and the anchor nucleotide is found on the 5’ end sequence of the recognition element.
- the methods described herein may include amplification of a nucleic acid.
- the nucleic acid is a recognition element.
- the nucleic acid is a target nucleic acid molecule.
- the nucleic acid is a combination of a recognition element and a target nucleic acid molecule, or a complement thereof.
- the amplification is selective amplification. For example, in some embodiments, amplification occurs if a target recognition region of a recognition element recognizes and binds to a complementary target nucleic acid.
- amplification occurs if a primer is used that is complementary to one or more of a portion of a target recognition region, a portion of a segment of a code, or another sequence in the recognition element that is complementary to a primer used for amplification.
- the amplification is non- selective.
- randomers can be used to prime amplification from one or more recognition elements.
- a universal primer can be used to prime amplification from a plurality of recognition elements.
- the methods described herein may include selectively amplifying a subset of nucleic acids.
- a subset of a plurality of recognition elements bound to a plurality of target nucleic acid molecules may be amplified.
- the subset may comprise a percentage of the total amount of recognition elements bound to target nucleic acid molecules as described herein.
- the subset may include 5% or more, 10% or more, 15% or more, 20% or more, 25% or more, 30% or more, 35% or more, 40% or more, 45% or more, 50% or more, 55% or more, 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, or 95% or more of the total amount of recognition elements bound to target nucleic acid molecules.
- the subset may include 95% or less, 90% or less, 85% or less, 80% or less, 75% or less, 70% or less, 65% or less, 60% or less, 55% or less, 50% or less, 45% or less, 40% or less, 35% or less, 30% or less, 25% or less, 20% or less, 15% or less, 10% or less, or 5% or less of the total amount of recognition elements bound to target nucleic acid molecules.
- the amplification may include rolling circle amplification (RCA).
- the amplification may include multiple strand displacement amplification.
- RCA may generate a concatemer as an amplification product, wherein the concatemer contains multiple copies of the circularized modified recognition element, including associated codes, target recognition regions, and any other functional sequences that are included in the circular modified recognition element.
- RCA may be performed while the circularized recognition element is in solution.
- RCA may be performed on a circularized recognition element while the circularized recognition element is immobilized, either reversibly or non-reversibly, on a solid substrate or surface.
- RCA is performed on a modified recognition element that is still hybridized to the target molecule.
- the terms “solid substrate” and “solid surface” may be referred to herein as a surface or substrate.
- the substrate is a bead, a flow cell, a microwell, or a nanowell.
- the substrate is coated with a composition that enhances target molecule immobilization.
- the substrate is charged.
- the substrate is positively charged or negatively charged.
- the substrate is an anionic substrate.
- the substrate is a cationic substrate.
- the substrate comprises an immobilization composition, such as polyacrylamide, branched PEI, linear PEI, poly(P-aminoester) and poly(amidoamine), PEG, a gel, poly-L-lysine, silane, agarose, muscle mimetic catecholamine polymer, and the like.
- the substrate has no charge.
- FIG. 6, shows a schematic diagram illustrating RCA amplification of a recognition element to yield a concatemeric amplification product.
- rolling circle amplification using primer 616b that is complementary to recognition element sequence 616 is hybridized to circular modified recognition element 625 and used to initiate the RCA reaction to generate an amplification product 630.
- Amplification product 630 is a polymeric concatemeric molecule that includes multiple repeated copies of circular modified recognition element 625, wherein each copy includes primer 616, code 614, a functional sequence 612, target recognition regions, and a second functional sequence 618.
- the complement of modified recognition element 625 is indicated by the dashed line.
- one or more functional sequences which may be included in a recognition element include, but are not limited to, a unique molecular identifier (UMI) sequence, a sequencing primer sequence, an index sequence, a restriction endonuclease sequence, a cleavage sequence, a unique molecular identifier, or combinations thereof.
- UMI unique molecular identifier
- An RCA reaction may be performed in the presence of the cationic polymer coated surface, resulting in simultaneous immobilization and amplification of an amplification product.
- RCA primers may be supplied in solution or bound to the cationic polymer-coated surface prior to, or concurrent with, performing the RCA reaction.
- a Hamming distance selection criterion may be implemented between any two segments of a code.
- a Hamming distance between two segments in a code refers to the number of states that differ between the segments. In essence, the Hamming distance measures the number of changes that would need to be made to a first segment to change the string of states to the second segment.
- the Hamming distance may be a minimum Hamming distance. In some embodiments, the Hamming distance may be a maximum Hamming distance.
- a segment may serve as a primer for amplification.
- a segment may comprise an amplification primer binding sequence for rolling circle amplification (RCA) for generating a plurality of amplification products.
- RCA rolling circle amplification
- a plurality of segments is present on the recognition elements provided herein.
- the nucleotide or nucleic acid sequence of each segment corresponds to one or more computation states for performing a decoding process of the present disclosure.
- one or more segments of a code may be detected with a first pool of detection polynucleotide complexes to produce one or more detectable binding complexes.
- the one or more detectable binding complexes once imaged, produce one or more optical signals.
- a series of optical signals may be observed (e.g., a code profile).
- the application of detection oligonucleotides to amplification products for detection is called “flow” or “cycle”, wherein “flow” or “cycle” is the number of times a particular segment of an amplification product is queried, or the number of times detection oligonucleotides or detection polynucleotide complexes are flowed or cycled over an amplification product in order to detect a segment sequence.
- one or more optical signals observed from querying an amplification product with detection polynucleotide complexes translates to one or more computational states such that each optical signal may be used to decode a code.
- the plurality of segments on the recognition element may correspond to at least three computational states.
- the optical signal may be a color or a non-color.
- the optical signal may be a combination of colors (e.g., when the detection polynucleotide complex comprises a plurality of detectable labels).
- the computational states are numbers (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.).
- additional ways to increase the number of computational states for decoding include, but are not limited to, adding levels of identifiability associated with a particular detectable signal such as whether a detectable signal is brighter or dimmer compared to a normal level of signal, whether there is a combination of detectable colors that is used to identify a particular nucleotide.
- the number of computational states that could be used is only limited by practicality for any given assay.
- a detection scheme using a larger number of computational states may require greater instrument complexity, which may lead to potential drawbacks such as color crosstalk, wherein the computational states used in the detection scheme may become difficult to distinguish from other computational states.
- a greater number of computational states may require that a more complex detection tool be used.
- three or more computational states may be used in the methods described herein.
- the methods described herein may use one or more computational states, five or more computational states, 10 or more computational states, 15 or more computational states, 20 or more computational states, 25 or more computational states, 30 or more computational states, 35 or more computational states, 40 or more computational states, 45 or more computational states, or 50 or more computational states.
- the methods described herein may use 50 or less computational states, 45 or less computational states, 40 or less computational states, 35 or less computational states, 30 or less computational states, 25 or less computational states, 20 or less computational states, 15 or less computational states, 10 or less computational states, or five or less computational states.
- each segment of a code may correspond to a combination of computational states.
- each segment may correspond to one or more computational states, two or more computational states, three or more computational states, four or more computational states, five or more computational states, six or more computational states, seven or more computational states, eight or more computational states, nine or more computational states, or 10 or more computational states.
- each segment may correspond to 10 or less computational states, nine or less computational states, eight or less computational states, seven or less computational states, six or less computational states, five or less computational states, four or less computational states, three or less computational states, or two or less computational states. It is the combinations of detected signals that are used to build a code profile which can be decoded for identifying the presence of a target molecule.
- the methods described herein may include introducing 1,000 or less, 900 or less, 800 or less, 700 or less, 600 or less, 500 or less, 400 or less, 300 or less, 200 or less, 100 or less, 50 or less, 25 or less, 20 or less, 15 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection polynucleotide complexes to an amplification product.
- Detection Oligonucleotide may include introducing 1,000 or less, 900 or less, 800 or less, 700 or less, 600 or less, 500 or less, 400 or less, 300 or less, 200 or less, 100 or less, 50 or less, 25 or less, 20 or less, 15 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection polynucleotide complexes to an amplification product.
- each detection polynucleotide complex may comprise a detection oligonucleotide.
- the detection oligonucleotide may comprise a portion comprising a detectable label (e.g., fluorescent molecule) and another portion configured to bind to at least a portion of an anchor oligonucleotide.
- the portion configured to bind to the anchor oligonucleotide comprises a nucleic acid sequence complementary to a portion of the nucleic acid sequence of the anchor oligonucleotide.
- FIG. 10A shows a nonlimiting example of a structure of a detection oligonucleotide comprising a fluorescent molecule 1010 and a portion complementary to at least a portion of an anchor oligonucleotide 1020.
- the detection oligonucleotide may comprise various nucleotide lengths. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 25 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 20 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 15 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 10 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 8 nucleotides.
- the detection oligonucleotide may comprise a length of between about 5 to about 100 nucleotides, between about 10 to about 80 nucleotides, between about 20 to about60 nucleotides, between about 30 to about 50 nucleotides, between about 15 to about 30 nucleotides.
- the detection oligonucleotide may comprise one or more detectable labels.
- one or more detectable labels comprise a fluorescent moiety. The fluorescent moiety may emit in the red, far-red, near-red, yellow, green, blue, or ultraviolet wavelengths.
- the fluorescent moiety comprises one or more of 6-FAM (6-carboxyfluorescein), JOE (6-carboxy-4',5'-dichloro-2',7'-dimethoxyfluorescein), TAMRA (6-carboxytetramethylrhodamine), 5-Cy5 (5-carboxyrhodamine), 5-Cy5.5 (5-carboxylic acid succinimidyl ester), 5-Cy7 (5-carboxyrhodamine), (hexachlorofluorescein), Alexa Fluor 488 (AF488), Alexa Fluor 514 (AF514), Texas Red, Cyanine 3, Cyanine 5, Pacific Blue, Tetramethyl rhodamine, Oxazole Yellow, Atto647N, and Rhodamine 6G (R6G).
- 6-FAM 6-carboxyfluorescein
- JOE 6-carboxy-4',5'-dichloro-2',7'-dimethoxyfluorescein
- TAMRA 6-carbox
- the fluorescent moiety may comprise an organic dye, a biological fluorophore, a quantum dot, or a combination thereof.
- the organic dye may comprise an organic molecule.
- the organic dye may comprise a coumarin, a cyanine, a benzofuran, a quinoline, a quinazolinone, an indole, a benzazole, a borapolyazaindacene, a xanthene, or a combination thereof.
- the organic dye may correspond to a color.
- the organic dye my correspond to a green, a yellow, a blue, an indigo, a red, an orange a purple, a pink, a violet, or a combination thereof.
- the organic dye may correspond to no color.
- the organic dye may correspond to a black color.
- the organic dye may correspond to a white color.
- the fluorophore may emit a color in the visible light spectrum. In some embodiments, the fluorophore may emit in a wavelength in the range between 400 nanometers (nm) and 900 nm. In some embodiments, the fluorophore may emit in a wavelength between about 400 nm and about 475 nm, about 475 nm and about 490 nm, about 490 nm and about 530 nm, about 530 nm and about 575 nm, about 575 nm and about 600 nm, about 600 nm and about 700 nm, or about 700 nm and about 800 nm.
- the fluorophore may emit a wavelength of 400 nm or more, 425 nm or more, 450 nm or more, 475 nm or more, 500 nm or more, 525 nm or more, 550 nm or more, 575 nm or more, 600 nm or more, 625 nm or more, 650 nm or more, 675 nm or more, 700 nm or more, 725 nm or more, 750 nm or more, 775 nm or more, 800 nm or more, 825 nm or more, 850 nm or more, 875 nm or more, or 900 nm or more.
- the fluorophore may emit a wavelength of 900 nm or less, 875 nm or less, 850 nm or less, 825 nm or less, 800 nm or less, 775 nm or less, 750 nm or less, 725 nm or less, 700 nm or less, 675 nm or less, 650 nm or less, 625 nm or less, 600 nm or less, 575 nm or less, 550 nm or less, 525 nm or less, 500 nm or less, 475 nm or less, 450 nm or less, 425 nm or less, or 400 nm or less.
- the wavelength of light that the fluorophore emits may correspond to a color on the visible spectrum. Examples of colors include, but are not limited to, green, blue, red, yellow, orange, pink, purple, or a combination thereof.
- a fluorophore emitting light in a wavelength between about 400 nm and about 475 nm may produce a purple color.
- a fluorophore emitting in a wavelength between about 420 nm and about 530 nm may produce a blue color.
- a fluorophore emitting in a wavelength between about 490 nm and about 575 nm may produce a green color.
- a fluorophore emitting in a wavelength between about 530 nm and about 600 nm may produce a yellow color. In some embodiments, a fluorophore emitting in a wavelength between about 575 nm and about 750 nm may produce an orange color. In some embodiments, fluorophore emitting in a wavelength between about 600 nm and about 800 nm may emit a red color.
- using a larger number of optically distinct detectable labels may lead to a detection process that identifies a target molecule in less time as compared to using a fewer number of optically distinct detectable labels when querying an amplification product.
- a detection scheme using a larger number of optically distinct detectable labels may lead to greater instrument complexity, which may lead to fluorescence detection crosstalk, whereby the fluorescence emission spectra of the optically distinct fluorescent moieties may not yield distinct fluorescence signals.
- a detection scheme using a greater number of optically distinct fluorescent moieties may require use of a more complex detection tool.
- the detectable labels may be optically distinct.
- FIG. 10B shows a set of four detection oligonucleotide (e.g., 1001, 1002, 1003, 1004) that are each optically distinct from one another. As such, detection molecules 1001, 1002, 1003, and 1004 emit different wavelengths of light when imaged.
- the fluorescent moieties may not be optically distinct.
- FIG. 10C shows a set of four detection oligonucleotide (e.g., 1005, 1005, 1005, and 1005) that are not optically distinct from one another. As such, detector polynucleotides 1005, 1005, 1005, and 1005 emit similar wavelengths of light during imaging resulting in target molecule data that is difficult to interpret.
- Anchor Oligonucleotide e.g., anchor Oligonucleotide
- the portion of the amplification product that the anchor oligonucleotide may be complementary to is a segment of a code.
- the anchor oligonucleotide may be complementary to a segment 1115 of a code on an amplified recognition element.
- the anchor oligonucleotide may comprise various nucleotide lengths. In some embodiments, the anchor oligonucleotide may comprise a length of about 20 to about 100 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 30 to about 70 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 40 to about 50 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 10 to about 20 nucleotides.
- the anchor oligonucleotide may comprise a length of one or more nucleotides, two or more nucleotides, three or more nucleotides, four or more nucleotides, five or more nucleotides, six or more nucleotides, seven or more nucleotides, eight or more nucleotides, nine or more nucleotides, 10 or more nucleotides, 15 or more nucleotides, 20 or more nucleotides, 25 or more nucleotides, 30 or more nucleotides, 35 or more nucleotides, 40 or more nucleotides, 45 or more nucleotides, 50 or more nucleotides, 55 or more nucleotides, 60 or more nucleotides, 65 or more nucleotides, 70 or more nucleotides, 75 or more nucleotides, 80 or more nucleotides, 85 or more nucleotides, 90 or more nucleo
- the anchor oligonucleotide may comprise 100 or less nucleotides, 95 or less nucleotides, 90 or less nucleotides, 85 or less nucleotides, 80 or less nucleotides, 75 or less nucleotides, 70 or less nucleotides, 65 or less nucleotides, 60 or less nucleotides, 55 or less nucleotides, 50 or less nucleotides, 45 or less nucleotides, 40 or less nucleotides, 35 or less nucleotides, 30 or less nucleotides, 25 or less nucleotides, 20 or less nucleotides, 15 or less nucleotides, 10 or less nucleotides, nine or less nucleotides, eight or less nucleotides, seven or less nucleotides, six or less nucleotides, five or less nucleotides, four or less nucleotides, three or less nucleotides, or two or less
- FIG. 11B shows an example of a detection polynucleotide complex comprising a detection oligonucleotide 1130 and an anchor oligonucleotide 1140.
- the detection oligonucleotide and the anchor oligonucleotide form a detection polynucleotide complex.
- the detection polynucleotide complex when associated with the target molecule, forms a detectable binding complex that is suitable for detection using the imaging system of the present disclosure.
- a plurality of detectable binding complexes is formed when a pool of detection oligonucleotides, anchor oligonucleotides, or detection polynucleotide complexes are introduced to a plurality of target molecules from a sample.
- detectable binding complexes Upon formation of detectable binding complexes, a plurality of signals may be observed, wherein one or more signals correspond to a single detectable binding complex, thereby generating a signal profile for a code that can be decoded.
- the methods herein may use 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, ten or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection pools.
- the intensity footprints of all amplified recognition elements that are confidently assigned to type k are averaged using a basic "hard" decoding operation.
- the hypercode profiles can be refined using empirical intensity data, effectively applying one or more iterations of a clustering algorithm to the initial values.
- the intensity profiles can be scaled to unit norm.
- Intensity profiles can be used to decode amplified recognition elements, which can be used as an indirect indicator of whether a target of interest is present in a biological sample.
- a matrix-variate Gaussian mixture model can be proposed to capture the correlation of the fluorescence signals across color channels and flows, wherein this model can be used in a Bayesian estimator called "PosTCode".
- a prior distribution of the probabilities of targets is assumed.
- the conditional probability of a given fluorescent signature over all channels and readout flows given a particular code in the codespace can be modeled as a Gaussian mixture with correlations over the channels and over the flows.
- the posterior probability of each codeword is then computed from the conditional probabilities and the prior distribution. The process can be iterated since the correlation matrices in the model need to also be estimated.
- the model can be complex, but also very powerful as long as the probability distributions involved are members of the exponential function family.
- the model includes fitting a large number of parameters, particularly as plexity increases, and empirical data may not always be a good match to the Gaussian mixture model.
- typically a more robust decode performance was obtained using the Manhattan distance, rather than the sum-squared distance, which would be appropriate for underlying Gaussian statistics.
- the number of distinctive hypercodes profiled can scale with the number of cycles and resolvable optical signatures at each cycle, thereby expanding the potential for assay complexity (FIG. 20C).
- hypercodes were designed within the framework of a 4-state system, with variable number of segments, and variable number of cycles for decoding the segments.
- assay complexities comprising up to 12,000 hypercodes was achieved.
- errors in determining a hypercode profile was mitigated by reducing the number of segments in a hypercode and enabling two separate optical signatures per segment (FIG. 15).
- the methods described herein may relate to using an algorithm to predict the presence of target nucleic acid molecules in a sample.
- the algorithm is a soft decision decoding algorithm.
- the algorithm is applied to the detectable signals of the codes or amplified codes, a code profile, for predicting the presence of a target nucleic acid in a sample.
- the methods disclosed herein may comprise soft decision decoding to predict, or determine the probability of, the presence of the code in a recognition element of amplification product thereof, wherein the presence of the code correlates and serves as a proxy for the presence of a target nucleic acid in a sample.
- the methods described herein may use soft decision decoding.
- the methods described herein may use hard decision decoding.
- hard decision decoding signals from queried concatemers are extracted from images. This is the same for soft decision decoding, in that signals that are generated and imaged are extracted from the images.
- hard decision decoding hard basecalls are generated from the intensities of the signals, whereas with soft decision decoding no hard basecalls are necessary as all of the signal range is retained.
- the code assignment for hard decision decoding is determined by matching the nucleotide reads to codes, whereas with soft decision decoding, the signals are cross correlated against the expects signals and the most likely code is assigned, as such soft decision decoding is a probabilistic methodology. When using soft decision decoding techniques, it is not necessary for the model to identify each base specifically.
- signals e.g., fluorescent signals
- signals generated during each cycle of a detection process may be detected and recorded to produce a data set that may be used as input into a model to calculate a probability that a specific code is present without requiring a hard decision decoding model.
- a soft decision decoding model developed according to the methods of the disclosure may nevertheless include assigning a probability or identity to each nucleotide in the sequence of a code.
- the permutation space on a recognition element is the totality of factors that determines the number of unique nucleotide possibilities at each segment of a code.
- the factors of the code space comprise the number of segments present on the recognition element, the number of incubation periods or times a segment is queried with a detection pool (e.g., flows or cycles), and the number of computational states used in the methods and systems described herein.
- a confidence score is computed from the difference between the intensity profile of each feature and the trained profiles. Several filters are applied to remove outliers, duplicates, and low confidence decoded concatemers. The final output is a table of decoded concatemers with associated filter status, confidence score, and most likely assignment to one of the codes of the codeset used in the recognition elements of the assay.
- a recognition element comprises a larger code, for example a code with four segments instead of two or three.
- a recognition element comprising a larger code may result in a detection scheme with improved error correction compared to a smaller code. Additionally, in some embodiments, a larger code may result in a lower signal-to-noise ratio.
- a recognition element comprises a smaller code, for example a code with two segments.
- a recognition element comprising a smaller code may result in a detection scheme with lower error correction abilities. Further, in some embodiments, a small code may result in a higher signal-to-noise ratio. Error Correction for Profile Decoding
- FIG. 16A shows representative intensity profiles for 100 concatenated amplification products with a profile score of ⁇ 0.67 that decode to the hypercode with median skew.
- the learned intensity profile and the ideal intensity profile are superimposed and reflect the systematic deviations in the measured intensity associated with this hypercode.
- FIG. 16B shows representative intensity profiles for 100 concatenated amplification products with a profile score of ⁇ 0.71 that decode to the hypercode with high skew
- FIG. 16C shows representative intensity profiles for 100 concatenated amplification products with a profile score of ⁇ 0.65 that decode to the hypercode with high skew.
- a simple hybridization based detection workflow eliminates the need for multi-component detection reagents and detection enzymatic chemistries.
- Detection schemes can be tuned to address assay requirements such as minimized decode error rates, reduced readout time, higher plexity and/or improved dynamic range quantification of a small number of targets of interest by varying the number of fluorescence states and cycles.
- machine learning based hypercode decoding allows for systematic noise compensation and robust error correction beyond the biological limitations of ligases, while keeping decoding time low for high plexity panels. Iteratively repeating operations
- the methods described herein may relate to iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes.
- the iterative repetition of the operations may be performed for each segment of a code.
- the iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions.
- the methods described herein may comprise about 2 to about50 iterative repetitions, about 2 to about 10 iterative repetitions, about 2 to about 8 iterative repetitions, or about four iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes.
- the method described herein may comprise one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, 10 or more, 15 or more, 20 or more, 25 or more, 30 or more, 35 or more, 40 or more, 45 or more, or 50 or more iterative repetitions of : (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions.
- the method described herein may comprise 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less iterative repetitions of the operations of : (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions.
- the number of iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions that may correspond to the number of segments present in a code of the recognition element.
- the code of the recognition element may comprise a number of segments.
- each segment of the code of the recognition element may undergo a number of iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions.
- the sample may comprise a biological sample.
- the sample may comprise whole blood, lymphatic fluid, serum, plasma, sweat, tears, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, liquids containing multi-celled organisms, biological swabs, biological washes, or a combination thereof.
- the sample may comprise whole blood.
- a whole blood may be veinous blood or capillary blood (e.g., obtained by a fingerstick).
- the target nucleic acid molecules may include, but are not limited to, DNA or RNA.
- the target nucleic acid molecules may include DNA.
- the target nucleic acid molecules may include RNA.
- the target nucleic acid molecules may include a combination of DNA and RNA.
- the target nucleic acid molecules are fragments or components of DNA.
- the target nucleic acid molecules are fragments or components of RNA.
- the target nucleic acid molecules comprise complementary DNA or cDNA, transcribed from RNA (e.g., mRNA).
- the target nucleic acid molecules comprise mRNA.
- the target nucleic acid molecules may comprise DNA.
- the DNA may be genomic DNA.
- the DNA may include one or more single nucleotide variants (SNVs), single nucleotide polymorphisms (SNPs), insertions/deletions (indels), copy number variants (CNVs), methylated nucleotides, or any combination thereof.
- the DNA may include cell-free DNA (cfDNA).
- the cfDNA may include maternal cfDNA, fetal cfDNA, or combinations thereof, or cfDNA from a solid tumor.
- the DNA may include circulating tumor cell DNA or ctcDNA.
- the DNA may include a synthetic DNA target, such as a product of a polymerase chain reaction (PCR).
- the DNA may be transcribed from single- stranded RNA templates, such as complementary DNA (cDNA) from a first strand or second strand synthesis reaction.
- the DNA may be PCR-amplified or RT- PCR amplified DNA or complementary DNA (cDNA).
- DNA is genomic DNA or fragmented genomic DNA.
- the target nucleic acid molecules may comprise RNA.
- the RNA may include messenger RNA (mRNA).
- the mRNA may be a splice variant.
- the RNA may include microRNA (miRNA).
- the RNA may include pre-miRNA.
- the RNA may include pri-miRNA.
- the RNA may include mRNA.
- the RNA may include pre-mRNA.
- the RNA may include viral RNA.
- the RNA may include viroid RNA.
- the RNA may include virusoid RNA.
- the RNA may include circular RNA (circRNA).
- the RNA may include ribosomal RNA (rRNA).
- the RNA may include transfer RNA (tRNA).
- the RNA may include pre-tRNA.
- the RNA may include long non-coding RNA (IncRNA).
- the RNA may include small nuclear RNA (snRNA).
- the RNA may include circulating RNA.
- the RNA may include cell-free RNA.
- the RNA may include exosomal RNA.
- the RNA may include vector-expressed RNA.
- the RNA may include synthetic RNA.
- the systems disclosed herein relate to detecting a target nucleic acid molecule.
- the systems comprise a plurality of recognition elements and a plurality of detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides.
- the systems comprise a solid substrate configured to immobilize detectable binding complexes.
- the systems comprise a welled plate or a flow cell.
- the systems comprise a fluid flow controller, a temperature controller, an imaging system, a computer system, or any combination thereof.
- the systems comprise a plurality of recognition elements of the present disclosure.
- the recognition element of the plurality comprises one or more target regions complementary to a corresponding target nucleic acid molecule in a sample.
- the recognition element comprises a code of a set of codes, wherein the code is associated with one or more target nucleic acid molecules in the sample.
- the code comprises a plurality of segments that correspond to one or more computational states of a set of computational states.
- the plurality of detection polynucleotide complexes comprises a detection oligonucleotide and an anchor oligonucleotide, wherein a portion of each anchor oligonucleotide is complementary to a portion of a different segment of the plurality of segments; and another portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide.
- the systems disclosed herein may include a solid substrate or a solid surface.
- the solid substrates and surfaces disclosed herein may be referred to as a substrate, a support, a solid support, or a surface.
- the substrate may be configured to immobilize a detectable binding complex.
- the substrate may be configured to immobilize circularized modified recognition elements, amplification products, or both.
- the substrate may be modified for attachment of a nucleic acid described herein.
- the substrate may be modified for attachment of the amplification products described herein.
- Example solid substrates include, but are not limited to, glass, modified or functionalized glass, plastics, polysaccharides, nylon, nitrocellulose, ceramics, resins, silica, silica-based materials, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and other polymers.
- the plastic solid substrates may include acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, or polyurethanes.
- the silica-based solid substrates may include silicon or modified silicon.
- the substrate may be a welled plate. In some embodiments, the substrate may be a 96-well plate. In some embodiments, the substrate may be a 4-well plate, a 6- well plate, an 8-well plate, a 12-well plate, a 24-well plate, a 48-well plate, a 384-well plate, an 864-well plate, or a 1,536-well plate. In some embodiments, the substrate may have greater than or equal to 96 wells. In some embodiments, the substrate may have less than or equal to 96 wells. In some embodiments, the substrate may be a flow cell. In some embodiments, the flow cell may have two or more lanes. In some embodiments, the flow cell may have two or less lanes.
- the substrate may be a microarray, a slide, a chip, a microwell, a tube, a column, a particle, or a bead.
- the substrate may be a microarray, such as a DNA microarray.
- the substrate may be a paramagnetic bead.
- the substrate may comprise a coating.
- the coating may comprise a layer that may be charged.
- the coating layer may be positively charged.
- the coating layer may be negatively charged.
- the coating may be non-charged.
- the substrate may comprise a surface comprising a cation-coating layer.
- the substrate may comprise a surface comprising an anion-coating layer.
- the substrate may comprise a surface comprising a neutral -charged layer.
- the substrate may be coated with streptavidin.
- the substrate may be coated with avidin.
- the substrate may be coated with one or more antibodies.
- the systems disclosed herein may include a fluid flow controller, a temperature controller, an imaging system, a computer system, or any combination thereof.
- the systems disclosed herein may comprise a fluidics system.
- the fluidics system may comprise a fluid flow controller.
- the fluid flow controller may comprise one or more pumps, valves, mixing manifolds, reagent reservoirs, waste reservoirs, or any combination thereof.
- the fluidic system and subcomponents of the fluidics system are fluidically connected to the reaction vessel of the present disclosure.
- the fluidic system and subcomponents of the fluidics system iteratively flow in reagents (e.g., buffers, detection oligonucleotides, anchor oligonucleotides, detection polynucleotide complexes, etc.) to the reaction vessel.
- the reaction vessel comprises a solid substrate configured to immobilize the modified recognition elements or amplification products thereof.
- the systems disclosed herein may comprise a temperature system.
- the temperature system may comprise a temperature controller.
- the temperature controller may be incorporated into the systems described herein to facilitate accuracy of the methods and systems described herein.
- the temperature controller may comprise temperature control components.
- Non-limiting examples of temperature control components include resistive heating elements, infrared light sources, heating or cooling devices, heat sinks, thermocouples, thermistors, or a combination thereof.
- the temperature controller may provide changes in temperature over specified time intervals.
- the temperature controller may provide an increase in temperature.
- the temperature controller may provide a decrease in temperature.
- the temperature controller may provide for cycling of temperatures between two or more set temperatures so that thermocycling or amplification may be performed.
- the temperature controller may provide a constant temperature.
- the systems disclosed herein may comprise an imaging system.
- signals produced by the detectable binding complexes disclosed herein may be imaged by the imaging systems disclosed herein.
- the imaging system may comprise one or more light sources, one or more optical components, one or more filters, one or one or more imaging sensors for imaging and detection, or a combination thereof.
- the one or more light sources may comprise light from a bulb.
- the one or more optical components may comprise lenses, mirrors, prisms, optical filters, colored glass filters, narrowband interference filters, broadband interference filters, dichroic reflectors, diffraction gratings, apertures, optical fibers, optical waveguides, or a combination thereof.
- the one or more imaging sensors may comprise a charge-coupled device (CCD) sensor or camera, a complementary metal- oxide-semiconductor (CMOS) imaging sensor or camera, a negative-channel metal-oxide semiconductor (NMOS) imaging sensor or camera, or a combination thereof.
- CCD charge-coupled device
- CMOS complementary metal- oxide-semiconductor
- NMOS negative-channel metal-oxide semiconductor
- FIG. 23 a block diagram is shown depicting an exemplary machine that includes a computer system 2300 (e.g., a processing or computing system) within which a set of instructions can execute for causing a device to perform or execute any one or more of the aspects and/or methodologies for static code scheduling of the present disclosure.
- a computer system 2300 e.g., a processing or computing system
- the components in FIG. 23 are examples only and do not limit the scope of use or functionality of any hardware, software, embedded logic component, or a combination of two or more such components implementing particular embodiments.
- Computer system 2300 may include one or more processors 2301, a memory 2303, and a storage 2308 that communicate with each other, and with other components, via a bus 2340.
- the bus 2340 may also link a display 2332, one or more input devices 2333 (which may, for example, include a keypad, a keyboard, a mouse, a stylus, etc.), one or more output devices 2334, one or more storage devices 2335, and various tangible storage media 2336. All of these elements may interface directly or via one or more interfaces or adaptors to the bus 2340.
- the various tangible storage media 2336 can interface with the bus 2340 via storage medium interface 2326.
- Computer system 2300 may have any suitable physical form, including but not limited to one or more integrated circuits (ICs), printed circuit boards (PCBs), mobile handheld devices (such as mobile telephones or PDAs), laptop or notebook computers, distributed computer systems, computing grids, or servers.
- ICs integrated circuits
- PCBs printed circuit boards
- mobile handheld devices such as mobile telephones or PDAs
- laptop or notebook computers distributed computer systems, computing grids, or servers.
- Computer system 2300 includes one or more processor(s) 2301 (e.g., central processing units (CPUs), general purpose graphics processing units (GPGPUs), or quantum processing units (QPUs)) that carry out functions.
- processor(s) 2301 e.g., central processing units (CPUs), general purpose graphics processing units (GPGPUs), or quantum processing units (QPUs)
- processor(s) 2301 optionally contains a cache memory unit
- Processor(s) 2301 are configured to assist in execution of computer readable instructions.
- Computer system 2300 may provide functionality for the components depicted in FIG. 23 as a result of the processor(s) 2301 executing non-transitory, processor-executable instructions embodied in one or more tangible computer-readable storage media, such as memory 2303, storage 2308, storage devices 2335, and/or storage medium 2336.
- the computer-readable media may store software that implements particular embodiments, and processor(s) 2301 may execute the software.
- 2303 may read the software from one or more other computer-readable media (such as mass storage device(s) 2335, 2336) or from one or more other sources through a suitable interface, such as network interface 2320.
- the software may cause processor(s) 2301 to carry out one or more processes or one or more steps of one or more processes described or illustrated herein. Carrying out such processes or steps may include defining data structures stored in memory 2303 and modifying the data structures as directed by the software.
- the memory 2303 may include various components (e.g., machine readable media) including, but not limited to, a random-access memory component (e.g., RAM 2304) (e.g., static RAM (SRAM), dynamic RAM (DRAM), ferroelectric random access memory (FRAM), phasechange random access memory (PRAM), etc.), a read-only memory component (e.g., ROM 2305), and any combinations thereof.
- ROM 2305 may act to communicate data and instructions unidirectionally to processor(s) 2301
- RAM 2304 may act to communicate data and instructions bidirectionally with processor(s) 2301.
- ROM 2305 and RAM 2304 may include any suitable tangible computer-readable media described below.
- a basic input/output system 2306 (BIOS) including basic routines that help to transfer information between elements within computer system 2300, such as during start-up, may be stored in the memory 2303.
- Fixed storage 2308 is connected bidirectionally to processor(s) 2301, optionally through storage control unit 2307.
- Fixed storage 2308 provides additional data storage capacity and may also include any suitable tangible computer-readable media described herein.
- Storage 2308 may be used to store operating system 2309, executable(s) 2310, data 2311, applications 2312 (application programs), and the like.
- Storage 2308 can also include an optical disk drive, a solid- state memory device (e.g., flash-based systems), or a combination of any of the above.
- Information in storage 2308 may, in appropriate cases, be incorporated as virtual memory in memory 2303.
- storage device(s) 2335 may be removably interfaced with computer system 2300 (e.g., via an external port connector (not shown)) via a storage device interface 2325.
- storage device(s) 2335 and an associated machine-readable medium may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for the computer system 2300.
- software may reside, completely or partially, within a machine-readable medium on storage device(s) 2335.
- software may reside, completely or partially, within processor(s) 2301.
- Bus 2340 connects a wide variety of subsystems.
- reference to a bus may encompass one or more digital signal lines serving a common function, where appropriate.
- Bus 2340 may be any of several types of bus structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of bus architectures.
- such architectures include an Industry Standard Architecture (ISA) bus, an Enhanced ISA (EISA) bus, a Micro Channel Architecture (MCA) bus, a Video Electronics Standards Association local bus (VLB), a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCLX) bus, an Accelerated Graphics Port (AGP) bus, HyperTransport (HTX) bus, serial advanced technology attachment (SATA) bus, and any combinations thereof.
- ISA Industry Standard Architecture
- EISA Enhanced ISA
- MCA Micro Channel Architecture
- VLB Video Electronics Standards Association local bus
- PCI Peripheral Component Interconnect
- PCLX PCI-Express
- AGP Accelerated Graphics Port
- HTX HyperTransport
- SATA serial advanced technology attachment
- Computer system 2300 may also include an input device 2333.
- a user of computer system 2300 may enter commands and/or other information into computer system 2300 via input device(s) 2333.
- Examples of an input device(s) 2333 include, but are not limited to, an alpha-numeric input device (e.g., a keyboard), a pointing device (e.g., a mouse or touchpad), a touchpad, a touch screen, a multi-touch screen, a joystick, a stylus, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), an optical scanner, a video or still image capture device (e.g., a camera), and any combinations thereof.
- an alpha-numeric input device e.g., a keyboard
- a pointing device e.g., a mouse or touchpad
- a touchpad e.g., a touch screen
- a multi-touch screen e.g.,
- the input device is a Kinect, Leap Motion, or the like.
- Input device(s) 2333 may be interfaced to bus 2340 via any of a variety of input interfaces 2323 (e.g., input interface 2323) including, but not limited to, serial, parallel, game port, USB, FIREWIRE, THUNDERBOLT, or any combination of the above.
- Information and data can be displayed through a display 2332.
- a display 2332 include, but are not limited to, a cathode ray tube (CRT), a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT-LCD), an organic liquid crystal display (OLED) such as a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display, a plasma display, and any combinations thereof.
- the display 2332 can interface to the processor(s) 2301, memory 2303, and fixed storage 2308, as well as other devices, such as input device(s) 2333, via the bus 2340.
- the computer programs described herein may apply selection criterion or selection criteria to a set of nucleic acid segments.
- the computer programs may sort the nucleic acid segments, determine or compute characteristics of the nucleic acid segments, perform calculations, reorder the nucleic acid segments, or a combination thereof.
- the computer programs described herein may store information related to the nucleic acid segments.
- the computer program may use information stored related to the nucleic acid segments to apply selection criteria to the nucleic acid segments.
- the computer program may receive information and/or data related to nucleic acid segments, selection criteria, or a combination thereof.
- an application provision system alternatively has a distributed, cloud-based architecture 2500 and comprises elastically load balanced, auto-scaling web server resources 2510 and application server resources 2520 as well as synchronously replicated databases 2530.
- a computer program includes a mobile application provided to a mobile computing device.
- the mobile application is provided to a mobile computing device at the time it is manufactured.
- the mobile application is provided to a mobile computing device via the computer network described herein.
- a compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code.
- Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, JavaTM, Lisp, PythonTM, Visual Basic, and VB .NET, or combinations thereof. Compilation is often performed, at least in part, to create an executable program.
- a computer program includes one or more executable complied applications.
- plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non-limiting examples, C++, Delphi, JavaTM, PHP, PythonTM, and VB .NET, or combinations thereof.
- Web browsers are software applications, designed for use with network-connected computing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non-limiting examples, Microsoft® Internet Explorer®, Mozilla® Firefox®, Google® Chrome, Apple® Safari®, Opera Software® Opera®, and KDE Konqueror. In some embodiments, the web browser is a mobile web browser. Mobile web browsers (also called microbrowsers, mini-browsers, and wireless browsers) are designed for use on mobile computing devices including, by way of nonlimiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems.
- PDAs personal digital assistants
- Suitable mobile web browsers include, by way of non-limiting examples, Google® Android® browser, RIM BlackBerry® Browser, Apple® Safari®, Palm® Blazer, Palm® WebOS® Browser, Mozilla® Firefox® for mobile, Microsoft® Internet Explorer® Mobile, Amazon® Kindle® Basic Web, Nokia® Browser, Opera Software® Opera® Mobile, and Sony® PSPTM browser.
- the platforms, systems, media, and methods disclosed herein include software, server, and/or database modules, or use of the same.
- software modules are created by techniques known to those of skill in the art using machines, software, and languages known to the art.
- the software modules disclosed herein are implemented in a multitude of ways.
- a software module comprises a file, a section of code, a programming object, a programming structure, a distributed computing resource, a cloud computing resource, or combinations thereof.
- a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, a plurality of distributed computing resources, a plurality of cloud computing resources, or combinations thereof.
- the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, a standalone application, and a distributed or cloud computing application.
- software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on a distributed computing platform such as a cloud computing platform. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
- the platforms, systems, media, and methods disclosed herein include one or more databases, or use of the same.
- suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, XML databases, document oriented databases, and graph databases. Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, Sybase, and MongoDB.
- a database is Internet-based.
- a database is web-based.
- a database is cloud computing-based.
- a database is a distributed database.
- a database is based on one or more local computer storage devices.
- the kits may comprise one or more buffers or reagents. In some embodiments, the kits may comprise two buffers or reagents, or a combination thereof. In some embodiments, a first buffer or reagent of the kits described herein may be configured to promote hybridization. In some embodiments, a second buffer or reagent of the kits described herein may be configured to promote de-hybridization. In some embodiments, the kits may comprise one or more reagents. In some embodiments, the kits may comprise three reagents. In some embodiments, the first reagent of the kits described herein may comprise a set of recognition elements.
- the second reagent of the kits described herein may comprise a set of detection polynucleotide complexes, wherein each detection polynucleotide complex may comprise a hybridized detection oligonucleotide and an anchor oligonucleotide.
- the second reagent of the kits described herein may comprise a set of detection oligonucleotides and a set of associated anchor oligonucleotides, wherein a subset of the detection oligonucleotides and a subset of associated anchor oligonucleotides may be assembled into detection polynucleotide complexes prior to a detection assay, whereas the remaining detection oligonucleotides and associated anchor oligonucleotides may assemble in the detection assay proper.
- the third reagent of the kits described herein may comprise a sample comprising a plurality of target nucleic acid molecules.
- the first reagent, the second reagent, and the third reagent may be found in separate containers within the kit. In some embodiments, the first reagent, the second reagent, and the third reagent may be found in the same container within the kit.
- the kits may comprise instructions for use, a manual, a protocol, or a combination thereof. In some embodiments, the kits may comprise a tube, a bottle, a glass jar, a container, or a combination thereof. In some embodiments, the kits may comprise instrumentation, including but not limited to a centrifuge, a heating element, a cooling element, a shaker, an incubator, or a combination thereof.
- the kit may comprise components configured to perform any one of the methods described herein.
- components of the kit may be stored at room temperature, below room temperature, below 15°C, below 10°C, below 5°C below 0°C, below -20°C, or below -40°C, or a combination thereof.
- different components within the kit may be stored at different temperatures.
- the term “about” in some cases refers to an amount that is approximately the stated amount or that is near the stated amount by 10%, 5%, or 1%, including increments therein.
- each of the expressions “at least one of A, B and C”, “at least one of A, B, or C”, “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.
- “Linked” with respect to two nucleic acids means not only a fusion of a first moiety to a second moiety at the C-terminus or the N-terminus, but also includes insertion of the first moiety to the second moiety into a common nucleic acid.
- the nucleic acid A may be linked directly to nucleic acid B such that A is adjacent to B (-A-B-), but nucleic acid A may be linked indirectly to nucleic acid B, by intervening nucleotide or nucleotide sequence C between A and B (e.g., -A-C-B- or -B-C-A-).
- the term “linked” is intended to encompass these various possibilities.
- “Set” includes sets of one or more elements or objects.
- a “subset” of a set includes any number elements or objects from the set, from one up to all of the elements of the set.
- Subject includes any plant or animal, including without limitation, humans.
- Detecting” or “decoding” with respect to a code includes determining the presence of a known code or a probability of the presence of a known code with or without determining the nucleotide by nucleotide sequence of the code.
- Decoding may be hard decision decoding.
- Decoding may be soft decision decoding.
- Hard decision decoding or “hard decision” refers to a method or model that includes making a call for each nucleotide in a nucleic acid segment (commonly referred to as a “base call”) in order to determine the sequence of nucleotides in the nucleic acid segment.
- Models of the inventive concepts incorporate hard decision decoding models.
- the particular nucleic acid being detected may be or include a code of the inventive concepts described herein.
- Soft decision decoding or “soft decision” refers to a method or a model that uses data collected during a sequencing or detecting process to calculate a probability that a particular nucleic acid or nucleic acid segment is present.
- the probability may be calculated without making a base call for each nucleotide in a nucleic acid segment.
- a probability is calculated without making a hard call that a string of nucleic acids in a segment are present.
- a probabilistic decoding algorithm is applied to the recorded signal(s) upon completion of signal collection.
- a probability of the presence of each of the codes may be determined without discarding signal in contrast to hard decision decoding method in which hard calls are made during the signal collection process.
- the data may, for example, include or be calculated from, fluorescent intensity readings in spectral bands for signals produced by the sequencing/detecting chemistry.
- soft decision decoding uses data collected during a sequencing/detecting process to calculate a probability that a particular nucleic acid segment from a known set of sequences is present.
- Models of the inventive concepts may be used for soft decision decoding.
- the particular nucleic acid or nucleic acid segment being detected may be or include a code of the inventive concepts.
- Crosstalk refers to the situation in which a signal from one nucleotide addition reaction may be picked up by multiple channels (referred to as “color crosstalk”) or the situation in which a signal from a concatemer or sequencing cluster interferes with an adjacent or nearby cluster or concatemer (referred to as “cluster crosstalk” or “concatemer crosstalk”).
- Coupled channel means a set of optical elements for sensing and recording an electromagnetic signal from a sequencing reaction.
- optical elements include lenses, filters, mirrors, and cameras.
- Spectral band or “spectral region” means a continuous wavelength range in the electromagnetic spectrum.
- Flow or “cycle” refers to a single incubation period with a detection polynucleotide complex pool, or a pool of detection oligonucleotides and anchor oligonucleotides, and a concatemeric amplification product.
- Code Space refers to the totality of factors that determine the number of unique nucleotide possibilities at each segment on a recognition element.
- Bind refers to and includes both covalent and non-covalent interactions.
- bind can include any degree of hybridization between two nucleic acid sequences.
- bind can include covalent binding, where the sharing of electrons between atoms occurs (e.g., a chemical bond).
- a covalent bond can be reversible or irreversible, depending on the need. Hydrogen bonds, van der Waals interactions and other weak interactions between molecules are also considered “bonds” for the present disclosure.
- Target nucleic acids are extracted from a sample, be it a eukaryotic or prokaryotic sample. Extracting the nucleic acids from samples may be performed by existing methods available as known to a skilled artisan.
- An aliquot of the extracted target nucleic acids is combined with a pool of recognition elements. This combination can be performed in a tube, a well of a plate, or other contained devices.
- a buffer and a ligase for example a high-fidelity thermostable DNA ligase, for example in a 20 pL reaction volume.
- the ligation reaction is incubated and cycled between 95°C and 60°C, for example six times, to increase the formation of hybrids between the target nucleic acids and their complementary regions in the recognition elements and the ligation of the ends of the recognition elements that are brought together via the hybridization event to generate circularized recognition elements.
- the reactions are treated with an exonuclease to remove linear nucleic acids.
- an exonuclease mixture can be added, incubated at 37°C for 30 min., followed by a 5 min. exonuclease inactivation at 95°C for 5 min.
- the circularized recognition elements can be immobilized on the surface of a pre-treated substrate comprising a cationic coating by incubating the substrate with the amplification products at 42°C for 60 min.
- a pre-treated substrate comprising a cationic coating by incubating the substrate with the amplification products at 42°C for 60 min.
- the circularized recognition elements can be amplified to generate concatemeric amplification products.
- an amplification primer deoxynucleotide triphosphates (dNTPs)
- a buffer solution and a DNA polymerase such as Phi29 DNA polymerase for performing rolling circle amplification (RCA).
- the RCA reaction can be incubated at 42°C for 1- 2 hrs, after which the wells can be washed, leaving the concatemeric amplification products in a wash buffer such as a Tris-EDTA wash solution in anticipation of the detection part of the assay.
- a wash buffer such as a Tris-EDTA wash solution
- the plate comprising the concatemeric amplification products can be placed in a fluorescent imaging instrument with an optical subsystem with at least a four color imager with two excitation paths, for example 520 nanometers (nm) and 639 nm, and two cameras for capturing images of two spectrally separated wavelengths per excitation path.
- Detection takes the form of an iterative process including 1) hybridization of detection polynucleotide complexes to the concatemeric amplification products which share homology, 2) imaging of the hybridization event (e.g., readout cycles), 3) removing the detection complexes to allow for the next cycle or flow or hybridization and imaging.
- This process can be repeated any number of times, depending on the number of segments in a hypercode and the number of states needed to provide hypercode profiles for a given recognition element.
- the number of readout cycles therefore changes depending on the plexity of the panel. Detection, imaging, and detection complex removal can all be done at room temperature.
- a process 500 for detecting a nucleotide sequence was conducted.
- a recognition element comprising a code with five segments is used. The segments are illustrated in FIG. 5. Operations 510, 520, and 530 are followed and repeated five times.
- a recognition element comprising a code with five segments is incubated with a detection pool comprising sixteen detection oligonucleotides, however only six of the sixteen detection oligonucleotides are depicted in this example.
- the recognition element and detection pools are incubated together.
- One of the detection oligonucleotides in the detection pool is complementary to a segment (e.g., segment one) of the recognition element. As shown in FIG. 5, the detection oligonucleotide bound to its complementary segment one.
- the unbound detection oligonucleotides are washed away, and the bound detection oligonucleotides are imaged as described herein.
- the bound detection oligonucleotides include a detection oligonucleotide portion with a fluorescent molecule attached thereto. Imaging involves a fluorescent imaging system including a microscope. The fluorescent molecules emit a color on the visible spectrum during imaging.
- the bound detection oligonucleotide is de-hybridized and washed away.
- De-hybridization includes the addition of a reagent in solution that promotes the dehybridization of the bound detection oligonucleotide.
- Operations 510, 520, and 530 of process 500 are repeated five times with different detection pools of detection oligonucleotides. During each repetition, one of the detection oligonucleotides hybridizes to a segment on the recognition element and the signal is captured.
- Detection polynucleotide complexes can be dispensed into the wells. For example, 50 pL of a solution comprising 16 detection polynucleotide complexes in a buffered solution comprising mono and divalent salts, EDTA and a surfactant can be added to the wells with the concatemeric amplification products, and the reactions incubated for 10 min. to allow for hybridization of the anchor oligonucleotide to its complementary hypercode sequence. After hybridization, the detection complexes are removed, the wells washed with the last wash remaining in the wells, and the reactions subsequently imaged.
- the wash buffer is aspirated, the wells are stripped by the addition of a stripping solution to remove the hybridized detection complexes, the stripping buffer removed, the wells washed, and a new solution with new detection complexes are added to the wells.
- the operations are repeated until all the defined readout cycles are performed for each well.
- the hypercode profiles generated by the multiple readout cycles for each well can be decoded to identify the hypercode present, and hence the target nucleic acid present.
- a recognition element comprising a code with five segments and two target specific recognition element arms is used.
- the recognition element is separately incubated with five detection pools wherein each pool has four differently labeled detection polynucleotide complexes.
- Each detection polynucleotide complex is composed of a detection oligonucleotide and an anchor oligonucleotide (See FIGs. 10A-10C and 11A-11B).
- the five segments occupy a large portion of the recognition element.
- this detection scheme illustrates a tradeoff between: (i) the five segments taking up a large portion of the recognition element and (2) a small number (e.g., 4) of detection polynucleotide complexes present in each detection pool.
- each detection pool corresponded to one of the five segments on the recognition element.
- Each of the four detection polynucleotide complexes in each detection pool comprises a fluorescent molecule that is optically distinct.
- This detection scheme is composed of the following factors: five segments on the recognition element, one incubation (e.g., flow or cycle) per segment (e.g., five total flows or cycles), five detection pools with four detection polynucleotide complexes per detection pool, and four optically distinct fluorescent molecules per detection pool.
- this detection scheme utilizes 20 total detection polynucleotide complexes, and results in 1,024 possible permutations. The larger the permutation space, the larger number of codes that can be derived for a set minimum Hamming-distance criteria across the set of codes.
- the amplified recognition element is incubated with a first detection pool.
- One of the four detection polynucleotide complexes of the first detection pool is complementary to one of the segments on the recognition element.
- a hybridization event occurs between the detection polynucleotide complex and the corresponding complementary segment of the recognition element.
- the un-bound detection polynucleotide complexes in the first detection pool are washed away.
- the fluorescent molecule present on the bound detection oligonucleotide complex is imaged, resulting in capturing the emission fluorescence signal of the fluorescent molecule.
- a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from its corresponding complementary segment.
- the operations of incubation, hybridization, wash, image, dehybridization, and wash are iteratively repeated five times, which accounts for the five segments present on the recognition element, and a hypercode profile of the fluorescence images are generated.
- one of the detection polynucleotide complexes in the detection pool is bound to a corresponding segment of the recognition element, and the bound detection polynucleotide complex is imaged, de-hybridized, and removed by washing.
- the process continues for three additional iterative repetitions. After five repetitions, five detection polynucleotide complexes are bound to five segments on the recognition element, which results in 20 images.
- the resulting combination of images, the hypercode profile allows for detection of the hypercode (e.g., the combination of segments that make up the code), which is used as a surrogate for detection of the target nucleic acid molecule that originally hybridizes to the recognition element.
- the hypercode e.g., the combination of segments that make up the code
- a recognition element comprising a code comprising one segment is used, and the segment is incubated (e.g., flowed, cycled) with five detection pools.
- This example uses five detection pools, and each detection pool includes 1,024 detection polynucleotide complexes.
- a segment of a code on the recognition element occupies a small portion of the recognition element unlike the Detection Scheme 1 of Example 2.
- the tradeoff in this detection scheme is between: (1) a segment taking up a small portion of the recognition element, and (2) a large number of detection polynucleotide complexes (e.g., 1,024) present in each detection pool.
- each detection pool includes 1,024 detection polynucleotide complexes.
- this detection scheme includes the following factors: one segment on the recognition element, five incubation periods (e.g., flows or cycles) per segment (e.g., five total flows), five detection pools with 1,024 detection polynucleotide complexes per detection pool, and 4 fluorescent molecules.
- this detection scheme utilizes 4,096 detection polynucleotide complexes, and results in 1,024 possible combinations. The larger the number of possible combinations that can be made, the larger the permutation space that may exist. The larger the permutation space, the larger the set of codes that can be derived from the permutation space.
- the segment is incubated with a first detection pool.
- One of the 1,024 detection polynucleotide complexes of the first detection pool is complementary to the segment of the recognition element.
- a hybridization event occurs between the detection polynucleotide complex and the segment.
- the 1,023 un-bound detection polynucleotide complexes are washed away.
- the bound detection polynucleotide complex is imaged. After imaging, a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from the segment by washing.
- a hypercode profile also known as a code profile.
- a second pool of detection polynucleotide complexes is incubated with the segment on the recognition element, where one of the detection polynucleotide complexes of the second detection pool hybridizes to the segment, and the bound detection oligonucleotide complex is imaged, de-hybridized, and washed.
- the detection process continues for the remaining three detection pools where each of the remaining detection pools is incubated with the segment of the recognition element.
- each detection polynucleotide complex fluoresces one of four different fluorescent molecules.
- the resulting combination of images allows for detection of the code (e.g., the combination of segments that make up the code, in this example one segment), which is used as a surrogate for determining the presence of the target nucleic acid molecule that originally hybridizes to the recognition element.
- a recognition element with four segments is used and each segment is incubated (e.g., flowed or cycled) with two detection pools.
- this detection scheme four detection pools are used, and each detection pool comprises 16 detection polynucleotide complexes each.
- the four segments on the recognition element and the eight incubation periods (e.g., two flows per segment) of the segments balance the factors illustrated in FIG. 3 and FIG. 4.
- the four segments take up a preferred amount of space on the recognition element.
- this preferred scheme requires detection pools with 16 detection polynucleotide complexes each.
- this detection scheme comprises four segments present on the recognition element taking up a preferred space on the recognition element, and also 16 detection polynucleotide complexes per detection pool.
- a recognition element with two target specific arms and four segments is used.
- Each of the four segments is incubated (e.g., two flows per segment) with two detection pools that results in an eight-flow detection scheme.
- the 16 detection polynucleotide complexes in each detection pool are divided into four groups of four, where each group is assigned one of four fluorescent molecules (e.g., colors).
- FIG. 12A shows an example of a detection pool used in this example where groups of four detection polynucleotide complexes includes the same detection oligonucleotide and the same fluorescent molecule.
- the present detection scheme is composed of the following factors: four segments on the recognition element, two incubation periods (e.g., flows) per segment (e.g., eight total flows), four detection pools with 16 detection polynucleotide complexes per detection pool, and four fluorescent molecules used per detection pool.
- this detection scheme requires 64 detection polynucleotide complexes (e.g., four detection pools with 16 detection polynucleotide complexes in each detection pool) and results in 1,024 possible combinations.
- the larger the number of possible combinations that can be made the larger the permutation space that may exist.
- the larger the permutation space the larger the code space that can be derived from the permutation space.
- the 64 anchor oligonucleotide sequences that can be used with different detection oligonucleotides to generate 64 unique detection polynucleotide complexes for code identification are illustrated in FIG. 17.
- the segments on the recognition element are incubated (e.g., flowed or cycled) with a first detection pool.
- a hybridization event occurs between the detection polynucleotide complex and the corresponding complementary segment on the recognition element.
- a detection polynucleotide complex is hybridized to its corresponding segment. After hybridization, the 15 un-bound detection polynucleotide complexes are washed away. After washing, the bound detection polynucleotide complex is imaged. After imaging, a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from its corresponding complementary segment.
- the operations of incubation, hybridization, wash, image, de-hybridization, and wash are repeated a number of times thereby generating a hypercode profile.
- the same first detection pool or a second detection pool of detection polynucleotide complexes is incubated with the same first segment or a second segment on the recognition element, where one of the detection polynucleotide complexes of the selected detection pool hybridizes to a corresponding complementary segment on the recognition element, and the bound detection polynucleotide complex is imaged, de-hybridized, and washed.
- the process continues for the remaining detection pools and segments.
- the detection polynucleotide complexes are bound to four segments on the recognition element, each fluorescing one of four colors through its fluorescent molecule during imaging.
- the resulting combination of images allows for detection of the code (e.g., the combination of segments that make up the code), which is used as a surrogate for determining the presence of the target nucleic acid molecule that originally hybridizes to the recognition element.
- the combination of colors in flow one and flow two may provide the unique sequence of segment one.
- the combination of colors in flow three and flow four may provide the unique sequence of segment two.
- the combination of colors in flow five and flow six may provide the unique sequence of segment three.
- the combination of colors in flow seven and flow eight may provide the unique sequence of segment four.
- the combination of all eight colors in the four segments and eight flows may provide the unique sequence of segments one through four which can be used as a proxy from the original hybridization event between the target nucleic acid and the recognition element.
- each detection pool undergoes two incubation periods (e.g., flows or cycles), for each segment of a concatemeric amplification product that results in an eight-flow system.
- one detection pool may be incubated with the concatemeric amplification product twice (e.g., flows one and two)
- a second detection pool may be incubated with the concatemeric amplification product twice (e.g., flows three and four)
- a third detection pool may be incubated with the concatemeric amplification product twice (e.g., flows five and six)
- a fourth detection pool may be incubated with the concatemeric amplification product twice (e.g., flows seven and eight).
- each detection pool is made up of the same detection oligonucleotide and anchor oligonucleotide sequences.
- the combination of detection oligonucleotides and anchor oligonucleotides in each detection pool may be different in each flow (e.g., flows one and two, flows three and four, flows five and six, and flows seven and eight).
- FIG. 13A shows a detection pool for flow one of segment one and FIG. 13B shows a detection pool for flow two of segment one.
- each anchor oligonucleotide is labeled with a number (e.g., numbers one through 16), and each detection oligonucleotide is labeled with a three-digit number representing the emission wavelength of the attached fluorescent moiety (e.g., 488 nm, 532 nm, 568 nm, 647 nm).
- the combination of detection oligonucleotides and anchor oligonucleotides differ between flows one and two.
- anchor oligonucleotides 1, 2, 3, and 4 are attached to detector oligonucleotide comprising a fluorescent moiety that emits at 568 nm.
- anchor oligonucleotides 1, 8, 11, and 14 are attached to detection oligonucleotide 568.
- anchor oligonucleotides 9, 10, 11, and 12 are attached to detection oligonucleotides comprising a fluorescent moiety that emits at 647 nm.
- anchor oligonucleotides 9, 16, 4, and 6 are attached to detection oligonucleotide comprising a fluorescent moiety that emits at 647 nm.
- detection oligonucleotides and anchor oligonucleotides between flows one and two comprise the same sequences, the combination of the detection oligonucleotides comprising different fluorescent moi eties and anchor oligonucleotides between flows one and two differ.
- Example 6 State Aggregator
- a detection pool comprising detection polynucleotide complexes may be used herein. Detection polynucleotide complexes in the detection pool may bind to segments on the recognition element.
- FIG. 15 represents the 16 unique detection polynucleotide complex possibilities in each detection pool in a detection scheme using four fluorescent molecules and including a recognition element comprising four segments (but could be any number of segments) with two flows per segment.
- the Cycle 1 column represents which of the four fluorescent molecules (e.g., fluorescent molecule 1, 2, 3 or 4) is imaged during the first cycle.
- the Cycle 2 column represents which of the four fluorescent molecules is imaged during the second cycle.
- the State Pair column is the listed of the two states from the first cycle and the second cycle (Cycle 1 and Cycle 2).
- unique detection polynucleotide complex one will include fluorescent molecule one in cycle one and florescent molecule one in cycle two, providing a “11” representation for the State Pair.
- the unique detection polynucleotide complex eight will include fluorescent molecule two in cycle one and fluorescent molecule four in cycle two, providing a “24” representation for the State Pair.
- unique detection polynucleotide complex 15 will include fluorescent molecule four in cycle one and fluorescent molecule three in cycle two, providing a “43” representation for the State Pair. As shown in FIG.
- each unique detection polynucleotide complex in the group of 16 detection polynucleotide complexes includes a different combination of fluorescent molecules, or hypercode profile, as such the assay has the ability to differentiate between different hypercodes based on their profiles, and hence the presence of different targets of interest associated with the hypercode.
- 64 detection polynucleotide complexes in the detection pool may be used.
- the 64 detection polynucleotide complexes may be designed to be orthogonal to each other.
- Each of the 64 detection polynucleotide complexes is represented by a number (e.g., numbers one through 64).
- the detection pool for segment one may include detection polynucleotide complexes one to 16.
- the detection pool for segment two may include detection polynucleotide complexes 17 to 32.
- the detection pool for segment three may include detection polynucleotide complexes 33 to 48.
- the detection pool for segment four may include detection polynucleotide complexes 49 to 64. Examples of anchor oligonucleotide sequences of each detection polynucleotide are provided in FIG. 17.
- FIG. 18 is a table of the permutations (e.g., colors, cycles/segment, total segments, and total cycles) that may be used to achieve a relatively large codespace from which to select a subset of codes.
- FIG. 20A is a table showing the relationship of the number of codes in a codespace, such that as the code space increases so does the number of codes for potential use in detection and decoding schemes for target nucleic acid identification using recognition elements as described herein.
- FIG. 20A is a summary table of the codespace generated by the indicated number of segments, cycles, and colors. Representative code set size is selected from the codespace, which enables the detection of numbers of targets. The greater the number of segments, cycles, and colors, the larger the codespace and therefore the greater the number of possible targets can be detected with the expanded codeset in a single assay.
- FIG. 20B is a table demonstrating that, for a four color detection system, the number of cycles, or flows, for querying a segment and the number of potential code possibilities can change depending on desired Hamming distance (HD).
- HD Hamming distance
- FIG. 21 is a schematic diagram of an example of a trellis codespace and a process of using the trellis codespace to select a set of codes with desired properties for an assay.
- FIG. 7 shows example results of assay performance using the hypercodes described herein. Two plexities were examined, a lower plexity of 1,000 different hypercodes and targets of interest and a higher plexity of 12,000 different hypercodes and targets of interest. Hypercodes were considered detected when they were detected more than 10 times within a well. Median hypercode coverage for each assay was observed well above the minimum allowable hypercodes for being considered detectable.
- the Decode Count represents the overall coefficient of variance (CV) of the total well to well counts.
- the Raw Hypercode Count represents the CV of hypercode counts across all the hypercodes present that hybridized to the same target.
- the Normalized Hypercode Count normalized for the relative abundance of each hypercode measured across the wells.
- the Decode Error Rate is the probability that any of the decoded calls is incorrect.
- the data demonstrates the ability of the methods disclosed herein to encode and decode targets of interest with high yield and low error rates, across three orders of magnitude in plexity.
- Hypercoding as described herein can provide for genotyping and quantitation of targets of interest that is comparable, or superior, to next generation sequencing or microarray technologies.
- the methods described herein can provide a faster sample-to-answer turnaround time.
- the use of double ligation recognition elements that can discern between a target of interest from a sample with high homology pseudogenes or homologs overcomes the limitations of microarrays and next generation sequencing in this regard.
- the methods described herein provide data of thousands of targets of interest in parallel, thereby providing a rapid and inexpensive alternative to existing technologies.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Provided herein are methods, systems, compositions, and kits for detecting and decoding an encoded assay. The present disclosure provides low cost, multiplexed and automatable assays by hypercoding, a scalable technology for detection and quantitation of multi-omics targets. Hypercoding, or coding, utilizes signals from fluorescent hybridization with an error-corrected code to enable accurate detection and high-plexity targets of interest from samples. The present disclosure provides encoded assays for multiplex target detection from a sample suitable for detection by hybridization. Detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides, of the present disclosure are utilized in the detection and decoding of the hypercodes to produce optical signals capable of being decoded by a soft decision decoding algorithm for determining the presence of a target of interest, or multiple targets of interest, in a sample.
Description
METHODS, SYSTEMS, COMPOSITIONS AND KITS FOR TARGET DETECTION CROSS-REFERENCE
[001] This application claims the benefit of United States Provisional Patent Application Serial No. 63/616,299, filed December 29, 2023, which is incorporated herein by reference in its entirety.
BACKGROUND
[002] Advances in genomic and multi-omic technologies have transformed translational research and are poised to do the same for clinical care and diagnostics. Utilizing genomic and multi-omic testing can enable higher precision in clinical decision making from assessing risk of disease for a subject, disease diagnosis, treatment monitoring for response and disease recurrence, to clinically informing what drug and treatment regimen may be most beneficial. In short, enabling personalized patient care.
[003] To unlock the benefits of omics testing at a population level, testing at scale is critical for both powering translational studies and for driving routine adoption in the health care system. There is a need for molecular assays that are fast, affordable, and easy to implement and standardize across many labs, for example, methods capable of performing tests in a fast, efficient, and less costly manner while assaying multiple samples and multiple targets simultaneously for providing more information for clinicians and diagnosticians to make more informed decision on the health and treatment of a subject.
SUMMARY
[004] Aspects disclosed herein provide methods for determining the presence of one or more target nucleic acid molecules, the method comprising: (a) providing a plurality of target nucleic acid molecules from a sample; (b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises: (i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and (ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set
of computational states; (c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products; (d) hybridizing a detection polynucleotide complex of a plurality of detection polynucleotide complexes to a concatemeric amplification product of the plurality of concatemeric amplification products, wherein each detection polynucleotide complex of the plurality of detection polynucleotide complexes comprises: (i) a detection oligonucleotide; and (ii) an anchor oligonucleotide, wherein: (1) a first portion of the anchor oligonucleotide is complementary to at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products, and (2) a second portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide; (e) detecting the detection polynucleotide complex of the plurality of detection polynucleotide complexes hybridized to the at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products; and (f) determining the presence of the target nucleic acid molecule in the sample based on the detection of the hypercode. In some embodiments, the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof. In some embodiments, the one or more detection moieties comprises the fluorophore. In some embodiments, the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more optically distinct fluorescent moieties. In some embodiments, the plurality of detection polynucleotide complexes comprises two or more distinct detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the two or more distinct detection polynucleotide complexes comprises one or more optically distinct fluorescent moieties. In some embodiments, the plurality of detection polynucleotide complexes comprises three or more distinct detection polynucleotide complexes. In some embodiments, the plurality of detection polynucleotide complexes comprises four or more distinct detection polynucleotide complexes. In some embodiments, the detection oligonucleotide comprises a length of 5 to 25 nucleotides. In some embodiments, the anchor oligonucleotide comprises a length of 20 to 100 nucleotides. In some embodiments, the anchor oligonucleotide comprises a length of 40 to 50 nucleotides. In some embodiments, the method further comprises forming the plurality of detection polynucleotide complexes. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor
oligonucleotide prior to the hybridizing of (d). In some embodiments, the hybridizing of (d) comprises hybridizing the detection polynucleotide complex of the plurality of detection polynucleotide complexes to the hypercode of the set of hypercodes on the plurality of concatemeric amplification products. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide concurrently with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide after hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide substantially simultaneously with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof. In some embodiments, the method further comprises extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof. In some embodiments, the plurality of target nucleic acid molecules comprises DNA.In some embodiments, the plurality of target nucleic acid molecules comprises RNA. In some embodiments, the RNA comprises mRNA. In some embodiments, the method further comprises (g) imaging the detection polynucleotide complex of the plurality of detection polynucleotide complexes hybridized to the at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the method further comprises repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the repeating of (d) through (g) comprises 2 to 15 iterative repetitions. In some embodiments, the repeating of (d) through (g) comprises 2 to 10 iterative repetitions. In some embodiments, the detecting of (e) comprises fluorescence detection. In some embodiments, the method further comprises applying a soft decision algorithm to a detected hypercode profile. In some embodiments, the method further comprises performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the
sample. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules. In some embodiments, each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence. In some embodiments, the plurality of recognition elements comprises 10 to 10,000 recognition elements. In some embodiments, the plurality of recognition elements comprises 10 to 1,000 recognition elements. In some embodiments, the plurality of segments comprises 2 to 10 segments. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises at least 5 contiguous nucleotides. In some embodiments, the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from the other computational states of the set of computational states. In some embodiments, the set of computational states comprises 5 to 30 computational states. In some embodiments, the method further comprises determining a Hamming distance between any two hypercodes of the set of hypercodes. In some embodiments, the method further comprises determining a Hamming distance between any two segments of a hypercode of the set of hypercodes. In some embodiments, the Hamming distance is 2 to 8. In some embodiments, the method further comprises repeating (d) through (f). In some embodiments, the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments. In some embodiments, the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule. In some embodiments, the method further comprises providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe. In some embodiments, the method further comprises ligating and circularizing the plurality of recognition elements that are hybridized to the plurality
of target nucleic acid molecules. In some embodiments, the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
[005] Aspects disclosed herein provide methods for determining the presence of one or more target nucleic acid molecules, the method comprising: (a) providing a plurality of target nucleic acid molecules from a sample; (b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises: (i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and (ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set of computational states; (c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products; (d) hybridizing a detection oligonucleotide of a plurality of detection oligonucleotides to a concatemeric amplification product of the plurality of concatemeric amplification products, wherein each detection oligonucleotide of the plurality of detection oligonucleotides comprises: (i) a nucleic acid sequence that is complementary to a portion of the hypercode; and (ii) a detectable moiety; (e) detecting the detection oligonucleotide of the plurality of detection oligonucleotides hybridized to the portion of the hypercode of the concatemeric amplification product of the plurality of concatemeric amplification products via the detectable moiety of the detection oligonucleotide; and (f) determining the presence of the target nucleic acid molecule in the sample based on the detection of the hypercode. In some embodiments, the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof. In some embodiments, the one or more detection moieties comprises the fluorophore. In some embodiments, the detection oligonucleotide comprises one or more optically distinct fluorescent moieties. In some embodiments, the plurality of detection oligonucleotides comprises two or more distinct detection oligonucleotides. In some embodiments, the plurality of detection oligonucleotides comprises three or more distinct detection oligonucleotides. In some embodiments, the plurality of detection oligonucleotides comprises four or more distinct detection oligonucleotides. In some embodiments, the detection oligonucleotide comprises a length of 5 to 25 nucleotides. In some embodiments, the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof. In some
embodiments, the method further comprises extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof. In some embodiments, the plurality of target nucleic acid molecules comprises DNA. In some embodiments, the plurality of target nucleic acid molecules comprises RNA. In some embodiments, the RNA comprises mRNA. In some embodiments, the method further comprises (g) imaging the detection oligonucleotide of the plurality of detection oligonucleotides hybridized to the portion of the hypercode of the concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the method further comprises repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products. In some embodiments, the repeating of (d) through (g) comprises 2 to 15 iterative repetitions. In some embodiments, the repeating of (d) through (g) comprises 2 to 10 iterative repetitions. In some embodiments, the method further comprises performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the sample. In some embodiments, the detecting of (e) comprises fluorescence detection. In some embodiments, the method further comprises applying a soft decision algorithm to a detected hypercode profile. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules. In some embodiments, the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules. In some embodiments, each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence. In some embodiments, the plurality of recognition elements comprises 10 to 10,000 recognition elements. In some embodiments, the plurality of recognition elements comprises 10 to 1,000 recognition elements. In some embodiments, the plurality of segments comprises 2 to 10 segments. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides. In some embodiments, each segment of the plurality of segments comprises at least 5 contiguous nucleotides. In some embodiments, the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from other computational states of the set of computational states. In some embodiments, the set of computational states comprises 5 to 30 computational states. In some embodiments, the method further comprises determining a Hamming distance between any two hypercodes of the set of hypercodes. In some embodiments, the method further comprises
determining a Hamming distance between any two segments of a hypercode. In some embodiments, the Hamming distance is 2 to 8. In some embodiments, the method further comprises repeating (d) through (f). In some embodiments, the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments. In some embodiments, the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments. In some embodiments, the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule. In some embodiments, the method further comprises providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe. In some embodiments, the method further comprises ligating and circularizing the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules. In some embodiments, the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
[006] Aspects disclosed herein provide systems, comprising: (a) a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises:
(i) one or more target regions complementary to a corresponding target nucleic acid molecule of a plurality of target nucleic acid molecules from a sample; and (ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with each target nucleic acid molecule of the plurality of target nucleic acid molecules from the sample including the corresponding target nucleic acid molecule in (i), and wherein the hypercode comprises a plurality of segments that corresponds to at least two computational states of a set of computational states; and (b) a plurality of detection polynucleotide complexes comprising: (i) a detection oligonucleotide; and (ii) an anchor oligonucleotide, wherein: (1) a first portion of the anchor oligonucleotide is complementary to a segment of the hypercode or a portion thereof; and
(ii) a second portion of the anchor oligonucleotide is complementary to a portion of the detection oligonucleotide. In some embodiments, the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof. In some embodiments, the plurality of target nucleic acid molecules comprises DNA. In some embodiments, the plurality of target nucleic acid molecules comprises RNA. In some embodiments, the RNA comprises mRNA. In some embodiments, each
segment of the plurality of segments comprises at least 5 contiguous nucleotides, wherein the at least 5 contiguous nucleotides each correspond to a computational state that is different from another computational state of the set of computational states. In some embodiments, the set of computational states comprises 2 to 20 computational states. In some embodiments, the set of computational states comprises 2 to 10 computational states. In some embodiments, the set of computational states comprises 4 computational states. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 3 to 5. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 3. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 4. In some embodiments, the set of codes comprises a Hamming distance of any two codes of the set of codes of 5. In some embodiments, the detection oligonucleotide comprises one or more fluorescence molecules. In some embodiments, the one or more fluorescent molecules comprises an organic dye, a biological fluorophore, a quantum dot, or a combination thereof. In some embodiments, the detection oligonucleotide comprises a length of 5 to 10 nucleotides. In some embodiments, the anchor oligonucleotide comprises a length of 10 to 25 nucleotides. In some embodiments, the plurality of detection polynucleotide complexes comprises 2 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 2 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct. In some embodiments, the plurality of detection polynucleotide complexes comprises 3 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 3 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct. In some embodiments, the plurality of detection polynucleotide complexes comprises 4 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 4 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct. In some embodiments, the system further comprises a solid substrate configured to immobilize nucleic acids. In some embodiments, the solid substrate comprises a welled plate or a flow cell, wherein a surface of the welled plate or a surface of the flow cell comprises a cation-coating layer coupled thereto. In some embodiments, the system further comprises (c) a fluid flow controller; (d) an imaging system; (e) a computer system; or (f) any combination of (c) to (e). In some embodiments, the fluid flow controller comprises one or more pumps, valves, mixing manifolds, reagent reservoirs, waste reservoirs, or any combination
thereof. In some embodiments, the fluid flow controller is configured to provide programmable control of fluid flow velocity, volumetric fluid flow rate, timing of reagent or buffer introduction, or any combination thereof. In some embodiments, a detection polynucleotide complex is bound to a recognition element of the plurality of recognition elements, or a concatemeric amplification product thereof to form a detectable binding complex.
[007] Aspects disclosed herein provide kits comprising: (a) a plurality of recognition elements; (b) a plurality of detection polynucleotide complexes comprising (i) a plurality of detection oligonucleotides and (ii) a plurality of anchor oligonucleotides; and (c) instructions for use of (a) and (b) according to any one of the methods disclosed herein. In some embodiments, the kits further comprise (d) a first buffer, wherein the first buffer is configured to promote hybridization; and (e) a second buffer, wherein the second buffer is configured to promote de-hybridization. [008] Aspects disclosed herein provide compositions comprising a plurality of detectable binding complexes, wherein each detectable binding complex of the plurality of detectable binding complexes comprises a concatemeric amplification product comprising a recognition element sequence, wherein the recognition element sequence comprises complementary sequences to a target nucleic acid, and a hypercode sequence comprising one or more segment sequences, and a plurality of detection polynucleotide complexes hybridized to the one or more segment sequences, wherein a detection polynucleotide complex of the plurality of detection polynucleotide complexes comprises a detection oligonucleotide comprising a detection moiety hybridized to an anchor oligonucleotide, and wherein a portion of the anchor oligonucleotide of the detection polynucleotide complex is hybridized to the one or more segment sequences.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] An understanding of the features and advantages of the present disclosure is obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized.
[010] FIG. 1 shows an example workflow according to some embodiments herein.
[Oil] FIG. 2 shows a schematic diagram of an example of a recognition element and a target nucleic acid molecule according to some embodiments herein.
[012] FIG. 3 shows a schematic diagram illustrating non-limiting examples of factors considered in the design of recognition elements according to some embodiments herein.
[013] FIG. 4 shows a schematic diagram illustrating non-limiting examples of factors considered in determining a permutation code space according to some embodiments herein.
[014] FIG. 5 shows a schematic diagram illustrating an example of a process for detecting a nucleotide sequence according to some embodiments herein.
[015] FIG. 6 shows a schematic diagram of rolling circle amplification (RCA) to produce a concatemeric amplification product according to some embodiments herein.
[016] FIG. 7 shows a table where illustrating hypercode performance evaluated for a low plexity and a high plexity assay.
[017] FIG. 8A shows a schematic diagram illustrating an algorithm for generating a recognition element including screening for targets of interest.
[018] FIG. 8B shows a schematic diagram illustrating an algorithm for generating a recognition element including recognition element 5’ and 3’ region design generation.
[019] FIG. 8C shows a schematic diagram illustrating an algorithm for generating a recognition element including recognition element creation.
[020] FIG. 9A shows a dual ligation recognition element strategy for use when a target of interest has high homology with a homolog or pseudogene.
[021] FIG. 9B shows graphs indicating the success of using a recognition element strategy from FIG. 9A to differentiate between a target of interest in a background of homologous sequences from a homolog or pseudogene.
[022] FIG. 10A shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including an example of a structure of a detector oligonucleotide.
[023] FIG. 10B shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including four optically distinct detector oligonucleotides.
[024] FIG. 10C shows an image illustrating examples of detection oligonucleotides according to some embodiments herein including four non-optically distinct detector oligonucleotides.
[025] FIG. 11A shows examples of a detection polynucleotide according to some embodiments herein including an example of an anchor oligonucleotide hybridized to a portion of a code.
[026] FIG. 11B shows examples of a detection polynucleotide according to some embodiments herein including an example of a detection oligonucleotide complexed with a corresponding anchor oligonucleotide to generate a detection polynucleotide complex.
[027] FIG. 12A shows schematic illustrations of detection polynucleotide complexes including a schematic diagram illustrating an example of a pool of 16 anchor oligonucleotides and four different detector oligonucleotides.
[028] FIG. 12B shows schematic illustrations of detection polynucleotide complexes including an image illustrating an example of one detection polynucleotide complex hybridized to a portion of a code (e.g., segment) of a recognition element.
[029] FIG. 13A shows schematic illustrations of detection polynucleotide complexes including a schematic diagram illustrating a pool of 16 anchor oligonucleotides and four detection oligonucleotides.
[030] FIG. 13B shows schematic illustrations of detection polynucleotide complexes including a pool of 16 detection polynucleotide complexes where each potential detection polynucleotide complex comprises a different combination of anchor oligonucleotide and detection oligonucleotide pairs.
[031] FIG. 14 shows an example of an encoded assay workflow for detecting methylated DNA. [032] FIG. 15 shows an example of a table illustrating how states are assigned using four colors in a two flow decode system.
[033] FIG. 16A shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with median skew.
[034] FIG. 16B shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with high skew.
[035] FIG. 16C shows representative intensity profiles for hypercodes fluorescently labeled in concatemeric amplification products including profiles that decode to a hypercode with high skew.
[036] FIG. 17 is a table illustrating examples of sequences of 64 unique detection polynucleotide complexes used to identify 64 possible codes.
[037] FIG. 18 is an example of a table of the permutations that may be used to achieve a relatively large combination code space from which to select a subset of codes for detecting and decoding a recognition element code.
[038] FIG. 19A shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including (i) cycles of hypercode detection and (ii) representative intensity vectors of both a learned profile and the ideal intensity profile.
[039] FIG. 19B shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a profile skew.
[040] FIG. 19C shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a profile score.
[041] FIG. 19D shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating an example of a decoded concatemeric amplification products for a range of target input concentrations.
[042] FIG. 19E shows examples of plot and graphs demonstrating hypercoding strategy for enabling accurate, high plexity detection of concatemeric products including a graph demonstrating decoded error rates as a function of target input concentration.
[043] FIG. 20A shows an example of a summary table of the number of codes available in a code space, the number of segments, cycles (e.g., flows), and different detection moieties (e.g., colors) needed for a given number of potential targets for detection.
[044] FIG. 20B shows an example of a summary table of the codespace of cycles (e.g., flows) and Hamming distance (HD) and their importance in generating different code sets.
[045] FIG. 20C shows an example of a summary table of how plexity can increase based on either increasing the number of detection cycles or the number of differentiable optical states. [046] FIG. 21 is a schematic diagram of an example of a trellis codespace and a process of using the trellis codespace to select a set of codes with desired properties from a large codespace for an assay.
[047] FIG. 22 is a schematic diagram of an example of a soft decision decoding algorithm of the present disclosure.
[048] FIG. 23 shows a non-limiting example of a computing system; in this case, a system with one or more processors, memory, storage, and a network interface.
[049] FIG. 24 shows a non-limiting example of a cloud-based web/mobile application provision system; in this case, a system comprising an elastically load balanced, auto-scaling web server and application server resources as well as synchronously replicated databases.
[050] FIG. 25 shows a non-limiting example of a web/mobile application provision system; in this case, a system providing browser-based and/or native mobile user interfaces.
DETAILED DESCRIPTION
[051] To unlock benefits of omics testing at a population level, testing at scale is critical for both powering translational studies and for driving routine adoption in the health care system. There is a need for molecular assays that are fast, affordable, and easy to implement and standardize across many labs. Next generation sequencing (NGS) has been critical to the discovery of new clinically relevant variants and bringing new applications into the translational and clinical space. However, while the cost of NGS has lessened, large scale implementation remains challenging due to the overall complexity, long turnaround times, the need for specialized skill sets, and data storage requirements. Tests relying on quantitative polymerase chain reaction (qPCR) remain a work horse in clinical labs, due to workflow simplicity, speed and low cost. While qPCR is effective for low target numbers (<20), expansion to multiple targets is constrained by available fluorophores and optical detection systems, requiring multiple reactions per sample to cover additional targets, thus limiting the scalability.
[052] Alternative strategies achieve high plexities using encoded assays that rely on aligning a target of interest, for example a nucleic acid target or a protein of a target, to a pre-determined code that serves as a proxy for the presence of the target of interest from a biological sample. Although this has been demonstrated with microarray and sequencing data, there is no current platform that is built for fast, highly quantitative detection of codes, and from the code’s detection the determination of the presence of the codes’ associated target of interest.
[053] The present disclosure provides methods, systems and compositions for a platform that enables high-plex detection and quantitation of targets of interest from a sample by correlating detection of a hypercode with a target of interest from a sample for a fast, simple and highly- plexed assay. The methods and system presented herein provide for detecting and quantifying multiple samples and multiple targets of interest in those samples, in parallel and at high plexity. [054] The present disclosure provides methods, systems, compositions and kits for multiplexed target molecule detection utilizing a detection and decoding by hybridization approach. The target molecule may be a nucleic acid molecule from a sample (e.g., a biological sample) or a nucleic acid molecule serving as a surrogate of a target molecule that is other than a nucleic acid molecule (e.g., polypeptide, sugar, metabolite, etc.). Methods, systems, compositions and kits of the present disclosure provide encoded assays (or components of the encoded assays) comprising a recognition element that uniquely recognizes and binds to a target molecule from a sample under conditions sufficient that the recognition element undergoes a molecular transformation in
the presence (but not in the absence) of the target molecule to produce a modified recognition element.
[055] Such modified recognition elements may have a nucleic acid code or hypercode, which is associated or correlated with the target molecule. The modified recognition element may be amplified to produce multiple copies of the recognition element including its code. For example, where the molecular transformation is a ligation event adjoining a 3’ probe arm and a 5’ probe arm of the recognition element, the amplification may be rolling circle amplification (RCA). The code or hypercode may be made up of one or more segments wherein each segment may hybridize to a detectable oligonucleotide or a detection polynucleotide complex to determine the presence of the one or more segments, which can be correlated to the presence of the target molecule. The detection polynucleotide complexes of the present disclosure may include a detection oligonucleotide having a detectable label and a nucleic acid sequence configured to bind to an anchor oligonucleotide, wherein the anchor oligonucleotide is configured to bind to a segment of the code and the detection oligonucleotide, as shown for example in FIG. 12B. Diverse pools of the detection polynucleotide complexes utilized over multiple rounds of detection, also called flows or cycles, (e.g., for each segment of a code) are able to deliver a plurality of signals (e.g., fluorescent signals), also known as states, that may be combined to generate a pattern of detected colors, signals or states which can be subsequently decoded (e.g., soft decision decoding) to determine the probability of the presence of the hypercode which serves as a proxy or surrogate for the presence of the target molecule.
Detecting a nucleotide sequence
[056] Hypercoding of omics targets can be enabled using a multi-functional linear nucleic acid molecule called a “recognition element”. A recognition element comprises a circularizable linear DNA molecule that comprises a code or hypercode unique to a particular target of interest that when hybridized to target sequences can conform into a padlock probe configuration. In some embodiments, in the presence of the complementary genomic sequence, wild type or variant, a recognition element can hybridize to its intended target sequences of interest, undergoes a conformation change into a circular DNA molecule, and the two adjacent sequences of the circularized recognition element can be ligated together to form a modified, circularized recognition element. In some embodiments, in the absence of target sequences of interest, there is no hybridization and therefore no circularized DNA molecule and no ligation of adjacent ends. In some embodiments, after ligation, the modified, circularized recognition element can be amplified to increase the number of hypercodes for downstream robust fluorescence detection.
[057] The methods described herein relate to determining the presence of a target molecule from a sample. In some embodiments, the presence of the target molecule is determined by introducing a plurality of recognition elements to the sample, wherein each recognition element comprises a target recognition region specific to a target in the sample under conditions sufficient to bind the recognition elements to the respective targets. In some embodiments, the target recognition regions are complementary to the respective target nucleic acid molecule. In some embodiments, the recognition element comprises a code that is associated with target molecules that may be detected and used as a surrogate, or proxy, for the target molecule. In some embodiments, the code comprises a plurality of segments that each correspond to one or more computational states (e.g., fluorescent colors, or signals) that are used in a decoding process to determine the presence of the code and therefore the presence of the target nucleic acid molecule. In some embodiments, the decoding process comprises a soft decision decoding algorithm.
[058] The recognition elements bound to the respective target molecules may be selectively amplified to produce a plurality of amplification products comprising amplified codes. In some embodiments, the amplification products are immobilized on a substrate (e.g., welled plate, flow cell). In some embodiments, the amplification to generate amplification products is performed in solution. In some embodiments, the amplification to generate amplification products is performed on a substrate.
[059] In some embodiments, a plurality of detection polynucleotide complexes is introduced to the plurality of amplification products. In some embodiments, a detection oligonucleotide and an anchor oligonucleotide are added to the amplification products, wherein the detection oligonucleotide and the anchor oligonucleotide assemble to form a detection polynucleotide complex substantially simultaneously to the anchor oligonucleotide hybridizing to a code or a portion of a code.
[060] In some embodiments, the detection polynucleotide complex comprises a detection oligonucleotide and an anchor oligonucleotide. A portion of an anchor oligonucleotide may be complementary to at least a portion of the amplification product. Another portion of the anchor oligonucleotide may be complementary to at least a portion of the detection oligonucleotide. When the detection polynucleotide complex having an anchor oligonucleotide complementary to a portion of a code of an amplification product, detectable binding complexes with the portion of the code are formed. In some embodiments, the detectable binding complexes are imaged with an imaging system disclosed elsewhere herein to obtain signals associated with the code for each amplification product. This process may be repeated for each segment of each code, thereby
building a color profile for each code. A decoding process (e.g., soft decision decoding) may be applied to the color profile to determine or predict the presence of each code, thereby identifying the presence of the target molecules in the sample.
[061] In some embodiments, provided herein are methods comprising analyzing a plurality of target nucleic acid molecules from a sample, providing a plurality of recognition elements, wherein each recognition element of the plurality comprises one or more target recognition regions complementary to a corresponding target nucleic acid molecule of the plurality of target nucleic acid molecules; and a code from a set of codes, wherein the code is associated with one or more target nucleic acid molecules of the plurality from the sample including the corresponding target nucleic acid molecule. The code can comprise a plurality of segments that corresponds to at least two computational states of a set of computational states. In some embodiments, the methods comprise selectively amplifying a subset of the plurality of recognition elements bound to the plurality of target nucleic acid molecules to produce a plurality of amplification products, wherein each amplification product comprises a code, introducing a plurality of detection polynucleotide complexes to the plurality of amplification products, wherein each detection polynucleotide complex of the plurality comprises a detection oligonucleotide and an anchor oligonucleotide, wherein a portion of each anchor oligonucleotide is complementary to at least a portion of the amplification product of the plurality of amplification products, and another portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide forming a plurality of detectable binding complexes. In some embodiments, each detectable binding complex comprises a detection polynucleotide complex bound to an amplification product of the plurality of amplification products, further including imaging the plurality of detectable binding complexes to obtain signals associated with the different segments of the plurality of segments of each code of the plurality of codes for each amplification product of the plurality of amplification products, iteratively repeating the operations of the introducing, forming, and imaging for each segment of each code of the plurality of codes and applying a soft decision decoding algorithm to the plurality of codes to predict a presence of the one or more target nucleic acid molecules from the sample.
[062] In some embodiments, provided herein are methods comprising analyzing a plurality of target nucleic acid molecules from a sample, providing a plurality of recognition elements, wherein each recognition element of the plurality comprises one or more target recognition regions complementary to a corresponding target nucleic acid molecule of the plurality of target
nucleic acid molecules; and a code from a set of codes, wherein the code is associated with one or more target nucleic acid molecules of the plurality from the sample including the corresponding target nucleic acid molecule. In some embodiments, the code comprises a plurality of segments that corresponds to at least two computational states of a set of computational states, selectively amplifying a subset of the plurality of recognition elements bound to the plurality of target nucleic acid molecules to produce a plurality of amplification products, wherein each amplification product comprises the code. In some embodiments, the methods comprise introducing a plurality of detection oligonucleotides to the plurality of amplification products, wherein each detection oligonucleotide comprises a sequence complementary to a portion of a code and further comprises a detectable moiety and when bound to a segment of the code is call a detectable binding complex, wherein each detectable binding complex comprises a detection oligonucleotide bound to an amplification product of the plurality of amplification products, imaging the plurality of detectable binding complexes to obtain signals associated with the different segments of the plurality of segments of each code of the plurality of codes for each amplification product of the plurality of amplification products, iteratively repeating the operations of the introducing, forming, and imaging for each segment of each code of the plurality of codes and applying a soft decision decoding algorithm to the plurality of codes to predict a presence of the one or more target nucleic acid molecules from the sample.
[063] In some embodiments, the methods described herein may comprise iteratively repeating the operations of: (i) introducing detection oligonucleotide complexes; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes to obtain signals, in order to generate a pattern of states indicative of each code. In some embodiments, the method is performed for each segment within a code. In some embodiments, each operation comprises a cycle whereby a first pool of detection polynucleotide complexes is introduced to a plurality of modified recognition elements associated with respective target molecules, or amplification products thereof, under conditions sufficient to bind a detection polynucleotide complex to a modified recognition element or an amplification product thereof to facilitate detection of the segment of a code. In some embodiments, the iteratively repeating operations (i) to (iii) comprises adding and cycling additional pools of detection polynucleotide complexes in a sequential manner until all or substantially all of the segments for each of the codes are detected. In some embodiments, methods further comprise a wash operation between operations (ii) and (iii) to remove one or more of unbound detection polynucleotide complexes, unbound detection oligonucleotides, or anchor oligonucleotides. In some embodiments, methods further comprise a
dehybridization operation in between cycles to destabilize and remove a first pool of detection polynucleotide complexes.
[064] FIG. 5 provides a non-limiting example of a process 500 for detecting the presence of a target nucleic acid molecule. The process 500 comprises the operations of incubation 510, wash and image 520, and dehybridization and wash 530. At operation 510, detection polynucleotide complexes comprising a detection oligonucleotide and an anchor oligonucleotide may be incubated with amplification products of recognition elements comprising a code of one or more segments. A detection polynucleotide complex may bind to a complementary segment of a code, or a portion thereof, present on a recognition element. At operation 520, the detection polynucleotide complexes that are not bound to a segment of a code, or a portion thereof, may be removed by washing, and the detection polynucleotide complex that is bound to the segment of a code, or a portion thereof, may be imaged. At operation 530, the detection polynucleotide complex that is bound to the segment of a code, or a portion thereof, may be dehybridized and removed from the assay environment. The methods described herein relate to performing process 500 and iteratively repeating operations 510, 520, and 530 to determine the code state or color profile of an amplified recognition element, thereby identifying the presence of the target nucleic acid molecule.
Encoded Assay Workflows
[065] The encoded assays disclosed herein are capable of multiplex target detection. The readout of the encoded assays can be measured alongside the readout of various molecular assays that may be performed in parallel, thereby enabling a multiomic platform for the analysis of different target molecules from a sample.
[066] An assay workflow, according to some embodiments herein, may comprise the following operations.
[067] A sample may be collected or provided. The sample may be whole blood, lymphatic fluid, serum, plasma, sweat, tear, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, liquids containing multi-celled organisms, tissues, cells, biopsy samples, biological swabs or biological washes. For example, a blood or saliva sample may be collected. In one example, a whole blood sample may be collected and processed to separate the plasma fraction from the cellular components of whole blood.
[068] Target molecule extraction, concentration, conversion, and/or purification processes may be performed. The target molecule may be DNA. DNA (e.g., cell-free DNA) in the plasma sample may be extracted, purified, and concentrated for analysis. A proteinase K digestion operation may be used to digest proteins present in the plasma sample. In some cases, a heat denaturation operation (e.g., 94-98°C for 20-30 seconds) may be used to denature doublestranded DNA into single-stranded DNA. A bead-based extraction and concentration protocol may be used to capture single-stranded DNA in the plasma sample. In some embodiments, the bead-based extraction protocol uses magnetically responsive nucleic acid capture beads. The bead-bound DNA may be released from the capture beads using an elution buffer (or other elution means suitable to the capture bead used) to produce a processed DNA sample for analysis. In one embodiment, the DNA sample may be further processed in a bisulfite conversion reaction for analysis of the methylation status of a set of targets in the sample. A skilled artisan will understand the methods that can be used to extract and/or purify target molecules such as nucleic acids or proteins, from a sample.
[069] The DNA sample may be transferred into an analysis cartridge or device according to some embodiments herein. The analysis cartridge or device may comprise a reaction vessel. Nonlimiting examples of reaction vessels include a plate, a well, a container, a tube, a flow cell, a microfluidic chip, or the like. The plate may be a welled plate, such as a 12-well plate, a 24-well plate, a 48-well plate, a 96-well plate, a 384 well plate, a 1536-well plate, and the like. The reaction vessel, or a reaction surface thereof, may be optically clear to enable optical target detection in the reaction vessel. The reaction vessel may comprise a glass surface. The reaction vessel may comprise a glass-bottomed, well plate. The reaction vessel may comprise a surface coating that promotes sequestration of nucleic acid amplification products. For example, the reaction vessel may comprise a cationic coating.
[070] A recognition event for each target molecule in a set of target molecules may be performed. FIG. 1 provides an example of an encoded assay workflow. Starting from the top left of FIG. 1, a recognition element (see also FIG. 2) comprises 5’ and 3’ ends that are complementary to target sequences of interest. A recognition element further comprises a code, also known as a hypercode. In this instance of FIG. 1, there are four segments to the hypercode, wherein the hypercode can be used as a proxy for the presence of a target of interest. Target nucleic acids of interest are incubated with the recognition element wherein they can hybridize, if present, to their complementary sequences in the recognition element (middle top illustration). After hybridization (e.g., in the presence of the target sequence of interest), the ends of the
recognition element are adjacently located and ligation between the 5’ and 3’ ends of the recognition element can occur. If there is no target of interest, there is no hybridization and no ligation. The reactions can be treated with one or more exonucleases, thereby digesting any linear nucleic acids present in the reaction that did not participate in the hybridization and ligation events.
[071] Still referring to FIG. 1, after exonuclease digestion the circularized recognition elements are aliquoted into a welled plate where they are immobilized onto the surface of the welled plate. The addition of a polymerase (e.g., DNA polymerase) allows for amplification and concatenation of the circularized recognition elements. The concatenated amplified products can be queried with fluorescent detection complexes (bottom left of FIG. 1) (see also FIG. 5) and imaged. A hypercode profile can be generated upon multiple query events, also called cycles or flows. The resulting hypercode profile can be decoded to identify the hypercode which can be used as a proxy for the presence of the target of interest.
[072] The target molecule can be uniquely recognized by and bound to a recognition element associated with a hypercode (and optionally other elements). In one example, the recognition event for the set of target molecules uses a plurality of coded recognition elements. In another example, the recognition event for the set of target molecules uses a panel of molecular inversion probes. In another example, the recognition event for the set of target molecules uses a panel of padlock probes. The recognition event yields a set of hypercoded target molecules comprising the target molecule and the recognition element.
[073] The recognition event may include sequence-specific binding between a 5’ probe arm and a 3’ probe arm of the recognition element to the target nucleic acid molecule under conditions sufficient to form a binding complex comprising the recognition element and the target nucleic acid molecule. In embodiments where the recognition element is a padlock probe, the 5’ probe arm and the 3’ probe arm comprises a target recognition element that binds to two adjacent sequences in the target molecule. In another embodiment, the 5’ probe arm and the 3’ probe arm bind to the target nucleic acid molecule at 3’ and 5’ regions flanking the target region leaving a gap between the 5’ probe arm and the 3’ probe arm of the padlock probe.
[074] The recognition event may include sequence-specific binding between a 5’ probe arm and a 3’ probe arm of the recognition element to 3’ region and a 5’ region of a bridge oligonucleotide having a target-specific element complementary to the target nucleic acid molecule interposed between the 3’ region and the 5’ region. In some embodiments, the bridge oligonucleotide is a surrogate for the target molecule. In some embodiments, the bridge oligonucleotide and the
recognition element are introduced to the target nucleic acid molecule under conditions sufficient to form a ternary binding complex comprising the recognition element, the bridge oligonucleotide and target nucleic acid molecule.
[075] The recognition element may include sequence-specific binding between the target nucleic acid molecule and a target-binding region of a pre-circularized recognition element. In some embodiments, the target nucleic acid molecule is a surrogate for the target molecule, such as, for example, a cleavage product from a flap endonuclease cleavage reaction between a dualprobe recognition element and the target molecule. In some embodiments, the target nucleic acid molecule serves as a primer for an amplification reaction, such as a rolling circle amplification (RCA) reaction or a multiple strand displacement reaction.
[076] A transformation event for each recognition element may be performed. The transformation event may comprise one or more enzymes under conditions sufficient to circularize the recognition element. The transformation event may include a ligation reaction between the 3’ probe arm and the 5’ probe arm of the recognition element by a ligating enzyme. The ligating enzyme may be a DNA ligase or catalytically active portion thereof. Non-limiting examples of ligases include any ligase which can ligate the 3’ hydroxyl group to a 5’ phosphate group of a DNA molecule, for example T4 DNA ligase, thermostable T4 DNA ligase, AmpLigase Thermostable DNA ligase, HiFi Taq DNA ligase, and the like. The transformation event may include a gap-fill ligation reaction in embodiments where there is a gap between the 3’ probe arm and the 5’ probe arm of the recognition element following hybridization of the recognition element to the target nucleic acid molecule. In addition to the ligase, a polymerized enzyme to synthesize DNA from the 3’ end of the recognition element until it abuts the 5’ end of the recognition element in the gap may be used. The polymerizing enzyme may be a DNA polymerase or a catalytically active portion thereof. The DNA polymerase may be a thermostable polymerase, a thermolabile polymerase, Bst polymerase, a Bst-like polymerase, a Therminator X polymerase, a Bst3.0 polymerase, an ArcticZymes polymerase, or a Bsm DNA polymerase. The transformation event may produce a modified recognition element, e.g., a version of the recognition element that is ligated or gap-filled and ligated. In one example, transformation of a recognition element in a ligation or gap-fill ligation reaction generates a circular modified recognition element.
[077] An exonuclease cleanup operation may be used following ligation of the recognition element to digest any remaining single stranded nucleic acid, such as unhybridized recognition elements, amplification primers, and single-stranded target molecules. Non-limiting examples of
exonucleases useful for digesting remaining single-stranded nucleic acids include Exonuclease I, Exonuclease I, Exonuclease VII, Msz Exonuclease I, T5 exonuclease, Exonuclease V, DNase I, or any combination thereof.
[078] An amplification event for modified recognition elements may be performed. In one example, the amplification event may be a rolling circle amplification (RCA) reaction to generate a set of concatenated amplification products. The amplification event thereby yields a set of concatenated amplified recognition elements including their unique codes (e.g., codes present in the modified recognition elements) that can be correlated to the target molecule. An amplification event could further be a multiple strand displacement reaction to generate a set of target molecule-specific amplification products.
[079] A detection event followed by a decoding event for each amplified code as found in the amplified recognition elements may be performed to identify the code of an amplified modified recognition element. In one example, the code may be detected by hybridization of one or more segments of the code (and optionally other elements) to a detection polynucleotide complex or a detection oligonucleotide of the present disclosure. The detection events detect the code as a surrogate or proxy for identifying the presence of the target molecule in the sample. Decoding the detection events may in some cases make use of a soft decision decoding algorithm.
[080] A bioinformatics analysis of the code information (and optionally other elements) from the detection operation may be performed. The bioinformatic analysis may be performed by one or more computer systems as described herein.
[081] In some embodiments, the amplification event and the detection event may occur in a step wise manner, such that first the modified recognition elements are amplified followed by one or more washes, followed by the detection event and the decoding event.
[082] In some embodiments, presence of the codes may be determined with a detection and decoding by hybridization process disclosed herein. For example, a plurality of detection polynucleotide complexes or detection oligonucleotides may be introduced to the amplified modified recognition elements iteratively for detection of each segment, or a portion thereof, within all or substantially all amplified codes in the amplified recognition elements.
[083] FIG. 14 is a schematic diagram illustrating an example of a process 1400 of using a bisulfite conversion reaction in combination with a coded recognition element to detect a methylated target nucleic acid of interest. In this example, a DNA sample may include a target sequence of interest 1410 that may be methylated (e.g., 1410a “Methylated Target”) or unmethylated (e.g., 1410b “Unmethylated Target”) at a CpG site of interest. A bisulfite
conversion reaction is used to convert non-methylated cytosines to thymines (C — > T) in the target sequence 1410b.
[084] In the recognition event, target sequence 1410 is recognized and bound by a recognition element comprising a code, 1415. Recognition element 1415, in this example, includes a 3'- terminal G nucleotide that base pairs with the target C at the CpG site of interest.
[085] In some embodiments, in a transformation event, ligation of recognition element 1415 only occurs when the 3'-terminus of the recognition element (e.g., a guanine “G”) hybridizes to the target site “C” of interest in target sequence 1410a to generate a circularized modified recognition element 1420. No ligation occurs at the mismatched target site “T” in the bisulfite converted target sequence 1410b and consequently, there is no ligation and circularization of the recognition element. The circular modified recognition element 1420 may be amplified in an amplification reaction to generate an amplification product comprising many copies of the circular modified recognition element including its code (among other elements) and the code may be detected and decoded.
[086] In one embodiment of process 1400, the recognition element may be a padlock probe as shown in 1415. In another embodiment, a molecular inversion probe that includes a 3'-terminal single base gap at a target site of interest may be used. A gap-fill and ligation event using only a single added nucleotide (at a minimum) may be used to generate the circular modified recognition element comprising the code only when the nucleotide corresponding to the target site of interest is incorporated. This approach provides two forms of specificity to the assay: (i) the 3 '-terminus of the recognition element recognizes and binds the interrogated site; and (ii) a single base extension reaction that incorporates the nucleotide corresponding to the target site of interest occurs.
Recognition Elements
[087] Hypercoding of targets of interest can be enabled by utilizing a multi-functional oligonucleotide called a “recognition element”. In some embodiments, a plurality of recognition elements is provided in assays disclosed herein. In some embodiments, each recognition element in the plurality of recognition elements may comprise one or more target recognition regions. The target recognition regions of the recognition elements may comprise one or more nucleic acid sequence(s) complementary to a target molecule of interest. In some embodiments, the one or more nucleic acid sequences of a recognition element may hybridize to one or more nucleic acid sequences of the target molecule. In some embodiments, the target recognition region is configured to bind to one or more regions of the target molecule of interest flanking a target of
interest (e.g., SNP, indel, and so on). In some embodiments, the target recognition region is configured to bind to the target of interest (e.g., the target recognition region base pairs with the SNP). In some embodiments, the recognition element comprises a code. In some embodiments, the target molecule comprises the code. In either embodiment, the code may be detected as a surrogate or proxy for the presence of the target molecule. A non-liming example of a recognition element comprising target recognition regions and a code is depicted in FIG. 2. In this non-limiting depiction, the recognition element comprises two target recognition regions (e.g., one at the 5’ end and another on the 3’ end of the recognition element) and a code comprising four segments (e.g., as an example).
[088] In some embodiments, the structure of the recognition elements may vary. In some embodiments, the structure of the recognition element may configure into a specific structure when hybridized to a target nucleic acid. Non-limiting examples of a recognition element configuration may include a padlock probe, a molecular inversion probe, a hairpin oligonucleotide, a single-stranded oligonucleotide, a double-stranded oligonucleotide, or a combination thereof. In some embodiments, the recognition element is linear. In some embodiments, the linear recognition element is circularized during the molecular transformation once hybridized to the respective target molecule. In some embodiments, the recognition element is circular prior to the molecular transformation. In one embodiment, the target molecule may serve as a primer for an amplification reaction (e.g., rolling circle amplification, multiple strand displacement amplification, etc.).
[089] In some embodiments, the recognition element is configured to be a padlock probe once the recognition element is hybridized to the target molecule of interest sequences. Padlock probes may be referred to as linear oligonucleotides whose ends are complementary to adjacent target sequences. Upon hybridization to a target molecule, the two ends (e.g., 5’ end and 3’ end) of the recognition element may be brought into contact, generating a padlock probe configuration for subsequent circularization by ligation.
[090] There are several factors and considerations to consider in the design of a recognition element. The diagram depicted in FIG. 3 illustrates some of these factors and considerations, with corresponding potential advantages and drawbacks.
[091] A recognition element as described herein further comprises a hypercode, or code, which can be used to uniquely identify the recognition element, and hence the target of interest to which it can hybridize to thereby providing an indirect determination of whether a target of interest is present in a sample.
Target recognition regions
[092] In some embodiments, each recognition element provided may comprise one or more target recognition regions. The target recognition regions of the recognition element are configured to hybridize to a target molecule. In some embodiments, the target recognition regions may be complementary to a sequence of a target nucleic acid molecule.
[093] In some embodiments, the recognition element may comprise a number of target recognition regions. For example, as depicted in FIG. 2, the recognition element may comprise two target recognition regions. In FIG. 2, one target recognition region may be present at the 5’ end of the recognition element, and another target recognition region may be present at the 3’ end of the recognition element. Alternatively, the target recognition region may be interposed between the 5’ end and the 3’ end of the recognition element. In some embodiments, the recognition element may comprise one or more target recognition regions, two or more target recognition regions, three or more target recognition regions, four or more target recognition regions, five or more target recognition regions, six or more target recognition regions, seven or more target recognition regions, eight or more target recognition regions, nine or more target recognition regions, 10 or more target recognition regions, 15 or more target recognition regions, 20 or more target recognition regions, or 25 or more target recognition regions. In some embodiments, the recognition element may comprise 25 or less target recognition regions, 20 or less target recognition regions, 15 or less target recognition regions, 10 or less target recognition regions, nine or less target recognition regions, eight or less target recognition regions, seven or less target recognition regions, six or less target recognition regions, five or less target recognition regions, four or less target recognition regions, three or less target recognition regions, or two or less target recognition regions.
[094] In some embodiments, each target recognition region of the recognition element may comprise a plurality of nucleotides. In some embodiments, each target recognition region may comprise a length of 2 or more nucleotides, 3 or more nucleotides, 4 or more nucleotides, 5 or more nucleotides, 10 or more nucleotides, 15 or more nucleotides, 20 or more nucleotides, 25 or more nucleotides, 30 or more nucleotides, 35 or more nucleotides, 40 or more nucleotides, 45 or more nucleotides, or 50 or more nucleotides. In some embodiments, each target recognition region comprises a length of 50 or less nucleotides, 45 or less nucleotides, 40 or less nucleotides, 35 or less nucleotides, 30 or less nucleotides, 25 or less nucleotides, 20 or less nucleotides, 15 or less nucleotides, 5 or less nucleotides, 4 or less nucleotides, 3 or less nucleotides, or 2 or less nucleotides.
Recognition element design
[095] In some embodiments, there are three major operations in designing a recognition element (also known as a Plenoid™) to identify and hybridize to its target.
[096] In some embodiments, the first operation is to select the target sequences, including a screen for sequence similarity in the genome (FIG. 8A). If similarity is found, the recognition element for a locus may be designed utilized a dual-ligation strategy for design instead of a standard design. For example, a pharmacogenomic target such as CYP2D6 is difficult to genotype because of high homology with a pseudogene CYP2D7, so using a dual-ligation strategy to identify targets of interest in CYP2D6 and not in CYP2D7 overcomes that difficulty and allows for CYP2D6 targets of interest to be uniquely identified in a back of the presence of nucleic acids from the pseudogene CYP2D7 (see FIG. 9A).
[097] Referring to FIG. 9A, a strategy to genotype variants in regions with high homology to homologs or pseudogenes, for example, includes a two-ligation approach wherein one ligation event occurs at the base of interest and a second ligation event occurs at a downstream site where the high homology region and the region of interest differ. Additionally, a bridge oligonucleotide can be added to fill the gap between the two ligation sites. FIG. 9B demonstrates the success of following this two-ligation strategy. In FIG. 9B, the target of interest lies in the CYP2D6 gene and is differentiated from the pseudogene CYP2D7 using a dual ligation recognition element strategy compared to a single ligation (e.g., “normal”) recognition element strategy. Graph (i) shows results using a single ligation recognition element strategy in identifying a variant rs774671100, wherein known HomRef (Ref) genotyped samples show elevated counts for the alternative, or Alt, allele. However, as seen in graph (ii) when implementing a dual-ligation recognition element strategy the off-target counts are dramatically decreased. FIG. 9B graph (iii) shows overlapping clusters and indistinguishable reference allele counts using a single ligation recognition element strategy, compared to graph (iv) where a dual-ligation recognition element strategy was implemented which resulted in distinguishable reference allele counts.
[098] In some embodiments, the second operation is to select 5’ and 3’ regions (also known as probe arms) of the recognition element that are complementary to the target sequences of interest identified in the first operation (FIG. 8B). 5’ and 3’ complementary regions, which may be of varying lengths, are selected based on thermodynamic predictions for hybridization to the intended target sequences.
[099] In some embodiments, the third operation is to assign the complementary 5’ and 3’ sequences from the second operation to a hypercode by computing and minimizing the potential
for interaction between the complementary 5’ and 3’ sequences and the hypercode assigned thereto (FIG. 8C).
[100] In some embodiments, the sequences for a target of interest are generally provided as a standard .vcf (e.g., variant call format) file that includes the chromosome number, position, locus id, reference allele (e.g., ref allele), and alternative (e.g., variant, Alt, Het) allele(s). Generally, each locus is evaluated as N recognition elements where N is the total number of alleles (reference and alternative). For each potential recognition element, the reference genome of interest can be queried to obtain a sequence that extends, for exmaple, 45 bases before the locus and 44 bases after the locus. For example, the reference base (e.g., wild type nucleotide) is replaced with the alternative base(s) (e.g., variant nucleotide(s)) being targeted as the target of interest by the recognition element. For example, a variant of interest could be a single nucleotide polymorphism or other variant of interest. The sequence of the target of interest can be divided into two separate portions: a portion of the target of interest that is complementary to the 5’ end of a recognition element and a portion of the target of interest that is complementary to the 3’ end of a recognition element. In some embodiments, the reference/altemative target of interest can be located at the end of the 3’ end of the recognition element. In some embodiments, the reference/altemative target of interest can be located at the end of the 5’ end of the recognition element.
[101] In some embodiments, for each target sequence of interest, a list of decreasing target sequence lengths from 45 nucleotides to a minimum sequence length of 15 nucleotides, and the reverse complements of those sequences, is generated. For each target sequence of interest and its reverse complement (a pair), a Nearest Neighbor estimation of the thermodynamic parameters between the two sequences is computed (e.g., AG and melting temperature). For the pairs of 5’ end targets and 3’ end targets, the pairs with a melting temperature closest to 65 °C and 60°C, respectively, are selected.
[102] In some embodiments, after both the 5’ target sequence end and a 3’ target sequence end for a recognition element are determined a hypercode unique to those sequences, and hence the target of interest, is assigned to the recognition element. A series of potential end/hypercode sequence matches is computed and screened for secondary structure due to potential intermolecular interactions between the sequences. As an example, this screening strategy involves randomly selecting 25 hypercodes for each recognition element from the selected codespace. For each hypercode, the 5’ target end sequence is placed in close proximity to the 5’ end of the potential hypercode sequence and the 3’ target end sequence is placed in close
proximity to the 3’ end of the hypercode sequence, as such generating a linear recognition element. In some embodiments, one or more primer binding site sequences can be inserted into the recognition element for amplification. In some embodiments, a hypercode or a portion thereof can be utilized as a primer binding site for amplification. A sliding window along the sequence of the unique recognition element sequences can be generated, wherein the window is stepped every 10 nucleotides. At each window placement the thermodynamic interactions between the sequence within the window and the full recognition element sequence can be computed. Upon the window reaching the end of the linear recognition element sequence, the minimum AG is reported across all windows. Once the minimum AG is computed for all linear recognition elements, the recognition element that has the maximum AG (e.g., a AG value greater than 0 to indicate unfavorable intermolecular interactions) and its incorporated hypercode is reserved for the particular target of interest as represented in the 5’ and 3’ end sequences of the linear recognition element.
[103] In some embodiments, there can be two additional considerations for recognition element design. The first consideration includes ligation errors that may occur due to ligase specificity. The second consideration includes pseudogenes and homologs that may have high homology between the specified locus and target of interest and other regions in the genome. To detect these homologies, a trimmed target sequence can be generated, by looking, for example, at 20 nucleotides before and after the target locus of interest. A Basic Local Alignment Search Tool (BLAST) search can be performed using the 20 nucleotide target sequence to identify homology that is > 80% match with the 20 nucleotide target sequence. All hits reported from the BLAST search that are >80% match to the 20 nucleotide target sequence can be added to a list of potential homology matches that may cause off-target hybridization events with a recognition element. The BLAST alignments can be reviewed, for example from 200 bases before and after each locus of interest. The sequence alignments can be reviewed, for example by moving in the 5’ direction and comparing the bases of the reference sequence against the alignment hits. In the event of a mismatch between the reference and one or more hit sequences, the mismatch position can be recorded as a potential anchor site for a double ligation recognition element design strategy. Additional anchor sites can be recorded as they are identified.
[104] In some embodiments, if the locus was targeted along the forward strand, then the anchor sites that are identified moving along the 5’ direction can be considered for 3’ ligation sites and the anchor sites found moving in the 3’ direction can be considered for 5’ ligation sites, or vice versa if targeting the reverse strand. In some embodiments, available databases for allele
information can also be queried and if the alleles result in a matching homolog the allele frequency at the site can be recorded. If the allele frequency is below a specified threshold, then it can be assumed that these homologs, when looking across samples, can be corrected for. If the allele frequency is above a specified threshold, then a dual ligation recognition element as described herein can be implemented for target of interest identification. As described herein for a dual ligation recognition element, a bridge oligonucleotide comprising target sequence between the target sequence of interest to the anchor nucleotide can be generated, wherein the 5’ end sequence and 3’ end sequence of the recognition element hybridizes adjacent to the ends of the bridge oligonucleotide when hybridized to the target of interest. In some embodiments, the target of interest sequence is found on the 5’ end sequence and the anchor nucleotide is found on the 3’ end sequence of the recognition element. In some embodiments, the target of interest sequence is found on the 3’ end sequence and the anchor nucleotide is found on the 5’ end sequence of the recognition element.
Amplification
[105] The methods described herein may include amplification of a nucleic acid. In some embodiments, the nucleic acid is a recognition element. In some embodiments, the nucleic acid is a target nucleic acid molecule. In some embodiments, the nucleic acid is a combination of a recognition element and a target nucleic acid molecule, or a complement thereof. In some embodiments, the amplification is selective amplification. For example, in some embodiments, amplification occurs if a target recognition region of a recognition element recognizes and binds to a complementary target nucleic acid. In some embodiments, amplification occurs if a primer is used that is complementary to one or more of a portion of a target recognition region, a portion of a segment of a code, or another sequence in the recognition element that is complementary to a primer used for amplification. In some embodiments, the amplification is non- selective. For example, in some embodiments randomers can be used to prime amplification from one or more recognition elements. In another example, a universal primer can be used to prime amplification from a plurality of recognition elements.
[106] In some embodiments, the methods described herein may include selectively amplifying a subset of nucleic acids. For example, in some embodiments, a subset of a plurality of recognition elements bound to a plurality of target nucleic acid molecules may be amplified. The subset may comprise a percentage of the total amount of recognition elements bound to target nucleic acid molecules as described herein. In some embodiments, the subset may include 5% or more, 10% or more, 15% or more, 20% or more, 25% or more, 30% or more, 35% or more, 40% or more,
45% or more, 50% or more, 55% or more, 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, or 95% or more of the total amount of recognition elements bound to target nucleic acid molecules. In some embodiments, the subset may include 95% or less, 90% or less, 85% or less, 80% or less, 75% or less, 70% or less, 65% or less, 60% or less, 55% or less, 50% or less, 45% or less, 40% or less, 35% or less, 30% or less, 25% or less, 20% or less, 15% or less, 10% or less, or 5% or less of the total amount of recognition elements bound to target nucleic acid molecules.
[107] In some embodiments, the amplification may include rolling circle amplification (RCA). In some embodiments, the amplification may include multiple strand displacement amplification. In some embodiments, RCA may generate a concatemer as an amplification product, wherein the concatemer contains multiple copies of the circularized modified recognition element, including associated codes, target recognition regions, and any other functional sequences that are included in the circular modified recognition element. In some embodiments, RCA may be performed while the circularized recognition element is in solution. In some embodiments, RCA may be performed on a circularized recognition element while the circularized recognition element is immobilized, either reversibly or non-reversibly, on a solid substrate or surface. In some embodiments, RCA is performed on a modified recognition element that is still hybridized to the target molecule. The terms “solid substrate” and “solid surface” may be referred to herein as a surface or substrate. In some embodiments, the substrate is a bead, a flow cell, a microwell, or a nanowell. In some embodiments, the substrate is coated with a composition that enhances target molecule immobilization. In some embodiments, the substrate is charged. In some embodiments, the substrate is positively charged or negatively charged. In some embodiments, the substrate is an anionic substrate. In some embodiments, the substrate is a cationic substrate. In some embodiments, the substrate comprises an immobilization composition, such as polyacrylamide, branched PEI, linear PEI, poly(P-aminoester) and poly(amidoamine), PEG, a gel, poly-L-lysine, silane, agarose, muscle mimetic catecholamine polymer, and the like. In some embodiments, the substrate has no charge.
[108] FIG. 6, for example, shows a schematic diagram illustrating RCA amplification of a recognition element to yield a concatemeric amplification product. As illustrated in FIG. 6, rolling circle amplification using primer 616b that is complementary to recognition element sequence 616 is hybridized to circular modified recognition element 625 and used to initiate the RCA reaction to generate an amplification product 630. Amplification product 630 is a polymeric concatemeric molecule that includes multiple repeated copies of circular modified recognition
element 625, wherein each copy includes primer 616, code 614, a functional sequence 612, target recognition regions, and a second functional sequence 618. In this example, the complement of modified recognition element 625 is indicated by the dashed line. In some embodiments, one or more functional sequences which may be included in a recognition element include, but are not limited to, a unique molecular identifier (UMI) sequence, a sequencing primer sequence, an index sequence, a restriction endonuclease sequence, a cleavage sequence, a unique molecular identifier, or combinations thereof. An RCA reaction may be performed in the presence of the cationic polymer coated surface, resulting in simultaneous immobilization and amplification of an amplification product. RCA primers may be supplied in solution or bound to the cationic polymer-coated surface prior to, or concurrent with, performing the RCA reaction.
[109] In some embodiments, amplification may include on-surface polymerase chain reaction (PCR), isothermal amplification, RCA, or a combination thereof. In some embodiments, amplification may include polymerase chain reaction (PCR). In some embodiments, PCR is multiplexed PCR. The amplification methods disclosed herein may include isothermal amplification. Non-limiting examples of isothermal amplification include Nicking endonuclease amplification reaction (NEAR), Transcription mediated amplification (TMA), Loop-mediated isothermal amplification (LAMP), Helicase-dependent amplification (HDA), Nucleic Acid Sequence Based Amplification (NASBA), Strand displacement amplification (SDA), Multiple Displacement Amplification (MDA), Rolling Circle Amplification (RCA), bridge amplification, or Ramification (RAM) amplification method. In some embodiments, the amplification method is provided in Fakruddin M, Mannan KS, Chowdhury A, Mazumdar RM, Hossain MN, Islam S, Chowdhury MA. Nucleic acid amplification: Alternative methods of polymerase chain reaction. J Pharm Bioallied Sci. 2013 Oct;5(4):245-52, which is hereby incorporated by reference in its entirety.
Codes or Hypercodes
[HO] The methods described herein relate to the use of a hypercode, also known as a code, as part of a recognition element. The terms “hypercode” and “code” are used interchangeably herein. In some embodiments, in the presence of the complementary sequence, a recognition element hybridizes at the 5’ and 3’ ends to the complementary sequences in a target of interest. The incorporation of a hypercode or code into a recognition provides a unique way to identify the originally hybridized target sequences of interest to the recognition element, thereby serving as a proxy for the presence of the target of interest in a sample. In some embodiments, the hypercode is selected from a set of codes wherein the set of codes make up a code space. In some
embodiments, the hypercode is associated with one or more target nucleic acid molecules. In some embodiments, the hypercode comprises a plurality of segments, where each segment corresponds to one or more computational states that are used in a decoding process of the present disclosure. In some embodiments, the decoded hypercodes may be used as surrogates or proxies of target molecules, thereby serving as an indirect analysis of the presence of a target molecule from a sample as the codes correlate with the presence of a target molecule that hybridized to a recognition element.
[Hl] In some embodiments, the hypercode is associated with one or more target nucleic acid molecules. For example, as shown in FIG. 2, the code (or hypercode) may be associated with one target nucleic acid molecule. In some embodiments, the code is associated with two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, or 10 or more target nucleic acid molecules. In some embodiments, the code is associated with 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less target nucleic acid molecules.
[112] In some embodiments, the code present on the recognition element is selected from a set of codes which comprise a codespace. In some embodiments, each code from the set of codes may be from a predetermined set of codes. In some embodiments, each code from the set of codes may be selected to ensure that the selected code differs from other codes in the set of codes. As such, in some embodiments, several selection criteria may be implemented to generate a set of codes. In some embodiments, selection of the codes may comprise a Hamming distance.
[113] In some embodiments, to generate a code selected from a set of codes for use in a recognition element, a Hamming distance (HD) selection criterion may be implemented between any two codes of the set of codes. A Hamming distance between two codes in a set of codes may refer to the number of states that differ between two codes in the set of codes. In essence, the Hamming distance measures the number of changes that would need to be made to a first code to change the string of states to the second code. As such, the codes cannot have a Hamming distance greater than the length of the code. For example, if the length of a code being the number of cycles or flows of decoding runs being eight, and if each cycle or flow corresponds to one state, therefore eight states, then the maximum Hamming distance is eight. In some embodiments, the Hamming distance may be a minimum Hamming distance. In some embodiments, the Hamming distance may be a maximum Hamming distance. In some embodiments, a minimum Hamming distance may be from about 2-10. In some embodiments, the Hamming distance is between 2-7. In some embodiments, the Hamming distance is between
3-5 FIG. 20B is an example of a table showing the number of codes that would satisfy a Hamming distance (HD) of 3, 4 or 5 given a) the number of detection colors or color states (in this example, four color channels for detection), and b) the number of times the codes are queried (e.g., flowed) by detection oligonucleotides or detection polynucleotide complexes. As can be seen, as the HD increases the number of codes that can be used decreases, whereas as the number of flows that a code is exposed to increases the number of codes that can be used increases, thereby demonstrating a few of the parameters as listed in FIG. 3 that need to be taken into account when defining codes for use in recognition elements for target identification.
[114] The code may have a certain length in nucleotides. In some embodiments, the code has a length of greater than or equal to about three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 contiguous nucleotides. In some embodiments, the code has a length of fewer than or equal to about 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 contiguous nucleotides. In some embodiments, the length is about 5 to about 200, about 10 to about 150, about 15 to about 100, about 20 to about 90, or about 30 to about 80 contiguous nucleotides.
[115] In some embodiments, each code from a set of codes is generated using a 4-ary nucleotide alphabet of A, C, G, and T. In some embodiments, each code of a set of codes is generated using a 3-ary nucleotide alphabet of a set of three of A, C, G, and T. In some embodiments, the codes can be generated from arbitrary states 1 to 4, corresponding to the fluorophores that are associated with a unique string of oligonucleotides. The numbers are used in the abstract but serve as a means to numerate colors or mixes of colors that are utilized to query the codes for decoding.
[116] The flexibility of using recognition elements with hypercodes allows for assay optimization on various levels. For example, the number of detection events can be reduced, thereby lowering costs and runtimes for low-plexity assays (e.g., quantifying a small number of targets of interest over a maximal dynamic range), or increasing the number of detection events when higher plexity assays are needed. Further, detection can be performed with fewer fluorescent moieties and more queries to maximize signal-to-noise and minimize error rates, or conversely with more fluorescent moieties to further reduce data output time. The present disclosure demonstrates the scalability of the disclosed methods where over 10,000 hypercodes and measurements of absolute analyte concentration spanning up to at least a 10 fold dynamic range with sensitivities as low as 1 fM.
Codespace Design
[117] Each code or hypercode used in a recognition element is generated based on a codespace design. A “codespace” is a collection of “codewords” that can be used to uniquely identify a recognition element in an assay and is based on colors and/or their combination. Each codeword comprises a string of states drawn from a collection of possible states, for example a four-color state codespace is encoded using a number of different fluorescent moieties, in this example four different fluorescent moieties, which emit at different wavelengths.
[118] Given 5 states or colors (corresponding to fluorescence states or different emission spectra) and F readout flows or detection cycles, there can be sFpossible codewords or different possible color state combinations. The present disclosure reports data corresponding to 5=4 and F=8, so 48 potential codewords. However, the number of flows or detection events or cycles can be decreased (for faster data output) or increased (for higher target plexity and/or higher minimum hamming distance between codewords). A strength of decoding by hybridization as described herein is that the number of states or colors 5 can be increased by using more fluorescent moieties for detection events, combinations of fluorescent moieties, or fluorescence levels, for example where each amplification product is detected with only one of several colors that maximizes ease of decoding with high signal-to-noise ratio (SNR). As such, the number of distinctive hypercodes can scale with the number of cycles and resolvable optical signatures at each cycle thereby expanding the potential assay complexity.
[119] The list of possible codewords can be filtered using heuristics derived from data output. For example, low-complexity codewords that includes long runs of a single color state can be excluded, out of concern that such codewords may be more vulnerable to being misread through biochemistry or optical artifacts. For example, a single color state could be given the number 1, and the fluorescent moiety could be FITC, such that using the same color multiple times in a row for detection events would result in a state profile of 1111111. Such a long run of the same color detection could be misread by an instrument thereby misidentifying a target of interest.
[120] Given the collection of codewords of different fluorescent profiles that are unique, a codespace can be generated, wherein the codespace comprises a collection of codewords with a minimum pairwise distance threshold. Selecting a maximum-size codespace (enabling the largest possible assay plexity) can be challenging since, for example, enumerating all codeword combinations can become computationally intractable for nontrivial cases. As such, one strategy is to heuristically generate multiple codespaces and select the candidate codespace with the largest number of codewords. One strategy in designing a codespace is to begin with an empty
list and an available list of all valid codewords. One codeword can be selected at random from the list of available codewords and added to the empty list. Any candidates whose hamming distance from the chosen codeword is smaller than a chosen cutoff can be removed from the list of available codewords. This selection method strategy can be repeated until no more codewords can be added to the previously empty list. As such, the codespace generation process can be repeated many times to generate many potential codeword candidates.
[121] In an alternative strategy, codeword sets can be generated by deliberately choosing “snug codewords” which are as close as possible to the codewords already in the set, without violating the hamming distance cutoff. The alternative strategy is similar to the random construction strategy; however the difference is in how the next available codeword is selected. For example, for each available codeword, the number of already-selected codewords at the minimumallowable hamming distance is tracked. The next codeword is chosen at random from the subset of available codewords with the largest number of nearest neighbors. A snug codeword based codespace design strategy can yield higher codespace size (e.g., 1,166 valid codewords for a four-color state eight-flow data output) compared to the random selection strategy (e.g., 966 valid codewords).
[122] The implementation of the codespace generation strategies described herein evaluated 10,000 candidate four-flow/eight-state codespaces in 3.7 hours on a 3.6 GHz desktop PC, with the maximum codespace size obtained by the 4721st iteration after 1.75 hours. Runtime increases for still larger state or flow counts but can readily be accelerated by using additional threads. FIG. 20B demonstrates the codespace sizes for variable numbers of flows with variable hamming distances (HD) of three, four or five when using four color states.
Hypercode design
[123] Once the codespace is defined, sequences for each hypercode can be applied to the codewords in the codespace for generating unique hypercodes for each recognition element. Each hypercode is a unique nucleotide sequence, comprising a number of short distinct segments (e.g., nucleic acid segments), which are unique in their locational position in a recognition element, and which correlate with the 5’ and 3’ ends of the recognition element, which are in turn specific for a target of interest in a biological sample. As such, a hypercode can be used as an indirect surrogate or proxy for the presence of a target of interest, or the absence thereof. Collectively, all of the possible hypercode sequences that could be incorporated into recognition elements is referred to as the hypercode space. The design of a highly sensitive and specific hypercode space comprises careful selection of nucleic acid segments with favorable biochemical properties. A
nucleic acid segment of a hypercode hybridizes specifically to an anchor oligonucleotide that in turn hybridizes to a detection oligonucleotide (thereby generating a detection polynucleotide complex), while exhibiting low affinity for hybridizing to other anchor oligonucleotide sequences that are used to detect other nucleic acid segments of a hypercode.
[124] Given p segment positions each of nucleotide length L in a recognition element, there can be 4pL possible hypercodes (e.g., if there are four positions in the hypercode to be filled). In some embodiments, there can be less or more positions to be filled in a hypercode, as desired for plexity. From this set of 4pL possible hypercodes, a subset of sequences with favorable biochemical properties can be selected. As a first operation, segments that are anticipated to be vulnerable to readout failure due to mis-hybridization are removed from the subset. For example, nucleic acid elements with repeated nucleotide sequences such as AAAAAAAA are excluded because failure to dehybridize a detection polynucleotide could cause some other sequence, such as AAAAACGT, to be easily misread as the original sequence. As a second operation, to minimize off-target hybridization, the number of nucleic acid segments in a set can be further reduced by removing nucleic acid segments whose complements have a high predicted melting temperature (Tm) when hybridized to other nucleic acid segments used in the hypercode set.
[125] After reduction for biochemical suitability, the largest possible set of nucleic acid segments S„ such that the minimum Levenshtein distance (e.g., number of nucleotide positions that are different) between any pair of segments is larger than a determined cutoff value is chosen. From the resulting subset of nucleic acid segments, the set of hypercodes can be built, wherein one of N segments is chosen for each segment location in a hypercode. A nucleic acid segment for a given position is determined using the flows, or detection events, that hybridize to that position. For example, given F total number of flows and F/ flows per position, 7V= (F/p) nucleic acid segments at each position can be determined. The number of potential combinations constructed in this way is c = Np , for example, with p= 4, and N= 44 = 16, a set of 65,536 candidate hypercodes is possible, from which all the hypercodes or codes, whose corresponding codewords fall within a designed codespace as previously described, can be selected.
Hypercode Segments
[126] In some embodiments, the hypercodes comprise one or more segments. For example, as shown in FIG. 2, the code comprises four segments.
[127] In some embodiments, the recognition elements provided herein comprise a code comprising one or more segments. The one or more segments, or the complements thereof,
within the code may be used as a proxy for detection of the target nucleic acid molecules recognized by the recognition element.
[128] The number of segments present in a code of a recognition element may be considered in the design of the recognition element, as shown in FIG. 3. The number of segments in a code of a recognition element helps to determine the nucleotide length of the recognition element. For example, a recognition element that includes a code comprising five segments may comprise a greater nucleotide length than a recognition element that includes a code of only two segments. As shown in FIG. 3, a recognition element with a larger nucleotide length may run up against synthesis limits and can lead to a greater risk of synthesis errors. Alternatively, a recognition element with a smaller nucleotide length may avoid synthesis limits and risks in synthesis errors (FIG. 3). A recognition element with a larger nucleotide length may include less space for other portions of the recognition element, such as the target recognition regions. A recognition element with a smaller nucleotide length may include more space for other portions of the recognition element, such as the target recognition regions or additional functional sequences.
[129] In some embodiments, the code comprises a number of segments. For example, FIG. 2 depicts a non-limiting recognition element with four segments that comprise the code. In some embodiments, the code comprises about 2 to about 10 segments. In some embodiments, the code comprises about 2 to about 8 segments. In some embodiments, the code comprises about 3 to about 5 segments. In some embodiments, the code comprises at least about 4 segments, at least about 5 segments, at least about 6 segments, at least about 7 segments, at least about 8 segments, at least about 9 segments, or at least about 10 segments.
[130] In some embodiments, each segment may comprise a length in nucleotides. In some embodiments, each segment may comprise a length of about 10 to about 30 nucleotides. In some embodiments, each segment may comprise a length of about 10 to about 25 nucleotides. In some embodiments, each segment may comprise a length of about 15 to about 20 nucleotides. In some embodiments, each segment may comprise a length of about 2 or more nucleotides, about 4 or more nucleotides, about 6 or more nucleotides, about 8 or more nucleotides, about 10 or more nucleotides, about 12 or more nucleotides, about 14 or more nucleotides, about 16 or more nucleotides, about 18 or more nucleotides, about 20 or more nucleotides, or about 22 or more nucleotides. In some embodiments, the segments in a code are of the same length. In some embodiments, the segments in a code are not the same length.
[131] As with a code in a recognition element, a Hamming distance selection criterion may be implemented between any two segments of a code. A Hamming distance between two segments
in a code refers to the number of states that differ between the segments. In essence, the Hamming distance measures the number of changes that would need to be made to a first segment to change the string of states to the second segment. In some embodiments, the Hamming distance may be a minimum Hamming distance. In some embodiments, the Hamming distance may be a maximum Hamming distance. In some embodiments, a minimum Hamming distance may be from about 2 to about 20, about 3 to about 19, about 4 to about 18, about 5 to about 17, about 6 to about 16, about 7 to about 15, about 8 to about 14, about 9 to about 13, or about 10 to about 12. In some embodiments, a minimum Hamming distance may be greater than or equal to about 2, greater than or equal to about 3, greater than or equal to about 4, greater than or equal to about 5, greater than or equal to about 6, greater than or equal to about 7, greater than or equal to about 8, greater than or equal to about 9, greater than or equal to about 10, greater than or equal to about 11, greater than or equal to about 12, greater than or equal to about 13, greater than or equal to about 14, greater than or equal to about 15, greater than or equal to about 16, greater than or equal to about 17, greater than or equal to about 18, greater than or equal to about 19, or greater than or equal to about 20.
[132] In some embodiments, a segment may serve as a primer for amplification. For example, a segment may comprise an amplification primer binding sequence for rolling circle amplification (RCA) for generating a plurality of amplification products. In some embodiments, a plurality of segments is present on the recognition elements provided herein. In some embodiments, the nucleotide or nucleic acid sequence of each segment corresponds to one or more computation states for performing a decoding process of the present disclosure. For example, one or more segments of a code may be detected with a first pool of detection polynucleotide complexes to produce one or more detectable binding complexes. In some embodiments, the one or more detectable binding complexes, once imaged, produce one or more optical signals. When all or substantially all segments of the code are detected by iteratively applying additional pools of detectable oligonucleotide complexes to the amplification products, a series of optical signals may be observed (e.g., a code profile). The application of detection oligonucleotides to amplification products for detection is called “flow” or “cycle”, wherein “flow” or “cycle” is the number of times a particular segment of an amplification product is queried, or the number of times detection oligonucleotides or detection polynucleotide complexes are flowed or cycled over an amplification product in order to detect a segment sequence. In some embodiments, one or more optical signals observed from querying an amplification product with detection polynucleotide complexes translates to one or more computational states such that each optical
signal may be used to decode a code. In some embodiments, the plurality of segments on the recognition element may correspond to at least three computational states. In some embodiments, the optical signal may be a color or a non-color. In some embodiments, the optical signal may be a combination of colors (e.g., when the detection polynucleotide complex comprises a plurality of detectable labels). In some embodiments, the computational states are numbers (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.). In some embodiments, each detection polynucleotide complex that comprises a detectable label comprises a state, as such when four different fluorescent moieties are used as detectable labels there are four states, 1 to 4, each corresponding to the emitted light wavelength of the detectable label. However, the number of states can be larger depending on the combination of detectable labels with each unique detection polynucleotide complex. For example, FIGs. 13A-B shows an example of 16 unique detection polynucleotide complexes wherein each has one of four fluorescent moieties. As such, in this example there can be 16 computational states used for decoding if all 16 unique detection polynucleotide complexes are used to detect a corresponding amplification product. However, additional ways to increase the number of computational states for decoding include, but are not limited to, adding levels of identifiability associated with a particular detectable signal such as whether a detectable signal is brighter or dimmer compared to a normal level of signal, whether there is a combination of detectable colors that is used to identify a particular nucleotide. As such, the number of computational states that could be used is only limited by practicality for any given assay.
[133] In some embodiments, the methods described herein may use a number of computational states. The number of computational states used in the methods and systems described herein may be considered in the design of the recognition elements (FIG. 3). For example, in some embodiments, a detection scheme using a larger number of computational states may lead to a larger code space, which may allow for a greater amount of information that may be detected. In some embodiments, a detection scheme using a smaller number of computational states may be limited in the amount of information that can be detected. In some embodiments, using a larger number of computational states may result in a faster detection process (less time to determine a target molecule compared to using a smaller number of computational states). In some embodiments, a detection scheme using a larger number of computational states may require greater instrument complexity, which may lead to potential drawbacks such as color crosstalk, wherein the computational states used in the detection scheme may become difficult to distinguish from other computational states. In some embodiments, a greater number of computational states may require that a more complex detection tool be used.
[134] In some embodiments, three or more computational states may be used in the methods described herein. In some embodiments, the methods described herein may use one or more computational states, five or more computational states, 10 or more computational states, 15 or more computational states, 20 or more computational states, 25 or more computational states, 30 or more computational states, 35 or more computational states, 40 or more computational states, 45 or more computational states, or 50 or more computational states. In some embodiments, the methods described herein may use 50 or less computational states, 45 or less computational states, 40 or less computational states, 35 or less computational states, 30 or less computational states, 25 or less computational states, 20 or less computational states, 15 or less computational states, 10 or less computational states, or five or less computational states.
[135] In some embodiments, each segment of a code may correspond to a combination of computational states. In some embodiments, each segment may correspond to one or more computational states, two or more computational states, three or more computational states, four or more computational states, five or more computational states, six or more computational states, seven or more computational states, eight or more computational states, nine or more computational states, or 10 or more computational states. In some embodiments, each segment may correspond to 10 or less computational states, nine or less computational states, eight or less computational states, seven or less computational states, six or less computational states, five or less computational states, four or less computational states, three or less computational states, or two or less computational states. It is the combinations of detected signals that are used to build a code profile which can be decoded for identifying the presence of a target molecule.
Detection Polynucleotide Complexes
[136] The methods described herein may include introducing detection polynucleotide complexes or a detection oligonucleotide. The methods described herein may include introducing detection polynucleotide complexes or detection oligonucleotides to the circularized and amplified recognition elements. The methods described herein may include introducing a plurality of detection polynucleotide complexes to the plurality of concatemeric amplification products. Each detection polynucleotide complex may comprise a detection oligonucleotide and an anchor oligonucleotide.
[137] In some embodiments, the methods described herein may include introducing a plurality of detection polynucleotide complexes. In some embodiments, the methods described herein may include introducing a plurality of detection oligonucleotides. In some embodiments, the methods described herein may include introducing one or more, two or more, three or more, four or more,
five or more, six or more, seven or more, eight or more, nine or more, 10 or more, 15 or more, 20 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, or 1,000 or more detection polynucleotide complexes to an amplification product. In some embodiments, the methods described herein may include introducing 1,000 or less, 900 or less, 800 or less, 700 or less, 600 or less, 500 or less, 400 or less, 300 or less, 200 or less, 100 or less, 50 or less, 25 or less, 20 or less, 15 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection polynucleotide complexes to an amplification product. Detection Oligonucleotide
[138] In some embodiments, each detection polynucleotide complex may comprise a detection oligonucleotide. In some embodiments, the detection oligonucleotide may comprise a portion comprising a detectable label (e.g., fluorescent molecule) and another portion configured to bind to at least a portion of an anchor oligonucleotide. In some embodiments, the portion configured to bind to the anchor oligonucleotide comprises a nucleic acid sequence complementary to a portion of the nucleic acid sequence of the anchor oligonucleotide. FIG. 10A shows a nonlimiting example of a structure of a detection oligonucleotide comprising a fluorescent molecule 1010 and a portion complementary to at least a portion of an anchor oligonucleotide 1020.
[139] The detection oligonucleotide may comprise various nucleotide lengths. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 25 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 20 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 15 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 10 nucleotides. In some embodiments, the detection oligonucleotide may comprise a length of about 5 to about 8 nucleotides.
[140] In some embodiments, the detection oligonucleotide may comprise a length of between about 5 to about 100 nucleotides, between about 10 to about 80 nucleotides, between about 20 to about60 nucleotides, between about 30 to about 50 nucleotides, between about 15 to about 30 nucleotides. In some embodiments, the detection oligonucleotide may comprise one or more detectable labels. In some embodiments, one or more detectable labels comprise a fluorescent moiety. The fluorescent moiety may emit in the red, far-red, near-red, yellow, green, blue, or ultraviolet wavelengths. In some embodiments, the fluorescent moiety comprises one or more of 6-FAM (6-carboxyfluorescein), JOE (6-carboxy-4',5'-dichloro-2',7'-dimethoxyfluorescein), TAMRA (6-carboxytetramethylrhodamine), 5-Cy5 (5-carboxyrhodamine), 5-Cy5.5 (5-carboxylic
acid succinimidyl ester), 5-Cy7 (5-carboxyrhodamine), (hexachlorofluorescein), Alexa Fluor 488 (AF488), Alexa Fluor 514 (AF514), Texas Red, Cyanine 3, Cyanine 5, Pacific Blue, Tetramethyl rhodamine, Oxazole Yellow, Atto647N, and Rhodamine 6G (R6G).
[141] FIG. 10A shows a non-limiting example of a structure of a detection oligonucleotide comprising a florescent molecule 1010. In some embodiments, the detection oligonucleotide may comprise one or more fluorescent moieties. In some embodiments, the detection oligonucleotide may comprise two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, or 10 or more fluorescent moieties. In some embodiments, the detection oligonucleotide may comprise 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less fluorescent moieties.
[142] In some embodiments, the fluorescent moiety may comprise an organic dye, a biological fluorophore, a quantum dot, or a combination thereof. In some embodiments, the organic dye may comprise an organic molecule. In some embodiments, the organic dye may comprise a coumarin, a cyanine, a benzofuran, a quinoline, a quinazolinone, an indole, a benzazole, a borapolyazaindacene, a xanthene, or a combination thereof. The organic dye may correspond to a color. For example, the organic dye my correspond to a green, a yellow, a blue, an indigo, a red, an orange a purple, a pink, a violet, or a combination thereof. In some embodiments, the organic dye may correspond to no color. In some embodiments, the organic dye may correspond to a black color. In some embodiments, the organic dye may correspond to a white color.
[143] In some embodiments, during imaging, the fluorophore may emit a color in the visible light spectrum. In some embodiments, the fluorophore may emit in a wavelength in the range between 400 nanometers (nm) and 900 nm. In some embodiments, the fluorophore may emit in a wavelength between about 400 nm and about 475 nm, about 475 nm and about 490 nm, about 490 nm and about 530 nm, about 530 nm and about 575 nm, about 575 nm and about 600 nm, about 600 nm and about 700 nm, or about 700 nm and about 800 nm. In some embodiments, the fluorophore may emit a wavelength of 400 nm or more, 425 nm or more, 450 nm or more, 475 nm or more, 500 nm or more, 525 nm or more, 550 nm or more, 575 nm or more, 600 nm or more, 625 nm or more, 650 nm or more, 675 nm or more, 700 nm or more, 725 nm or more, 750 nm or more, 775 nm or more, 800 nm or more, 825 nm or more, 850 nm or more, 875 nm or more, or 900 nm or more. In some embodiments, the fluorophore may emit a wavelength of 900 nm or less, 875 nm or less, 850 nm or less, 825 nm or less, 800 nm or less, 775 nm or less, 750 nm or less, 725 nm or less, 700 nm or less, 675 nm or less, 650 nm or less, 625 nm or less, 600
nm or less, 575 nm or less, 550 nm or less, 525 nm or less, 500 nm or less, 475 nm or less, 450 nm or less, 425 nm or less, or 400 nm or less.
[144] The wavelength of light that the fluorophore emits may correspond to a color on the visible spectrum. Examples of colors include, but are not limited to, green, blue, red, yellow, orange, pink, purple, or a combination thereof. For example, in some embodiments, a fluorophore emitting light in a wavelength between about 400 nm and about 475 nm may produce a purple color. In some embodiments, a fluorophore emitting in a wavelength between about 420 nm and about 530 nm may produce a blue color. In some embodiments, a fluorophore emitting in a wavelength between about 490 nm and about 575 nm may produce a green color. In some embodiments, a fluorophore emitting in a wavelength between about 530 nm and about 600 nm may produce a yellow color. In some embodiments, a fluorophore emitting in a wavelength between about 575 nm and about 750 nm may produce an orange color. In some embodiments, fluorophore emitting in a wavelength between about 600 nm and about 800 nm may emit a red color.
[145] The methods described herein may use a number of detectable labels (e.g., fluorescent moieties). The detectable labels (e.g., fluorescent moieties) may be optically distinct. The number of optically distinct detectable labels used in the methods described herein may impact the amount of information that may be detected. For example, a detection scheme using a larger number of optically distinct detectable labels may allow for multiplexing of codes from the code space, which may allow for a greater amount of target molecule related information to be detected. In some embodiments, a detection scheme using a smaller number of optically distinct detectable labels may be limited in the amount of information that may be detected from an amplification product. In some embodiments, using a larger number of optically distinct detectable labels may lead to a detection process that identifies a target molecule in less time as compared to using a fewer number of optically distinct detectable labels when querying an amplification product. In some embodiments, a detection scheme using a larger number of optically distinct detectable labels may lead to greater instrument complexity, which may lead to fluorescence detection crosstalk, whereby the fluorescence emission spectra of the optically distinct fluorescent moieties may not yield distinct fluorescence signals. In some embodiments, a detection scheme using a greater number of optically distinct fluorescent moieties may require use of a more complex detection tool.
[146] In some embodiments, the detectable labels (e.g., fluorescent moieties) may be optically distinct. For example, FIG. 10B shows a set of four detection oligonucleotide (e.g., 1001, 1002,
1003, 1004) that are each optically distinct from one another. As such, detection molecules 1001, 1002, 1003, and 1004 emit different wavelengths of light when imaged. In some embodiments, the fluorescent moieties may not be optically distinct. For example, FIG. 10C shows a set of four detection oligonucleotide (e.g., 1005, 1005, 1005, and 1005) that are not optically distinct from one another. As such, detector polynucleotides 1005, 1005, 1005, and 1005 emit similar wavelengths of light during imaging resulting in target molecule data that is difficult to interpret. Anchor Oligonucleotide
[147] In some embodiments, each detection polynucleotide complex may comprise an anchor oligonucleotide. In some embodiments, each anchor oligonucleotide comprises a portion that is complementary to at least a portion of a modified recognition element or an amplification product thereof, and another portion that is complementary to at least a portion of a detection oligonucleotide. FIG. 11A shows a non-limiting example of the structure of an anchor oligonucleotide comprising a portion that may be complementary to a detection oligonucleotide 1110 and a portion that may be complementary to an amplification product 1120. In some embodiments, the portion of the amplification product that the anchor oligonucleotide may be complementary to is a segment of a code. For example, as shown in FIG. 11A, the anchor oligonucleotide may be complementary to a segment 1115 of a code on an amplified recognition element.
[148] In some embodiments, the anchor oligonucleotide may comprise various nucleotide lengths. In some embodiments, the anchor oligonucleotide may comprise a length of about 20 to about 100 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 30 to about 70 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 40 to about 50 nucleotides. In some embodiments, the anchor oligonucleotide may comprise a length of about 10 to about 20 nucleotides.
[149] In some embodiments, the anchor oligonucleotide may comprise a length of one or more nucleotides, two or more nucleotides, three or more nucleotides, four or more nucleotides, five or more nucleotides, six or more nucleotides, seven or more nucleotides, eight or more nucleotides, nine or more nucleotides, 10 or more nucleotides, 15 or more nucleotides, 20 or more nucleotides, 25 or more nucleotides, 30 or more nucleotides, 35 or more nucleotides, 40 or more nucleotides, 45 or more nucleotides, 50 or more nucleotides, 55 or more nucleotides, 60 or more nucleotides, 65 or more nucleotides, 70 or more nucleotides, 75 or more nucleotides, 80 or more nucleotides, 85 or more nucleotides, 90 or more nucleotides, 95 or more nucleotides, or 100 or more nucleotides. In some embodiments, the anchor oligonucleotide may comprise 100 or less
nucleotides, 95 or less nucleotides, 90 or less nucleotides, 85 or less nucleotides, 80 or less nucleotides, 75 or less nucleotides, 70 or less nucleotides, 65 or less nucleotides, 60 or less nucleotides, 55 or less nucleotides, 50 or less nucleotides, 45 or less nucleotides, 40 or less nucleotides, 35 or less nucleotides, 30 or less nucleotides, 25 or less nucleotides, 20 or less nucleotides, 15 or less nucleotides, 10 or less nucleotides, nine or less nucleotides, eight or less nucleotides, seven or less nucleotides, six or less nucleotides, five or less nucleotides, four or less nucleotides, three or less nucleotides, or two or less nucleotides.
[150] FIG. 11B shows an example of a detection polynucleotide complex comprising a detection oligonucleotide 1130 and an anchor oligonucleotide 1140.
[151] Referring to FIG. 12B, the detection oligonucleotide and the anchor oligonucleotide form a detection polynucleotide complex. Still referring to FIG. 12B, the detection polynucleotide complex, when associated with the target molecule, forms a detectable binding complex that is suitable for detection using the imaging system of the present disclosure. A plurality of detectable binding complexes is formed when a pool of detection oligonucleotides, anchor oligonucleotides, or detection polynucleotide complexes are introduced to a plurality of target molecules from a sample. Upon formation of detectable binding complexes, a plurality of signals may be observed, wherein one or more signals correspond to a single detectable binding complex, thereby generating a signal profile for a code that can be decoded.
Detection Pools
[152] In some embodiments, the methods provided herein relate to providing detection polynucleotide complexes, detection oligonucleotides, or anchor oligonucleotides. In some embodiments, the detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides provided may be provided in one or more detection pools. In some embodiments, the methods herein may use one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, 15 or more, 20 or more, 25 or more, 30 or more, 40 or more, 45 or more, or 50 or more detection pools. In some embodiments, the methods herein may use 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, ten or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection pools.
[153] In some embodiments, each detection pool provided may comprise a number of detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides. In some embodiments, each detection pool may comprise two or more, three or more, four or more, five or more, ten or more, 15 or more, 25 or more, 50 or more, 100 or more, 150 or more, 250 or
more, 500 or more, 1,000 or more, 1,500 or more, 2,500 or more, or 5,000 or more detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides. In some embodiments, each detection pool may comprise 5,000 or less, 2,500 or less, 1,500 or less, 1,000 or less, 500 or less, 250 or less, 150 or less, 100 or less, 50 or less, 25 or less, 15 or less, ten or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides.
[154] As shown in FIG. 3, the number of detection pools and the number of detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides provided in each detection pool may be considered in the design of the recognition elements. In some embodiments, one advantage to using a smaller number of detection pools in the methods described herein may be to lower design costs. In some embodiments, one advantage to using a larger number of detection pools in the methods described herein for detection by hybridization may be a larger amount of information that may be detected. For example, a larger permutation space and a larger code space may allow for the simultaneous detection of a larger number of codes and therefore a larger number of target molecules.
[155] To identify a hypercode from a concatemeric amplification product, a segment is probed twice and sequentially with a unique set of detection polynucleotide complexes, also known as a read out cycle (FIG. 19A). An array of fluorescent images is generated in each of the four color channels, for example for 10 read out cycles. Subsequent image analysis and signal processing generates an intensity vector for each product, which can then be matched to the nearest consensus hypercode profile that is associated with the target of interest.
Imaging
[156] The methods described herein may include imaging. The methods described herein may include imaging a plurality of detectable binding complexes to obtain identifiable signals or states. In some embodiments, the signals or states are associated with the segments of a code for each amplification product. In some embodiments, the imaging is performed by an imaging system of the present disclosure.
[157] In some embodiments, the imaging may be conducted using an imaging system. The imaging system may comprise at the minimum a camera, a detector, an illuminator, a condenser, or a combination thereof. In some embodiments, the imaging may include images of fluorescence emission, luminescence, or a combination thereof. In some embodiments, the imaging systems comprise components or sub-systems of a larger system that may also include fluidics modules,
temperature control modules, translation stages, robotic fluid dispensing and/or microplate handling, processors or computers, instrument control software, data analysis and display software, etc. In some embodiments, the imaging system is a fluorescence imaging system. In some embodiments, the imaging may include fluorescent images from the fluorescent moieties present on the detector polynucleotide on the detection oligonucleotide complex.
[158] In some embodiments, the image may comprise fluorescence information from a variant of wavelengths. In some embodiments, the fluorescence information may comprise emission data from a wavelength from about 220-830 nanometers (nm), about 230 to about 820 nm, about 240 to about 810 nm, about 250 to about 800 nm, about 260 to about 790 nm, about 270 to about 780 nm, about 280 to about 770 nm, about 290 to about 760 nm, about 300 to about 750 nm, about
310 to about 740 nm, about 320 to about 730 nm, about 330 to about 720 nm, about 340 to about 710 nm, about 350 to about 700 nm, about 360 to about 690 nm, about 370 to about 680 nm, about 380 to about 670 nm, about 390 to about 660 nm, about 400 to about 650 nm, about 410 to about 640 nm, about 420 to about 630 nm, about 430 to about 620 nm, about 440 to about 610 nm, about 450 to about 600 nm, about 460 to about 590 nm, about 470 to about 580 nm, about 480 to about 570 nm, about 490 to about 560 nm, about 500 to about 550 nm, about 510 to about 540 nm, about 520 to about 530 nm, or a combination thereof. In some embodiments, the fluorescence data may comprise emission data from a wavelength of about 220 nm, about 230 nm, about 240 nm, about 250 nm, about 260 nm, about 270 nm, about 280 nm, about 290 nm, about 300 nm, about 310 nm, about 320 nm, about 330 nm, about 340 nm, about 350 nm, about 360 nm, about 370 nm, about 380 nm, about 390 nm, about 400 nm, about 410 nm, about 420 nm, about 430 nm, about 440 nm, about 450 nm, about 460 nm, about 470 nm, about 480 nm, about 490 nm, about 500 nm, about 510 nm, about 520 nm, about 530 nm, about 540 nm, about 550 nm, about 560 nm, about 570 nm, about 580 nm, about 590 nm, about 600 nm, about 610 nm, about 620 nm, about 630 nm, about 640 nm, about 650 nm, about 660 nm, about 670 nm, about 680 nm, about 690 nm, about 700 nm, about 710 nm, about 720 nm, about 730 nm, about 740 nm, about 750 nm, about 760 nm, about 770 nm, about 780 nm, about 790 nm, about 800 nm, about 810 nm, about 820 nm, about 830 nm, or a combination thereof.
Image Analysis and Signal Processing
[159] The imaging of fluorescently labeled detection polynucleotide complexes for detection of circularized and amplified recognition elements can be performed on a fluorescent instrument. The decoding of the images begins with collectively analyzing the images of all the detection events across all of the fluorescent channels for an amplified product. Subsequent analysis
operations can involve intensity normalizations, feature identification, intensity extractions, and data conditioning operations. Minimizing optical, electrical, biochemical, thermal and motion- induced noise sources is preferential for optimizing decoding of the fluorescent detection profile of an amplified product. As such, the imaging pipeline can employ subpixel feature (e.g., amplified recognition element) detection methods. Subsequently, subpixel registration techniques can be applied to pinpoint the same amplified product in all readout flows and channels. Optical and biochemical imperfections, such as non-linear distortion and non-uniform illumination, can be corrected for in-situ or via prior calibrated correction factors. The extraction of fluorescent intensities can be accomplished by convolving the feature image raw pixels with an appropriately tuned extraction kernel. These computationally intensive processing operations can be implemented using, for example, C++ and Halide thereby enabling pseudo-real -time processing.
[160] In some embodiments, following extraction the raw intensity values from the population of amplified recognition elements can undergo normalization. For example, mapping all intensities into a standard range to eliminate outliers, for example into the [0.5, 0.995]% range of all intensity values can be performed. Spatial variability in foreground and background intensities resulting from factors such as non-uniform illumination can be corrected for. For example, an image can be divided into a grid of sub-images, and high and low percentiles of intensities in each sub-region can be measured. Fluorescence intensities from the amplified products can be rescaled such that, after correction, the background and foreground intensities of the various subregions are comparable. To minimize edge effects across sub-regions, bilinear interpolation can be applied to the correction factors. In some embodiments, the addition of color-balanced control recognition elements can help ensure that all fluorescent color channels observe some bright objects in each readout flow, regardless of the true application sample plexity and the sample assayed. In some embodiments, color crosstalk caused by the overlap of emission spectra of different fluorophores used in the assay can be corrected.
[161] In some embodiments, analyzing crosstalk as a linear mixing operation and correcting for it by “unmixing” the observed intensities can be an effective approach for electrophoretic sequencing. The linear models can be measured for a particular combination of fluorescent dye moieties, excitation lasers, and filters used. Once the crosstalk between colors is quantified, it can be corrected by multiplying the received intensities by the inverse of the 4 * 4 matrix of color crosstalk coefficients. Finally, the intensity footprint of each amplified recognition element can be normalized to a unit norm.
Hypercode intensity profiles
[162] Even after conditioning the received intensity footprint into its normalized form, there may still exist signal variations such that even without noise, the intensity of an amplified recognition element may not be a clean 0/1 sequence of intensity measurements. Instead, for example, amplified recognition elements of a given type k may share a common baseline readout intensity footprint P/ = \pi, p2, psf\, referred to as a “hypercode profile”. To find the hypercode profile for amplified recognition element species k, the intensity footprints of all amplified recognition elements that are confidently assigned to type k are averaged using a basic "hard" decoding operation. The hypercode profiles can be refined using empirical intensity data, effectively applying one or more iterations of a clustering algorithm to the initial values. The intensity profiles can be scaled to unit norm.
[163] Intensity profiles can be used to decode amplified recognition elements, which can be used as an indirect indicator of whether a target of interest is present in a biological sample. With evidence that residual noise and errors in the intensities are distributed according to independent Laplacian probability distributions, a good approximation of the probability formula is: c = arg min dM (Ynorm, Pc~) CEC where dM Y,X) is the Manhattan distance between Y and X, defined as
and Ynorm is the normalized intensity footprint.
[164] In an effort to approach the problem stochastically, a matrix-variate Gaussian mixture model can be proposed to capture the correlation of the fluorescence signals across color channels and flows, wherein this model can be used in a Bayesian estimator called "PosTCode". Specifically, a prior distribution of the probabilities of targets is assumed. For example, the conditional probability of a given fluorescent signature over all channels and readout flows given a particular code in the codespace can be modeled as a Gaussian mixture with correlations over the channels and over the flows. The posterior probability of each codeword is then computed from the conditional probabilities and the prior distribution. The process can be iterated since the correlation matrices in the model need to also be estimated. The model can be complex, but also very powerful as long as the probability distributions involved are members of the exponential function family. However, the model includes fitting a large number of parameters, particularly
as plexity increases, and empirical data may not always be a good match to the Gaussian mixture model. For the methods described herein, typically a more robust decode performance was obtained using the Manhattan distance, rather than the sum-squared distance, which would be appropriate for underlying Gaussian statistics.
[165] The number of distinctive hypercodes profiled can scale with the number of cycles and resolvable optical signatures at each cycle, thereby expanding the potential for assay complexity (FIG. 20C). In some embodiments, hypercodes were designed within the framework of a 4-state system, with variable number of segments, and variable number of cycles for decoding the segments. In some embodiments, assay complexities comprising up to 12,000 hypercodes was achieved. In some embodiments, errors in determining a hypercode profile was mitigated by reducing the number of segments in a hypercode and enabling two separate optical signatures per segment (FIG. 15).
Soft decision decoding
[166] The methods described herein may relate to using an algorithm to predict the presence of target nucleic acid molecules in a sample. In some embodiments, the algorithm is a soft decision decoding algorithm. In some embodiments, the algorithm is applied to the detectable signals of the codes or amplified codes, a code profile, for predicting the presence of a target nucleic acid in a sample.
[167] The methods disclosed herein may comprise soft decision decoding to predict, or determine the probability of, the presence of the code in a recognition element of amplification product thereof, wherein the presence of the code correlates and serves as a proxy for the presence of a target nucleic acid in a sample. In some embodiments, the methods described herein may use soft decision decoding.
[168] In some embodiments, the methods described herein may use hard decision decoding. For hard decision decoding, signals from queried concatemers are extracted from images. This is the same for soft decision decoding, in that signals that are generated and imaged are extracted from the images. For hard decision decoding, hard basecalls are generated from the intensities of the signals, whereas with soft decision decoding no hard basecalls are necessary as all of the signal range is retained. The code assignment for hard decision decoding is determined by matching the nucleotide reads to codes, whereas with soft decision decoding, the signals are cross correlated against the expects signals and the most likely code is assigned, as such soft decision decoding is a probabilistic methodology. When using soft decision decoding techniques, it is not necessary for the model to identify each base specifically. For example, signals (e.g., fluorescent signals)
generated during each cycle of a detection process may be detected and recorded to produce a data set that may be used as input into a model to calculate a probability that a specific code is present without requiring a hard decision decoding model. Although it is not necessary using a soft decision decoding model to make a hard decision about the identity of each nucleotide, a soft decision decoding model developed according to the methods of the disclosure may nevertheless include assigning a probability or identity to each nucleotide in the sequence of a code.
[169] The permutation space on a recognition element is the totality of factors that determines the number of unique nucleotide possibilities at each segment of a code. The factors of the code space comprise the number of segments present on the recognition element, the number of incubation periods or times a segment is queried with a detection pool (e.g., flows or cycles), and the number of computational states used in the methods and systems described herein.
[170] FIG. 22 details an example of a soft decision decoding algorithmic workflow for determining the presence of a code and hence the presence of a target nucleic acid from a sample. Images of the detection polynucleotide complex queried sample are acquired, aligned, and processed to extract the intensity of the features of interest across the imaged field of view in multiple spectral channels. The corrected intensities of said features are then fed through a series of algorithms that make up the soft decision decoder. At first, the intensity profiles of the codes are learned based on features of high confidence or high intensity. This trained model provides a template for each code from which the rest of the features of interest are compared to in the second operation. Third, a confidence score is computed from the difference between the intensity profile of each feature and the trained profiles. Several filters are applied to remove outliers, duplicates, and low confidence decoded concatemers. The final output is a table of decoded concatemers with associated filter status, confidence score, and most likely assignment to one of the codes of the codeset used in the recognition elements of the assay.
[171] In some embodiments, a recognition element comprises a larger code, for example a code with four segments instead of two or three. In some embodiments, a recognition element comprising a larger code may result in a detection scheme with improved error correction compared to a smaller code. Additionally, in some embodiments, a larger code may result in a lower signal-to-noise ratio.
[172] In some embodiments, a recognition element comprises a smaller code, for example a code with two segments. In some embodiments, a recognition element comprising a smaller code may result in a detection scheme with lower error correction abilities. Further, in some embodiments, a small code may result in a higher signal-to-noise ratio.
Error Correction for Profile Decoding
[173] Soft decision decoding comprises, for each readout cycle, the determination of the most likely fluorophore state. For example, if the sequence of states c = ci,C2,cs, c/perfectly matches a codeword, the concatenated amplification product is considered decoded with no mismatch. However, due to readout noise sources such as non-uniform illumination, or cross-hybridization, there may be errors in the states read out for particular amplification products and readout cycles. Traditional error correction enables correction of up to x state errors in a codespace with minimum pairwise Hamming distance 2x+l, where the sequence of states c can be corrected to the unique valid codeword where at most x states disagree with c. This state-by-state decoding can have limited accuracy. Knowledge about the probabilistic distribution of the intensities can be utilized to apply a more effective assignment and error correction. A probabilistic "soft" decoding assigns hypercodes as
where Y = {y 1, y2 ..., ysf} is the entire f cycle x 5 color intensity vector of a concatemeric amplification product. This formulation has the advantage that it can probabilistically weigh the effect of different noise sources and track statistical dependencies of intensities across readout cycles. For example, due to secondary structures that might form, a particular hypercode may exhibit brighter-than-average intensity in certain readout cycles and dimmer-than-average intensity in others.
Decode Performance
[174] To assess accuracy and performance of the integrated methods described herein which comprise the assay, detection instrument and detection/decoding operations, metrics were developed to, for example, estimate error rates and predict decoding confidence scores.
[175] Accurate decoding may rely on unambiguous and consistent classification of up to 2 million intensity vectors. Image analysis and signal processing may generate an intensity vector that is matched to the nearest consensus hypercode profile that is uniquely associated with its target analyte. Implementing larger expected intensity differences among hypercodes by design may improve decode performance, however, it may reduce assay plexity when doing so.
[176] To help increase decode performance, machine learning was implemented. Machine learning was implemented in the decoding algorithm to compensate for the systematic signal variations from hypercode profiles, termed “profile skew” (FIG. 19B). Hypercodes that exhibited larger profile skews benefited most from machine learning implementation, for example by
increasing the number of decoded hypercode profiles (FIG. 15). Refining hypercode profiles using empirical intensity data also improved the confidence when assigning the hypercode profiles to concatemeric amplification products, also known as a “profile score”. The outcome was useful in decoding hypercodes that were more dim (not as fluorescent) and noisy (FIGs. 16A-C)
[177] FIG. 16A shows representative intensity profiles for 100 concatenated amplification products with a profile score of < 0.67 that decode to the hypercode with median skew. The learned intensity profile and the ideal intensity profile are superimposed and reflect the systematic deviations in the measured intensity associated with this hypercode. FIG. 16B shows representative intensity profiles for 100 concatenated amplification products with a profile score of < 0.71 that decode to the hypercode with high skew, and FIG. 16C shows representative intensity profiles for 100 concatenated amplification products with a profile score of < 0.65 that decode to the hypercode with high skew.
[178] Refining hypercode profiles using empirical intensity data also improved the confidence of identifying a concatemeric amplification product hypercode profile assignment or “profile score” and the total number of concatemeric amplification product hypercode profiles decoded (FIG. 19C)
[179] By implementing both empirical intensity profiles and a machine learning strategy, hundreds of thousands to millions of concatemeric amplification products were decoded across a 10-fold DNA input (FIG. 19D). Errors in decoding when utilizing a machine learning strategy was generally less than 1 : 10,000 (FIG. 19E). As such, the use of both empirical intensity profiles in decoding and an error-correcting codespace based on a machine learning strategy contributed to highly accurate hypercode profile determinations, and hence the presence of a target of interest in a sample.
[180] By utilizing detection polynucleotide complexes, a simple hybridization based detection workflow eliminates the need for multi-component detection reagents and detection enzymatic chemistries. Detection schemes can be tuned to address assay requirements such as minimized decode error rates, reduced readout time, higher plexity and/or improved dynamic range quantification of a small number of targets of interest by varying the number of fluorescence states and cycles. Additionally, machine learning based hypercode decoding allows for systematic noise compensation and robust error correction beyond the biological limitations of ligases, while keeping decoding time low for high plexity panels.
Iteratively repeating operations
[181] The methods described herein may relate to iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes. In some embodiments, the iterative repetition of the operations may be performed for each segment of a code.
[182] In some embodiments, the iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions. For example, the methods described herein may comprise about 2 to about50 iterative repetitions, about 2 to about 10 iterative repetitions, about 2 to about 8 iterative repetitions, or about four iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes. In some embodiments, the method described herein may comprise one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, 10 or more, 15 or more, 20 or more, 25 or more, 30 or more, 35 or more, 40 or more, 45 or more, or 50 or more iterative repetitions of : (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions. In some embodiments, the method described herein may comprise 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less iterative repetitions of the operations of : (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions.
[183] In some embodiments, the number of iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions that may correspond to the number of segments present in a code of the recognition element. As described above, the code of the recognition element may comprise a number of segments.
[184] In some embodiments, each segment of the code of the recognition element may undergo a number of iterative repetitions of the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes may comprise a number of iterative repetitions. For example, the methods described herein may comprise iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes two times per segment, three times per segment, or four times per segment. In some embodiments, the methods described herein may comprise iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes two or more times per segment, three or more times per segment, four or more times per segment, five or more times per segment, six or more times per segment, seven or more times per segment, eight or more times per segment, nine or more times per segment, 10 or more times per segment, 11 or more times per segment, 12 or more times per segment, 13 or more times per segment, 14 or more times per segment, 15 or more times per segment, 16 or more times per segment, 17 or more times per segment, 18 or more times per segment, 19 or more times per segment, or 20 or more times per segment. In some embodiments, the methods described herein may comprise iteratively repeating the operations of: (i) introducing detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides; (ii) forming detectable binding complexes; and (iii) imaging the detectable binding complexes 20 or less times per segment, 19 or less times per segment, 18 or less times per segment, 17 or less times per segment, 16 or less times per segment, 15 or less times per segment, 14 or less times per segment, 13 or less times per segment, 12 or less times per segment, 11 or less times per segment, 10 or less times per segment, nine or less times per segment, eight or less times per segment, seven or less times per segment, six or less times per segment, five or less times per segment, four or less times per segment, three or less times per segment, or two or less times per segment.
[185] As shown in FIG. 3, the number of iterative repetitions and the number of iterative repetitions per segment may be considered in the design of the recognition element. In some embodiments, a detection scheme comprising a smaller number of iterative repetitions and iterative repetitions per segment may require a greater number of detection pools and detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides per
detection pool, which may result in greater costs and resources needed for the detection scheme. In some embodiments, a detection scheme comprising a greater number of iterative repetitions per segment may allow for a greater amount of information that may be detected and for a larger code space. In some embodiments, a detection scheme comprising a greater number of iterative repetitions and iterative repetitions per segment may result is greater error correction mechanisms. As shown in FIG. 4, the number of iterative repetitions, or cycles or flows, per segment may also influence the code space.
Samples
[186] In some embodiments, the methods described herein relate to providing a sample. The sample may be a biological sample. The sample may comprise target nucleic acid molecules. The target nucleic acid molecules may be used to detect a nucleotide sequence.
[187] In some embodiments, the sample may comprise a biological sample. In some embodiments, the sample may comprise whole blood, lymphatic fluid, serum, plasma, sweat, tears, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, liquids containing multi-celled organisms, biological swabs, biological washes, or a combination thereof. In some embodiments, the sample may comprise whole blood. A whole blood may be veinous blood or capillary blood (e.g., obtained by a fingerstick).
[188] In some embodiments, the sample may be from a subject. In some embodiments, the subject may be a mammal. The mammal may include a fox, bear, dog, cat, monkey, sheep, cow, or pig. The subject may be a human. The subject may be male. The subject may be female. The subject may be an adult (e.g., 18 years of age or older). The subject may be a child (e.g., less than 18 years of age). The subject may be a vertebrate. In some embodiments, a sample is from a eukaryote or a prokaryote. In some embodiments, the sample is a viral sample, for example a DNA virus or an RNA virus. In some embodiments, the sample is a microorganism and could be one or more of a pathogenic microorganism or a bioterrorism related microorganism. In some embodiments, the sample may be from a plant, for example from a crop plant species such as corn, soybeans, cotton, wheat, etc. In some embodiments, the sample may be from a non-crop plant species.
[189] The methods described herein may relate to providing one or more samples. In some embodiments, the methods may relate to providing one or more, two or more, three or more, four
or more, five or more, six or more, seven or more, eight or more, nine or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more, 17 or more, 18 or more, 19 or more, or 20 or more samples. In some embodiments, the methods may relate to providing 25 or more, 50 or more, 75 or more, 100 or more, 125 or more, 150 or more, 175 or more, 200 or more, 225 or more, 250 or more, 275 or more, or 300 or more samples. In some embodiments, the methods may relate to providing 20 or less, 19 or less, 18 or less, 17 or less, 16 or less, 15 or less, 14 or less, 13 or less, 12 or less, 11 or less, 10 or less, nine or less, eight or less, seven or less, six or less, five or less, four or less, three or less, or two or less samples. In some embodiments, the methods may comprise providing 300 or less, 275 or less, 250 or less, 225 or less, 200 or less, 175 or less, 150 or less, 125 or less, 100 or less, 75 or less, 50 or less, or 25 or less samples.
[190] The sample that is provided may comprise nucleic acid molecules. In some embodiments, the sample that is provided may comprise a plurality of nucleic acid molecules.
[191] In some embodiments, the sample that is provided may comprise a plurality of target nucleic acid molecules. In some embodiments, the sample may comprise one or more target nucleic acid molecules, two or more target nucleic acid molecules, three or more target nucleic acid molecules, four or more target nucleic acid molecules, five or more target nucleic acid molecules, six or more target nucleic acid molecules, seven or more target nucleic acid molecules, eight or more target nucleic acid molecules, nine or more target nucleic acid molecules, 10 or more target nucleic acid molecules, 15 or more target nucleic acid molecules, 25 or more target nucleic acid molecules, 50 or more target nucleic acid molecules, 100 or more target nucleic acid molecules, 250 or more target nucleic acid molecules, 500 or more target nucleic acid molecules, 750 or more target nucleic acid molecules, or 1,000 or more target nucleic acid molecules. In some embodiments, the sample may comprise 1,000 or less target nucleic acid molecules, 750 or less target nucleic acid molecules, 500 or less target nucleic acid molecules, 250 or less target nucleic acid molecules, 100 or less target nucleic acid molecules, 50 or less target nucleic acid molecules, 25 or less target nucleic acid molecules, 15 or less target nucleic acid molecules, 10 or less target nucleic acid molecules, nine or less target nucleic acid molecules, eight or less target nucleic acid molecules, seven or less target nucleic acid molecules, six or less target nucleic acid molecules, five or less target nucleic acid molecules, four or less target nucleic acid molecules, three or less target nucleic acid molecules, or two or less target nucleic acid molecules.
[192] In some embodiments, the target nucleic acid molecules may include, but are not limited to, DNA or RNA. The target nucleic acid molecules may include DNA. The target nucleic acid
molecules may include RNA. The target nucleic acid molecules may include a combination of DNA and RNA. In some embodiments, the target nucleic acid molecules are fragments or components of DNA. In some embodiments, the target nucleic acid molecules are fragments or components of RNA. In some embodiments, the target nucleic acid molecules comprise complementary DNA or cDNA, transcribed from RNA (e.g., mRNA). In some embodiments, the target nucleic acid molecules comprise mRNA.
[193] The target nucleic acid molecules may comprise DNA. The DNA may be genomic DNA. The DNA may include one or more single nucleotide variants (SNVs), single nucleotide polymorphisms (SNPs), insertions/deletions (indels), copy number variants (CNVs), methylated nucleotides, or any combination thereof. In some embodiments, the DNA may include cell-free DNA (cfDNA). The cfDNA may include maternal cfDNA, fetal cfDNA, or combinations thereof, or cfDNA from a solid tumor. In some embodiments, the DNA may include circulating tumor cell DNA or ctcDNA. In some embodiments, the DNA may include a synthetic DNA target, such as a product of a polymerase chain reaction (PCR). In some embodiments, the DNA may be transcribed from single- stranded RNA templates, such as complementary DNA (cDNA) from a first strand or second strand synthesis reaction. The DNA may be PCR-amplified or RT- PCR amplified DNA or complementary DNA (cDNA). In some embodiments, DNA is genomic DNA or fragmented genomic DNA.
[194] The target nucleic acid molecules may comprise RNA. The RNA may include messenger RNA (mRNA). The mRNA may be a splice variant. The RNA may include microRNA (miRNA). The RNA may include pre-miRNA. The RNA may include pri-miRNA. The RNA may include mRNA. The RNA may include pre-mRNA. The RNA may include viral RNA. The RNA may include viroid RNA. The RNA may include virusoid RNA. The RNA may include circular RNA (circRNA). The RNA may include ribosomal RNA (rRNA). The RNA may include transfer RNA (tRNA). The RNA may include pre-tRNA. The RNA may include long non-coding RNA (IncRNA). The RNA may include small nuclear RNA (snRNA). The RNA may include circulating RNA. The RNA may include cell-free RNA. The RNA may include exosomal RNA. The RNA may include vector-expressed RNA. The RNA may include synthetic RNA.
SYSTEMS
[195] The systems disclosed herein relate to detecting a target nucleic acid molecule. In some embodiments, the systems comprise a plurality of recognition elements and a plurality of detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides. In some embodiments, the systems comprise a solid substrate configured to immobilize
detectable binding complexes. In some embodiments, the systems comprise a welled plate or a flow cell. In some embodiments, the systems comprise a fluid flow controller, a temperature controller, an imaging system, a computer system, or any combination thereof.
[196] In some embodiments, the systems comprise a plurality of recognition elements of the present disclosure. In some embodiments, the recognition element of the plurality comprises one or more target regions complementary to a corresponding target nucleic acid molecule in a sample. In some embodiments, the recognition element comprises a code of a set of codes, wherein the code is associated with one or more target nucleic acid molecules in the sample. In some embodiments, the code comprises a plurality of segments that correspond to one or more computational states of a set of computational states. In some embodiments, the plurality of detection polynucleotide complexes comprises a detection oligonucleotide and an anchor oligonucleotide, wherein a portion of each anchor oligonucleotide is complementary to a portion of a different segment of the plurality of segments; and another portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide.
[197] In some embodiments, the systems disclosed herein may include a solid substrate or a solid surface. The solid substrates and surfaces disclosed herein may be referred to as a substrate, a support, a solid support, or a surface. In some embodiments, the substrate may be configured to immobilize a detectable binding complex. In some embodiments, the substrate may be configured to immobilize circularized modified recognition elements, amplification products, or both.
[198] The substrate may be modified for attachment of a nucleic acid described herein. For example, the substrate may be modified for attachment of the amplification products described herein. Example solid substrates include, but are not limited to, glass, modified or functionalized glass, plastics, polysaccharides, nylon, nitrocellulose, ceramics, resins, silica, silica-based materials, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and other polymers. In some embodiments, the plastic solid substrates may include acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, or polyurethanes. In some embodiments, the silica-based solid substrates may include silicon or modified silicon.
[199] In some embodiments, the substrate may be a welled plate. In some embodiments, the substrate may be a 96-well plate. In some embodiments, the substrate may be a 4-well plate, a 6- well plate, an 8-well plate, a 12-well plate, a 24-well plate, a 48-well plate, a 384-well plate, an 864-well plate, or a 1,536-well plate. In some embodiments, the substrate may have greater than
or equal to 96 wells. In some embodiments, the substrate may have less than or equal to 96 wells. In some embodiments, the substrate may be a flow cell. In some embodiments, the flow cell may have two or more lanes. In some embodiments, the flow cell may have two or less lanes. In some embodiments, the substrate may be a microarray, a slide, a chip, a microwell, a tube, a column, a particle, or a bead. In some embodiments, the substrate may be a microarray, such as a DNA microarray. In some embodiments, the substrate may be a paramagnetic bead.
[200] In some embodiments, the substrate may comprise a coating. In some embodiments, the coating may comprise a layer that may be charged. In some embodiments, the coating layer may be positively charged. In some embodiments, the coating layer may be negatively charged. In some embodiments, the coating may be non-charged. In some embodiments, the substrate may comprise a surface comprising a cation-coating layer. In some embodiments, the substrate may comprise a surface comprising an anion-coating layer. In some embodiments, the substrate may comprise a surface comprising a neutral -charged layer. In some embodiments, the substrate may be coated with streptavidin. In some embodiments, the substrate may be coated with avidin. In some embodiments, the substrate may be coated with one or more antibodies.
[201] In some embodiments, the systems disclosed herein may include a fluid flow controller, a temperature controller, an imaging system, a computer system, or any combination thereof.
Fluidics systems
[202] The systems disclosed herein may comprise a fluidics system. The fluidics system may comprise a fluid flow controller. In some embodiments, the fluid flow controller may comprise one or more pumps, valves, mixing manifolds, reagent reservoirs, waste reservoirs, or any combination thereof. In some embodiments, the fluidic system and subcomponents of the fluidics system are fluidically connected to the reaction vessel of the present disclosure. In some embodiments, the fluidic system and subcomponents of the fluidics system iteratively flow in reagents (e.g., buffers, detection oligonucleotides, anchor oligonucleotides, detection polynucleotide complexes, etc.) to the reaction vessel. In some embodiments, the reaction vessel comprises a solid substrate configured to immobilize the modified recognition elements or amplification products thereof.
Temperature systems
[203] The systems disclosed herein may comprise a temperature system. The temperature system may comprise a temperature controller. The temperature controller may be incorporated into the systems described herein to facilitate accuracy of the methods and systems described herein. In some embodiments, the temperature controller may comprise temperature control
components. Non-limiting examples of temperature control components include resistive heating elements, infrared light sources, heating or cooling devices, heat sinks, thermocouples, thermistors, or a combination thereof. In some embodiments, the temperature controller may provide changes in temperature over specified time intervals. In some embodiments, the temperature controller may provide an increase in temperature. In some embodiments, the temperature controller may provide a decrease in temperature. In some embodiments, the temperature controller may provide for cycling of temperatures between two or more set temperatures so that thermocycling or amplification may be performed. In some embodiments, the temperature controller may provide a constant temperature.
Imaging systems
[204] The systems disclosed herein may comprise an imaging system. In some embodiments, signals produced by the detectable binding complexes disclosed herein may be imaged by the imaging systems disclosed herein.
[205] The imaging system may comprise one or more light sources, one or more optical components, one or more filters, one or one or more imaging sensors for imaging and detection, or a combination thereof. In some embodiments, the one or more light sources may comprise light from a bulb. In some embodiments, the one or more optical components may comprise lenses, mirrors, prisms, optical filters, colored glass filters, narrowband interference filters, broadband interference filters, dichroic reflectors, diffraction gratings, apertures, optical fibers, optical waveguides, or a combination thereof. In some embodiments, the one or more imaging sensors may comprise a charge-coupled device (CCD) sensor or camera, a complementary metal- oxide-semiconductor (CMOS) imaging sensor or camera, a negative-channel metal-oxide semiconductor (NMOS) imaging sensor or camera, or a combination thereof.
Computing systems
[206] Various operations of the methods and systems disclosed herein may be performed by a computer system of the present disclosure. Referring to FIG. 23, a block diagram is shown depicting an exemplary machine that includes a computer system 2300 (e.g., a processing or computing system) within which a set of instructions can execute for causing a device to perform or execute any one or more of the aspects and/or methodologies for static code scheduling of the present disclosure. The components in FIG. 23 are examples only and do not limit the scope of use or functionality of any hardware, software, embedded logic component, or a combination of two or more such components implementing particular embodiments.
[207] Computer system 2300 may include one or more processors 2301, a memory 2303, and a storage 2308 that communicate with each other, and with other components, via a bus 2340. The bus 2340 may also link a display 2332, one or more input devices 2333 (which may, for example, include a keypad, a keyboard, a mouse, a stylus, etc.), one or more output devices 2334, one or more storage devices 2335, and various tangible storage media 2336. All of these elements may interface directly or via one or more interfaces or adaptors to the bus 2340. For instance, the various tangible storage media 2336 can interface with the bus 2340 via storage medium interface 2326. Computer system 2300 may have any suitable physical form, including but not limited to one or more integrated circuits (ICs), printed circuit boards (PCBs), mobile handheld devices (such as mobile telephones or PDAs), laptop or notebook computers, distributed computer systems, computing grids, or servers.
[208] Computer system 2300 includes one or more processor(s) 2301 (e.g., central processing units (CPUs), general purpose graphics processing units (GPGPUs), or quantum processing units (QPUs)) that carry out functions. Processor(s) 2301 optionally contains a cache memory unit
2302 for temporary local storage of instructions, data, or computer addresses. Processor(s) 2301 are configured to assist in execution of computer readable instructions. Computer system 2300 may provide functionality for the components depicted in FIG. 23 as a result of the processor(s) 2301 executing non-transitory, processor-executable instructions embodied in one or more tangible computer-readable storage media, such as memory 2303, storage 2308, storage devices 2335, and/or storage medium 2336. The computer-readable media may store software that implements particular embodiments, and processor(s) 2301 may execute the software. Memory
2303 may read the software from one or more other computer-readable media (such as mass storage device(s) 2335, 2336) or from one or more other sources through a suitable interface, such as network interface 2320. The software may cause processor(s) 2301 to carry out one or more processes or one or more steps of one or more processes described or illustrated herein. Carrying out such processes or steps may include defining data structures stored in memory 2303 and modifying the data structures as directed by the software.
[209] The memory 2303 may include various components (e.g., machine readable media) including, but not limited to, a random-access memory component (e.g., RAM 2304) (e.g., static RAM (SRAM), dynamic RAM (DRAM), ferroelectric random access memory (FRAM), phasechange random access memory (PRAM), etc.), a read-only memory component (e.g., ROM 2305), and any combinations thereof. ROM 2305 may act to communicate data and instructions unidirectionally to processor(s) 2301, and RAM 2304 may act to communicate data and
instructions bidirectionally with processor(s) 2301. ROM 2305 and RAM 2304 may include any suitable tangible computer-readable media described below. In one example, a basic input/output system 2306 (BIOS), including basic routines that help to transfer information between elements within computer system 2300, such as during start-up, may be stored in the memory 2303.
[210] Fixed storage 2308 is connected bidirectionally to processor(s) 2301, optionally through storage control unit 2307. Fixed storage 2308 provides additional data storage capacity and may also include any suitable tangible computer-readable media described herein. Storage 2308 may be used to store operating system 2309, executable(s) 2310, data 2311, applications 2312 (application programs), and the like. Storage 2308 can also include an optical disk drive, a solid- state memory device (e.g., flash-based systems), or a combination of any of the above. Information in storage 2308 may, in appropriate cases, be incorporated as virtual memory in memory 2303.
[211] In one example, storage device(s) 2335 may be removably interfaced with computer system 2300 (e.g., via an external port connector (not shown)) via a storage device interface 2325. Particularly, storage device(s) 2335 and an associated machine-readable medium may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for the computer system 2300. In one example, software may reside, completely or partially, within a machine-readable medium on storage device(s) 2335. In another example, software may reside, completely or partially, within processor(s) 2301.
[212] Bus 2340 connects a wide variety of subsystems. Herein, reference to a bus may encompass one or more digital signal lines serving a common function, where appropriate. Bus 2340 may be any of several types of bus structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of bus architectures. As an example, and not by way of limitation, such architectures include an Industry Standard Architecture (ISA) bus, an Enhanced ISA (EISA) bus, a Micro Channel Architecture (MCA) bus, a Video Electronics Standards Association local bus (VLB), a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCLX) bus, an Accelerated Graphics Port (AGP) bus, HyperTransport (HTX) bus, serial advanced technology attachment (SATA) bus, and any combinations thereof.
[213] Computer system 2300 may also include an input device 2333. In one example, a user of computer system 2300 may enter commands and/or other information into computer system 2300 via input device(s) 2333. Examples of an input device(s) 2333 include, but are not limited to, an alpha-numeric input device (e.g., a keyboard), a pointing device (e.g., a mouse or touchpad), a
touchpad, a touch screen, a multi-touch screen, a joystick, a stylus, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), an optical scanner, a video or still image capture device (e.g., a camera), and any combinations thereof. In some embodiments, the input device is a Kinect, Leap Motion, or the like. Input device(s) 2333 may be interfaced to bus 2340 via any of a variety of input interfaces 2323 (e.g., input interface 2323) including, but not limited to, serial, parallel, game port, USB, FIREWIRE, THUNDERBOLT, or any combination of the above.
[214] In particular embodiments, when computer system 2300 is connected to network 2330, computer system 2300 may communicate with other devices, specifically mobile devices and enterprise systems, distributed computing systems, cloud storage systems, cloud computing systems, and the like, connected to network 2330. Communications to and from computer system 2300 may be sent through network interface 2320. For example, network interface 2320 may receive incoming communications (such as requests or responses from other devices) in the form of one or more packets (such as Internet Protocol (IP) packets) from network 2330, and computer system 2300 may store the incoming communications in memory 2303 for processing. Computer system 2300 may similarly store outgoing communications (such as requests or responses to other devices) in the form of one or more packets in memory 2303 and communicated to network 2330 from network interface 2320. Processor(s) 2301 may access these communication packets stored in memory 2303 for processing.
[215] Examples of the network interface 2320 include, but are not limited to, a network interface card, a modem, and any combination thereof. Examples of a network 2330 or network segment 2330 include, but are not limited to, a distributed computing system, a cloud computing system, a wide area network (WAN) (e.g., the Internet, an enterprise network), a local area network (LAN) (e.g., a network associated with an office, a building, a campus or other relatively small geographic space), a telephone network, a direct connection between two computing devices, a peer-to-peer network, and any combinations thereof. A network, such as network 2330, may employ a wired and/or a wireless mode of communication. In general, any network topology may be used.
[216] Information and data can be displayed through a display 2332. Examples of a display 2332 include, but are not limited to, a cathode ray tube (CRT), a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT-LCD), an organic liquid crystal display (OLED) such as a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display, a plasma display, and any combinations thereof. The display 2332 can interface to the processor(s)
2301, memory 2303, and fixed storage 2308, as well as other devices, such as input device(s) 2333, via the bus 2340. The display 2332 is linked to the bus 2340 via a video interface 2322, and transport of data between the display 2332 and the bus 2340 can be controlled via the graphics control 2321. In some embodiments, the display is a video projector. In some embodiments, the display is a head-mounted display (HMD) such as a VR headset. In further embodiments, suitable VR headsets include, by way of non-limiting examples, HTC Vive, Oculus Rift, Samsung Gear VR, Microsoft HoloLens, Razer OSVR, FOVE VR, Zeiss VR One, Avegant Glyph, Freefly VR headset, and the like. In still further embodiments, the display is a combination of devices such as those disclosed herein.
[217] In addition to a display 2332, computer system 2300 may include one or more other peripheral output devices 2334 including, but not limited to, an audio speaker, a printer, a storage device, and any combinations thereof. Such peripheral output devices may be connected to the bus 2340 via an output interface 2324. Examples of an output interface 2324 include, but are not limited to, a serial port, a parallel connection, a USB port, a FIREWIRE port, a THUNDERBOLT port, and any combinations thereof.
[218] In addition, or as an alternative, computer system 2300 may provide functionality as a result of logic hardwired or otherwise embodied in a circuit, which may operate in place of or together with software to execute one or more processes or one or more steps of one or more processes described or illustrated herein. Reference to software in this disclosure may encompass logic, and reference to logic may encompass software. Moreover, reference to a computer- readable medium may encompass a circuit (such as an IC) storing software for execution, a circuit embodying logic for execution, or both, where appropriate. The present disclosure encompasses any suitable combination of hardware, software, or both.
[219] Those of skill in the art will appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality.
[220] The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or
transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
[221] The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by one or more processor(s), or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
[222] In accordance with the description herein, suitable computing devices include, by way of non-limiting examples, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, notepad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles. Those of skill in the art will also recognize that select televisions, video players, and digital music players with optional computer network connectivity are suitable for use in the system described herein. Suitable tablet computers, in various embodiments, include those with booklet, slate, and convertible configurations, known to those of skill in the art.
[223] In some embodiments, the computing device includes an operating system configured to perform executable instructions. The operating system is, for example, software, including programs and data, which manages the device’s hardware and provides services for execution of applications. Those of skill in the art will recognize that suitable server operating systems include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD®, Linux, Apple® Mac OS X Server®, Oracle® Solaris®, Windows Server®, and Novell® NetWare®. Those of skill in the art will recognize that suitable personal computer operating systems include, by way of nonlimiting examples, Microsoft® Windows®, Apple® Mac OS X®, UNIX®, and UNIX-like
operating systems such as GNU/Linux®. In some embodiments, the operating system is provided by cloud computing. Those of skill in the art will also recognize that suitable mobile smartphone operating systems include, by way of non-limiting examples, Nokia® Symbian® OS, Apple® iOS®, Research In Motion® BlackBerry OS®, Google® Android®, Microsoft® Windows Phone® OS, Microsoft® Windows Mobile® OS, Linux®, and Palm® WebOS®. Those of skill in the art will also recognize that suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV®, Roku®, Boxee®, Google TV®, Google Chromecast®, Amazon Fire®, and Samsung® HomeSync®. Those of skill in the art will also recognize that suitable video game console operating systems include, by way of non-limiting examples, Sony® PS3®, Sony® PS4®, Microsoft® Xbox 360®, Microsoft Xbox One, Nintendo® Wii®, Nintendo® Wii U®, and Ouya®.
Non-transitory computer readable storage medium
[224] In some embodiments, the platforms, systems, media, and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked computing device. In further embodiments, a computer readable storage medium is a tangible component of a computing device. In still further embodiments, a computer readable storage medium is optionally removable from a computing device. In some embodiments, a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, distributed computing systems including cloud computing systems and services, and the like. In some cases, the program and instructions are permanently, substantially permanently, semipermanently, or non-transitorily encoded on the media.
Computer program
[225] In some embodiments, the platforms, systems, media, and methods disclosed herein include at least one computer program, or use of the same. A computer program includes a sequence of instructions, executable by one or more processor(s) of the computing device’s CPU, written to perform a specified task. Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), computing data structures, and the like, which perform particular tasks or implement particular abstract data types. In light of the disclosure provided herein, those of skill in the art will recognize that a computer program may be written in various versions of various languages.
[226] The functionality of the computer readable instructions may be combined or distributed as desired in various environments. In some embodiments, a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
[227] In some embodiments, the computer programs described herein may be used to perform at least one function. The computer programs described herein may perform functions related to storing data, receiving data, analyzing data, exporting data, or a combination thereof. In some embodiments, the computer programs described herein may perform functions related to applying selection criteria, including in silico selection criteria, functional selection criteria, or a combination thereof. In some embodiments, the computer programs may receive sequence information, including sequence information for nucleic acid segments. The sequence information may be configured as an array, a table, a list, or combination thereof. The sequence information may be formatted in a variety of ways, including, but not limited to a .txt file, a FASTA file, an .xls file, or a combination thereof. The computer programs described herein may apply selection criterion or selection criteria to a set of nucleic acid segments. The computer programs may sort the nucleic acid segments, determine or compute characteristics of the nucleic acid segments, perform calculations, reorder the nucleic acid segments, or a combination thereof. In some embodiments, the computer programs described herein may store information related to the nucleic acid segments. In some embodiments, the computer program may use information stored related to the nucleic acid segments to apply selection criteria to the nucleic acid segments. In certain embodiments, the computer program may receive information and/or data related to nucleic acid segments, selection criteria, or a combination thereof. In some embodiments, the computer programs may perform functions related to analyzing data from functional assays, including, but not limited to functional assays described herein. In some embodiments, analyzing data from functional assays may comprise image analysis, image quantification, intensity quantification, feature identification, or a combination thereof. The computer programs described herein may also export information. In some embodiments, the exported information may comprise images, files, data tables, documents, folders, or a combination thereof.
Web application
[228] In some embodiments, a computer program includes a web application. In light of the disclosure provided herein, those of skill in the art will recognize that a web application, in various embodiments, utilizes one or more software frameworks and one or more database systems. In some embodiments, a web application is created upon a software framework such as Microsoft® .NET or Ruby on Rails (RoR). In some embodiments, a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, XML, and document oriented database systems. In further embodiments, suitable relational database systems include, by way of non-limiting examples, Microsoft® SQL Server, mySQL™, and Oracle®. Those of skill in the art will also recognize that a web application, in various embodiments, is written in one or more versions of one or more languages. A web application may be written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof. In some embodiments, a web application is written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML). In some embodiments, a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS). In some embodiments, a web application is written to some extent in a client-side scripting language such as Asynchronous JavaScript and XML (AJAX), Flash® ActionScript, JavaScript, or Silverlight®. In some embodiments, a web application is written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion®, Perl, Java™, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), Python™, Ruby, Tel, Smalltalk, WebDNA®, or Groovy. In some embodiments, a web application is written to some extent in a database query language such as Structured Query Language (SQL). In some embodiments, a web application integrates enterprise server products such as IBM® Lotus Domino®. In some embodiments, a web application includes a media player element. In various further embodiments, a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe® Flash®, HTML 5, Apple® QuickTime®, Microsoft® Silverlight®, Java™, and Unity®.
[229] Referring to FIG. 24, in a particular embodiment, an application provision system comprises one or more databases 2400 accessed by a relational database management system (RDBMS) 2410. Suitable RDBMSs include Firebird, MySQL, PostgreSQL, SQLite, Oracle Database, Microsoft SQL Server, IBM DB2, IBM Informix, SAP Sybase, Teradata, and the like.
In this embodiment, the application provision system further comprises one or more application severs 2420 (such as Java servers, .NET servers, PHP servers, and the like) and one or more web servers 2430 (such as Apache, IIS, GWS and the like). The web server(s) optionally expose one or more web services via app application programming interfaces (APIs) 2440. Via a network, such as the Internet, the system provides browser-based and/or mobile native user interfaces.
[230] Referring to FIG. 25, in a particular embodiment, an application provision system alternatively has a distributed, cloud-based architecture 2500 and comprises elastically load balanced, auto-scaling web server resources 2510 and application server resources 2520 as well as synchronously replicated databases 2530.
Mobile application
[231] In some embodiments, a computer program includes a mobile application provided to a mobile computing device. In some embodiments, the mobile application is provided to a mobile computing device at the time it is manufactured. In other embodiments, the mobile application is provided to a mobile computing device via the computer network described herein.
[232] In view of the disclosure provided herein, a mobile application is created by techniques known to those of skill in the art using hardware, languages, and development environments known to the art. Those of skill in the art will recognize that mobile applications are written in several languages. Suitable programming languages include, by way of non-limiting examples, C, C++, C#, Objective-C, Java™, JavaScript, Pascal, Object Pascal, Python™, Ruby, VB.NET, WML, and XHTML/HTML with or without CSS, or combinations thereof.
[233] Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplaySDK, alcheMo, Appcelerator®, Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples, Lazarus, MobiFlex, MoSync, and Phonegap. Also, mobile device manufacturers distribute software developer kits including, by way of non-limiting examples, iPhone and iPad (iOS) SDK, Android™ SDK, BlackBerry® SDK, BREW SDK, Palm® OS SDK, Symbian SDK, webOS SDK, and Windows® Mobile SDK.
[234] Those of skill in the art will recognize that several commercial forums are available for distribution of mobile applications including, by way of non-limiting examples, Apple® App Store, Google® Play, Chrome WebStore, BlackBerry® App World, App Store for Palm devices, App Catalog for webOS, Windows® Marketplace for Mobile, Ovi Store for Nokia® devices, Samsung® Apps, and Nintendo® DSi Shop.
[235] In some embodiments, a computer program includes a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in. Those of skill in the art will recognize that standalone applications are often compiled. A compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, Java™, Lisp, Python™, Visual Basic, and VB .NET, or combinations thereof. Compilation is often performed, at least in part, to create an executable program. In some embodiments, a computer program includes one or more executable complied applications.
[236] In some embodiments, the computer program includes a web browser plug-in (e.g., extension, etc.). In computing, a plug-in is one or more software components that add specific functionality to a larger software application. Makers of software applications support plug-ins to enable third-party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the functionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. Those of skill in the art will be familiar with several web browser plug-ins including, Adobe® Flash® Player, Microsoft® Silverlight®, and Apple® QuickTime®. In some embodiments, the toolbar comprises one or more web browser extensions, add-ins, or add-ons. In some embodiments, the toolbar comprises one or more explorer bars, tool bands, or desk bands.
[237] In view of the disclosure provided herein, those of skill in the art will recognize that several plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non-limiting examples, C++, Delphi, Java™, PHP, Python™, and VB .NET, or combinations thereof.
[238] Web browsers (also called Internet browsers) are software applications, designed for use with network-connected computing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non-limiting examples, Microsoft® Internet Explorer®, Mozilla® Firefox®, Google® Chrome, Apple® Safari®, Opera Software® Opera®, and KDE Konqueror. In some embodiments, the web browser is a mobile web browser. Mobile web browsers (also called microbrowsers, mini-browsers, and wireless browsers) are designed for use on mobile computing devices including, by way of nonlimiting examples, handheld computers, tablet computers, netbook computers, subnotebook
computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems. Suitable mobile web browsers include, by way of non-limiting examples, Google® Android® browser, RIM BlackBerry® Browser, Apple® Safari®, Palm® Blazer, Palm® WebOS® Browser, Mozilla® Firefox® for mobile, Microsoft® Internet Explorer® Mobile, Amazon® Kindle® Basic Web, Nokia® Browser, Opera Software® Opera® Mobile, and Sony® PSP™ browser.
[239] In some embodiments, the platforms, systems, media, and methods disclosed herein include software, server, and/or database modules, or use of the same. In view of the disclosure provided herein, software modules are created by techniques known to those of skill in the art using machines, software, and languages known to the art. The software modules disclosed herein are implemented in a multitude of ways. In various embodiments, a software module comprises a file, a section of code, a programming object, a programming structure, a distributed computing resource, a cloud computing resource, or combinations thereof. In further various embodiments, a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, a plurality of distributed computing resources, a plurality of cloud computing resources, or combinations thereof. In various embodiments, the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, a standalone application, and a distributed or cloud computing application. In some embodiments, software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on a distributed computing platform such as a cloud computing platform. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
[240] In some embodiments, the platforms, systems, media, and methods disclosed herein include one or more databases, or use of the same. In view of the disclosure provided herein, those of skill in the art will recognize that many databases are suitable for storage and retrieval of nucleic acid segment sequences or analysis thereof information. In various embodiments, suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, XML databases, document oriented databases, and graph databases.
Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, Sybase, and MongoDB. In some embodiments, a database is Internet-based. In further embodiments, a database is web-based. In still further embodiments, a database is cloud computing-based. In a particular embodiment, a database is a distributed database. In other embodiments, a database is based on one or more local computer storage devices.
[241] Provided herein are kits related to the methods and systems described herein. In some embodiments, the kits may comprise one or more of a plurality of recognition elements, a plurality of detectionoligonucleotides, a plurality of anchor oligonucleotides, a plurality of detection polynucleotide complexes, one or more buffers, one or more reagents, instructions for use, a manual, a protocol, or a combination thereof. In some embodiments, the kits may comprise one or more containers. In some embodiments, the kits may comprise a plurality of recognition elements, a plurality of detection polynucleotide complexes, or detection oligonucleotides and anchor oligonucleotides, and instructions for use.
[242] In some embodiments, the kits may comprise one or more buffers or reagents. In some embodiments, the kits may comprise two buffers or reagents, or a combination thereof. In some embodiments, a first buffer or reagent of the kits described herein may be configured to promote hybridization. In some embodiments, a second buffer or reagent of the kits described herein may be configured to promote de-hybridization. In some embodiments, the kits may comprise one or more reagents. In some embodiments, the kits may comprise three reagents. In some embodiments, the first reagent of the kits described herein may comprise a set of recognition elements. In some embodiments, the second reagent of the kits described herein may comprise a set of detection polynucleotide complexes, wherein each detection polynucleotide complex may comprise a hybridized detection oligonucleotide and an anchor oligonucleotide. In some embodiments, the second reagent of the kits described herein may comprise a set of detection oligonucleotides and a set of associated anchor oligonucleotides, wherein a subset of the detection oligonucleotides and a subset of associated anchor oligonucleotides may be assembled into detection polynucleotide complexes prior to a detection assay, whereas the remaining detection oligonucleotides and associated anchor oligonucleotides may assemble in the detection assay proper. In some embodiments, the third reagent of the kits described herein may comprise a sample comprising a plurality of target nucleic acid molecules. In some embodiments, the first reagent, the second reagent, and the third reagent may be found in separate containers within the kit. In some embodiments, the first reagent, the second reagent, and the third reagent may be found in the same container within the kit.
[243] In some embodiments, the kits may comprise instructions for use, a manual, a protocol, or a combination thereof. In some embodiments, the kits may comprise a tube, a bottle, a glass jar, a container, or a combination thereof. In some embodiments, the kits may comprise instrumentation, including but not limited to a centrifuge, a heating element, a cooling element, a shaker, an incubator, or a combination thereof. In some embodiments, the kit may comprise components configured to perform any one of the methods described herein. In some embodiments, components of the kit may be stored at room temperature, below room temperature, below 15°C, below 10°C, below 5°C below 0°C, below -20°C, or below -40°C, or a combination thereof. In some embodiments, different components within the kit may be stored at different temperatures.
[244] Unless defined otherwise, all terms of art, notations and other technical and scientific terms or terminology used herein are intended to have the same meaning as is commonly understood by one of ordinary skill in the art to which the claimed subject matter pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.
[245] As used herein, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Any reference to “or” herein is intended to encompass “and/or” unless otherwise stated.
[246] As used herein, the term “about” in some cases refers to an amount that is approximately the stated amount or that is near the stated amount by 10%, 5%, or 1%, including increments therein.
[247] As used herein, the phrases “at least one”, “one or more”, and “and/or” are open-ended expressions that are both conjunctive and disjunctive in operation. For example, each of the expressions “at least one of A, B and C”, “at least one of A, B, or C”, “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.
[248] The terms “coded” and “encoded” are intended to have the same meaning and are herein used interchangeably.
[249] “Linked” with respect to two nucleic acids means not only a fusion of a first moiety to a second moiety at the C-terminus or the N-terminus, but also includes insertion of the first moiety to the second moiety into a common nucleic acid. Thus, for example, the nucleic acid A may be linked directly to nucleic acid B such that A is adjacent to B (-A-B-), but nucleic acid A may be
linked indirectly to nucleic acid B, by intervening nucleotide or nucleotide sequence C between A and B (e.g., -A-C-B- or -B-C-A-). The term “linked” is intended to encompass these various possibilities.
[250] “Optimum,” “optimal,” “optimize” and the like are not intended to limit the inventive concepts to the absolute optimum state of the aspect or characteristic being optimized but will include improved but less than optimum states.
[251] “Sample” means a source of target or analyte. Examples of samples include biological samples, such as whole blood, lymphatic fluid, serum, plasma, sweat, tear, saliva, sputum, cerebrospinal fluid, amniotic fluid, seminal fluid, vaginal excretion, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluid, intestinal fluid, fecal samples, liquids containing single or multiple cells, liquids containing organelles, fluidized tissues, fluidized organisms, liquids containing multi-celled organisms, biological swabs and biological washes. Samples may be from any organism (e.g., prokaryotes, eukaryotes, plants, animals, humans) or other samples (e.g., environmental or forensic samples).
[252] “Set” includes sets of one or more elements or objects. A “subset” of a set includes any number elements or objects from the set, from one up to all of the elements of the set.
[253] “Subject” includes any plant or animal, including without limitation, humans.
[254] “Detecting” or “decoding” with respect to a code includes determining the presence of a known code or a probability of the presence of a known code with or without determining the nucleotide by nucleotide sequence of the code. Decoding may be hard decision decoding. Decoding may be soft decision decoding.
[255] “Identify,” “determine” and the like with respect to codes, targets or analytes of the inventive concepts are intended to include any or all of: (A) an indication of the presence or absence of the relevant code, target or analyte, (B) an indication of the probability of the presence or absence of the relevant code, target or analyte, and/or (C) quantification of the relevant code, target or analyte.
[256] “Hard decision decoding” or “hard decision” refers to a method or model that includes making a call for each nucleotide in a nucleic acid segment (commonly referred to as a “base call”) in order to determine the sequence of nucleotides in the nucleic acid segment. Models of the inventive concepts incorporate hard decision decoding models. The particular nucleic acid being detected may be or include a code of the inventive concepts described herein.
[257] “Soft decision decoding” or “soft decision” refers to a method or a model that uses data collected during a sequencing or detecting process to calculate a probability that a particular nucleic acid or nucleic acid segment is present. The probability may be calculated without making a base call for each nucleotide in a nucleic acid segment. In another example, a probability is calculated without making a hard call that a string of nucleic acids in a segment are present. Instead of making a hard call for each nucleotide or nucleotide segment, a probabilistic decoding algorithm is applied to the recorded signal(s) upon completion of signal collection. A probability of the presence of each of the codes may be determined without discarding signal in contrast to hard decision decoding method in which hard calls are made during the signal collection process. In soft decision decoding, the data may, for example, include or be calculated from, fluorescent intensity readings in spectral bands for signals produced by the sequencing/detecting chemistry. In one embodiment, soft decision decoding uses data collected during a sequencing/detecting process to calculate a probability that a particular nucleic acid segment from a known set of sequences is present. Models of the inventive concepts may be used for soft decision decoding. The particular nucleic acid or nucleic acid segment being detected may be or include a code of the inventive concepts.
[258] “Crosstalk” refers to the situation in which a signal from one nucleotide addition reaction may be picked up by multiple channels (referred to as “color crosstalk”) or the situation in which a signal from a concatemer or sequencing cluster interferes with an adjacent or nearby cluster or concatemer (referred to as “cluster crosstalk” or “concatemer crosstalk”).
[259] “Color channel” means a set of optical elements for sensing and recording an electromagnetic signal from a sequencing reaction. Examples of optical elements include lenses, filters, mirrors, and cameras.
[260] “Spectral band” or “spectral region” means a continuous wavelength range in the electromagnetic spectrum.
[261] “Flow” or “cycle” refers to a single incubation period with a detection polynucleotide complex pool, or a pool of detection oligonucleotides and anchor oligonucleotides, and a concatemeric amplification product.
[262] “Code Space” refers to the totality of factors that determine the number of unique nucleotide possibilities at each segment on a recognition element.
[263] “Bind”, “bound” refers to and includes both covalent and non-covalent interactions. For example, “bind” can include any degree of hybridization between two nucleic acid sequences. Alternatively, “bind” can include covalent binding, where the sharing of electrons between atoms
occurs (e.g., a chemical bond). A covalent bond can be reversible or irreversible, depending on the need. Hydrogen bonds, van der Waals interactions and other weak interactions between molecules are also considered “bonds” for the present disclosure.
EXAMPLES
[264] The following illustrative examples are representative of embodiments of the methods, compositions and systems described herein and are not meant to be limiting in any way.
Example 1. Sample Preparation and General Encoded Assay Workflow
[265] This example describes a general workflow for the encoded assays disclosed herein. Target nucleic acids are extracted from a sample, be it a eukaryotic or prokaryotic sample. Extracting the nucleic acids from samples may be performed by existing methods available as known to a skilled artisan.
[266] An aliquot of the extracted target nucleic acids is combined with a pool of recognition elements. This combination can be performed in a tube, a well of a plate, or other contained devices. To the reaction vessel is further added a buffer and a ligase, for example a high-fidelity thermostable DNA ligase, for example in a 20 pL reaction volume. The ligation reaction is incubated and cycled between 95°C and 60°C, for example six times, to increase the formation of hybrids between the target nucleic acids and their complementary regions in the recognition elements and the ligation of the ends of the recognition elements that are brought together via the hybridization event to generate circularized recognition elements.
[267] After the hybridization and ligation cycles, the reactions are treated with an exonuclease to remove linear nucleic acids. For example, to each 20 pL ligation reaction an exonuclease mixture can be added, incubated at 37°C for 30 min., followed by a 5 min. exonuclease inactivation at 95°C for 5 min.
[268] Aliquots of the exonuclease treated circularized recognition elements are transferred to new substrates that comprise optically clear glass bottoms. The circularized recognition elements are amplified by rolling circle amplification (RCA) or multiple strand displacement amplification, thereby yielding concatemeric amplification products. For example, an exonuclease treated reaction that is aliquoted into a new substrate such as one or more wells of a welled plate can be immobilized onto the bottom of the substrate, if the substrate has been pretreated with a composition that can immobilize nucleic acids. For example, treating wells of a multi-welled plate with a cationic composition known to immobilize nucleic acids can be used to capture circularized recognition elements. The circularized recognition elements can be
immobilized on the surface of a pre-treated substrate comprising a cationic coating by incubating the substrate with the amplification products at 42°C for 60 min. A skilled artisan will understand myriad other methods for immobilizing an amplification product on a substrate surface.
[269] After immobilization, the circularized recognition elements can be amplified to generate concatemeric amplification products. For example, to the immobilized and circularized recognition elements can be added an amplification primer, deoxynucleotide triphosphates (dNTPs), a buffer solution and a DNA polymerase such as Phi29 DNA polymerase for performing rolling circle amplification (RCA). The RCA reaction can be incubated at 42°C for 1- 2 hrs, after which the wells can be washed, leaving the concatemeric amplification products in a wash buffer such as a Tris-EDTA wash solution in anticipation of the detection part of the assay.
[270] The plate comprising the concatemeric amplification products can be placed in a fluorescent imaging instrument with an optical subsystem with at least a four color imager with two excitation paths, for example 520 nanometers (nm) and 639 nm, and two cameras for capturing images of two spectrally separated wavelengths per excitation path. Detection takes the form of an iterative process including 1) hybridization of detection polynucleotide complexes to the concatemeric amplification products which share homology, 2) imaging of the hybridization event (e.g., readout cycles), 3) removing the detection complexes to allow for the next cycle or flow or hybridization and imaging. This process can be repeated any number of times, depending on the number of segments in a hypercode and the number of states needed to provide hypercode profiles for a given recognition element. The number of readout cycles therefore changes depending on the plexity of the panel. Detection, imaging, and detection complex removal can all be done at room temperature.
Example 2. Overview of a Detection Scheme
[271] Referring to FIG. 5, a process 500 for detecting a nucleotide sequence was conducted. In this example, a recognition element comprising a code with five segments is used. The segments are illustrated in FIG. 5. Operations 510, 520, and 530 are followed and repeated five times.
[272] At operation 510, a recognition element comprising a code with five segments is incubated with a detection pool comprising sixteen detection oligonucleotides, however only six of the sixteen detection oligonucleotides are depicted in this example. The recognition element and detection pools are incubated together. One of the detection oligonucleotides in the detection pool is complementary to a segment (e.g., segment one) of the recognition element. As shown in FIG. 5, the detection oligonucleotide bound to its complementary segment one.
[273] At operation 520, the unbound detection oligonucleotides are washed away, and the bound detection oligonucleotides are imaged as described herein. The bound detection oligonucleotides include a detection oligonucleotide portion with a fluorescent molecule attached thereto. Imaging involves a fluorescent imaging system including a microscope. The fluorescent molecules emit a color on the visible spectrum during imaging.
[274] At operation 530, the bound detection oligonucleotide is de-hybridized and washed away. De-hybridization includes the addition of a reagent in solution that promotes the dehybridization of the bound detection oligonucleotide. Operations 510, 520, and 530 of process 500 are repeated five times with different detection pools of detection oligonucleotides. During each repetition, one of the detection oligonucleotides hybridizes to a segment on the recognition element and the signal is captured.
[275] To prepare concatemeric amplification products (found in Example 1) for detection, wash buffer in the sample wells can be removed. Detection polynucleotide complexes can be dispensed into the wells. For example, 50 pL of a solution comprising 16 detection polynucleotide complexes in a buffered solution comprising mono and divalent salts, EDTA and a surfactant can be added to the wells with the concatemeric amplification products, and the reactions incubated for 10 min. to allow for hybridization of the anchor oligonucleotide to its complementary hypercode sequence. After hybridization, the detection complexes are removed, the wells washed with the last wash remaining in the wells, and the reactions subsequently imaged. Following imaging, the wash buffer is aspirated, the wells are stripped by the addition of a stripping solution to remove the hybridized detection complexes, the stripping buffer removed, the wells washed, and a new solution with new detection complexes are added to the wells. The operations are repeated until all the defined readout cycles are performed for each well. The hypercode profiles generated by the multiple readout cycles for each well can be decoded to identify the hypercode present, and hence the target nucleic acid present.
Example 3. Detection Scheme 1: Low Pool Density
[276] In this detection scheme, a recognition element comprising a code with five segments and two target specific recognition element arms is used. The recognition element is separately incubated with five detection pools wherein each pool has four differently labeled detection polynucleotide complexes. Each detection polynucleotide complex is composed of a detection oligonucleotide and an anchor oligonucleotide (See FIGs. 10A-10C and 11A-11B). In this detection scheme, the five segments occupy a large portion of the recognition element. As such, this detection scheme illustrates a tradeoff between: (i) the five segments taking up a large
portion of the recognition element and (2) a small number (e.g., 4) of detection polynucleotide complexes present in each detection pool.
[277] In this detection scheme, each detection pool corresponded to one of the five segments on the recognition element. Each of the four detection polynucleotide complexes in each detection pool comprises a fluorescent molecule that is optically distinct. This detection scheme is composed of the following factors: five segments on the recognition element, one incubation (e.g., flow or cycle) per segment (e.g., five total flows or cycles), five detection pools with four detection polynucleotide complexes per detection pool, and four optically distinct fluorescent molecules per detection pool. In total, this detection scheme utilizes 20 total detection polynucleotide complexes, and results in 1,024 possible permutations. The larger the permutation space, the larger number of codes that can be derived for a set minimum Hamming-distance criteria across the set of codes.
[278] To begin process, the amplified recognition element is incubated with a first detection pool. One of the four detection polynucleotide complexes of the first detection pool is complementary to one of the segments on the recognition element. As such, a hybridization event occurs between the detection polynucleotide complex and the corresponding complementary segment of the recognition element. After hybridization, the un-bound detection polynucleotide complexes in the first detection pool are washed away. After washing, the fluorescent molecule present on the bound detection oligonucleotide complex is imaged, resulting in capturing the emission fluorescence signal of the fluorescent molecule. After imaging, a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from its corresponding complementary segment.
[279] To continue the process, the operations of incubation, hybridization, wash, image, dehybridization, and wash are iteratively repeated five times, which accounts for the five segments present on the recognition element, and a hypercode profile of the fluorescence images are generated. In each iterative repetition, one of the detection polynucleotide complexes in the detection pool is bound to a corresponding segment of the recognition element, and the bound detection polynucleotide complex is imaged, de-hybridized, and removed by washing. The process continues for three additional iterative repetitions. After five repetitions, five detection polynucleotide complexes are bound to five segments on the recognition element, which results in 20 images. The resulting combination of images, the hypercode profile, allows for detection of the hypercode (e.g., the combination of segments that make up the code), which is used as a
surrogate for detection of the target nucleic acid molecule that originally hybridizes to the recognition element.
Example 4. Detection Scheme 2: High Pool Density
[280] A recognition element comprising a code comprising one segment is used, and the segment is incubated (e.g., flowed, cycled) with five detection pools. This example uses five detection pools, and each detection pool includes 1,024 detection polynucleotide complexes. In this scheme, a segment of a code on the recognition element occupies a small portion of the recognition element unlike the Detection Scheme 1 of Example 2. As such, the tradeoff in this detection scheme is between: (1) a segment taking up a small portion of the recognition element, and (2) a large number of detection polynucleotide complexes (e.g., 1,024) present in each detection pool.
[281] In this detection scheme, a recognition element with two target specific recognition arms and one segment is used. In this detection scheme, each detection pool includes 1,024 detection polynucleotide complexes. As such, this detection scheme includes the following factors: one segment on the recognition element, five incubation periods (e.g., flows or cycles) per segment (e.g., five total flows), five detection pools with 1,024 detection polynucleotide complexes per detection pool, and 4 fluorescent molecules. In total, this detection scheme utilizes 4,096 detection polynucleotide complexes, and results in 1,024 possible combinations. The larger the number of possible combinations that can be made, the larger the permutation space that may exist. The larger the permutation space, the larger the set of codes that can be derived from the permutation space.
[282] To begin, the segment is incubated with a first detection pool. One of the 1,024 detection polynucleotide complexes of the first detection pool is complementary to the segment of the recognition element. As such, a hybridization event occurs between the detection polynucleotide complex and the segment. After hybridization, the 1,023 un-bound detection polynucleotide complexes are washed away. After wash, the bound detection polynucleotide complex is imaged. After imaging, a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from the segment by washing.
[283] To continue, the operations of incubation, hybridization, wash, image, de-hybridization, and wash are repeated a number of times, thereby generating a hypercode profile (also known as a code profile). For example, a second pool of detection polynucleotide complexes is incubated with the segment on the recognition element, where one of the detection polynucleotide complexes of the second detection pool hybridizes to the segment, and the bound detection
oligonucleotide complex is imaged, de-hybridized, and washed. The detection process continues for the remaining three detection pools where each of the remaining detection pools is incubated with the segment of the recognition element. After five rounds, five detection polynucleotide complexes are bound to the segment on the recognition element, and each detection polynucleotide complex fluoresces one of four different fluorescent molecules. The resulting combination of images allows for detection of the code (e.g., the combination of segments that make up the code, in this example one segment), which is used as a surrogate for determining the presence of the target nucleic acid molecule that originally hybridizes to the recognition element.
Example 5. Detection Scheme 3: Intermediate Pool Density
[284] In this example, a recognition element with four segments is used and each segment is incubated (e.g., flowed or cycled) with two detection pools. In this detection scheme, four detection pools are used, and each detection pool comprises 16 detection polynucleotide complexes each. In this scheme, the four segments on the recognition element and the eight incubation periods (e.g., two flows per segment) of the segments balance the factors illustrated in FIG. 3 and FIG. 4. As such, in this scheme, the four segments take up a preferred amount of space on the recognition element. Furthermore, this preferred scheme requires detection pools with 16 detection polynucleotide complexes each. As such, this detection scheme comprises four segments present on the recognition element taking up a preferred space on the recognition element, and also 16 detection polynucleotide complexes per detection pool.
[285] In this example, a recognition element with two target specific arms and four segments is used. Each of the four segments is incubated (e.g., two flows per segment) with two detection pools that results in an eight-flow detection scheme.
[286] In this detection scheme, the 16 detection polynucleotide complexes in each detection pool are divided into four groups of four, where each group is assigned one of four fluorescent molecules (e.g., colors). For example, FIG. 12A shows an example of a detection pool used in this example where groups of four detection polynucleotide complexes includes the same detection oligonucleotide and the same fluorescent molecule.
[287] As such, the present detection scheme is composed of the following factors: four segments on the recognition element, two incubation periods (e.g., flows) per segment (e.g., eight total flows), four detection pools with 16 detection polynucleotide complexes per detection pool, and four fluorescent molecules used per detection pool. In total, this detection scheme requires 64 detection polynucleotide complexes (e.g., four detection pools with 16 detection polynucleotide complexes in each detection pool) and results in 1,024 possible combinations. The larger the
number of possible combinations that can be made, the larger the permutation space that may exist. The larger the permutation space, the larger the code space that can be derived from the permutation space. The 64 anchor oligonucleotide sequences that can be used with different detection oligonucleotides to generate 64 unique detection polynucleotide complexes for code identification are illustrated in FIG. 17.
[288] To begin, the segments on the recognition element are incubated (e.g., flowed or cycled) with a first detection pool. During the first incubation period (e.g., flow or cycle), one of the 16 detection polynucleotide complexes in the first detection pool is complementary to one of the segments on the recognition element. As such, a hybridization event occurs between the detection polynucleotide complex and the corresponding complementary segment on the recognition element. A detection polynucleotide complex is hybridized to its corresponding segment. After hybridization, the 15 un-bound detection polynucleotide complexes are washed away. After washing, the bound detection polynucleotide complex is imaged. After imaging, a de-hybridization event occurs, and the bound detection polynucleotide complex is removed from its corresponding complementary segment.
[289] To continue, the operations of incubation, hybridization, wash, image, de-hybridization, and wash are repeated a number of times thereby generating a hypercode profile. For example, the same first detection pool or a second detection pool of detection polynucleotide complexes is incubated with the same first segment or a second segment on the recognition element, where one of the detection polynucleotide complexes of the selected detection pool hybridizes to a corresponding complementary segment on the recognition element, and the bound detection polynucleotide complex is imaged, de-hybridized, and washed. The process continues for the remaining detection pools and segments. After eight rounds, eight detection polynucleotide complexes are bound to four segments on the recognition element, each fluorescing one of four colors through its fluorescent molecule during imaging. The resulting combination of images allows for detection of the code (e.g., the combination of segments that make up the code), which is used as a surrogate for determining the presence of the target nucleic acid molecule that originally hybridizes to the recognition element.
[290] In some embodiments, the combination of colors in flow one and flow two may provide the unique sequence of segment one. In some embodiments, the combination of colors in flow three and flow four may provide the unique sequence of segment two. In some embodiments, the combination of colors in flow five and flow six may provide the unique sequence of segment three. In some embodiments, the combination of colors in flow seven and flow eight may provide
the unique sequence of segment four. In some embodiments, the combination of all eight colors in the four segments and eight flows may provide the unique sequence of segments one through four which can be used as a proxy from the original hybridization event between the target nucleic acid and the recognition element.
[291] In this detection scheme, each detection pool undergoes two incubation periods (e.g., flows or cycles), for each segment of a concatemeric amplification product that results in an eight-flow system. For example, one detection pool may be incubated with the concatemeric amplification product twice (e.g., flows one and two), a second detection pool may be incubated with the concatemeric amplification product twice (e.g., flows three and four), a third detection pool may be incubated with the concatemeric amplification product twice (e.g., flows five and six), and a fourth detection pool may be incubated with the concatemeric amplification product twice (e.g., flows seven and eight). In some embodiments, each detection pool is made up of the same detection oligonucleotide and anchor oligonucleotide sequences. However, the combination of detection oligonucleotides and anchor oligonucleotides in each detection pool may be different in each flow (e.g., flows one and two, flows three and four, flows five and six, and flows seven and eight).
[292] FIG. 13A, for example, shows a detection pool for flow one of segment one and FIG. 13B shows a detection pool for flow two of segment one. In FIGs. 13A-B, each anchor oligonucleotide is labeled with a number (e.g., numbers one through 16), and each detection oligonucleotide is labeled with a three-digit number representing the emission wavelength of the attached fluorescent moiety (e.g., 488 nm, 532 nm, 568 nm, 647 nm). As shown in FIGs. 13A and 13B, the combination of detection oligonucleotides and anchor oligonucleotides differ between flows one and two. For example, in the flow one detection pool, anchor oligonucleotides 1, 2, 3, and 4 are attached to detector oligonucleotide comprising a fluorescent moiety that emits at 568 nm. In the detection pool in flow two, however, anchor oligonucleotides 1, 8, 11, and 14 are attached to detection oligonucleotide 568. As yet another example, in the flow one detection pool, anchor oligonucleotides 9, 10, 11, and 12 are attached to detection oligonucleotides comprising a fluorescent moiety that emits at 647 nm. However, in the detection pool of flow two, anchor oligonucleotides 9, 16, 4, and 6 are attached to detection oligonucleotide comprising a fluorescent moiety that emits at 647 nm. As such, while the detection oligonucleotides and anchor oligonucleotides between flows one and two comprise the same sequences, the combination of the detection oligonucleotides comprising different fluorescent moi eties and anchor oligonucleotides between flows one and two differ.
Example 6. State Aggregator
[293] As described above, a detection pool comprising detection polynucleotide complexes may be used herein. Detection polynucleotide complexes in the detection pool may bind to segments on the recognition element.
[294] FIG. 15 represents the 16 unique detection polynucleotide complex possibilities in each detection pool in a detection scheme using four fluorescent molecules and including a recognition element comprising four segments (but could be any number of segments) with two flows per segment. The Cycle 1 column represents which of the four fluorescent molecules (e.g., fluorescent molecule 1, 2, 3 or 4) is imaged during the first cycle. The Cycle 2 column represents which of the four fluorescent molecules is imaged during the second cycle. The State Pair column is the listed of the two states from the first cycle and the second cycle (Cycle 1 and Cycle 2).
[295] As shown in FIG. 15, for example, unique detection polynucleotide complex one will include fluorescent molecule one in cycle one and florescent molecule one in cycle two, providing a “11” representation for the State Pair. As another example, the unique detection polynucleotide complex eight will include fluorescent molecule two in cycle one and fluorescent molecule four in cycle two, providing a “24” representation for the State Pair. Another example, unique detection polynucleotide complex 15 will include fluorescent molecule four in cycle one and fluorescent molecule three in cycle two, providing a “43” representation for the State Pair. As shown in FIG. 15, each unique detection polynucleotide complex in the group of 16 detection polynucleotide complexes includes a different combination of fluorescent molecules, or hypercode profile, as such the assay has the ability to differentiate between different hypercodes based on their profiles, and hence the presence of different targets of interest associated with the hypercode.
Example 7. Assay Plexity Factors
[296] In a detection scheme where a recognition element comprises four segments and two cycles per segment a (e.g., four segments and eight cycles), 64 detection polynucleotide complexes in the detection pool may be used. The 64 detection polynucleotide complexes may be designed to be orthogonal to each other.
[297] Each of the 64 detection polynucleotide complexes is represented by a number (e.g., numbers one through 64). In some embodiments, the detection pool for segment one may include detection polynucleotide complexes one to 16. In some embodiments, the detection pool for segment two may include detection polynucleotide complexes 17 to 32. In some embodiments,
the detection pool for segment three may include detection polynucleotide complexes 33 to 48. In some embodiments, the detection pool for segment four may include detection polynucleotide complexes 49 to 64. Examples of anchor oligonucleotide sequences of each detection polynucleotide are provided in FIG. 17.
[298] FIG. 18 is a table of the permutations (e.g., colors, cycles/segment, total segments, and total cycles) that may be used to achieve a relatively large codespace from which to select a subset of codes. FIG. 20A is a table showing the relationship of the number of codes in a codespace, such that as the code space increases so does the number of codes for potential use in detection and decoding schemes for target nucleic acid identification using recognition elements as described herein.
[299] FIG. 20A is a summary table of the codespace generated by the indicated number of segments, cycles, and colors. Representative code set size is selected from the codespace, which enables the detection of numbers of targets. The greater the number of segments, cycles, and colors, the larger the codespace and therefore the greater the number of possible targets can be detected with the expanded codeset in a single assay. FIG. 20B is a table demonstrating that, for a four color detection system, the number of cycles, or flows, for querying a segment and the number of potential code possibilities can change depending on desired Hamming distance (HD). FIG. 20C demonstrates that by increasing either the number of readout cycles or the number of instinct identifiable optical states can increase the pool plexity exponentially, thereby allowing for increased plexity for an assay. FIG. 21 is a schematic diagram of an example of a trellis codespace and a process of using the trellis codespace to select a set of codes with desired properties for an assay.
Example 8. Performance of Recognition Elements and Hypercodes
[300] Recognition elements were evaluated on two key metrics, 1) the uniformity of coverage across the individual hypercodes with a well and 2) reproducibility of coverage across multiple wells. Experiments were performed found in Example 1, using a 1,000 plex assay and a 12,000 plex assay.
[301] FIG. 7 shows example results of assay performance using the hypercodes described herein. Two plexities were examined, a lower plexity of 1,000 different hypercodes and targets of interest and a higher plexity of 12,000 different hypercodes and targets of interest. Hypercodes were considered detected when they were detected more than 10 times within a well. Median hypercode coverage for each assay was observed well above the minimum allowable hypercodes for being considered detectable. The Decode Count represents the overall coefficient of variance
(CV) of the total well to well counts. The Raw Hypercode Count represents the CV of hypercode counts across all the hypercodes present that hybridized to the same target. The Normalized Hypercode Count normalized for the relative abundance of each hypercode measured across the wells. The Decode Error Rate is the probability that any of the decoded calls is incorrect.
[302] It was determined that overall decoding counts were consistent from well to well with a CV of 1.8% (n=3) for the 1,000 plex assay and 7.8% (n=4) for the 12,000 plex assay. Within a well, similar numbers of decoded recognition elements were observed for different hypercodes, with a hypercode-to-hypercode CV of 65.9% for the 1,000 plex assay and 50.9% for the 12,000 plex assay, which could be further reduced to 7.1% for the 1,000 plex assay and 28.1% for the 12,000 plex assay by normalizing for the relative abundance of each hypercode measured across the wells.
[303] The data demonstrates the ability of the methods disclosed herein to encode and decode targets of interest with high yield and low error rates, across three orders of magnitude in plexity. Hypercoding as described herein can provide for genotyping and quantitation of targets of interest that is comparable, or superior, to next generation sequencing or microarray technologies. For example, the methods described herein can provide a faster sample-to-answer turnaround time. Further, the use of double ligation recognition elements that can discern between a target of interest from a sample with high homology pseudogenes or homologs overcomes the limitations of microarrays and next generation sequencing in this regard. Additionally, the methods described herein provide data of thousands of targets of interest in parallel, thereby providing a rapid and inexpensive alternative to existing technologies.
[304] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
[305] While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the disclosure be limited by the specific examples provided within the specification. While the disclosure has been described with reference to the aforementioned specification, the descriptions and illustrations of the embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure. Furthermore, it shall be understood that all aspects of the disclosure are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions
and variables. It should be understood that various alternatives to the embodiments of the disclosure described herein may be employed in practicing the disclosure. It is therefore contemplated that the inventive concepts shall also cover any such alternatives, modifications, variations or equivalents. It is intended that the following claims define the scope of the disclosure and that methods and structures within the scope of these claims and their equivalents be covered thereby.
Claims
1. A method for determining the presence of one or more target nucleic acid molecules, the method comprising:
(a) providing a plurality of target nucleic acid molecules from a sample;
(b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises:
(i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and
(ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set of computational states;
(c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products;
(d) hybridizing a detection polynucleotide complex of a plurality of detection polynucleotide complexes to a concatemeric amplification product of the plurality of concatemeric amplification products, wherein each detection polynucleotide complex of the plurality of detection polynucleotide complexes comprises:
(i) a detection oligonucleotide; and
(ii) an anchor oligonucleotide, wherein:
(1) a first portion of the anchor oligonucleotide is complementary to at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products, and
(2) a second portion of the anchor oligonucleotide is complementary to at least a portion of the detection oligonucleotide;
(e) detecting the detection polynucleotide complex of the plurality of detection polynucleotide complexes hybridized to the at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products; and
(f) determining the presence of the target nucleic acid molecule in the sample based on the detection of the hypercode.
2. The method of claim 1, wherein the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof.
3. The method of claim 2, wherein the one or more detection moieties comprises the fluorophore.
4. The method of any one of claims 1-3, wherein the detection polynucleotide complex of the plurality of detection polynucleotide complexes or the detection oligonucleotide comprises one or more optically distinct fluorescent moieties.
5. The method of any one of claims 1-4, wherein the plurality of detection polynucleotide complexes comprises two or more distinct detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the two or more distinct detection polynucleotide complexes comprises one or more optically distinct fluorescent moieties.
6. The method of any one of claims 1-4, wherein the plurality of detection polynucleotide complexes comprises three or more distinct detection polynucleotide complexes.
7. The method of any one of claims 1-4, wherein the plurality of detection polynucleotide complexes comprises four or more distinct detection polynucleotide complexes.
8. The method of any one of claims 1-7, wherein the detection oligonucleotide comprises a length of 5 to 25 nucleotides.
9. The method of any one of claims 1-8, wherein the anchor oligonucleotide comprises a length of 20 to 100 nucleotides.
10. The method of claim 9, wherein the anchor oligonucleotide comprises a length of 40 to 50 nucleotides.
11. The method of any one of claims 1-10, further comprising forming the plurality of detection polynucleotide complexes.
12. The method of claim 11, wherein forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide prior to the hybridizing of (d).
13. The method of claim 12, wherein the hybridizing of (d) comprises hybridizing the detection polynucleotide complex of the plurality of detection polynucleotide complexes to the hypercode of the set of hypercodes on the plurality of concatemeric amplification products.
14. The method of claim 13, wherein forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide concurrently with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
15. The method of claim 13, wherein forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide after hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
16. The method of claim 13, wherein forming the plurality of detection polynucleotide complexes comprises hybridizing the detection oligonucleotide to the anchor oligonucleotide substantially simultaneously with hybridizing the first portion of the anchor oligonucleotide to the portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
17. The method of any one of claims 1-16, wherein the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
18. The method of claim 17, further comprising extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof.
19. The method of any one of claims 1-18, wherein the plurality of target nucleic acid molecules comprises DNA.
20. The method of any one of claims 1-18, wherein the plurality of target nucleic acid molecules comprises RNA.
21. The method of claim 20, wherein the RNA comprises mRNA.
22. The method of any one of claims 1-21, further comprising:
(g) imaging the detection polynucleotide complex of the plurality of detection polynucleotide complexes hybridized to the at least a portion of the hypercode in the concatemeric amplification product of the plurality of concatemeric amplification products.
23. The method of claim 22, further comprising repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products.
24. The method of claim 23, wherein the repeating of (d) through (g) comprises 2 to 15 iterative repetitions.
25. The method of claim 24, wherein the repeating of (d) through (g) comprises 2 to 10 iterative repetitions.
26. The method of any one of claims 1-25, wherein the detecting of (e) comprises fluorescence detection.
27. The method of any one of claims 1-26, further comprising applying a soft decision algorithm to a detected hypercode profile.
28. The method of any one of claims 1-27, further comprising performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the sample.
29. The method of any one of claims 1-28, wherein the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules.
30. The method of claim 29, wherein the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules.
31. The method of any one of claims 1-30, wherein each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence.
32. The method of any one of claims 1-31, wherein the plurality of recognition elements comprises 10 to 10,000 recognition elements.
33. The method of claim 32, wherein the plurality of recognition elements comprises 10 to 1,000 recognition elements.
34. The method of any one of claims 1-33, wherein the plurality of segments comprises 2 to 10 segments.
35. The method of any one of claims 1-34, wherein each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides.
36. The method of claim 35, wherein each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides.
37. The method of any one of claims 1-34, wherein each segment of the plurality of segments comprises at least 5 contiguous nucleotides.
38. The method of claim 37, wherein the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from the other computational states of the set of computational states.
39. The method of any one of claims 1-38, wherein the set of computational states comprises 5 to 30 computational states.
40. The method of any one of claims 1-39, further comprising determining a Hamming distance between any two hypercodes of the set of hypercodes.
41. The method of any of claims 1-39, further comprising determining a Hamming distance between any two segments of a hypercode of the set of hypercodes.
42. The method of claim 40 or 41, wherein the Hamming distance is 2 to 8.
43. The method of any one of claims 1-42, further comprising repeating (d) through (f).
44. The method of claim 43, wherein the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments.
45. The method of claim 43, wherein the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments.
46. The method of claim 43, wherein the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments.
47. The method of any one of claims 1-46, wherein the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule.
48. The method of claim 47, further comprising providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe.
49. The method of any one of claims 1-48, further comprising ligating and circularizing the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules.
50. The method of any one of claim 1-49, wherein the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
51. A method for determining the presence of one or more target nucleic acid molecules, the method comprising:
(a) providing a plurality of target nucleic acid molecules from a sample;
(b) providing a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises:
(i) one or more target recognition regions complementary to regions of a target nucleic acid molecule of the plurality of target nucleic acid molecules; and
(ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with one target nucleic acid molecule of the plurality of target nucleic acid molecules, and wherein the hypercode from the set of hypercodes comprises a plurality of segments that corresponds to at least two computational states from a set of computational states;
(c) amplifying a subset of the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules to produce a plurality of concatemeric amplification products;
(d) hybridizing a detection oligonucleotide of a plurality of detection oligonucleotides to a concatemeric amplification product of the plurality of concatemeric amplification products, wherein each detection oligonucleotide of the plurality of detection oligonucleotides comprises:
(i) a nucleic acid sequence that is complementary to a portion of the hypercode; and
(ii) a detectable moiety;
(e) detecting the detection oligonucleotide of the plurality of detection oligonucleotides hybridized to the portion of the hypercode of the concatemeric amplification product of the plurality of concatemeric amplification products via the detectable moiety of the detection oligonucleotide; and
(f) determining the presence of the target nucleic acid molecule in the sample based on the detection of the hypercode.
52. The method of claim 51, wherein the detection oligonucleotide comprises one or more detection moieties, wherein the one or more detection moieties comprises an organic dye, a fluorophore, a quantum dot, or a combination thereof.
53. The method of claim 52, wherein the one or more detection moieties comprises the fluorophore.
54. The method of any one of claims 51-53, wherein the detection oligonucleotide comprises one or more optically distinct fluorescent moieties.
55. The method of any one of claims 51-54, wherein the plurality of detection oligonucleotides comprises two or more distinct detection oligonucleotides.
56. The method of any one of claims 51-54, wherein the plurality of detection oligonucleotides comprises three or more distinct detection oligonucleotides.
57. The method of any one of claims 51-54, wherein the plurality of detection oligonucleotides comprises four or more distinct detection oligonucleotides.
58. The method of any one of claims 51-57, wherein the detection oligonucleotide comprises a length of 5 to 25 nucleotides.
59. The method of any one of claims 51-58, wherein the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
60. The method of claim 59, further comprising extracting the plurality of target nucleic acid molecules from the tissue, the one or more cells, the plasma, the blood, the urine, or a combination thereof.
61. The method of any one of claims 51-60, wherein the plurality of target nucleic acid molecules comprises DNA.
62. The method of any one of claims 51-60, wherein the plurality of target nucleic acid molecules comprises RNA.
63. The method of claim 62, wherein the RNA comprises mRNA.
64. The method of any one of claims 51-63, further comprising:
(g) imaging the detection oligonucleotide of the plurality of detection oligonucleotides hybridized to the portion of the hypercode of the concatemeric amplification product of the plurality of concatemeric amplification products.
65. The method of claim 64, further comprising repeating (d) through (g) for each segment of the plurality of segments of the hypercode from each concatemeric amplification product of the plurality of concatemeric amplification products.
66. The method of claim 65, wherein the repeating of (d) through (g) comprises 2 to 15 iterative repetitions.
67. The method of claim 65, wherein the repeating of (d) through (g) comprises 2 to 10 iterative repetitions.
68. The method of any one of claims 51-67, further comprising performing (b) through (g) concurrently for each target nucleic acid molecule from the plurality of target nucleic acid molecules from the sample.
69. The method of any one of claims 51-68, wherein the detecting of (e) comprises fluorescence detection.
70. The method of any one of claims 51-69, further comprising applying a soft decision algorithm to a detected hypercode profile.
71. The method of any one of claims 51-70, wherein the plurality of target nucleic acid molecules comprises 10 to 10,000 target nucleic acid molecules.
72. The method of claim 71, wherein the plurality of target nucleic acid molecules comprises 10 to 1,000 target nucleic acid molecules.
73. The method of any one of claims 51-72, wherein each recognition element of the plurality of recognition elements comprises an amplification primer binding sequence.
74. The method of any one of claims 51-73, wherein the plurality of recognition elements comprises 10 to 10,000 recognition elements.
75. The method of claim 74, wherein the plurality of recognition elements comprises 10 to 1,000 recognition elements.
76. The method of any one of claims 51-75, wherein the plurality of segments comprises 2 to 10 segments.
77. The method of any one of claim 51-76, wherein each segment of the plurality of segments comprises a length of 5 to 30 contiguous nucleotides.
78. The method of claim 77, wherein each segment of the plurality of segments comprises a length of 5 to 25 contiguous nucleotides.
79. The method of any one of claims 51-76, wherein each segment of the plurality of segments comprises at least 5 contiguous nucleotides.
80. The method of claim 79, wherein the at least 5 contiguous nucleotides each correspond to a computational state from the set of computational states, wherein each computational state is different from other computational states of the set of computational states.
81. The method of any one of claims 51-80, wherein the set of computational states comprises 5 to 30 computational states.
82. The method of any one of claims 51-81, further comprising determining a Hamming distance between any two hypercodes of the set of hypercodes.
83. The method of any of claims 51-81, further comprising determining a Hamming distance between any two segments of a hypercode.
84. The method of claim 82 or 83, wherein the Hamming distance is 2 to 8.
85. The method of any one of claims 51-84, further comprising repeating (d) through (f).
86. The method of claim 85, wherein the repeating of (d) through (f) comprises two repeats per segment of the plurality of segments.
87. The method of claim 85, wherein the repeating of (d) through (f) comprises three repeats per segment of the plurality of segments.
88. The method of claim 85, wherein the repeating of (d) through (f) comprises four repeats per segment of the plurality of segments.
89. The method of any one of claims 51-88, wherein the one or more target recognition regions in each of the recognition elements of the plurality of recognition elements comprises a 5’ arm and a 3’ arm, wherein the 5’ arm and the 3’ arm each comprise a sequence complementary to a portion of the target nucleic acid molecule.
90. The method of claim 89, further comprising providing a splint oligonucleotide probe comprising a 3’ region and a 5’ region, wherein the 5’ arm of the recognition element is complementary to the 3’ region of the splint oligonucleotide probe and the 3’ arm of the recognition element is complementary to the 5’ region of the splint oligonucleotide probe.
91. The method of any one of claims 51-90, further comprising ligating and circularizing the plurality of recognition elements that are hybridized to the plurality of target nucleic acid molecules.
92. The method of any one of claim 51-91, wherein the amplifying comprises performing rolling circle amplification or multiple strand displacement amplification.
93. A system, comprising:
(a) a plurality of recognition elements, wherein each recognition element of the plurality of recognition elements comprises:
(i) one or more target regions complementary to a corresponding target nucleic acid molecule of a plurality of target nucleic acid molecules from a sample; and
(ii) a hypercode from a set of hypercodes, wherein the hypercode from the set of hypercodes is associated with each target nucleic acid molecule of the plurality of target nucleic acid molecules from the sample including the corresponding target nucleic acid molecule in (i), and wherein the hypercode comprises a plurality of segments that corresponds to at least two computational states of a set of computational states; and
(b) a plurality of detection polynucleotide complexes comprising:
(i) a detection oligonucleotide; and
(ii) an anchor oligonucleotide, wherein:
(1) a first portion of the anchor oligonucleotide is complementary to a segment of the hypercode or a portion thereof; and
(ii) a second portion of the anchor oligonucleotide is complementary to a portion of the detection oligonucleotide.
94. The system of claim 93, wherein the sample comprises a tissue, one or more cells, plasma, blood, urine, or a combination thereof.
95. The system of claim 93 or 94, wherein the plurality of target nucleic acid molecules comprises DNA.
96. The system of any one of claims 93-95, wherein the plurality of target nucleic acid molecules comprises RNA.
97. The system of claim 96, wherein the RNA comprises mRNA.
98. The system of any one of claims 93-97, wherein each segment of the plurality of segments comprises at least 5 contiguous nucleotides, wherein the at least 5 contiguous nucleotides each correspond to a computational state that is different from another computational state of the set of computational states.
99. The system of any one of claims 93-98, wherein the set of computational states comprises 2 to 20 computational states.
100. The system of claim 99, wherein the set of computational states comprises 2 to 10 computational states.
101. The system of claim 100, wherein the set of computational states comprises 4 computational states.
102. The system of any one of claims 93-101, wherein the set of codes comprises a Hamming distance of any two codes of the set of codes of 3 to 5.
103. The system of any one of claims 93-101, wherein the set of codes comprises a Hamming distance of any two codes of the set of codes of 3.
104. The system of any one of claims 93-101, wherein the set of codes comprises a Hamming distance of any two codes of the set of codes of 4.
105. The system of any one of claims 93-101, wherein the set of codes comprises a Hamming distance of any two codes of the set of codes of 5.
106. The system of any one of claims 93-105, wherein the detection oligonucleotide comprises one or more fluorescence molecules.
107. The system of claim 106, wherein the one or more fluorescent molecules comprises an organic dye, a biological fluorophore, a quantum dot, or a combination thereof.
108. The system of any one of claims 93-107, wherein the detection oligonucleotide comprises a length of 5 to 10 nucleotides.
109. The system of any one of claims 93-108, wherein the anchor oligonucleotide comprises a length of 10 to 25 nucleotides.
110. The system of any one of claims 93-109, wherein the plurality of detection polynucleotide complexes comprises 2 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 2 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
111. The system of any one of claims 93-109, wherein the plurality of detection polynucleotide complexes comprises 3 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 3 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
112. The system of any one of claims 93-109, wherein the plurality of detection polynucleotide complexes comprises 4 or more species of the plurality of detection polynucleotide complexes, and wherein the detection oligonucleotide of each of the 4 or more species comprises one or more fluorescent molecules, wherein the one or more fluorescent molecules are optically distinct.
113. The system of any one of claims 93-112, further comprising a solid substrate configured to immobilize nucleic acids.
114. The system of claim 113, wherein the solid substrate comprises a welled plate or a flow cell, wherein a surface of the welled plate or a surface of the flow cell comprises a cationcoating layer coupled thereto.
115. The system any one of claims 93-114, further comprising:
(c) a fluid flow controller;
(d) an imaging system;
(e) a computer system; or
(f) any combination of (c) to (e).
116. The system of claim 115, wherein the fluid flow controller comprises one or more pumps, valves, mixing manifolds, reagent reservoirs, waste reservoirs, or any combination thereof.
117. The system of claim 115 or 116, wherein the fluid flow controller is configured to provide programmable control of fluid flow velocity, volumetric fluid flow rate, timing of reagent or buffer introduction, or any combination thereof.
118. The system of any one of claims 93-117, wherein a detection polynucleotide complex is bound to a recognition element of the plurality of recognition elements, or a concatemeric amplification product thereof to form a detectable binding complex.
119. A kit compri sing :
(a) a plurality of recognition elements;
(b) a plurality of detection polynucleotide complexes comprising (i) a plurality of detection oligonucleotides and (ii) a plurality of anchor oligonucleotides; and
(c) instructions for use of (a) and (b) according to the method of any one of claims 1-92.
120. The kit of claim 119, further comprising:
(d) a first buffer, wherein the first buffer is configured to promote hybridization; and (e) a second buffer, wherein the second buffer is configured to promote dehybridization.
121. A composition comprising a plurality of detectable binding complexes, wherein each detectable binding complex of the plurality of detectable binding complexes comprises a concatemeric amplification product comprising a recognition element sequence, wherein the recognition element sequence comprises complementary sequences to a target nucleic acid, and a hypercode sequence comprising one or more segment sequences, and a plurality of detection polynucleotide complexes hybridized to the one or more segment sequences, wherein a detection polynucleotide complex of the plurality of detection polynucleotide complexes comprises a detection oligonucleotide comprising a detection moiety hybridized to an anchor oligonucleotide,
and wherein a portion of the anchor oligonucleotide of the detection polynucleotide complex is hybridized to the one or more segment sequences.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363616299P | 2023-12-29 | 2023-12-29 | |
| US63/616,299 | 2023-12-29 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025145004A1 true WO2025145004A1 (en) | 2025-07-03 |
Family
ID=94383779
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/062056 Pending WO2025145004A1 (en) | 2023-12-29 | 2024-12-27 | Methods, systems, compositions and kits for target detection |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025145004A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3754028A1 (en) * | 2019-06-18 | 2020-12-23 | Apollo Life Sciences GmbH | Method of signal encoding of analytes in a sample |
| US20220235403A1 (en) * | 2021-01-26 | 2022-07-28 | 10X Genomics, Inc. | Nucleic acid analog probes for in situ analysis |
| WO2023096672A1 (en) * | 2021-11-23 | 2023-06-01 | Pleno, Inc. | Multiplexed detection of target biomolecules |
| WO2023141588A1 (en) * | 2022-01-21 | 2023-07-27 | 10X Genomics, Inc. | Multiple readout signals for analyzing a sample |
| WO2023172915A1 (en) * | 2022-03-08 | 2023-09-14 | 10X Genomics, Inc. | In situ code design methods for minimizing optical crowding |
-
2024
- 2024-12-27 WO PCT/US2024/062056 patent/WO2025145004A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3754028A1 (en) * | 2019-06-18 | 2020-12-23 | Apollo Life Sciences GmbH | Method of signal encoding of analytes in a sample |
| US20220235403A1 (en) * | 2021-01-26 | 2022-07-28 | 10X Genomics, Inc. | Nucleic acid analog probes for in situ analysis |
| WO2023096672A1 (en) * | 2021-11-23 | 2023-06-01 | Pleno, Inc. | Multiplexed detection of target biomolecules |
| WO2023141588A1 (en) * | 2022-01-21 | 2023-07-27 | 10X Genomics, Inc. | Multiple readout signals for analyzing a sample |
| WO2023172915A1 (en) * | 2022-03-08 | 2023-09-14 | 10X Genomics, Inc. | In situ code design methods for minimizing optical crowding |
Non-Patent Citations (1)
| Title |
|---|
| FAKRUDDIN MMANNAN KSCHOWDHURY AMAZUMDAR RMHOSSAIN MNISLAM SCHOWDHURY MA.: "Nucleic acid amplification: Alternative methods of polymerase chain reaction.", J PHARM BIOALLIED SCI., vol. 5, no. 4, October 2013 (2013-10-01), pages 245 - 52, XP055846001, DOI: 10.4103/0975-7406.120066 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Xie et al. | Designing highly multiplex PCR primer sets with simulated annealing design using dimer likelihood estimation (SADDLE) | |
| Quick et al. | Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples | |
| Draghici et al. | Reliability and reproducibility issues in DNA microarray measurements | |
| San Segundo-Val et al. | Introduction to the gene expression analysis | |
| Li et al. | Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study | |
| Su et al. | Next-generation sequencing and its applications in molecular diagnostics | |
| JP2023072089A (en) | Methods and compositions for analyzing nucleic acids | |
| US11473140B2 (en) | Highly selective omega primer amplification of nucleic acid sequences | |
| JP2021101732A (en) | Method for determining tumor gene copy number by analysis of cell-free dna | |
| Teder et al. | TAC-seq: targeted DNA and RNA sequencing for precise biomarker molecule counting | |
| US20130309676A1 (en) | Biased n-mers identification methods, probes and systems for target amplification and detection | |
| Chen et al. | Isothermal self-primer exponential amplification reaction (SPEXPAR) for highly sensitive detection of single-stranded nucleic acids and proteins | |
| Chen et al. | Highly accurate fluorogenic DNA sequencing with information theory–based error correction | |
| JP6858783B2 (en) | Single nucleotide polymorphism and indel multiple allele genotyping | |
| Chan et al. | Detecting m6A at single-molecular resolution via direct RNA sequencing and realistic training data | |
| US20160230224A1 (en) | Methods and apparatus to sequence a nucleic acid | |
| JP2020530261A (en) | Methods for Accurate Computational Degradation of DNA Mixtures from Contributors of Unknown Genotypes | |
| JP2024099818A (en) | Methods and systems for detecting transplant rejection | |
| CN120167013A (en) | Methods and compositions for enriching nucleic acid molecules for sequencing | |
| Rykalina et al. | Exome sequencing from nanogram amounts of starting DNA: comparing three approaches | |
| Gunaratne et al. | Large-scale integration of microRNA and gene expression data for identification of enriched microRNA–mRNA associations in biological systems | |
| Chow et al. | Concepts and new developments in droplet-based single cell multi-omics | |
| Zhuang et al. | Integrating magnetic-bead-based sample extraction and molecular barcoding for the one-step pooled RT-qPCR assay of viral pathogens without retesting | |
| WO2025145004A1 (en) | Methods, systems, compositions and kits for target detection | |
| JP7606554B2 (en) | Correction of deamination-induced sequence errors |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24847120 Country of ref document: EP Kind code of ref document: A1 |