[go: up one dir, main page]

WO2019123398A1 - Method of analysis of mutations in the hepatitis b virus and uses thereof - Google Patents

Method of analysis of mutations in the hepatitis b virus and uses thereof Download PDF

Info

Publication number
WO2019123398A1
WO2019123398A1 PCT/IB2018/060480 IB2018060480W WO2019123398A1 WO 2019123398 A1 WO2019123398 A1 WO 2019123398A1 IB 2018060480 W IB2018060480 W IB 2018060480W WO 2019123398 A1 WO2019123398 A1 WO 2019123398A1
Authority
WO
WIPO (PCT)
Prior art keywords
hbv
orf
subject
fibv
subgenotype
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IB2018/060480
Other languages
French (fr)
Inventor
William Grant Hartley ABBOTT
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Zealand Health Innovation Hub Management Ltd
Original Assignee
New Zealand Health Innovation Hub Management Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Zealand Health Innovation Hub Management Ltd filed Critical New Zealand Health Innovation Hub Management Ltd
Priority to CN201880079424.8A priority Critical patent/CN111465703A/en
Publication of WO2019123398A1 publication Critical patent/WO2019123398A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • C12Q1/706Specific hybridization probes for hepatitis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the present invention relates to methods for determining the liver inflammation status of, determining the genetic status of a hepatitis B virus (HBV) in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with HBV genotype A, genotype B or genotype C, and related systems, kits, uses, devices, computer-readable media and apparatus.
  • HBV hepatitis B virus
  • Chronic hepatitis B is a form of chronic liver inflammation caused by high levels of replication of the hepatitis B virus (HBV) in the liver cells of patients infected with the hepatitis B virus. If untreated, CHB leads to liver complications, including inflammation, cirrhosis, liver failure and liver cancer. Modern anti-viral treatments that suppress replication of the hepatitis B virus in patients with active liver inflammation result in lower rates of liver cirrhosis, liver failure and liver cancer. These treatments can also reverse pre-existing liver damage.
  • HBV hepatitis B virus
  • HBV-infected patients are monitored by regular blood testing to measure serum alanine amino transferase (ALT) - a liver enzyme that is released into the blood when liver damage occurs, indicating liver inflammation.
  • ALT serum alanine amino transferase
  • commencencement of anti-viral therapy in patients with elevated serum ALT substantially reduces the future risk of liver complications.
  • liver inflammation and CHB develops with no increase in serum ALT levels. Consequently, these patients present with late stage disease that is often untreatable.
  • the invention relates to a method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S open reading frame (S ORF) of HBV genotype A, B or C, the method comprising a) obtaining sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region; and b) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status
  • S ORF S
  • the invention in another aspect relates to a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the system comprising a) a measurement tool that analyses sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV based on the frequency of non- synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic
  • the system comprises a measurement tool comprising one or more oligonucleotide primers that target or are based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
  • the system comprises a measurement tool comprising one or more oligonucleotide primers that target a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
  • the system comprises a measurement tool comprising one or more oligonucleotide primers based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
  • the system comprises a measurement tool that generates sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject and the analysis tool is adapted to select an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with and compare sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
  • the invention in another aspect relates to a method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the FIBV by a method comprising a) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, b) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, c) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and d) determining the frequency of non-s
  • the invention relates to a system for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the system comprising a) a measurement tool that generates sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the FIBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, ii) comparing the sequence information generated by the measurement
  • the invention relates to a kit for determining the liver inflammation status of, or determini ng the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV i nfection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C
  • the kit comprising a) an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an ampli mer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nu
  • the invention relates to a kit for determining the liver inflammation status of, or determini ng the genetic status of an FIBV in, or identifying susceptibility to or detecti ng a risk of developing liver inflammation or liver complications of chronic FIBV i nfection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C
  • the kit comprising a) an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an ampli mer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C
  • the invention relates to use of a measurement tool that generates sequence information about at least a portion of an S ORF region of HBV present in a sample obtained from a subject in the manufacture of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C; wherein the determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C is carried out by a method described herein.
  • the invention relates to use of an agent that generates sequence information about at least a portion of an S ORF region of HBV present in a sample obtained from a subject in the manufacture of a kit or system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C; wherein the determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in the subject is carried out by a method described herein.
  • the invention relates to a device for determining the frequency of non-synonymous mutations in an S ORF region of HBV in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C comprising: a) a memory that stores instructions; and b) a processor that retrieves instructions from the memory and executes the instructions i) to determine the frequency of non-synonymous mutations in the S ORF region of HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-s
  • the invention relates to a computer readable medium havi ng instructions stored thereon, which, when executed by a processor, causes the processor to perform operations that implement a method to determine the frequency of non-synonymous mutations in an S ORF region of FIBV in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that a subject is infected with, comparing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons.
  • the invention relates to use of an oligonucleotide primer pai r for amplifyi ng at least a portion of an S ORF region of FIBV present in a sample obtained from a subject to produce an amplimer in the manufacture of a kit or system for determining the liver inflammation status of a subject or the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver compl ications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the primer pair comprising a) a forward primer comprising a sequence of at least about 5 contiguous
  • nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the FIBV is genotype A; and b) a reverse pri mer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 100-2000; wherein the determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in the subject is carried out by a method described herein.
  • the invention relates to use of one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of FIBV for generating sequence information about at least a portion of the S ORF region of FIBV present in a sample obtained from a subject in the manufacture of a kit or system for determining the liver inflammation status of a subject or the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, wherein the determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in the subject is carried out by a method described herein.
  • the invention relates to use of a) a measurement tool that that generates sequence information about at least a portion of an S ORF region of FIBV present in a sample obtained from a subject, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the FIBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a
  • the invention in another aspect relates to an apparatus for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the apparatus comprising a) means for amplifying at least a portion of the S ORF region of the FIBV
  • the invention in another aspect relates to an apparatus for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the apparatus comprising a) means for generating sequence information about the S ORF of the HBV that the subject is infected with, and b) means for determining the frequency of non-synonymous mutations in the S ORF region of the HBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, ii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iii) determining the frequency of non-synonymous mutations in the
  • the means for amplifying at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to produce an amplimer comprises an oligonucleotide primer pair described herein.
  • the means for generating sequence information is a nucleotide sequencer.
  • the means for generating sequence information comprises one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject.
  • the invention relates to use of a) an oligonucleotide primer pair for amplifyi ng at least a portion of an S ORF region of HBV present in a sample obtained from a subject to produce an amplimer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is i nfected with compris
  • amplimer c) a processor, d) a computer readable medium, and e) an analysis tool stored on the computer-readable medi um that is adapted to be executed on the processor to determi ne the frequency of non-synonymous mutations in the S ORF region of the HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information generated by the nucleotide sequencer to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibi lity to or detecting a risk of developing l
  • the invention relates to use of a) an oligonucleotide primer pair for amplifyi ng at least a portion of an S ORF region of HBV present in a sample obtained from a subject to produce an amplimer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and i) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is i nfected with
  • the invention relates to use of a) one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject, and b) an analysis tool stored on a computer-readable medium that is adapted to be executed on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the
  • the invention in another aspect relates to a method of monitoring the response to therapy, predicting the response to a therapy, selecting a therapy, determining the optimal timing, duration or regimen of a therapy in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the HBV by a method comprising i) providing sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iv) determining the frequency of non-synonymous mutations in the S ORF region
  • the therapy is chemotherapy, drug therapy, immunotherapy, gene therapy, or a combination of any two or more thereof.
  • the therapy is immunotherapy.
  • the invention relates to a method of monitoring the response to immunotherapy, predicting the response to an immunotherapy, selecting a therapy, determining the optimal timing, duration or regimen of a therapy in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the FIBV by a method comprising i) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to
  • immunotherapy select a therapy, or determine the optimal timing, duration or regimen of a therapy based on the frequency of non-synonymous mutations in the S ORF region of the FIBV.
  • the subject is infected with an FIBV comprising an S ORF of FIBV genotype C, FIBV genotype B, FIBV genotype A, or a combination of any two or more thereof.
  • the frequency of non-synonymous mutations is determined at 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 35, 40, 45, 50, 55 or 60 or more codons in the S ORF region, and various ranges may be selected from between any of these val ues, for example, from about 2 to about 60, about 2 to about 50, 2 to about 40, about 2 to about 35, about 2 to about 31, about 2 to about 30, about 2 to about 25, about 2 to about 20, about 2 to about 15, about 2 to about 10, about 2 to about 5, about 3 to about 60, about 3 to about 50, about 3 to about 40, about 3 to about 30, about 3 to about 20, about 3 to about 10, about 4 to about 40, about 4 to about 30, about 4 to about 20, about 4 to about 10, about 5 to about 60, about 5 to about 50, about 5 to about 40, about 5 to about 30, about 5 to about 20, about 5 to about 10, about 7 to about 40, about 7 to about
  • the frequency of non-synonymous mutations is determined at 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 or 32 codons corresponding to the codons at positions 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 and 391 of a S ORF reference sequence of any one SEQ ID NO: 1-14, and various ranges can be selected from between any of these values, for example, from about 0 to about 32, about 0 to about 25, about 0 to about 20, about 2 to about 32, about 2 to about 25, about 2 to about 20, about 3 to about 32, about 3 to about 25, about 3 to about 20, about 5 to about 32, about 5 to about 25, about 5 to
  • the frequency of non-synonymous mutations is determined at two or more codons corresponding to the codons at positions 4, 35, 51, 54, 56, 120, 132, 135, 141, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 334, 358, 363, 377, 378, 382 and 387 of a S ORF reference sequence of any one SEQ ID NO: 1-14, or a combi nation of any two or more thereof.
  • the frequency of non- synonymous mutations is determined at two or more codons correspondi ng to the codons at positions 4, 120, 132, 135 and 378 of a S ORF reference sequence of any one of SEQ ID NO: 1-14.
  • the frequency of non-synonymous mutations is determined at two or more codons that are under positive selection pressure.
  • the frequency of non-synonymous mutations is determined at two or more codons within a portion of the S ORF region of FIBV comprising codons corresponding to codons 1-400, 1-350, 1-300, 1-250, 1-200, 1-150, 1-100, 1-50, 50-400, 50-350, 50-300, 50-250, 50-200, 50-150, 50-100, 100-400, 100-350, 100-300, 100-250, 100-200, 100-150, 150-400, 150-350, 150-300, 150-250, 150-200, 200-400, 200-350, 200-300, 200-250, 250-400, 250-350, 250-300, 300-400, 300-350, or 350- 400 of an S ORF reference sequence of any one of SEQ ID NO: 1-14.
  • selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) comparing the sequence information to one or more S ORF subgenotype
  • corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) providing information about the subgenotype of the S ORF of the FIBV the subject is infected with, and b) selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with.
  • the S ORF reference sequence is a) SEQ ID NO: 1 when the subgenotype of the FIBV is genotype Al ; b) SEQ ID NO: 2 when the subgenotype of the FIBV is genotype A2; c) SEQ ID NO: 3 when the subgenotype of the FIBV is genotype Bl; d) SEQ ID NO: 4 when the subgenotype of the FIBV is genotype B2; e) SEQ ID NO: 5 when the subgenotype of the FIBV is genotype B3; f) SEQ ID NO: 6 when the subgenotype of the FIBV is genotype B4; g) SEQ ID NO: 7 when the subgenotype of the FIBV is genotype B5; h) SEQ ID NO: 8 when the subgenotype of the FIBV is genotype Cl;
  • SEQ ID NO: 9 when the subgenotype of the FIBV is genotype C2; j) SEQ ID NO: 10 when the subgenotype of the FIBV is genotype C3A; ) SEQ ID NO: 11 when the subgenotype of the FIBV is genotype C3B;
  • the method comprises selecting an S ORF reference sequence comprising, or the S ORF reference sequence comprises, a sequence that corresponds to 2, 3, 4, 5, 6, 7, 8, 9 or 10 or more subgenotypes of FIBV genotypes A, B or C.
  • the S ORF reference sequence corresponds to all known
  • sequence information is generated about, or an oligonucleotide primer pair amplifies a portion of the S ORF region of the FIBV comprising at least about 10, 20, 50, 100, 200, 250, 300, 400, 500, 750, 1000 or 1200 contiguous nucleotides of the S ORF of the FIBV, and various ranges may be selected between any of these values, for example from about 10 to about 1200, about 200 to about 1200, about 500 to about 1200, about 750 to about 1200 or about 1000 to about 1200 contiguous nucleotides.
  • the method comprises amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject using an oligonucleotide primer pair to produce an amplimer and generating sequence information about the amplimer.
  • the method comprises adding the sample or the amplimer to one or more nucleotide probes that hybridise to specific nucleotides i n the S ORF region of FIBV to generate the sequence information.
  • the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer.
  • the measurement tool comprises a nucleotide sequencer for generating sequence information from the amplimer or from the sample.
  • the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer and a nucleotide sequencer for generating sequence information from the amplimer.
  • the measurement tool comprises one or more nucleotide probes that hybridise to specific nucleotides in the S ORF region of FIBV for generating sequence information from the sample or the amplimer.
  • the oligonucleotide primer pair comprises a) a forward primer comprising a sequence of at least about 5 contiguous
  • nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and b) a reverse pri mer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence corresponding to a subgenotype of the HBV that the subject is infected with comprising nucleotides 100-2000 of the HBV genome.
  • the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 100-2000; or b) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1800-3215 when the HBV is genotype B or C or nucleotides 1800-3221 when the HBV is
  • the forward primer comprises a sequence of at least about 5, 6, 7, 8, 9, 10, 12, 15, 17, 18, 20, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a portion of an HBV genome reference sequence, and various ranges may be selected from between these values, for example, from about 5 to 50, 5 to 30, 7 to 30, 10 to 30, 15 to 30, 20 to 30, 25 to 30, 5 to 25, 7 to 25, 10 to 25, 15 to 25, 20 to 25, 5 to 20, 7 to 20, 10 to 20, 15 to 20, 5 to 15, 7 to 15, 10 to 15 contiguous nucleotides of a portion of the HBV genome.
  • the reverse pri mer comprises a sequence that is complementary to at least about 5, 6, 7, 8, 9, 10, 12, 15, 17, 20, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a portion of an HBV genome reference sequence, and various ranges may be selected from between these values, for example, from about 5 to 50, 5 to 30, 7 to 30, 10 to 30, 15 to 30, 20 to 30, 25 to 30, 5 to 25, 7 to 25, 10 to 25, 15 to 25, 20 to 25, 5 to 20, 7 to 20, 10 to 20, 15 to 20, 5 to 15, 7 to 15, 10 to 15 contiguous nucleotides of a portion of the HBV genome.
  • the HBV genome reference sequence is a) SEQ ID NO: 15 when the subgenotype of the HBV is genotype Al ; b) SEQ ID NO: 16 when the subgenotype of the HBV is genotype A2; c) SEQ ID NO: 17 when the subgenotype of the HBV is genotype Bl ; d) SEQ ID NO: 18 when the subgenotype of the HBV is genotype B2; e) SEQ ID NO: 19 when the subgenotype of the HBV is genotype B3; f) SEQ ID NO: 20 when the subgenotype of the HBV is genotype B4; g) SEQ ID NO: 21 when the subgenotype of the HBV is genotype B5; h) SEQ ID NO: 22 when the subgenotype of the HBV is genotype Cl ;
  • SEQ ID NO: 23 when the subgenotype of the HBV is genotype C2; j) SEQ ID NO: 24 when the subgenotype of the HBV is genotype C3A; ) SEQ ID NO: 25 when the subgenotype of the HBV is genotype C3B;
  • the method comprises using two or more primers to amplify at least a portion of the S ORF region comprising the two or more codons, wherein the two or more primers each comprise or are complementary to 5 or more contiguous nucleotides of an S ORF reference sequence of any one of SEQ ID Nos: 1-14.
  • one or more of the primers are designed to anneal specifically to a wild-type codon of the S ORF reference sequence at which the frequency of non-synonymous mutations is to be determined .
  • one or more of the primers are designed to anneal specifically to a codon of the S ORF reference sequence comprising a non- synonymous mutation.
  • ORF region of the FIBV present in a sample obtained from the subject based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region is determined usi ng amplification refractory mutation system (ARMS) .
  • ARMS ng amplification refractory mutation system
  • S ORF region of the FIBV indicates a) the subject has a positive liver inflammation status, b) the FIBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver
  • S ORF region of the FIBV is determined by a) comparing the sequence i nformation to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) assigning a numerical score for each codon based on the frequency of non- synonymous mutations at that codon, c) combining the numerical scores for each codon to provide a combi ned
  • a positive numerical score is assigned for each codon having a frequency of non-synonymous mutations of at least about 5%, 10%, 15%, 20%, 25%, 30%, 40% or at least about 50%.
  • the positive score for each codon is independently 0.5, 1, 1.5, 2 or 2.5.
  • the combi ned numerical score representing the frequency of non-synonymous mutations in the S ORF of the HBV of greater than 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 indicates that a) the subject has a positive liver inflammation status, b) the HBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver
  • the subject a) is HBeAg negative
  • b) has normal serum ALT levels
  • c) has a serum HBV-DNA titre of greater than about 2,000 IU/ml
  • d) has a combination of any two or more of a) to c).
  • the subject is HBeAg positive or has elevated serum
  • ALT levels or is both HBeAg positive and has elevated serum ALT levels.
  • the liver complications of chronic HBV infection comprise liver cirrhosis, l iver cancer, liver failure, liver inflammation, liver damage, liver dysfunction, or a combination of any two or more thereof.
  • This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more of said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.
  • Figure 1 is a Receiver Operating Characteristic (ROC) curve showing the relationship between the sensitivity and specificity of the frequency of non-synonymous mutations in the S ORF of FIBV to predict liver disease.
  • ROC Receiver Operating Characteristic
  • Figure 2 is a survival analysis comparing the outcome at the ten year follow up of nine subjects with a score of greater than or equal to 3 (Group 1) with that of seven subjects with a score of less than or equal to 2 (Group 0). Scores were calculated as described above for Figure 1.
  • the present inventors have identified codons in the S ORF of the HBV genome that are under positive selection pressure.
  • the present inventors have surprisingly determined that determining the frequency of non-synonymous mutations across the S ORF region of HBV based on the frequency of non-synonymous mutations at codons under positive selection pressure is useful to detect or predict early stage liver inflammation and/or an increased risk of developing liver complications of chronic HBV infection in patients infected with HBV.
  • the present invention relates to a method of determining the liver inflammation status of, determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with HBV genotype A, B or C by determining the frequency of non-synonymous mutations at two or more codons in the S ORF region of HBV.
  • the invention also provides related systems, kits, measurement tools, agents, devices, computer readable media, oligonucleotide primer pairs and apparatus and uses thereof.
  • an HBV comprising an S ORF of HBV genotype A, B or C refers to HBV comprising an S ORF that substantially corresponds to the S ORF region of a subgenotype of any one of HBV genotypes A, B or C. It includes an HBV comprising an S ORF having substantial sequence identity to an S ORF reference sequence of any one of HBV subgenotypes, Al, A2, Bl, B2, B3, B4, B5, Cl, C2, C3A, C3B, C4, C5 and C6 described herein.
  • HBV comprising an S ORF region having substantial sequence identity to an S ORF consensus sequence of any as yet unclassified or unidentified subgenotype of HBV genotype A, B or C.
  • HBV has an S ORF having at least about 70%, 75%, 80%, 90%, 95% or at least 99% sequence identity to an S ORF reference sequence described herein.
  • nucleotide sequence of the S ORF is substantially identical to an S ORF reference sequence described herein but for one or more synonymous substitutions.
  • the term “indel” as used herein refers to an insertion or deletion of one or more nucleotides in the wild-type HBV genome.
  • non-synonymous mutation refers to a single nucleotide mutation in an HBV S ORF sequence that results in a codon that codes for a different amino acid to the amino acid coded for at the equivalent codon position in a wild- type, consensus or reference HBV S ORF sequence.
  • sequence that results in a codon that codes for the same amino acid as the amino acid coded for at the equivalent codon position in a wild-type, consensus or reference HBV S ORF sequence.
  • a suitable forward primer comprises a sequence of at least about 5 contiguous nucleotides of the sequence of the non-template strand (also known as the plus or sense strand) of the HBV genome.
  • the forward primer binds to the template strand of the HBV genome (also known as the minus or antisense strand).
  • a suitable reverse primer for use in PCR reactions described herein comprises at least about 5 contiguous nucleotides of the sequence of the template strand (also known as the minus or antisense strand) of the HBV genome.
  • the reverse primer binds to the non-template strand of the HBV genome (also known as the plus or sense strand).
  • under positive selection pressure refers to codons in the S ORF of HBV at which the rate of non-synonymous substitutions exceeds the rate expected by random processes.
  • Substitution rates of codon evolution can be determined using methods known in the art, including methods described herein in the examples. For example, substitution rates can be inferred by constructing phylogenies of S ORF sequences of HBV obtained from one or more subjects and then estimating the overall proportion of codons that are under positive selection pressure across the phylogenetic tree using methods such as Phylogenetic Analysis of Maximum Likelihood (PAML).
  • PAML Phylogenetic Analysis of Maximum Likelihood
  • the terms "genotype”, “subgenotype” and “sub-subgenotype” refer to genetically distinct groups of the hepatitis B virus that have arisen during evolution of the virus.
  • the consensus, wild-type nucleotide sequences of the groups currently recognized in the art are described herein. The terms are intended to include any new genotypes, subgenotypes and sub-sub genotypes recognized in the future.
  • a "subject” refers to a vertebrate that is a mammal, for example, a human.
  • the phrase "liver complications of chronic HBV (CHB) infection” and related terms as used herein relates generally to any liver pathologies or conditions resulting from chronic HBV infection including, but not limited to, liver inflammation, liver damage, liver dysfunction, liver cirrhosis, liver cancer (for example, hepatocellular carcinoma), and liver failure.
  • CHB chronic HBV
  • wild-type as used herein with reference to genome sequences, S ORF sequences and codons therein and amino acid sequences for a given HBV genotype, subgenotype or sub-subgenotype refers to the recognised consensus sequence or codon for that HBV genotype, subgenotype or sub-subgenotype.
  • HBV is a DNA virus that belongs to the Hepadnaviridae family.
  • the circular genome of HBV is encased in a protein core and is partially double-stranded.
  • the genome consists of a non-coding negative (template) strand that is attached to the polymerase protein at its 5' end.
  • the terminal 3' nucleotides of the negative strand overlap with the 5' sequence.
  • the positive coding strand is usually 1700-2800 nucleotides in length and the 5' sequence of the positive strand bridges the gap between the 3' and 5' ends of the negative strand.
  • cccDNA covalently closed circular DNA
  • the cccDNA of genotype B and genotype C viruses is 3,215 nucleotides in length.
  • the cccDNA of genotype A viruses is 3,221 nucleotides in length.
  • cccDNA contains four overlapping open reading frames (ORF).
  • the C ORF encodes the core protein and the P25 precursor protein for HBeAg.
  • the X ORF is predicted to encode a transcriptional transactivator.
  • the S ORF encodes the 3 envelope proteins of the mature virion, and the P ORF encodes a protein with several functions related to replication of the HBV genome.
  • genotypes There is substantial variation in the sequence of HBV between individual chronically infected subjects. There are conserved patterns to this sequence variation that allow most HBVs to be classified into broad groups known as genotypes, subgenotypes and sub-subgenotypes.
  • genotypes A, B, C, D, E, F, G, H and I nine genotypes of the hepatitis B virus have been defined in the human population. They are known as genotypes A, B, C, D, E, F, G, H and I, and they are distinguished by haplotypes of nucleotide substitutions in at least 8% of their nucleotide sequence. These genotypes differ in their geographical distribution throughout the world and also in their ability to cause disease. For example, the genotype C hepatitis B virus predominantly occurs in East Asia and the Pacific and is associated with the highest rates of progression to liver cirrhosis and liver cancer.
  • subgenotypes There is a second level of variation within each genotype known as subgenotypes. These subgenotypes also differ in their geographical distributions.
  • the subgenotypes of genotypes A, B and C of the hepatitis B virus that are well recognised are A1 and A2; Bl, B2, B3, B4 and B5; and Cl, C2, C3, C4, C5 and C6.
  • two subsubgenotypes of subgenotype C3 have been described in Tonga, and they are known as sub-subgenotypes C3A and C3B.
  • a further source of variation in hepatitis B virus sequences is recombination of genetic material from two viruses of different genotypes that have infected the same patient. The most common of these are identified as genotype B/C, genotype C/D or genotype A/D. Lastly, the dominant sequence of a hepatitis B virus changes with time within patients as a result of selection pressure and genetic drift.
  • This intra-individual variation can include mutations of nucleotides that are included in the haplotypes that define genotypes, subgenotypes and sub-subgenotypes.
  • the consensus nucleotide sequences for the genome of FIBV subgenotypes A1 and A2 are set out in SEQ ID NOs: 15 and 16, respectively.
  • the method of the invention comprises the steps of a) obtaining sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
  • the sample obtained from the subject is processed for use in the method of the invention.
  • the sample comprises serum or plasma separated from a whole blood sample extracted from the subject.
  • the sample comprises fresh frozen tissue or formalin-fixed tissue.
  • the sample comprises whole blood, serum, plasma, or liver tissue.
  • the tissue sample is obtained during a biopsy or surgery.
  • FIBV DNA may be extracted from the sample using any suitable method known in the art including, but not limited to, the methods described in the Examples. Suitable commercial kits for extracting viral DNA are widely available and well known in the art.
  • the method comprises selecting or designing an appropriate S ORF reference sequence. Recombination events between FIBV genomes can result in mixed FIBV genotypes and subgenotypes, and so it may be necessary to determine the subgenotype of the S ORF region specifically.
  • the method comprises providing information about the subgenotype of the S ORF of the FIBV the subject is infected with, for example, information from an earlier test, and selecting an S ORF reference sequence corresponding to that subgenotype.
  • the method comprises comparing the sequence information about the S ORF region of the FIBV from a sample obtained from the subject as described herein to one or more subgenotype S ORF consensus sequences to determine the subgenotype of the S ORF of the FIBV the subject is infected with and selecting a corresponding S ORF reference sequence.
  • the method comprises performing one or more sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences.
  • the method comprises performing one or more pairwise sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences.
  • An example of such a method of determining S ORF subgenotype is described in Example 1.
  • the sequence alignment is performed using one to many pairwise alignments (using but not limited to BLAST, Smith-Waterman or Needleman-Wunsch based algorithms).
  • Software may be selected from a group of programs including but not limited to the Burrows-Wheeler aligner program, novoalign, bowtie, mrsFAST, Partek, SOAP, MAQ, Segemehl, SSHA and Stampy. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • the method comprises performing one or more multiple sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences.
  • Techniques for conducting multiple alignments include but are not limited to progressive alignment construction, iterative methods, consensus methods, hidden Markov models, phylogeny-aware models, motif finding or non-coding multiple sequence alignment.
  • Software that might be used includes but is not limited to Clustal, MAFFT, T-Coffee, PRRN/PRRP, CHAOS/DIALIGN, M-Coffee, MergeAlign, POA, SAM, HMMER, PRANK, PAGAN, MEME/MAST and EDNA.
  • the method comprises determining the subgenotype S ORF consensus sequence to which the sequence information preferentially aligns and selecting the preferentially aligned S ORF sequence as the S ORF reference sequence. [00103] In some embodiments it is necessary to disti nguish regions of the S ORF derived from different genotypes, subgenotypes or sub-subgenotypes in patients infected with recombinant viruses.
  • the method comprises constructing an S ORF reference sequence by identifying the genotype, subgenotype or sub-subgenotype of two or more regions of the S ORF of FIBV the subject is infected with, selecting a region of an S ORF reference sequence disclosed herein corresponding to each region of differing subgenotype or sub-subgenotype and combining these regions to form a full length S ORF reference sequence tailored to that subject.
  • the method of the invention is suitable for determining the frequency of non- synonymous mutations in the S ORF of any subgenotype of FIBV genotypes A, B or C for which an S ORF subgenotype sequence is available.
  • a suitable S ORF reference sequence for an FIBV subgenotype may be generated by aligning two or more S ORF nucleotide sequences obtained from FIBV identified as being of that subgenotype and generati ng a consensus sequence.
  • Al ignment software suitable for generating consensus sequences is well known i n the art, includi ng software programs described herein.
  • the one or more FIBV S ORF reference sequences comprises one or more sequences of SEQ ID Nos: 1-14.
  • S ORF reference sequences for FIBV subgenotypes A1 and A2 are provided by SEQ ID NOs: 1 and 2, respectively.
  • S ORF reference sequences for FIBV subgenotypes Bl are provided by SEQ ID NOs: 1 and 2, respectively.
  • S ORF reference sequences for subgenotypes Cl, C2, C3A, C3B, C4, C5 and C6 are provided by SEQ ID NOs: 8-14, respectively.
  • the S ORF reference sequence corresponding to the subgenotype of the S ORF region of the FIBV the subject is infected with comprises a nucleotide sequence having at least about 90%, 95%, 97%, 98% or at least about 99% sequence identity to the S ORF reference sequence corresponding to that subgenotype descri bed herein .
  • the S ORF reference sequence comprises sequence information about a portion of the S ORF region.
  • the S ORF reference sequence solely comprises sequence information about the two or more codons at which the frequency of non-synonymous mutations is to be determined for the genotype or subgenotype of the HBV the patient is infected with.
  • the S ORF reference sequence comprises two or more discrete sequences wherein each sequence corresponds to a codon at which the frequency of non-synonymous mutations is to be determined.
  • the reference sequence comprises a sequence of the S ORF region, or a portion thereof, wherein one or more nucleotides that do not correspond to the two or more codons at which the frequency of non-synonymous mutations is to be determined are designated "n".
  • the S ORF reference sequence comprises a sequence designed to correspond to two or more subgenotypes of FIBV.
  • the S ORF reference sequence comprises a consensus sequence of two or more subgenotypes of FIBV. If, for example, the S ORF reference sequence is designed to correspond to all known subgenotypes of a particular genotype of FIBV), it may not be necessary to determine the subgenotype of the FIBV the subject is infected with in order to carry out the method of the invention.
  • the S ORF reference sequence comprises a sequence that corresponds to two or more, three or more, four or more or five or more subgenotypes of genotype C.
  • the S ORF reference sequence comprises a sequence that corresponds to all known subgenotypes of FIBV genotype C. In another embodiment the S ORF reference sequence comprises a sequence designed to correspond to all known genotypes and subgenotypes of the FIBV.
  • Additional subgenotypes of FIBV may be identified in the future.
  • S ORF reference sequences corresponding to these new subgenotypes can be generated as described above. As nucleotide sequences for new subgenotypes become available, these sequences can be included in a method described above to identify whether these subgenotypes are present in the sample obtained from the subject.
  • the method comprises a) determining the frequency of non-synonymous mutations in the S ORF region of the
  • FIBV the subject is infected with by a method comprising i) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF that the FIBV the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iv) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; and b) determining the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
  • the S ORF region of FIBV genotype A comprises 1,200 nucleotides comprising nucleotides 2854-3221 and 1-832 of the circular FIBV genome. Consensus nucleotide sequences for the S ORF of FIBV subgenotypes A1 and A2 are set out in SEQ ID NOs: 1 and 2, respectively.
  • the S ORF region of FIBV genotypes B and C comprises 1,200 nucleotides comprising nucleotides 2848-3215 and 1-832 of the circular FIBV genome.
  • Consensus nucleotide sequences for the S ORF of FIBV subgenotypes Bl, B2, B3, B4 and B5 are set out in SEQ ID NOs: 3-7, respectively.
  • Consensus nucleotide sequences for the S ORF of FIBV subgenotypes Cl, C2, C3a, C3b, C4, C5 and C6 are set out in SEQ ID NOs: 8-14, respectively.
  • the 1,200 nucleotide S ORF region comprises 400 codons.
  • the codons are numbered
  • codon 1 comprises nucleotides 1-3
  • codon 2 comprises nucleotides 4-6 and so forth to codon 400 comprising nucleotides 1,198-1,200 of the S ORF.
  • S ORF sequence comprising one or more indels
  • specific codons located after the indel are identified by reference to the relevant subgenotype S ORF reference sequence.
  • the method comprises generating sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject. In one embodiment the method comprises generating sequence information about the entire 1,200 nucleotides of the S ORF region. In another embodiment the method comprises generating sequence information about a portion of the S ORF region comprising two or more codons at which the frequency of non-synonymous mutations is to be determined.
  • the method comprises amplifying the S ORF region, or a portion thereof, to produce an amplimer.
  • the S ORF, or a portion thereof is amplified using sequential nested PCR reactions or other amplification strategies known in the art to reduce mispriming, non-specific amplification or to enhance specific amplification of the desired amplimer.
  • the method comprises performing two initial rounds of DNA synthesis using oligonucleotide primers wherein the 3' portion is comprised of a random nucleotide sequence and the 5' portion is a defined sequence.
  • An example of such a strategy is described in Bohlander et a I . , 2002, Genomics 13: 1322-1324.
  • an oligonucleotide primer comprising a 5' "defined” sequence comprising a sequence that is absent in FIBV and human genomes and optimised for use in a PCR reaction according to criteria known to those skilled in the art and a 3' "random" sequence comprising 3 or more nucleotides selected at random is annealed to FIBV DNA and extended using a polymerase (for example, a T7 DNA polymerase) resulting in first strand DNA synthesis. Following denaturation and re-annealing, a second strand synthesis is performed using the same or a similarly designed primer to produce a complementary second strand. This double-stranded DNA is then amplified in a PCR reaction using primers that anneal to the 5' defined sequences to amplify the S ORF region, or a portion thereof.
  • a polymerase for example, a T7 DNA polymerase
  • the method comprises generating sequence information about the amplimer using massive parallel sequencing technology (also known as second generation sequencing or next generation sequencing) in which multiple, spatially-separated DNA templates are sequenced simultaneously.
  • massive parallel sequencing technology also known as second generation sequencing or next generation sequencing
  • Any suitable massive parallel sequencing technology known in the art can be used.
  • the method comprises pyrosequencing, reversible terminator chemistry, sequencing by ligation or phospholinked fluorescent nucleotides.
  • Suitable platform technologies are well known in the art, and include, but are not limited to, Illumina, Ion Torrent, Roche, Life Technologies, Complete Genomics, Flelicos Biosciences, GS FLX Titanium or Pacific Biosciences platforms.
  • amplimers are fragmented by sonication. After fragmentation, end-repair, adenylation of 3' ends, adaptor ligation and amplification of the adaptor ligated libraries are performed using the NEBNext Ultra library preparation reagents for Illumina platform (NEB). Individual barcodes are added at the time of library formation using the NEBNext Multiplex Oligos Set-I. The Illumina libraries are sequenced on a NextSeq500 instrument (Illumina Inc.) to obtain about 10 million paired-end (PE) reads (150bp x 2) from each sample.
  • PE paired-end
  • the method comprises generating sequence information about the S ORF directly from HBV DNA extracted from a suitable sample obtained from the subject using massively parallel sequencing technology.
  • Any suitable extraction method known in the art that produces partial or complete HBV DNA suitable for sequencing can be used.
  • Any suitable massively parallel sequencing technology known in the art can be used including the methods described above.
  • the method comprises using DNA hybridisation to obtain nucleotide or amino acid sequence information about the HBV from the sample.
  • Suitable DNA hybridisation techniques include, but are not limited to, hybridisation of DNA, RNA or cDNA to DNA, cDNA or RNA probes attached either to a solid phase such as glass, plastic or silicon biochips or to microscopic polystyrene beads. Such techniques can be used to generate sequence information from partial or complete HBV DNA extracted from the sample or from an amplimer.
  • the method comprises generating sequence information about the S ORF using single molecule-based sequencing technology. Any suitable sequencing technology known in the art can be used. In various embodiments the method comprises sequencing by hybridisation or sequencing by synthesis. Suitable platform technologies are well known in the art, and include, but are not limited to, Ion Torrent, Life Technologies, or Pacific Biosciences platforms.
  • Suitable oligonucleotide primers can be designed using strategies and methods known in the art.
  • a first step is to identify a portion of the HBV genome comprising the S ORF to be amplified. The portion comprises all of the codons in the S ORF to be assessed.
  • the next step is to identify suitable primers to achieve the desired amplimer: a forward primer comprising 5 or more contiguous nucleotides at the 5' end of the portion to be amplified and a reverse primer comprising a sequence complementary to 5 or more contiguous nucleotides at the 3' end of the portion of the genome to be amplified.
  • Forward and reverse primers are chosen based on consideration of parameters including, but not limited to, primer length, primer melting temperature, primer annealing temperature, GC content and primer secondary structures.
  • primer design tools are available to assist primer design i ncl uding, but not limited to, Primer Premier, Primer-BLAST, DNASTAR, Primer3 and Ol igo Primer Analysis Software. It will be apparent to those skilled in the art that unique primer pairs may be required for specific HBV subgenotypes and that such primers can be designed usi ng the methods outlined above.
  • the forward primer is derived from a region of the HBV genome comprising nucleotides 1500-3215 and nucleotides 1-60 (when the HBV is genotype B or C) or nucleotides 1500-3221 and nucleotides 1-60 (when the HBV is genotype A) and the reverse primer is derived from a region of the HBV genome comprisi ng nucleotides 1- 2000
  • both primers anneal to a region of the HBV genome that falls outside the S ORF so that the PCR reaction amplifies the entire S ORF.
  • the forward pri mer is designed to anneal to a position that partially overlaps with or falls wholly within the S ORF region of the HBV genome and the reverse primer anneals to a region of the HBV genome that falls outside the S ORF region of the HBV genome, or vice versa .
  • both primers are designed to anneal to a position that partially overlaps with or falls wholly within the S ORF region of the HBV genome.
  • the forward primer is derived from a region of the HBV genome comprising nucleotides 1500-2848 of the HBV genome and the reverse primer is derived from a region of the S ORF comprising nucleotides 832-2000.
  • the primers are universal primers designed to bind and amplify a portion of the HBV genome of any HBV of any genotype or subgenotype.
  • Such universal primers can be designed by targeting highly conserved regions of the HBV genome of sufficient length to accommodate primers. Examples of such pri mers are provided in the Examples below. Due to the frequent and rapid mutation of the HBV vi rus, it may be necessary to design a suite of one or more forward primers and one or more reverse primers designed for use with the specific genotypes or subgenotypes or for use as a backup if a first primer pair does not produce the desired amplimer.
  • the method of the invention comprises comparing the sequence information about a portion of the S ORF region of HBV in a sample obtained from the subject, for example, the reads obtained by next generation sequenci ng, and the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region. The frequency of non-synonymous mutations across the S ORF region is then determined based on the frequency of non-synonymous mutations at each codon.
  • sequence information is assembled into haplotypes followed by alignment of the assembled sequences (typically called contigs) to the S ORF reference sequence and analysis of genetic variation between individual contigs is performed.
  • the method comprises performing a bioinformatics analysis, for example, a sequence similarity analysis as described above and in the
  • sequence information examples between the sequence information and the S ORF reference sequence to identify the presence or absence of non-synonymous mutations at each of two or more codons in the S ORF.
  • Software programs suitable for mutation identification are well known in the art including, and include, but are not limited to, SAMtools, GATK Flaplotypecaller, Freebayes, GATK UnifiedGenotyper, SOAPSnp, VarScan, and Platypus.
  • the method comprises determining whether each detected mutation is a synonymous or non-synonymous mutation. This will be performed with reference to the genetic code, which is familiar to those with expertise in the art.
  • the method comprises determining the frequency of non- synonymous mutations at each of two or more codons in the S ORF.
  • Each sequence is individually aligned to the S ORF reference sequence followed by counting the number of all sequence reads from that subject that align to each codon within the S ORF. This will provide the denominator for non-synonymous mutation frequency calculations.
  • the frequency of non-synonymous mutations at each of two or more codons in the S ORF will be calculated as the number of non-synonymous mutations detected at each codon divided by the number of sequence reads at that codon.
  • the method comprises determining a numerical score representing the frequency of non-synonymous mutations across the S ORF. In one embodiment the method comprises assigning a category or grade based on the frequency of non-synonymous mutations across the S ORF.
  • the method comprises assigning a numerical score for each codon based on the frequency of non-synonymous mutations at that codon.
  • the numerical score is assigned based on a threshold frequency of non- synonymous mutations at each codon. For example, the method comprises assigning a score of 0 if the frequency of non-synonymous mutations at that codon is less than the threshold frequency, and assigning a positive score if the frequency of non-synonymous mutations at that codon is equal to or greater than the threshold frequency.
  • the method comprises assigning a category or grade defined by a threshold frequency on non-synonymous mutations at each codon.
  • the positive score assigned at each codon may be weighted, for example, a higher score may be assigned to codons under the strongest positive selection pressure.
  • the numerical score assigned at each codon is combined to provide a combined numerical score representing the frequency of non-synonymous mutations in the S ORF region of the HBV.
  • the category or grade assigned at each codon is combined to provide a combined grade representing the frequency of non-synonymous mutations in the S ORF region of the FIBV.
  • the method comprises detecting the frequency of non-synonymous mutations at each codon by amplification refractory mutation system (ARMS) assay.
  • ARMS assay amplification by PCR is performed using primers comprising 3' terminal sequences that are designed to distinguish between the bases present at each codon in the S ORF reference sequence and non-synonymous mutations at that codon present in the S ORF of the FIBV in a sample obtained from the subject.
  • PCR primers for use in ARMS are designed based on the S ORF reference corresponding to the subgenotype of the S ORF region of the FIBV the subject is infected with.
  • a PCR primer is designed for use in a PCR reaction to specifically amplify at least a portion of the S ORF comprising each codon only when the wildtype codon is present, i.e. the nucleotide sequence of the codon in the S ORF reference sequence for the subgenotype of the S ORF of the FIBV the subject is infected with.
  • primers are designed for use in PCR reactions to specifically amplify at least a portion of the S ORF comprising when each possible non-synonymous mutation at that codon is present.
  • the amplimers of each reaction are subjected to agarose gel electrophoresis and the bands corresponding to each amplimer are detected and quantified to determine the frequency of non-synonymous mutations using methods known in the art.
  • the method comprises detecting the presence or absence of amino acid substitutions in the protein sequence of the S ORF using mass spectrometry, labelled antibodies fluorescent-labelled probes, metal-labelled probes or any other molecules that distinguish wild-type and mutated amino acids.
  • the wild-type amino acids at each position are those amino acids coded for by the codons at the corresponding positions in the S ORF reference sequence.
  • the labelled antibodies, probes or other molecules are detected by flow cytometry or by cytometry by time of flight (CyTOF) analysis.
  • the presence of non-synonymous mutations in the nucleotide or protein sequence of the S ORF will be detected by analysis of transfected, transduced or infected cells in in vitro culture or by any analysis of the supernatants from in vitro cultures.
  • the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at each codon can be determined using any method described herein.
  • CFIB Chronic hepatitis B
  • liver complications including liver inflammation, cirrhosis and liver cancer.
  • Patients with CFIB may be asymptomatic for many years and only present with clinical symptoms at an advanced stage of disease.
  • the method of the invention is suitable for screening any subject infected with HBV genotype A, B or C.
  • the method is particularly contemplated for screening subjects with chronic HBV who do not exhibit clinical symptoms of, or who have not been diagnosed with, liver complications of HBV, for example, subjects considered to be inactive healthy carriers (IHC) of HBV.
  • IHC inactive healthy carriers
  • the subject tests negative for hepatitis e antigen (HBeAg), has normal serum ALT levels, has a serum HBV-DNA titre of greater than about 2,000 IU/ml, or any combination thereof.
  • HBeAg negativity usually indicates that a patient has a low risk of developing liver inflammation and cirrhosis, and most patients are regarded as being inactive healthy carriers (IHC).
  • IHC inactive healthy carriers
  • a sub-group of HBeAg-negative patients with a normal baseline ALT level and HBV-DNA levels greater than 2,000 IU/ml can develop a form of chronic liver inflammation known as HBeAg-negative chronic hepatitis B (e-CHB), which is associated with rapid progression to liver cirrhosis.
  • HBeAg-CHB chronic chronic hepatitis B
  • the frequency of non-synonymous mutations in the S ORF region of HBV is used to determine the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non-synonymous mutations in the S ORF region.
  • the frequency of non-synonymous mutations in the S ORF region of the HBV indicates that the subject is suffering from liver inflammation, for example, early stage liver inflammation with no clinical symptoms.
  • the method is used to determine the genetic status of the HBV the patient is infected with .
  • the frequency of non-synonymous mutations indicates that the HBV is under positive selection pressure.
  • the frequency of non-synonymous mutations in the S ORF region of the HBV indicates that the subject is susceptible to or at risk of developing liver inflammation or one or more liver complications of chronic HBV infection.
  • the l iver complications of chronic HBV infection comprise liver cirrhosis, liver cancer, liver failure, liver i nflammation, l iver damage, liver dysfunction, or a combination of any two or more thereof.
  • the subject is HBeAg-positive or has elevated serum ALT levels or is both HBeAg-positive and has elevated serum ALT levels.
  • the method is used to confirm that the subject is suffering from liver inflammation, for example, early stage l iver inflammation .
  • the frequency of non-synonymous mutations in the S ORF region i ndicates that the subject is susceptible to or at risk of developing liver inflammation or liver complications within 1 year, or 2, 3, 4, 5, 6, 7, 8, 9 or 10 years.
  • a numerical score representing the frequency of non- synonymous mutations in the S ORF region is used to determine clinical outcome. For example, a numerical score greater than or equal to a predetermined threshold score indicates that the HBV is under positive selection pressure or that the subject is susceptible to or at risk of developing liver inflammation or liver complications of chronic HBV infection .
  • a numerical score of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more indicates that the HBV is under positive selection pressure or that the subject is susceptible to or at risk of developing liver inflammation or liver complications of chronic HBV infection.
  • the frequency of non-synonymous mutations in the S ORF is used to determine that a subject should undergo further clinical testing, for example a liver scan or biopsy to identify liver complications of HBV.
  • the frequency of non-synonymous mutations is used to determine a clinical screening protocol for that patient, for example, to determine a frequency of testi ng for serum ALT, HBeAg, HBV DNA levels.
  • the frequency of non-synonymous mutations is used to recommend that the method of the invention should be repeated within a certain timeframe.
  • the frequency of non-synonymous mutations in the S ORF region is used to monitor the response to therapy, predict the response to a therapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy in the subject.
  • Immunotherapy includes any therapy that alters or enhances the immune response of a subject to a pathogen, in particular, a virus such as HBV.
  • a pathogen in particular, a virus such as HBV.
  • immunotherapy particularly contemplates treatment or therapy administered to a subject with a chronic HBV infection.
  • the frequency of non-synonymous mutations in the S ORF region is used to monitor the response to immunotherapy, predict the response to an immunotherapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy in the subject.
  • the frequency of non-synonymous mutations in the S ORF region is determined in the subject at a regular interval. In one embodiment an increase in the frequency of non-synonymous mutations in the S ORF region is indicative of an increased risk of developing liver inflammation or liver complications of chronic HBV or that the subject should commence a particular immunotherapy or other therapy regime.
  • This example describes cloning and analysis of the HBV S ORF from a first patient cohort infected with HBV.
  • the New Zealand Hepatitis B Screening Programme monitors 1,439 Tongan adults with a chronic HBV infection at 6-12 monthly intervals. 345 of these subjects were recruited into this study at the time of a routine blood test. An extra 3 mL serum was taken for HBV-DNA analyses, and genomic DNA was extracted from peripheral blood leucocytes for HLA class I genotyping . All subjects gave written consent, and the study was approved by the Northern X Regional Ethics Committee of the New Zealand Ministry of Health .
  • HBV-DNA was extracted from 300 mI of serum using the High Pure Viral Nucleic Acid kit (Roche Diagnostics, Indianapolis, USA). There are both between-subject and within-subject differences i n the sequence of the HBV in the Tongan population.
  • the first step was to obtain a consensus sequence of base pairs 1700-2411.
  • the primers for used for amplifying this sequence were identified by finding conserved sequences in a Multalin (http://multalin.toulouse.i nra .fr/multalin/muitalin. html) al ignment of 54 genotype C and genotype D sequences obtained from the NCBI nucleotide database (http://www. ncbi .nim .nih .gov/sites/entrez). From the al ignment, three forward primers were determined which all match two reverse primers (all shown in Table 1). Tm val ues were calculated using Oligo Primer Analysis Software, version 7.58 (Molecular Biology Insights Inc, CO) .
  • Base numbers correspond to a genotype C3 sequence (Genbank number X75656).
  • the HBV genome was cloned in two fragments.
  • a 2.6 kb clone that i ncluded the S ORF, the terminal 2455 bp of the P ORF and the initial 403 bp of the X ORF was amplified using primers that are complementary to conserved sequences in the 1700- 241 lbp sequence described above and were designed to amplify the minus strand of the HBV.
  • conserved sequences were identified by visual inspection of the 47 1700-2411bp sequences in the Multalin sequence alignment program
  • the upper primer for the first PCR reaction was either Forward 4 or Forward 5 and the lower primer was either Reverse 3 or Reverse 4 (Table 2) .
  • PCR cycling conditions for each reaction were determined using Oligo Primer Analysis Software, Version 6 (Molecular Biology Insights; www.oi igo. net) .
  • the PCR strategy for all PCR reactions described above was a 3 stage touchdown protocol .
  • the denaturation temperature was 96°C for 10 seconds for the first stage, and 2°C above the Tm of the product for 10 seconds for the last two stages.
  • the fi rst and second stage annealing temperatures were set at the Tm of the pri mers (with a maximum of 72°C) for 15 seconds, decreasing at 0.2°C and 0.5°C per cycle for 8 cycles and 4 cycles for the first and second stages respectively.
  • the third stage annealing temperature was set at the optimal value calculated by the Oligo Pri mer Analysis Software for 30 cycles.
  • the extension temperature was 72°C, and the extension time was 50 to 60 seconds per kilobase of product. All amplifications of HBV-DNA were performed using Accuprime Taq DNA Polymerase High Fidelity (Invitrogen Life Technologies, Carlsbad, CA).
  • Sanger sequencing reactions contained 20-40 ng of cloned DNA and 1 mI of ABI Prism BigDye Terminator v3.1 cycle sequencing mix (Applied Biosystems, Foster City, CA). The sequencing reactions were performed using annealing temperatures of either 56°C or 50.5°C, depending on the Tm of the sequencing primer. The sequencing reaction products were purified by magnetic beads (CleanSEQ, Agencourt Biosciences Corp., Beverly, MA), and analysed on an ABI Prism genetic analyser 3130XL fitted with a 50 cm capillary array using POP7 polymer. Sequence analysis and contig assembly were performed using Chromas 1.61 (Technelysi um Pty Ltd, Queensland, Australia).
  • genotype of each HBV sequence was determined by aligning the sequence of the amplimer produced by any one of the primer pairs against a panel of sequences obtained from the NCBI nucleotide database (Genbank Accession and Version in parentheses) : genotype A (AF297623.1, AY128092.1, AB126580.1), genotype B
  • genotype D (AB126581.1, AB104711.1, AB104712.1), genotype E (AB091256.1, AB091255.1), genotype F (AB036920.1, AY179734.1, AY179735.1), genotype G (AB064315.1, AF405706.1) and genotype H (AY090460.1, AY090457.1).
  • the fi rst amplimers sequenced (containing bases 1700-2411, see above) contained a haplotype of 52 bases that distinguish genotype C from genotype D in the Tongan population, and this sequence was used to exclude genotype D subjects.
  • the possibi lity that any subject had a virus of mixed genotype was excluded by aligning the sequences of the 2.6kb clones with the panel of sequences above. No Tongan subjects with a virus comprising a mixture of genotype C and genotype D sequences were identified .
  • HBV genotype C from 19 subjects, with a range of 5 to 12 clones per subject. Clones containing indels and/or nonsense mutations were excluded from further analysis.
  • This example describes cloning and analysis of the HBV genome obtained from a second patient cohort infected with HBV.
  • the New Zealand Hepatitis B Screening Programme monitors over 10,000 adults with a chronic HBV infection at 6-12 monthly intervals. 250 of these subjects who were HBeAg-negative and with a normal ALT level when they entered the screening program were recruited independently of age, ethnicity or gender into this study at the time of a routine blood test. All subjects gave written consent for their i nitial screening serum sample to be used for the study, which was approved by the Northern X Regional Ethics Committee of the New Zealand Ministry of Health. HBV-DNA levels were measured using the COBAS Ampliprep/COBAS Taqman HBV Test, version 2.0, which has a linear range of 1.30 to 8.23 log io IU/ml . This study included 12 subjects with HBV DNA>2,000IU/ml and 5 subjects with HBV-DNA ⁇ 2,000 IU/ml who were all genotype C.
  • HBV DNA was extracted from 500 mI of serum using the QiAamp Ultrasens Virus Kit (QIAGEN GmbH, Hilden, Germany). The S open reading frame of the HBV was amplified in sequential nested PCR reactions. All PCR reactions used 5 mI of template and Phusion High fidelity DNA Polymerase (Finnzymes, Espoo, Finland) . The optimal annealing temperatures were determined from the manufacturer's website (www.finnzymes.com).
  • An initial PCR reaction was performed using either Primer Forward 4 or Primer Forward 5 with Reverse Primer 4 listed in Table 2 above.
  • a 3 stage touchdown protocol was used .
  • the denaturation temperature was 98°C for 10 seconds.
  • the first and second stage annealing temperatures were initiated at the Tm of the primers (with a maximum of 72°C) for 10 seconds, decreasing at 0.2°C and 0.5°C per cycle for 8 cycles and 4 cycles for the first and second stages respectively.
  • the third stage annealing temperature was set at the optimal value calculated by the manufacturer's website for 25 cycles.
  • the extension temperature was 72°C, and the extension time was 30 seconds per kilobase of product.
  • a nested PCR was performed on the product of the initial PCR.
  • the nested PCR used HF buffer and a primer pair comprising the forward primer and one of the reverse primers listed in Table 5 below.
  • a 3 stage touchdown protocol was used as described above for the initial PCR.
  • genotype of each HBV sequence was determined by aligning the sequence produced by any one of the primer pairs against a panel of HBV sequences as described above for Example 1.
  • Table 6 summarises the PAML analysis of 298 S ORF clones obtained from the 36 subjects in Example 1 and Example 2. It lists all codons for which w was > 2.0, identifying the codons with strongest positive selection pressure. Positive selection pressure in codons marked with asterisks (* or **) has reached a conventional level (Pr>0.95) of two-tailed statistical significance. Table 6: Results of PAML analysis of HBV genotype C clones
  • the sensitivity (or true positive rate) is calculated as the number of subjects with greater than or equal to that score who developed l iver disease divided by the total number of subjects who developed liver disease.
  • the specificity is calculated as the number of subjects with greater than or equal to that score who did not develop liver disease divided by the total number of subjects who did not develop liver disease. This is presented in the ROC curve as 1-specificity which is the false positive rate. Since all subjects had a score of greater than or equal to zero, this score is 100% sensitive and has a 100% false positive rate. A score of greater than or equal to 2 is 100% sensitive and has a false positive rate of 38%. A score of greater than or equal to 5 is 75% sensitive with a false positive rate of 0%. [00193] The survival curve shown in Figure 2 compares the 10 year follow up outcome of 9 subjects with a score of greater than or equal to 3 (group 1) with the outcome of 7 subjects with a score of less than or equal to 2 (group 0).
  • the x axis shows the time to development of liver inflammation in 7 subjects in group 1 and 1 subject in group 0; and the duration of follow up in the 8 subjects who continued to have inactive disease for over 10 years is also shown (+) as the censored observations.
  • the time to development of liver disease in the single subject from group 0 (with a score of 2) was longer than in any subject from group 1.
  • Both the subjects in group 1 who did not develop liver disease had scores of 3.
  • one of these subjects had a C subgenotype that did not fit any currently known subgenotype and thus the score for this subject may not have been a reflection of immune-mediated positive selection.
  • NGS Next Generation Sequencing
  • NGS sequences were aligned with Burrows- Wheeler Aligner to a panel of reference genotype C subgenotype sequences for
  • subgenotypes Cl, C2, C3a, C3b and C4 (SEQ ID Nos: 22-26). Alignments were analysed with SAMtools program and a custom filter to identify which of the positively-selected codons identified in Example 3 (viz codons 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135,
  • restriction digestion was performed with Sau96I following the manufacturer's instructions (New England Biolabs, #R0165).
  • This example demonstrates the uti lity of next generation sequencing for detecti ng non-synonymous mutations in the S ORF of HBV in a method of the invention.
  • liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer defined as liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer.
  • the definition of active hepatitis B virus induced liver disease for the purpose of this investigation is a subject with a chronic hepatitis B virus infection who has been deemed to have developed at least one of liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer by his or her physician.
  • HBV DNA was extracted from sera with HBV-DNA> 2,000 IU/ml using the Qiagen Ultrasens viral extraction kit.
  • genotype and subgenotype of the virus from each subject was determined by al igni ng the consensus sequence of the S ORF derived from the next generation sequence with the reference sequences of HBV subgenotypes, Al, A2, Bl, B2,
  • a score for each subject was calculated as the number of positively-selected codons having a frequency of non-synonymous mutations greater than 0.10.
  • ROC analyses identify the positively-selected codons that are most strongly predictive of the development of active hepatitis B virus-i nduced liver disease.
  • the purpose of this example is to assess the utility of the method of the invention for predicting the development of active, hepatitis B virus-induced liver disease util ising the optimal combination of positively selected codons identified in Example 6 in a new cohort of genotype C, HBeAg-negative chronic hepatitis B virus subjects with normal serum ALT levels.
  • results determine the lowest subject score that identifies subjects with active hepatitis B vi rus-induced liver disease with a false positive rate of 0%. It is recommended that any subject with a score greater than or equal to this number be started on antiviral therapy for the purpose of preventing or reversing active hepatitis B virus- induced liver disease.
  • results are used to determine the lowest score in any subject who went on to develop active liver disease. It is recommended that any subject with a score lower than this number continue participation in the screening program, primarily for the purpose of early identification of liver cancers that can occur in the absence of liver cirrhosis. These subjects should not be given anti-viral therapy.
  • liver fibroscan liver ultrasound, liver biopsy. It is recommended that this assay be repeated in these patients in 5-10 years time.
  • the purpose of this example is to identify the S open reading frame codons in the genotype B hepatitis B virus that are under positive selection pressure in subjects who are HBeAg-negative and who have normal serum ALT levels.
  • the uti lity of the frequency of non-synonymous mutations in these codons to predict active, hepatitis B virus- induced liver disease is assessed .
  • the S ORF of the hepatitis B virus is cloned and sequenced for 50 sera as in Example 2 above. 300 clones are sequenced from the 50 sera. [00226] Phylogenetic analysis of the 300 HBV clones to identify codons under positive selection pressure is performed as described in Example 3 above.
  • the S ORF is amplified from the second 50 of these sera and subjected to next generation sequencing as described above.
  • An ROC analysis is performed to confirm that counting non-synonymous mutations at the positively-selected codons identified in Stage 1 is predictive of the development of active, hepatitis B virus-induced liver disease, using the methods described in Example 7 above.
  • Stage 1 identifies codons in the genotype B HBV S ORF that are under positive selection pressure.
  • Stage 2 determines whether the frequency of non-synonymous mutations at the codons identified at Stage 1 predicts whether a patient will develop either active or inactive liver disease over the following 5 years.
  • the ROC analysis determines which patient score predicts the development of active liver disease with a false positive rate of 0% as in Example 4 above.
  • the present invention provides a method of determining the liver inflammation status of or susceptibility or increased risk of liver complication of CHB in patients infected with HBV genotypes A, B or C.
  • the present invention has applicability in medical, clinical and laboratory research applications.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Virology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Communicable Diseases (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods of determining liver inflammation status or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic hepatitis B virus (HBV) infection in subjects infected with HBV genotypes A, B or C based on the frequency of non-synonymous mutations in the S ORF region of the HBV genome and related systems, kits, measurement tools, agents, devices, computer readable media, oligonucleotide primer pairs and apparatus and uses thereof.

Description

METHOD OF ANALYSIS OF MUTATIONS IN THE HEPATITIS B VIRUS AND USES
THEREOF
FIELD OF THE INVENTION
[0001] The present invention relates to methods for determining the liver inflammation status of, determining the genetic status of a hepatitis B virus (HBV) in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with HBV genotype A, genotype B or genotype C, and related systems, kits, uses, devices, computer-readable media and apparatus.
BACKGROUND TO THE INVENTION
[0002] Chronic hepatitis B (CHB) is a form of chronic liver inflammation caused by high levels of replication of the hepatitis B virus (HBV) in the liver cells of patients infected with the hepatitis B virus. If untreated, CHB leads to liver complications, including inflammation, cirrhosis, liver failure and liver cancer. Modern anti-viral treatments that suppress replication of the hepatitis B virus in patients with active liver inflammation result in lower rates of liver cirrhosis, liver failure and liver cancer. These treatments can also reverse pre-existing liver damage.
[0003] HBV-infected patients are monitored by regular blood testing to measure serum alanine amino transferase (ALT) - a liver enzyme that is released into the blood when liver damage occurs, indicating liver inflammation. Commencement of anti-viral therapy in patients with elevated serum ALT substantially reduces the future risk of liver complications. However, for a subset of patients, liver inflammation and CHB develops with no increase in serum ALT levels. Consequently, these patients present with late stage disease that is often untreatable.
[0004] There is a continuing need for an alternative laboratory test to detect or monitor liver inflammation in HBV-infected patients, identify susceptibility to or detect a risk of developing liver inflammation or liver complications of chronic HBV infection in HBV- infected patients, and/or inform clinical decision-making about anti-viral therapy in HBV- infected patients.
[0005] It is an object of the present invention to meet one or more of these needs, or to at least provide the public with a useful choice. SUMMARY OF THE INVENTION
[0006] In one aspect the invention relates to a method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S open reading frame (S ORF) of HBV genotype A, B or C, the method comprising a) obtaining sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region; and b) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
[0007] In another aspect the invention relates to a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the system comprising a) a measurement tool that analyses sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV based on the frequency of non- synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non-synonymous mutations in the S ORF region; the analysis of step (a) or the determination of step (d) being based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with.
[0008] In one embodiment the system comprises a measurement tool comprising one or more oligonucleotide primers that target or are based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
[0009] In one embodiment the system comprises a measurement tool comprising one or more oligonucleotide primers that target a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
[0010] In one embodiment the system comprises a measurement tool comprising one or more oligonucleotide primers based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with for use in determining the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
[0011] In an alternative embodiment the system comprises a measurement tool that generates sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject and the analysis tool is adapted to select an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with and compare sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region.
[0012] In another aspect the invention relates to a method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the FIBV by a method comprising a) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, b) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, c) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and d) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
[0013] In a further aspect the invention relates to a system for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the system comprising a) a measurement tool that generates sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the FIBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, ii) comparing the sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region, iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of non- synonymous mutations i n the S ORF region.
[0014] In a further aspect the invention relates to a kit for determining the liver inflammation status of, or determini ng the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV i nfection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the kit comprising a) an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an ampli mer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the FIBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is i nfected with comprising nucleotides 100-2000, b) amplification reagents, c) an analysis tool for execution on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, ii) comparing sequence information generated by a measurement tool about at least a portion of the S ORF region present in the ampli mer to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of non- synonymous mutations in the S ORF region, or d) any combination of any two or more of a) to c).
[0015] In a further aspect the invention relates to a kit for determining the liver inflammation status of, or determini ng the genetic status of an FIBV in, or identifying susceptibility to or detecti ng a risk of developing liver inflammation or liver complications of chronic FIBV i nfection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the kit comprising a) an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an ampli mer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the FIBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence corresponding to a subgenotype of the HBV that the subject is infected with comprising nucleotides 100-2000, b) amplification reagents, c) one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject, d) an analysis tool for execution on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, ii) comparing sequence information about at least a portion of the S ORF region present in the amplimer generated by a measurement tool or the sequence information generated by the one or more nucleotide probes to the S ORF reference sequence to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region, iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non- synonymous mutations in the S ORF region, or e) any combination of any two or more of a) to d).
[0016] In another aspect the invention relates to use of a measurement tool that generates sequence information about at least a portion of an S ORF region of HBV present in a sample obtained from a subject in the manufacture of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C; wherein the determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C is carried out by a method described herein.
[0017] In a further aspect the invention relates to use of an agent that generates sequence information about at least a portion of an S ORF region of HBV present in a sample obtained from a subject in the manufacture of a kit or system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C; wherein the determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in the subject is carried out by a method described herein.
[0018] In another aspect the invention relates to a device for determining the frequency of non-synonymous mutations in an S ORF region of HBV in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C comprising: a) a memory that stores instructions; and b) a processor that retrieves instructions from the memory and executes the instructions i) to determine the frequency of non-synonymous mutations in the S ORF region of HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons.
[0019] In a further aspect the invention relates to a computer readable medium havi ng instructions stored thereon, which, when executed by a processor, causes the processor to perform operations that implement a method to determine the frequency of non-synonymous mutations in an S ORF region of FIBV in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that a subject is infected with, comparing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons.
[0020] In another aspect the invention relates to use of an oligonucleotide primer pai r for amplifyi ng at least a portion of an S ORF region of FIBV present in a sample obtained from a subject to produce an amplimer in the manufacture of a kit or system for determining the liver inflammation status of a subject or the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver compl ications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the primer pair comprising a) a forward primer comprising a sequence of at least about 5 contiguous
nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the FIBV is genotype A; and b) a reverse pri mer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 100-2000; wherein the determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in the subject is carried out by a method described herein.
[0021] In another aspect the invention relates to use of one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of FIBV for generating sequence information about at least a portion of the S ORF region of FIBV present in a sample obtained from a subject in the manufacture of a kit or system for determining the liver inflammation status of a subject or the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, wherein the determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in the subject is carried out by a method described herein.
[0022] In a further aspect the invention relates to use of a) a measurement tool that that generates sequence information about at least a portion of an S ORF region of FIBV present in a sample obtained from a subject, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the FIBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a system for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C.
[0023] In another aspect the invention relates to an apparatus for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the apparatus comprising a) means for amplifying at least a portion of the S ORF region of the FIBV
present in a sample obtained from the subject to produce an amplimer, b) means for generating sequence information from the amplimer, and c) means for determining the frequency of non-synonymous mutations in the S ORF region of the FIBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, ii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of non- synonymous mutations in the S ORF region. [0024] In another aspect the invention relates to an apparatus for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the apparatus comprising a) means for generating sequence information about the S ORF of the HBV that the subject is infected with, and b) means for determining the frequency of non-synonymous mutations in the S ORF region of the HBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, ii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non- synonymous mutations in the S ORF region.
[0025] In one embodiment the means for amplifying at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to produce an amplimer comprises an oligonucleotide primer pair described herein.
[0026] In one embodiment the means for generating sequence information is a nucleotide sequencer. In another embodiment the means for generating sequence information comprises one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject.
[0027] In another aspect the invention relates to use of a) an oligonucleotide primer pair for amplifyi ng at least a portion of an S ORF region of HBV present in a sample obtained from a subject to produce an amplimer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is i nfected with comprising nucleotides 100-2000; and b) a nucleotide sequencer for generating sequence information from the
amplimer, c) a processor, d) a computer readable medium, and e) an analysis tool stored on the computer-readable medi um that is adapted to be executed on the processor to determi ne the frequency of non-synonymous mutations in the S ORF region of the HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information generated by the nucleotide sequencer to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibi lity to or detecting a risk of developing l iver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C.
[0028] In a further aspect the invention relates to use of a) an oligonucleotide primer pair for amplifyi ng at least a portion of an S ORF region of HBV present in a sample obtained from a subject to produce an amplimer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and i) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is i nfected with comprising nucleotides 100-2000; and b) amplification reagents, and c) an analysis tool stored on a computer-readable medi um that is adapted to be executed on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing sequence information generated by a measurement tool about at least a portion of the S ORF region present in the amplimer to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons i n the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C.
[0029] In a further aspect the invention relates to use of a) one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject, and b) an analysis tool stored on a computer-readable medium that is adapted to be executed on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; in the preparation of a system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C.
[0030] In another aspect the invention relates to a method of monitoring the response to therapy, predicting the response to a therapy, selecting a therapy, determining the optimal timing, duration or regimen of a therapy in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the HBV by a method comprising i) providing sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iv) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to monitor the response to immunotherapy, predict the response to an immunotherapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy based on the frequency of non-synonymous mutations in the S ORF region of the FIBV.
[0031] In various embodiments the therapy is chemotherapy, drug therapy, immunotherapy, gene therapy, or a combination of any two or more thereof. In one embodiment the therapy is immunotherapy.
[0032] In another aspect the invention relates to a method of monitoring the response to immunotherapy, predicting the response to an immunotherapy, selecting a therapy, determining the optimal timing, duration or regimen of a therapy in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the FIBV by a method comprising i) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to
determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iv) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to monitor the response to immunotherapy, predict the response to an
immunotherapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy based on the frequency of non-synonymous mutations in the S ORF region of the FIBV.
[0033] Any of the embodiments or preferences described herein may relate to any of the aspects herein alone or in combination with any one or more embodiments or preferences described herein, unless stated or indicated otherwise.
[0034] In one embodiment the subject is infected with an FIBV comprising an S ORF of FIBV genotype C, FIBV genotype B, FIBV genotype A, or a combination of any two or more thereof.
[0035] In various embodi ments the frequency of non-synonymous mutations is determined at 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 35, 40, 45, 50, 55 or 60 or more codons in the S ORF region, and various ranges may be selected from between any of these val ues, for example, from about 2 to about 60, about 2 to about 50, 2 to about 40, about 2 to about 35, about 2 to about 31, about 2 to about 30, about 2 to about 25, about 2 to about 20, about 2 to about 15, about 2 to about 10, about 2 to about 5, about 3 to about 60, about 3 to about 50, about 3 to about 40, about 3 to about 30, about 3 to about 20, about 3 to about 10, about 4 to about 40, about 4 to about 30, about 4 to about 20, about 4 to about 10, about 5 to about 60, about 5 to about 50, about 5 to about 40, about 5 to about 30, about 5 to about 20, about 5 to about 10, about 7 to about 40, about 7 to about 30, about 7 to about 20, about 7 to about 10, about 10 to about 60, about 10 to about 50, about 10 to about 40, about 10 to about 30, about 10 to about 20, about 10 to about 15, about 20 to about 60, about 20 to about 50, about 20 to about 40, about 25 to about 60, about 25 to about 50, or about 25 to about 40 codons of the S ORF region of FIBV.
[0036] In various embodi ments the frequency of non-synonymous mutations is determined at 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 or 32 codons corresponding to the codons at positions 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 and 391 of a S ORF reference sequence of any one SEQ ID NO: 1-14, and various ranges can be selected from between any of these values, for example, from about 0 to about 32, about 0 to about 25, about 0 to about 20, about 2 to about 32, about 2 to about 25, about 2 to about 20, about 3 to about 32, about 3 to about 25, about 3 to about 20, about 5 to about 32, about 5 to about 25, about 5 to about 20, about 8 to about 31, about 8 to about 25, about 8 to about 20, about 10 to about 32, about 10 to about 25, about 10 to about 20, about 12 to about 32, about 12 to about 25, about 12 to about 20, about 15 to about 32, about 15 to about 25, or about 15 to about 20 codons corresponding to the codons at positions 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 and 391 of an S ORF reference sequence of any one of SEQ ID NO: 1-14.
[0037] In various embodi ments the frequency of non-synonymous mutations is determined at two or more codons corresponding to the codons at positions 4, 35, 51, 54, 56, 120, 132, 135, 141, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 334, 358, 363, 377, 378, 382 and 387 of a S ORF reference sequence of any one SEQ ID NO: 1-14, or a combi nation of any two or more thereof. In one embodiment the frequency of non- synonymous mutations is determined at two or more codons correspondi ng to the codons at positions 4, 120, 132, 135 and 378 of a S ORF reference sequence of any one of SEQ ID NO: 1-14.
[0038] In one embodiment the frequency of non-synonymous mutations is determined at two or more codons that are under positive selection pressure.
[0039] In various embodi ments the frequency of non-synonymous mutations is determined at two or more codons within a portion of the S ORF region of FIBV comprising codons corresponding to codons 1-400, 1-350, 1-300, 1-250, 1-200, 1-150, 1-100, 1-50, 50-400, 50-350, 50-300, 50-250, 50-200, 50-150, 50-100, 100-400, 100-350, 100-300, 100-250, 100-200, 100-150, 150-400, 150-350, 150-300, 150-250, 150-200, 200-400, 200-350, 200-300, 200-250, 250-400, 250-350, 250-300, 300-400, 300-350, or 350- 400 of an S ORF reference sequence of any one of SEQ ID NO: 1-14.
[0040] In one embodiment selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) comparing the sequence information to one or more S ORF subgenotype
reference sequences to determine the subgenotype of the S ORF of the FIBV the subject is infected with, and b) selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with. [0041] In another embodiment selecting an S ORF reference sequence
corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) providing information about the subgenotype of the S ORF of the FIBV the subject is infected with, and b) selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with.
[0042] In various embodiments the S ORF reference sequence is a) SEQ ID NO: 1 when the subgenotype of the FIBV is genotype Al ; b) SEQ ID NO: 2 when the subgenotype of the FIBV is genotype A2; c) SEQ ID NO: 3 when the subgenotype of the FIBV is genotype Bl; d) SEQ ID NO: 4 when the subgenotype of the FIBV is genotype B2; e) SEQ ID NO: 5 when the subgenotype of the FIBV is genotype B3; f) SEQ ID NO: 6 when the subgenotype of the FIBV is genotype B4; g) SEQ ID NO: 7 when the subgenotype of the FIBV is genotype B5; h) SEQ ID NO: 8 when the subgenotype of the FIBV is genotype Cl;
SEQ ID NO: 9 when the subgenotype of the FIBV is genotype C2; j) SEQ ID NO: 10 when the subgenotype of the FIBV is genotype C3A; ) SEQ ID NO: 11 when the subgenotype of the FIBV is genotype C3B;
L) SEQ ID NO: 12 when the subgenotype of the FIBV is genotype C4; m) SEQ ID NO: 13 when the subgenotype of the FIBV is genotype C5; or n) SEQ ID NO: 14 when the subgenotype of the FIBV is genotype C6.
[0043] In one embodiment the method comprises selecting an S ORF reference sequence comprising, or the S ORF reference sequence comprises, a sequence that corresponds to 2, 3, 4, 5, 6, 7, 8, 9 or 10 or more subgenotypes of FIBV genotypes A, B or C. In one embodiment the S ORF reference sequence corresponds to all known
subgenotypes of HBV genotype A, B or C.
[0044] In various embodiments sequence information is generated about, or an oligonucleotide primer pair amplifies a portion of the S ORF region of the FIBV comprising at least about 10, 20, 50, 100, 200, 250, 300, 400, 500, 750, 1000 or 1200 contiguous nucleotides of the S ORF of the FIBV, and various ranges may be selected between any of these values, for example from about 10 to about 1200, about 200 to about 1200, about 500 to about 1200, about 750 to about 1200 or about 1000 to about 1200 contiguous nucleotides.
[0045] In one embodiment the method comprises amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject using an oligonucleotide primer pair to produce an amplimer and generating sequence information about the amplimer.
[0046] In one embodiment the method comprises adding the sample or the amplimer to one or more nucleotide probes that hybridise to specific nucleotides i n the S ORF region of FIBV to generate the sequence information.
[0047] In one embodiment the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer. In one embodiment the measurement tool comprises a nucleotide sequencer for generating sequence information from the amplimer or from the sample. In one embodi ment the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer and a nucleotide sequencer for generating sequence information from the amplimer.
[0048] In one embodiment the measurement tool comprises one or more nucleotide probes that hybridise to specific nucleotides in the S ORF region of FIBV for generating sequence information from the sample or the amplimer.
[0049] In various embodiments the oligonucleotide primer pair comprises a) a forward primer comprising a sequence of at least about 5 contiguous
nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and b) a reverse pri mer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence corresponding to a subgenotype of the HBV that the subject is infected with comprising nucleotides 100-2000 of the HBV genome.
[0050] In various embodiments a) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 100-2000; or b) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1800-3215 when the HBV is genotype B or C or nucleotides 1800-3221 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 700-1800; or the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1500-2848; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 700-1800; or d) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1800-3215; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 832-1800; or e) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 1800-2848; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 832-1800; or f) the oligonucleotide primer pair comprises i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 2000-2848; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence comprising nucleotides 832-1500; wherein the HBV genome reference sequence is a sequence of any one of SEQ ID Nos: 15- 28 corresponding to a subgenotype of the HBV that the subject is infected with.
[0051] In various embodi ments the forward primer comprises a sequence of at least about 5, 6, 7, 8, 9, 10, 12, 15, 17, 18, 20, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a portion of an HBV genome reference sequence, and various ranges may be selected from between these values, for example, from about 5 to 50, 5 to 30, 7 to 30, 10 to 30, 15 to 30, 20 to 30, 25 to 30, 5 to 25, 7 to 25, 10 to 25, 15 to 25, 20 to 25, 5 to 20, 7 to 20, 10 to 20, 15 to 20, 5 to 15, 7 to 15, 10 to 15 contiguous nucleotides of a portion of the HBV genome. [0052] In various embodiments the reverse pri mer comprises a sequence that is complementary to at least about 5, 6, 7, 8, 9, 10, 12, 15, 17, 20, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a portion of an HBV genome reference sequence, and various ranges may be selected from between these values, for example, from about 5 to 50, 5 to 30, 7 to 30, 10 to 30, 15 to 30, 20 to 30, 25 to 30, 5 to 25, 7 to 25, 10 to 25, 15 to 25, 20 to 25, 5 to 20, 7 to 20, 10 to 20, 15 to 20, 5 to 15, 7 to 15, 10 to 15 contiguous nucleotides of a portion of the HBV genome.
[0053] In various embodi ments the HBV genome reference sequence is a) SEQ ID NO: 15 when the subgenotype of the HBV is genotype Al ; b) SEQ ID NO: 16 when the subgenotype of the HBV is genotype A2; c) SEQ ID NO: 17 when the subgenotype of the HBV is genotype Bl ; d) SEQ ID NO: 18 when the subgenotype of the HBV is genotype B2; e) SEQ ID NO: 19 when the subgenotype of the HBV is genotype B3; f) SEQ ID NO: 20 when the subgenotype of the HBV is genotype B4; g) SEQ ID NO: 21 when the subgenotype of the HBV is genotype B5; h) SEQ ID NO: 22 when the subgenotype of the HBV is genotype Cl ;
SEQ ID NO: 23 when the subgenotype of the HBV is genotype C2; j) SEQ ID NO: 24 when the subgenotype of the HBV is genotype C3A; ) SEQ ID NO: 25 when the subgenotype of the HBV is genotype C3B;
L) SEQ ID NO: 26 when the subgenotype of the HBV is genotype C4; m) SEQ ID NO: 27 when the subgenotype of the HBV is genotype C5; and n) SEQ ID NO: 28 when the subgenotype of the HBV is genotype C6.
[0054] In one embodiment the method comprises using two or more primers to amplify at least a portion of the S ORF region comprising the two or more codons, wherein the two or more primers each comprise or are complementary to 5 or more contiguous nucleotides of an S ORF reference sequence of any one of SEQ ID Nos: 1-14. In one embodiment one or more of the primers are designed to anneal specifically to a wild-type codon of the S ORF reference sequence at which the frequency of non-synonymous mutations is to be determined . In one embodiment one or more of the primers are designed to anneal specifically to a codon of the S ORF reference sequence comprising a non- synonymous mutation.
[0055] In one embodiment sequence information about at least a portion of the S
ORF region of the FIBV present in a sample obtained from the subject, based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region is determined usi ng amplification refractory mutation system (ARMS) .
[0056] In various embodi ments the frequency of non-synonymous mutations in the
S ORF region of the FIBV indicates a) the subject has a positive liver inflammation status, b) the FIBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver
inflammation or liver complications of chronic FIBV infection, or d) a combination of any two or more of a) to c).
[0057] In various embodi ments the frequency of non-synonymous mutations in the
S ORF region of the FIBV is determined by a) comparing the sequence i nformation to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) assigning a numerical score for each codon based on the frequency of non- synonymous mutations at that codon, c) combining the numerical scores for each codon to provide a combi ned
numerical score representing the frequency of non-synonymous mutations in the S ORF region of the FIBV.
[0058] In various embodiments a positive numerical score is assigned for each codon having a frequency of non-synonymous mutations of at least about 5%, 10%, 15%, 20%, 25%, 30%, 40% or at least about 50%. In various embodiments the positive score for each codon is independently 0.5, 1, 1.5, 2 or 2.5. [0059] In various embodi ments the combi ned numerical score representing the frequency of non-synonymous mutations in the S ORF of the HBV of greater than 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 indicates that a) the subject has a positive liver inflammation status, b) the HBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver
inflammation or liver complications of chronic HBV infection, or d) a combination of any two or more of a) to c).
[0060] In various embodi ments the subject a) is HBeAg negative, b) has normal serum ALT levels, c) has a serum HBV-DNA titre of greater than about 2,000 IU/ml, or d) has a combination of any two or more of a) to c).
[0061] In another embodiment the subject is HBeAg positive or has elevated serum
ALT levels, or is both HBeAg positive and has elevated serum ALT levels.
[0062] In various embodiments the liver complications of chronic HBV infection comprise liver cirrhosis, l iver cancer, liver failure, liver inflammation, liver damage, liver dysfunction, or a combination of any two or more thereof.
[0063] The term "comprising" as used in this specification means "consisting at least in part of". When interpreting statements in this specification which include that term, the features, prefaced by that term in each statement, all need to be present but other features can also be present. Related terms such as "comprise" and "comprised" are to be interpreted i n the same manner.
[0064] This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more of said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth. [0065] It is intended that reference to a range of numbers disclosed herein (for example, 1 to 10) also incorporates reference to all rational numbers within that range (for example, 1, 1.1, 2, 3, 3.9, 4, 5, 6, 6.5, 7, 8, 9, and 10) and also any range of rational numbers within that range (for example, 2 to 8, 1.5 to 5.5, and 3.1 to 4.7) and, therefore, all sub-ranges of all ranges expressly disclosed herein are hereby expressly disclosed.
These are only examples of what is specifically intended and all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application in a similar manner.
[0066] In this specification, where reference has been made to external sources of information, including patent specifications and other documents, this is generally for the purpose of providing a context for discussing the features of the present invention. Unless stated otherwise, reference to such sources of information is not to be construed, in any jurisdiction, as an admission that such sources of information are prior art or form part of the common general knowledge in the art.
[0067] Other aspects of the invention may become apparent from the following description which is given by way of example only and with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0068] The invention will now be described by way of example only and with reference to the drawings in which:
[0069] Figure 1 is a Receiver Operating Characteristic (ROC) curve showing the relationship between the sensitivity and specificity of the frequency of non-synonymous mutations in the S ORF of FIBV to predict liver disease. Sixteen subjects with an FIBeAg- negative, genotype C chronic FIBV infection with an FIBV-DNA > 2,000IU/ml and a normal serum ALT level were classified as having developed either serious hepatitis B-related liver disease (liver inflammation, liver cirrhosis, liver failure or liver cancer - Group A) or inactive liver disease (Group B) ten years after serum samples were obtained (8 subjects in each group). Six clones from each subject were analysed. The number of the 32 positively selected codons (Pr>0.90) identified in Table 6 that had 2 or more of the 6 clones containing a non-synonymous mutation was counted in each subject. The ROC curve plots the relationship between the sensitivity (number of Group A subjects with greater than or equal to that score divided by the total number of Group A subjects) and specificity (number of Group B subjects with greater than or equal to that score divided by the total number of Group B subjects) of the test for each possible score (in small squares). [0070] Figure 2 is a survival analysis comparing the outcome at the ten year follow up of nine subjects with a score of greater than or equal to 3 (Group 1) with that of seven subjects with a score of less than or equal to 2 (Group 0). Scores were calculated as described above for Figure 1.
DETAILED DESCRIPTION OF THE INVENTION
[0071] The present inventors have identified codons in the S ORF of the HBV genome that are under positive selection pressure. The present inventors have surprisingly determined that determining the frequency of non-synonymous mutations across the S ORF region of HBV based on the frequency of non-synonymous mutations at codons under positive selection pressure is useful to detect or predict early stage liver inflammation and/or an increased risk of developing liver complications of chronic HBV infection in patients infected with HBV. Thus the present invention relates to a method of determining the liver inflammation status of, determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with HBV genotype A, B or C by determining the frequency of non-synonymous mutations at two or more codons in the S ORF region of HBV. The invention also provides related systems, kits, measurement tools, agents, devices, computer readable media, oligonucleotide primer pairs and apparatus and uses thereof.
1. Definitions
[0072] The phrase "an HBV comprising an S ORF of HBV genotype A, B or C" as used herein refers to HBV comprising an S ORF that substantially corresponds to the S ORF region of a subgenotype of any one of HBV genotypes A, B or C. It includes an HBV comprising an S ORF having substantial sequence identity to an S ORF reference sequence of any one of HBV subgenotypes, Al, A2, Bl, B2, B3, B4, B5, Cl, C2, C3A, C3B, C4, C5 and C6 described herein. It also includes HBV comprising an S ORF region having substantial sequence identity to an S ORF consensus sequence of any as yet unclassified or unidentified subgenotype of HBV genotype A, B or C. In various embodiments the HBV has an S ORF having at least about 70%, 75%, 80%, 90%, 95% or at least 99% sequence identity to an S ORF reference sequence described herein. In one embodiment the nucleotide sequence of the S ORF is substantially identical to an S ORF reference sequence described herein but for one or more synonymous substitutions.
[0073] The term "indel" as used herein refers to an insertion or deletion of one or more nucleotides in the wild-type HBV genome. [0074] The term "non-synonymous mutation" as used herein refers to a single nucleotide mutation in an HBV S ORF sequence that results in a codon that codes for a different amino acid to the amino acid coded for at the equivalent codon position in a wild- type, consensus or reference HBV S ORF sequence.
[0075] The term "synonymous mutation" or "synonymous substitution" as used herein refers to a single nucleotide mutation or substitution in an HBV S ORF sequence that results in a codon that codes for the same amino acid as the amino acid coded for at the equivalent codon position in a wild-type, consensus or reference HBV S ORF sequence.
[0076] The term "primer" as used herein refers to a short nucleotide strand comprising a sequence of 10 or more nucleotides that hybridises to a template DNA sequence and serves as the starting point for a polymerase chain reaction (PCR). For use in PCR reactions described herein, a suitable forward primer comprises a sequence of at least about 5 contiguous nucleotides of the sequence of the non-template strand (also known as the plus or sense strand) of the HBV genome. The forward primer binds to the template strand of the HBV genome (also known as the minus or antisense strand). A suitable reverse primer for use in PCR reactions described herein comprises at least about 5 contiguous nucleotides of the sequence of the template strand (also known as the minus or antisense strand) of the HBV genome. The reverse primer binds to the non-template strand of the HBV genome (also known as the plus or sense strand).
[0077] The phrase "under positive selection pressure" as used herein refers to codons in the S ORF of HBV at which the rate of non-synonymous substitutions exceeds the rate expected by random processes. Substitution rates of codon evolution can be determined using methods known in the art, including methods described herein in the examples. For example, substitution rates can be inferred by constructing phylogenies of S ORF sequences of HBV obtained from one or more subjects and then estimating the overall proportion of codons that are under positive selection pressure across the phylogenetic tree using methods such as Phylogenetic Analysis of Maximum Likelihood (PAML).
[0078] The terms "genotype", "subgenotype" and "sub-subgenotype" (and related classifications) refer to genetically distinct groups of the hepatitis B virus that have arisen during evolution of the virus. The consensus, wild-type nucleotide sequences of the groups currently recognized in the art are described herein. The terms are intended to include any new genotypes, subgenotypes and sub-sub genotypes recognized in the future.
[0079] A "subject" refers to a vertebrate that is a mammal, for example, a human. [0080] The phrase "liver complications of chronic HBV (CHB) infection" and related terms as used herein relates generally to any liver pathologies or conditions resulting from chronic HBV infection including, but not limited to, liver inflammation, liver damage, liver dysfunction, liver cirrhosis, liver cancer (for example, hepatocellular carcinoma), and liver failure.
[0081] As used herein the term "and/or" means "and" or "or", or both.
[0082] As used herein "(s)" following a noun means the plural and/or singular forms of the noun.
[0083] The term "wild-type" as used herein with reference to genome sequences, S ORF sequences and codons therein and amino acid sequences for a given HBV genotype, subgenotype or sub-subgenotype refers to the recognised consensus sequence or codon for that HBV genotype, subgenotype or sub-subgenotype.
2. HBV genome
[0084] HBV is a DNA virus that belongs to the Hepadnaviridae family.
[0085] The circular genome of HBV is encased in a protein core and is partially double-stranded. The genome consists of a non-coding negative (template) strand that is attached to the polymerase protein at its 5' end. The terminal 3' nucleotides of the negative strand overlap with the 5' sequence. The positive coding strand is usually 1700-2800 nucleotides in length and the 5' sequence of the positive strand bridges the gap between the 3' and 5' ends of the negative strand. On entry to the hepatocyte nucleus, both strands are completed creating a circular, double stranded genome known as covalently closed circular DNA (cccDNA). The cccDNA of genotype B and genotype C viruses is 3,215 nucleotides in length. The cccDNA of genotype A viruses is 3,221 nucleotides in length. cccDNA contains four overlapping open reading frames (ORF). The C ORF encodes the core protein and the P25 precursor protein for HBeAg. The X ORF is predicted to encode a transcriptional transactivator. The S ORF encodes the 3 envelope proteins of the mature virion, and the P ORF encodes a protein with several functions related to replication of the HBV genome.
[0086] There is substantial variation in the sequence of HBV between individual chronically infected subjects. There are conserved patterns to this sequence variation that allow most HBVs to be classified into broad groups known as genotypes, subgenotypes and sub-subgenotypes. To date, nine genotypes of the hepatitis B virus have been defined in the human population. They are known as genotypes A, B, C, D, E, F, G, H and I, and they are distinguished by haplotypes of nucleotide substitutions in at least 8% of their nucleotide sequence. These genotypes differ in their geographical distribution throughout the world and also in their ability to cause disease. For example, the genotype C hepatitis B virus predominantly occurs in East Asia and the Pacific and is associated with the highest rates of progression to liver cirrhosis and liver cancer.
[0087] There is a second level of variation within each genotype known as subgenotypes. These subgenotypes also differ in their geographical distributions. The subgenotypes of genotypes A, B and C of the hepatitis B virus that are well recognised are A1 and A2; Bl, B2, B3, B4 and B5; and Cl, C2, C3, C4, C5 and C6. In addition, two subsubgenotypes of subgenotype C3 have been described in Tonga, and they are known as sub-subgenotypes C3A and C3B. Given that hepatitis B virus sequences are still being identified that contain haplotypes of fixed nucleotide substitutions that do not match any known subgenotype or sub-subgenotype, it is likely that more subgenotypes and more subsubgenotypes will be identified in the future.
[0088] Some of the sequence variation that defines genotypes, subgenotypes and sub-subgenotypes of HBV occurs in the S ORF. Accordingly, in order to determine the frequency of non-synonymous mutations across the S ORF, it is necessary to select an S ORF reference sequence as described below corresponding to the subgenotype of a subject's virus for analysis according to the method of the invention.
[0089] A further source of variation in hepatitis B virus sequences is recombination of genetic material from two viruses of different genotypes that have infected the same patient. The most common of these are identified as genotype B/C, genotype C/D or genotype A/D. Lastly, the dominant sequence of a hepatitis B virus changes with time within patients as a result of selection pressure and genetic drift. This intra-individual variation can include mutations of nucleotides that are included in the haplotypes that define genotypes, subgenotypes and sub-subgenotypes.
[0090] The consensus nucleotide sequences for the genome of FIBV subgenotypes A1 and A2 are set out in SEQ ID NOs: 15 and 16, respectively.
[0091] The consensus nucleotide sequences for the genome of FIBV subgenotypes Bl, B2, B3, B4 and B5 are set out in SEQ ID NOs: 17-21, respectively.
[0092] The consensus nucleotide sequences for the genome of FIBV subgenotypes Cl, C2, C3A, C3B, C4, C5 and C6 are set out in SEQ ID NOs: 22-28, respectively. Methods for determining consensus nucleotide sequences for the FIBV genome of a particular FIBV subgenotypes are described below. 3. Method of the invention
[0093] The method of the invention comprises the steps of a) obtaining sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
Obtaining a sample and extracting viral DNA
[0094] Any sample obtained from the subject that comprises FIBV DNA is suitable.
In some embodiments the sample obtained from the subject is processed for use in the method of the invention. For example, in one embodiment the sample comprises serum or plasma separated from a whole blood sample extracted from the subject. In another embodiment the sample comprises fresh frozen tissue or formalin-fixed tissue. In various embodiments the sample comprises whole blood, serum, plasma, or liver tissue. In various embodiments the tissue sample is obtained during a biopsy or surgery.
[0095] FIBV DNA may be extracted from the sample using any suitable method known in the art including, but not limited to, the methods described in the Examples. Suitable commercial kits for extracting viral DNA are widely available and well known in the art.
Selecting an S ORF reference sequence
[0096] Due to the nucleotide sequence variation between FIBV subgenotypes, including within the S ORF region, in various embodiments the method comprises selecting or designing an appropriate S ORF reference sequence. Recombination events between FIBV genomes can result in mixed FIBV genotypes and subgenotypes, and so it may be necessary to determine the subgenotype of the S ORF region specifically.
[0097] In one embodiment the method comprises providing information about the subgenotype of the S ORF of the FIBV the subject is infected with, for example, information from an earlier test, and selecting an S ORF reference sequence corresponding to that subgenotype.
[0098] In another embodiment the method comprises comparing the sequence information about the S ORF region of the FIBV from a sample obtained from the subject as described herein to one or more subgenotype S ORF consensus sequences to determine the subgenotype of the S ORF of the FIBV the subject is infected with and selecting a corresponding S ORF reference sequence.
[0099] In one embodiment the method comprises performing one or more sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences.
[00100] In one embodiment the method comprises performing one or more pairwise sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences. An example of such a method of determining S ORF subgenotype is described in Example 1. In various embodiments the sequence alignment is performed using one to many pairwise alignments (using but not limited to BLAST, Smith-Waterman or Needleman-Wunsch based algorithms). Software may be selected from a group of programs including but not limited to the Burrows-Wheeler aligner program, novoalign, bowtie, mrsFAST, Partek, SOAP, MAQ, Segemehl, SSHA and Stampy. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
[00101] In one embodiment the method comprises performing one or more multiple sequence alignments of the sequence information about at least a portion of the S ORF with one or more FIBV genotype, subgenotype or sub-subgenotype S ORF consensus sequences. Techniques for conducting multiple alignments include but are not limited to progressive alignment construction, iterative methods, consensus methods, hidden Markov models, phylogeny-aware models, motif finding or non-coding multiple sequence alignment.
Software that might be used includes but is not limited to Clustal, MAFFT, T-Coffee, PRRN/PRRP, CHAOS/DIALIGN, M-Coffee, MergeAlign, POA, SAM, HMMER, PRANK, PAGAN, MEME/MAST and EDNA.
[00102] In one embodiment the method comprises determining the subgenotype S ORF consensus sequence to which the sequence information preferentially aligns and selecting the preferentially aligned S ORF sequence as the S ORF reference sequence. [00103] In some embodiments it is necessary to disti nguish regions of the S ORF derived from different genotypes, subgenotypes or sub-subgenotypes in patients infected with recombinant viruses. In one embodiment the method comprises constructing an S ORF reference sequence by identifying the genotype, subgenotype or sub-subgenotype of two or more regions of the S ORF of FIBV the subject is infected with, selecting a region of an S ORF reference sequence disclosed herein corresponding to each region of differing subgenotype or sub-subgenotype and combining these regions to form a full length S ORF reference sequence tailored to that subject.
[00104] In subjects infected with two or more genotypes, subgenotypes or subsubgenotypes of FIBV, in various embodiments it may be necessary to identify the frequency of non-synonymous mutations in the S ORF region separately for each distinct genotype.
[00105] The method of the invention is suitable for determining the frequency of non- synonymous mutations in the S ORF of any subgenotype of FIBV genotypes A, B or C for which an S ORF subgenotype sequence is available.
[00106] A suitable S ORF reference sequence for an FIBV subgenotype may be generated by aligning two or more S ORF nucleotide sequences obtained from FIBV identified as being of that subgenotype and generati ng a consensus sequence. Al ignment software suitable for generating consensus sequences is well known i n the art, includi ng software programs described herein.
[00107] In various embodi ments the one or more FIBV S ORF reference sequences comprises one or more sequences of SEQ ID Nos: 1-14.
[00108] S ORF reference sequences for FIBV subgenotypes A1 and A2 are provided by SEQ ID NOs: 1 and 2, respectively. S ORF reference sequences for FIBV subgenotypes Bl,
B2, B3, B4 and B5 are provided by SEQ ID NOs: 3-7, respectively. S ORF reference sequences for subgenotypes Cl, C2, C3A, C3B, C4, C5 and C6 are provided by SEQ ID NOs: 8-14, respectively. In various embodiments the S ORF reference sequence corresponding to the subgenotype of the S ORF region of the FIBV the subject is infected with comprises a nucleotide sequence having at least about 90%, 95%, 97%, 98% or at least about 99% sequence identity to the S ORF reference sequence corresponding to that subgenotype descri bed herein .
[00109] In one embodiment the S ORF reference sequence comprises sequence information about a portion of the S ORF region. In one embodiment the S ORF reference sequence solely comprises sequence information about the two or more codons at which the frequency of non-synonymous mutations is to be determined for the genotype or subgenotype of the HBV the patient is infected with. For example, in one embodiment the S ORF reference sequence comprises two or more discrete sequences wherein each sequence corresponds to a codon at which the frequency of non-synonymous mutations is to be determined. In another embodiment the reference sequence comprises a sequence of the S ORF region, or a portion thereof, wherein one or more nucleotides that do not correspond to the two or more codons at which the frequency of non-synonymous mutations is to be determined are designated "n".
[OOllO] In one embodiment the S ORF reference sequence comprises a sequence designed to correspond to two or more subgenotypes of FIBV. For example, in one embodiment the S ORF reference sequence comprises a consensus sequence of two or more subgenotypes of FIBV. If, for example, the S ORF reference sequence is designed to correspond to all known subgenotypes of a particular genotype of FIBV), it may not be necessary to determine the subgenotype of the FIBV the subject is infected with in order to carry out the method of the invention. In one embodiment the S ORF reference sequence comprises a sequence that corresponds to two or more, three or more, four or more or five or more subgenotypes of genotype C. In another embodiment the S ORF reference sequence comprises a sequence that corresponds to all known subgenotypes of FIBV genotype C. In another embodiment the S ORF reference sequence comprises a sequence designed to correspond to all known genotypes and subgenotypes of the FIBV.
[00111] Additional subgenotypes of FIBV may be identified in the future. S ORF reference sequences corresponding to these new subgenotypes can be generated as described above. As nucleotide sequences for new subgenotypes become available, these sequences can be included in a method described above to identify whether these subgenotypes are present in the sample obtained from the subject.
Providing sequence information
[00112] In one embodiment the method comprises a) determining the frequency of non-synonymous mutations in the S ORF region of the
FIBV the subject is infected with by a method comprising i) providing sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject, ii) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF that the FIBV the subject is infected with, iii) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and iv) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons; and b) determining the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
Generating sequence information
[00113] The S ORF region of FIBV genotype A comprises 1,200 nucleotides comprising nucleotides 2854-3221 and 1-832 of the circular FIBV genome. Consensus nucleotide sequences for the S ORF of FIBV subgenotypes A1 and A2 are set out in SEQ ID NOs: 1 and 2, respectively.
[00114] The S ORF region of FIBV genotypes B and C comprises 1,200 nucleotides comprising nucleotides 2848-3215 and 1-832 of the circular FIBV genome. Consensus nucleotide sequences for the S ORF of FIBV subgenotypes Bl, B2, B3, B4 and B5 are set out in SEQ ID NOs: 3-7, respectively. Consensus nucleotide sequences for the S ORF of FIBV subgenotypes Cl, C2, C3a, C3b, C4, C5 and C6 are set out in SEQ ID NOs: 8-14, respectively.
[00115] The 1,200 nucleotide S ORF region comprises 400 codons. For the purpose of identifying specific codons of the S ORF region herein, the codons are numbered
consecutively from the 5' end of the relevant subgenotype S ORF consensus sequence identified above. For example, codon 1 comprises nucleotides 1-3, codon 2 comprises nucleotides 4-6 and so forth to codon 400 comprising nucleotides 1,198-1,200 of the S ORF. For an S ORF sequence comprising one or more indels, specific codons located after the indel are identified by reference to the relevant subgenotype S ORF reference sequence.
[00116] In one embodiment the method comprises generating sequence information about at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject. In one embodiment the method comprises generating sequence information about the entire 1,200 nucleotides of the S ORF region. In another embodiment the method comprises generating sequence information about a portion of the S ORF region comprising two or more codons at which the frequency of non-synonymous mutations is to be determined.
[00117] In one embodiment the method comprises amplifying the S ORF region, or a portion thereof, to produce an amplimer. In one embodiment the S ORF, or a portion thereof, is amplified using sequential nested PCR reactions or other amplification strategies known in the art to reduce mispriming, non-specific amplification or to enhance specific amplification of the desired amplimer.
[00118] In one embodiment the method comprises performing two initial rounds of DNA synthesis using oligonucleotide primers wherein the 3' portion is comprised of a random nucleotide sequence and the 5' portion is a defined sequence. An example of such a strategy is described in Bohlander et a I . , 2002, Genomics 13: 1322-1324. In one
embodiment an oligonucleotide primer comprising a 5' "defined" sequence comprising a sequence that is absent in FIBV and human genomes and optimised for use in a PCR reaction according to criteria known to those skilled in the art and a 3' "random" sequence comprising 3 or more nucleotides selected at random is annealed to FIBV DNA and extended using a polymerase (for example, a T7 DNA polymerase) resulting in first strand DNA synthesis. Following denaturation and re-annealing, a second strand synthesis is performed using the same or a similarly designed primer to produce a complementary second strand. This double-stranded DNA is then amplified in a PCR reaction using primers that anneal to the 5' defined sequences to amplify the S ORF region, or a portion thereof.
[00119] In one embodiment the method comprises generating sequence information about the amplimer using massive parallel sequencing technology (also known as second generation sequencing or next generation sequencing) in which multiple, spatially-separated DNA templates are sequenced simultaneously. Any suitable massive parallel sequencing technology known in the art can be used. In various embodiments the method comprises pyrosequencing, reversible terminator chemistry, sequencing by ligation or phospholinked fluorescent nucleotides. Suitable platform technologies are well known in the art, and include, but are not limited to, Illumina, Ion Torrent, Roche, Life Technologies, Complete Genomics, Flelicos Biosciences, GS FLX Titanium or Pacific Biosciences platforms.
[00120] An exemplary method is described in the examples. Briefly, amplimers are fragmented by sonication. After fragmentation, end-repair, adenylation of 3' ends, adaptor ligation and amplification of the adaptor ligated libraries are performed using the NEBNext Ultra library preparation reagents for Illumina platform (NEB). Individual barcodes are added at the time of library formation using the NEBNext Multiplex Oligos Set-I. The Illumina libraries are sequenced on a NextSeq500 instrument (Illumina Inc.) to obtain about 10 million paired-end (PE) reads (150bp x 2) from each sample.
[00121] In one embodiment the method comprises generating sequence information about the S ORF directly from HBV DNA extracted from a suitable sample obtained from the subject using massively parallel sequencing technology. Any suitable extraction method known in the art that produces partial or complete HBV DNA suitable for sequencing can be used. Any suitable massively parallel sequencing technology known in the art can be used including the methods described above.
[00122] In one embodiment the method comprises using DNA hybridisation to obtain nucleotide or amino acid sequence information about the HBV from the sample. Suitable DNA hybridisation techniques include, but are not limited to, hybridisation of DNA, RNA or cDNA to DNA, cDNA or RNA probes attached either to a solid phase such as glass, plastic or silicon biochips or to microscopic polystyrene beads. Such techniques can be used to generate sequence information from partial or complete HBV DNA extracted from the sample or from an amplimer.
[00123] In one embodiment the method comprises generating sequence information about the S ORF using single molecule-based sequencing technology. Any suitable sequencing technology known in the art can be used. In various embodiments the method comprises sequencing by hybridisation or sequencing by synthesis. Suitable platform technologies are well known in the art, and include, but are not limited to, Ion Torrent, Life Technologies, or Pacific Biosciences platforms.
[00124] Other sequencing methods suitable for generating sequencing information about the S ORF region are well known in the art, including, for example, Sanger sequencing.
Oligonucleotide primer design
[00125] Suitable oligonucleotide primers can be designed using strategies and methods known in the art. A first step is to identify a portion of the HBV genome comprising the S ORF to be amplified. The portion comprises all of the codons in the S ORF to be assessed. The next step is to identify suitable primers to achieve the desired amplimer: a forward primer comprising 5 or more contiguous nucleotides at the 5' end of the portion to be amplified and a reverse primer comprising a sequence complementary to 5 or more contiguous nucleotides at the 3' end of the portion of the genome to be amplified. Forward and reverse primers are chosen based on consideration of parameters including, but not limited to, primer length, primer melting temperature, primer annealing temperature, GC content and primer secondary structures. A number of primer design tools are available to assist primer design i ncl uding, but not limited to, Primer Premier, Primer-BLAST, DNASTAR, Primer3 and Ol igo Primer Analysis Software. It will be apparent to those skilled in the art that unique primer pairs may be required for specific HBV subgenotypes and that such primers can be designed usi ng the methods outlined above.
[00126] In one embodiment the forward primer is derived from a region of the HBV genome comprising nucleotides 1500-3215 and nucleotides 1-60 (when the HBV is genotype B or C) or nucleotides 1500-3221 and nucleotides 1-60 (when the HBV is genotype A) and the reverse primer is derived from a region of the HBV genome comprisi ng nucleotides 1- 2000
[00127] In one embodiment both primers anneal to a region of the HBV genome that falls outside the S ORF so that the PCR reaction amplifies the entire S ORF. In one embodiment the forward pri mer is designed to anneal to a position that partially overlaps with or falls wholly within the S ORF region of the HBV genome and the reverse primer anneals to a region of the HBV genome that falls outside the S ORF region of the HBV genome, or vice versa .
[00128] In one embodiment both primers are designed to anneal to a position that partially overlaps with or falls wholly within the S ORF region of the HBV genome. For example, the forward primer is derived from a region of the HBV genome comprising nucleotides 1500-2848 of the HBV genome and the reverse primer is derived from a region of the S ORF comprising nucleotides 832-2000.
[00129] In a preferred embodiment the primers are universal primers designed to bind and amplify a portion of the HBV genome of any HBV of any genotype or subgenotype. Such universal primers can be designed by targeting highly conserved regions of the HBV genome of sufficient length to accommodate primers. Examples of such pri mers are provided in the Examples below. Due to the frequent and rapid mutation of the HBV vi rus, it may be necessary to design a suite of one or more forward primers and one or more reverse primers designed for use with the specific genotypes or subgenotypes or for use as a backup if a first primer pair does not produce the desired amplimer.
Determining the frequency of non-synonymous mutations in the S ORF region
[00130] In one embodiment the method of the invention comprises comparing the sequence information about a portion of the S ORF region of HBV in a sample obtained from the subject, for example, the reads obtained by next generation sequenci ng, and the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region. The frequency of non-synonymous mutations across the S ORF region is then determined based on the frequency of non-synonymous mutations at each codon.
[00131] In one embodiment, the sequence information is assembled into haplotypes followed by alignment of the assembled sequences (typically called contigs) to the S ORF reference sequence and analysis of genetic variation between individual contigs is performed.
[00132] In one embodiment the method comprises performing a bioinformatics analysis, for example, a sequence similarity analysis as described above and in the
Examples between the sequence information and the S ORF reference sequence to identify the presence or absence of non-synonymous mutations at each of two or more codons in the S ORF. Software programs suitable for mutation identification are well known in the art including, and include, but are not limited to, SAMtools, GATK Flaplotypecaller, Freebayes, GATK UnifiedGenotyper, SOAPSnp, VarScan, and Platypus.
[00133] In one embodiment the method comprises determining whether each detected mutation is a synonymous or non-synonymous mutation. This will be performed with reference to the genetic code, which is familiar to those with expertise in the art.
[00134] In one embodiment the method comprises determining the frequency of non- synonymous mutations at each of two or more codons in the S ORF. Each sequence is individually aligned to the S ORF reference sequence followed by counting the number of all sequence reads from that subject that align to each codon within the S ORF. This will provide the denominator for non-synonymous mutation frequency calculations. The frequency of non-synonymous mutations at each of two or more codons in the S ORF will be calculated as the number of non-synonymous mutations detected at each codon divided by the number of sequence reads at that codon.
[00135] In one embodiment the method comprises determining a numerical score representing the frequency of non-synonymous mutations across the S ORF. In one embodiment the method comprises assigning a category or grade based on the frequency of non-synonymous mutations across the S ORF.
[00136] In one embodiment the method comprises assigning a numerical score for each codon based on the frequency of non-synonymous mutations at that codon. In one embodiment the numerical score is assigned based on a threshold frequency of non- synonymous mutations at each codon. For example, the method comprises assigning a score of 0 if the frequency of non-synonymous mutations at that codon is less than the threshold frequency, and assigning a positive score if the frequency of non-synonymous mutations at that codon is equal to or greater than the threshold frequency. In one embodiment the method comprises assigning a category or grade defined by a threshold frequency on non-synonymous mutations at each codon.
[00137] The positive score assigned at each codon may be weighted, for example, a higher score may be assigned to codons under the strongest positive selection pressure.
[00138] In one embodiment the numerical score assigned at each codon is combined to provide a combined numerical score representing the frequency of non-synonymous mutations in the S ORF region of the HBV. In one embodiment the category or grade assigned at each codon is combined to provide a combined grade representing the frequency of non-synonymous mutations in the S ORF region of the FIBV.
ARMS assay and other methods
[00139] In an alternative embodiment, the method comprises detecting the frequency of non-synonymous mutations at each codon by amplification refractory mutation system (ARMS) assay. In an ARMS assay, amplification by PCR is performed using primers comprising 3' terminal sequences that are designed to distinguish between the bases present at each codon in the S ORF reference sequence and non-synonymous mutations at that codon present in the S ORF of the FIBV in a sample obtained from the subject.
[00140] PCR primers for use in ARMS are designed based on the S ORF reference corresponding to the subgenotype of the S ORF region of the FIBV the subject is infected with. In order to determine the frequency of non-synonymous mutations at the two or more codons of interest in the S ORF, a PCR primer is designed for use in a PCR reaction to specifically amplify at least a portion of the S ORF comprising each codon only when the wildtype codon is present, i.e. the nucleotide sequence of the codon in the S ORF reference sequence for the subgenotype of the S ORF of the FIBV the subject is infected with. Other primers are designed for use in PCR reactions to specifically amplify at least a portion of the S ORF comprising when each possible non-synonymous mutation at that codon is present. The amplimers of each reaction are subjected to agarose gel electrophoresis and the bands corresponding to each amplimer are detected and quantified to determine the frequency of non-synonymous mutations using methods known in the art.
[00141] In other embodiments, the method comprises detecting the presence or absence of amino acid substitutions in the protein sequence of the S ORF using mass spectrometry, labelled antibodies fluorescent-labelled probes, metal-labelled probes or any other molecules that distinguish wild-type and mutated amino acids. The wild-type amino acids at each position are those amino acids coded for by the codons at the corresponding positions in the S ORF reference sequence.
[00142] In some embodiments, the labelled antibodies, probes or other molecules are detected by flow cytometry or by cytometry by time of flight (CyTOF) analysis.
[00143] In other embodiments, the presence of non-synonymous mutations in the nucleotide or protein sequence of the S ORF will be detected by analysis of transfected, transduced or infected cells in in vitro culture or by any analysis of the supernatants from in vitro cultures.
[00144] In these embodiments, the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at each codon can be determined using any method described herein.
Using the frequency to predict HBV genetic status and clinical disease
[00145] Chronic hepatitis B (CFIB) is associated with liver complications including liver inflammation, cirrhosis and liver cancer. Patients with CFIB may be asymptomatic for many years and only present with clinical symptoms at an advanced stage of disease.
[00146] The method of the invention is suitable for screening any subject infected with HBV genotype A, B or C. The method is particularly contemplated for screening subjects with chronic HBV who do not exhibit clinical symptoms of, or who have not been diagnosed with, liver complications of HBV, for example, subjects considered to be inactive healthy carriers (IHC) of HBV.
[00147] In other embodiments the subject tests negative for hepatitis e antigen (HBeAg), has normal serum ALT levels, has a serum HBV-DNA titre of greater than about 2,000 IU/ml, or any combination thereof. HBeAg negativity usually indicates that a patient has a low risk of developing liver inflammation and cirrhosis, and most patients are regarded as being inactive healthy carriers (IHC). However, a sub-group of HBeAg-negative patients with a normal baseline ALT level and HBV-DNA levels greater than 2,000 IU/ml can develop a form of chronic liver inflammation known as HBeAg-negative chronic hepatitis B (e-CHB), which is associated with rapid progression to liver cirrhosis.
[00148] In various embodiments the frequency of non-synonymous mutations in the S ORF region of HBV is used to determine the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non-synonymous mutations in the S ORF region. [00149] In one embodiment the frequency of non-synonymous mutations in the S ORF region of the HBV indicates that the subject is suffering from liver inflammation, for example, early stage liver inflammation with no clinical symptoms.
[00150] In one embodiment the method is used to determine the genetic status of the HBV the patient is infected with . In one embodiment the frequency of non-synonymous mutations indicates that the HBV is under positive selection pressure.
[00151] In one embodiment the frequency of non-synonymous mutations in the S ORF region of the HBV indicates that the subject is susceptible to or at risk of developing liver inflammation or one or more liver complications of chronic HBV infection. In various embodiments the l iver complications of chronic HBV infection comprise liver cirrhosis, liver cancer, liver failure, liver i nflammation, l iver damage, liver dysfunction, or a combination of any two or more thereof.
[00152] In other embodiments the subject is HBeAg-positive or has elevated serum ALT levels or is both HBeAg-positive and has elevated serum ALT levels. In some embodiments the method is used to confirm that the subject is suffering from liver inflammation, for example, early stage l iver inflammation .
[00153] In one embodiment the frequency of non-synonymous mutations in the S ORF region i ndicates that the subject is susceptible to or at risk of developing liver inflammation or liver complications within 1 year, or 2, 3, 4, 5, 6, 7, 8, 9 or 10 years.
[00154] In one embodiment a numerical score representing the frequency of non- synonymous mutations in the S ORF region is used to determine clinical outcome. For example, a numerical score greater than or equal to a predetermined threshold score indicates that the HBV is under positive selection pressure or that the subject is susceptible to or at risk of developing liver inflammation or liver complications of chronic HBV infection .
[00155] In various embodiments a numerical score of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more indicates that the HBV is under positive selection pressure or that the subject is susceptible to or at risk of developing liver inflammation or liver complications of chronic HBV infection.
[00156] In various embodi ments the frequency of non-synonymous mutations in the S ORF is used to determine that a subject should undergo further clinical testing, for example a liver scan or biopsy to identify liver complications of HBV. In other embodiments the frequency of non-synonymous mutations is used to determine a clinical screening protocol for that patient, for example, to determine a frequency of testi ng for serum ALT, HBeAg, HBV DNA levels. In other embodiments the frequency of non-synonymous mutations is used to recommend that the method of the invention should be repeated within a certain timeframe.
[00157] In various embodiments the frequency of non-synonymous mutations in the S ORF region is used to monitor the response to therapy, predict the response to a therapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy in the subject.
[00158] Immunotherapy includes any therapy that alters or enhances the immune response of a subject to a pathogen, in particular, a virus such as HBV. The term
immunotherapy particularly contemplates treatment or therapy administered to a subject with a chronic HBV infection.
[00159] In various embodiments the frequency of non-synonymous mutations in the S ORF region is used to monitor the response to immunotherapy, predict the response to an immunotherapy, select a therapy, or determine the optimal timing, duration or regimen of a therapy in the subject.
[00160] In various embodiments the frequency of non-synonymous mutations in the S ORF region is determined in the subject at a regular interval. In one embodiment an increase in the frequency of non-synonymous mutations in the S ORF region is indicative of an increased risk of developing liver inflammation or liver complications of chronic HBV or that the subject should commence a particular immunotherapy or other therapy regime.
EXAMPLES
EXAMPLE 1
[00161] This example describes cloning and analysis of the HBV S ORF from a first patient cohort infected with HBV.
1. Methods
[00162] The New Zealand Hepatitis B Screening Programme monitors 1,439 Tongan adults with a chronic HBV infection at 6-12 monthly intervals. 345 of these subjects were recruited into this study at the time of a routine blood test. An extra 3 mL serum was taken for HBV-DNA analyses, and genomic DNA was extracted from peripheral blood leucocytes for HLA class I genotyping . All subjects gave written consent, and the study was approved by the Northern X Regional Ethics Committee of the New Zealand Ministry of Health .
[00163] Initially, the full HBV genome from the serum of 47 of the 345 subjects was sequenced for the purpose of determini ng the wild-type sequences of the HBV in Tonga, and identifying conserved primer sites for sequencing and PCR. Next, al l HBeAg-negative subjects with a normal serum ALT level and a genotype C infection, who had PCR-detectable HBV DNA were selected for a cross-sectional analysis of S ORF mutations. None of these subjects had ever been treated for chronic hepatitis B.
Cloning and sequencing the HBV genome
[00164] HBV-DNA was extracted from 300 mI of serum using the High Pure Viral Nucleic Acid kit (Roche Diagnostics, Indianapolis, USA). There are both between-subject and within-subject differences i n the sequence of the HBV in the Tongan population.
Consequently it was necessary to design specific PCR primers for amplifying and sequencing the HBV from some individual subjects; especially those who were HBeAg-negative.
[00165] The first step was to obtain a consensus sequence of base pairs 1700-2411. The primers for used for amplifying this sequence were identified by finding conserved sequences in a Multalin (http://multalin.toulouse.i nra .fr/multalin/muitalin. html) al ignment of 54 genotype C and genotype D sequences obtained from the NCBI nucleotide database (http://www. ncbi .nim .nih .gov/sites/entrez). From the al ignment, three forward primers were determined which all match two reverse primers (all shown in Table 1). Tm val ues were calculated using Oligo Primer Analysis Software, version 7.58 (Molecular Biology Insights Inc, CO) .
Table 1: Primers used to amplify region 1700-2411 HBV genome
Figure imgf000046_0001
Base numbers correspond to a genotype C3 sequence (Genbank number X75656).
[00166] The HBV genome was cloned in two fragments. A 2.6 kb clone that i ncluded the S ORF, the terminal 2455 bp of the P ORF and the initial 403 bp of the X ORF was amplified using primers that are complementary to conserved sequences in the 1700- 241 lbp sequence described above and were designed to amplify the minus strand of the HBV. Conserved sequences were identified by visual inspection of the 47 1700-2411bp sequences in the Multalin sequence alignment program
(http://multalin .toulouse. inra .fr/multalin/multalin. html). The conserved sequences that were used as primers to clone the 2.6 kb fragment are descri bed in Table 2 and Table 3.
Table 2: External primer pairs used to clone the 2.6kb fragment
Figure imgf000047_0001
Table 3: Primers used for the nested PCR reaction
Figure imgf000047_0002
Table 4: Internal sequencing primers
Figure imgf000047_0003
[00167] The upper primer for the first PCR reaction was either Forward 4 or Forward 5 and the lower primer was either Reverse 3 or Reverse 4 (Table 2) . The PCR products from the first reaction were purified and concentrated using polyethylene glycol (PEG) MW= 8000 (P5413, Sigma Aldrich, St. Louis, MO) and resuspended in 5mI of water, and an aliquot of between 0.1-5.0 mI was used as the template for a second, nested PCR reaction using a primer pair comprising either U pper 6 or Upper 7 as the upper primer and Reverse 4 as the lower primer (Table 3).
[00168] PCR cycling conditions for each reaction were determined using Oligo Primer Analysis Software, Version 6 (Molecular Biology Insights; www.oi igo. net) . The PCR strategy for all PCR reactions described above was a 3 stage touchdown protocol . The denaturation temperature was 96°C for 10 seconds for the first stage, and 2°C above the Tm of the product for 10 seconds for the last two stages. The fi rst and second stage annealing temperatures were set at the Tm of the pri mers (with a maximum of 72°C) for 15 seconds, decreasing at 0.2°C and 0.5°C per cycle for 8 cycles and 4 cycles for the first and second stages respectively. The third stage annealing temperature was set at the optimal value calculated by the Oligo Pri mer Analysis Software for 30 cycles. The extension temperature was 72°C, and the extension time was 50 to 60 seconds per kilobase of product. All amplifications of HBV-DNA were performed using Accuprime Taq DNA Polymerase High Fidelity (Invitrogen Life Technologies, Carlsbad, CA).
[00169] PEG-cleaned amplimers from the nested PCR reaction were A-tailed (30-300 ng of amplimer, 0.88mI lO x ammonium sulphate buffer (pH = 8.4), 0.8 mI 2mM dATP, 0.33mI 25mM MgCh, 3U Taq polymerase, water to 8.8mI) for 30 minutes at 72°C, ligated into pGEM- T (Promega, Madison, WI), and used to transform E.coli (DH5a). One full length clone was sequenced from each PCR reaction.
[00170] Sanger sequencing reactions contained 20-40 ng of cloned DNA and 1 mI of ABI Prism BigDye Terminator v3.1 cycle sequencing mix (Applied Biosystems, Foster City, CA). The sequencing reactions were performed using annealing temperatures of either 56°C or 50.5°C, depending on the Tm of the sequencing primer. The sequencing reaction products were purified by magnetic beads (CleanSEQ, Agencourt Biosciences Corp., Beverly, MA), and analysed on an ABI Prism genetic analyser 3130XL fitted with a 50 cm capillary array using POP7 polymer. Sequence analysis and contig assembly were performed using Chromas 1.61 (Technelysi um Pty Ltd, Queensland, Australia).
[00171] To minimise the risk of contamination between samples, all PCR cocktails were prepared in a PCR workstation (Bigneat Ltd, Hampshire, UK), and transferred to a separate room for addition of templates. Thermal cycling and agarose gel analysis of PCR products were conducted in a third room. Transformation and cloning of PCR products was performed in an externally-vented fume cabinet, with an immediately adjacent, dedicated 4- 8°C refrigerator, 37°C incubator and water bath. There is a unique repertoire of mutations in the HBV population from all subjects, and cross-contami nation between subjects can be identified from the mutation profiles.
Genotyping the HBV
[00172] The genotype of each HBV sequence was determined by aligning the sequence of the amplimer produced by any one of the primer pairs against a panel of sequences obtained from the NCBI nucleotide database (Genbank Accession and Version in parentheses) : genotype A (AF297623.1, AY128092.1, AB126580.1), genotype B
(AY167100.1, AY206391.1, AY206390.1), genotype C (AY167090.1, AY167096.1,
AY167092.1, X75656.1), genotype D (AB126581.1, AB104711.1, AB104712.1), genotype E (AB091256.1, AB091255.1), genotype F (AB036920.1, AY179734.1, AY179735.1), genotype G (AB064315.1, AF405706.1) and genotype H (AY090460.1, AY090457.1). The fi rst amplimers sequenced (containing bases 1700-2411, see above) contained a haplotype of 52 bases that distinguish genotype C from genotype D in the Tongan population, and this sequence was used to exclude genotype D subjects. The possibi lity that any subject had a virus of mixed genotype was excluded by aligning the sequences of the 2.6kb clones with the panel of sequences above. No Tongan subjects with a virus comprising a mixture of genotype C and genotype D sequences were identified .
[00173] 142 clones were obtained that contained the full length S ORF sequence of
HBV genotype C from 19 subjects, with a range of 5 to 12 clones per subject. Clones containing indels and/or nonsense mutations were excluded from further analysis.
EXAMPLE 2
[00174] This example describes cloning and analysis of the HBV genome obtained from a second patient cohort infected with HBV.
1. Methods
[00175] The New Zealand Hepatitis B Screening Programme monitors over 10,000 adults with a chronic HBV infection at 6-12 monthly intervals. 250 of these subjects who were HBeAg-negative and with a normal ALT level when they entered the screening program were recruited independently of age, ethnicity or gender into this study at the time of a routine blood test. All subjects gave written consent for their i nitial screening serum sample to be used for the study, which was approved by the Northern X Regional Ethics Committee of the New Zealand Ministry of Health. HBV-DNA levels were measured using the COBAS Ampliprep/COBAS Taqman HBV Test, version 2.0, which has a linear range of 1.30 to 8.23 log io IU/ml . This study included 12 subjects with HBV DNA>2,000IU/ml and 5 subjects with HBV-DNA<2,000 IU/ml who were all genotype C.
Cloning and sequencing the HBV genome.
[00176] HBV DNA was extracted from 500 mI of serum using the QiAamp Ultrasens Virus Kit (QIAGEN GmbH, Hilden, Germany). The S open reading frame of the HBV was amplified in sequential nested PCR reactions. All PCR reactions used 5 mI of template and Phusion High fidelity DNA Polymerase (Finnzymes, Espoo, Finland) . The optimal annealing temperatures were determined from the manufacturer's website (www.finnzymes.com).
[00177] An initial PCR reaction was performed using either Primer Forward 4 or Primer Forward 5 with Reverse Primer 4 listed in Table 2 above. A 3 stage touchdown protocol was used . The denaturation temperature was 98°C for 10 seconds. The first and second stage annealing temperatures were initiated at the Tm of the primers (with a maximum of 72°C) for 10 seconds, decreasing at 0.2°C and 0.5°C per cycle for 8 cycles and 4 cycles for the first and second stages respectively. The third stage annealing temperature was set at the optimal value calculated by the manufacturer's website for 25 cycles. The extension temperature was 72°C, and the extension time was 30 seconds per kilobase of product.
[00178] A nested PCR was performed on the product of the initial PCR. The nested PCR used HF buffer and a primer pair comprising the forward primer and one of the reverse primers listed in Table 5 below. A 3 stage touchdown protocol was used as described above for the initial PCR.
Table 5: Primer pairs used in nested PCR
Figure imgf000050_0001
[00179] PEG-cleaned amplimers were A-tailed and transformed into E.coli as described in Example 1.
[00180] Sanger sequencing was performed as described for Example 1.
[00181] Measures to minimise the risk of contamination were taken as described for Example 1.
Genotyping the HBV
[00182] The genotype of each HBV sequence was determined by aligning the sequence produced by any one of the primer pairs against a panel of HBV sequences as described above for Example 1.
[00183] 156 clones containing the full length S ORF sequence were obtained from 17 subjects. Clones containing indels and/or nonsense mutations were excluded from further analysis.
EXAMPLE 3
[00184] This example describes PAML analyses of HBV genotype C to identify S ORF codons under positive selection pressure. 1. Methods
[00185] PAML analyses were performed on the S ORF clones obtained as described in Examples 1 and 2 above.
[00186] The nature of the selection pressures acting on the HBV viruses was investigated usi ng the methods of Yang et al . (Mol . Biol . Evol . 2005; 22 : 1107-1118). In these methods, the rate at which non-synonymous substitutions occur (c//V) i n the codon sequence is compared with the rate for synonymous substitutions (c/S). These rates are usually compared by calculating their ratio (often represented as w). When evolution occurs by the random processes of mutation and drift of nucleotides, it is expected that the rate of non-synonymous substitutions to be the same as that for synonymous substitutions (w = 1) and selection is said to be neutral . When amino acid-altering substitutions are
disadvantageous, and purged, then the ratio w will fall in the range 0-1. This represents negative selection. When selection favours amino acid diversity, the rate of non- synonymous substitutions exceeds the rate expected by random processes (w> 1), to give positive selection. These substitution rates of codon evolution must be inferred on a phylogenetic tree.
[00187] The phylogenies of the viruses obtained as described in Examples 1 and 2 were constructed using the maxi mum likel ihood method in PhyML (Guindon et al ., Proc.
Natl . Acad . Sci . USA 2004; 101 : 12957-12962). The Phylogenetic Analysis of Maximum Likelihood (PAML; model 2a, (Yang et al ., Mol . Biol . Evol . 2005; 22: 1107-1118) program was then used to estimate the overall proportion of codons in the whole alignment that were under positive, neutral or negative selection pressures across the whole tree. Each codon was then assigned to the selection category (negative, neutral or positive) to which it had the greatest posterior probability of belonging, as calculated using the Bayes Empirical Bayes (BEB) criterion .
2. Results
[00188] Table 6 summarises the PAML analysis of 298 S ORF clones obtained from the 36 subjects in Example 1 and Example 2. It lists all codons for which w was > 2.0, identifying the codons with strongest positive selection pressure. Positive selection pressure in codons marked with asterisks (* or **) has reached a conventional level (Pr>0.95) of two-tailed statistical significance. Table 6: Results of PAML analysis of HBV genotype C clones
Figure imgf000052_0001
EXAMPLE 4
[00189] This example investigates the correlation between the frequency of non- synonymous mutations at codons in the S ORF under positive selection pressure with the development of chronic hepatitis B (CHB).
1. Methods
[00190] Sixteen subjects from Example 1 (n=4) and Example 2 (n = 12) were identified as having an HBeAg-negative, genotype C chronic hepatitis B vi rus i nfection with a serum HBV-DNA > 2,000IU/ml and a normal serum ALT level . There was a minimum of 10 years follow up from collection of the analysed serum sample available for all these subjects through the local laboratory database and hospital notes. This al lowed every subject to be classified as having either developed serious hepatitis B-related liver disease (liver inflammation, liver cirrhosis, liver failure or liver cancer) or having developed i nactive liver disease over this time. There were 8 subjects in each group. Every subject had 6 cloned full length S ORF sequences avai lable for analysis. Only the first 6 clones obtained were analysed in subjects from whom more than 6 clones were obtained. The number of the above 32 positively selected codons (Pr>0.90) that had either 2, 3, 4, 5 or 6 of the 6 clones containing a non-synonymous mutation was counted in each subject. This potentially can result in any number from 0 to 32. This number is the score for that subject. The range of scores found in the 16 subjects ranged from 0 to 11.
2. Results
[00191] A Receiver Operating Characteristic (ROC) curve (Figure 1) was generated from this data to show the relationship between the sensitivity and specificity of the test for each possible score.
[00192] The sensitivity (or true positive rate) is calculated as the number of subjects with greater than or equal to that score who developed l iver disease divided by the total number of subjects who developed liver disease. The specificity is calculated as the number of subjects with greater than or equal to that score who did not develop liver disease divided by the total number of subjects who did not develop liver disease. This is presented in the ROC curve as 1-specificity which is the false positive rate. Since all subjects had a score of greater than or equal to zero, this score is 100% sensitive and has a 100% false positive rate. A score of greater than or equal to 2 is 100% sensitive and has a false positive rate of 38%. A score of greater than or equal to 5 is 75% sensitive with a false positive rate of 0%. [00193] The survival curve shown in Figure 2 compares the 10 year follow up outcome of 9 subjects with a score of greater than or equal to 3 (group 1) with the outcome of 7 subjects with a score of less than or equal to 2 (group 0).
[00194] The x axis shows the time to development of liver inflammation in 7 subjects in group 1 and 1 subject in group 0; and the duration of follow up in the 8 subjects who continued to have inactive disease for over 10 years is also shown (+) as the censored observations. Figure 2 shows a statistically significant difference between the 2 groups (Wilcoxon Chi-square=7.2, p=0.007). The time to development of liver disease in the single subject from group 0 (with a score of 2) was longer than in any subject from group 1. Both the subjects in group 1 who did not develop liver disease had scores of 3. In addition one of these subjects had a C subgenotype that did not fit any currently known subgenotype and thus the score for this subject may not have been a reflection of immune-mediated positive selection.
[00195] The results show that determination of the frequency of non-synonymous mutations in FIBV is useful for sensitively and specifically predicting the development of liver disease in FIBV-infected patients.
EXAMPLE 5
[00196] The purposes of the studies described in this example were i) to determine whether the same repertoire of positively-selected mutations was detected by Next Generation Sequencing (NGS) sequencing of 66 pooled clones as by Sanger sequencing of the same 66 individual clones. This was to confirm that NGS could identify the same range of mutations at the 32 positively-selected codons of interest as had been found by AT cloning. ii) to determine if the full repertoire of mutations found in 36 Sanger-sequenced clones made from 6 subjects with either liver inflammation or inactive disease could be found in one NGS sequencing run containing one PCR reaction pooled from each subject. This was to determine whether amplification bias in the PCR reactions resulted in failure to identify some mutations. iii) to assess the best method for fractionation of template DNA before NGS
sequencing. 1. First study
[00197] In the first study, 66 S ORF clones of known sequence from 11 subjects in Example 2 were pooled (lOng DNA/clone) for Illumina NGS sequencing. The clones were first fragmented by mechanical shearing on an Epishear sonicator (Activemotif) for 15 minutes at 65% amplitude and at 4°C with 30 seconds on and 30 seconds off cycles. After fragmentation, the end-repair, adenylation of 3'ends, adaptor ligation and amplification of the adaptor ligated libraries were performed using the NEBNext Ultra library preparation reagents for Illumina platform (NEB). The Illumina libraries were then sequenced on the NextSeq500 instrument (Illumina Inc.) to obtain about 10 million paired-end (PE) reads (150bp x 2) from each sample.
[00198] For bioinformatic analysis, the NGS sequences were aligned with Burrows- Wheeler Aligner to a panel of reference genotype C subgenotype sequences for
subgenotypes Cl, C2, C3a, C3b and C4 (SEQ ID Nos: 22-26). Alignments were analysed with SAMtools program and a custom filter to identify which of the positively-selected codons identified in Example 3 (viz codons 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135,
141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 or 391) contained non-synonymous mutations at a frequency of greater than 0.01 of the reads. All mutations identified in 31 of the 32 positively selected codons from the 66 Sanger sequenced clones were identified in the NGS sequence. The DNA sequence surrounding codon 84 was missing from the NGS sequence for reasons we could not determine, and thus the mutation repertoire at this codon could not be assessed. We concluded that NGS was accurate enough to detect all mutations in S ORF codons under positive selection.
2. Second study
[00199] 90ng of one PCR reaction product used for AT cloning from each of the 6 liver inflammation subjects from Example 2 were combined and a further 90 ng of one PCR reaction product from each of the 6 inactive disease subjects from Example 2 were also combined in a separate tube. These two complex mixtures of PCR products were
fractionated by either sonication (the 6 liver inflammation subjects) or restriction enzyme digestion (the 6 inactive disease subjects). The restriction digestion was performed with Sau96I following the manufacturer's instructions (New England Biolabs, #R0165).
[00200] End-repair, adenylation of 3' ends, adaptor ligation and amplification of the adaptor ligated libraries were performed using the NEBNext Ultra library preparation reagents for Illumina platform (NEB). Libraries were sequenced on the Illumina Next Generation Sequencing platform. Sonication was the best method of fractionation, as it resulted in between 850,000 and 1.1 mil lion reads at each of the 32 codons of interest. In contrast restriction enzyme fractionation resulted in low numbers of reads at 5 of the 32 codons, with a range of between 9,000 to 813,000 reads per codon. The results show that sonication was the best method for fractionation of the PCR products.
[00201] There were 43 non-synonymous mutations detected in the Sanger- sequenced clones from the sonicated amplimers. 41 of these were detected by NGS. There were 20 non-synonymous mutations detected in the Sanger-sequenced clones from the restriction enzyme-digested amplimers. 15 of these were also detected by NGS.
[00202] This example demonstrates the uti lity of next generation sequencing for detecti ng non-synonymous mutations in the S ORF of HBV in a method of the invention.
EXAMPLE 6
[00203] The purposes of this example were to a) determine whether the presence of non-synonymous mutations at positively-selected codons 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 or 391 is increased in genotype C, HBeAg-negative chronic hepatitis B virus subjects with normal serum ALT levels at the time of diagnosis of active hepatitis B virus-induced liver disease (defined as liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer), and b) identify the positively-selected codons that are most strongly predictive of the
development of active hepatitis B virus-induced liver disease (defined as liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer) in a genotype C, HBeAg-negative chronic hepatitis B virus subjects with normal serum ALT levels.
1. Methods
[00204] 150 consecutive subjects with an HBeAg-negative chronic HBV infection of genotype C who have normal serum ALT levels and serum HBV-DNA > 2,000 IU/ml are being recruited from the outpatient department of Nanfang Hospital of Southern Medical University in Guangzhou, China . Patients are assessed for the presence of active liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer by their physician according to current Asian guidelines (Asian-Pacific clinical practice guidelines on the management of hepatitis B: a 2015 update. Hepatology International 2015; DOI ;
1 Q.100 ?/s 12072-015-9675-4). 47 patients were assessed for this example.
[00205] The definition of active hepatitis B virus induced liver disease for the purpose of this investigation is a subject with a chronic hepatitis B virus infection who has been deemed to have developed at least one of liver inflammation requiring treatment, liver cirrhosis, liver failure or liver cancer by his or her physician.
[00206] HBV DNA was extracted from sera with HBV-DNA> 2,000 IU/ml using the Qiagen Ultrasens viral extraction kit.
[00207] One PCR reaction that amplifies the S ORF of the hepatitis B virus from each subject was performed as described in Example 2 above. This was individually barcoded at the time of library formation using the NEBNext M ultiplex Oligos Set-I. Next generation sequenci ng of the library from each subject, following fractionation by sonication, was carried out using the methodology descri bed in Example 5 above.
[00208] The genotype and subgenotype of the virus from each subject was determined by al igni ng the consensus sequence of the S ORF derived from the next generation sequence with the reference sequences of HBV subgenotypes, Al, A2, Bl, B2,
B3, B4, B5, Cl, C2, C3A, C3B, C4, C5 and C6 described herein .
[00209] Bioinformatic analyses were carried out as described i n Example 5 above.
The frequency of non-synonymous mutations occurring in more than 0.10 of reads at each of S ORF codons 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195,
198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 or 391 was calculated for each subject.
[00210] A score for each subject was calculated as the number of positively-selected codons having a frequency of non-synonymous mutations greater than 0.10.
[00211] The percentage of patients who have been deemed to require treatment for chronic HBV i nfection by thei r physician on the basis of their clinical assessment at the time they entered the study who have a score of greater than or equal to 4 was determined .
[00212] An analysis at 5 years will also be performed . In this analysis, ROC curves will be created for different combinations of positively selected codons to determine which combi nation of positively selected codons provides the highest area under the curve and thus the highest level of discrimination between subjects who have been treated for active liver disease over the preceding 5 years and patients who have continued to have inactive disease. 2. Results
[00213] All subjects with evidence of liver fibrosis or cirrhosis by fibroscan had a score of greater than or equal to 4. This indicates that the method of the invention is useful for identifyi ng serious chronic liver disease in HBeAg-negative patients with a chronic HBV infection of genotype C who have normal ALT levels and HBV-DNA levels > 2,000 IU/ml .
[00214] The ROC analyses identify the positively-selected codons that are most strongly predictive of the development of active hepatitis B virus-i nduced liver disease.
EXAMPLE 7
[00215] The purpose of this example is to assess the utility of the method of the invention for predicting the development of active, hepatitis B virus-induced liver disease util ising the optimal combination of positively selected codons identified in Example 6 in a new cohort of genotype C, HBeAg-negative chronic hepatitis B virus subjects with normal serum ALT levels.
1. Methods
[00216] Stored serum samples from 100 subjects with a chronic hepatitis B virus infection who were HBeAg negative with normal ALT levels at the time of their initial blood test are obtai ned from the outpatient department of Nanfang Hospital of Southern Medical University in Guangzhou, China . These patients are being assessed for the presence of active liver i nflammation requi ring treatment, liver cirrhosis, l iver failure or l iver cancer by their physician according to the current Asian guidelines described for Example 6 above The samples are tested to identify all the subjects with an HBV-DNA level greater than 2,000 IU/ml .
[00217] The S ORF of the hepatitis B virus from all subjects with a genotype C infection is amplified, barcoded and sequenced using next generation sequencing as descri bed in Example 6 above.
[00218] Subjects are genotyped and sub-genotyped as described above in Example 6.
[00219] Bioinformatic and ROC analyses are undertaken as described for Example 6 above when clinical information is avai lable after 5 years follow up for each patient. A score for each subject will be calculated as the number of positively-selected codons identified in Example 6 having a frequency of non-synonymous mutations greater than 0.10.
2. Results [00220] The results determine the lowest subject score that identifies subjects with active hepatitis B vi rus-induced liver disease with a false positive rate of 0%. It is recommended that any subject with a score greater than or equal to this number be started on antiviral therapy for the purpose of preventing or reversing active hepatitis B virus- induced liver disease.
[00221] The results are used to determine the lowest score in any subject who went on to develop active liver disease. It is recommended that any subject with a score lower than this number continue participation in the screening program, primarily for the purpose of early identification of liver cancers that can occur in the absence of liver cirrhosis. These subjects should not be given anti-viral therapy.
[00222] For subjects having a score or scores that occur in both subjects who develop active liver disease and subjects who develop inactive l iver disease over 10 years, further investigations are recommended, e.g . liver fibroscan, liver ultrasound, liver biopsy. It is recommended that this assay be repeated in these patients in 5-10 years time.
EXAMPLE 8
[00223] The purpose of this example is to identify the S open reading frame codons in the genotype B hepatitis B virus that are under positive selection pressure in subjects who are HBeAg-negative and who have normal serum ALT levels. The uti lity of the frequency of non-synonymous mutations in these codons to predict active, hepatitis B virus- induced liver disease is assessed .
1. Methods
Stage 1
[00224] Stored serum samples from 100 subjects with a chronic hepatitis B virus infection who were HBeAg negative with normal ALT levels at the time of their initial blood test are obtai ned from the outpatient department of Nanfang Hospital of Southern Medical University in Guangzhou, China . These patients are being assessed for the presence of active liver i nflammation requi ring treatment, liver cirrhosis, l iver failure or l iver cancer by their physician according to the current Asian guidelines described for Example 6 above HBV DNA is extracted from these 100 sera using the Qiagen Ultrasens viral extraction kit.
[00225] The S ORF of the hepatitis B virus is cloned and sequenced for 50 sera as in Example 2 above. 300 clones are sequenced from the 50 sera. [00226] Phylogenetic analysis of the 300 HBV clones to identify codons under positive selection pressure is performed as described in Example 3 above.
Stage 2
[00227] The S ORF is amplified from the second 50 of these sera and subjected to next generation sequencing as described above. An ROC analysis is performed to confirm that counting non-synonymous mutations at the positively-selected codons identified in Stage 1 is predictive of the development of active, hepatitis B virus-induced liver disease, using the methods described in Example 7 above.
2. Results
[00228] Stage 1 identifies codons in the genotype B HBV S ORF that are under positive selection pressure.
[00229] Stage 2 determines whether the frequency of non-synonymous mutations at the codons identified at Stage 1 predicts whether a patient will develop either active or inactive liver disease over the following 5 years.
[00230] The ROC analysis determines which patient score predicts the development of active liver disease with a false positive rate of 0% as in Example 4 above.
INDUSTRIAL APPLICATION
[00231] The present invention provides a method of determining the liver inflammation status of or susceptibility or increased risk of liver complication of CHB in patients infected with HBV genotypes A, B or C. The present invention has applicability in medical, clinical and laboratory research applications.
[00232] Where in the foregoing description reference has been made to elements or integers having known equivalents, then such equivalents are included as if they were individually set forth.
[00233] Although the invention has been described by way of example and with reference to particular embodiments, it is to be understood that modifications and/or improvements may be made without departing from the scope or spirit of the invention.

Claims

1. A method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the method comprising a) obtaining sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region; and b) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
2. A system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the system comprising a) a measurement tool that analyses sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV based on the frequency of non- synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of non-synonymous mutations in the S ORF region; the analysis of step (a) or the determination of step (d) being based on a S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with.
3. A method of determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the method comprising determining the frequency of non-synonymous mutations in the S ORF region of the HBV by a method comprising a) providing sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, b) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, c) comparing the sequence information to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, and d) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of the HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection based on the frequency of the non-synonymous mutations in the S ORF region.
4. A system for determining the liver inflammation status of, or determining the genetic status of an HBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic HBV infection in a subject infected with an HBV comprising an S ORF of HBV genotype A, B or C, the system comprising a) a measurement tool that generates sequence information about at least a portion of the S ORF region of the HBV present in a sample obtained from the subject, b) a processor, c) a computer readable medium, and d) an analysis tool stored on the computer-readable medium that is adapted to be executed on the processor to determine the frequency of non-synonymous mutations in the S ORF region of the FIBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the FIBV that the subject is infected with, ii) comparing the sequence information generated by the measurement tool to the S ORF reference sequence to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region, iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of FIBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic FIBV infection based on the frequency of non- synonymous mutations in the S ORF region.
5. A method of claim 1 or 3 comprising amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject using an oligonucleotide primer pair to produce an amplimer and generating sequence information about the amplimer.
6. A system of claim 2 or 4 wherein the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer.
7. A system of claim 2 or 4 or 6 wherein the measurement tool comprises a nucleotide sequencer for generating sequence information from the sample or from the amplimer.
8. A system of claim 2 or 4 wherein the measurement tool comprises an oligonucleotide primer pair for amplifying at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer and a nucleotide sequencer for generating sequence information from the ampli mer.
9. A method of claim 1, 3, or 5 comprising adding the sample or the ampli mer to one or more nucleotide probes that hybridise to specific nucleotides in the S ORF region of HBV to generate the sequence information .
10. A system of claim 2, 4 or 6 wherein the measurement tool comprises one or more nucleotide probes that hybridise to specific nucleotides i n the S ORF region of FIBV to generate the sequence information from the sample or amplimer.
11. A method or system of any one of clai ms 5 to 10 wherein the oligonucleotide primer pai r comprises a) a forward primer comprising a sequence of at least about 5 contiguous
nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the FIBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the FIBV is genotype A; and b) a reverse pri mer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence corresponding to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 100-2000 of the FIBV genome.
12. A kit for determining the liver inflammation status of, or determining the genetic status of an FIBV in, or identifying susceptibility to or detecting a risk of developing liver inflammation or liver complications of chronic FIBV infection in a subject infected with an FIBV comprising an S ORF of FIBV genotype A, B or C, the kit comprising a) an oligonucleotide primer pair for amplifyi ng at least a portion of the S ORF region of the FIBV present in a sample obtained from the subject to produce an amplimer, the primer pair comprising i) a forward primer comprising a sequence of at least about 5 contiguous nucleotides of a portion of an FIBV genome reference sequence correspondi ng to a subgenotype of the FIBV that the subject is infected with comprising nucleotides 1500-3215 and nucleotides 1-60 when the HBV is genotype B or C or nucleotides 1500-3221 and nucleotides 1-60 when the HBV is genotype A; and ii) a reverse primer comprising a sequence that is complementary to at least about 5 contiguous nucleotides of a portion of an HBV genome reference sequence correspondi ng to a subgenotype of the HBV that the subject is i nfected with comprising nucleotides 100-2000 of the HBV genome, b) amplification reagents, c) one or more nucleotide probes that hybridise to specific nucleotide sequences in the S ORF region of HBV for generating sequence information about at least a portion of the S ORF region of HBV present in a sample obtained from a subject, d) an analysis tool for execution on a processor to determine the frequency of non-synonymous mutations in the S ORF region of the HBV by i) selecting an S ORF reference sequence corresponding to a subgenotype of the S ORF of the HBV that the subject is infected with, ii) comparing sequence information generated by a measurement tool about at least a portion of the S ORF region present in the ampli mer or the sequence i nformation generated by the one or more nucleotide probes to the S ORF reference sequence to determine the frequency of non- synonymous mutations at each of two or more codons in the S ORF region, iii) determining the frequency of non-synonymous mutations in the S ORF region based on the frequency of non-synonymous mutations at the two or more codons to determine the liver inflammation status of the subject or the genetic status of HBV or whether the subject is susceptible to or has an increased risk of developing liver inflammation or liver
complications of chronic HBV infection based on the frequency of non- synonymous mutations in the S ORF region, or e) any combination of any two or more of a) to d).
13. A method, system or kit of any one of claims 1 to 12 wherei n the subject is infected with an HBV comprisi ng an S ORF of HBV genotype C.
14. A method, system or kit of any one of claims 1 to 12 wherei n the subject is infected with an HBV comprisi ng an S ORF of HBV genotype B.
15. A method, system or kit of any one of claims 1 to 14 wherein the two or more codons in the S ORF region are selected from the group comprising codons corresponding to the codons at positions 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 and 391 of an S ORF reference sequence of any one SEQ ID NOs: 1-14.
16. A method, system or kit of any one of claims 3 to 15 wherein selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) comparing the sequence information to one or more S ORF subgenotype
reference sequences to determine the subgenotype of the S ORF of the FIBV the subject is infected with, and b) selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with.
17. A method, system or kit of any one of claims 3 to 15 wherein selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with comprises a) providing information about the subgenotype of the S ORF of the FIBV the subject is i nfected with, and b) selecting an S ORF reference sequence corresponding to the subgenotype of the S ORF of the FIBV the subject is infected with.
18. A method, system or kit of any one of claims 1 to 17 comprising selecting the S ORF reference sequence of or wherein the S ORF reference sequence comprises a) SEQ ID NO: 1 when the subgenotype of the FIBV is genotype Al ; b) SEQ ID NO: 2 when the subgenotype of the FIBV is genotype A2; c) SEQ ID NO: 3 when the subgenotype of the FIBV is genotype Bl ; d) SEQ ID NO: 4 when the subgenotype of the FIBV is genotype B2; e) SEQ ID NO: 5 when the subgenotype of the FIBV is genotype B3; f) SEQ ID NO: 6 when the subgenotype of the HBV is genotype B4; g) SEQ ID NO: 7 when the subgenotype of the HBV is genotype B5; h) SEQ ID NO: 8 when the subgenotype of the HBV is genotype Cl ; i) SEQ ID NO: 9 when the subgenotype of the HBV is genotype C2; j) SEQ ID NO: 10 when the subgenotype of the HBV is genotype C3a ; k) SEQ ID NO: 11 when the subgenotype of the HBV is genotype C3b;
L) SEQ ID NO: 12 when the subgenotype of the HBV is genotype C4; m) SEQ ID NO: 13 when the subgenotype of the HBV is genotype C5; and n) SEQ ID NO: 14 when the subgenotype of the HBV is genotype C6.
19. A method, system or kit of any one of claims 1 to 17 comprising selecting an S ORF reference sequence corresponding to, or wherei n the S ORF reference sequence corresponds to, two or more subgenotypes of HBV genotypes A, B or C.
20. A method, system or kit of any one of claims 1 to 19 wherein the liver complications of chronic HBV infection comprise liver cirrhosis, liver cancer, liver failure, liver
inflammation, liver damage, liver dysfunction, or a combination of any two or more thereof.
21. A method, system or kit of any one of claims 1 to 20 wherein the method comprises determining, or the system or kit comprises an analysis tool adapted to be executed on a processor to determine, the frequency of non-synonymous mutations at 3 or more, 4 or more, 5 or more or 10 or more codons of the S ORF region of the HBV.
22. A method, system or kit of any one of claims 1 to 21 wherein the method comprises determining, or the system or kit comprises an analysis tool adapted to be executed on a processor to determine, the frequency of non-synonymous mutations at at least 2, 3, 5, 10, 15, 20 or 25 codons corresponding to the codons at positions 4, 35, 51, 54, 56, 73, 81, 84, 120, 132, 135, 141, 184, 188, 192, 195, 198, 214, 219, 221, 242, 270, 275, 300, 334, 358, 363, 377, 378, 382, 387 and 391 of an S ORF reference sequence of any one SEQ ID NOs: 1-14.
23. A method, system or kit of any one of claims 1 to 22 wherein the method comprises determining, or the system or kit comprises an analysis tool adapted to be executed on a processor to determine, the frequency of non-synonymous mutations at two or more codons within a portion of the S ORF region of HBV comprising codons corresponding to codons 1- 400, codons 1-300, codons 1-200, codons 1-100, codons 100-400, codons 100-300, codons 100-200, codons 200-400, codons 200-300, or codons 300-400 of an S ORF reference sequence of any one SEQ ID NOs: 1-14.
24. A method, system or kit of any one of claims 1 to 23 wherein the method comprises providing sequence information about, the system comprises a measurement tool that generates sequence information about, or the kit comprises an oligonucleotide primer pair for amplifyi ng a portion of the S ORF region of the FIBV comprising codons corresponding to codons 1-400, 1-300, codons 1-200, codons 1-100, codons 100-400, codons 100-300, codons 100-200, codons 200-400, codons 200-300, or codons 300-400 of an S ORF reference sequence of any one SEQ ID NOs: 1-14.
25. A method, system or kit of any one of claims 1 to 24 wherein the method comprises providing sequence information about, the system comprises a measurement tool that generates sequence information about, or the kit comprises an oligonucleotide primer pair for amplifyi ng a portion of the S ORF region of the FIBV comprisi ng at least about 10, 20,
50, 100, 200, 250, 300, 400, 500, 750, 1000 or 1200 contiguous nucleotides of the S ORF of the HBV.
26. A method, system or kit of any one of claims 11 to 25 wherein the HBV genome reference sequence is a) SEQ ID NO: 15 when the subgenotype of the HBV is genotype Al ; b) SEQ ID NO: 16 when the subgenotype of the HBV is genotype A2; c) SEQ ID NO: 17 when the subgenotype of the HBV is genotype Bl ; d) SEQ ID NO: 18 when the subgenotype of the HBV is genotype B2; e) SEQ ID NO: 19 when the subgenotype of the HBV is genotype B3; f) SEQ ID NO: 20 when the subgenotype of the HBV is genotype B4; g) SEQ ID NO: 21 when the subgenotype of the HBV is genotype B5; h) SEQ ID NO: 22 when the subgenotype of the HBV is genotype Cl ; i) SEQ ID NO: 23 when the subgenotype of the HBV is genotype C2; j) SEQ ID NO: 24 when the subgenotype of the HBV is genotype C3a ; k) SEQ ID NO: 25 when the subgenotype of the HBV is genotype C3b;
L) SEQ ID NO: 26 when the subgenotype of the HBV is genotype C4; m) SEQ ID NO: 27 when the subgenotype of the HBV is genotype C5; or n) SEQ ID NO: 28 when the subgenotype of the HBV is genotype C6.
27. A method, system or kit of any of one of clai ms 1 to 26 wherein the sequence information is generated by next generation sequenci ng .
28. A method, system or kit of any one of claims 1 to 27 wherein the frequency of non- synonymous mutations in the S ORF region of the HBV indicates a) the subject has a positive liver inflammation status, b) the HBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver inflammation or liver complications of chronic HBV infection, or d) a combination of any two or more of a) to c).
29. A method, system or kit of any one of claims 1 to 28 comprising a) comparing the sequence i nformation to the S ORF reference sequence to determine the frequency of non-synonymous mutations at each of two or more codons in the S ORF region, b) assigning a numerical score for each codon based on the frequency of non- synonymous mutations at that codon, and c) combining the numerical scores for each codon to provide a combi ned
numerical score representi ng the frequency of non-synonymous mutations in the S ORF region of the HBV.
30. A method, system or kit of claim 29 comprising assigning a positive numerical score for each codon having a frequency of non-synonymous mutations of at least about 5%, 10%, 15%, 20%, 25%, 30%, 40% or at least about 50%.
31. A method, system or kit of claim 30 wherein the positive score for each codon is independently 0.5, 1, 1.5, 2 or 2.5.
32. A method, system or kit of any one of claims 29 to 31 wherein a combi ned numerical score representing the frequency of non-synonymous mutations in the S ORF of the HBV of greater than 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 indicates that a) the subject has a positive liver inflammation status, b) the HBV is subject to positive selection pressure, c) the subject is susceptible to or has an increased risk of developing liver
inflammation or liver complications of chronic HBV infection, or d) a combination of any two or more of a) to c).
33. A method, system or kit of any one of claims 1 to 32 wherei n the subject a) is HBeAg negative, b) has normal serum ALT levels, c) has a serum HBV-DNA titre of greater than about 2,000 IU/ml, or d) has a combination of any two or more of a) to c).
PCT/IB2018/060480 2017-12-21 2018-12-21 Method of analysis of mutations in the hepatitis b virus and uses thereof Ceased WO2019123398A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201880079424.8A CN111465703A (en) 2017-12-21 2018-12-21 Method for analyzing mutation of hepatitis B virus and its use

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ738558 2017-12-21
NZ73855817 2017-12-21

Publications (1)

Publication Number Publication Date
WO2019123398A1 true WO2019123398A1 (en) 2019-06-27

Family

ID=66993168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2018/060480 Ceased WO2019123398A1 (en) 2017-12-21 2018-12-21 Method of analysis of mutations in the hepatitis b virus and uses thereof

Country Status (3)

Country Link
CN (1) CN111465703A (en)
TW (1) TW201928064A (en)
WO (1) WO2019123398A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111621607A (en) * 2020-07-21 2020-09-04 中南大学湘雅二医院 Method and kit for detecting HBV genotype and/or X region mutation, CDS standard sequence of HBx, primer and application
CN113593639A (en) * 2021-08-05 2021-11-02 湖南大学 Method and system for analyzing and monitoring virus genome variation

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114250325A (en) * 2021-12-31 2022-03-29 广西壮族自治区疾病预防控制中心 Primer, probe, kit and method for rapidly detecting 9 genotypes of hepatitis B virus

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0024442D0 (en) * 2000-10-05 2000-11-22 Isis Innovation Genetic factors affecting the outcome of viral infections
CN101586170B (en) * 2009-07-06 2011-11-30 重庆医科大学 Method and kits for detecting genotype of hepatitis B virus
EP2629096A1 (en) * 2012-02-20 2013-08-21 Roche Diagniostics GmbH HBV immunocomplexes for response prediction and therapy monitoring of chronic HBV patients

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
ABBOTT, W.G.H. ET AL.: "Selection pressure on the hepatitis B virus pre-S/S and P open reading frames in Tongan subjects with a chronic hepatitis B virus infection", ANTIVIRAL RESEARCH, vol. 96, no. 2, 31 August 2012 (2012-08-31) - November 2012 (2012-11-01), pages 148 - 157, XP055621629 *
CREMER, J. ET AL.: "Genetic variation of hepatitis B surface antigen among acute and chronic hepatitis B virus infections in The Netherlands", JOURNAL OF MEDICAL VIROLOGY, vol. 90, no. 10, 24 May 2018 (2018-05-24) - October 2018 (2018-10-01), pages 1576 - 1585, XP055621626 *
GAO, S. ET AL.: "Chronic hepatitis B carriers with acute on chronic liver failure show increased HBV surface gene mutations, including immune escape variants", VIROLOGY JOURNAL, vol. 14, no. 203, 24 October 2017 (2017-10-24), pages 1 - 8, XP055621620 *
LIU, S. ET AL.: "Associations between hepatitis B virus mutations and the risk of hepatocellular carcinoma: a meta-analysis", JNCI: JOURNAL OF THE NATIONAL CANCER INSTITUTE, vol. 101, no. 15, 5 August 2009 (2009-08-05), pages 1066 - 1082, XP055621613 *
POLLICINO, T. ET AL.: "Hepatitis B virus PreS/S gene variants: pathobiology and clinical implications", J. HEPATOL, vol. 61, 2014, pages 408 - 417, XP029034906, doi:10.1016/j.jhep.2014.04.041 *
SALPINI, R.: "Hepatitis B surface antigen genetic elements critical for immune escape correlate with hepatitis B virus reactivation upon immunosuppression", HEPATOLOGY, vol. 61, no. 3, 21 November 2014 (2014-11-21) - March 2015 (2015-03-01), pages 823 - 833, XP055621600 *
WANG T. ET AL.: "Sequence analysis of the Pre-S gene in chronic asymptomatic HBV carriers with low-level HBsAg", INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE, vol. 42, 17 August 2018 (2018-08-17), pages 2689 - 2699, XP055621622 *
YIN J ET AL.: "Significant association of different preS mutations with hepatitis B-related cirrhosis or hepatocellular carcinoma", J. GASTROENTEROL, vol. 45, 2010, pages 1063 - 1071, XP019851414 *
ZHANG, A.-Y. ET AL.: "Deep sequencing analysis of quasispecies in the HBV pre-S region and its association with hepatocellular carcinoma", J. GASTROENTEROL, vol. 52, 2017, pages 1064 - 1074, XP036303668, [retrieved on 20170328], doi:10.1007/s00535-017-1334-1 *
ZHANG, A.-Y. ET AL.: "Evolutionary Changes of Hepatitis B Virus Pre-S Mutations Prior to Development of Hepatocellular Carcinoma", PLOS ONE, vol. 10, no. 9, 30 September 2015 (2015-09-30), pages e0139478, XP055621595 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111621607A (en) * 2020-07-21 2020-09-04 中南大学湘雅二医院 Method and kit for detecting HBV genotype and/or X region mutation, CDS standard sequence of HBx, primer and application
CN113593639A (en) * 2021-08-05 2021-11-02 湖南大学 Method and system for analyzing and monitoring virus genome variation
CN113593639B (en) * 2021-08-05 2023-08-25 湖南大学 Method and system for analyzing and monitoring variation of viral genome

Also Published As

Publication number Publication date
TW201928064A (en) 2019-07-16
CN111465703A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
US20210032707A1 (en) Systems and methods to detect rare mutations and copy number variation
CN110785499B (en) Circulating RNA signature specific for preeclampsia
CN104120132B (en) FBN1 gene mutation body and its application
TW201920685A (en) Detection of genetic or molecular variants associated with cancer
CN107849569B (en) Lung adenocarcinoma biomarkers and their applications
WO2019123398A1 (en) Method of analysis of mutations in the hepatitis b virus and uses thereof
EP4232597A1 (en) Methods of assessing risk of developing a disease
CN106282195A (en) Gene mutant and application thereof
CN104093858A (en) Method, system and computer readable medium for determining whether chromosome number variation exists in biological sample
WO2017107545A1 (en) Scap gene mutant and application thereof
JP2017512324A (en) Mutation analysis in high-throughput sequencing applications
JP2023552199A (en) Methods for identifying subjects at high risk of developing coronavirus infection and methods for treatment thereof
CN106119406B (en) Genotyping diagnostic kit for multiple granulomatous vasculitis and arteriolositis and using method thereof
CN110527721B (en) Old tuberculosis marker and application thereof
Borkakoty et al. TSP-based PCR for rapid identification of L and S type strains of SARS-CoV-2
CN109457031B (en) BRCA2 gene g.32338309A&gt;G mutant and its application in auxiliary diagnosis of breast cancer
JP5845489B2 (en) Oligonucleotide set for detecting hepatitis B virus group and evaluating genetic diversity, and method using the same
WO2018223185A1 (en) Methods of determining the likelihood of hepatitis b virus recrudescence
HK40026537A (en) Method of analysis of mutations in the hepatitis b virus and uses thereof
TWI646198B (en) Method for screening high risk of liver cancer by using hepatitis B virus gene sequence
Kiryanov et al. Spread of variants with gene N hot spot mutations in russian SARS-CoV-2 isolates
CN107974436B (en) KLHL24 gene mutant and its application
CN106255762A (en) HCV gene type algorithm
Flores et al. Dried blood spot as alternative specimen for molecular epidemiology studies among HCV/HIV coinfected patients
McHenry Genomic and Co-Evolutionary Determinants of Clinical Severity in Active Tuberculosis Patients

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18892921

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18892921

Country of ref document: EP

Kind code of ref document: A1

WWW Wipo information: withdrawn in national office

Ref document number: 764464

Country of ref document: NZ