GENETICALLY ENGINEERED CLONE OF HEPETITIS E VIRUS (HEV) GENOME WHICH IS INFECTIOUS. ITS PRODUCTION AND USES
The present invention relates to a unique genetically engineered nucleic acid cDNA clone of hepatitis E virus (HEV) genome found to be infectious in cell culture and to a process for the in-vitro synthesization of such a cDNA clone of HEV. Also included within the scope of the invention is the expression from said clone of viral proteins adapted for use in the detection of antibodies to HEV in biological samples and a diagnostic test kit for the presence of HEV infection essentially comprising the expressed proteins.
Historically, two types of the hepatitis virus were initially identified which were referred to as hepatitis A and hepatitis B. Other types of this virus which were known but until they were positively identified were lumped collectively under the identity "non A - non B" (NANB) hepatitis virus. Subsequently, as a result of epidemiological studies and animal transmission, the existence of two or more viruses within the so-called entity NANB hepatitis viruses were suspected. One type of these viruses was found to be parenterally transmittable and this was identified as a flavivirus referred to as hepatitis C. -[Alter et al., Lancet 1, 459-463 (1978); Hollinger et al., Intervirology 10, 60-68 ( 1978); Tabor et al., Lancet 1, 463-466 (1978)].
In the description which follows, references to and acknowledgments of earlier research and articles are identified by means of numerals within square brackets which are cited in the bibliographical reference sheets included at the end of this description. Further advances in diagnostics and the understanding of epidemiology led to the identification of other types of virus from the group of NANB agents. One such identified virus was found to cause water-borne epidemics and sporadic hepatitis and was termed hepatitis E virus which is conveniently abbreviated as HEV. It is HEV which forms the study by the applicants and the subject of the present invention. HEV is now established as an etiological agent of both the epidemic and sporadic forms of water-borne hepatitis which is endemic to the Indian sub-continent and prevalent in most parts of the developing world. In fact, infection due to HEV accounts for one third of the sporadic acute viral hepatitis and almost all the described epidemics in the Indian subcontinent [26, 39]. It is a major health problem in tropical and sub- tropical parts of the world.
The first well-characteπ/ed HEV epidemic w as reported in Delhi. India in 1 955 Sev eral other epidemics have since been described in dev eloping countries on ncarl>
iral infection which is enteπcalK transferred affects mi l l ions of
persons in the areas mentioned and is particularly associated with acute liver failure and high mortality, particularly in the instance of pregnant women.
According to the eighth report of the International Committee of Taxonomy of
Viruses. HEV has been provisionally classified as the prototype member of the group of hepatitis-E viruses. HEV has been found to have a positive strand polyadenvlated ribonucleic acid (RNA) as genome with a size of -7.2 kb and possessing three open reading frames identified as ORF 1, ORF 2 and ORF 3.
ORF 1 encodes the putative non-structural polyprotein including domains representative of (a) a viral methyltransferase, (b) a papain-like cysteine protease, (c) a RNA helicase, and (d) a viral RNA dependent RNA polymerase. ORF 2 encodes an -88 kDa glycoprotein that is the major viral capsid protein. ORF 3 encodes a 13.5 kDa phosphoprotein of unknown function.
The coding sequences in the HEV genome are flanked by 5' and 3' non-coding regions (NCR's) which are respectively 27 and 68 nucleotides long. These NCR's play an important role as cis-acting elements for the genomic RNA replication of HEV as has been reported for other positive strand RNA viruses.
Of the three open reading frames, ORF1 is the largest in size, starts 27 nucleotides downstream of the 5' end and terminates at position 5079, resulting in a protein of 1683 amino acids. ORF 2 begins 37 nucleotides downstream of ORF1 and codes for a protein of 660 amino acids. The third ORF (ORF3), is the smallest in size, encoding a protein o( 123 amino acids and overlaps with ORF1 at its terminal base [55]. Non-structural ORF 1 is believed to code for a putative polyprotein which has different motifs, such as those for a viral methyltransferase, a papain-like cysteine protease, a RNA helicase and a RNA dependant RNA polymerase [28]. However none of these putative functional regions have been characterized so far.
The structural ORFs have been expressed in prokaryotic and eukaryotic systems and immunogenicity of the resulting proteins have been reported in a number of studies [20, 34, 40]. The inventors have earlier expressed ORF2 and ORF3 in animal cells. The ORF2 protein (pORF2) is an 88 kDa glycoprotein that is expressed intracellularly as w ell as on the cell surface [23]. The ORF3 protein (pORF3) is a 13.5 kDa phosphoprotein. which is phosphorylated by the cellular mitogen activated protein kinase and associates with the cytoskeleton [64].
The complete non-structural ORF1 and its putative functional domains hav e been similarly reconstructed and expressed in prokaryotic and eukaryotic systems [2]. These three ORFs originating from a single viral isolate were joined in proper orientation to produce a full-length cDNA clone of HEV, which was used to generate full-length HEV RNA. Such in-vitro generated RNA was shown to be infectious in the tissue culture system. This is similar to observations with the other positive stranded RNA viruses [ 1 . 46. 48. 53].
The hepatitis E virus has been cloned and sequenced from several countries, among them India, Pakistan, Myanmar, China and Mexico. However, only partial expression of the viral structural proteins has been described so far by other workers.
Some of these fusion products have been used in attempts to produce diagnostic test kits and vaccines.
On the other hand, the replication mechanism of this virus has been until no virtually unexplored and in the absence of a reliable in-vitro culture system, the replication and transcription strategy of HEV has been very poorly understood. The in- vitro propagation and production of HEV from the primary hepatocytes of experimentally infected cynomologus macaques has been reported [54]. Unfortunately, this was of only limited utility due to difficulties associated with animal experimentation.
Nevertheless, several cDNA clones of other positive-strand RNA v iruses have been shown to be infectious in cell culture and experimental animal systems [13. 46].
In the absence of an appropriate tissue culture system for HEV so far. the best approach to study HEV replication has been to identify the cis-acting elements in the RNA and the viral and cellular proteins that form the replication complex. Such an approach has been based on the projection that information regarding the replication control may help to design strategies for intervention and that computer analysis could help localize putative structured RNA interacting domains within the viral proteins. In fact, a computer assisted comparison of the polymerases of positive strand RNA iruses has revealed that the putative polymerases of HEV, beet necrotic yellow vein virus (BNYVV) and rubella virus (RubV) form a distinct tight group within the subergroup III of viral polymerases. A detailed analysis of the RdRp structure based on available sequences and the poliovirus 3DP0 protein has been reviewed.
In researching the present invention, the inventors have focussed on two major areas:
a De eloping a complete virus genome clone which can be used to infect cells and thereby provide a valuable in-vitro system for designing and analysing the effects of drugs and other inhibitors and vaccines which can help in the piev ention 01 cure of HEV associated liver disease and or severe complications arising therefrom, and b Studying the potential of the 3' NCR of HEV in serving as the replication initiation site for HEV RNA genome
In their attempts to develop a complete virus genome clone which is infectious, the inventois have considered and analyzed the potential of a full-length m-vιtι o synthesized HEV RNA from a cDNA clone of an Indian isolate of HEV This has involv ed the expeπmental replication and expression of HEV from such RNA in cultmed hepatoma cells, as well as the production of infectious virions ascertained through expeπmental infection of a rhesus macaque
In its broadest novelty, thei^fore, the present invention provides a unique genetically engineered nucleic acid clone of hepatitis E virus (HEV) genome w hich comprises the complete structural and non-structural genes I e the complete coding sequence of HEV flanked by the 5' and 3' non-coding regions, along w ith a 3 polyadenylated tail to the 3' non-codmg region said clone being capable of infecting cultured vei cells and replicating, transcribing and assembling the HEV theiein The nucleic acid clone of the invention comprises a DNA which is complementary to the hepatitis E virus genome Con eniently, this clone is deriv ed using multiple PCR amplification followed by assembly strategy
It has been ascertained that the complete coding sequence (Seq ID No 1 ) of the nucleic acid clone of HEV genome of the present invention is as follows SEQ ID No.l : atg gaggcccatc agtttctcaa ggctcccggc atcactactg ctgttgagca ggctgctcta gccacggcca actctgccct ggcgaatgct gtggtagtta ggccttttct ttctcaccag cagattgaga ttctcattaa cctaatgcaa cctcgccagc ttgttttccg ccccgaggtt ttctggaatc aacccatcca gcgtgtcatt cataacgagc tggagcttta ctgccgcgct cgctccggcc gctgtcttga aattggcgcc catccccgct caataaatga taatcctaat gtggtccacc gctgcttcct ccgccctgtt gggcgtgatg ttcagcgctg gtatactgct cccactcgcg ggccggctgc taattgccgc cgttccgcgt tgcgtgggct tcccgctgct gaccgcacat actgcttcga cgggttttct
ggctgtagct gccccgccga gacgggtatc gccctttact ccctccatga tatgtcacca tctgatgttg ccgaggccat gttccgccat ggtatgacgc ggctttatgc tgccctccat cttccgcctg aggtcttgct gccccctggc acatatcgca ccgcatcgta tttgctgatt catgacggca ggcgcgttgt ggtgacgtat gagggtgata ctagtgctgg ttacaaccac gatgtctcca acttgcgctc ctggattaga accaccaagg ttaccggaga ccatcccctc gttatcgagc gggttagggc cattggctgc cactttgttc tcttgctcac ggcagccccg gagccatcac ctatgcctta tgttccttac ccccggtcta ccgaggtcta tgtccgatcg atcttcggcc cgggtggtac cccttcctta ttcccaacct catgctccac taagtcgacc ttccacgctg tccctgccca tatttgggac cgtcttatgc tgttcggggc caccttggat gaccaagcct tttgctgctc ccgtttaatg acctaccttc gcggcattag ctacaaggtc actgttggta cccttgtggc taatgaaggc cggaacgcct ctgaggacgc cctcacagct gtcatcactg ccgcctatct taccatttgc caccagcggt atctccgcac ccaggctata tccaagggga tccgtcgtct tgaacgggag catgaccaga agtttataac acgcctttac agctggctct tcgagaagtc cggccgtgat tacatccctg gccgtcagtt ggagttctac gcccagtgta ggcgctggct ttcggccggc tttcatcttg atccacgtgt actggttttt gacgagtcgg ccccctgcca ttgtaggact gtgatccgca aggcgctctc gaagttttgc tgctttatga agtggcttgg tcaggagtgc acctgttttc ttcaacctgc agaaggcgtc gtcggcgacc agggtcatga taacgaatcc tatgaggggt ccgatgttga ccctgctgag tccgccatta gtgacatctc tgggtcctat gtcgtccctg gcacagccct ccaaccgctc taccaggccc tcgatctccc cgatgagatt gtggctcgcg cgtgccggct gaccgccaca gtaaaggtct cccaggtcga tgggcggatc gattgcgaga cccttcttgg taacaaaacc ttccgcacgt cgtttgtcga cggggcggtc ttagagacca atggcccaga gcgccacaat ctctcctttg atgccagtca gagcactatg gccgctggcc ctttcagtct cacctatgcc gcctctgcag ctgggctgga ggtgcgctat gttggtgccg ggcttgacca tcgggcgatt tttgcccccg gtgtttcacc ccggtcaaac cccggcgagg tcaccgcctt ctgctctgcc ctatataggt tcaaccgtga ggcccagcgc cattcgctga ccggtaactt atggttccat cctgaggggc ttattggcct ctttgccccg ttttcgcctg ggcatgtctg ggagtcggct aaaccattct gtggcgagag cacactttac acccgtactt ggtcggaggt tgatgccgtc tctagtccaa cccggcccga tttgggtttt atgtctgagc ctcctatacc tagtagggcc gccacgccta ccttggcggc ccctctaccc ccccttgcac cggacccttc ccctccttct tctgccccgg cgctcgatga gccggcttct gccgctacct ccggggtccc ggccataacc caccagacgg cccggcaccg ccgcctgctc ttcacctacc cggatggctc taaggtattc gccggctcgc tgttcgagtc gacatgcacg tggctcgtta acgcgtctaa tgttgaccac tsccctsscε ecgssctttg ccatecattt taccaaaεst accccεcctc ctttgatgct
gcctgttttg tgatgcgcga cggtgcggcc gcgtacacac tgaccccccg gccaataatt catcgtgtcg cccctgatta taggttggaa cataacccaa agaggcttga ggctgcttat cgggagactt gttcccgcct cggtaccgct gcatacccgc tcctcgggac cggcatatac caggtgccga tcggtcccag ctttgacgcc tgggagcgga accaccgccc cggggatgag ttgtaccttc ctgaacttgc tgccagatgg tttgaggcca ataggccgac ccgcccaact ctcactataa ctgaggatgc tgcacggaca gcgaatctgg ccatcgagct tgactcagcc acagatgtcg gccgggcctg tgctggctgt cgggttaccc ctggcgttgt tcaataccag tttaccgcag gtgtgcctgg atccggcaag tcccgctcca tcacccgagc cgatgtggac gttgtcgtgg tcccgacgcg cgagttgcgt aatgcctggc gccgtcgcgg ctttgctgcc ttcaccccgc acactgccgc tagagtcacc gacgggcgcc gggttgtcat tgatgaggct ccatccctcc cccctcacct gttgctgctc cacatgcagc gggccgccac cgtccacctt cttggcgacc cgaatcagat cccagccatc gactttgagc accctgggct cgtccccgcc atcaggcccg acttagcccc tacctcctgg tggcatgtta cccatcgctg ccctgcggat gtatgtgagt tgatccgtgg tgcatacccc atgatccaga ccactagccg ggttctccgt tcgttgttct ggggtgagcc tgccgtcggg cagaaactag tgttcaccca ggcggccaag cccgccaacc ccggctcagt gacggtccac gattcgcagg gcgctaccta cacttatacc actattattg ccacagcaga tgcccggggc cttattcagt cgtctcgggc tcatgccatt gttgctctga cgcgccacac tgagaagtgg gtcatcattg acgcaccagg cctgcttcgc gaggtgggca tctccgatgc aatcgttaat aactttttcc tcgctggtgg cgaaattggt catcagcgcc catctgttat tccccgtggc aaccctgacg ccaatgttga caccctggct gccttcccac cgtcttgcca gattagtgcc ttccatcagt tagctgagga gctcggccac agacctgccc ctgttgcagc tgttctacca ccctgccccg agctcgaaca gggcctcctc tatctgcccc aggagctcac cacctgtgat agtgtcgtaa catttgaatt aacagatatt gtgcactgcc gcatggccgc cccgagccag cgcaaggccg tagtgtccac actcgtgggc cgctacggcc gtcgcacaaa gctctacaat gcttcccact ctgatgttcg cgactctctc gcccgtttta tccctgccat tggccccgta caggtcacaa cctgtgaatt gtacgagtta gtggaggcca tggtcgagaa gggccaggat ggctccgccg tccttgagct cgatctttgc aaccgtgatg tgtccaggat caccttcttc cagaaagatt gtaacaagtt caccacaggt gagaccattg ctcatggtaa agtgggccag ggcatctcgg cctggagcaa gaccttctgc gccctctttg gcccttggtt tcgcgccatt gagaaggcta ttctggcctt gctccctcag ggtgtgtttt acggtgatgc ctttgatgac accgtcttct cagcggctgt ggccgcagca aaggcatcca tggtgtttga gaatgacttt tctgagtttg actccaccca gaataacttt tctttgggtc tagagtgtgc tattatggag gagtgcggga tgccgcaggg gctcatccgc ttgtatcacc ttataaggtc tgcgtggatc ctgcaggccc cgaaggagtc tctgctaggg
ttttggaaga aacactccgg cgagcccggc actcttctat ggaatactgt ctggaatatg gctgttatta cccactgtta tgacttccgc gatttgcagg tagctgcctt taaaggtgat gattcgatag tgctttgcag tgagtatcgt cagagtccag gagctgctgt cctgatcgcc ggctgtggct tgaagttgaa ggtagatttc cgcccgatgc gtttgtatgc aggtgttgtg gtggcccccg gccttggcgc gcttcctgat gtcgtgcgct tcgccggccg gcttaccgag aagaattggg gccctggccc tgagcgggcg gacgagctcc gcatcgctgt tagtgacttc ctccgcaagc tcacgaatgt ggctcagatg tgtgtggatg ttgtttcccg tgtttatggg gtttcccctg ggctcgttca taacctgatt ggcatgctac aggctgttgc tgatggcaag gcacatttca ctgagtcagt aaaaccagtg ctcgacctga caaattcaat cttgtgtcgg gtggaatgaa taacatgtct tttgctgcgc ccatgggttc gcgaccatgg gccctcggcc tattttgttg ctgttcctca tgtttctgcc tatgctgctc gcgccaccgc ccggtcagcc gtctggccgc cgtcgtgggc ggcgcagcgg cggttccggc ggtggtttct ggggtgaccg ggttgattct cagcccttcg caatccccta tattcatcca accaacccct tcgccccgaa tgtcaccgct gcggccgggg ctggacctcg tgttcgccaa cccgtccgac cactcggctc cgcttggcgc gaccaggccc agcgccccgc cgctgcctca cgtcgtagac ctaccacagc tggggccgcg ccgctaaccg cggtcgctcc ggcccatgac accccgccag tgcctgatgt cgactcccgc ggcgccatct tgcgccggca gtacaaccta tcaacatctc cccttacctc ttccgtggcc accggtacta acctggttct ttatgccgcc cctcttagtc cgcttttacc ccttcaggac ggtaccaata ctcatataat ggccacggaa gcttctaatt atgcccagta ccgggttgcc cgtgccacga tccgttaccg cccgctggtc cccaacgctg tcggcggtta cgccatctcc atctcattct ggccacagac tacccccacc ccgacgtccg ttgatatgaa ttcaataacc tcgacggatg ttcgcatttt agtccagccc ggcatagcct ctgagcttgt gatcccaagc gagcgcctac actatcgtaa ccaaggttgg cgctctgtcg agacctccgg ggtggccgag gaggaggcca cctccggtct tgttatgctt tgcatacatg gctcacccgt aaattcctat actaatacac cctataccgg tgcccttggg ctgctggact ttgcccttga gcttgagttt cgcaacctta cccccggtaa tactaatacg cgggtctccc gttattccag cactgctcgc caccgccttc gtcgcggtgc ggacgggact gccgagctca ccaccacggc tgctacccgc tttatgaagg acctctattt tactagtact aatggtgttg gtgagatcgg ccgcgggata gccctcaccc tgttcaacct tgctgacact ttgcttggcg gcctgccgac agaattgatt tcgtcggctg gtggccagct gttctactcc cgtcccgttg tctcagccaa tggcgagccg actgttaagc tgtatacatc tgtagagaat gctcagcagg ataagggtat tgcaatcccg aatgatattg acctcggaga atctcgtgtg gttatccagg attatgataa ccaacatgaa caagaccggc cgacgccttc tccagccccg tcgcgccctt tctctgttct tcgagctaat gatgtgcttt ggctctctct caccgctgcc gagtatgacc agtccaccta
tggctcttcg actggcccag tttatgtttc tgactctgtg accttggtta atgttgccac cggcgcgcag gccgttgccc ggtcgctcga ttggaccaag gtcacacttg acggtcgccc tctctccacc atccagcagt attcgaagat cttctttgtc ctgccgctcc gcgggaagct ctctttctgg gaggcaggca caactaggcc cgggtaccct tataattaca acaccactgc aagcgaccaa ctgcttgtcg agaatgccgc cgggcaccgg gttgctattt ccacttacac cactagcctg ggtgctggcc ccgtctctat ttctgcggtt gccgtcttag gcccccactc tgcgctagca ttgcttgagg atactttgga ttaccctgcc cgcgcccata cttttgatga cttctgccca gagtgccgcc cccttggcct ccagggctgc gctttccagt ctactgtcgc tgagctccag cgccttaaga tgaaggtggg taaaactcgg gagtta.
As stated, the coding sequence in the HEV genome is flanked by 5' and 3' non- coding regions (NCRs) which are 27 and 68 nucleotides long, respectively [55]. These non-coding regions are projected as having the potential to play an important role as cis- acting elements for the genomic RNA replication of HEV as has been reported for other positive strand RNA viruses [21, 30, 49, 52, 56, 62].
The 5' non-coding region of the nucleic acid clone of HEV genome of the present invention is homologous or complementary to an RNA sequence (Seq. ID No. 2) selected from:
SEQ ID No. 2: aggcagaccacatatgtggtcgatgcc
The 3' non-coding region of the nucleic acid clone of HEV genome of the present invention is homologous or complementary to an RNA sequence (Seq. ID No. 3) selected from:
SEQ ID No. 3: tagtttatttgcttgtgccccccttctctctgttgcttatttctcatttctgcgttccgcgctccctg
When the coding and non-coding sequences are taken together, the complete sequence (Seq. ID No. 4) of the nucleic acid clone of HEV genome of the present invention comprises:
SEQ ID No. 4: aggcagacca catatgtggt cgatgccatg gaggcccatc agtttctcaa ggctcccggc 60 atcactactg ctgttgagca ggctgctcta gccacggcca actctgccct ggcgaatgct 120 εtεεtaεtta εεccttttct ttctcaccag caεattεaεa ttctcattaa cctaatεcaa 180
cctcgccagc ttgttttccg ccccgaggtt ttctggaatc aacccatcca gcgtgtcatt 240 cataacgagc tggagcttta ctgccgcgct cgctccggcc gctgtcttga aattggcgcc 300 catccccgct caataaatga taatcctaat gtggtccacc gctgcttcct ccgccctgtt 360 gggcgtgatg ttcagcgctg gtatactgct cccactcgcg ggccggctgc taattgccgc 420 cgttccgcgt tgcgtgggct tcccgctgct gaccgcacat actgcttcga cgggttttct 480 ggctgtagct gccccgccga gacgggtatc gccctttact ccctccatga tatgtcacca 540 tctgatgttg ccgaggccat gttccgccat ggtatgacgc ggctttatgc tgccctccat 600 cttccgcctg aggtcttgct gccccctggc acatatcgca ccgcatcgta tttgctgatt 660 catgacggca ggcgcgttgt ggtgacgtat gagggtgata ctagtgctgg ttacaaccac 720 gatgtctcca acttgcgctc ctggattaga accaccaagg ttaccggaga ccatcccctc 780 gttatcgagc gggttagggc cattggctgc cactttgttc tcttgctcac ggcagccccg 840 gagccatcac ctatgcctta tgttccttac ccccggtcta ccgaggtcta tgtccgatcg 900 atcttcggcc cgggtggtac cccttcctta ttcccaacct catgctccac taagtcgacc 960 ttccacgctg tccctgccca tatttgggac cgtcttatgc tgttcggggc caccttggat 1020 gaccaagcct tttgctgctc ccgtttaatg acctaccttc gcggcattag ctacaaggtc 1080 actgttggta cccttgtggc taatgaaggc cggaacgcct ctgaggacgc cctcacagct 1 140 gtcatcactg ccgcctatct taccatttgc caccagcggt atctccgcac ccaggctata 1200 tccaagggga tccgtcgtct tgaacgggag catgaccaga agtttataac acgcctttac 1260 agctggctct tcgagaagtc cggccgtgat tacatccctg gccgtcagtt ggagttctac 1320 gcccagtgta ggcgctggct ttcggccggc tttcatcttg atccacgtgt actggttttt 1380 gacgagtcgg ccccctgcca ttgtaggact gtgatccgca aggcgctctc gaagttttgc 1440 tgctttatga agtggcttgg tcaggagtgc acctgttttc ttcaacctgc agaaggcgtc 1500 gtcggcgacc agggtcatga taacgaatcc tatgaggggt ccgatgttga ccctgctgag 1560 tccgccatta gtgacatctc tgggtcctat gtcgtccctg gcacagccct ccaaccgctc 1620 taccaggccc tcgatctccc cgatgagatt gtggctcgcg cgtgccggct gaccgccaca 1680 gtaaaggtct cccaggtcga tgggcggatc gattgcgaga cccttcttgg taacaaaacc 1740 ttccgcacgt cgtttgtcga cggggcggtc ttagagacca atggcccaga gcgccacaat 1800 ctctcctttg atgccagtca gagcactatg gccgctggcc ctttcagtct cacctatgcc 1860 gcctctgcag ctgggctgga ggtgcgctat gttggtgccg ggcttgacca tcgggcgatt 1920 tttgcccccg gtgtttcacc ccggtcaaac cccggcgagg tcaccgcctt ctgctctgcc 1980 ctatataggt tcaaccgtga ggcccagcgc cattcgctga ccggtaactt atggttccat 2040 cctgaggggc ttattggcct ctttgccccg ttttcgcctg ggcatgtctg ggagtcggct 2100 aaaccattct gtggcgagag cacactttac acccgtactt ggtcggaggt tgatgccgtc 2160 tctagtccaa cccggcccga tttgggtttt atgtctgagc ctcctatacc tagtagggcc 2220
gccacgccta ccttggcggc ccctctaccc ccccttgcac cggacccttc ccctccttct 2280 tctgccccgg cgctcgatga gccggcttct gccgctacct ccggggtccc ggccataacc 2340 caccagacgg cccggcaccg ccgcctgctc ttcacctacc cggatggctc taaggtattc 2400 gccggctcgc tgttcgagtc gacatgcacg tggctcgtta acgcgtctaa tgttgaccac 2460 tgccctggcg gcgggctttg ccatgcattt taccaaaggt accccgcctc ctttgatgct 2520 gcctgttttg tgatgcgcga cggtgcggcc gcgtacacac tgaccccccg gccaataatt 2580 catcgtgtcg cccctgatta taggttggaa cataacccaa agaggcttga ggctgcttat 2640 cgggagactt gttcccgcct cggtaccgct gcatacccgc tcctcgggac cggcatatac 2700 caggtgccga tcggtcccag ctttgacgcc tgggagcgga accaccgccc cggggatgag 2760 ttgtaccttc ctgaacttgc tgccagatgg tttgaggcca ataggccgac ccgcccaact 2820 ctcactataa ctgaggatgc tgcacggaca gcgaatctgg ccatcgagct tgactcagcc 2880 acagatgtcg gccgggcctg tgctggctgt cgggttaccc ctggcgttgt tcaataccag 2940 tttaccgcag gtgtgcctgg atccggcaag tcccgctcca tcacccgagc cgatgtggac 3000 gttgtcgtgg tcccgacgcg cgagttgcgt aatgcctggc gccgtcgcgg ctttgctgcc 3060 ttcaccccgc acactgccgc tagagtcacc gacgggcgcc gggttgtcat tgatgaggct 3120 ccatccctcc cccctcacct gttgctgctc cacatgcagc gggccgccac cgtccacctt 3180 cttggcgacc cgaatcagat cccagccatc gactttgagc accctgggct cgtccccgcc 3240 atcaggcccg acttagcccc tacctcctgg tggcatgtta cccatcgctg ccctgcggat 3300 gtatgtgagt tgatccgtgg tgcatacccc atgatccaga ccactagccg ggttctccgt 3360 tcgttgttct ggggtgagcc tgccgtcggg cagaaactag tgttcaccca ggcggccaag 3420 cccgccaacc ccggctcagt gacggtccac gattcgcagg gcgctaccta cacttatacc 3480 actattattg ccacagcaga tgcccggggc cttattcagt cgtctcgggc tcatgccatt 3540 gttgctctga cgcgccacac tgagaagtgg gtcatcattg acgcaccagg cctgcttcgc 3600 gaggtgggca tctccgatgc aatcgttaat aactttttcc tcgctggtgg cgaaattggt 3660 catcagcgcc catctgttat tccccgtggc aaccctgacg ccaatgttga caccctggct 3720 gccttcccac cgtcttgcca gattagtgcc ttccatcagt tagctgagga gctcggccac 3780 agacctgccc ctgttgcagc tgttctacca ccctgccccg agctcgaaca gggcctcctc 3840 tatctgcccc aggagctcac cacctgtgat agtgtcgtaa catttgaatt aacagatatt 3900 gtgcactgcc gcatggccgc cccgagccag cgcaaggccg tagtgtccac actcgtgggc 3960 cgctacggcc gtcgcacaaa gctctacaat gcttcccact ctgatgttcg cgactctctc 4020 gcccgtttta tccctgccat tggccccgta caggtcacaa cctgtgaatt gtacgagtta 4080 gtggaggcca tggtcgagaa gggccaggat ggctccgccg tccttgagct cgatctttgc 4140 aaccgtgatg tgtccaggat caccttcttc cagaaagatt gtaacaagtt caccacaggt 4200 gagaccattg ctcatggtaa agtgggccag ggcatctcgg cctggagcaa gaccttctgc 4260
gccctctttg gcccttggtt tcgcgccatt gagaaggcta ttctggcctt gctccctcag 4320 ggtgtgtttt acggtgatgc ctttgatgac accgtcttct cagcggctgt ggccgcagca 4380 aaggcatcca tggtgtttga gaatgacttt tctgagtttg actccaccca gaataacttt 4440 tctttgggtc tagagtgtgc tattatggag gagtgcggga tgccgcaggg gctcatccgc 4500 ttgtatcacc ttataaggtc tgcgtggatc ctgcaggccc cgaaggagtc tctgctaggg 4560 ttttggaaga aacactccgg cgagcccggc actcttctat ggaatactgt ctggaatatg 4620 gctgttatta cccactgtta tgacttccgc gatttgcagg tagctgcctt taaaggtgat 4680 gattcgatag tgctttgcag tgagtatcgt cagagtccag gagctgctgt cctgatcgcc 4740 ggctgtggct tgaagttgaa ggtagatttc cgcccgatgc gtttgtatgc aggtgttgtg 4800 gtggcccccg gccttggcgc gcttcctgat gtcgtgcgct tcgccggccg gcttaccgag 4860 aagaattggg gccctggccc tgagcgggcg gacgagctcc gcatcgctgt tagtgacttc 4920 ctccgcaagc tcacgaatgt ggctcagatg tgtgtggatg ttgtttcccg tgtttatggg 4980 gtttcccctg ggctcgttca taacctgatt ggcatgctac aggctgttgc tgatggcaag 5040 gcacatttca ctgagtcagt aaaaccagtg ctcgacctga caaattcaat cttgtgtcgg 5100 gtggaatgaa taacatgtct tttgctgcgc ccatgggttc gcgaccatgg gccctcggcc 5160 tattttgttg ctgttcctca tgtttctgcc tatgctgctc gcgccaccgc ccggtcagcc 5220 gtctggccgc cgtcgtgggc ggcgcagcgg cggttccggc ggtggtttct ggggtgaccg 5280 ggttgattct cagcccttcg caatccccta tattcatcca accaacccct tcgccccgaa 5340 tgtcaccgct gcggccgggg ctggacctcg tgttcgccaa cccgtccgac cactcggctc 5400 cgcttggcgc gaccaggccc agcgccccgc cgctgcctca cgtcgtagac ctaccacagc 5460 tggggccgcg ccgctaaccg cggtcgctcc ggcccatgac accccgccag tgcctgatgt 5520 cgactcccgc ggcgccatct tgcgccggca gtacaaccta tcaacatctc cccttacctc 5580 ttccgtggcc accggtacta acctggttct ttatgccgcc cctcttagtc cgcttttacc 5640 ccttcaggac ggtaccaata ctcatataat ggccacggaa gcttctaatt atgcccagta 5700 ccgggttgcc cgtgccacga tccgttaccg cccgctggtc cccaacgctg tcggcggtta 5760 cgccatctcc atctcattct ggccacagac tacccccacc ccgacgtccg ttgatatgaa 5820 ttcaataacc tcgacggatg ttcgcatttt agtccagccc ggcatagcct ctgagcttgt 5880 gatcccaagc gagcgcctac actatcgtaa ccaaggttgg cgctctgtcg agacctccgg 5940 ggtggccgag gaggaggcca cctccggtct tgttatgctt tgcatacatg gctcacccgt 6000 aaattcctat actaatacac cctataccgg tgcccttggg ctgctggact ttgcccttga 6060 gcttεagttt cgcaacctta cccccggtaa tactaatacg cgggtctccc gttattccag 6120 cactgctcgc caccgccttc gtcgcggtgc ggacgggact gccgagctca ccaccacggc 6180 tgctacccgc tttatgaagg acctctattt tactagtact aatggtgttg gtgagatcgg 6240 ccgcgggata gccctcaccc tgttcaacct tgctgacact ttgcttggcg gcctgccgac 6300
agaattgatt tcgtcggctg gtggccagct gttctactcc cgtcccgttg tctcagccaa 6360 tggcgagccg actgttaagc tgtatacatc tgtagagaat gctcagcagg ataagggtat 6420 tgcaatcccg aatgatattg acctcggaga atctcgtgtg gttatccagg attatgataa 6480 ccaacatgaa caagaccggc cgacgccttc tccagccccg tcgcgccctt tctctgttct 6540 tcgagctaat gatgtgcttt ggctctctct caccgctgcc gagtatgacc agtccaccta 6600 tggctcttcg actggcccag tttatgtttc tgactctgtg accttggtta atgttgccac 6660 cggcgcgcag gccgttgccc ggtcgctcga ttggaccaag gtcacacttg acggtcgccc 6720 tctctccacc atccagcagt attcgaagat cttctttgtc ctgccgctcc gcgggaagct 6780 ctctttctgg gaggcaggca caactaggcc cgggtaccct tataattaca acaccactgc 6840 aagcgaccaa ctgcttgtcg agaatgccgc cgggcaccgg gttgctattt ccacttacac 6900 cactagcctg ggtgctggcc ccgtctctat ttctgcggtt gccgtcttag gcccccactc 6960 tgcgctagca ttgcttgagg atactttgga ttaccctgcc cgcgcccata cttttgatga 7020 cttctgccca gagtgccgcc cccttggcct ccagggctgc gctttccagt ctactgtcgc 7080 tgagctccag cgccttaaga tgaaggtggg taaaactcgg gagttatagt ttatttgctt 7140 gtgcccccct tctctctgtt gcttatttct catttctgcg ttccgcgctc cctg 7194
The non-structural gene (ORF1) of the nucleic acid clone of HEV genome codes for the polyprotein having the following amino acid sequence Seq. ID No. 5:
SEQ ID No. 5:
ORF1 : MEAHQFLKAPGITTAVEQAALATANSALANAVVVRPFLSHQQIEILINLM
QPRQLVFRPEVFWNQPIQRVIHNELELYCRARSGRCLEIGAHPRSINDNPNVVHR CFLRPVGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCFDGFSGCSCPAET GIALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHD GRRVVVTYEGDTSAGYNHDVSNLRSWIRTTKVTGDHPLVIERVRAIGCHFVLLL TAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIWDRL MLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGRNASEDALTAVITAAY LTICHQRYLRTQAISKGIRRLEREHDQKFITRLYSWLFEKSGRDYIPGRQLEF\'AQ CRRWLSAGFHLDPRVLVFDESAPCHCRTVIRKALSKFCCFMKWLGQECTCFLQP AEGVVGDQGHDNESYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPDEIV ARACRLTATVKVSQVDGRIDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDA SQSTMAAGPFSLTYAASAAGLEVRYVGAGLDHRAIFAPGVSPRSNPGEVTAFCS ALYRFNREAQRHSLTGNLWFHPEGLIGLFAPFSPGHVWESAKPFCGESTLYTRT WSEVDAVSSPTRPDLGFMSEPPIPSRAATPTLAAPLPPLAPDPSPPSSAPALDEPAS
AATSGVPAITHQTARHRRLLFTYPDGSKVFAGSLFESTCTWLVNASNVDHCPGG GLCHAFYQRYPASFDAACFVMRDGAAAYTLTPRPIIHRVAPDYRLEHNPKRLEA AYRETCSRLGTAAYPLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFE ANRPTRPTLTITEDAARTANLAIELDSATDVGRACAGCRVTPGVVQYQFTAGVP GSGKSRSITRADVDVVVVPTRELRNAWRRRGFAAFTPHTAARVTDGRRVVIDEA PSLPPHLLLLHMQRAATVHLLGDPNQIPAIDFEHPGLVPAIRPDLAPTSWWHVTH RCPADVCELIRGAYPMIQTTSRVLRSLFWGEPAVGQKLVFTQAAKPANPGSVTV HDSQGATYTYTTIIATADARGLIQSSRAHAIVALTRHTEKWVIIDAPGLLREVGIS DAIVNNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHR P APVAAVLPPCPELEQGLLYLPQELTTCDSVVTFELTDIVHCRMAAPSQFLKAVVS TLVGRYGRRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELVEAMVEKGQD GSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWS TFCALFGP WFRAIEKAILALLPQGVFYGDAFDDTVFSAAVAAAKASMVFENDFSEFDSTQNN FSLGLECAIMEECGMPQGLIRLYHLIRSAWILQAPKESLLGFWKKHSGEPGTLLW NTVWNMAVITHCYDFRDLQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLK.L VDFRPMRLYAGVVVAPGLGALPDVVRFAGRLTEKNWGPGPERADELRIAVSDF LRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVA DGKAHFTESVKPVLDLTNSILCRVE.
The structural genes (ORF2 and ORF3) of the nucleic acid clone of HEV genome code for proteins having the following amino acid sequences Seq ID No. 6 and Seq. ID No. 7, respectively:
SEQ ID No. 6:
ORF2:
MGPRPILLLFLMFLPMLLAPPPGQPSGRRRGRRSGGSGGGFWGDRVDSQP FAIPYIHPTNPFAPNVTAAAGAGPRVRQPVRPLGSAWRDQAQRPAAASRRRPTT AGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAP LSPLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISFWPQ TTPTPTSVDMNSITSTDVRILVQPGIASELVIPSERLHYRNQGWRSVETSGVAEEE ATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRVSRYSS TARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGVGEIGRGIALTLFNLADTLL GGLPTELISSAGGQLFYSRPVVSANGEPTVKLYTSVENAQQDKGIAIPNDIDLGES RVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVL SLTAAEYDQSTYGSSTGP VYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLSTIQQYSKIFFVLPLRGKL
SFYVEAGTTRPGYPYNYNTTASDQ'LLVENAAGHRVAISTYTTSLGAGPYSISAVA
VLGPHSALALLEDTLDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLI MK
VGKTREL
SEQ ID No. 7:
ORF3:
MNNMSFAAPMGSRPWALGLFCCCSSCFCLCCSRHRPVSRLAAYVGGAA AVPAVVSGVTGLILSPSQSPIFIQPTPSPRMSPLRPGLDLVFANPSDHSAPLGATRP SAPPLPHVVDLPQLGPRR. As stated, the nucleic acid clone of HEV genome of the present invention comprises a DNA which is complementary to the hepatitis E virus genome. Conveniently, this DNA complementary to the hepatitis E genome is a DNA from an epidemic strain derived from Abrahampatnam, Andhra Pradesh, India.
To elucidate the potential role of the 3' NCR in HEV as a replication initiation site for its RNA genome, the inventors have analyzed the simulated structure of the 3' end and its interaction with the viral RdRp and cellular proteins, which w ere partially characterized by RNase protection and UV-crosslinking assays. Also analyzed were the structural aspects of the HEV RdRp by comparison with results from computer predictions of secondary and tertiary structures of various representative RdRps [38]. with special reference to the crystal structure of the poliovirus RdRp [19]. This in itself can serve as a target for drug development.
These research endeavours have successfully established that the 3' non-coding region of the nucleic acid clone of HEV genome of the present invention is characterized by the presence of cis-acting regulatory elements capable of interacting with recombinant viral RNA-dependent RNA polymerase (RdRp) and cellular proteins for virus grow th, said elements being thus usable as targets for inhibitors of virus replication. Examples of such inhibitors of virus replication include anti-sense ribonucleic acid (RNA). ribozymes. analogues and inhibitors of the viral replicase.
The cis-acting regulatory elements of the 3' end which are usable as targets for inhibitors of virus replication possess the following sequence Seq ID No. 8:
SEQ ID No. 8:
gcuccagcgccuuaagaugaagguggguaaaacucgggaguuauaguLiuauuugCLiugugccccccuuc ucucuguugcuuauuucucauuucugcguuccgcgcucccug
The cis-acting regulatory elements of the 3' end which are usable as targets for inhibitors of virus replication possess the structure shown in Figure 8A of the accompanying drawings. The 3' polyadenylated tail of the 3' non-coding region preferably comprises at least 3 adenosine residues.
A unique genetic engineered nucleic acid clone of hepatitis E virus genome of the invention, which can be transcribed to produce infectious transcripts, has a sequence number AF076239.
Production of a replication and transcription competent (infectious) clone w ill mitigate the difficulties in analysis of virus biology. For this purpose the ιιι-vιtro produced full-length HEV RNA was used to study the replication of the virus v ia gene transfer. In this context the HEV genome was cloned downstream of the T7 promoter. transcribed in-vitro and transcripts were characterized.
Thus, according to a preferred feature of the invention, there is provided a process for the production of a unique genetic engineered nucleic acid clone of hepatitis E virus (HEV) genome which comprises:
(i) amplifying fragments or components of the genome by use of reverse transcription-polymerase chain reaction:
(ii) cloning and sequencing the amplified fragments or components:
(iii) assembling the cloned sub-genomic components by use of overlap amplification and specific restriction fragment cloning to form a full-length HEV cDNA clone: and (iv) introducing said full-length HEV cDNA clone downstream of a T7 promoter to convert said cDNA clone into a full-length RNA genome using in-vitro transcription.
In a particularly advantageous embodiment of the invention, the full-length cDNA clone or the coding components thereof may be cloned in an expression vector system to induce therefrom the expression of viral proteins.
The viral proteins thus expressed from the full-length cDNA clone or its coding components find use in the detection of antibodies to HEV in biological samples such as blood, serum, plasma, blood cells, lymphocytes, bile and liver tissue biopsy.
Accordingly, the invention also includes within its scope an in-vitro method for detecting antibodies to HEV in a biological sample obtained from a subject, which method comprises:
(i) contacting said sample with the viral proteins expressed from the cloning of the full- length cDNA clone or its coding components in an expression vector system, said contacting being effected under conditions that permit binding of HEV-specific antibodies in the sample to said proteins; and
(ii) detecting such binding of antibodies in said sample to said proteins, any such binding of antibodies to said proteins being indicative of the presence of antibodies to HEV in the sample.
As a still further feature, the present invention also includes a diagnostic test kit for the presence of HEV infection essentially comprising viral proteins expressed from the unique genetically engineered full-length cDNA clone of HEV genome of the invention.
According to a further embodiment of the invention, the unique nucleic acid clone of hepatitis E virus (HEV) genome can be mutated to produce non-infectious HEV particles. These non-infectious HEV particles so produced can be combined with a pharmaceutically acceptable adjuvant to produce an HEV vaccine.
The invention also envisages a high throughput assay system for rapid anti-HEY drug testing which comprises the nucleic acid clone of hepatitis E virus (HEV) genome described herein modified by the insertion therein of a reporter gene for the analysis o\~ replication inhibition.
The invention will be explained in greater detail with reference to the following non-limitative Examples which are provided as exemplary of the invention.
EXAMPLE 1
Assembly of a full-length HEV cDNA clone
As part of their efforts to assemble a full-length HEV cDNA clone, the inv entois hav e analyzed the potential of a full-length in-vitro synthesized HEV RNA The thiee know n ORFs {GenBank Ace No AF028091 (ORFl ). Ace No U22532 (ORF2 and ORF3) | along with the non-codmg regions were cloned using sub-genomic PCR amplification followed by reconstruction strategy [2, 23] The cDNA clone complementary to the hepatitis E v irus genome was derived from an Indian isolate of HEV from an epidemic strain from Abrahampatnam, A P , India The clone in question was generated using multiple PCR amplification followed by assembly strategv and possess the following sequence Seq ID No 4 SEQ ID No. 4:
aggcagacca catatgtggt cgatgccatg gaggcccatc agtttctcaa ggctcccggc 60 atcactactg ctgttgagca ggctgctcta gccacggcca actctgccct ggcgaatgct 120 gtggtagtta ggccttttct ttctcaccag cagattgaga ttctcattaa cctaatgcaa 180 cctcgccagc ttgttttccg ccccgaggtt ttctggaatc aacccatcca gcgtgtcatt 240 cataacgagc tggagcttta ctgccgcgct cgctccggcc gctgtcttga aattggcgcc 300 catccccgct caataaatga taatcctaat gtggtccacc gctgcttcct ccgccctgtt 360 gggcgtgatg ttcagcgctg gtatactgct cccactcgcg ggccggctgc taattgccgc 420 cgttccgcgt tgcgtgggct tcccgctgct gaccgcacat actgcttcga cgggttttct 480 ggctgtagct gccccgccga gacgggtatc gccctttact ccctccatga tatgtcacca 540 tctgatgttg ccgaggccat gttccgccat ggtatgacgc ggctttatgc tgccctccat 600 cttccgcctg aggtcttgct gccccctggc acatatcgca ccgcatcgta tttgctgatt 660 catgacggca ggcgcgttgt ggtgacgtat gagggtgata ctagtgctgg ttacaaccac 720 gatgtctcca acttgcgctc ctggattaga accaccaagg ttaccggaga ccatcccctc 780 gttatcgagc gggttagggc cattggctgc cactttgttc tcttgctcac ggcagccccg 840 gagccatcac ctatgcctta tgttccttac ccccggtcta ccgaggtcta tgtccgatcg 900 atcttcggcc cgggtggtac cccttcctta ttcccaacct catgctccac taagtcgacc 960 ttccacgctg tccctgccca tatttgggac cgtcttatgc tgttcggggc caccttggat 1020 gaccaagcct tttgctgctc ccgtttaatg acctaccttc gcggcattag ctacaaggtc 1080 actgttggta cccttgtggc taatgaaggc cggaacgcct ctgaggacgc cctcacagct 1 140 gtcatcactg ccgcctatct taccatttgc caccagcggt atctccgcac ccaggctata 1200 tccaaεεεεa tccεtcεtct tεaacε εaε catεaccaεa agtttataac acεcctttac 1260
agctggctct tcgagaagtc cggccgtgat tacatccctg gccgtcagtt ggagttctac 1320 gcccagtgta ggcgctggct ttcggccggc tttcatcttg atccacgtgt actggttttt 1380 gacgagtcgg ccccctgcca ttgtaggact gtgatccgca aggcgctctc gaagttttgc 1440 tgctttatga agtggcttgg tcaggagtgc acctgttttc ttcaacctgc agaaggcgtc 1500 gtcggcgacc agggtcatga taacgaatcc tatgaggggt ccgatgttga ccctgctgag 1560 tccgccatta gtgacatctc tgggtcctat gtcgtccctg gcacagccct ccaaccgctc 1620 taccaggccc tcgatctccc cgatgagatt gtggctcgcg cgtgccggct gaccgccaca 1680 gtaaaggtct cccaggtcga tgggcggatc gattgcgaga cccttcttgg taacaaaacc 1740 ttccgcacgt cgtttgtcga cggggcggtc ttagagacca atggcccaga gcgccacaat 1800 ctctcctttg atgccagtca gagcactatg gccgctggcc ctttcagtct cacctatgcc 1860 gcctctgcag ctgggctgga ggtgcgctat gttggtgccg ggcttgacca tcgggcgatt 1920 tttgcccccg gtgtttcacc ccggtcaaac cccggcgagg tcaccgcctt ctgctctgcc 1980 ctatataggt tcaaccgtga ggcccagcgc cattcgctga ccggtaactt atggttccat 2040 cctgaggggc ttattggcct ctttgccccg ttttcgcctg ggcatgtctg ggagtcggct 2100 aaaccattct gtggcgagag cacactttac acccgtactt ggtcggaggt tgatgccgtc 2160 tctagtccaa cccggcccga tttgggtttt atgtctgagc ctcctatacc tagtagggcc 2220 gccacgccta ccttggcggc ccctctaccc ccccttgcac cggacccttc ccctccttct 2280 tctgccccgg cgctcgatga gccggcttct gccgctacct ccggggtccc ggccataacc 2340 caccagacgg cccggcaccg ccgcctgctc ttcacctacc cggatggctc taaggtattc 2400 gccggctcgc tgttcgagtc gacatgcacg tggctcgtta acgcgtctaa tgttgaccac 2460 tgccctggcg gcgggctttg ccatgcattt taccaaaggt accccgcctc ctttgatgct 2520 gcctgttttg tgatgcgcga cggtgcggcc gcgtacacac tgaccccccg gccaataatt 2580 catcgtgtcg cccctgatta taggttggaa cataacccaa agaggcttga ggctgcttat 2640 cgggagactt gttcccgcct cggtaccgct gcatacccgc tcctcgggac cggcatatac 2700 caggtgccga tcggtcccag ctttgacgcc tgggagcgga accaccgccc cggggatgag 2760 ttgtaccttc ctgaacttgc tgccagatgg tttgaggcca ataggccgac ccgcccaact 2820 ctcactataa ctgaggatgc tgcacggaca gcgaatctgg ccatcgagct tgactcagcc 2880 acagatgtcg gccgggcctg tgctggctgt cgggttaccc ctggcgttgt tcaataccag 2940 tttaccgcag gtgtgcctgg atccggcaag tcccgctcca tcacccgagc cgatgtggac 3000 gttgtcgtgg tcccgacgcg cgagttgcgt aatgcctggc gccgtcgcgg ctttgctgcc 3060 ttcaccccgc acactgccgc tagagtcacc gacgggcgcc gggttgtcat tgatgaggct 3120 ccatccctcc cccctcacct gttgctgctc cacatgcagc gggccgccac cgtccacctt 3180 cttggcgacc cgaatcagat cccagccatc gactttgagc accctgggct cgtccccgcc 3240 atcaggcccg acttagcccc tacctcctgg tggcatgtta cccatcgctg ccctgcggat 3300
gtatgtgagt tgatccgtgg tgcatacccc atgatccaga ccactagccg ggttctccgt 3360 tcgttgttct ggggtgagcc tgccgtcggg cagaaactag tgttcaccca ggcggccaag 3420 cccgccaacc ccggctcagt gacggtccac gattcgcagg gcgctaccta cacttatacc 3480 actattattg ccacagcaga tgcccggggc cttattcagt cgtctcgggc tcatgccatt 3540 gttgctctga cgcgccacac tgagaagtgg gtcatcattg acgcaccagg cctgcttcgc 3600 gaggtgggca tctccgatgc aatcgttaat aactttttcc tcgctggtgg cgaaattggt 3660 catcagcgcc catctgttat tccccgtggc aaccctgacg ccaatgttga caccctggct 3720 gccttcccac cgtcttgcca gattagtgcc ttccatcagt tagctgagga gctcggccac 3780 agacctgccc ctgttgcagc tgttctacca ccctgccccg agctcgaaca gggcctcctc 3840 tatctgcccc aggagctcac cacctgtgat agtgtcgtaa catttgaatt aacagatatt 3900 gtgcactgcc gcatggccgc cccgagccag cgcaaggccg tagtgtccac actcgtgggc 3960 cgctacggcc gtcgcacaaa gctctacaat gcttcccact ctgatgttcg cgactctctc 4020 gcccgtttta tccctgccat tggccccgta caggtcacaa cctgtgaatt gtacgagtta 4080 gtggaggcca tggtcgagaa gggccaggat ggctccgccg tccttgagct cgatctttgc 4140 aaccgtgatg tgtccaggat caccttcttc cagaaagatt gtaacaagtt caccacaggt 4200 gagaccattg ctcatggtaa agtgggccag ggcatctcgg cctggagcaa gaccttctgc 4260 gccctctttg gcccttggtt tcgcgccatt gagaaggcta ttctggcctt gctccctcag 4320 ggtgtgtttt acggtgatgc ctttgatgac accgtcttct cagcggctgt ggccgcagca 4380 aaggcatcca tggtgtttga gaatgacttt tctgagtttg actccaccca gaataacttt 4440 tctttgggtc tagagtgtgc tattatggag gagtgcggga tgccgcaggg gctcatccgc 4500 ttgtatcacc ttataaggtc tgcgtggatc ctgcaggccc cgaaggagtc tctgctaggg 4560 ttttggaaga aacactccgg cgagcccggc actcttctat ggaatactgt ctggaatatg 4620 gctgttatta cccactgtta tgacttccgc gatttgcagg tagctgcctt taaaggtgat 4680 gattcgatag tgctttgcag tgagtatcgt cagagtccag gagctgctgt cctgatcgcc 4740 ggctgtggct tgaagttgaa ggtagatttc cgcccgatgc gtttgtatgc aggtgttgtg 4800 gtggcccccg gccttggcgc gcttcctgat gtcgtgcgct tcgccggccg gcttaccgag 4860 aagaattggg gccctggccc tgagcgggcg gacgagctcc gcatcgctgt tagtgacttc 4920 ctccgcaagc tcacgaatgt ggctcagatg tgtgtggatg ttgtttcccg tgtttatggg 4980 gtttcccctg ggctcgttca taacctgatt ggcatgctac aggctgttgc tgatggcaag 5040 gcacatttca ctgagtcagt aaaaccagtg ctcgacctga caaattcaat cttgtgtcgg 5100 gtggaatgaa taacatgtct tttgctgcgc ccatgggttc gcgaccatgg gccctcggcc 5160 tattttgttg ctgttcctca tgtttctgcc tatgctgctc gcgccaccgc ccggtcagcc 5220 gtctggccgc cgtcgtgggc ggcgcagcgg cggttccggc ggtggtttct ggggtgaccg 5280 εεttεattct caεcccttcg caatccccta tattcatcca accaacccct tcεccccgaa 5340
tgtcaccgct gcggccgggg ctggacctcg tgttcgccaa cccgtccgac cactcggctc 5400 cgcttggcgc gaccaggccc agcgccccgc cgctgcctca cgtcgtagac ctaccacagc 5460 tggggccgcg ccgctaaccg cggtcgctcc ggcccatgac accccgccag tgcctgatgt 5520 cgactcccgc ggcgccatct tgcgccggca gtacaaccta tcaacatctc cccttacctc 5580 ttccgtggcc accggtacta acctggttct ttatgccgcc cctcttagtc cgcttttacc 5640 ccttcaggac ggtaccaata ctcatataat ggccacggaa gcttctaatt atgcccagta 5700 ccgggttgcc cgtgccacga tccgttaccg cccgctggtc cccaacgctg tcggcggtta 5760 cgccatctcc atctcattct ggccacagac tacccccacc ccgacgtccg ttgatatgaa 5820 ttcaataacc tcgacggatg ttcgcatttt agtccagccc ggcatagcct ctgagcttgt 5880 gatcccaagc gagcgcctac actatcgtaa ccaaggttgg cgctctgtcg agacctccgg 5940 ggtggccgag gaggaggcca cctccggtct tgttatgctt tgcatacatg gctcacccgt 6000 aaattcctat actaatacac cctataccgg tgcccttggg ctgctggact ttgcccttga 6060 gcttgagttt cgcaacctta cccccggtaa tactaatacg cgggtctccc gttattccag 6120 cactgctcgc caccgccttc gtcgcggtgc ggacgggact gccgagctca ccaccacggc 6180 tgctacccgc tttatgaagg acctctattt tactagtact aatggtgttg gtgagatcgg 6240 ccgcgggata gccctcaccc tgttcaacct tgctgacact ttgcttggcg gcctgccgac 6300 agaattgatt tcgtcggctg gtggccagct gttctactcc cgtcccgttg tctcagccaa 6360 tggcgagccg actgttaagc tgtatacatc tgtagagaat gctcagcagg ataagggtat 6420 tgcaatcccg aatgatattg acctcggaga atctcgtgtg gttatccagg attatgataa 6480 ccaacatgaa caagaccggc cgacgccttc tccagccccg tcgcgccctt tctctgttct 6540 tcgagctaat gatgtgcttt ggctctctct caccgctgcc gagtatgacc agtccaccta 6600 tggctcttcg actggcccag tttatgtttc tgactctgtg accttggtta atgttgccac 6660 cggcgcgcag gccgttgccc ggtcgctcga ttggaccaag gtcacacttg acggtcgccc 6720 tctctccacc atccagcagt attcgaagat cttctttgtc ctgccgctcc gcgggaagct 6780 ctctttctgg gaggcaggca caactaggcc cgggtaccct tataattaca acaccactgc 6840 aagcgaccaa ctgcttgtcg agaatgccgc cgggcaccgg gttgctattt ccacttacac 6900 cactagcctg ggtgctggcc ccgtctctat ttctgcggtt gccgtcttag gcccccactc 6960 tgcgctagca ttgcttgagg atactttgga ttaccctgcc cgcgcccata cttttgatga 7020 cttctgccca gagtgccgcc cccttggcct ccagggctgc gctttccagt ctactgtcgc 7080 tgagctccag cgccttaaga tgaaggtggg taaaactcgg gagttatagt ttatttgctt 7140 gtgcccccct tctctctgtt gcttatttct catttctgcg ttccgcgctc cctg 7194
These fragments were expressed in both prokaryotic as well eukaryotic systems [2, 23, 40]. These sub-genomic fragments were produced by PCR cloning and assembly from a single viral isolate grown in a rhesus monkey. These previously described cloned
fiagments were used for the reconstruction of full-length genomic cDNA clone of the HEV Briefly, a Xhol restriction enzyme site was engineered in the pπmei 7195 sequence (5'GCctcgagTTTTTCAGGGAGCGCGGAACGCA3') that had a stretch of fi e thymidine bases at the 3' end to produce a ORF2-pBluescπpt SK (+) (Stratagene. Germany) clone This clone w as extended at the 5' end by inserting a PCR amplified fiagment of HEV covering 4438 to 5285 nucleotides using standard cloning pioceduie The clone HEV ORFl-pSGI w as digested with EcoRI and Xbal to release an insert langing from nucleotide 1-4449 whereas the insert ranging from nucleotide 4449-7195 w as released from 4438-7195 pBluescπpt SK (+) clone by digestion with Xbal and Xhol restriction enzymes. These inserts weie cloned into pSGI vector [22] digested v\ ith EcoRl and Xhol restriction enzymes in a three-way ligation.
The reconstruction referred to above is schematically represented in Figure 1 of the accompanying drawings. The strategy shown has been used for the assembly of a full- length HEV cDNA clone. The ORF2-pCR script SK(+) clone was extend at it's 5' end with a PCR amplified fragment nt. 4438-5285 using inherent restriction site BstEII and Xbal. A Xhol site was created at 3' end using another PCR amplified fragment with Xhol site in the primer, which replaced a fragment extending from 6881-7195. This fragment (nt. 4449-7195) and the fragment of ORF-1 (nt. 1-4449) were used in a three way hgation to create the complete cDNA clone of HEV in the pSGI vector. This full-length HEY cDNA clone was completely sequenced after reconstruction (GenBank Ace No AF076239) and named pSGI-HEV(I).
EXAMPLE 2
In-vitro Transcription of Full-Length HEV RNA
In-vitro transcription of the cDNA clone (pSGI-HEV(I)} assembled according to Example 1 as carried out using T7 RNA polymerase (Stratagene, Germany) to gene ue full-length HEV transcripts. Briefly, the caesium chloride (CsCb) gradient pun lied plasmid {pSGl-HEV(I)} containing full-length HEV cDNA was digested w ith Xhol to produce hneai DNA template RNA was in-vitro transcribed using 50 units of T7 R\ A polymeiase in a 50 μl reaction volume containing 40 mM Tπs-HCl pH S . 8 m MgCh. 50 mM NaCl. 2 mM spermidine. 30 mM DTT, 400 μM each I KTPS and 5 u template DNA The leaction as incubated at 37°C foi 30 minutes followed digestion w ith DNasel (RQ1 RNase-fiee DNase, Pi omega. USA) foi one houi at 37CC The leaction w as extracted w ith phenol chloroform follow ed b a second round i
DNasel digestion and phenol/ chloroform extraction to ensure no carry ov er of the template DNA with the transcripts The transcripts were ethanol precipitated at -70°C foi one hour The RNA pellet was washed with cold 70% ethanol. dried and its integrity was analyzed by 0.8% formaldehyde agarose gel electrophoresis followed by ethidium- bromide staining and northern hybridization
According to such northern hybridization procedure, in-vitro produced RNA was resolved on a 0.8% formaldehyde denaturing agarose gel and transferred to nylon membrane (Hybond, Amersham Intl.. UK) m the presence of 20X SSC. The membrane was washed with 10X SSC solution, air-dried and subjected to UV-crosslinking in an ultraviolet crosslinker (Stratagene, Germany) The membrane was put in prehybridization solution (6X SSC, 5X Denhardt's solution, 0.5% Sodium dodecyl sulfate (SDS), 100 μg calf thymus DNA per ml of the solution) and incubated at 68°C foi 6 hours in hybridization oven (Shel Lab, Model 1004, USA). The hybridization was subsequently carried out in fresh pre-hybπdization solution containing 1x10 cpm of an alpha J~P dCTP labelled probe generated from the full-length ORF2 clone of HEV [39]. The probe was prepared by using a commercial random priming kit (Prime-it, Stratagene, Germany) as per manufacturer's protocol.
Following hybridization for 16 hours at 68°C, the membrane was washed as follow s: a) 2x SSC, 0.1% SDS for 5 minutes at room temperature, b) 0 2x SSC, 0.1% SDS for 5 minutes at room temperature twice, c) 0 SSC, 0.1 % SDS for 15 minutes at 42°C, and d) 0 l x SSC, 0.1% SDS for 15 minutes at 68°C
All the solutions were discarded as radioactive waste.
Following the last wash, the membrane was wrapped in Saran Wrap and exposed for autoradiography using Kodak X-Omat AR film with Du-Pont intensifying screens (Du-Pont, USA).
The analysis of in-vitro transcribed full-length HEV RNA from the cDNA clone
{pSG-HEV(l )}, on 0 8% formaldehyde agarose gel electrophoresis is photographically shown in Figure 2 of the drawings In this figure, lane 1 shows the RNA on staining the gel w ith ethidium bromide Lane 2 shows the RNA as detected by northern hybridization using HEV specific probe Lane M shows the molecular size RNA markei in kilobases
01/57073 23
The majority of the HEV transcripts were bout 7.2 Kb in size corresponding to complete genome w hen compared with standard RNA molecular weight marker ( Life Technologies, USA).
EXAMPLE 3 RNA Transfection and Metabolic Pulse Labelling
HepG2 cells were maintained in Dulbecco's modified Eagle's Medium (D EM ) containing 10% fetal calf serum (Life Technologies, USA). Cells at about -50% confluency were used for transfection of HEV RNA. Twenty microgram of in-vitro produced RNA was transfected by liposome induced method (Lipofectamine. Life Technologies, USA) as per the manufacturer's guidelines. The plasmid vector (pSGI) served as a control for the transfection. For each 60-mm diameter culture dish. 20 μg of the HEY RNA and 10 μl of lipofectamine were diluted in 1.5 ml of serum-free medium. The mixture was kept at room temperature for 30 minutes and gently overlaid on to the monolayer. Fresh medium with 10% fetal calf serum was replaced after 6 hours and the cells were kept in an atmosphere of 5% CO2- After 72 hours the cells were harvested for extraction of total RNA.
Transfected cells were pulse labelled with (100 μCi/ ml/ 60 mm plate) ~ " S methionine-cysteine (Promix, Amersham Intl., UK) for four hours at 72 hours post transfection in methionine-cysteine deficient DMEM (Sigma, USA). The metabolically labelled cells were harvested and proteins were immunoprecipitated using HEV speci fic polyclonal antibodies. Similar labelling experiments were also carried out at 12. 24. 36. 72 and 96 hours to determine the expression kinetics of the viral RNA dependent RN.A Polymerase (RdRp). A batch of the HEV RNA transfected cells were maintained in the culture and allowed to grow for next 45 days (8 passages). These cells w ere analyzed at 3. 7, 15, 33 and 45 days post transfection for the presence of anti-sense RNA replicativ e intermediates using strand-specific PCR.
EXAMPLE 4
Detection of Anti-sense HEV RNA
Strand-specific PCR was carried out to detect anti-sense and sense HEV RNA in the cells transfected according to the procedure of Example 3 [37]. Total RNA as isolated at days 3, 7, 15, 33 and 45 post-transfection from the cells by single-step RN.A isolation method [8].
Results showed that both sense and anti-sense strands of the HEV genome were detected by strand-specific RT-PCR in the HEV RNA transfected HepG2 cells. The pSGI transfected cells remained negative for HEV sequences at all point of time. The HEV
RNA transfected cells were positive for sense as well as anti-sense strands of the HEV genome at 3, 7, 15 and 33 days.
A serial log fold dilution of the total RNA extracted at 72 hours (3 days) was carried out to determine an approximate ratio of sense and anti-sense strands. For sense strand detection the HepG2 cells transfected with plasmid pSGI and serum from HEV infected monkey with viremia served as negative and positive controls respectively. For anti-sense detection, RNA isolated from HEV infected monkey liver containing anti- sense replicative intermediate served as positive control whereas the bile fluid or serum from the same viremic animal served as negative control. For strand-specific detection the reverse transcription was carried out using either sense or anti-sense primer. Following cDNA synthesis the RNA in the reaction mix was degraded by digestion with RNase H (2 units) and RNase A (1 μg) (Promega, USA). Following RNase treatment the reaction mix was extracted once with phenol/ chloroform and ethanol precipitated. The precipitated cDNA was used for PCR amplification using both sense and anti-sense primers.
Results of the experiment with serial log fold dilution carried out at 72 hours post transfection, showed that the negative strand of HEV genome was detected up to a dilution of 10"6 of the total HepG2 cell RNA. On the other hand the positive strand was detected up to a dilution of 10"15. The positive strand was more abundant than the negative sense strand. These results are depicted in Figures 3 A and 3B of the dra ings in which the photograph shows amplification product of HEV genome in strand specific PCR carried out on total RNA extracted from HepG2 cell transfected with full-length positive sense in-vitro synthesized HEV RNA. Figure 3A in this figure represents detection of the negative-sense strand of HEV RNA and Figure 3B represents detection of the positive-sense strand of HEV RNA. Explanation of the other integers shown in Figure 3 are as follows: Figure 3A: Detection of negative strand of HEV RNA:
Lanes: M= marker, 100 bp ladder (Life technologies, USA) P= Positive control (RNA isolated from the HEV infected Rhesus monkey liver) N= Negative control ( RN.A extracted from the bile fluid of HEV infected Rhesus monkey) 1= RNA ( l μg) extracted
from HepG2 cells transfected w ith in-vitw synthesized HEV RNA at 72 houis (Neat) 2 through 9= Log dilution of the transfected HepG2 cell RNA from 10 through 10
Figure 3B Detection of positiv e sense strand of HEV RNA
Lanes M= marker. 100 bp ladder (Life technologies. USA) P= positiv e control (RN A extracted from the HEV infected Rhesus monkey serum) N= negativ e control (RNA extracted from the normal control Rhesus monkey serum) 1= RNA (lμg) extracted from HepG2 cells transfected with
ιtro synthesized HEV RNA at 72 hours Lane 2 through 8= Log dilutions of total RNA from transfected HepG2 cells 2 = 10 ', 3 = 10 \ 4 = 10 '" 5 = 10 ' \ 6 = 10
16, 7 = 10
17, 8 = 10
18 dilutions) For hybridisation based detection total RNA (30 μg) extracted from the transfected cells was immobilized on a nylon membrane (Amersham Intl , UK) using a Hybπ slot manifold (Life Technologies, USA) RNA from cells transfected with plasmid pSGI was used as negative control In addition, the in-vitro transcribed unlabelled sense and anti-sense HEV RNA (2 5 μg each) were used as positive and negativ e controls alternatively Before transfer the manifold was cleaned with 0 1 M NaOH and nnsed twice with DEPC treated water The membrane was cut to the size of manifold and soaked in 20X SSC for 10 minutes pnor to blotting Approximately 30 μg of the total RNA isolated from the HEV RNA transfected cells were mixed w ith 3 volume of denaturing solution [5 ml deiomzed formamide, 1 62 ml of 37% formaldehyde and 1 ml MOPS buffei (0 2 M MOPS, 0 5 M sodium acetate and 0 01 M EDTA pH 8 0)] and incubated at 65°C for 15 min The mixture was chilled on ice, diluted w ith 1 ml of ice cold 20X SSC and loaded in defined wells The suction was continued till the samples were exhausted in all the w ells The membrane was air dried. UV cross-linked and incubated in pre-hybndization solution as described above Sense and anti-sense specific πboprobes were prepared by transcription with T7 and SP6 polymerases (Ribopiobc system-T7 and Riboprobe system-SP6, Promega, USA) using direct and rev eise onented clones of a HEV cDNA encompassing nucleotide 1-457 in pCR-Scnpt SK (-) (Stratagene, Germany) and pGEM-T (Promega, USA) vectors respectiv elv The transcription leaction was carried out in the presence of alpha 3 p UTP (2500 Ci mmol Ameisham Intl , UK) Prior to m-vi o transcription, the template DNA as lineaiized bv restriction enzyme digestion at the end of the fragment The reaction mixtuie w as tieated w ith DNasel. extracted with phenol/ chloroform and ethanol piecrpitated as mentioned above Hybndization and washing conditions were similar to that descnbed eai liei
Thoroughly washed membrane was wrapped in Saran Wrap and exposed for autoradiography.
Results: Strand-specific RNA hybridization carried out using alpha 33P TP labelled riboprobes confirm the presence of anti-sense RNA in the HEV RNA transfected HepG2 cells. Both sense and anti-sense HEV RNA could be detected in transfected cells by this method while control cells transfected with pSGI vector did not show anv signal on hybridization. These results are shown in Figure 4 of the drawings in which the photograph shows strand specific slot hybridisation for detection of positiv e and negative sense HEV RNA in total RNA extracted from HepG2 cells transfected w ith full-length in-vitro transcribed HEV RNA.
Panel A of Figure 4 shows hybridisation with alpha 33P UTP labelled riboprobe of anti-sense polarity. Slots 1, 2, 3 and 4 represent the following:
1. In-vitro synthesized positive sense HEV RNA (Positive control)
2. In-vitro synthesize negative sense HEV RNA (Negative control) 3. RNA isolated from HepG2 cells transfected with full-length in-vitro transcribed HEV RNA
4. RNA isolated from HepG2 cells transfected with pSGI vector as control.
Panel B of Figure 4 shows hybridisation with alpha " JP UTP riboprobe of sense polarity. Slots 1, 2, 3 and 4 represent the following: 1. In-vitro synthesized HEV RNA of negative sense polarity (Positive control)
2. In-vitro synthesized HEV RNA of positive sense polarity (Negative control)
3. RNA isolated from HepG2 cells transfected with full-length in-vitro transcribed HEV RNA
4. RNA isolated from HepG2 cells transfected with pSGI vector as control In-vitro synthesized sense and anti-sense HEV RNA were used to validate the specificity of hybridisation method in detecting the strands. The presence of anti-sense HEV RNA was reconfirmed by hybridisation.
EXAMPLE 5
Detection of Viral Proteins in Transfected Cells
The cells transfected with HEV RNA according to the procedure of Example 3 were lysed in 750 μl of RIP A buffer (10 mM Tris-HCl pH 8.0, 140 mM NaCl. 5 mM Iodoacetamide, 0.5% Triton X-100, 1 % SDS. 1% sodium deoxycholate. 2 mM phenymethylsulfonyl fluoride). The clarified lysate was incubated with 7 μl of anti- ORF2. anti-ORF3 or anti-ORFl (putative anti-Methyl transferase (anti-met). putativ e anti-Helicase (anti-hel) and putative anti-RNA dependent RNA Polymerase (RdRp) independently on ice, for one hour. The polyclonal antibodies were raised in rabbits against the structural proteins (pORF2, pORF3) and components of non-structural polyprotein pORFl (putative methyl transferase, helicase and RdRp regions) as described elsewhere [2. 40]. The antigen-antibody complexes formed were further incubated w ith 100 μl 10% suspension of pre-swollen Protein A Sepharose-4B (Pharmacia. Upssala. Sweden) and the reaction mixture was kept at 4°C with slow end to end shaking. After one hour, the reaction mixture was centrifuged for one minute at 10,000 rpm in a refrigerated microcentrifuge (Hermle, Germany). Supernatant was discarded as radioactive waste and the beads were washed thrice with 1 ml of RIPA buffer for 10 minutes at 4°C with shaking. The complex was boiled with 50 μl of 2X SDS-PAGE sample buffer (50 mM Tris-HCl pH 6.8, 2% SDS, 5% 2-Beta mercaptoethanol. 0.1 % bromophenol blue) and analyzed on a SDS 6-15% gradient PAGE. The gel was treated with 0.5 M sodium salicylate, washed, dried and exposed for autoradiography as described above.
Results: The HEV ORF2 and ORF3 proteins (pORF2 and pORF3) were detected by immunoprecipitation with their corresponding specific antibodies. An autoradiograph showinε immunoprecipitation of the structural proteins from the HEV RNA transfected HepG2 cells is depicted in Figure 5A of the drawings. The arrow marked signals in the lane 2 and 4 represent the pORF2 (-72 kDa) and pORF3 (13.5 kDa) respectively. The pSGI transfected cells were immunoprecipitated with anti-ORF2 and anti-ORF3 antibodies (Figure 5A, lanes 1 and 3) to serve as control. The immunoprecipitates were analyzed on SDS 6-15% gradient PAGE and visualized by fluorography. The molecular sizes of C labelled markers (in kilodaltons) are indicated on the right (Amersham Int.. UK).
The signals corresponding to putative domains of methyl transferase (-35 kDa) and helicase (-38 kDa) could be detected by immunoprecipitation. These are autoradiographically depicted in Figures 5B of the drawings. The putative helicase and methyl transferase were detected in samples prepared at 72 hours post transfection. The autoradiograph of Figure 5B shows immunoprecipitation of putative domains of non- structural polyprotein corresponding to methyl transferase and helicase from the HEV RNA transfected HepG2 cells. The arrow marked signals in lanes 2 and 3 represent the signals corresponding to putative methyl transferase (-35 kDa) and helicase (-38 kDa). respectively. The mock transfected pSGI cells were immunoprecipitated with anti-met and anti-hel antibodies (Figure 5B, lanes 1 and 4) to serve as control. The immunoprecipitates were analyzed on SDS 6-15% gradient PAGE and visualized by fluorography. Molecular size markers (in kilodaltons) are indicated on the right (Rainbow markers. Amersham Int.. UK).
There were additional unique high molecular weight bands in the immunoprecipitation experiments using antibodies against putative helicase and methyl transferase domains. These may be intermediates in the processing of the ORFl protein (Figure 5B).
Signals corresponding to the putative domain of RdRp (-36 kDa) could be detected by immunoprecipitation at 12, 24 and 36 hours post transfection. The autoradiograph of Figure 5C shows immunoprecipitation of putative domains of non- structural polyprotein corresponding to RNA dependent RNA polymerase (RdRp) from the HEV RNA transfected HepG2 cells. The immunoprecipitation was carried out using anti-RdRp antibodies at the stated 12, 24 and 36 hours post transfection. For control, the HepG2 cells were transfected with pSGI vector and immunoprecipitated at 36 hours with the same antibody. The immunoprecipitates were analyzed on SDS 6-15% gradient PAGE and visualized by fluorography. Molecular size markers (in kilodaltons) are indicated on the right (Rainbow markers, Amersham Int., UK). Putative RdRp could not be detected in transfected cells at 72 and 96 hours (Data not shown).
In all the immunoprecipitation experiments, pSGI transfected HepG2 cells w ere used as control and specific polypeptide species were found to be missing from these control cells.
For immunofluorescence studies, transfection was carried out on cells grow n on cover slips (30 mm diameter). After 72 hours, the cells on coverslips were washed tvv ice
O 01/5
w ith phosphate buffei saline (PBS, pH 7 2) and fixed with 4% para-formaldehyde at room temperatuie for 30 minutes Following fixation, the cells were incubated with 0 1 % saponm (Sigma, USA) on ice for 10 minutes, washed with PBS and incubated with 1 100 diluted antι-pORF2, antι-pORF3, anti-met and anti-hel antibodies for one hour in a humid chambei at 37°C The cells w ere washed with PBS and further incubated w ith goat anti- tabbit FITC (fluorescein isothiocyanate) conjugate (1 100 dilution) (Sigma. USA) at 37°C in the humid chamber for 45 minutes After three washings w ith PBS. the covershps weie mounted on glass slides and were observed under confocal microscope (Bio-Rad, USA) Similarly, lmmunofluorescence labelling of transfected cells with anti- RdRp antibody was carried out 24 hours post transfection Cells transfected ith the vectoi pSGI served as negative control
Both the structural (pORF2 and pORF3) and non-structural proteins (corresponding to putative domains of Met, Hel and RdRp) were demonstrated in the transfected cells following staining with their corresponding antibodies The results are depicted in Figure 6 of the drawing which is a composite photograph show ing immunofluoiescent antibody staining in in-vitro synthesised HEV RNA tiansfectcd HepG2 cells and control cells tiansfected with pSGI vector Panels B. D. F. H and K of Figure 6 represent the lmmunofluorescent staining of the HEV transfected HepG2 cells with antι-pORF2, antι-pORF3, anti-met, anti-hel and anti-RdRp antibodies, respectively Panels A, C, E, G and J represent the immunostaining of the control HepG2 cells w ith the same antibodies corresponding to the test panel All the immunostaining w as earned out at 72 houi post tiansfection except that with the anti-RdRp antibodies w hich w as performed on the 24 hours post transfection
None of the antibodies showed any significant signal with control cells transfected with vector plasmid pSGI (Figure 6)
EXAMPLE 6
Experimental Infection of Rhesus monkeys (M. mulata) by Inoculation v\ ith Culture Supernatant from HEV RNA Transfected Cells
The ethical clearance w as obtained from the primate reseaich facility of the All India Institute of Medical Sciences for experimentation on rhesus monke s The animals
(M-1690, M-1761, M-1927 and M-2197) were put under quarantine and w eie ruled out foi any pnoi infection Pre-inoculation blood was collected asepticalh The seia w eie analyzed foi the presence of IgM anti-HEV, IgG anti-HEV and HEV RNA by RT-PCR
Animals negative for these markers were used for further experiments One lhesus monkev (M-1690) was injected intravenously with 6 ml of pooled culture supernatants from HEV RNA transfected HepG2 cells (prepared according to the procedure of Example 3) collected at 72 hours. The control monkey (M-1761) was injected only w ith PBS and kept under the same condition. Two other monkeys (M-1927 and M-2197) w ere injected w ith 100 μg of HEV RNA (transcribed according to Example 2) w ith animal injectable RNAse inhibitor (Promega, USA) into the liver at multiple sites after performing a mini laprotomy After the inoculation the blood samples w ere collected aseptically twice a week and serum was stored at -70°C for further use The alanine transferase (ALT) and aspartate transferase (AST) values were monitored by using commercial assay kit (Boehπnger Mannheim, Germany). HEV antibodies w ere detected by m-house ELISA system using recombinant HEV proteins (pORF2. pORF3 and pORFl ) (SK Panda, Unpublished). The presence of HEV RNA was investigated bv RT- PCR as described earlier [22]. Results: Following inoculation of specimen M-1690, HEV RNA was observ ed w ith the help of RT-PCR in the sera collected between days 24 to 37. During this period (24-37 days) the AST and ALT values increased to between 1.5 to 2.5 (53-100 IL liter) times normal level. The IgM class of anti HEV antibodies directed against the ORF l . ORF2 and ORF3 viral proteins were detected after 4 weeks and persisted for the next 14 davs The ratio of optical density (OD) between pre-inoculation and positive sera w ere in the range of 1 :8 to 1 : 15 which is typical for HEV infection in Rhesus monkey The animals (M-1927. M-2197) which received in-vitro produced HEV RNA as well as the control monkey (M-1761) remained normal with no rise in ALT and AST v alues and no seroconv ersion for antibodies were observed. They also remained negative foi HEV RNA in semm (viremia) throughout the follow up period. The IgG anti-HEV antibodies w eie detected in the infected monkey (M-1690) 3 months after the inoculation
Figure 7 of the drawings represents a photograph of 2% agarose gel depicting the RT-nested PCR products (343bp) for HEV genome amplified from the sei uin of infected rhesus Macaca mulata (M-1690) Lane P indicates Positive control. Lane N indicates the control monkey (M-1761) and Lane M represents 100 bp DNA laddei ( Li fe Technologies, USA) Lanes 1 , 2, 3, 4 and 5 represent the serum collected from the monkev M-1690 on days 24, 28. 33, 37 and 43, respectively, post injection The arrow indicates amplified fragments from HEV genome
EXAMPLE 7
Computer Derived Structural Analysis of the 3' end of HEV RNA
Sequences of five geographically different HEV isolates including those from India (AF076239), Pakistan (M80851), Myanmar (M73218), China (L08816) and Mexico (M74506) were analyzed for conserved domains within the 3' non-coding region (NCR) using clustal (DNAStar Inc, USA). Subsequently, 300 nucleotides at the 3' end followed by the poly(A) stretch were subjected to secondary structure prediction using the MFOLD 2.3 program [66]. RNA parameters were used as given by Walter [60]. All the optimal and sub-optimal conformers predicted by MFOLD were visualized bv ESSA. an interactive tool for analyzing RNA secondary structure [7], A more elaborate secondary structure analysis was carried out for the conserved stem-loop structures found within the last 1 10 bases at the 3' end of various isolates with added poly(A) sequences.
These were equivalent to nucleotide positions 7084 -7194-A11 nucleotides of the Indian isolate (AF076239). Results: The nucleotide sequence analysis of the 3' non coding region of the full-length HEV sequences in Genbank, [India (AF076239), Pakistan (M80851 ), Myanmar (M73218), China (L08816) and Mexico (M74506)] showed high degree of homology with very few C— >U variations in Pakistani, Myanmar and Chinese isolates and a single A→G variation in the Mynamar isolate (see Figure 8 A of the drawings). Mexican isolate had 9 nucleotides extra at the 3' end and showed quite a few variations within the sequence. Secondary structure prediction of the last 300 nucleotides at the 3' end indicated the presence of a number of stem and loops, of which two stem-loop structures at the extreme 3' end were found to be conserved amongst all viral isolates with minor differences in the Mexican isolate. This conserved region spans nucleotide positions 7084 to 7194 in the Indian isolate of HEV and its predicted structure is shown in Figure 8A. Stem-loop structures 1 and 2 (SL1, SL2) comprise nucleotides 7173-7194 and 7089- 7163, respectively, separated by a single strand region (SS) of nt 7164-7172. The SL 1 structure involves the poly(A) stretch at its 3' end in base-pairing. All the nucleotide variations in other isolates were found in the bulge regions (Figure 8A). In the Mexican isolate. SL1 involves four A residues in base-pairing instead of three as observed in others, and the SL2 structure also bifurcates into two hairpin loops unlike other isolates (data not shown).
EXAMPLE 8
Construction of Recombinant Plasmids and Sequencing
A PCR-based strategy was used to generate the cDNA clones for the production of RNA transcripts corresponding to the 3' end of the HEV genome and its various mutant forms. A schematic representation and description of the constructs is shown in Figure 9 of the drawings. Forward and reverse primers were designed using the OLIGO 4.0 software and synthesized on an automated DNA synthesizer (Model 392, Applied Biosystems). These primers are identified in Table I at the end of this description. Such primers were used to amplify regions of the 3' end using a HEV cDNA 6046-7194 nt pCRScript clone [40] as a template [which was later used for the construction of an infectious HEV cDNA clone pSGl-HEV(I) of the present invention] and a combination of Tag (Promega) and Pfu (Stratagene) DNA polymerases. The forward primer contained the T7 promoter sequence immediately upstream of the HEV sequence. The amplified products were gel purified, polished with Klenow fragment of DNA polymerase (Amersham International) and phosphorylated with polynucleotide kinase (Amersham International). Subsequently these fragments were cloned into pUC18 vector (Ready to go pUC18 Sma I/BAP + ligase), (Pharmacia Biotech., Sweden) and sequenced by Sanger's dideoxy chain termination method using the Sequenase version 2.0 sequencing kit (Amersham International). EXAMPLE 9
Production of RNA Transcripts
For the generation of run off transcripts, plasmids were linearized by cutting with the appropriate restriction enzyme at the 3' end of the insert. Xlw I restriction enzyme site present in the 3' end primer was used to linearize the clones [3'(+)A ] and [s3'(+)A ] and Bam HI site of the vector MCS was used for other mutants (Table 1 ). Correct production of RNA transcripts was ensured by linearizing the clones with restriction enzymes giving rise to 5' overhangs and not 3' overhang, which might otherwise lead to the transcription of the wrong strand of DNA. In-vitro transcription reaction was performed with T7 RNA polymerase (Life Technologies Inc., USA) in the presence of 500 μM each of ATP. CTP and GTP along with 1 μM UTP and 50 μCi [ 3P] UTP (2500 Ci/mmol: Amersham, UK and BARC, India) in the presence of RNasin (Promega, USA) at 37°C for 1 hr. The reaction mixtures were then treated with RNase-free DNase I (Amersham, UK), extracted with phenol: chloroform and precipitated twice with ethanol. Soluble and TCA precipitable counts were measured in a liquid scintillation counter (Beckman. USA) to
estimate the incorporation level. Total amount of RNA synthesized was determined by the amount of limiting rNTP (UTP) present in the reaction, the maximum theoretical yield and the percent incorporation. The specific activity of the probe was expressed as the total incorporated cpm/total μg of RNA synthesized [43]. Upto 60% incorporation of the [J~P] UTP was routinely achieved resulting in labelled transcripts with a specific activity of typically 108 cpm/μg. The transcription reaction generated expected size transcripts as visualized on a 8% denaturing polyacrylamide gel. Unlabelled RNA w as similarly transcribed using 500 μM of each of NTPs, phenol: chloroform extracted and ethanol precipitated. EXAMPLE 10
Metal Ion and Enzymatic Probing of RNA Secondary Structure
In order to confirm the predicted structure at the 3' end, Pb~^ ions induced cleavage and RNase probing of the RNA transcript of the 3' end with a poly(A) stretch [3'(+)An] was performed. The in-vitro transcribed RNA from the Hind III linearised clone [3'(+)A ] was treated with calf intestinal alkaline phosphatase (Promega) and purified on 8% polyacrylamide/ 8 M urea gel. The RNA bands were visualized by UY shadowing using fluorescent TLC plate, cut and eluted overnight in a buffer containing 1% SDS and 2.5 M ammonium acetate. The transcripts were ethanol precipitated and 5' end labelled using T4 polynucleotide kinase (Promega) and [γ-"P] ATP (5000 Ci. mmol: BARC, India) and further purified on 8% polyacrylamide / 8 M urea gel to remove the unincorporated nucleotides. Autoradiographic exposure of 1 min enabled the visualization of the labelled bands. Prior to reaction, the 32P-labeled RNA transcripts were supplemented with carrier tRNA (Calbiochem) to a final concentration of 8 μM and subjected to a denaturation / renaturation procedure by heating the samples at 65"C for 3 min and slow cooling to 25°C over 60 min. The renaturation step was performed in the buffer containing 10 mM Tris/HCl, pH 7.2, 10 mM MgCl2 and 40 mM NaCl for lead ion induced cleavage and in the respective enzyme buffers for enzymatic cleavage reaction. Subsequently, Pb(OAc)? (Sigma) solution was added to the final concentration of 2.5 mM and 5 mM, and the digestion was conducted at 25°C for 15 min. For enzymatic cleavage. Ribonuclease I (0.1 U) (Promega), SI nuclease (0.4125 U) (Amersham International) and Mung Bean nuclease (0.1 U) (Amersham International) were independently added and the reaction proceeded at 25°C for 10 min. All the reactions were quenched by adding equal volume of 7M urea, 10 mM EDTA solution and loaded on 15 % polyacrylamide 7
M urea gel at -104 cpm / well. Electrophoresis was performed in 0.5X TBE buffer at 2000Y for -1.5 - 4 h for short and long runs, respectively. Autoradiograp v w as performed at -80°C with an intensifying screen (DuPont). The RNA cleavage products were run along with the products of alkaline RNA hydrolysis and limited ribonuclease Tl (Calbiochem) digestion of the same RNA. Alkaline hydrolysis ladder was generated by incubating the RNA with formamide in boiling water for 15 min. Partial Tl digestion w as performed in denaturing conditions (50 mM sodium citrate, pH 4.5, 7 M urea) with 0.05 U of enzyme at 55°C for 10 min. The above mentioned procedure was carried out after the standardization of reaction, gel electrophoresis and autoradiography conditions. Results: Catalysis by Pb + is known to be specific at the single stranded regions, loops and bulges, leaving out the double stranded RNA stretches [10, 1 1 , 12]. In the present structure probing studies, Pb"+ hydrolysis gave a more descriptive picture of the structure than the enzymatic probing utilizing different single strand cutting ribonucleases like Ribonuclease ONE, SI nuclease and Mung Bean nuclease. This might be due to the ability of smaller Pb2+ ions having better access into the folded RNA structure than the enzymes. The digestion pattern observed is summarized in Figure 8B and the representative structure probing gel is shown in Figure 10.
Figure 8B shows the nucleotides within the native RNA that were determined to be susceptible to Pb (II) and RNases. These are labelled p-Pb2+; o-Ribonuclease ONE: s- S I nuclease; m-Mung Bean nuclease. Nucleotides prone to Pb(II) hydrolysis in the RN.A- RdRp complex are shown in brackets [ ].
With reference to Figure 10, cleavage products obtained by hydrolysis of 5' end labeled RNA transcripts with Pb2+ ions and digestion with single strand specific RNases (Ribonuclease ONE, SI nuclease and Mung Bean nuclease) were separated on 15% polyacrylamide gel and given short and long runs. Prior to the reaction, the labeled RN.A transcripts were supplemented with tRNA and brought to a concentration of 8 μ.VI and subjected to denaturation/ renaturation procedure. The results according to lanes shown in Figure 10 were as follows:
Lane A: control incubation of the RNA without Pb2+ ions or RNases. Lane B: limited RNase Tl hydrolysis under denaturing conditions: cutting at cv erv G residue. Some of the numbers of G residues in the RNA sequence are depicted on the left.
Lane C: formamide ladder.
Lanes D & E: Pb (II) induced hydrolysis at 2.5 and 5 mM Pb(OAc)2.
Lane F: same as 5 but with the omission of the denaturation/ renaturation procedure.
Lanes G. H & I: digestion with Ribonuclease ONE. SI nuclease and Mung Bean nuclease. respectively. Pb" and RNases including SI nuclease and Mung Bean nuclease. cut at ev ery single nucleotide from nt 7164 to 7172 which forms a single strand region (SS ) in the computer predicted structure model. Occasional cuts within this region were observed by Ribonuclease ONE enzyme too. Pb2+ ions could identify and thus confirm the loops and bulge regions in SL1 and SL2, the reactivities being shown by the nt 7177. 7182-83. 7189-93 and 7198-99 in SL1 structure and nt 7094-97, 71 13-19, 7125-27, 7130-33. 71 36. 7141. 7154 and 7159 in the SL2 structure. Occasional hydrolysis was observed at either ends of the loops like nt 7098, 7100, 71 1 1, 71 12, 7124 etc., indicating towards some conformational flexibility in these regions. Different RNases cleaved at the bulge and loop regions of SL1 and SL2 as depicted in Figure 8B. Prior to the reaction w ith Pb(OAc):, the end labelled RNA transcripts were denatured and then cooled slow ly for efficient folding and obtaining maximum number of native molecules. How ev er, omission of this procedure had no effect on the probing profiles obtained according to Figure 10, suggesting that the RNA secondary structure as determined in these experiments forms rapidly and reproducibly under the chosen conditions. Various point mutants and deletion mutants were constructed as shown in Figure
9 based on the secondary structure model of Figure 8 A. The predicted structure of the mutants is demonstrated in Figure 1 1.
EXAMPLE 11
Cloning, Expression and Purification of the HEV RdRp Domain The putative RdRp coding region covering nucleotide positions 3546-5106 in the
HEV genome [28] was amplified as a larger fragment (3493-5163 nt) by PCR using a combination of Taq (Promega) and Pfu (Stratagene) DNA polymerases from a HEX' cDNA clone comprising nucleotides 2346-5163 in the pCRScript vector [2] [which w s later used for the reconstruction of full-length HEV cDNA infectious clone pSGl -HEY(I) of the present invention]. The primers used were: Forward primer. ACAGctcgagcccgggGCATGATTCAGTCG nucleotides 3493 to 3523 (Genbank AF076239); Reverse primer, GCGaagcttCg taccTGGTCGCGAACCCATGG
nucleotides 5163 to 5131 (Genbank. AF076239). The forward primer sequence was modified at the 5' end to create the restπction enzyme sites for Xlio I and Snui I (low er case nt in the primer sequence) and the start codon ATG. Similarly the reverse primer was modified at the 5' end to incorporate the restriction enzyme sites for Kpn I and Hind III (lower case nt in the primer sequence). The amplified fragment was cloned into pGEM-T vector (Promega) and sequenced as described earlier. It was then subcloned into the expression vector pRSET (Invitrogen) as a Xlw 1-Kpn I fragment. The recombinant pRSET C-RdRp clone was transformed into E.coli BL21-DE3 (Stratagene) and its overnight culture (O N) grown in Luria-Bertani (LB) medium. The overnight culture w as inoculated at 1% in NZ-Amine medium supplemented with 0.4% Glucose, IX M9 salts, 1 mM MgSOα and 50 μg/ml Ampicillin and the culture incubated at 37°C with shaking. Iso- propyl β-D thiogalactopyranoside (IPTG) was added at 1 mM concentration when the culture attained an optical density of 0.5. The culture was further incubated at 37°C with shaking for 3 hours. Cells were pelleted, solubilised in 2X Laemmili sample buffer ( 100 mM Tris-HCl, pH 6.8, 2% β-mercaptoethanol, 4% SDS, 0.2% bromophenol blue. 20% glycerol) and the extracts were separated on a Sodium-dodecyl-sulfate (SDS)- 10% polyacrylamide gel along with High Molecular weight marker (Life Technologies). The protein bands were visualized by Coomassie blue staining. Specificity of the expressed protein was determined by immunoblotting either with rabbit sera raised against E.coli expressed RdRp [2] or patient sera as the source of primary antibodies and peroxidase conjugated swine anti-rabbit and anti-human immunoglobulins (Sigma) respectively, as secondary antibodies. Colour development was carried out with 3.3'-diaminobenzidine (Sigma) as the substrate.
The over-expressed hexa-Histidine tagged recombinant RdRp protein s purified on Ni 2+ -NTA-Agarose (Qiagen, Germany) as per manufacturers' instructions.
The protein was refolded while immobilized on the column to avoid the formation of misfolded aggregates. Renaturation was carried out over a period of 1.5 hours using a linear 6 M-l M urea gradient in 500 mM NaCl, 20% glycerol. Tris/HCl pH 7.4, containing phenyl-methyl-sulfonyl-fluoride (PMSF) (Sigma) as a protease inhibitor. After refolding, the bound protein was eluted by the addition of 250 mM imidazole
(Sigma) The eluted fractions were checked on SDS-10% polyacrylamide gel and those containing high amounts of RdRp were further concentrated using the Centrisart I starter kit (20.000 cut off daltons) (Sartorius, Germany). Protein concentration was determined bv the Bradford assay (Bio Rad).
Results: Putative RdRp was expressed as a 63 kDa His-tagged protein, in E colt as show n in Figures 12A and 12B. Figure 12A depicts the Coomassie brilliant blue stained SDS-10% polyacrylamide gel showing the expression and purification of recombinant HEV RdRp protein in E.coli as a 63 kDa band. E.coli BL21(DE3) cells w ere transformed w ith pRSET C vector containing HEV RdRp coding sequence (nt 3493-5163 ) in- frame and induced with 1 mM IPTG. The expressed RdRp protein containing the 6 His tag w as purified on Ni-NTA column. U-uninduced lane, I-induced lane. P-lane show ing the band of purified RdRp protein. M-high molecular weight marker (Life Technologies). Arrowhead indicates the expressed protein band. The 63 kDa band is composed of -59 kDa RdRp and - 4 kDa His tag from the vector sequence.
Figure 12B shows the western blot of E.coli expressed recombinant HEV RdRp with patient sera and rabbit sera raised against RdRp. Lanes 1 & 2: immunoblot ith patient sera of the uninduced and induced lanes of RdRp, respectively. Lanes 3 & 4. immunoblot with preimmune rabbit sera of the un-induced and induced RdRp lanes. respectively. Lanes 5 & 6: immunoblot with rabbit sera raised against RdRp [2] of the uninduced and induced lanes of RdRp, respectively.
EXAMPLE 12
Preparation of Cell Extracts
HepG2, Cos-7 and Hep2 cells were cultured in Dulbecco's Modified Eagle's Medium (DMEM) (Life Technologies Inc., USA) supplemented w ith 10% fetal bov ine serum (Life Technologies Inc., USA) in a 5% CO2Ϊ atmosphere. Confluent layers of
HepG2 cells grown in 25 cm" flasks were washed twice with ice cold phosphate buffered saline (PBS), then harvested with a rubber policeman and pelleted by centπfugation at
3000 rpm in 220.87VO1 rotor Hermle 323K centrifuge (Germany) for 2 mm at 4"C The cell pellet was re-suspended in 100 μl of resuspension buffer (20 mM Tπs-HCl (pH 7.5 ),
2 mM DTT, 20% glycerol and 400 mM KC1). The cells were lysed w ith three cv cles of freezing and thawing at -80°C followed by incubation on ice, respecti ely. The cell debris w as removed by centrifugation at 12,000 rpm for 10 mm at 4°C in 220.87YO1 rotor
Hermle 323K centrifuge (Germany) and the supernatant with a protein concentration 5- 10 μg/μl was used as the cell lysate.
HepG2 cells transfected with full-length HEV genome RNA [of the present inv ention] w ere maintained similarly and the whole cell extract w as prepared as described above.
EXAMPLE 13
Electrophoretic Mobility Shift Assay (EMSA)
EMS A was standardized for protein concentration, buffei conditions leaction tempei dtui e and time, type of gel and electrophoresis conditions In the standaid binding assav 10 μg of whole cell extract protein or 5 μg of purified RdRp and -1 ng of [ P] UTP labelled RNA probe (-10" cpm) were incubated at 28°C foi 20 mm in a buf fei containing 10 mM Hepes (pH 7 6), 0 3 mM MgC . 60 mM KC1, 2% glycerol and 1 mM DTT Pnor to addition of the labelled RNA probe, 80 μg or 20 μg of E coli tR (Calbiochem. USA) was added to the reaction mixture containing cell extract oi RdRp respectively and incubated for 10 min at 28°C RNA-protein complexes w eie then subjected to electrophoresis on a 5% nondenatuπng polyacrylamide gel (acrylamide bisacrylamide ratio = 60 1) containing 5% glycerol in 0 5X Tπs-Borate-EDTA (TBE) buffei at 250 V at 4°C Gels were fixed, dried and autoradiographed
Partially purified and refolded RdRp was added to the binding reaction mix at concentrations langing from 1 μg to 20 μg Similar dose experiments w ith cellulai proteins weie performed m the lange of 2 5 to 20 μg total protein
To confirm specificity of the RdRp-RNA and cellular protein-RNA mtei actions specific and non-specific competitor RNAs were added to the binding reaction mix pπoi to addition of the probe Unlabelled RNA corresponding to the probe as a specific competitoi and E coli tRNA as a non-specific competitor were used at a molai excess of 5-50 fold with respect to the probe Human placental mRNA and ohgo dT ( 18 mei ) w ei e also used at a molar excess of 20-60 fold as non specific competitors to study specificitv of the RdRp-RNA interaction
Supeishift assay was performed to further confirm the specificity of RdRp-R A interaction The RdRp-RNA complex in the binding mix was incubated w ith the l abbit seia laised against E coli expressed RdRp for 20 min at 30°C The rabbit sei a w s eai liei v erified foi the presence of anti-RdRp antibodies by lmmunoblotting [2] The RdRp- RNA complex was incubated with pre-immune sera also in a separate leaction mix to be used as a control The reaction mix was then subjected to nondenatuπng polyaciv lamide gel electrophoresis as described above
Quantitative estimation of binding was performed by measuring the latio of RN A probe bound to RdRp to the unbound probe under optimal binding conditions and a
known RdRp concentration. This was accomplished by cutting the gel piece corresponding to the bound and unbound probe and measuring the counts by scintillation. Percent binding was determined as follows:
Percent binding = bound probe / (bound probe + unbound probe) x 100 Results: Binding of RdRp was observed with the wild type transcript with the poly(.A ) tail [3'(+)A ] (Figure 13A, lane 2), but not with a similar RNA lacking the poly(A) stretch [3'(-)] (Figure 13A, lane 4). The binding of RdRp to [3'(+)An] RNA showed a dose response since the band intensity increased on increasing the amount of RdRp for a fixed amount of RNA probe (Figure 13B). The binding showed a saturation effect and an optimal binding condition of 5 μg RdRp protein for 1 ng (103 cpm) of input RNA w as used for subsequent experiments. The specificity of the complex formation between
RdRp and [3'(+)A ] was evaluated in competition assays (Figures 13C and 13D). While a specific competitor, unlabelled [3'(+)An] RNA, disrupted the RdRp-RNA complex completely at 5-10 times the molar excess (Figure 13C, lanes 3 to 6), a non-specific competitor such as E.coli tRNA failed to disrupt the complex even at a 50-fold molar excess (Figure 13C, lanes 7 to 10). To rule out the possibility that the poly(A) tail itself confers all specificity to this RdRp-RNA interaction, human placental mRNA was also used as a non-specific competitor. It could not disrupt the RdRp-[3'(+)An] RNA complex even at a 60-fold molar excess (Figure 13D, lanes 3 and 4). In another competition assay. oligo dT also failed to compete with the RdRp-[3'(+)A ] RNA complex at a 60-fold molar excess (Figure 13D, lanes 5 and 6). In the supershift assay (Figure 13E). addition of rabbit anti-RdRp sera shifted the specific complex (C) to a higher position in the gel (band marked S in Figure 13E, lane 3). This shifted complex was absent with the pre- immune sera, though a non-specific band was observed in both the sera abov e the complex S (Figure 13E, lanes 3 and 4).
To identify domains within the 3' end of the HEV genome involved in interaction with RdRp. a mutational study was carried out. Various deletion and point mutations of the 3' end transcripts were made and assayed for their binding activity with the purified viral RdRp. The results are summarized in Figure 14 of the drawings. A major deletion of the nucleotides 7163-7194 (last 32 nucleotides) from the 3' NCR as well as omission of the 3' poly(A) sequences [3'(+)D] (Figure 1 1 ii) completely abolished RdRp binding, indicating the requirement of this region in complex formation. Further, no binding as
observed with the wild type 3' end lacking only the 3' poly(A) sequences [3'(-)] ( Figure H i), even with increased concentrations of RdRp. This clearly suggested that the 3' end poly(A) sequences are essential for binding of RdRp to the 3' non-coding region. From the secondary structure models (Figure H i) it is apparent that removal of poly(A) disrupts the necessary structure in Sill which may be required for binding. To further analyze the 3' end-RdRp interaction, two major deletions in the 3' end were generated.
{ [3'( )dAn] and [s3'(+)An]} (Figures l l iii and 1 1 iv). While both of these mutants contained the poly(A) stretch at their 3' ends, mutant [3'(+)dA ] was deleted of nt 7173-
7194 and lacked SL1 , mutant [s3'(+)An] was deleted of nt 7084-7138 and did not form SL2. Both these mutants showed less than 5% of the binding observed with the w ild type 3' end containing poly(A) sequences. Dose response experiments further indicated that amounts of RdRp lower than the optimal level (5 μg) failed to bind to even such a low- degree to these mutants (data not shown). This clearly suggested that the integrity of the SL1 and SL2 domains, along with the presence of the poly(A) tail is necessary for binding of the viral RdRp to the 3' end of HEV genome.
To determine whether the structure of SL1 is important for RdRp binding, tw o point mutants {[3'(+)Ml] & [3'(+)M2]} (Figures l lv and l lvi) and two deletion mutants | [3'(+)M3] & [3'(+)M4]} (Figures l lvii and l lviii) were generated. These mutants compromised the interaction at the optimal concentration of RdRp by varying degrees. A single G→A point mutation at nucleotide 7194 in [3'(+)Ml], which resulted in the disruption of SL1 by opening the interior loop and stem region (a), reduced the binding to 48%. A compensatory C→T point mutation at nucleotide 7176 in the [3'(+)M l ] background was introduced in mutant [3'(+)M2], which restored base-pairing. This mutant showed a slightly increased binding efficiency of -60%. The deletion mutant [3'(-)M3] lacking the interior loop showed only 13% binding, whereas the deletion mutant [3'(+)M4] lacking the hairpin loop still showed about 60% binding.
To explore the possibility of host cell proteins binding to the 3' end of HEY RNA. binding studies were performed with extracts prepared from liver as well as non-liv er cells. The HepG2 cell line, which has a human hepatoblastoma origin, was chosen as a representative liver cell and SV40-transformed monkey kidney Cos-7 and human epithelial Hep2 cells were used as representative non-liver cells. Total cell extracts were allowed to bind to the m-vitro transcribed [~ "P] UTP labelled RNA probes corresponding to the 3' end with added poly(A) sequences [3'(+)A ] in the presence of an excess of
tRNA to reduce non-specific binding. The amount of cell extract was titrated and it w as found that the addition of 10 μg of total protein served as the optimal amount to produce distinct RNA-protein complexes by EMSA (Figure 15A). Two major retarded bands were observed; one complex (Cl) migrated close to the free probe (F) and another complex (C2) migrated far more slowly in the gel (Figure 15 A, lanes 1 to 6). Thus, it is possible that more than one cellular protein bind to the 3' end. Unlike the binding of RdRp to 3' end. binding of cell extract proteins did not require the presence of a poly( .A) tail as apparent from binding studies with the probe lacking the poly(A) stretch [3'(- )] (Figure 15 A, lanes 7 to 12), but removal of the 3'-terminal nt 7163-7194 together w ith the poly(A) tail as in mutant transcript [3'(+)D] abrogated all binding activity (Figure 15.A. lanes 13 to 18). Binding studies with different cell extracts (HepG2, Cos-7 and Hep2) demonstrated the formation of the major complex C2 with all cell extracts (Figure 15B. lanes 2 to 5), but the minor complex Cl was absent with Cos-7 cell extract (Figure 15B. lane 3). Competition assays were performed using unlabelled probes as specific competitors and E.coli tRNA as the non-specific competitor. Whereas a 5 to 10 fold molar excess of the specific competitor inhibited complex formation (Figure 16A. lanes 3 to 6 and Figure 16B, lanes 3 to 6), even up to a 50 fold molar excess of the non-specific competitor was unable to inhibit the complex formation (Figure 16A, lanes 7 to 10 and Figure 16B, lanes 7 to 10), thus establishing the specificity of the complexes formed. Deletion of SLl [3'(+)dAn, Δ7173-7194 nt] (Figure l liii) from the 3' end led to the formation of complex Cl but abrogated the formation of C2 complex partially ( Figure
16C. lane 4). Deletion of SL2 as in [s3'(+)An, Δ7084-7138 nt] (Figure l l iv) from the 3 " end completely abolished the complex Cl and showed reduced binding in complex C2 (Figure 16C, lane 6), thus indicating at the regions of the 3' end that interact with cellular proteins.
EXAMPLE 14
UV-induced Crosslinking of RNA-protein Complexes
For UV-crosslinking assay, the binding reaction was set up as described m Example 13 with 40 μg of cell extract protein and ~106 cpm of RNA probe. The reaction mixture containing the RNA-protein complexes was UV irradiated on ice for 30 min in CL-1000 Ultraviolet Crosslinker (UVP Products limited, UK) at a distance of 5 cm from the UV source. 20 μg of RNase A (Sigma. USA) and 20 U of RNase Tl (Calbiochem. USA) were added and the reaction mix was incubated at 37°C for 30 min to digest the
unbound RNA completely. UV-crosslinked products were boiled in 2X Laemmili sample buffer for 3 min and analysed on a SDS-12% polyacrylamide gel. Gels were fixed, dried and the complexes were analysed by autoradiography along with Rainbow molecular weight marker (RPN 756; Amersham, UK) for size estimation. Results: To identify sizes of the cellular proteins that bind to the 3' end of HEV genomic RNA, an UV-crosslinking experiment was performed. Following RNA-protein binding. the " "P label was transferred from the 3'(+)An RNA probe to proteins bound directly to it by subjecting the complex to UV irradiation and the unbound regions of RNA were digested with RNase A and RNase Tl . Proteins to which 3jP label was transferred by crosslinking were analysed by SDS-12% polyacrylamide gel. Two major sets of polypeptides were found to cross-link to the 3'(+)A probe. One formed a doublet -45 kDa and the other a doublet of -95-105 kDa on a SDS-polyacrylamide gel (Figure 17A. Iane2). Several other minor bands were also observed which possibly represent other proteins of the complex. No such polypeptide bands were observed in the control lane lacking HepG2 cell extract (Figure 17A, lane 1).
To test whether the bound proteins were specific to cells of hepatic origin, binding and UV-crosslinking experiments were repeated with the cell extracts derived from non- hepatic Cos-7 and Hep2 cells. While both the polypeptides were observed to cross-link to 3' end probe from HepG2 and Hep2 cells (Figure 17B. lanes 3 and 5), the -45 kDa polypeptides were missing from the Cos7 cells (Figure 17B, lane 4). Instead, the Cos7 cell extract showed crosslinking of another polypeptide of slightly higher mobility. For the same amount of extract used, the level of the 45 kDa polypeptide was significantly higher in HepG2 (liver) cells compared to Hep2 (non liver) cells.
To look into the possibility of induction of these or other host proteins upon viral infection, binding and UV-crosslinking studies were also performed with the HepG2 cells transfected with the full-length HEV genome. No extra bound complexes or protein bands were observed in comparison to the uninfected cells (Figure 17B, lane 6).
EXAMPLE 15
Pb (II) - Induced Cleavage of RNA-RdRp Complex
The 5' end labeled probe of [3'(+)An] was allowed to bind to RdRp as mentioned above and the complex was subjected to reaction with Pb(OAc)2 at final concentration of 1.25. 2.5 and 5 mM at 25°C for 10 min. The reaction was stopped by the addition of
EDTA to a final concentration of 10 mM, extracted with w ater-saturated phenol and RNA piecipitated with ethanol The samples were dissolved in 7 M urea/ 10 mM EDT A solution and loaded on the gel as descπbed above.
Results: Pb"" induced hydrolysis carried out on the native 3' end RNA molecule and the one complexed with viral RdRp further confirmed the nucleotides involv ed in such an interaction as shown in Figure 8B and Figure 18 Figure 18 is an autoiadiogi aphic depiction of Pb (II) - induced cleavage of the 5' end labeled HEV 3' end RNA The cleavage in the native form is presented in lanes D and E and as complexed w ith RdRp in lanes F and H Lane A represents control incubation of RNA without Pb2~ ions Lane B show s limited RNase Tl hydrolysis under denaturing conditions cutting at ev erv G residue Some of the numbers of G residues in the RNA sequence are depicted on the left Lane C is the formamide laddei Lanes D and E show the reactions of nativ e fonn of RNA with 2 5 and 5 mM Pb2' Lanes F, G and H show the reactions of the RNA-RdRp complex with 5, 2.5 & 1.25 mM Pb2-. Although this approach limited the identification of interacting nucleotides to the single stranded regions, loops and bulges, it was nevertheless helpful m compounding the results obtained using different mutant forms of 3' end. The nucleotides that could still be hydrolyzed by Pb2+ on forming RNA-RdRp complex are bracketed in Figure 8B thus representing the unmasked nucleotides in such an interaction It is quite e ident from the foot printing data that the RdRp interacting sites are spread o er SLl. SL2 and SS legions This is indeed shown by the experiments with deletion mutants [3'(-)d A
3'(+)D and s3'(+)A ]whereιn the deletion of either of these three regions abrogates all the binding activity as shown in Figure 14 Finer mutational analysis involv ing minoi deletions in SLl like lack of 3 C (7190-92) in mutant 3' (+) M3 (Figuie l l v u ) brings down the binding activity to 13% of the normal This is well in coordination w ith the Pb2" hydrolysis data of the RNA-RdRp complex wherein all these three C residues ai e masked and thus resistant to hydrolysis (Figure 8B)
Some of the nucleotides in the double stranded regions (7178 and "176) w hich were lesistant to hydrolysis in the native form of RNA, get exposed to Pb"- ions in the RNA-RdRp complex (Figure 8B) This might be due to local conformational changes during complex fonnation of RNA ith RdRp
EXAMPLE 16
Structural analysis of RNA dependent RNA polymerase
The putative HEV RdRp sequence was aligned with the poliov irus RdRp sequence to screen for homology and the presence of "palm" domain motifs A. B. C. D and E [38]. The motif domains were also subjected to secondary structure predictions by the neural net method of Rost and Sander [50]. The sequence alignment, secondary structure predictions, structure-based sequence alignment and partial crystal structure of the poliovirus RdRp was used to build a core structure of the "palm" domain for HEY RdRp. The mutations and loops were generated using Swiss PDB viewer [18] and the model was energy minimized with AMBER 5.0. [42].
Results: The reported crystal structures of various nucleic acid polymerases hav e revealed the overall structure to resemble a right hand [38]. By homology modeling, w e have constructed a similar structure for HEV RdRp, comprising of a "palm" (composed of five motifs: A, B, C, D and E), "fingers" and "thumb" domains as illustrated in Figure 19A of the drawings. In this figure, "fingers" and "thumb" domains are shown to correspond to relative positions of the domains in space. Common essential residues found in poliovirus RdRp and in HEV RdRp are marked along with residue positions in the HEV RdRp sequence. In Figure 19A, dark ribbons represent α helices, light ribbons represent β strands and dark backbone represent turns. The sequence alignment and identified polymerase motifs A through E within the palm domain are shown in Figure 19B. Consensus secondary structures as giv en by Hansen et al. 1997 [19] are shown at the top of each aligned motif sequence. The arrow s represent β strands, dotted lines correspond to turns and rods represent α helices. The most highly conserved residues (in comparison of RdRps: BMV 2a, TMV pi 83. TBSY p92, HCV NS5B, Qβ replicase subunit II, poliovirus and HIV RT) are shown in bold letters with asterisks, high similarity residues in underlined letters with double dots and low similarity residues are depicted in italics with single dots.
It was found that strands of the antiparallel β sheets in the core region of the palm domain are composed of residues fro motifs A, C and part of D, whereas the a helices are composed of residues from motif B and the remainder of motif D (Figure 19A). The computer generated core structure of HEV RdRp palm domain resembles the poliovirus
RdRp as is apparent from a superimposed picture of the two RdRps. The core structure
of polymerase palm domain is also characterized b its similarity to RNA lecognition motif (RRM) found in splicing proteins and sev eral iibosomal proteins In v iew of these facts a thiee dimensional model of the core structure of HEV-RdRp 'palm" domain has been proposed (Figure 19A). The palm domain is in turn composed of various motifs ( A. C. B. D and E) divided on the basis of secondary structure as well as their functional relev ance
Motif A (Figure 19B) - The secondary structure prediction of HEV RdRp show s that the motif A consists of a β strand identical to that in poliovirus RdRp (POL-RdRp). forming one of the four β strands of the core structure This is followed by a short helical turn Near the end of the first β strand is a completely conserved aspartate (amino acid 268) in POL-RdRp, HEV-RdRp as well as in other classes of RNA polymerases The region follow ing this is α helical, found in predicted as well as POL-RdRp ci stal structure Motif A is involved in magnesium coordination and possibly sugar selection The aspartate (amino acid 268) near the end of the β strand is likely to be inv olved in coordination of divalent cations duπng nucleotidyl catalysis The second aspartate (ammo acid 273) is thought to be involved in sugar selection This aspartate is conserv ed in all positiv e stranded RNA viral RdRps An aspartate at this position favors NTPs ov ei dNTPs, perhaps by direct interaction with 2' hydroxyl group of an incoming NTP [ 19]
Motif B (Figure 19B) - Predicted structure of this region was found to be similai in all classes of polymerases Crystal structure of POL-RdRp suggests a loop formation containing a highly conserved glycine residue (amino acid 326) The loop is also found in the secondaiv structuie prediction of HEV-RdRp Fiv e residues downstieam ol the glycine is a threonine (amino acid 330). conserved in all RdRps and found to be important foi RdRp activity Motif B also forms one of the two helices that pack beneath the four-stranded antiparallel β sheet of the polymerase core structure The predicted extent of α helical region matches well with hepatitis C virus (HCV) NS5B [38]
Howev ei, considering the consensus arrived on the basis of sequence alignments and secondaiy structure prediction of various classes of polymerases by O'Reilly and Kao
[38], the inventors have taken the helical legion to be of same length as that of POL- RdRp in their model of the core of palm domain A highly conserved asparagine residue
(amino acid 334) found in all RdRps characterizes the helical region This residue is thought to plav a role in the discrimination of πbose v ersus deoxyπbose Mutation of this aspaiagine is known to abolish RdRp activity in HCV NS5B
Motif C (Figure 19B) - Motif C forms a β-turn structure, which is similar in all classes of polymerases and it positions the two aspartates (GDD motif in the turn region ) (amino acids 359 and 360) close to the conserved aspartate of motif A. In POL-RdRp. the two aspartates occupy positions 3 and 4 of the type IT β turn, placing the aspartate side chains on the outside face of the core structure. This structure is similar in all classes of polymerases. The first aspartate of motif GDD is thought to be involved in coordination of second div alent cation. Second aspartate, although not absolutely conserv ed in all classes of polymerases. is conserved in RdRps. In HEV RdRp the GDD motif is found to be conserved. Motif D (Figure 19B) - Motif D of POL-RdRp forms a α helix-tum-β strand structure. Although the PHD (Predict at Heidelberg) program predicted this region accurately for poliovirus. there are inconsistencies in the structures predicted for other polymerases [38]. However on the basis of local sequence similarity the inventors hav e assigned it to be same in HEV-RdRp core model. Functions of residues in this motif are unclear. The lysine towards the end of the motif is found to be conserved in all RTs and nearly all RdRps, however not in HEV-RdRp.
Motif E (Figure 19B) - Motif E is unique to RdRps and RTs, which forms a short β strand-turn-β strand and is positioned between the palm and thumb. Ho ev er. the structure of motif E varies in different classes of polymerases and even in different crystals of HIV-RT. The hydrophobic residues in this motif seem to be important for the interaction with the palm core structure. The hydrogen bonding interactions with β strand of the thumb are thought to be crucial in proper positioning of the thumb on substrate binding.
DISCUSSION
As an overview of the present invention vis-a-vis the prior art. it has been established that for efficient productive infection, the in-vitro produced viral RN.A transcripts have to mimic the virion RNA as closely as possible. This is because the v iral genome has to interact with several viral as well as cellular proteins. These interactions determine the efficiency of replication, transcription and translation. The parameters, w hich affect infection by gene transfer are: heterogeneity of the transcript population, presence of point mutation and the sequences at 5' and 3 ' ends i.e. number of non-v iral
nucleotides. presence of a cap structure at the 5' end and poly A tail at the 3 ' end. The problem of heterogeneity in transcript population is mainly due to poor fidelity of RNA polymerase [1, 15]. As a result it may hamper the infectivity of the transcripts [5].
According to the inventors' experiments, this was circumvented by use of a large quantity of RNA for transfection studies. Adding cap structure to the 5' end could also possibly circumvent this problem. The effect of non-viral sequences at the extreme ends of viral transcripts may also play an important role and it has been observ ed that 5' extensions generally decrease or abolish infectivity whereas 3' extensions are tolerated to a limited extent [5]. The reduced infectivity of dengue RNA transcripts has been reported due to presence of non-viral sequences at the 5' end whereas non-viral nucleotides at 3' end in the dengue RNA transcript did not abolish infectivity [32]. In other positiv e stranded RNA viruses such as poliovirus [59], Hepatitis A virus (HAV) [ 13] and Sindbis virus [48], the in-vitro produced infectious RNA from cDNA clone had additional 5' non- viral sequences. In these cases the infectivity was found to be lower than the virion RNA. However such a comparison was not possible in the case of HEV because it is still not possible to culture this virus. There was only one non-viral nucleotide at the 3' end of the inventive HEV clone following digestion of the cDNA clone with Xhol restriction enzyme prior to in-vitro transcription. The HEV transcript used in the studies according to the present invention has 12 non- viral nucleotides at the 5' end in addition to the complete viral genome (Genbank Accession No. AF076239). These additional nucleotides in the transcript at the 5' (12 nucleotides) and 3' (1 nucleotide) ends did not abolish its competence for replication as observed in these studies. In recent studies by- other researchers, the presence of cap structure in HEV genome has been described [24]. However, the present invention demonstrates replication of HEV RNA w ithout a cap structure. Therefore, it may be presumed that the presence of cap structure is not obligatory for replication of HEV genome.
The negative strand of viral RNA usually serves as the replicative intermediate in most of the positive stranded RNA viruses. Such a species was demonstrated for HEV in the transfected HepG2 cells, indicating active viral replication. The anti-sense strand was found to be in a lower amount than the sense strand like in other positive stranded virus systems [53]. While the sense strand was detected up to 10" dilution of the template
RNA, the anti-sense strand was detected upto a dilution of 10" . This is possibly because anti-sense pregenome tends to get converted into sense strand faster. In addition some of
the RN.A used for transfection may persist ev en after thorough washing and may lead to very high level of sense strand detected. In most of the positive stranded RNA v iruses, the positiv e and negative strands are synthesized in unequimolar ratio, i.e. the positi e strand is produced in excess of the negative strand [53]. It is believed that this is due lo the v ariation in interaction of different cellular proteins and/or RdRp inv olv ed in regulating the rate of initiation of viral RNA synthesis from positive or negativ e sense template. The v iral replication was detected in the cells for 6 passages (33 day s ) Thereafter, neither the sense nor the anti-sense HEV RNA could be detected (45 day s)
The transfected viral genome was not only capable of replication but also expressed viral proteins in the transfected cells and released infectious v irus into the culture supernatant as evaluated by experimental infection into a Rhesus monkey Nearly
20% of the cells were transfected as observed by lmmunofluorescence assay The metabohcally labelled viral proteins were immunoprecipitated from transfected cells using their respective antibodies derived from both structural and putativ e non-structural regions. The components of non-structural polyprotein identified from the predicted homology to putative methyl transferase, helicase and RdRp domains w ere immunoprecipitated separately from the transfected cells. The putative RdRp w as detected up to 36 hours of transfection. This is possible because it is an early protein and undergoes rapid degradation. Therefore, no signal corresponding to RdRp could be detected at 72 and 96 hours. However, the other proteins were detected at 72 hours post transfection. which include putative methyl transferase and helicase. This indicates that the protein product from the ORFl region undergoes processing.
No processing of the ORFl polyprotein (-186 kDa) was observed in our earlier experiments w ith the expression of HEV ORFl [2]. No processed putativ e functional proteins could be individually identified either in an in-vitro coupled translation system or in HepG2 cells transfected with ORFl gene. Incubation of the eukaryotic expression product of the ORFl did not reveal any degradation over 24 hours at 37°C [2]. Ho e r the complete genome of HEV incorporating the same ORFl show s processed components that could be immunoprecipitated w ith putative domain specific antibodies Therefore it is predicted that processing of the non-structural polyprotein occurs onlv in context of the complete virus genome. The other viral proteins either directly or indirectly through cell dependent mechanisms may activate proteases responsible for such processing. A putative protease domain has been identified in the ORFl gene based on sequence comparison [28], However this has not been characteπzed yet. The possibility
of the viral protease needing activation cannot be ruled out. The ORF3 protein is a phosphoprotein that binds to src homology domain III. It is phosphorylated by mitogen activated protein (M.AP) kinase. Therefore, this possibly can play a role in protein phosphorylation [64]. Whether this protein alters the activity of any cellular or viral protease to initiate the polyprotein processing needs further investigation.
Inoculation of the culture supernatant from the RNA transfected cells was able to produce infection in one rhesus monkey. This was evidenced by rise in serum transaminase. direct detection of the viral genome and appearance of IgM and latter IgG anti-HEV antibodies in the serum of the inoculated animal. This is possible only when intact virus is released into the culture supernatant as inoculation of in-vitro produced HEV RNA did not produce infection. This method of gene transfer is unique, in the sense that it permits the recovery of an infectious agent from cells transfected with in-vitro produced RNA from an HEV cDNA clone generated by assembly of PCR amplified subgenomic fragments. Similar assembly of PCR amplified fragments have been described earlier [44]. It has always been believed that during PCR amplification error in nucleotide incorporation leads to production of mutated fragments that may not be functionally active. However, the experience of the inventors indicates that using simple methods like adding proofreading enzyme (Pfu DNA polymerase; Stratagene, Germany) during amplification can avoid this problem. This model of HEV gene transfer can now be used to facilitate the studies on evolution, pathogenesis, molecular biology and drug development relevant to understanding and controlling the HEV infection.
Based on the replication strategy of other positive strand RNA viruses and the genomic organization of HEV, a basic replication mechanism for HEV has been hypothesized earlier [47]. Upon entering the cell, the non-structural gene products are presumably expressed from the full-length positive sense genome, which are then involved in the earliest stages of replication. The replicase unit, i.e. RdRp either alone or in association with cellular proteins, subsequently directs the synthesis of negative sense pre-genomic RNA from the 3' end of the viral genome Such a negative strand HEV RN.A inteπnediate has indeed been detected earlier [37] and the research leading to the present invention supports this kind of replication pattern. Subsequently, the synthesis of progeny positive strand RNA takes place using the de novo synthesized negative strand as the template. Thus, the entire process of replication of HEV RNA genome minimally utilizes an interaction between viral and host cell proteins and the 3' and 5' terminal RNA motifs. Similar interactions have been characterized in Japanese encephalitis virus [6], west nile
MI us [4], encephelomyocarditis virus [14], alfalfa mosaic v irus [17]. hepatitis A v it us [31 ]. brome mosaic virus [45], mouse hepatitis virus [62. 63]. sindbis alphav lrus [41 ] and hepatitis C virus [61 ]
In the light of the proposed replication model, studies were undertaken to analy ze the RNA-protein interactions that occur at the 3' end of HEV RNA. hich might serv e as the key step in the initiation of replication of the viral genome To the inv entoi s' know ledge, this is the first set of expenments describing the interaction of viral RdRp and cellular proteins with the 3' end of the HEV genome. It also describes the RN A sequences and secondary structures responsible for the formation of these specific RN A- protein complexes. The present invention attempts to characterize the host proteins interacting with the 3' end. It also reports on the HEV RdRp structure in relation to the structure of other polymerases, in particular the crystal deriv ed structure of poliov uus RdRp
Chemical and enzymatic probing of the 3' end of HEV RNA with 5 Adenosine residues confirmed the computer predicted structure model derived by MFOLD program (Figures 8A and 8B). Chemical probing utilized Pb2+ which not only discriminates the single stranded regions and loops from double stranded stem structures, but also has the ability to distinguish between regions of increased conformational flexibility as w ell as altered conformation of RNA [10, 11, 12]. The approach of Pb (II) induced hydrolysis has been successfully applied in structural studies of several RNA molecules, like tRNA [3, 29. 35, 51 ], mouse Ul small nuclear RNA [65] and E.coli 16 S RNA [ 16] and different prokaryotic 5S rRNA [9, 12].
The earlier attempts to identify the R.dRps of positive-strand RNA v iruses hav e been based solely on the sequence conservation [25, 27]. According to the present invention, the inventors have tried to further analyze the structure and functional aspect of HEV RdRp by comparing it with the computer predictions of other representati e RdRps [38] with special reference to the recent report of partial crystal structui e of poliov irus RdRp [19]. The reported crystal structures of various nucleic acid pol meiases have, as already stated herein, revealed that the overall structure of a polymerase molecule resembles a right hand comprising of "palm", "fingers" and "thumb" domains It is also proposed that there appears to be a conservation of tertiary structuie rather than primary sequence However, there is a difference in 'fingers' and 'thumb' domains of different polymerases but all the known classes of polymerases hav e suni lai coi e
structure in the 'palm' domain. In view of these facts a three dimensional model of the core structure of HEV-RdRp 'palm' domain has been proposed (Figure 19A).
The 3' ends of alfalfa mosaic virus (AAMV) [17], turnip crinckle virus (TCV) [52], human rhinovirus [57] and encephalomyocarditis virus [14] are known to specifically bind to their respective RNA dependent RNA polymerases. Using gel shift assays, the present invention has demonstrated that the purified and refolded HEV RdRp binds specifically to the 3' end of the HEV RNA genome with the poly(A) tail. tRNA was included in the binding reaction mix in sufficient quantity to avoid any non-specific binding. The specificity of the interaction was further confirmed in competition assays. Specific competitors, i.e. cold RNA corresponding to the probe, disrupted the retarded complex even at 5-10 fold molar excess, but the nonspecific competitors like E.coli tRNA could not abolish the complex formation even at concentrations as high as 50-fold molar excess. Supershift assay using anti-RdRp antibodies further confirmed the specificity of interaction. The 3' end lacking the poly(A) stretch did not bind to RdRp. even with increased amounts of protein (20 μg). Thus, the interaction between the 3' end and RdRp shows an absolute requirement for the poly(A) tail, which might be due to its participation in the formation of SLl. A similar requirement for the poly(A) tail has been established earlier for encephalomyocarditis virus [14]. Functional relevance of a 3' pseudoknot that includes part of the 3' poly(A) tail has been shown for efficient replication of bamboo mosaic potexivirus RNA [58]. Specificity of the viral RdRp binding to the viral sequences rather than to polyadenylated RNA was confiπned by the inability of the polyadenylated mRNAs to compete with the interaction. Deletion mutants lacking either SLl or SL2 did not form the complex with RdRp. This suggests that the complete 3' end domain, including two stem loops and the poly(A) tail is recognized by the viral RdRp. Pb2+ induced hydrolysis of the RNA-RdRp complex revealed that the interacting sites were indeed spread over SLl, SL2 and even SS region. Modifications in SLl by deletion and point mutations affected the binding by varying degrees (Figure 14). Opening of the interior loop and stem region (a) by a point mutation brings down the binding efficiency to 48%. A compensatory point mutation that restores SLl base pairing was able to partially restore binding to 60%. Deletion of three cytosines in the interior loop effects the RNA-RdRp interaction drastically, bringing it down to merely 13%. In contrast, removal of stem region (b) and the hairpin loop had little effect on RdRp binding. These combined results suggest that both structured RNA motifs and sequence elements (CCC in the interior loop) are important for RdRp binding.
It is postulated that the initiation of the replication also requires binding of the host cell proteins at the 3' end of the positive-strand RNAs to form a replicase complex in conjunction with RNA dependent RNA polymerase enzyme. Recently many such interactions have been identified. The 3' ends of simian hemorrhagic fever virus RN.A [21 ]. hepatitis A virus RNA (presumed to form a pseudoknot structure) [31 ]. sindbis virus [30]. west nile virus [4] and conserved 11 nucleotide sequence at the 3" end of coronavirus mouse hepatitis [63] bind to their respective host cell proteins. The host cell proteins can also interact with the viral RNA replication complex, as in brome mosaic virus [45]. In their binding studies with the hepatic cell extract, the inventors have indeed shown that the 3' end of HEV genome forms specific complexes with cellular proteins. Two major complexes were observed on native gels suggesting the involvement of more than one host factor in such an interaction. The smeared signal near the top of EMS A gels also indicates that a high molecular weight complex with multiple protein subunits may interact with the 3' end. The formation of these complexes is specific as shown by competition experiments. Interaction with the cellular proteins is independent of the poly(A) stretch, removal of which disrupts SLl. Some point and deletion mutations in SLl do not seem to effect the complex formation with the cellular proteins. However, complete absence of SLl results in partial abrogation in the formation of complex C2. whereas absence of SL2 abolishes the formation of complex Cl completely. The definite biological role of the host cell proteins interacting with the viral
RNA genomes has not been clarified so far. Most such proteins have been found to be associated with cellular RNA-processing pathways or translation machinery [33]. Some of these cell proteins may aid in replication of the viral genome in conjunction w ith the RdRp-replicase complex. The two major polypeptides interacting with the 3' end of HEV RNA genome were identified as doublets of -45 kDa and -95-105 kDa and 3-4 minor species represented by faint bands. The 45 kDa protein might be an important specificit determining factor, as interaction with such a protein is absent with the Cos7 cell extract. Howev er, the Cos7 cell extract showed crosslinking of a smaller polypeptide. w hich might suggest that this protein(s) is somewhat smaller in monkey cells compared to human cells, or in kidney cells as compared to liver cells. Proteins with similar molecular weights have been identified in other systems. In brome mosaic virus, a host protein of 45 kDa bound to the RdRp was determined to be an analog of eIF-3 subunit p41 [45]. The 3' end of hepatitis A RNA is also known to bind to a 45 kDa host protein [31]. A 103 kDa cellular protein is reported to bind to the 3' end of mouse hepatitis virus [63] and another
105 kDa protein binds to sense and anti-sense strands of 3' non coding region of west nile virus [4]. The 3' end of rubella virus negative strand RNA binds to 97 kDa cellular protein whose intensity is increased in infected cells [36] The doublet signals observed in the UV-crosslinking assays might be either due to RNA binding to different proteins of slightly different molecular weights, differentially modified forms of the same protein (e.g. glycosylated and the un-glycosylated forms) or due to differential digestion of the RNA after UV-crosslinking. Difference in the intensities of the detected bands do not necessarily exemplify the differences in the binding capabilities, but could also be due to the difference in their respective amounts in the cell extract or due to difference in the UV-crosslinking abilities.
The inventors' studies describe herein the specific binding of purified and refolded HEV RdRp to the 3' end of the viral genome, which necessiates the structured SLl and SL2 domains and the poly(A) tail. Multiple sequence and structural factors are recognized by RdRp, which may explain the specificity of recognition of the HEV RNA. HEV RdRp structural analysis comparisons with other RNA polymerases reveal the presence of conserved structured domains with discrete functions. Cellular proteins also form complexes at SLl and SL2 sequence regions and the interaction is independent of the presence of poly(A) stretch. Molecular weights of the interacting cellular proteins have been identified to be -45 kDa and -95-105 kDa as detected by UV- induced crosslinking studies. By forming the specific binding site for viral RdRp and cellular proteins, the 3' end of HEV genome is projected as assuming a potential role as a cis- acting element for the initiation of replication. In such a situation, the 3' end of HEV genome holds a great possibility as the potential target for synthetic molecules, anti-sense and ribozyme therapy.
TABLE I
Primers used for the amplification of 3' end of HEV RNA and generation of its mutants
Probe Forward primer Reverse primer
3' (+) A" pT7-7084-GCTCCAGCGCCTTAAGATGAA GCCTCGAGTTTTTCAGGGAGCGCGG A ACGC A
3" (+) pT7-7084-GCTCCAGCGCCTTAAGATGAA AGGGAGCGCGGAACGCAGAAATG.AG AA ATA
AGCAACAGA
3' (+) D pT7-70S4-GCTCCAGCGCCTTAAGATGAA GCAACAGAGAGAAGGGGGGC AC A
3' (+) d A" pT7-7084-GCTCCAGCGCCTTAAGATGAA TTTTTTGAGAAAT AAGC.AAC A
s3' (+) A" pT7-7139-TTGTGCCCCCCTTCTCTCTG CCTCGAGTTTTTC AGGG.AGCGCGG A ACGC A
3' (+) M l pT7-7084-GCTCCAGCGCCTTAAGATGAA TTTTTTAGGGAGCGCGG A ACGC AG A
3' (+) M2 pT7-7084-GCTCCAGCGCCTTAAGATGAA TTTTAGGGAGCGCGGA ACGC A A A A ATG
3' (+) M3 pT7-7084-GCTCCAGCGCCTTAAGATGAA TTTTTCGCGCGGAACGCAGA
3' (+) M4 pT7-7084-GCTCCAGCGCCTTAAGATGAA TTTTTCAGGGAAGAAATGAG AA ATAAGC
pT7 l epi esents the T7 RNA polymeiase promotei sequence (TGTAATACGACTCACTATAGG ) 7084 and 7139 are the nucleotide positions in the HEV genome at which the foi w aid pi unei s begin
REFERENCES
1. Ahlquist, P., and M. Janda. 1984. cDNA cloning and in-vitro transcription of the complete brome mosaic virus genome. Mol. Cell. Biol. 4:2876-82.
2. Ansari I. H., S. K. Nanda, H. Durgapal, S. Agrawal, S. K. Mohanty. D. Gupta, S. Ja eel and S. K. Panda. 2000. Cloning, Sequencing and Expression of the Hepatitis
E virus non-structural Open Reading Frame 1 (ORFl). J. Med. Virol. 60:275-283.
3. Behlen, L., J.R. Sampson, A.B. DiRenzo and O.C. Uhlenbeck. 1990 Lead- catalyzed cleavage of yeast tRNAplιe mutants. Biocehm. 29:2515-2523.
4. Blackwell, J.L. and M.A. Brinton. 1995. BHK cell proteins that binds to the 3 ' stem loop structure of the West Nile virus genome RNA. J. Virol. 69:5650-5658.
5. Boyer, J. C, and A. L. Haenni. 1994. Infectious transcripts and cDNA clones of RNA viruses. Virology. 19:415-26.
6. Chen, C.J., M.D. Kuo, L.J. Chien, S.L. Hsu, Y.M. Wang and J.H. Lin. 1997. RNA-protein interactions: involvement of NS3, NS5 and 3' noncoding regions of Japanese encephalitis virus genomic RNA. J. Virol. 71:3466-3473.
7. Chetouani, F., P. Monestie, P. Thebault, C. Gaspin and B. Michot. 1997. ESS A: an integrated and interactive computer tool for analysing RNA secondary structure. Nucleic. Acids. Res. 25:3514-22.
8. Chomczynski, P., and N. Sacchi. 1987. Single step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Ann. Biochem. 162: 156- 159.
9. Ciesiolka, J. and W.J. Krzvzosiak. 1996. Structural analysis of two plant 5S rRNA species and fragments thereof by lead-induced hydrolysis. Biochem. Mol. Biol. Int. 39:319-328.
10. Ciesiolka, J., D. Michalowski, J. Wrzesinski, J. Krajewski and W.J. Krzvzosiak. 1998. Patterns of cleavages induced by lead ions in defined RNA secondary structure motifs. J. Mol. Biol. 275:21 1 -220.
1 1. Ciesiolka, J., S. Lorenz and V.A. Erdmann. 1992a. Structural analysis of three prokaryotic 5SrRNA species and selected 5S rRNA - ribosomal-protein complexes by- means of Pb(II)-induced hydrolysis.. Eur. J. Biochem. 204:575-581.
12. Ciesiolka, J., S. Lorenz and V.A. Erdmann. 1992b. Different conformational forms of Escheήchia coli and rat liver 5S rRNA revealed by Pb(II)-induced hydrolysis. Eur. J. Biochem. 204:583-589.
13. Cohen, J. I., J. R. Ticehurst, S. M. Feinstone, B. Rosenblum, and R. H. Purcell. 1987. Hepatitis A virus cDNA and its RNA transcripts are infectious in cell culture. J.
Virol. 61 :3035-9.
14. Cui, T., S. Sankar and A.G. Porter. 1993. Binding of Encephelomyocarditis virus RNA polymerase to the 3' noncoding region of the viral RNA is specific and requires the 3' poly A tail. J. Biol. Chem. 268:26093- 26098. 15. Dawson, W. O., D. L. Neck, D. A. Knorr, and G. L. Grantham. 1986. cDNA cloning of the complete genome of tobacco mosaic virus and production infectious transcripts. Proc. Natl. Acad. Sci. USA. 83: 1832-1836.
16. Gornicki, P., F. Baudin, P. Romby, M. Wiewiorowski, W.J. Krzvzosiak, J.P. Ebel, C. Ehresmann and B. Ehresmann. 1989. Use of Pb(II) to probe the structure of large RNAs. Conformation of 3' terminal domain of E.coli 16S rRNA and its involvement in building the tRNA binding sites. J. Bio ol. Struct. Dynam. 6:971 - 984.
17. Graaff, M.D., C. Thorburn and E.M.J. Jaspars. 1995. Interaction between RNA dependent RNA polymerase of Alfalfa Mosaic virus and its template; Oxidation of vicinal hydroxyl groups blocks in vitro RNA synthesis. Virol. 213:650-654.
18. Guex, N. and M.C. Peitsch. 1998. Swiss-PDB Viewer. Glaxo Wellcome Experimental Research v 3.2.
19. Hansen, J.L. 1997. Structure of the RNA-dependent RNA polymerase of poliovirus. Structure 5:1109-22. 20. He, J.A., A.W. Tarn, P.O. Yarbough, G.R. Reyes and M. Carl. 1993. Expression and diagnostic utility of hepatitis E virus putative structural proteins expressed in insect Cells. Journal of Clinical Microbiology 31 :2167-73.
21. Hwang, Y.K. and M.A. Brinton. 1998. A 68-nucleotide sequence within the 3' noncoding region of simian hemorrhagic fever virus negative strand RNA binds to four MA104 cell proteins. J Virol. 72:4341-4351.
22. Jameel, S., H. Durgapal. C. M. Habibullah, M. S. Khuroo. and S. K. Panda.
1992. Enteric non-A, non-B hepatitis: epidemics, animal transmission and hepatitis E virus detection by the polymerase chain reaction. J. Med. Virol. 37:263-270.
23. Jameel. S., M. Zafrullah, M. H. Ozdener, and S. K. Panda. 1996. Expression in animal cells and characterization of the hepatitis E virus structural proteins. J. Virol.
70:207-216.
24. Kabrane-Lazizi. Y., M. Xiang-Jin, R. H. Purcell and S. U. Emerson. 1999. Evidence that the genomic RNA of hepatitis E virus is capped. J Virol. 73:8848-8850.
25. Kamer, G. and P. Argos. 1984. 'Primary structural comparison of RNA-dependent polymerases from plant, animal and bacterial viruses. Nucleic Acid Res. 12:7269-
7282.
26. Khuroo, M. S., W. Deurmeyer, S. A. Zargar, M. A. Ahanger. and M. A. Shah.
1983. Acute sporadic non-A, non-B hepatitis in India. Am. J. Epidemiol. 1 18:360- 364. 27. Koonin, E.V. 1991. The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. J. Gen. Virol. 72:2197-2206.
28. Koonin, E.V., A. E. Gorbalenya, M. A. Purdy, M. N. Rozanov, G. R. Reyes, and D.W. Bradley. 1992. Computer-assisted assignment of functional domains in the non-structural polyprotein of hepatitis E virus: delineation of new group of animal and plant positive-strand RNA viruses. Proc. Natl. Acad. Sci. USA 89: 8259-63.
29. Krzvzosiak, W.J., T. Marciniec, M. Wiewiorowski, P. Romby, J.P. Ebel and R. Giege. 1998. Characterisation of Pb(II) induced cleavages in tRNAs in solution and effect of the Y-base in yeast tRNAPhe. Biochem. 27:5771-5777.
30. Kuhn, R.J., Z. Hong and J.H Strauss. 1990. Mutagenesis of the 31 nontranslated region of Sindbis virus RNA. J. Virol. 64: 1465-1476.
31 Kusov, Y., M. Weitz, G. Dollenmeier, V.G. Muller and G. Siegl. 1996. RNA- protein interactions at the 3' end of the hepatitis A virus RNA. J. Virol. 70: 1890- 1897.
32. Lai, C. J., B. T. Zhao. H. Hori, and M. Bray. 1991. Infectious RNA transcribed from stably cloned full-length cDNA of dengue type 4 virus. Proc. Natl. Acad. Sci.
USA 88:5139-43.
33. Lai, M.M.C. 1998. Cellular factors in the transcription and replication of viral RN.A genomes: a parallel to DNA-dependent RNA transcription. Virology 244:1- 12.
34. Li, F., H. Zhuang, S. Kolivas, S.A. Locarnini, D.A. Anderson. 1994. Persistent and transient antibody responses to hepatitis E virus detected by western immunoblot using open reading frame 2 and 3 and glutathione S-transferase fusion proteins.
Journal of Clinical Microbiology 32:2060-2066.
35. Marciniec, T., J. Ciesiolka, J. Wrzesinski and W.J. Krzvzosiak. 1989. Identification of magnesium, europium and lead binding sites in E.coli and lupine tRNAPhe by specific metal ion induces cleavages. FEBS lett. 243:293-298. 36. Nakahasi, H.L., X.Q. Cao, T.A. Rouault and T.Y. Liu. 1991. Specific binding of host cell proteins to the 3' terminal stem loop structure of rubella virus negative strand RNA. J. Virol. 65:5961-5967.
37. Nanda, S.K., S.K. Panda, H. Durgapal and S. Jameel. 1994b. Detection of negative strand of hepatitis E virus RNA in the livers of experimentally infected rhesus monkeys: evidence for viral replication. J. Med. Virol. 42:237-240.
38. O'Reilly, E.K. and C.C. Kao. 1998. Analysis of RNA-dependent RNA polymerase structure and function as guided by known polymerase structures and computer predictions of secondary structure. Virology 252:287-303.
39. Panda, S. K., R. Datta, K. Kaur, A. J. Zuckerman, and N. C. Nayak. 198 Enterically transmitted non-A, non-B hepatitis: recovery of virus-like particles from an epidemic in South Delhi and transmission studies in rhesus monkeys. Hepatology. 10:466-472.
40. Panda, S. K., S. K. Nanda, M. Zafrullah, I. H. Ansari, M. H. Ozdener, and S. Jameel. 1995. An Indian strain of hepatitis E virus (HEV): Cloning, sequencing and expression of structural region and antibody responses in sera from individuals from an area of high-level HEV endemicity. J. Clin. Microbiol. 33:2653-2659.
41. Pardigon, N., E. Lenches and J.H. Strauss. 1993. Multiple binding sites for cellular proteins in the 3' end of sindbis alphavirus minus-sense RNA. J. Virol. 67:5003-501 1.
42. Pearlman, D.A., D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham. S.E. DeBolt, D.M. Ferguson, G.L. Seibel and P.A. Kollman. 1995. "AMBER, a
Package of Computer Programs for Applying Molecular Mechanics. Normal Mode Analysis, Molecular Dynamics and Free Energy Calculations to Simulate the
Structural and Energetic Properties of Molecules"-A computer simulation software developed by University of California, 1995.
43. Promega Co. USA. 1996. ISBN 1-882274-57-1. Promega Protocols and Applications Guide. Ill Edition, Determination of percent incorporation and specific activ ity. p 1 18-1 19.
44. Pugachev K.V., E. S. Abernathy, and T. K Frey. 1997. Improvement of the specific infectivity of the rubella virus (RUB) infectious clone: determinants of cytopathogenicity induced by RUB map to the non-structural proteins. J Virol. 71( l ):562-8. 45. Quadt, R., C.C. Kao, K.S. Browning, R.P. Hershberger and P. Ahlquist. 1993. Characterization of a host protein associated with brome mosaic virus dependent RNA polymerase. Proc. Natl. Acad .Sci. USA 90:1498-1502.
46. Racaniello, V. R., and D. Baltimore. 1981. Cloned poliovirus complementar DNA is infectious in mammalian cells. Science. 214:916-919. 47. Reyes, G.R., C.C. Huang, A.W. Tarn and M.A. Purdy. 1993. Molecular organisation and replication of hepatitis E virus (HEV). Arch. Virol. Suppl 7: 15-25.
48. Rice, C. M., R. Levis, J. H. Strauss, and H. V. Huang. 1987. Production of infectious RNA transcripts from Sindbis virus cDNA clones: mapping of lethal mutations, rescue of a temperature-sensitive marker, and in vitro mutagenesis to generate defined mutants. J Virol. 61 :3809-19.
49. Rohll, J.B., N. Percy, R. Ley, D.J. Evans, J.W. Almond and W.S. Barclay. 1 94 The 5' untranslated regions of Picornavirus RNAs contain independent functional domains essential for RNA replication and translation. J. Virol. 68:4384-4391.
50. Rost, B. and C. Sander. 1993. Prediction of protein structure at better than 70% accuracy. J. Mol. Biol. 232:584-599.
51. Sampson, J.R., F.X. Sullivan, L.S. Behlen, A.B. DiRenzo and O.C. Uhlenbeck.
1987. Charaterization of two RNA catalyzed RNA cleavage reaction. Cold Spring Harbor Symp. Quant. Biol. 52:267-275.
52. Song, C. and A.E. Simon. 1995. Requirement of a 3' terminal stem loop in in vitro transcription by an RNA dependent RNA polymerase. J. Mol. Bio. 254:6- 14.
53. Strauss, J. H., and E. G. Strauss. 1994. The alphaviruses: gene expression, replication, and evolution. Microbiol. Rev. 58:491-562.
54 Tarn, A. W., R. White, E. Reed, M. Short, Y. Zhang, T. R. Furest, and R. E. Lanford. 1996. In-vitro Propagation and Production of Hepatitis E Virus from ;/;- v/vo-Infected Primary Macaque Hepatocytes. Virology. 215: 1-9.
55. Tarn, A.W., M.M. Smith, M.E. Guerra, C.C. Huang, D.W. Bradley, K.E. Fry and G.R. Reyes. 1991. Hepatitis E virus (HEV): molecular cloning and sequencing of the full-length viral genome. Virology 185:120-31.
56. Tiley, L., M. Hagen, J.T. Matthews and M. Krystal. 1994. Sequence specific binding of the Influenza virus RNA polymerase to sequences located at the 5' ends of the viral RNAs. J. Virol. 68:5108-5116.
57. Todd, S.. J.H.C. Nguyen and B.L. Semler. 1995. RNA - protein interactions directed by the 3' end of Human Rhinovirus genomic RNA. J. Virol. 69:000-000(a-j ).
58. Tsai, C.H., C.P. Cheng, C.W. Peng, B.Y. Lin, N.S. Lin and Y.H. Hsu. 1999 Sufficient length of a poly (A) tail for the formation of a potential pseudoknot is required for efficient replication of bamboo mosaic potexivirus RNA. J. Virol. 73:2703-2709.
59. van der Werf, S., J. Bradley, E. Wimmer, F. W. Studier, and J. J. Dunn. 1986 Synthesis of infectious poliovirus RNA by purified T7 RNA polymerase. Proc. Natl. Acad. Sci. USA 83:2330-4.
60. Walter, A.E. and D.H. Turner. 1994. Sequence dependence of stability for coaxial stacking of RNA helixes with Watson-Crick base paired interfaces. Biochemistry 33:12715-9.
61. Yen, J.H., S.C. Chang, C.R. Hu, S.C. Chu, S.S. Lin, Y.S. Hsieh and M.F. Chang. 1995. Cellular proteins specifically bind to the 5' noncoding region of hepatitis C virus RNA. Virology 208:723-732.
62. Yu, W. and J.L. Leibowitz. 1995a. A conserved motif at the 3" end of mouse hepatitis virus genomic RNA required for host protein binding and viral RN.A replication. Virol. 214:128-138.
63. Yu, W. and J.L. Leibowitz. 1995b. Specific binding of host cellular proteins to multiple sites within the 3' end of mouse hepatitis virus genomic RNA. J. Virol. 69:2016-2023.
64. Zafrullah, M., M. H. Ozdener, S. K. Panda, and S. Jameel. 1997. The ORF3 protein of HEV is a phosphoprotein that associates with cytoskeleton. .1. Virol.
71 :9045-9053.
65. Zietkiewicz, E., J. Ciesiolka, W.J. Krzyzosiak and R. Slomski. 1990. The secondary structure model of mouse Ul, snRNA as determined from the results of Pb_τ induced hydrolysis. In Nuclear Structure and Function (Harris JR and Zbraski JB. eds), pp.453-457, Plenum Press, New York.
66. Zuker, M. 1989. On finding all suboptimal foldings of an RNA molecule. Science 244:48-52.Review.
SEQUENCE LISTING PART
<1 10> Department of Science and Technology; All India Institute of Medical
Sciences; Panda, Subrat Kumar; Ansari, Israrul Haque; Agrawal. Shipra: Durgapal, Hemlata.
<120> GENETICALLY ENGINEERED CLONE OF HEPATITIS E VIRUS
(HEV) GENOME WHICH IS INFECTIOUS, ITS PRODUCTION AND USES
<130> PANDA/HEV
<150> IN 106/DEL/2000
<151> 2000-02-07
<160>
<210> SEQ ID No. 1
<21 1 > 7099
<212> cDNA
<213> Hepatitis E Virus
<400> SEQ ID No. 1 atg gaggcccatc agtttctcaa ggctcccggc atcactactg ctgttgagca ggctgctcta gccacggcca actctgccct ggcgaatgct gtggtagtta ggccttttct ttctcaccag cagattgaga ttctcattaa cctaatgcaa cctcgccagc ttgttttccg ccccgaggtt ttctggaatc aacccatcca gcgtgtcatt cataacgagc tggagcttta ctgccgcgct cgctccggcc gctgtcttga aattggcgcc catccccgct caataaatga taatcctaat gtggtccacc gctgcttcct ccgccctgtt gggcgtgatg ttcagcgctg gtatactgct cccactcgcg ggccggctgc taattgccgc cgttccgcgt tgcgtgggct tcccgctgct gaccgcacat actgcttcga cgggttttct ggctgtagct gccccgccga gacgggtatc gccctttact ccctccatga tatgtcacca tctgatgttg ccgaggccat gttccgccat ggtatgacgc ggctttatgc tgccctccat cttccgcctg aggtcttgct gccccctggc acatatcgca ccgcatcgta tttgctgatt
catgacggca ggcgcgttgt ggtgacgtat gagggtgata ctagtgctgg ttacaaccac gatgtctcca acttgcgctc ctggattaga accaccaagg ttaccggaga ccatcccctc gttatcgagc gggttagggc cattggctgc cactttgttc tcttgctcac ggcagccccg gagccatcac ctatgcctta tgttccttac ccccggtcta ccgaggtcta tgtccgatcg atcttcggcc cgggtggtac cccttcctta ttcccaacct catgctccac taagtcgacc ttccacgctg tccctgccca tatttgggac cgtcttatgc tgttcggggc caccttggat gaccaagcct tttgctgctc ccgtttaatg acctaccttc gcggcattag ctacaaggtc actgttggta cccttgtggc taatgaaggc cggaacgcct ctgaggacgc cctcacagct gtcatcactg ccgcctatct taccatttgc caccagcggt atctccgcac ccaggctata tccaagggga tccgtcgtct tgaacgggag catgaccaga agtttataac acgcctttac agctggctct tcgagaagtc cggccgtgat tacatccctg gccgtcagtt ggagttctac gcccagtgta ggcgctggct ttcggccggc tttcatcttg atccacgtgt actggttttt gacgagtcgg ccccctgcca ttgtaggact gtgatccgca aggcgctctc gaagttttgc tgctttatga agtggcttgg tcaggagtgc acctgttttc ttcaacctgc agaaggcgtc gtcggcgacc agggtcatga taacgaatcc tatgaggggt ccgatgttga ccctgctgag tccgccatta gtgacatctc tgggtcctat gtcgtccctg gcacagccct ccaaccgctc taccaggccc tcgatctccc cgatgagatt gtggctcgcg cgtgccggct gaccgccaca gtaaaggtct cccaggtcga tgggcggatc gattgcgaga cccttcttgg taacaaaacc ttccgcacgt cgtttgtcga cggggcggtc ttagagacca atggcccaga gcgccacaat ctctcctttg atgccagtca gagcactatg gccgctggcc ctttcagtct cacctatgcc gcctctgcag ctgggctgga ggtgcgctat gttggtgccg ggcttgacca tcgggcgatt tttgcccccg gtgtttcacc ccggtcaaac cccggcgagg tcaccgcctt ctgctctgcc ctatataggt tcaaccgtga ggcccagcgc cattcgctga ccggtaactt atggttccat cctgaggggc ttattggcct ctttgccccg ttttcgcctg ggcatgtctg ggagtcggct aaaccattct gtggcgagag cacactttac acccgtactt ggtcggaggt tgatgccgtc tctagtccaa cccggcccga tttgggtttt atgtctgagc ctcctatacc tagtagggcc gccacgccta ccttggcggc ccctctaccc ccccttgcac cggacccttc ccctccttct tctgccccgg cgctcgatga gccggcttct gccgctacct ccggggtccc ggccataacc caccagacgg cccggcaccg ccgcctgctc ttcacctacc cggatggctc taaggtattc gccggctcgc tgttcgagtc gacatgcacg tggctcgtta acgcgtctaa tgttgaccac tgccctggcg gcgggctttg ccatgcattt taccaaaggt accccgcctc ctttgatgct gcctgttttg tgatgcgcga cggtgcggcc gcgtacacac tgaccccccg gccaataatt catcgtgtcg cccctgatta taggttggaa cataacccaa agaggcttga ggctgcttat cgggagactt gttcccgcct cggtaccgct gcatacccgc tcctcgggac cggcatatac
caggtgccga tcggtcccag ctttgacgcc tgggagcgga accaccgccc cggggatgag ttgtaccttc ctgaacttgc tgccagatgg tttgaggcca ataggccgac ccgcccaact ctcactataa ctgaggatgc tgcacggaca gcgaatctgg ccatcgagct tgactcagcc acagatgtcg gccgggcctg tgctggctgt cgggttaccc ctggcgttgt tcaataccag tttaccgcag gtgtgcctgg atccggcaag tcccgctcca tcacccgagc cgatgtggac gttgtcgtgg tcccgacgcg cgagttgcgt aatgcctggc gccgtcgcgg ctttgctgcc ttcaccccgc acactgccgc tagagtcacc gacgggcgcc gggttgtcat tgatgaggct ccatccctcc cccctcacct gttgctgctc cacatgcagc gggccgccac cgtccacctt cttggcgacc cgaatcagat cccagccatc gactttgagc accctgggct cgtccccgcc atcaggcccg acttagcccc tacctcctgg tggcatgtta cccatcgctg ccctgcggat gtatgtgagt tgatccgtgg tgcatacccc atgatccaga ccactagccg ggttctccgt tcgttgttct ggggtgagcc tgccgtcggg cagaaactag tgttcaccca ggcggccaag cccgccaacc ccggctcagt gacggtccac gattcgcagg gcgctaccta cacttatacc actattattg ccacagcaga tgcccggggc cttattcagt cgtctcgggc tcatgccatt gttgctctga cgcgccacac tgagaagtgg gtcatcattg acgcaccagg cctgcttcgc gaggtgggca tctccgatgc aatcgttaat aactttttcc tcgctggtgg cgaaattggt catcagcgcc catctgttat tccccgtggc aaccctgacg ccaatgttga caccctggct gccttcccac cgtcttgcca gattagtgcc ttccatcagt tagctgagga gctcggccac agacctgccc ctgttgcagc tgttctacca ccctgccccg agctcgaaca gggcctcctc tatctgcccc aggagctcac cacctgtgat agtgtcgtaa catttgaatt aacagatatt gtgcactgcc gcatggccgc cccgagccag cgcaaggccg tagtgtccac actcgtgggc cgctacggcc gtcgcacaaa gctctacaat gcttcccact ctgatgttcg cgactctctc gcccgtttta tccctgccat tggccccgta caggtcacaa cctgtgaatt gtacgagtta gtggaggcca tggtcgagaa gggccaggat ggctccgccg tccttgagct cgatctttgc aaccgtgatg tgtccaggat caccttcttc cagaaagatt gtaacaagtt caccacaggt gagaccattg ctcatggtaa agtgggccag ggcatctcgg cctggagcaa gaccttctgc gccctctttg gcccttggtt tcgcgccatt gagaaggcta ttctggcctt gctccctcag ggtgtgtttt acggtgatgc ctttgatgac accgtcttct cagcggctgt ggccgcagca aaggcatcca tggtgtttga gaatgacttt tctgagtttg actccaccca gaataacttt tctttgggtc tagagtgtgc tattatggag gagtgcggga tgccgcaggg gctcatccgc ttgtatcacc ttataaggtc tgcgtggatc ctgcaggccc cgaaggagtc tctgctaggg ttttggaaga aacactccgg cgagcccggc actcttctat ggaatactgt ctggaatatg gctgttatta cccactgtta tgacttccgc gatttgcagg tagctgcctt taaaggtgat gattcgatag tgctttgcag tgagtatcgt cagagtccag gagctgctgt cctgatcgcc
ggctgtggct tgaagttgaa ggtagatttc cgcccgatgc gtttgtatgc aggtgttgtg gtggcccccg gccttggcgc gcttcctgat gtcgtgcgct tcgccggccg gcttaccgag aagaattggg gccctggccc tgagcgggcg gacgagctcc gcatcgctgt tagtgacttc ctccgcaagc tcacgaatgt ggctcagatg tgtgtggatg ttgtttcccg tgtttatggg gtttcccctg ggctcgttca taacctgatt ggcatgctac aggctgttgc tgatggcaag gcacatttca ctgagtcagt aaaaccagtg ctcgacctga caaattcaat cttgtgtcgg gtggaatgaa taacatgtct tttgctgcgc ccatgggttc gcgaccatgg gccctcggcc tattttgttg ctgttcctca tgtttctgcc tatgctgctc gcgccaccgc ccggtcagcc gtctggccgc cgtcgtgggc ggcgcagcgg cggttccggc ggtggtttct ggggtgaccg ggttgattct cagcccttcg caatccccta tattcatcca accaacccct tcgccccgaa tgtcaccgct gcggccgggg ctggacctcg tgttcgccaa cccgtccgac cactcggctc cgcttggcgc gaccaggccc agcgccccgc cgctgcctca cgtcgtagac ctaccacagc tggggccgcg ccgctaaccg cggtcgctcc ggcccatgac accccgccag tgcctgatgt cgactcccgc ggcgccatct tgcgccggca gtacaaccta tcaacatctc cccttacctc ttccgtggcc accggtacta acctggttct ttatgccgcc cctcttagtc cgcttttacc ccttcaggac ggtaccaata ctcatataat ggccacggaa gcttctaatt atgcccagta ccgggttgcc cgtgccacga tccgttaccg cccgctggtc cccaacgctg tcggcggtta cgccatctcc atctcattct ggccacagac tacccccacc ccgacgtccg ttgatatgaa ttcaataacc tcgacggatg ttcgcatttt agtccagccc ggcatagcct ctgagcttgt gatcccaagc gagcgcctac actatcgtaa ccaaggttgg cgctctgtcg agacctccgg ggtggccgag gaggaggcca cctccggtct tgttatgctt tgcatacatg gctcacccgt aaattcctat actaatacac cctataccgg tgcccttggg ctgctggact ttgcccttga gcttgagttt cgcaacctta cccccggtaa tactaatacg cgggtctccc gttattccag cactgctcgc caccgccttc gtcgcggtgc ggacgggact gccgagctca ccaccacggc tgctacccgc tttatgaagg acctctattt tactagtact aatggtgttg gtgagatcgg ccgcgggata gccctcaccc tgttcaacct tgctgacact ttgcttggcg gcctgccgac agaattgatt tcgtcggctg gtggccagct gttctactcc cgtcccgttg tctcagccaa tggcgagccg actgttaagc tgtatacatc tgtagagaat gctcagcagg ataagggtat tgcaatcccg aatgatattg acctcggaga atctcgtgtg gttatccagg attatgataa ccaacatgaa caagaccggc cgacgccttc tccagccccg tcgcgccctt tctctgttct tcgagctaat gatgtgcttt ggctctctct caccgctgcc gagtatgacc agtccaccta tggctcttcg actggcccag tttatgtttc tgactctgtg accttggtta atgttgccac cggcgcgcag gccgttgccc ggtcgctcga ttggaccaag gtcacacttg acggtcgccc tctctccacc atccagcagt attcgaagat cttctttgtc ctgccgctcc gcgggaagct
ctctttctgg gaggcaggca caactaggcc cgggtaccct tataattaca acaccactgc aagcgaccaa ctgcttgtcg agaatgccgc cgggcaccgg gttgctattt ccacttacac cactagcctg ggtgctggcc ccgtctctat ttctgcggtt gccgtcttag gcccccactc tgcgctagca ttgcttgagg atactttgga ttaccctgcc cgcgcccata cttttgatga cttctgccca gagtgccgcc cccttggcct ccagggctgc gctttccagt ctactgtcgc tgagctccag cgccttaaga tgaaggtggg taaaactcgg gagtta.
<210> SEQ ID No. 2
<21 1> 27
<212> cDNA
<213> Hepatitis E Virus
<400> SEQ ID No. 2 aggcagaccacatatgtggtcgatgcc
<210> SEQ ID No. 3
<21 1> 68
<212> cDNA
<213> Hepatitis E Virus
<400> SEQ ID No. 3 tagtttatttgcttgtgccccccttctctctgttgcttatttctcatttctgcgttccgcgctccctg
<210> SEQ ID No. 4
<211 > 7194
<212> cDNA
<213> Hepatitis E Virus
<400> SEQ ID No. 4 aggcagacca catatgtggt cgatgccatg gaggcccatc agtttctcaa ggctcccggc 60 atcactactg ctgttgagca ggctgctcta gccacggcca actctgccct ggcgaatgct 120 gtggtagtta ggccttttct ttctcaccag cagattgaga ttctcattaa cctaatgcaa 180 cctcgccagc ttgttttccg ccccgaggt't ttctggaatc aacccatcca gcgtgtcatt 240 cataacgagc tggagcttta ctgccgcgct cgctccggcc gctgtcttga aattggcgcc 300
catccccgct caataaatga taatcctaat gtggtccacc gctgcttcct ccgccctgtt 360 gggcgtgatg ttcagcgctg gtatactgct cccactcgcg ggccggctgc taattgccgc 420 cgttccgcgt tgcgtgggct tcccgctgct gaccgcacat actgcttcga cgggttttct 480 ggctgtagct gccccgccga gacgggtatc gccctttact ccctccatga tatgtcacca 540 tctgatgttg ccgaggccat gttccgccat ggtatgacgc ggctttatgc tgccctccat 600 cttccgcctg aggtcttgct gccccctggc acatatcgca ccgcatcgta tttgctgatt 660 catgacggca ggcgcgttgt ggtgacgtat gagggtgata ctagtgctgg ttacaaccac 720 gatgtctcca acttgcgctc ctggattaga accaccaagg ttaccggaga ccatcccctc 780 gttatcgagc gggttagggc cattggctgc cactttgttc tcttgctcac ggcagccccg 840 gagccatcac ctatgcctta tgttccttac ccccggtcta ccgaggtcta tgtccgatcg 900 atcttcggcc cgggtggtac cccttcctta ttcccaacct catgctccac taagtcgacc 960 ttccacgctg tccctgccca tatttgggac cgtcttatgc tgttcggggc caccttggat 1020 gaccaagcct tttgctgctc ccgtttaatg acctaccttc gcggcattag ctacaaggtc 1080 actgttggta cccttgtggc taatgaaggc cggaacgcct ctgaggacgc cctcacagct 1 140 gtcatcactg ccgcctatct taccatttgc caccagcggt atctccgcac ccaggctata 1200 tccaagggga tccgtcgtct tgaacgggag catgaccaga agtttataac acgcctttac 1260 agctggctct tcgagaagtc cggccgtgat tacatccctg gccgtcagtt ggagttctac 1320 gcccagtgta ggcgctggct ttcggccggc tttcatcttg atccacgtgt actggttttt 1380 gacgagtcgg ccccctgcca ttgtaggact gtgatccgca aggcgctctc gaagttttgc 1440 tgctttatga agtggcttgg tcaggagtgc acctgttttc ttcaacctgc agaaggcgtc 1500 gtcggcgacc agggtcatga taacgaatcc tatgaggggt ccgatgttga ccctgctgag 1560 tccgccatta gtgacatctc tgggtcctat gtcgtccctg gcacagccct ccaaccgctc 1620 taccaggccc tcgatctccc cgatgagatt gtggctcgcg cgtgccggct gaccgccaca 1680 gtaaaggtct cccaggtcga tgggcggatc gattgcgaga cccttcttgg taacaaaacc 1740 ttccgcacgt cgtttgtcga cggggcggtc ttagagacca atggcccaga gcgccacaat 1800 ctctcctttg atgccagtca gagcactatg gccgctggcc ctttcagtct cacctatgcc 1860 gcctctgcag ctgggctgga ggtgcgctat gttggtgccg ggcttgacca tcgggcgatt 1920 tttgcccccg gtgtttcacc ccggtcaaac cccggcgagg tcaccgcctt ctgctctgcc 1980 ctatataggt tcaaccgtga ggcccagcgc cattcgctga ccggtaactt atggttccat 2040 cctgaggggc ttattggcct ctttgccccg ttttcgcctg ggcatgtctg ggagtcggct 2100 aaaccattct gtggcgagag cacactttac acccgtactt ggtcggaggt tgatgccgtc 2160 tctagtccaa cccggcccga tttgggtttt atgtctgagc ctcctatacc tagtagggcc 2220 gccacgccta ccttggcggc ccctctaccc ccccttgcac cggacccttc ccctccttct 2280 tctgccccgg cgctcgatga gccggcttct gccgctacct ccggggtccc ggccataacc 2340
caccagacgg cccggcaccg ccgcctgctc ttcacctacc cggatggctc taaggtattc 2400 gccggctcgc tgttcgagtc gacatgcacg tggctcgtta acgcgtctaa tgttgaccac 2460 tgccctggcg gcgggctttg ccatgcattt taccaaaggt accccgcctc ctttgatgct 2520 gcctgttttg tgatgcgcga cggtgcggcc gcgtacacac tgaccccccg gccaataatt 2580 catcgtgtcg cccctgatta taggttggaa cataacccaa agaggcttga ggctgcttat 2640 cgggagactt gttcccgcct cggtaccgct gcatacccgc tcctcgggac cggcatatac 2700 caggtgccga tcggtcccag ctttgacgcc tgggagcgga accaccgccc cggggatgag 2760 ttgtaccttc ctgaacttgc tgccagatgg tttgaggcca ataggccgac ccgcccaact 2820 ctcactataa ctgaggatgc tgcacggaca gcgaatctgg ccatcgagct tgactcagcc 2.880 acagatgtcg gccgggcctg tgctggctgt cgggttaccc ctggcgttgt tcaataccag 2940 tttaccgcag gtgtgcctgg atccggcaag tcccgctcca tcacccgagc cgatgtggac 3000 gttgtcgtgg tcccgacgcg cgagttgcgt aatgcctggc gccgtcgcgg ctttgctgcc 3060 ttcaccccgc acactgccgc tagagtcacc gacgggcgcc gggttgtcat tgatgaggct 3120 ccatccctcc cccctcacct gttgctgctc cacatgcagc gggccgccac cgtccacctt 3180 cttggcgacc cgaatcagat cccagccatc gactttgagc accctgggct cgtccccgcc 3240 atcaggcccg acttagcccc tacctcctgg tggcatgtta cccatcgctg ccctgcggat 3300 gtatgtgagt tgatccgtgg tgcatacccc atgatccaga ccactagccg ggttctccgt 3360 tcgttgttct ggggtgagcc tgccgtcggg cagaaactag tgttcaccca ggcggccaag 3420 cccgccaacc ccggctcagt gacggtccac gattcgcagg gcgctaccta cacttatacc 3480 actattattg ccacagcaga tgcccggggc cttattcagt cgtctcgggc tcatgccatt 3540 gttgctctga cgcgccacac tgagaagtgg gtcatcattg acgcaccagg cctgcttcgc 3600 gaggtgggca tctccgatgc aatcgttaat aactttttcc tcgctggtgg cgaaattggt 3660 catcagcgcc catctgttat tccccgtggc aaccctgacg ccaatgttga caccctggct 3720 gccttcccac cgtcttgcca gattagtgcc ttccatcagt tagctgagga gctcggccac 3780 agacctgccc ctgttgcagc tgttctacca ccctgccccg agctcgaaca gggcctcctc 3840 tatctgcccc aggagctcac cacctgtgat agtgtcgtaa catttgaatt aacagatatt 3900 gtgcactgcc gcatggccgc cccgagccag cgcaaggccg tagtgtccac actcgtgggc 3960 cgctacggcc gtcgcacaaa gctctacaat gcttcccact ctgatgttcg cgactctctc 4020 gcccgtttta tccctgccat tggccccgta caggtcacaa cctgtgaatt gtacgagtta 4080 gtggaggcca tggtcgagaa gggccaggat ggctccgccg tccttgagct cgatctttgc 4140 aaccgtgatg tgtccaggat caccttcttc cagaaagatt gtaacaagtt caccacaggt 4200 gagaccattg ctcatggtaa agtgggccag ggcatctcgg cctggagcaa gaccttctgc 4260 gccctctttg gcccttggtt tcgcgccatt gagaaggcta ttctggcctt gctccctcag 4320 ggtgtgtttt acggtgatgc ctttgatgac accgtcttct cagcggctgt ggccgcagca 4380
aaggcatcca tggtgtttga gaatgacttt tctgagtttg actccaccca gaataacttt 4440 tctttgggtc tagagtgtgc tattatggag gagtgcggga tgccgcaggg gctcatccgc 4500 ttgtatcacc ttataaggtc tgcgtggatc ctgcaggccc cgaaggagtc tctgctaggg 4560 ttttggaaga aacactccgg cgagcccggc actcttctat ggaatactgt ctggaatatg 4620 gctgttatta cccactgtta tgacttccgc gatttgcagg tagctgcctt taaaggtgat 4680 gattcgatag tgctttgcag tgagtatcgt cagagtccag gagctgctgt cctgatcgcc 4740 ggctgtggct tgaagttgaa ggtagatttc cgcccgatgc gtttgtatgc aggtgttgtg 4800 gtggcccccg gccttggcgc gcttcctgat gtcgtgcgct tcgccggccg gcttaccgag 4860 aagaattggg gccctggccc tgagcgggcg gacgagctcc gcatcgctgt tagtgacttc 4920 ctccgcaagc tcacgaatgt ggctcagatg tgtgtggatg ttgtttcccg tgtttatggg 4980 gtttcccctg ggctcgttca taacctgatt ggcatgctac aggctgttgc tgatggcaag 5040 gcacatttca ctgagtcagt aaaaccagtg ctcgacctga caaattcaat cttgtgtcgg 5100 gtggaatgaa taacatgtct tttgctgcgc ccatgggttc gcgaccatgg gccctcggcc 5160 tattttgttg ctgttcctca tgtttctgcc tatgctgctc gcgccaccgc ccggtcagcc 5220 gtctggccgc cgtcgtgggc ggcgcagcgg cggttccggc ggtggtttct ggggtgaccg 5280 ggttgattct cagcccttcg caatccccta tattcatcca accaacccct tcgccccgaa 5340 tgtcaccgct gcggccgggg ctggacctcg tgttcgccaa cccgtccgac cactcggctc 5400 cgcttggcgc gaccaggccc agcgccccgc cgctgcctca cgtcgtagac ctaccacagc 5460 tggggccgcg ccgctaaccg cggtcgctcc ggcccatgac accccgccag tgcctgatgt 5520 cgactcccgc ggcgccatct tgcgccggca gtacaaccta tcaacatctc cccttacctc 5580 ttccgtggcc accggtacta acctggttct ttatgccgcc cctcttagtc cgcttttacc 5640 ccttcaggac ggtaccaata ctcatataat ggccacggaa gcttctaatt atgcccagta 5700 ccgggttgcc cgtgccacga tccgttaccg cccgctggtc cccaacgctg tcggcggtta 5760 cgccatctcc atctcattct ggccacagac tacccccacc ccgacgtccg ttgatatgaa 5820 ttcaataacc tcgacggatg ttcgcatttt'agtccagccc ggcatagcct ctgagcttgt 5880 gatcccaagc gagcgcctac actatcgtaa ccaaggttgg cgctctgtcg agacctccgg 5940 ggtggccgag gaggaggcca cctccggtct tgttatgctt tgcatacatg gctcacccgt 6000 aaattcctat actaatacac cctataccgg tgcccttggg ctgctggact ttgcccttga 6060 gcttgagttt cgcaacctta cccccggtaa tactaatacg cgggtctccc gttattccag 6120 cactgctcgc caccgccttc gtcgcggtgc ggacgggact gccgagctca ccaccacggc 6180 tgctacccgc tttatgaagg acctctattt tactagtact aatggtgttg gtgagatcgg 6240 ccgcgggata gccctcaccc tgttcaacct tgctgacact ttgcttggcg gcctgccgac 6300 agaattgatt tcgtcggctg gtggccagct gttctactcc cgtcccgttg tctcagccaa 6360 tggcgagccg actgttaagc tgtatacatc tgtagagaat gctcagcagg ataagggtat 6420
tgcaatcccg aatgatattg acctcggaga atctcgtgtg gttatccagg attatgataa 6480 ccaacatgaa caagaccggc cgacgccttc tccagccccg tcgcgccctt tctctgttct 6540 tcgagctaat gatgtgcttt ggctctctct caccgctgcc gagtatgacc agtccaccta 6600 tggctcttcg actggcccag tttatgtttc tgactctgtg accttggtta atgttgccac 6660 cggcgcgcag gccgttgccc ggtcgctcga ttggaccaag gtcacacttg acggtcgccc 6720 tctctccacc atccagcagt attcgaagat cttctttgtc ctgccgctcc gcgggaagct 6780 ctctttctgg gaggcaggca caactaggcc cgggtaccct tataattaca acaccactgc 6840 aagcgaccaa ctgcttgtcg agaatgccgc cgggcaccgg gttgctattt ccacttacac 6900 cactagcctg ggtgctggcc ccgtctctat ttctgcggtt gccgtcttag gcccccactc 6960 tgcgctagca ttgcttgagg atactttgga ttaccctgcc cgcgcccata cttttgatga 7020 cttctgccca gagtgccgcc cccttggcct ccagggctgc gctttccagt ctactgtcgc 7080 tgagctccag cgccttaaga tgaaggtggg taaaactcgg gagttatagt ttatttgctt 7140 gtgcccccct tctctctgtt gcttatttct catttctgcg ttccgcgctc cctg 7194
<210> SEQ ID No. 5
<21 1> 1693
<212> Protein
<213> Hepatitis E Virus
<400> SEQ ID No. 5
MEAHQFLKAPGITTAVEQAALATANSALANAVVVRPFLSHQQIEILINLM
QPRQLVFRPEVFWNQPIQRVIHNELELYCRARSGRCLEIGAHPRSINDNPN
VVHRCFLRPVGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCFDGF
SGCSCPAETGIALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPP
GTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWIRTTKVTGDH
PLVIERVRAIGCHFVLLLTAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSL
FPTSCSTKSTFHAVPAHIWDRLMLFGATLDDQAFCCSRLMTYLRG1SYKV
TVGTLVANEGRNASEDALTAVITAAYLTICHQRYLRTQAISKGIRRLEREH
DQKFITRLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRYLVFD
ESAPCHCRTVIRKALSKFCCFMKWLGQECTCFLQPAEGVVGDQGHDNES
YEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPDEIVARACRLTATV
KVSQVDGRIDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDASQSTMA
AGPFSLTYAASAAGLEVRYVGAGLDHRAIFAPGVSPRSNPGEVTAFCSAL
YRFNREAQRHSLTGNLWFHPEGLIGLFAPFSPGHVWESAKPFCGESTLYT
RTWSEVDAVSSPTRPDLGFMSEPPIPSRAATPTLAAPLPPLAPDPSPPSSAP
ALDEPASAATSGVPAITHQTARHRRLLFTYPDGSKVFAGSLFESTCTWLV
NASNVDHCPGGGLCHAFYQRYPASFDAACFVMRDGAAAYTLTPRPIIHR
VAPDYRLEHNPKRLEAAYRETCSRLGTAAYPLLGTGIYQVPIGPSFDAWE
RNHRPGDELYLPELAARWFEANRPTRPTLTITEDAARTANLAIELDSATD
VGRACAGCRVTPGVVQYQFTAGVPGSGKSRSITRADVDVVVVPTRELRN
AWRRRGFAAFTPHTAARVTDGRRVVIDEAPSLPPHLLLLHMQRAATVHL
LGDPNQIPAIDFEHPGLVPAIRPDLAPTSWWHVTHRCPADVCELIRGAYP
MIQTTSRVLRSLFWGEPAVGQKLVFTQAAKPANPGSVTVHDSQGATYTY
TTIIATADARGLIQSSRAHAIVALTRHTEKWVIIDAPGLLREVGISDAIVN
FFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRP
APVAAVLPPCPELEQGLLYLPQELTTCDSVVTFELTDIVHCRMAAPSQRK
AVVSTLVGRYGRRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELYE
AMVEKGQDGSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGK\'GQGI
SAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFDDTVFSAAVAAAK
ASMVFENDFSEFDSTQNNFSLGLECAIMEECGMPQGLIRLYHLIRSAWILQ
APKESLLGFWKKHSGEPGTLLWNTVWNMAVITHCYDFRDLQVAAFKGD
DSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPMRLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERADELRIAVSDFLRKLTNVAQMCVDVVSR
VYGVSPGLVHNLIGMLQAVA DGKAHFTESVKPVLDLTNSILCRVE.
<210> SEQ ID No. 6
<211> 660
<212> Protein
<213> Hepatitis E Virus
<400> SEQ ID No. 6
MGPRPILLLFLMFLPMLLAPPPGQPSGRRRGRRSGGSGGGFWGDRVDSQP
FAIPYIHPTNPFAPNVTAAAGAGPRVRQPVRPLGSAWRDQAQRPAAASRR
RPTTAGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGT
NLVLYAAPLSPLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNA
VGGYAISISFWPQTTPTPTSVDMNSITSTDVRILVQPGIASELVIPSERLHYR
NQGWRSVETSGVAEEEATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFAL
ELEFRNLTPGNTNTRVSRYSSTARHRLRRGADGTAELTTTAATRFMKOLY
01/57073 72
FTSTNGVGEIGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRPVVSANG
EPTVKLYTSVENAQQDKGIAIPNDIDLGESRVVIQDYDNQHEQDRPTPSPA
PSRPFSVLRANDVLWLSLTAAEYDQSTYGSSTGPVYVSDSVTLVNVATG
AQAVARSLDWTKVTLDGRPLSTIQQYSKIFFVLPLRGKLSFWEAGTTRPG
YPYNYNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLGPHSA
LALLEDTLDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLKMKVG
KTREL
<210> SEQ ID No. 7
<211> 123
<212> Protein
<213> Hepatitis E Virus
<400> SEQ ID No. 7
MNNMSFAAPMGSRPWALGLFCCCSSCFCLCCSRHRPVSRLAAVVGGAA
AVPAVVSGVTGLILSPSQSPIFIQPTPSPRMSPLRPGLDLVFANPSDHSAPLG
ATRPSAPPLPHVVDLPQLGPRR.
<210> SEQ ID No. 8
<21 1> 111
<212> cDNA
<213> Hepatitis E Virus
<400> SEQ ID No. 8 gcuccagcgccuuaagaugaagguggguaaaacucgggaguuauaguuuauuugcuugugccccccuuc ucucuguugcuuauuucucauuucugcguuccgcgcucccug