PULMONARY SURFACTANT PROTEINS
Field of the Invention
This invention relates to proteins isolated from human lung lavage, methods for obtaining said proteins and uses thereof.
Background of the Invention
Throughout this application various publications are referenced. Full citations for these publications may be found at the end of the specification. The disclosure of these publications are hereby incorporated by reference in order to more fully describe the state of the art to which this invention pertains.
Hyaline Membrane Disease (HMD) and Respiratory Distress Syndrome (RDS) are synonymous terms denoting the clinical condition of pulmonary dysfunction in premature infants. The disease is attributable to the absence of surface active material (surfactant) which lines the air-alveolar interface in the lung and prevents collapse of the alveoli during respiration. Current therapy is predominantly supportive. However, recent clinical trials indicate that one promising therapy is the instillation of bovine-derived surfactant into the lungs of the neonate.
Surface tension in the alveoli of the lung is lowered by a lipoprotein complex called pulmonary surfactant. This complex consists of phospholipid and 5-10% protein (King, 1982). The protein fraction of the surfactant is composed of nonserum and serum proteins. The major surfactant associated protein is reportedly a 35,000 dalton nonserum, sialoglycoprotein (Shelly et al., 1982; Bhattacharyya et
al, 1975; Sueishin and Benson 1981; King et al, 1973, Katyal & Singh, 1981). This protein reportedly seems to be important for the normal function of the pulmonary surfactant (King et. al., 1983; Hawgood et.al., 1985). It is present in reduced amounts in amniotic fluid samples taken shortly before the birth of infants who subsequently develop respiratory distress syndrome (Katyal and Singh, 1984; Shelly et al., 1982; King et al., 1975). Recently the biosynthesis of a 35,000 dalton protein in normal human lung tissue was studied and in an in vitro translation reaction, proteins of 29 and 31 kDa were identified as the primary translation products (Floros et al., 1985). A 35kDa protein also accumulates in the lungs of patients with alveolar proteinosis (Battacharyya and Lynn, 1978, Battacharyya and Lynn, 1980a). This protein has the same electrophoretic mobility, immunological determinants and peptide mapping as the 35kDa protein from normal human broncho-alveolar lavage material (Phelps et al., 1984; Whitsett et al., 1985).
In addition to the above mentioned proteins, the presence in rat lungs of a number of lower molecular weight surfactantassociated proteins has recently been reported. See D. L. Wang, A. Chandler and A. B. Fisher, Fed. Proc. 44(4): 1024 (1985), Abstract No. 3587 (ca. 9000 dalton rat protein) and S. Katyal and G. Singh, Fed. Proc. 44(6): 1890 (1985), Abstract No. 8639 (10,000 - 12,000 dalton rat protein).
Finally, a Feb. 6, 1985 press release from California Biotechnology Inc. reports the cloning and "detailed manipulation" of "the gene encoding human lung surfactant protein." However, the press release does not characterize that protein or describe the "detailed manipulations." Two other reports of possible surfactant-related proteins have also been published recently, namely, J.A. Whitsett et al., 1986, Pediatr. Res. 20:460 and A. Takahashi et al., 1986,
BBRC 135 : 527 .
The present invention relates to a new group of proteins recovered and purified from lung lavage of patients with alveolar proteinosis, methods for obtaining the proteins, corresponding recombinant proteins, antibodies to the proteins for use in diagnostic products, compositions containing the novel proteins, and methods for using the compositions, e.g. in the treatment of infants afflicted with conditions such as Respiratory Distress Syndrome (RDS), as a drug delivery vehicle in the administration of other therapeutic materials to the lungs or other organs and in the treatment of adult RDS, which can occur during cardiopulmonary operations or in other situations when the lungs are filled with fluid and natural pulmonary surfactant production and/or function ceases. While it is possible that one or more of the proteins described hereinafter is similar or identical to proteins discussed in the abovementioned papers, the exact relationship of the protein of this invention to prior proteins cannot at present be confirmed given the inadequacies of the prior disclosures with respect to amino acid or nucleotide sequence data, surfactant activity of prior proteins and the like.
Summary of the Invention
This invention relates to novel proteins useful for enhancing pulmonary surfactant activity, methods for obtaining said proteins and compositions containing one or more of the proteins. The proteins of this invention include the following:
1. A protein characterized by a molecular weight of about 35 kd and by being encoded for by the DNA sequence depicted in Table 1.
2. A protein characterized by a molecular weight of
about 35 kd and by being encoded for by the DNA sequence depicted in Table 2.
3. A protein encoded for by the DNA sequence of Table 6 or by a DNA sequence capable of hybridizing thereto and characterized by a molecular weight of about 5.59kd; and
4. A protein characterized by a molecular weight of about 6 kd and an amino acid composition as set forth in Table 4.
Detailed Description of the Invention
r
composition of the latter 6kd protein is set forth in Table 4.
The two approximately 6kd proteins differ significantly from each other with respect to amino acid composition, as well as from the protein described by Tanaka, Chem. Pharm. Bull. 311:4100 (1983). Additionally, the N-terminal peptide sequence of the cold butanol-insoluble 6 kd protein was determined (Table 5). For the sake of simplicity, both low molecular weight PSP proteins are referred to hereinafter as "6k" proteins based on their approximate apparent molecular weights as determined by conventional SDS-PAGE. It should be understood, however, that the actual molecular weights of these proteins are in the range of 5.5-9 kilodaltons.
The fact that the four proteins can now be obtained in pure form by the above-described methods now makes it possible for one to apply conventional methods to elucidate the amino acid composition and sequence of the proteins; to prepare oligonucleotide probes based on the elucidated peptide sequences; to identify genomic DNA or cDNA encoding the proteins by conventional means, e.g., via (a) hybridization of labeled oligonucleotide probes to DNA of an appropriate library (Jacobs et al., 1985), (b) expression cloning (Wong et al., 1985) and screening for surfactant enhancing activity or (c) immunoreactivity of the expressed protein with antibodies to the proteins or fragments thereof; and to produce corresponding recombinant proteins using the identified genomic DNA or cDNA and conventional expression technology i.e. by culturing genetically engineered host cells such as microbial , insect or mammalian host cells containing the DNA so identified, for instance, transformed with the DNA or with an expression vector containing the DNA.
By way of example, tryptic fragments of one of the two 35 kd proteins were prepared and sequenced. Oligonucleotide probes were synthesized based on the elucidated peptide sequence of the tryptic fragments and were used to screen a lambda gt10 cDNA library made from human lung mRNA. Numerous clones were identified which hybridized to the probes. DNAs from two of these positive clones (PSAP-1 and PSAP-2) were subcloned into M13 for DNA sequencing, thus generating the clones MPSAP-1A and MPSAP-6A. The nucleotide sequence for the cDNA clones encoding each of the two 35kd surfactant proteins was thereby elucidated and is presented above in Tables 1 and 2, respectively. The sequences of subclones encoding the two 35 kd proteins are similar to each other but not identical. The sequence differences result in restriction fragment polymoirphism between the two clones with respect to the coding region recognized by the restriction enzyme Pstl. Considerably more nucleotide variation between the two clones was found in their 3' untranslated regions. Plasmids PSP35K-1A-10 and PSP35K-6A-8 were constructed by inserting the approximately 940-950 nucleotide EcoRI fragments depicted in Tables 1 and 2, respectively, into the EcoRI site of plasmid SP65 (see infra). PSP35K-1A-10 contains the polylinker site adjacent to the EcoRI site at cDNA position 1, while PSP35K-6A-8 contains the polylinker site adjacent to the EcoRI site at cDNA position 947. PSP35K-1A-10 and PSP35K-6A-8 have been deposited with the American Type Culture Collection (ATCC), Rockville, MD under accession Nos. ATCC 40243 and 40244, respectively.
Additionally, oligonucleotide probes based on the N-terminal sequence of the cold butanol-insoluble 6K protein (See Table 5) were synthesized and were used to screen a cDNA library prepared from human lung mRNA (Toole et al., 1984) as described in greater detail in Example 4, below. Several clones which hybridized to the probes were identified.
Based on hybridization intensity one clone was selected, subcloned into M13 and sequenced. Plasmid PSP6K-17-3 was constructed by inserting the cloned cDNA so identified as an EcoRI fragment into the EcoRI site of plasmid SP65 (D.A. Melton et al., 1984, Nucleic Acids Res., 12:7035-7056). PSP6K-17-3 has been deposited with the ATCC under accession No. ATCC 40245. The nucleotide sequence of the cloned cDNA insert is shown in Table 6.
(-) = Not determined positions 8, 11 and 12 were unidentified
As those skilled in the art will appreciate, the cDNA insert in PSP6K-17-3 contains an open reading frame encoding a protein having a molecular weight of over 40kd. It is presently believed that the primary translation product is further processed, e.g., by Type II pneumocytes (Alveolar Type II cells), to yield the approximately 6K protein. It is contemplated that the cloned cDNA, portions thereof or sequences capable of hybridizing thereto may be expressed in host cells or cell lines by conventional expression methods to produce "recombinant" proteins having surfactant or surfactant enhancing activity.
With respect to the cloned approximately 6K protein, this invention encompasses vectors containing a heterologous DNA sequence encoding the characteristic peptide sequence lie through Cys corresponding to nucleotides A-656 through C757 of the sequence shown in Table 6, i.e., IKRIQAMIPKGALAVAVAQVCRVVPLVAGGICQC. One such vector contains the nucleotide sequence
ATC AAG CGG ATC CAA GCC ATG ATT CCC AAG GGT GCG CTA GCT GTG GCA GTG GCC CAG GTG TGC CGC GTG GTA CCT CTG GTG GCG GGC GGC ATC TGC CAG TGC
Other vectors of this invention contain a heterologous DNA sequence encoding the characteristic peptide sequence substantially as depicted in the underlined peptide region of Table 6, i.e., FPIPLPYCWLCRALIKRIQAMIPKGALAVAVAQVCRWPLVAGGICQCLAERYSVILLDTLLGRML. One such vector contains the DNA sequence substantially as depicted in the underlined nucleotide sequence of Table 6, i.e.,
TTC CCC ATT CCT CTC CCC TAT TGC TGG CTC TGC AGG GCT CTG ATC AAG CGG ATC CAA GCC ATG ATT CCC AAG GGT GCG CTA GCT GTG GCA GTG GCC CAG GTG TGC CGC GTG GTA CCT CTG GTG GCG GGC GGC ATC TGC CAG TGC CTG GCT GAG CGC TAC TCC GTC ATC CTG CTC GAC ACG CTG CTG GGC ATG CTG
Another exemplary vector contains a heterologous DNA sequence, such as the nucleotide sequence depicted in Table 6, which encodes the full-length peptide sequence of Table 6. DNA inserts for such vectors which comprise a DNA sequence shorter than the full-length cDNA of PSP6K-17-3, depicted in Table 6, may be synthesized by known methods, e.g. using an automated DNA synthesizer, or may be prepared from the full-length cDNA sequence by conventional methods such as loop-out mutagenesis or cleavage with restriction enzymes and ligation. Vectors so prepared may be used to express the subject proteins by conventional means or may be used in the assembly of vectors with larger cDNA inserts. In the former case the vector will also contain a promoter to which the DNA insert is operatively linked and may additionally contain an amplifiable and/or selectable marker, all as is well known in the art.
The proteins of this invention may thus be produced by recovering and purifying the naturally-occuring proteins from human pulmonary lavage material as described herein. Alternatively, the corresponding "recombinant" proteins may be produced by expression of the DNA sequence encoding the desired protein by conventional expression methodology using microbial or insect or preferably, mammalian host cells. Suitable vectors as well as methods for inserting therein the desired DNA are well known in the art. Suitable host cells for transfection or transformation by such vectors and expression of the cDNA are also known in the art.
Mammalian cell expression vectors, for example, may be synthesized by techniques well known to those skilled in this art. The components of the vectors such as the bacterial replicons, selection genes, enhancers, promoters, and the like may be obtained from natural sources or synthesized by known procedures. See Kaufman, Proc. Natl. Acad. Sci. 82: 689-693 (1985).
Established cell lines, including transformed cell lines, are suitable as hosts. Normal diploid cells, cell strains derived from in vitro culture of primary tissue, as well as primary explants are also suitable. Candidate cells need not be genotypically deficient in the selection gene so long as the selection gene is dominantly acting.
The host cells preferably will be established mammalian cell lines. For stable integration of vector DNA into chromosomal DNA, and for subsequenct amplification of integrated vector DNA, both by conventional methods, CHO (Chinese hamster Ovary) cells are generally preferred. Alternatively, the vector DNA may include all or part of the bovine papilloma virus genome (Lusky et al., Cell, 36:391-401 (1984) and be carried in cell lines such as C127 mouse cells as a stable episomal element. Other usable mammalian cell lines include HeLa, COS-1 monkey cells, mouse L-929 cells, 3T3 lines derived from Swiss, Balb-c or NIH mice, BHK or HaK hamster cell lines and the like. Cell lines derived from Alveolar Type II cells may be preferred in certain cases such as expression of the 6K protein (alone or with one or more other proteins of this invention) using the cDNA insert from PSP6K-13-7 or a fragment thereof.
Stable transformants then are screened for expression of the product by standard immunological or enzymatic assays. The presence of the DNA encoding the proteins may be detected
by standard procedures such as Southern blotting. Transient expression of the DNA encoding the proteins during the several days after introduction of the expression vector DNA into suitable host cells such as COS-1 monkey cells is measured without selection by activity or immunological assay of the proteins in the culture medium.
In the case of bacterial expression, the DNA encoding the protein may be further modified to contain preferred codons for bacterial expression as is known in the art and preferably is operatively linked in-frame to a nucleotide sequence encoding a secretory leader polypeptide permittng bacterial secretion of the mature variant protein, also as is known in the art. The compounds expressed in mammalian, insect or microbial host cells may then be recovered, purified, and/or characterized with respect to physicochemical, biochemical and/or clinical parameters, all by known methods.
One or more of the proteins of this invention may be combined with a pharmaceutically acceptable fatty acid or lipid such as dipalmitoylphosphatidyl choline or with mixtures of such fatty acids or lipids which may be obtained from commercial sources or by conventional methods, or with natural surfactant lipids to provide a formulated pulmonary surfactant composition. Natural surfactant lipids may be extracted by known methods from lung lavage, e.g. bovine or human lung lavage. Typically the weight ratios of total lipids to total proteins in the composition will be about 20:1 to about 100:1. At the levels currently being tested in clinical trials, one dose of the surfactant composition corresponds to 1-2 mg of total protein and 98-99 mg. of total lipid.
It is contemplated that certain subcombinations of (1) the 35 kd protein encoded for by the nucleotide sequence of Table 1, (2) the 35 kd protein encoded for by the nucleotide sequence of Table 2, (3) the 6 kd protein encoded by the cDNA sequence of Table 6 and having the amino acid composition set forth in Table 3 and (4) the 6 kd protein having the amino acid composition set forth in Table 4 may be especially useful in the treatment of patients with particular clinical indications. Thus, this invention specifically contemplates the following subcombinations and compositions containing such subcombinations:
(a) proteins (1), (2), (3) or (4);
(b) proteins (1) and (2)
(c) proteins (1) and (3)
(d) proteins (1) and (4)
(e) proteins (2) and (3)
(f) proteins (2) and (4)
(g) proteins (1), (2), and (3); (h) proteins (2), (3), and (4)' (i) proteins (3) and (4);
(j) proteins (1), (2), and (4); and
(k) proteins (1), (3) and (4) At present compositions containing proteins (3) and/or (4) are preferred.
EXPERIMENTAL EXAMPLES
Example 1
Isolation and Characterization of the 35Kd Surfactant
Associated Proteins
Pulmonary lavage (50 ml) from an alveolar proteinosis patient was centrifuged at 10,000 x g for 5 min. The pellet was collected and washed 5 times in 20 mm Tris HCl, 0.5 M NaCl, pH 7.4. The lipids and lipid-associated proteins were extracted from the washed pellet by shaking with 50 ml 1-butanol for 1 hr at room temperature. The butanol-insoluble material was collected by centrifugation, washed with distilled water and dissolved in 50 mM sodium phosphate, pH 6.0 and 6M guanidine HCl. The protein was applied to a Vydac C4 reverse phase column and eluted with a gradient of acetonitrile: 2-propanol (2:1,v:v) containing 0.1% trifluoroacetic acid. The major protein peak eluting at 50% B was collected and evaporated to dryness. The proteins present were analyzed by SDS-PAGE (Laemmli, 1970).
Alkylation and Tryptic Mapping
The protein so obtained (approx. 50ug) was taken up and reduced in 200mM Tris, ImM EDTA, 6M guanidine HCl, 20mM DTT, pH8.5 at 37°C for 2 hrs. Solid iodacetamide was added to a final concentration of 60mM and the reaction incubated at 0°C for 2 hrs under argon in the dark. The reaction was stopped and the reagents removed by dialysis into 0.1M NH4HCO3, 50mM 2-mercaptoethanol, pH7.5 followed by further dialysis into 100mM NH4HCO3, pH7.5. The alkylated protein was digested with trypsin (3% trypsin by weight) at 37°C for 16 hrs and the digest chromatographed over a C18 Vydac Reverse phase HPLC column (4.6x250mm).
The tryptic peptides were eluted with a linear gradient of 95% acetonitrile and 0.1% TFA, collected and subjected to N-terminal Edman degradation using an Applied-Biosystems Model 470A protein sequencer. The PTH-amino acids were analyzed by the method of Hunkapillar and Hood (1983). Sequence data so obtained for tryptic fragments T19, T26 and T28 is presented below in Table 7.
Example 2
Isolation and Characterization of the Low Molecular Weight
Lipid Associated Proteins
The butanol extract obtained in Example 1 was stored at -20°C causing precipitation of one of the low MW proteins. The precipitate was collected by centrifugation and dried under vacuum. The butanol layer containing butanol-soluble protein was evaporated to dryness. The precipitated cold butanol insoluble protein and the cold butanol-soluble protein were then purified in parallel by the same method as follows. Each crude protein was separately dissolved in CHCI3 : MeOH (2:1, v/v), applied to Sephadex LH20 columns and eluted with CECl3:MeOH (2:1). The proteins were then analyzed by SDS-PAGE. Fractions containing the protein were pooled and evaporated to dryness. Amino acid composition was determined by hydrolysis in 6 N HCl at 110°C for 22 hrs followed by chromatography on a Beckman model 63000 amino acid analyzer. N-terminal sequence was determined on an Applied Biosystems 470A sequencer. Molecular weights were determined on 10-20% gradient SDS polyacrylamide gels.
Example 3
Screening of the cDNA Library and Sequencing of Clones for the 35Kd Proteins
Based on the amino acid sequence of tryptic fragment T28, (Table 7) an oligonucleotide probe was synthesized. The probe consisted of four pools of 20 mers and each pool contained 32 different sequences. The sequences of the 20 mers are depicted in Table 8.
A cDNA library from human lung mRNA was prepared as described in Toole et al., (1984) and screened with thetotal mixture of the four pools using tetramethyl ammoniumchloride as a hybridization solvent (Jacobs et al., 1985) Between 0.5-1% of the phage clones were positive with this probe.
DNA from two of these clones were subcloned into M13 for DNA sequence analysis. By using Pool II as a sequencing primer, the nucleotide sequence corresponding to tryptic fragment T26 was identified inboth clones, confirming that the isolated clones code for the major protein species found in the partially purified 35kd protein from lavage material of alveolar proteinosis patients (see above).
The two clones differed in nucleotide sequence at three positions out of 250 nucleotides. Both clones were completely sequenced by generating an ordered set of deletions with Bal 31 nuclease, recloning into other M13 vectors and sequencing via the dideoxynucletide chain terminationprocedure (Viera and Messing, 1982; Sanger et al., 1977). One clone corresponded to a full-lenth copy of the type referred to as 1A (Table 1), the second to an incomplete copy of the type referred to as 6A (Table 2). By using an oligonucleotide specific for type 6A, a full-length clone of this type was identified. The 5' EcoRI fragment of the
lambda gt10 cDNA clone was subcloned into M13 and sequenced as above by using specific olignucleotides as primers. This sequence is presented in Table 2. The two clones differ within the coding region at 7 nucleotides which led to amino acid changes and at 6 nucleotides which resulted in silent changes. These changes result in restriction fragment polymorphism between the two clones within the coding region for the restriction enzyme Pstl. Clone 6A has 2 Pstl sites at the nucleotide position 454 and 478 (Table 2) and clone 1A has 3 Pstl sites at 454, 478 and 756 (Table 1). Additional DNA sequencing of each clone at the 3' untranslated region revealed a large 1kb untranslated region and considerably more nucleotide variation between MPSAP-1A and MPSAP-6A.
DNA Binding and Hybrid Selection
Very dilute (10-15 ug/ml) single stranded DNA from either subclone MPSAP-IA or MPSAP-1B was applied to nitrocellulose paper (10 ug/cm ) under vacuum (Kafatos et al., 1979). MPSAP-1A represents the M13 subclone of a 0.9kb ECORI fragment in one orientation in M13. MPSAP-1B represents the same fragment cloned in the opposite orientation in M13. Each filter (1cm2) was cut into nine equal size pieces and each piece was used in a 20-30ul hybridization reaction. Each reaction contained human lung RNA (5mg/ml), 50% deionized formamide (Fluka AG Chemical Corp.), 10mM PIPES [(Piperazine-N, N'-bis) (2-ethanesulfonic acid)] pH 6.4 and 0.4M NaCl (Miller et al., 1983). The source and preparation of the RNA have been reported previously (Floros et al., 1985). Each hybridization reaction was routinely incubated at 50°C for 3 hrs. At the end of the incubation period each filter was washed for five minutes with 1ml 1XSSC (.15M NaCl, 0.015M sodium citrate, 0.5% SDS at 60°C five times. Then it was washed for five minutes with 1ml of 2mM
EDTA, pH 7.9 at 60°C three times. The selected RNA was eluted by boiling for one minute in 300ul of ImM EDTA pH 7.9 and 10ug of yeast tRNA (Boehringer, Mannheim). The precipitated RNA was translated, immunoprecipitated and subjected to one and two dimensional gel electrophoresis as described in Floros et al., 1985.
Example 4
Screening of the cDNA Library and Sequencing of Clones for the 6Kd Proteins
Based on the first six amino acids of the sequence shown in
Table 5 an oligonucleotide probe was synthesized. The probe consisted of six pools of 17 mers. Three of the pools each contained 128 different sequences, and three of the pools each contained 64 different sequences. Based on the first seven amino acids two pools of 20 mers were synthesized. These pools contained either 384 or 192 different sequences.
A cDNA library from human lung mRNA was prepared as described in Toole et al., (1984) and screened with the total mixture of the six pools using tetramethylammoniumchloride as a hybridization solvent (Jacobs et al., 1985). Approximately 100,000 phage were screened, and 100 phage which hybridized to the probe were plaque purified. The phage were then pooled into groups of 25 and screened with the individual 17 mer and 20 mer pools. Six phage which hybridized most intensely to one of the 20 mer oligonucleotide probes and one of the corresponding 17 mer pools (pool 1447 containing 128 different sequences) were plaque purified. The 17 mer pool 1447 was divided into four pools of 32 different sequences and hybridized to a dot blot of DNA prepared from these phage.
Based on the hybridization intensity, DNA from one of these six phage were subcloned into M13 for DNA sequence analysis. A sequence corresponding in identity and position to the amino acids shown in Table 5 was obtained, confirming that the isolated clone coded for the approximately 6kd cold butanol-insoluble protein found in the lavage material of alveolar proteinosis patients (see above).
The first clone obtained was presumed to be an incomplete copy of the mRNA because it lacked an initiating methionine, and was used to isolate longer clones. Two clones were completely sequenced by generating an ordered set of deletions with Bal 31 nuclease, recloning into other M13 vectors and sequencing via the dideoxynucleotide chain termination procedure (Viera and Messing, 1982; Sanger et al., 1977). One clone corresponded to a full-length copy of the type referred to as 17 (Table 6), the second began at nucleotide 148 of clone 17. Sequence of the 5' end of a third clone confirmed the sequence of the 5' end of clone 17. The clones are identical throughout the coding region and differ only at two positions in the 3' untranslated region.
REFERENCES
1. Bhattacharyya, S.N., and Lynn, W.S. (1978) Biochem. Biophys. Acta 537, 329-335
2. Bhattacharyya, S.N., and Lynn, W. S. (1980) Biochem. Biophys. Acta 625, 451-458
3. Bhattacharyya, S.N., Passero, M.A., DiAugustine, R.P., and Lynn, W. S. (1975) J. Clin. Invest. 55, 914-920
4. Floros, J., Phelps, D.S., and Taeusch, W.H. (1985) J. Biol. Chem. 260, 495-500
5. Hawgood, S., Benson, B. J., and Hamilton, Jr. R. L. (1985) Biochemistry 24, 184-190
6. Hunkapiller, M. W. and Hood, L. E. (1983) Methods in Enzymology 91, 486¬
7. Jacobs, K., Shoemaker, C., Rudersdorf, R., Neil, S. D., Kaufman, R. J., Mufson, A., Seehra, J., Jones, S. S., Hewick, R., Fritsch,E. E., Kawakita, M., Shimizu, T., and Miyake, T. (1985) Nature (Lond.) 313, 806-810.
8. Kafatos, E., Jones, W. C., and Efstratiadis, A. (1979) Nucleic acid Rest. 7, 1541-1552.
9. Katyal, S. L., Amenta, J. S., Singh, G., and Silverman, J. A. (1984) Am. J. Obstet. Gynecol. 148, 48-53.
10. Katyal, S. L. and Singh, G. (1981) Biochem.
Biophys. Acta 670, 323-331.
11. King, R. J., Carmichael, M. C., and Horowitz, P.M. (1983) J. Biol. Chem. 258, 10672-10680.
12. King, R. J. (1982) J. Appl. Physiol. Exercise Physiol. 53, 1-8.
13. King. R. J., Klass, D. J., Gikas, E. G., and Clements,
J. A. (1973) Am. J. Physiol. 224 , 788-795.
14. King, R. J., Ruch, J., Gikas, E. G., Platzker, A. C. G., and Creasy, R. K. (1975) J. of Applied Phys. 39, 735-741.
15. Laemmli, U. K. (1970) Nature (Lond.) 227, 680-685.
16. Miller, J. S., Paterson, B. M., Ricciardi, R. P., Cohen, Land Roberts, B. E. (1983) Methods in Enzymology 101p. 650-674.
17. Phelps, D. S., Taeusch, W. H., Benson, B., and Hawgood, S. (1984) Biochem. Biophs. Acta, 791-226-238.
18. Shelley, S. A., Balis, J. U., Paciga, J. E., Knuppel, R. A., Ruffolo, E. H., and Bouis, P. J. (1982)
Am. J. Obstet. Gynecol. 144, 224-228.
19. Sigrist, H., Sigrist-Nelson, K., and Gither, G. (1977) BBRC 74, 178, 184.
20. Sueishi, K., and Benson, G. J. (1981) Biochem. Biophys. Acta 665, 442-453.
21. Toole, J. J., Knopf, J. L., Wozney, J. M., Sultzman L. A., Bucker, J. L., Pittman, D. D., Kaufman, R. J., Brown, E., Shoemaker, C., Orr, E. C., Amphlett, G. W., Foster, W. G., Coe, M. L., Knutson, G. L., Eass, D. N., Hewick, R. M. (1984) Nature (Lond.) 312, 342-347.
22. Whitsett, J. A., Hull, W., Ross, G., and Weaver, T. (1985) Pediatric Res. 19, 501-508.
23. Wong, G.G. et al., 1985, Science, 228:810-815
INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
(51) International Patent Classification 4 ; (11) International Publication Number : WO 87/ 0203
C07H 21/00, C12N 15/00 Al (43) International Publication Date: 9 April 1987 (09.04.8
(21) International Application Number: PCT/US86/02034 (74) Agents: BERSTEIN, David, L. ; Genetics Institut Inc., 87 CambridgePark Drive, Cambridge, M
(22) International Filing Date: 26 September 1986 (26.09.86) 02140 (US) et al.
(31) Priority Application Numbers 781,130 (81) Designated States: AT (European patent), BE (Eur 897,183 pean patent), CH (European patent), DE (Europe patent), FR (European patent), GB (European p
(32) Priority Dates: 26 September 1985 (26.09.85) tent), IT (European patent), JP, LU (European p 15 August 1986 (15.08.86) tent), NL (European patent), SE (European patent)
(33) Priority Country: US
Published
With international search report.
(71) Applicants: GENETICS INSTITUTE, INC. [US/US];
87 CambridgePark Drive, Cambridge, MA 02140-2387 (US). BRIGHAM AND WOMEN'S HOSPITAL [US/US]; 75 Francis Street, Boston, MA 02155 (US).
(72) Inventors: TAEUSCH, H., William, Jr. ; 20 Chapel
Street, Apt. 103A, Brookline, MA 02146 (US). JACOBS, Kenneth, A. ; 151 Beaumont Avenue, Newton, MA 02160 (US). STEINBRINK, D., Randall ; 45 Marion Road, Belmont, MA 02178 (US). FLOROS, Joanna ; PHELPS, Davis, S. ; 38 Greaton Road, West Roxbury, MA 02132 (US).
(54) Title: PULMONARY SURFACTANT PROTEINS
(57) Abstract
Novel proteins useful for enhancing pulmonary surfactant activity, methods for obtaining said proteins and comp sitions containing one or more of the proteins. The proteins of this invention include two separate proteins characterize by molecular weights of about 35 kd and two separate proteins characterized by molecular weights of about 5.5-9 kd.
FOR THE PURPOSES OF INFORMATION ONLY
Codes used to identify States party to the PCT on the front ages ofpamphlets publishing international applications under the PCT.
AT Austria FR France ML Mali
AU Australia GA Gabon MR Mauritania
BB Barbados GB United Kingdom MW Malawi
BE Belgium HU Hungary NL Netherlands
BG Bulgaria IT Italy NO Norway
BJ Benin JP Japan RO Romania
BR Brazil KP Democratic People's Republic SD Sudan
CF Central African Republic of Korea SE Sweden
CG Congo KR Republic ofKorea SN Senegal
CH Switzerland LI Liechtenstein SU Soviet Union
CM Cameroon LK Sri Lanka TD Chad
DE Germany, Federal Republic of LU Luxembourg TG Togo
DK Denmark MC Monaco US United States of America
Fl Finland MG Madagascar
PULMONARY SURFACTANT PROTEINS
Field of the Invention
This invention relates to proteins isolated from human lung lavage, methods for obtaining said proteins and uses thereof.
Background of the Invention
Throughout this application various publications are referenced. Full citations for these publications may be found at the end of the specification. The disclosure of these publications are hereby incorporated by reference in order to more fully describe the state of the art to which this invention pertains.
Hyaline Membrane Disease (HMD) and Respiratory Distress Syndrome (RDS) are synonymous terms denoting the clinical condition of pulmonary dysfunction in premature infants. The disease is attributable to the absence of surface active material (surfactant) which lines the air-alveolar interface in the lung and prevents collapse of the alveoli during respiration. Current therapy is predominantly supportive. However, recent clinical trials indicate that one promising therapy is the instillation of bovine-derived surfactant into the lungs of the neonate.
Surface tension in the alveoli of the lung is lowered by a lipoprotein complex called pulmonary surfactant. This complex consists of phospholipid and 5-10% protein (King, 1982). The protein fraction of the surfactant is composed of nonserum and serum proteins. The major surfactant associated protein is reportedly a 35,000 dalton nonserum, sialoglycoprotein (Shelly et al., 1982; Bhattacharyya et
al, 1975; Sueishin and Benson 1981; King et al, 1973, Katyal & Singh, 1981). This protein reportedly seems to be important for the normal function of the pulmonary surfactant (King et. al., 1983; Hawgood et.al., 1985). It is present in reduced amounts in amniotic fluid samples taken shortly before the birth of infants who subsequently develop respiratory distress syndrome (Katyal and Singh, 1984; Shelly et al., 1982; King et al., 1975). Recently the biosynthesis of a 35,000 dalton protein in normal human lung tissue was studied and in an in vitro translation reaction, proteins of 29 and 31 kDa were identified as the primary translation products (Floros et al., 1985). A 35kDa protein also accumulates in the lungs of patients with alveolar proteinosis (Battacharyya and Lynn, 1978, Battacharyya and Lynn, 1980a). This protein has the same electrophoretic mobility, immunological determinants and peptide mapping as the 35kDa protein from normal human broncho-alveolar lavage material (Phelps et al., 1984; Whitsett et al., 1985).
In addition to the above mentioned proteins, the presence in rat lungs of a number of lower molecular weight surfactantassociated proteins has recently been reported. See D. L. Wang, A. Chandler and A. B. Fisher, Fed. Proc. 44(4): 1024 (1985), Abstract No. 3587 (ca. 9000 dalton rat protein) and S. Katyal and G. Singh, Fed. Proc. 44(6): 1890 (1985), Abstract No. 8639 (10,000 - 12,000 dalton rat protein).
Finally, a Feb. 6, 1985 press release from California Biotechnology Inc. reports the cloning and "detailed manipulation" of "the gene encoding human lung surfactant protein." However, the press release does not characterize that protein or describe the "detailed manipulations." Two other reports of possible surfactant-related proteins have also been published recently, namely, J.A. Whitsett et al., 1986, Pediatr. Res. 20:460 and A. Takahashi et al., 1986,
BBRC 135:527.
The present invention relates to a new group of proteins recovered and purified from lung lavage of patients with alveolar proteinosis, methods for obtaining the proteins, corresponding recombinant proteins, antibodies to the proteins for use in diagnostic products, compositions containing the novel proteins, and methods for using the compositions, e.g. in the treatment of infants afflicted with conditions such as Respiratory Distress Syndrome (RDS), as a drug delivery vehicle in the administration of other therapeutic materials to the lungs or other organs and in the treatment of adult RDS, which can occur during cardiopulmonary operations or in other situations when the lungs are filled with fluid and natural pulmonary surfactant production and/or function ceases. While it is possible that one or more of the proteins described hereinafter is similar or identical to proteins discussed in the abovementioned papers, the exact relationship of the protein of this invention to prior proteins cannot at present be confirmed given the inadequacies of the prior disclosures with respect to amino acid or nucleotide sequence data, surfactant activity of prior proteins and the like.
Summary of the Invention
This invention relates to novel proteins useful for enhancing pulmonary surfactant activity, methods for obtaining said proteins and compositions containing one or more of the proteins. The proteins of this invention include the following:
1. A protein characterized by a molecular weight of about 35 kd and by being encoded for by the DNA sequence depicted in Table 1.
2. A protein characterized by a molecular weight of
about 35 kd and by being encoded for by the DNA sequence depicted in Table 2.
3. A protein encoded for by the DNA sequence of Table 6 or by a DNA sequence capable of hybridizing thereto and characterized by a molecular weight of about 5.59kd; and
4. A protein characterized by a molecular weight of about 6 kd and an amino acid composition as set forth in Table 4.
Detailed Description of the Invention
composition of the latter 6kd protein is set forth in Table 4.
The two approximately 6kd proteins differ significantly from each other with respect to amino acid composition, as well as from the protein described by Tanaka, Chem. Pharm. Bull. 311:4100 (1983). Additionally, the N-terminal peptide sequence of the cold butanol-insoluble 6 kd protein was determined (Table 5). For the sake of simplicity, both low molecular weight PSP proteins are referred to hereinafter as "6k" proteins based on their approximate apparent molecular weights as determined by conventional SDS-PAGE. It should be understood, however, that the actual molecular weights of these proteins are in the range of 5.5-9 kilodaltons.
The fact that the four proteins can now be obtained in pure form by the above-described methods now makes it possible for one to apply conventional methods to elucidate the amino acid composition and sequence of the proteins; to prepare oligonucleotide probes based on the elucidated peptide sequences; to identify genomic DNA or cDNA encoding the proteins by conventional means, e.g., via (a) hybridization of labeled oligonucleotide probes to DNA of an appropriate library (Jacobs et al., 1985), (b) expression cloning (Wong et al., 1985) and screening for surfactant enhancing activity or (c) immunoreactivity of the expressed protein with antibodies to the proteins or fragments thereof; and to produce corresponding recombinant proteins using the identified genomic DNA or cDNA and conventional expression technology i.e. by culturing genetically engineered host cells such as microbial, insect or mammalian host cells containing the DNA so identified, for instance, transformed with the DNA or with an expression vector containing the DNA.
By way of example, tryptic fragments of one of the two 35 kd proteins were prepared and sequenced. Oligonucleotide probes were synthesized based on the elucidated peptide sequence of the tryptic fragments and were used to screen a lambda gt10 cDNA library made from human lung mRNA. Numerous clones were identified which hybridized to the probes. DNAs from two of these positive clones (PSAP-1 and PSAP-2) were subcloned into M13 for DNA sequencing, thus generating the clones MPSAP-1A and MPSAP-6A. The nucleotide sequence for the cDNA clones encoding each of the two 35kd surfactant proteins was thereby elucidated and is presented above in Tables 1 and 2, respectively. The sequences of subclones encoding the two 35 kd proteins are similar to each other but not identical. The sequence differences result in restriction fragment polymorphism between the two clones with respect to the coding region recognized by the restriction enzyme Pstl. Considerably more nucleotide variation between the two clones was found in their 3' untranslated regions. Plasmids PSP35K-1A-10 and PSP35K-6A-8 were constructed by inserting the approximately 940-950 nucleotide EcoRI fragments depicted in Tables 1 and 2, respectively, into the EcoRI site of plasmid SP65 (see infra). PSP35K-1A-10 contains the polylinker site adjacent to the EcoRI site at cDNA position 1, while PSP35K-6A-8 contains the polylinker site adjacent to the EcoRI site at cDNA position 947. PSP35K-1A-10 and PSP35K-6A-8 have been deposited with the American Type Culture Collection (ATCC), Rockville, MD under accession Nos. ATCC 40243 and 40244, respectively.
Additionally, oligonucleotide probes based on the N-terminal sequence of the cold butanol-insoluble 6K protein (See Table 5) were synthesized and were used to screen a cDNA library prepared from human lung mRNA (Tooie et al., 1984) as described in greater detail in Example 4, below. Several clones which hybridized to the probes were identified.
Based on hybridization intensity one clone was selected, subcloned into M13 and sequenced. Plasmid PSP6K-17-3 was constructed by inserting the cloned cDNA so identified as an EcoRI fragment into the EcoRI site of plasmid SP65 (D.A. Melton et al., 1984, Nucleic Acids Res., 12:7035-7056). PSP6K-17-3 has been deposited with the ATCC under accession No. ATCC 40245. The nucleotide sequence of the cloned cDNA insert is shown in Table 6.
(-) - Not determined positions 8, 11 and 12 were unidentified
As those skilled in the art will appreciate, the cDNA insert in PSP6K-17-3 contains an open reading frame encoding a protein having a molecular weight of over 40kd. It is presently believed that the primary translation product is further processed, e.g., by Type II pneumocytes (Alveolar Type II cells), to yield the approximately 6K protein. It is contemplated that the cloned cDNA, portions thereof or sequences capable of hybridizing thereto may be expressed in host cells or cell lines by conventional expression methods to produce "recombinant" proteins having surfactant or surfactant enhancing activity.
With respect to the cloned approximately 6K protein, this invention encompasses vectors containing a heterologous DNA sequence encoding the characteristic peptide sequence lie through Cys corresponding to nucleotides A-656 through C757 of the sequence shown in Table 6, i.e., IKRIQAMIPKGALAVAVAQVCRVVPLVAGGICQC. One such vector contains the nucleotide sequence
Other vectors of this invention contain a heterologous DNA sequence encoding the characteristic peptide sequence substantially as depicted in the underlined peptide region of Table 6, i.e., FPIPLPYCWLCRALIKRIQAMIPKGALAVAVAQVCRWPLVAGGICQCLAERYSVILLDTLLGRML. One such vector contains the DNA sequence substantially as depicted in the underlined nucleotide sequence of Table 6, i.e.,
Another exemplary vector contains a heterologous DNA sequence, such as the nucleotide sequence depicted in Table 6, which encodes the full-length peptide sequence of Table 6. DNA inserts for such vectors which comprise a DNA sequence shorter than the full-length cDNA of PSP6K-17-3, depicted in Table 6, may be synthesized by known methods, e.g. using an automated DNA synthesizer, or may be prepared from the full-length cDNA sequence by conventional methods such as loop-out mutagenesis or cleavage with restriction enzymes and ligation. Vectors so prepared may be used to express the subject proteins by conventional means or may be used in the assembly of vectors with larger cDNA inserts. In the former case the vector will also contain a promoter to which the DNA insert is operatively linked and may additionally contain an amplifiable and/or selectable marker, all as is well known in the art.
The proteins of this invention may thus be produced by recovering and purifying the naturally-occuring proteins from human pulmonary lavage material as described herein. Alternatively, the corresponding "recombinant" proteins may be produced by expression of the DNA sequence encoding the desired protein by conventional expression methodology using miσrobial or insect or preferably, mammalian host cells. Suitable vectors as well as methods for inserting therein the desired DNA are well known in the art. Suitable host cells for transfection or transformation by such vectors and expression of the cDNA are also known in the art.
Mammalian cell expression vectors, for example, may be synthesized by techniques well known to those skilled in this art. The components of the vectors such as the bacterial replicons, selection genes, enhancers, promoters, and the like may be obtained from natural sources or synthesized by known procedures. See Kaufman, Proc. Natl. Aσad. Sci. 82: 689-693 (1985).
Established cell lines, including transformed cell lines, are suitable as hosts. Normal diploid cells, cell strains derived from in vitro culture of primary tissue, as well as primary explants are also suitable. Candidate cells need not be genotypically deficient in the selection gene so long as the selection gene is dominantly acting.
The host cells preferably will be established mammalian cell lines. For stable integration of vector DNA into chromosomal DNA, and for subsequenct amplification of integrated vector DNA, both by conventional methods, CHO (Chinese hamster Ovary) cells are generally preferred. Alternatively, the vector DNA may include all or part of the bovine papilloma virus genome (Lusky et al., Cell, 36:391-401 (1984) and be carried in cell lines such as C127 mouse cells as a stable episomal element. Other usable mammalian cell lines include HeLa, COS-1 monkey cells, mouse L-929 cells, 3T3 lines derived from Swiss, Balb-c or NIH mice, BHK or HaK hamster cell lines and the like. Cell lines derived from Alveolar Type II cells may be preferred in certain cases such as expression of the 6K protein (alone or with one or more other proteins of this invention) using the cDNA insert from PSP6K-13-7 or a fragment thereof.
Stable transformants then are screened for expression of the product by standard immunological or enzymatic assays. The presence of the DNA encoding the proteins may be detected
by standard procedures such as Southern blotting. Transient expression of the DNA encoding the proteins during the several days after introduction of the expression vector DNA into suitable host cells such as COS-1 monkey cells is measured without selection by activity or immunological assay of the proteins in the culture medium.
In the case of bacterial expression, the DNA encoding the protein may be further modified to contain preferred codons for bacterial expression as is known in the art and preferably is operatively linked in-frame to a nucleotide sequence encoding a secretory leader polypeptide permittng bacterial secretion of the mature variant protein, also as is known in the art. The compounds expressed in mammalian, insect or microbial host cells may then be recovered, purified, and/or characterized with respect to physicochemical, biochemical and/or clinical parameters, all by known methods.
One or more of the proteins of this invention may be combined with a pharmaceutically acceptable fatty acid or lipid such as dipalmitoylphosphatidyl choline or with mixtures of such fatty acids or lipids which may be obtained from commercial sources or by conventional methods, or with natural surfactant lipids to provide a formulated pulmonary surfactant composition. Natural surfactant lipids may be extracted by known methods from lung lavage, e.g. bovine or human lung lavage. Typically the weight ratios of total lipids to total proteins in the composition will be about 20:1 to about 100:1. At the levels currently being tested in clinical trials, one dose of the surfactant composition corresponds to 1-2 mg of total protein and 98-99 mg. of total lipid.
It is contemplated that certain subcombinations of (1) the 35 kd protein encoded for by the nucleotide sequence of Table 1, (2) the 35 kd protein encoded for by the nucleotide sequence of Table 2, (3) the 6 kd protein encoded by the cDNA sequence of Table 6 and having the amino acid composition set forth in Table 3 and (4) the 6 kd protein having the amino acid composition set forth in Table 4 may be especially useful in the treatment of patients with particular clinical indications. Thus, this invention specifically contemplates the following subcombinations and compositions containing such subcombinations:
(a) proteins ( 1), (2), (3) or (4);
(b) proteins (1) and (2)
(c) proteins (1) and (3)
(d) proteins (1) and (4)
(e) proteins (2) and (3)
(f) proteins (2) and (4)
(g) proteins (1), (2), and (3);
(h) proteins (2), (3), and (4)'
(i) proteins ( [3) and (4);
(j ) proteins (1), (2), and (4); and
(k) proteins (1), (3) and (4) At present compositions containing proteins (3) and/or (4) are preferred.
EXPERIMENTAL EXAMPLES
Example 1
Isolation and Characterization of the 35Kd Surfactant
Associated Proteins
Pulmonary lavage (50 ml) from an alveolar proteinosis patient was centrifuged at 10,000 x g for 5 min. The pellet was collected and washed 5 times in 20 mm Tris HCl, 0.5 M NaCl, pH 7.4. The lipids and lipid-associated proteins were extracted from the washed pellet by shaking with 50 ml 1-butanol for 1 hr at room temperature. The butanol-insoluble material was collected by centrifugation, washed with distilled water and dissolved in 50 mM sodium phosphate, pH 6.0 and 6M guanidine HCl. The protein was applied to a Vydac C4 reverse phase column and eluted with a gradient of acetonitrile: 2-propanol (2:1,v:v) containing 0.1% trifluoroacetic acid. The major protein peak eluting at 50% B was collected and evaporated to dryness. The proteins present were analyzed by SDS-PAGE (Laemmli, 1970).
Alkylation and Tryptic Mapping
The protein so obtained(approx. 50ug) was taken up and reduced in 200mM Tris, 1mM EDTA, 6M guanidine HCl, 20mM DTT, pH8.5 at 37°C for 2 hrs. Solid iodacetamide was added to a final concentration of 60mM and the reaction incubated at 0°C for 2 hrs under argon in the dark. The reaction was stopped and the reagents removed by dialysis into 0.1M NH4HCO3, 50mM 2-mercaptoethanol, pH7.5 followed by further dialysis into 100mM NH4HCO3, pH7.5. The alkylated protein was digested with trypsin (3% trypsin by weight) at 37°C for 16 hrs and the digest chromatographed over a C18 Vydac Reverse phase HPLC column (4.6x250mm).
The tryptic peptides were eluted with a linear gradient of 95% acetonitrile and 0.1% TFA, collected and subjected to N-terminal Edman degradation using an Applled-Biosystems Model 470A protein sequencer. The PTH-amino acids were analyzed by the method of Hunkapillar and Hood (1983). Sequence data so obtained for tryptic fragments T19, T26 and T28 is presented below in Table 7.
Example 2
Isolation and Characterization of the Low Molecular Weight
Lipid Associated Proteins
The butanol extract obtained in Example 1 was stored at -20°C causing precipitation of one of the low MW proteins. The precipitate was collected by centrifugation and dried under vacuum. The butanol layer containing butanol-soluble protein was evaporated to dryness. The precipitated cold butanol insoluble protein and the cold butanol-soluble protein were then purified in parallel by the same method as follows. Each crude protein was separately dissolved in CHCl3 : MeOH (2:1, v/v), applied to Sephadex LH20 columns and eluted with CHCl3:MeOH (2:1). The proteins were then analyzed by SDS-PAGE. Fractions containing the protein were pooled and evaporated to dryness. Amino acid composition was determined by hydrolysis in 6 N HCl at 110°C for 22 hrs followed by chromatography on a Beckman model 63000 amino acid analyzer. N-terminal sequence was determined on an Applied Biosystems 470A sequencer. Molecular weights were determined on 10-20% gradient SDS polyacrylamide gels.
Example 3
Screening of the cDNA Library and Sequencing of Clones for the 35Kd Proteins
Based on the amino acid sequence of tryptic fragment T28, (Table 7) an oligonucleotide probe was synthesized. The probe consisted of four pools of 20 mers and each pool contained 32 different sequences. The sequences of the 20 mers are depicted in Table 8.
A cDNA library from human lung mRNA was prepared as described in Toole et al., (1984) and screened with the total mixture of the four pools using tetramethyl ammoniumchloride as a hybridization solvent (Jacobs et al., 1985) Between 0.5-1% of the phage clones were positive with this probe.
DNA from two of these clones were subcloned into M13 for DNA sequence analysis. By using Pool II as a sequencing primer, the nucleotide sequence corresponding to tryptic fragment T26 was identified in both clones, confirming that the isolated clones code for the major protein species found in the partially purified 35kd protein from lavage material of alveolar proteinosis patients (see above).
The two clones differed in nucleotide sequence at three positions out of 250 nucleotides. Both clones were completely sequenced by generating an ordered set of deletions with Bal 31 nuclease, recloning into other M13 vectors and sequencing via the dideoxynucletide chain termination procedure (Viera and Messing, 1982; Sanger et al., 1977). One clone corresponded to a full-lenth copy of the type referred to as 1A (Table 1), the second to an incomplete copy of the type referred to as 6A (Table 2). By using an oligonucleotide specific for type 6A, a full-length clone of this type was identified. The 5' EcoRI fragment of the
lambda gt10 cDNA clone was subcloned into M13 and sequenced as above by using specific olignucleotides as primers. This sequence is presented in Table 2. The two clones differ within the coding region at 7 nucleotides which led to amino acid changes and at 6 nucleotides which resulted in silent changes. These changes result in restriction fragment polymorphism between the two clones within the coding region for the restriction enzyme Pstl. Clone 6A has 2 Pstl sites at the nucleotide position 454 and 478 (Table 2) and clone 1A has 3 Pstl sites at 454, 478 and 756 (Table 1). Additional DNA sequencing of each clone at the 3' untranslated region revealed a large 1kb untranslated region and considerably more nucleotide variation between MPSAP-IA and MPSAP-6A.
DNA Binding and Hybrid Selection
Very dilute (10-15 ug/ml) single stranded DNA from either subclone MPSAP-1A or MPSAP-1B was applied to nitrocellulose paper (10 ug/cm ) under vacuum (Kafatos et al., 1979). MPSAP-IA represents the M13 subclone of a 0.9kb ECORI fragment in one orientation in M13. MPSAP-1B represents the same fragment cloned in the opposite orientation in M13. Each filter (1cm2) was cut into nine equal size pieces and each piece was used in a 20-30ul hybridization reaction. Each reaction contained human lung RNA (5mg/ml), 50% deionized formamide (Fluka AG Chemical Corp.), 10mM PIPES [(Piperazine-N, N'-bis) (2-ethanesulfonic acid)] pH 6.4 and 0.4M NaCl (Miller et al., 1983). The source and preparation of the RNA have been reported previously (Floros et al., 1985). Each hybridization reaction was routinely incubated at 50°C for 3 hrs. At the end of the incubation period each filter was washed for five minutes with 1ml 1XSSC (.15M NaCl, 0.015M sodium citrate, 0.5% SDS at 60°C five times. Then it was washed for five minutes with 1ml of 2mM
EDTA, pH 7.9 at 60°C three times. The selected RNA was eluted by boiling for one minute in 300ul of ImM EDTA pH 7.9 and 10ug of yeast tRNA (Boehringer, Mannheim). The precipitated RNA was translated, immunoprecipitated and subjected to one and two dimensional gel electrophoresis as described in Floros et al., 1985.
Example 4
Screening of the cDNA Library and Sequencing of Clones for the 6Kd Proteins
Based on the first six amino acids of the sequence shown in Table 5 an oligonucleotide probe was synthesized. The probe consisted of six pools of 17 mers. Three of the pools each contained 128 different sequences, and three of the pools each contained 64 different sequences.Based on the first seven amino acids two pools of 20 mers were synthesized. These pools contained either 384 or 192 different sequences.
A cDNA library from human lung mRNA was prepared as described in Toole et al., (1984) and screened with the total mixture of the six pools using tetramethylammoniumchloride as a hybridization solvent (Jacobs et al., 1985). Approximately 100,000 phage were screened, and 100 phage which hybridized to the probe were plaque purified. The phage were then pooled into groups of 25 and screened with the individual 17 mer and 20 mer pools, six phage which hybridized most intensely to one of the 20 mer oligonucleotide probes and one of the corresponding 17 mer pools (pool 1447 containing 128 different sequences) were plaque purified. The 17 mer pool 1447 was divided into four pools of 32 different sequences and hybridized to a dot blot of DNA prepared from these phage.
Based on the hybridization intensity, DNA from one of these six phage were subcloned into M13 for DNA sequence analysis. A sequence corresponding in identity and position to the amino acids shown in Table 5 was obtained, confirming that the isolated clone coded for the approximately 6kd cold butanol-insoluble protein found in the lavage material of alveolar proteinosis patients (see above).
The first clone obtained was presumed to be an incomplete copy of the mRNA because it lacked an initiating methionine, and was used to isolate longer clones. Two clones were completely sequenced by generating an ordered set of deletions with Bal 31 nuclease, recloning into other M13 vectors and sequencing via the dideoxynucleotide chain termination procedure (Viera and Messing, 1982; Sanger et al., 1977). One clone corresponded to a full-length copy of the type referred to as 17 (Table 6), the second began at nucleotide 148 of clone 17. Sequence of the 5' end of a third clone confirmed the sequence of the 5' end of clone 17. The clones are identical throughout the coding region and differ only at two positions in the 3' untranslated region.
REFERENCES
1. Bhattacharyya, S.N., and Lynn, W.S. (1978) Biochem. Biophys. Acta 537, 329-335
2. Bhattacharyya, S.N., and Lynn, W. S. (1980) Biochem. Biophys. Acta 625, 451-458
3. Bhattacharyya, S.N., Passero, M.A., DiAugustine, R.P., and Lynn, W. S. (1975) J. Clin. Invest. 55, 914-920
4. Floros, J., Phelps, D.S., and Taeusch, W.H. (1985) J. Biol. Chem. 260, 495-500
5. Hawgood, S., Benson, B. J., and Hamilton, Jr. R. L. (1985) Biochemistry 24, 184-190
6. Hunkapiller, M. W. and Hood, L. E. (1983) Methods in Enzymology 91, 486¬
7. Jacobs, K., Shoemaker, C., Rudersdorf, R., Neil, S. D., Kaufman, R. J., Mufson, A., Seehra, J., Jones, S. S., Hewick, R., Fritsch,E. E., Kawakita, M., Shimizu, T., and Miyake, T. (1985) Nature (Lond.) 313, 806-810.
8. Kafatos, E., Jones, W. C., and Efstratiadis, A. (1979) Nucleic acid Rest. 7, 1541-1552.
9. Katyal, S. L., Amenta, J. S., Singh, G., and Silverman, J. A. (1984) Am. J. Obstet. Gynecol. 148, 48-53.
10. Katyal, S. L. and Singh, G. (1981) Biochem.
Biophys. Acta 670, 323-331.
11. King, R. J., Carmichael, M. C, and Horowitz, P.M. (1983) J. Biol. Chem. 258, 10672-10680.
12. King, R. J. (1982) J. Appl. Physiol. Exercise Physiol. 53, 1-8.
13. King. R. J., Klass, D. J., Gikas, E. G., and Clements,
J. A. (1973) Am. J. Physiol. 224 , 788-795.
14. King, R. J., Ruch, J., Gikas, E. G., Platzker, A. C. G., and Creasy, R. K. (1975) J. of Applied Phys. 39, 735-741.
15. Laemmli, U. K. (1970) Nature (Lond.) 227, 680-685.
16. Miller, J. S., Paterson, B. M., Ricciardi, R. P., Cohen, Land Roberts, B. E. (1983) Methods in Enzymology 101p. 650-674.
17. Phelps, D. S., Taeusch, W. H., Benson, B., and Hawgood, S. (1984) Biochem. Biophs. Acta, 791-226-238.
18. Shelley, S. A., Balis, J. U., Paciga, J. E., Knuppel, R. A., Ruffolo, E. H., and Bouis, P. J. (1982)
Am. J. Obstet. Gynecol. 144, 224-228.
19. Sigrist, H., Sigrist-Nelson, K., and Gither, G. (1977) BBRC 74, 178, 184.
20. Sueishi, K., and Benson, G. J. (1981) Biochem. Biophys. Acta 665, 442-453.
21. Toole, J. J., Knopf, J. L., Wozney, J. M., Sultzman L. A., Bucker, J. L., Pittman, D. D., Kaufman, R. J., Brown, E., Shoemaker, C., Orr, E. C., Amphlett, G. W., Foster, W. G., Coe, M. L., Knutson, G. L., Eass, D. N., Hewick, R. M. (1984) Nature (Lond.) 312, 342-347.
22. Whitsett, J. A., Hull, W., Ross, G., and Weaver, T. (1985) Pediatric Res. 19, 501-508.
23. Wong, G.G. et al., 1985, Science, 228:810-815