[go: up one dir, main page]

WO2004055158A2 - Therapeutic polypeptides, nucleic acids encoding same, and methods of use - Google Patents

Therapeutic polypeptides, nucleic acids encoding same, and methods of use Download PDF

Info

Publication number
WO2004055158A2
WO2004055158A2 PCT/US2003/031817 US0331817W WO2004055158A2 WO 2004055158 A2 WO2004055158 A2 WO 2004055158A2 US 0331817 W US0331817 W US 0331817W WO 2004055158 A2 WO2004055158 A2 WO 2004055158A2
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
novx
seq
protein
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2003/031817
Other languages
French (fr)
Other versions
WO2004055158A3 (en
Inventor
David Anderson
Edward Voss
Nikolai Khramtsov
Xiaojia Guo (Sasha)
Daniel Rieger
Li Li
Glennda Smithson
Ramesh Kekuda
Peter Mezes
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CuraGen Corp
Original Assignee
CuraGen Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CuraGen Corp filed Critical CuraGen Corp
Priority to AU2003282764A priority Critical patent/AU2003282764A1/en
Anticipated expiration legal-status Critical
Publication of WO2004055158A2 publication Critical patent/WO2004055158A2/en
Publication of WO2004055158A3 publication Critical patent/WO2004055158A3/en
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Definitions

  • the present invention relates to novel polypeptides, and the novel nucleic acids encoding them, having properties related to stimulation of biochemical or physiological responses in a cell, a tissue, an organ or an organism. More particularly, the novel polypeptides are gene products of novel genes, or are specified biologically active fragments or derivatives thereof. Methods of use encompass diagnostic and prognostic assay procedures as well as methods of treating diverse pathological conditions.
  • Eukaryotic cells are characterized by biochemical and physiological processes that under normal conditions are extremely highly balanced to achieve the preservation and propagation of the cells.
  • the regulation of the biochemical and physiological processes involves intricate signaling pathways.
  • signaling pathways involve extracellular signaling proteins, cellular receptors that bind the signaling proteins, and signal transducing components located within the cells.
  • Signaling proteins may be classified as endocrine effectors, paracrine effectors or autocrine effectors.
  • Endocrine effectors are signaling molecules secreted by a given organ into the circulatory system, which are then transported to a distant target organ or tissue.
  • the target cells include the receptors for the endocrine effector, and when the endocrine effector binds, a signaling cascade is induced.
  • Paracrine effectors involve secreting cells and receptor cells in close proximity to each other, for example two different classes of cells in the same tissue or organ. One class of cells secretes the paracrine effector, which then reaches the second class of cells, for example by diffusion through the extracellular fluid.
  • the second class of cells contains the receptors for the paracrine effector; binding of the effector results in induction of the signaling cascade that elicits the corresponding biochemical or physiological effect.
  • Autocrine effectors are highly analogous to paracrine effectors, except that the same cell type that secretes the autocrine effector also contains the receptor. Thus the autocrine effector binds to receptors on the same cell, or on identical neighboring cells. The binding process then elicits the characteristic biochemical or physiological effect.
  • Signaling processes may elicit a variety of effects on cells and tissues including by way of nonlimiting example induction of cell or tissue proliferation, suppression of growth or proliferation, induction of differentiation or maturation of a cell or tissue, and suppression of differentiation or maturation of a cell or tissue.
  • pathological conditions involve dysregulation of expression of important effector proteins.
  • the dysregulation is manifested as diminished or suppressed level of synthesis and secretion of protein effectors.
  • the dysregulation is manifested as increased or up-regulated level of synthesis and secretion of protein effectors.
  • a subject may be suspected of suffering from a condition brought on by altered or mis-regulated levels of a protein effector of interest. Therefore there is a need to assay for the level of the protein effector of interest in a biological sample from such a subject, and to compare the level with that characteristic of a nonpathoiogical condition. There also is a need to provide the protein effector as a product of manufacture.
  • Administration of the effector to a subject in need thereof is useful in treatment of the pathological condition. Accordingly, there is a need for a method of treatment of a pathological condition brought on by a diminished or suppressed levels of the protein effector of interest. In addition, there is a need for a method of treatment of a pathological condition brought on by a increased or up-regulated levels of the protein effector of interest.
  • Antibodies are multichain proteins that bind specifically to a given antigen, and bind poorly, or not at all, to substances deemed not to be cognate antigens. They are comprised of two short chains termed light chains and two long chains termed heavy chains. These chains are constituted of immunoglobulin domains, of which generally there are two classes: one variable domain per chain, one constant domain in light chains, and three or more constant domains in heavy chains. The antigen-specific portion of the immunoglobulin molecules resides in the variable domains; the variable domains of one light chain and one heavy chain associate with each other to generate the antigen-binding moiety.
  • Antibodies that bind immunospecifically to a cognate or target antigen bind with high affinities, and are thus useful in assaying specifically for the presence of the antigen in a sample. In addition, they have the potential of inactivating the activity of the antigen.
  • the present invention is based in part upon the discovery of isolated polypeptides including amino acid sequences selected from mature forms of the amino acid sequences selected , from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • novel nucleic acids and polypeptides are referred to herein as NOVIa, NOVIb, NOVIb, NOVIc, NOV2a, NOV2b, NOV2c, NOV2d, NOV3a, NOV3b, etc.
  • NOVX nucleic acid or polypeptide sequences.
  • the present invention also is based in part upon variants of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein any amino acid in the mature form is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence of the mature form are so changed.
  • the present invention includes the amino acid sequences selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • the invention also comprises variants of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence are so changed.
  • the invention also involves fragments of any of the mature forms of the amino acid sequences selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, or any other amino acid sequence selected from this group.
  • the invention also comprises fragments from these groups in which up to 15% of the residues are changed.
  • the present invention encompasses polypeptides that are naturally occurring allelic variants of the sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • allelic variants include amino acid sequences that are the translations of nucleic acid sequences differing by a single nucleotide from nucleic acid sequences selected from the group consisting of SEQ ID NOS: 2n-1 , wherein n is an integer between 1 and 34.
  • the variant polypeptide where any amino acid changed in the chosen sequence is changed to provide a conservative substitution.
  • the invention comprises a pharmaceutical composition involving a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 and a pharmaceutically acceptable carrier.
  • the invention involves a kit, including, in one or more containers, this pharmaceutical composition.
  • the invention includes the use of a therapeutic in the manufacture of a medicament for treating a syndrome associated with a human disease, the disease being selected from a pathology associated with a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein said therapeutic is the polypeptide selected from this group.
  • the invention comprises a method for determining the presence or amount of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a sample, the method involving providing the sample; introducing the sample to an antibody that binds immunospecifically to the polypeptide; and determining the presence or amount of antibody bound to the polypeptide, thereby determining the presence or amount of polypeptide in the sample.
  • the invention includes a method for determining the presence of or predisposition to a disease associated with altered levels of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a first mammalian subject, the method involving measuring the level of expression of the polypeptide in a sample from the first mammalian subject; and comparing the amount of the polypeptide in this sample to the amount of the polypeptide present in a control sample from a second mammalian subject known not to have, or not to be predisposed to, the disease, wherein an alteration in the expression level of the polypeptide in the first subject as compared to the control sample indicates the presence of or predisposition to the disease.
  • the invention involves a method of identifying an agent that binds to a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including introducing the polypeptide to the agent; and determining whether the agent binds to the polypeptide.
  • the agent could be a cellular receptor or a downstream effector.
  • the invention involves a method for identifying a potential therapeutic agent for use in treatment of a pathology, wherein the pathology is related to aberrant expression or aberrant physiological interactions of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including providing a cell expressing the polypeptide of the invention and having a property or function ascribable to the polypeptide; contacting the cell with a composition comprising a candidate substance; and determining whether the substance alters the property or function ascribable to the polypeptide; whereby, if an alteration observed in the presence of the substance is not observed when the cell is contacted with a composition devoid of the substance, the substance is identified as a potential therapeutic agent.
  • the invention involves a method for screening for a modulator of activity or of latency or predisposition to a pathology associated with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including administering a test compound to a test animal at increased risk for a pathology associated with the polypeptide of the invention, wherein the test animal recombinantly expresses the polypeptide of the invention; measuring the activity of the polypeptide in the test animal after administering the test compound; and comparing the activity of the protein in the test animal with the activity of the polypeptide in a control animal not administered the polypeptide, wherein a change in the activity of the polypeptide in the test animal relative to the control animal indicates the test compound is a modulator of latency of, or predisposition to, a pathology associated with the polypeptide of the invention.
  • the recombinant test animal could express a test protein transgene or express the transgene under the control of a promoter at an increased level relative to a wild-type test animal.
  • the promoter may or may not be the native gene promoter of the transgene.
  • the invention involves a method for modulating the activity of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including introducing a cell sample expressing the polypeptide with a compound that binds to the polypeptide in an amount sufficient to modulate the activity of the polypeptide.
  • the invention involves a method of treating or preventing a pathology associated with a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including administering the polypeptide to a subject in which such treatment or prevention is desired in an amount sufficient to treat or prevent the pathology in the subject.
  • the subject could be human.
  • the invention involves a method of treating a pathological state in a mammal, the method including administering to the mammal a polypeptide in an amount that is sufficient to alleviate the pathological state, wherein the polypeptide is a polypeptide having an amino acid sequence at least 95% identical to a polypeptide having the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 or a biologically active fragment thereof.
  • the invention involves an isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide having an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34; a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid in the mature form of the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence of the mature form are so changed; the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34; a variant of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, in which any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues
  • the invention comprises an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule comprises the nucleotide sequence of a naturally occurring allelic nucleic acid variant.
  • the invention involves an isolated nucleic acid molecule including a nucleic acid sequence encoding a polypeptide having an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 that encodes a variant polypeptide, wherein the variant polypeptide has the polypeptide sequence of a naturally occurring polypeptide variant.
  • the invention comprises an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule differs by a single nucleotide from a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 2n-1 , wherein n is an integer between 1 and 34.
  • the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34; a nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15% of the nucleotides are so changed; a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID NO:2n-1 , where
  • the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule hybridizes under stringent conditions to the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a complement of the nucleotide sequence.
  • the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule has a nucleotide sequence in which any nucleotide specified in the coding sequence of the chosen nucleotide sequence is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15% of the nucleotides in the chosen coding sequence are so changed, an isolated second polynucleotide that is a complement of the first polynucleotide, or a fragment of any of them.
  • the invention includes a vector involving the nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • This vector can have a promoter operably linked to the nucleic acid molecule. This vector can be located within a cell.
  • the invention involves a method for determining the presence or amount of a nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a sample, the method including providing the sample; introducing the sample to a probe that binds to the nucleic acid molecule; and determining the presence or amount of the probe bound to the nucleic acid molecule, thereby determining the presence or amount of the nucleic acid molecule in the sample.
  • the presence or amount of the nucleic acid molecule is used as a marker for cell or tissue type.
  • the cell type can be cancerous.
  • the invention involves a method for determining the presence of or predisposition for a disease associated with altered levels of a nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a first mammalian subject, the method including measuring the amount of the nucleic acid in a sample from the first mammalian subject; and comparing the amount of the nucleic acid in the sample of step (a) to the amount of the nucleic acid present in a control sample from a second mammalian subject known not to have or not be predisposed to, the disease; wherein an alteration in the level of the nucleic acid in the first subject as compared to the control sample indicates the presence of or predisposition to the disease.
  • the invention further provides an antibody that binds immunospecifically to a NOVX polypeptide.
  • the NOVX antibody may be monoclonal, humanized, or a fully human antibody.
  • the antibody has a dissociation constant for the binding of the NOVX polypeptide to the antibody less than 1 x 10 "9 M. More preferably, the NOVX antibody neutralizes the activity of the NOVX polypeptide.
  • the invention provides for the use of a therapeutic in the manufacture of a medicament for treating a syndrome associated with a human disease, associated with a NOVX polypeptide.
  • a therapeutic is a NOVX antibody.
  • the invention provides a method of treating or preventing a NOVX-associated disorder, a method of treating a pathological state in a mammal, and a method of treating or preventing a pathology associated with a polypeptide by administering a NOVX antibody to a subject in an amount sufficient to treat or prevent the disorder.
  • the present invention provides novel nucleotides and polypeptides encoded thereby. Included in the invention are the novel nucleic acid sequences, their encoded polypeptides, antibodies, and other related compounds.
  • the sequences are collectively referred to herein as “NOVX nucleic acids” or “NOVX polynucleotides” and the corresponding encoded polypeptides are referred to as “NOVX polypeptides” or “NOVX proteins.” Unless indicated otherwise, “NOVX” is meant to refer to any of the novel sequences disclosed herein. Table A provides a summary of the NOVX nucleic acids and their encoded polypeptides.
  • Table A indicates the homology of NOVX polypeptides to known protein families.
  • nucleic acids and polypeptides, antibodies and related compounds according to the invention corresponding to a NOVX as identified in column 1 of Table A are useful in therapeutic and diagnostic applications implicated in, for example, pathologies and disorders associated with the known protein families identified in column 5 of Table A.
  • Pathologies, diseases, disorders and condition and the like that are associated with NOVX sequences include, but are not limited to: e.g., cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), vascular calcification, fibrosis, atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic stenosis, ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, obesity, metabolic disturbances associated with obesity, transplantation, osteoarthritis, rheumatoid arthritis, osteochondrodysplasia, adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, diabetes, metabolic disorders, neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, glomerulonephritis, hemophilia,
  • NOVX polypeptides of the present invention show homology to, and contain domains that are characteristic of members of such protein families. Details of the sequence relatedness and domain analysis for each NOVX are presented in Example A.
  • the NOVX nucleic acids and polypeptides are used to screen for molecules, which inhibit or enhance NOVX activity or function. Specifically, the nucleic acids and polypeptides according to the invention are used as targets for the identification of small molecules that modulate or inhibit associated diseases.
  • the NOVX nucleic acids and polypeptides are also useful for detecting and differentiating specific cell types, tissues, pathological tissues, cell activation states and the like. Details of expression analysis for each NOVX are presented in Example C. Accordingly, the NOVX nucleic acids, polypeptides, antibodies and related compounds according to the invention have diagnostic and therapeutic applications in the detection of a variety of diseases with differential expression in normal vs. diseased tissues, e.g. detection of cancer.
  • NOVX clones The NOVX nucleic acids and proteins of the invention are useful in diagnostic and therapeutic applications and as a research tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be assessed, as well as potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) a biological defense weapon.
  • the invention includes an isolated polypeptide comprising an amino acid sequence selected from the group consisting of: (a) a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (b) a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34, wherein any amino acid in the mature form is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5% no more than 2% or no more than 1 % of the amino acid residues in the sequence of the mature form are so changed; (c) an amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (d) a variant of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid specified in the chosen sequence is
  • the invention includes an isolated nucleic acid molecule comprising a nucleic acid sequence encoding a NOVX polypeptide comprising an amino acid sequence selected from the group consisting of: (a) a mature form of the amino acid sequence given SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (b) a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34 wherein any amino acid in the mature form of the chosen sequence is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1 % of the amino acid residues in the sequence of the mature form are so changed; (c) the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (d) a variant of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n,
  • the invention includes an isolated nucleic acid molecule, wherein said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of: (a) the nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-1 , wherein n is an integer between 1 and 34; (b) a nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1 % of the nucleotides are so changed; (c) a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID NO: 2n-1, wherein n is an integer between 1 and 34; and (d) a nucleic
  • nucleic acid molecules that encode NOVX polypeptides or biologically active portions thereof. Also included in the invention are nucleic acid fragments sufficient for use as hybridization probes to identify NOVX-encoding nucleic acids (e.g., NOVX mRNAs) and fragments for use as PCR primers for the amplification and/or mutation of NOVX nucleic acid molecules.
  • nucleic acid molecule is intended to include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using nucleotide analogs, and derivatives, fragments and homologs thereof.
  • the nucleic acid molecule may be single-stranded or double-stranded, but preferably is comprised of double-stranded DNA.
  • a NOVX nucleic acid can encode a mature NOVX polypeptide.
  • a "mature" form of a polypeptide or protein disclosed in the present invention is the product of a naturally occurring polypeptide or precursor form or proprotein.
  • the naturally occurring polypeptide, precursor or proprotein includes, by way of nonlimiting example, the full-length gene product encoded by the corresponding gene. Alternatively, it may be defined as the polypeptide, precursor or proprotein encoded by an ORF described herein.
  • the product "mature" form arises, by way of nonlimiting example, as a result of one or more naturally occurring processing steps that may take place within the cell (e.g., host cell) in which the gene product arises.
  • Examples of such processing steps leading to a "mature" form of a polypeptide or protein include the cleavage of the N-terminal methionine residue encoded by the initiation codon of an ORF, or the proteolytic cleavage of a signal peptide or leader sequence.
  • a mature form arising from a precursor polypeptide or protein that has residues 1 to N, where residue 1 is the N-terminal methionine would have residues 2 through N remaining after removal of the N-terminal methionine.
  • a "mature" form of a polypeptide or protein may arise from a step of post-translational modification other than a proteolytic cleavage event. Such additional processes include, by way of non-limiting example, glycosylation, myristylation or phosphorylation.
  • a mature polypeptide or protein may result from the operation of only one of these processes, or a combination of any of them.
  • probe refers to nucleic acid sequences of variable length, preferably between at least about 10 nucleotides (nt), about 100 nt, or as many as approximately 6,000 nt, depending upon the specific use. Probes are used in the detection of identical, similar, or complementary nucleic acid sequences. Longer length probes are generally obtained from a natural or recombinant source, are highly specific, and much slower to hybridize than shorter-length oligomer probes. Probes 05/be single- stranded or double-stranded and designed to have specificity in PCR, membrane-based hybridization technologies, or ELISA-like technologies.
  • nucleic acid molecule as used herein, is as defined in United States Patent 6,600,019, columns 69, lines 8 to 30, the disclosure of which is hereby incorporated in toto by reference.
  • a nucleic acid molecule of the invention e.g., a nucleic acid molecule having the nucleotide sequence of SEQ ID NO:2 ⁇ -1 , wherein n is an integer between 1 and 34, or a complement of this nucleotide sequence, can be isolated using standard molecular biology techniques and the sequence information provided herein.
  • NOVX molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, ef a/., (eds.), MOLECULAR CLONING: A LABORATORY MANUAL 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; and Ausubel, et al., (eds.), CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, NY, 1993.)
  • a nucleic acid of the present invention can also be amplified as described in Untied States Patent 6,600,019, column 69 , lines 45 to 54, the disclosure of which is hereby incorporated in toto by reference.
  • oligonucleotide refers to a series of linked nucleotide residues as defined in United States Patent 6,600,019, columns 69 and 70, the disclosure of which is hereby incorporated in toto herein by reference.
  • an oligonucleotide comprising a nucleic acid molecule less than 100 nt in length would further comprise at least 6 contiguous nucleotides of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a complement thereof.
  • an isolated nucleic acid molecule of the invention comprises a nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a portion of this nucleotide sequence (e.g., a fragment that can be used as a probe or primer or a fragment encoding a biologically-active portion of a NOVX polypeptide).
  • a nucleic acid molecule that is complementary to the nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, is one that is sufficiently complementary to the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, that it can hydrogen bond with few or no mismatches to the nucleotide sequence shown in SEQ ID NO:2 ⁇ -1, wherein n is an integer between 1 and 34, thereby forming a stable duplex.
  • fragment As used herein, the term “complementary”, and the term “binding” are as defined in United States Patent 6,600,019, column 70, lines 15 to 27, the disclosure of which is hereby incorporated in toto herein by reference
  • a "fragment” provided herein is defined as a sequence of at least 6 (contiguous) nucleic acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific hybridization in the case of nucleic acids or for specific recognition of an epitope in the case of amino acids, and is at most some portion less than a full length sequence. Fragments may be derived from any contiguous portion of a nucleic acid or amino acid sequence of choice.
  • a full-length NOVX clone is identified as containing an ATG translation start codon and an in-frame stop codon. Any disclosed NOVX nucleotide sequence lacking an ATG start codon therefore encodes a truncated C-terminal fragment of the respective NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 5' direction of the disclosed sequence. Any disclosed NOVX nucleotide sequence lacking an in-frame stop codon similarly encodes a truncated N-terminal fragment of the respective NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 3' direction of the disclosed sequence.
  • a “derivative” is a nucleic acid sequence or amino acid sequence formed from the native compounds either directly, by modification or partial substitution.
  • An “analog” is a nucleic acid sequence or amino acid sequence that has a structure similar to, but not identical to, the native compound, e.g. they differs from it in respect to certain components or side chains. Analogs may be synthetic or derived from a different evolutionary origin and may have a similar or opposite metabolic activity compared to wild type.
  • a “homolog” is a nucleic acid sequence or amino acid sequence of a particular gene that is derived from different species. Derivatives and analogs may be full length or other than full length.
  • nucleic acids or proteins of the invention include, but are not limited to, molecules comprising regions that are substantially homologous to the nucleic acids or proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95%, or more identity, with a preferred identity of 80-95%, and most preferred identity of 98-99% or more, over a nucleic acid or amino acid sequence of identical size or when compared to an aligned sequence in which the alignment is done by a computer homology program known in the art, or whose encoding nucleic acid is capable of hybridizing to the complement of a sequence encoding the proteins under stringent, moderately stringent, or low stringent conditions. See Ausubel, etal., Supra, and below.
  • a “homologous nucleic acid sequence” or “homologous amino acid sequence,” or variations thereof, are as defined in United States Patent 6,600,019, column 71 , the disclosure of which is hereby incorporated in toto by reference.
  • Homologous nucleic acid sequences according to the present invention include those nucleic acid sequences that encode conservative amino acid substitutions (see below) in SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, as well as a polypeptide possessing NOVX biological activity. Various biological activities of the NOVX proteins are described below.
  • a NOVX polypeptide is encoded by the open reading frame ("ORF") of a NOVX nucleic acid.
  • An ORF corresponds to a nucleotide sequence that could potentially be translated into a polypeptide.
  • a stretch of nucleic acids comprising an ORF is uninterrupted by a stop codon.
  • An ORF that represents the coding sequence for a full protein begins with an ATG "start” codon and terminates with one of the three “stop” codons, namely, TAA, TAG, orTGA.
  • an ORF may be any part of a coding sequence, with or without a start codon, a stop codon, or both.
  • a minimum size requirement is often set, e.g., a stretch of DNA that would encode a protein of 50 amino acids or more.
  • the nucleotide sequences determined from the cloning of the human NOVX genes allows for the generation of probes and primers designed for use in identifying and/or cloning NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX homologues from other vertebrates.
  • the probe/primer typically comprises substantially purified oligonucleotide.
  • the oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 250, 300, 350 or 400 consecutive sense strand nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34; or an anti-sense strand nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34; or of a naturally occurring mutant of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34.
  • Probes based on the human NOVX nucleotide sequences can be used to detect transcripts or genomic sequences encoding the same or homologous proteins.
  • the probe has a detectable label attached, e.g. the label can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor.
  • the label can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor.
  • Such probes can be used as a part of a diagnostic test kit for identifying cells or tissues which mis-express a NOVX protein, such as by measuring a level of a NOVX-encoding nucleic acid in a sample of cells from a subject e.g., detecting NOVX mRNA levels or determining whether a genomic NOVX gene is up or down regulated or has been mutated or deleted.
  • a polypeptide having a biologically-active portion of a NOVX polypeptide refers to polypeptides as defined in United States Patent 6,600,019, columns 71 , lines 59 to 64, the disclosure of which is hereby incorporated in toto by reference.
  • a nucleic acid fragment encoding a "biologically-active portion of NOVX” can be prepared by isolating a portion of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, that encodes a polypeptide having a NOVX biological activity (the biological activities of the NOVX proteins are described below), expressing the encoded portion of NOVX protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of NOVX.
  • a variant sequence can include a single nucleotide polymorphism (SNP).
  • SNP can, in some instances, be referred to as a "cSNP" to denote that the nucleotide sequence containing the SNP originates as a cDNA.
  • SNPs occurring within genes may result in an alteration of the amino acid encoded by the gene at the position of the SNP.
  • Preferred embodiments include NOV1f, NOV1g, NOV1h, NOV1i, NOV2e, NOV2f, NOV5b, NOV5c, NOV6d, NOV6e, NOV6f, and NOV6g.
  • NOVX Nucleic Acid and Polypeptide Variants
  • the invention further encompasses nucleic acid molecules that differ from the nucleotide sequences of SEQ ID N0:2 ⁇ -1 , wherein n is an integer between 1 and 34, due to degeneracy of the genetic code and thus encode the same NOVX proteins as that encoded by the nucleotide sequences of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34.
  • an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • n is an integer between 1 and 34
  • DNA sequence polymorphisms that lead to changes in the amino acid sequences of the NOVX polypeptides may exist within a population (e.g., the human population).
  • Such genetic polymorphism in the NOVX genes may exist among individuals within a population due to natural allelic variation.
  • the terms "gene” and “recombinant gene” refer to nucleic acid molecules comprising an open reading frame (ORF) encoding a NOVX protein, preferably a vertebrate NOVX protein.
  • Such natural allelic variations can typically result in 1-5% variance in the nucleotide sequence of the NOVX genes. Any and all such nucleotide variations and resulting amino acid polymorphisms in the NOVX polypeptides, which are the result of natural allelic variation and that do not alter the functional activity of the NOVX polypeptides, are intended to be within the scope of the invention. Moreover, nucleic acid molecules encoding NOVX proteins from other species, and thus that have a nucleotide sequence that differs from a human SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, are intended to be within the scope of the invention.
  • Nucleic acid molecules corresponding to natural allelic variants and homologues of the NOVX cDNAs of the invention can be isolated based on their homology to the human NOVX nucleic acids disclosed herein using the human cDNAs, or a portion thereof, as a hybridization probe according to - standard hybridization techniques under stringent hybridization conditions.
  • an isolated nucleic acid molecule of the invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34.
  • the nucleic acid is at least 10, 25, 50, 100, 250, 500, 750, 1000, 1500, or 2000 or more nucleotides in length.
  • an isolated nucleic acid molecule of the invention hybridizes to the coding region.
  • Homologs i.e., nucleic acids encoding NOVX proteins derived from species other than human
  • other related sequences e.g., paralogs
  • nucleic acid molecule of the invention that hybridizes under stringent conditions to a sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, corresponds to a naturally-occurring nucleic acid molecule.
  • a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).
  • a nucleic acid sequence that is hybridizable to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof, under conditions of moderate stringency is provided.
  • moderate stringency hybridization conditions are hybridization in 6X SSC, 5X Reinhardt's solution, 0.5% SDS and 100 mg/ml denatured salmon sperm DNA at 55 °C, followed by one or more washes in 1X SSC, 0.1% SDS at 37 °C.
  • Other conditions of moderate stringency that may be used are well-known within the art. See, Ausubel, et al.
  • nucleic acid that is hybridizable to the nucleic acid molecule comprising the nucleotide sequences of SEQ ID NO:2 ⁇ -1, wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof, under conditions of low stringency, is provided.
  • low stringency hybridization conditions are hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCI (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 mg/ml denatured salmon sperm DNA, 10% (wt/vol) dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM Tris-HCI (pH 7.4), 5 mM EDTA, and 0.1% SDS at 50°C.
  • Other conditions of low stringency that may be used are well known in the art (e.g., as employed for cross-species hybridizations). See, e.g., Ausubel, supra; and Proc NatlAcad Sci USA 78: 6789-6792 (1981).
  • nucleotide sequences of SEQ ID NO:2/>1 wherein n is an integer between 1 and 34, thereby leading to changes in the amino acid sequences of the encoded NOVX protein, without altering the functional ability of that NOVX protein.
  • nucleotide substitutions leading to amino acid substitutions at "non-essential" amino acid residues can be made in the sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • non-essential amino acid residue is a residue that can be altered from the wild-type sequences of the NOVX proteins without altering their biological activity, whereas an "essential" amino acid residue is required for such biological activity.
  • amino acid residues that are conserved among the NOVX proteins of the invention are predicted to be particularly non-amenable to alteration. Amino acids for which conservative substitutions can be made are well-known within the art.
  • NOVX proteins that contain changes in amino acid residues that are not essential for activity.
  • NOVX proteins differ in amino acid sequence from SEQ ID NO:2/>1 , wherein n is an integer between 1 and 34, yet retain biological activity.
  • the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino acid sequence at least about 80% homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34; more preferably at least about 90% homologous, even more preferably at least about 95% homologous, most preferably 98-99% homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • An isolated nucleic acid molecule encoding a NOVX protein homologous to the protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34 can be created by introducing one or more nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID NO:2 ⁇ -1 , wherein n is an integer between 1 and 34, such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein.
  • Mutations can be introduced any one of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, by standard techniques, such as described in United States Patent 6,600,019, columns 74, line 58 to column 75, line 14, the disclosure of which is hereby incorporated in toto by reference.
  • the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined.
  • amino acid families may also be determined based on side chain interactions.
  • Substituted amino acids may be fully conserved "strong” residues or fully conserved “weak” residues.
  • the "strong” group of conserved amino acid residues may be any one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, FYW, wherein the single letter amino acid codes are grouped by those amino acids that may be substituted for each other.
  • a mutant NOVX protein can be assayed for (/) the ability to form protein:protein interactions with other NOVX proteins, other cell-surface proteins, or biologically-active portions thereof, (ii) complex formation between a mutant NOVX protein and a NOVX ligand; or (///) the ability of a mutant NOVX protein to bind to an intracellular target protein or biologically-active portion thereof; (e.g. avidin proteins).
  • a mutant NOVX protein can be assayed for the ability to regulate a specific biological function (e.g., regulation of insulin release).
  • NOVX gene expression can be attenuated by RNA interference.
  • RNA interference One approach well-known in the art is short interfering RNA (siRNA) mediated gene silencing where expression products of a NOVX gene are targeted by specific double stranded NOVX derived siRNA nucleotide sequences that are complementary to at least a 19-25 nt long segment of the NOVX gene transcript, including the 5' untranslated (UT) region, the ORF, or the 3' UT region.
  • siRNA short interfering RNA
  • Targeted genes can be a NOVX gene, or an upstream or downstream modulator of the NOVX gene.
  • upstream or downstream modulators of a NOVX gene include, e.g., a transcription factor that binds the NOVX gene promoter, a kinase or phosphatase that interacts with a NOVX polypeptide, and polypeptides involved in a NOVX regulatory pathway.
  • An inventive therapeutic method of the invention contemplates administering a NOVX siRNA construct as therapy to compensate for increased or aberrant NOVX expression or activity.
  • the NOVX ribopolynucleotide is obtained and processed into siRNA fragments, or a NOVX siRNA is synthesized, as described above.
  • the NOVX siRNA is administered to cells or tissues using known nucleic acid transfection techniques, as described above.
  • a NOVX siRNA specific for a NOVX gene will decrease or knockdown NOVX transcription products, which will lead to reduced NOVX polypeptide production, resulting in reduced NOVX polypeptide activity in the cells or tissues.
  • the present invention also encompasses a method of treating a disease or condition associated with the presence of a NOVX protein in an individual comprising administering to the individual an RNAi construct that targets the mRNA of the protein (the mRNA that encodes the protein) for degradation.
  • a specific RNAi construct includes a siRNA or a double stranded gene transcript that is processed into siRNAs.
  • the target protein is not produced or is not produced to the extent it would be in the absence of the treatment.
  • a NOVX siRNA is used in therapy. Methods for the generation and use of a NOVX siRNA are known to those skilled in the art. Example techniques are provided below.
  • Sense RNA (ssRNA) and antisense RNA (asRNA) of NOVX are produced using known methods such as transcription in RNA expression vectors.
  • the sense and antisense RNA are about 500 bases in length each.
  • the produced ssRNA and asRNA (0.5 UM) in 10 mM Tris-HCI (pH 7.5) with 20 mM NaCl were heated to 95° C for 1 min then cooled ' and annealed at room temperature for 12 to 16 h.
  • the RNAs are precipitated and resuspended in lysis buffer (below).
  • RNAs are electrophoresed in a 2% agarose gel in TBE buffer and stained with ethidium bromide. See, e.g., Sambrook et al., Molecular Cloning. Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989).
  • Untreated rabbit reticulocyte lysate (Ambion) are assembled according to the manufacturer's directions. dsRNA is incubated in the lysate at 30° C for 10 min prior to the addition of mRNAs. Then NOVX mRNAs are added and the incubation continued for an additional 60 min. The molar ratio of double stranded RNA and mRNA is about 200:1. The NOVX mRNA is radiolabeled (using known techniques) and its stability is monitored by gel electrophoresis. In a parallel experiment made with the same conditions, the double stranded RNA is internally radiolabeled with a 32 P-ATP.
  • the band of double stranded RNA about 21-23 bps, is eluded.
  • the efficacy of these 21 -23 mers for suppressing NOVX transcription is assayed in vitro using the same rabbit reticulocyte assay described above using 50 nanomolar of double stranded 21-23 mer for each assay.
  • the sequence of these 21-23 mers is then determined using standard nucleic acid sequencing techniques.
  • RNAs 21 nt RNAs, based on the sequence determined above, are chemically synthesized using Expedite RNA phosphoramidites and thymidine phosphoramidite (Proligo, Germany). Synthetic oligonucleotides are deprotected and gel-purified (Genes & Dev. 15, 188-200, 2001), followed by Sep-Pak C18 cartridge (Waters, Milford, Mass., USA) purification (Biochemistry, 32:11658-11668 1993).
  • RNAs (20 ⁇ M) single strands are incubated in annealing buffer (100 mM potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate) for 1 min at 90° C followed by 1 h at 37° C.
  • annealing buffer 100 mM potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate
  • a cell culture known in the art to regularly express NOVX is propagated using standard conditions. 24 hours before transfection, at approx. 80% confluency, the cells are trypsinized and diluted 1 :5 with fresh medium without antibiotics (1-3 X 105 cells/ml) and transferred to 24-well plates (500 ml/well). Transfection is performed using a commercially available lipofection kit and NOVX expression is monitored using standard techniques with positive and negative control. A positive control is cells that naturally express NOVX while a negative control is cells that do not express NOVX. Base-paired 21 and 22 nt siRNAs with overhanging 3' ends mediate efficient sequence-specific mRNA degradation in lysates and in cell culture. Different concentrations of siRNAs are used.
  • siRNAs are effective at concentrations that are several orders of magnitude below the concentrations applied in conventional antisense or ribozyme gene targeting experiments.
  • Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof.
  • An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein (e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence).
  • antisense nucleic acid molecules comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire NOVX coding strand, or to only a portion thereof.
  • Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, or antisense nucleic acids complementary to a NOVX nucleic acid sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, are additionally provided.
  • an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a nucleotide sequence encoding a NOVX protein.
  • coding region refers to the region of the nucleotide sequence comprising codons which are translated into amino acid residues.
  • the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence encoding the NOVX protein.
  • noncoding region refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions).
  • antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing.
  • the antisense nucleic acid molecule can be complementary to the entire coding region of NOVX mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of NOVX mRNA.
  • the antisense oligonucleotide can be complementary to the region surrounding the translation start site of NOVX mRNA.
  • An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length.
  • an antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art.
  • an antisense nucleic acid e.g., an antisense oligonucleotide
  • an antisense nucleic acid can be chemically synthesized using naturally-occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids (e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used).
  • modified nucleotides are well-known in the art and may be found and discussed, for example, in United States Patent 6,600,019, column 76, line 21, to column 77, line 21 , the disclosure of which is hereby incorporated in toto by reference.
  • Ribozymes and PNA Moieties are well-known in the art and may be found and discussed, for example, in United States Patent 6,600,019, column 76, line 21, to column 77, line 21 , the disclosure of which is hereby incorporated in toto by reference.
  • Nucleic acid modifications include, by way of non-limiting example, modified bases, and nucleic acids whose sugar phosphate backbones are modified or derivatized. These modifications are carried out at least in part to enhance the chemical stability of the modified nucleic acid, such that they may be used, for example, as antisense binding nucleic acids in therapeutic applications in a subject.
  • an antisense nucleic acid of the invention is a ribozyme.
  • Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a complementary region.
  • ribozymes e.g., hammerhead ribozymes as described in Nature 334:585,1988) can be used to catalytically cleave NOVX mRNA transcripts to thereby inhibit translation of NOVX mRNA.
  • a ribozyme having specificity for a NOVX-encoding nucleic acid can be designed based upon the nucleotide sequence of a NOVX cDNA disclosed herein (SEQ ID NO:2 ⁇ -1 , wherein n is an integer between 1 and 34).
  • a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a NOVX-encoding mRNA. See, U.S. Patents 4,987,071 and 5,116,742.
  • NOVX mRNA can also be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, Science 261:1411 (1993).
  • NOVX gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the NOVX nucleic acid (e.g., the NOVX promoter and/or enhancers) to form triple helical structures that prevent transcription of the NOVX gene in target cells.
  • nucleotide sequences complementary to the regulatory region of the NOVX nucleic acid e.g., the NOVX promoter and/or enhancers
  • the NOVX nucleic acids can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule as has been reported (see, United States Patent 6,600,019, column 77, line 54 to column 78, line 15, the disclosure of which is hereby incorporated in toto by reference.
  • the oligonucleotide according to the present invention may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Proc. Natl. Acad. Sci. U.S.A.
  • oligonucleotides can be modified with hybridization triggered cleavage agents (see, e.g., BioTechniques 6:958,1988) or intercalating agents (see, e.g.,. Pharm. Res. 5: 539,1988).
  • the oligonucleotide may be conjugated to another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, and the like.
  • a polypeptide according to the present invention includes a polypeptide of the amino acid sequence of NOVX polypeptides whose sequences are provided in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • the invention also includes a mutant or variant protein any of whose residues may be changed from the corresponding residues shown in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 34, while still encoding a protein that maintains its NOVX activities and physiological functions, or a functional fragment thereof.
  • One aspect of the invention pertains to isolated NOVX proteins, and biologically-active portions thereof, or derivatives, fragments, analogs or homologs thereof.
  • polypeptide fragments suitable for use as immunogens to raise anti-NOVX antibodies are also provided.
  • native NOVX proteins can be isolated from cells or tissue sources by an appropriate purification scheme using standard protein purification techniques.
  • NOVX proteins are produced by recombinant DNA techniques.
  • a NOVX protein or polypeptide can be synthesized chemically using standard peptide synthesis techniques.
  • an “isolated” or “purified” polypeptide or protein or biologically-active portion thereof is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the NOVX protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized as more fully defined in United States Patent 6,600,019, column 79, lines 28 to 44, the disclosure of which is incorporated in toto herein.
  • the language “substantially free of chemical precursors or other chemicals” is as more fully defined in United States Patent 6,600,019, column 79, lines 51 to 55, the disclosure of which is incorporated in toto herein.
  • the language "substantially free of chemical precursors or other chemicals” includes preparations of NOVX proteins having less than about 30% (by dry weight) of chemical precursors or non-NOVX chemicals, preferably less than about 20%, even more preferably less than about 10% still more preferably less than about 5%, and most preferably less that 1-2% chemical precursors or non-NOVX chemicals.
  • Biologically-active portions of NOVX proteins include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequences of the NOVX proteins (e.g., the amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34) that include fewer amino acids than the full-length NOVX proteins, and exhibit at least one activity of a NOVX protein.
  • biologically-active portions comprise a domain or motif with at least one activity of the NOVX protein.
  • a biologically-active portion of a NOVX protein can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acid residues in length.
  • other biologically-active portions, in which other regions of the protein are deleted can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native NOVX protein.
  • the NOVX protein has an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • the NOVX protein is substantially homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34, and retains the functional activity of the protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, yet differs in amino acid sequence due to natural allelic variation or mutagenesis, as described in detail, below.
  • the NOVX protein is a protein that comprises an amino acid sequence at least about 80% homologous to the amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34, and retains the functional activity of the NOVX proteins of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
  • the determination of homology between two or more sequences " is as more fully defined in United States Patent 6,600,019, column 80, line 30 to colum 81 , line 10, the disclosure of which is incorporated in toto herein wherein the CDS (encoding) part of the DNA sequence of SEQ ID NO:2 ⁇ -1, wherein n is an integer between 1 and 34.
  • sequence identity refers to the degree to which two polynucleotide or polypeptide sequences are identical on a residue-by-residue basis over a particular region of comparison.
  • percentage of sequence identity is calculated by comparing two optimally aligned sequences over that region of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I, in the case of nucleic acids) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the region of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • substantially identical denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 80 percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison region.
  • NOVX chimeric or fusion proteins As used herein, a NOVX "chimeric protein” or “fusion protein” comprises a NOVX polypeptide operatively-linked to a non-NOVX polypeptide.
  • An "NOVX polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, whereas a "non-NOVX polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein that is not substantially homologous to the NOVX protein, e.g., a protein that is different from the NOVX protein and that is derived from the same or a different organism.
  • a NOVX fusion protein comprises at least one biologically-active portion of a NOVX protein.
  • a NOVX fusion protein comprises at least two biologically-active portions of a NOVX protein.
  • a NOVX fusion protein comprises at least three biologically-active portions of a NOVX protein.
  • the term "operatively-linked" is intended to indicate that the NOVX polypeptide and the non-NOVX polypeptide are fused in-frame with one another.
  • the non-NOVX polypeptide can be fused to the N-terminus or C-terminus of the NOVX polypeptide.
  • the fusion protein is a GST-NOVX fusion protein in which the NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) sequences.
  • GST glutthione S-transferase
  • Such fusion proteins can facilitate the purification of recombinant NOVX polypeptides.
  • the fusion protein is a NOVX protein containing a heterologous signal sequence at its N-terminus.
  • expression and/or secretion of NOVX can be increased through use of a heterologous signal sequence.
  • the fusion protein is a NOVX-immunoglobulin fusion protein in which the NOVX sequences are fused to sequences derived from a member of the immunoglobulin protein family.
  • the NOVX-immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject to inhibit an interaction between a NOVX ligand and a NOVX protein on the surface of a cell, to thereby suppress NOVX-mediated signal transduction in vivo.
  • the NOVX-immunoglobulin fusion proteins can be used to affect the bioavailability of a NOVX cognate ligand. Inhibition of the NOVX ligand/NOVX interaction may be useful therapeutically for both the treatment of proliferative and differentiative disorders, as well as modulating (e.g. promoting or inhibiting) cell survival.
  • the NOVX-immunoglobulin fusion proteins of the invention can be used as immunogens to produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in screening assays to identify molecules that inhibit the interaction of NOVX with a NOVX ligand.
  • a NOVX chimeric or fusion protein of the invention can be produced by standard recombinant DNA techniques as described and more fully defined in United States Patent 6,600,019, column 82, lines 15 to 37, the disclosure of which is incorporated in toto herein. NOVX Agonists and Antagonists
  • the present invention also pertains to variants of the NOVX proteins that function as either NOVX agonists (i.e., mimetics) or as NOVX antagonists as more fully defined in United States Patent 6,600,019, column 82, line 40 to column 83, line 19, the disclosure of which is incorporated in toto herein.
  • NOVX agonists i.e., mimetics
  • NOVX antagonists as more fully defined in United States Patent 6,600,019, column 82, line 40 to column 83, line 19, the disclosure of which is incorporated in toto herein.
  • antibody refers to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen.
  • immunoglobulin immunoglobulin
  • Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a , F ab' and F (ab . )2 fragments, and an F ⁇ expression library.
  • Antibodies may be any of the classes IgG, IgM, IgA, IgE and IgD, and include subclasses such as IgG-i, lgG 2 , and others.
  • the light chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a reference to all such classes, subclasses and types of antibody species.
  • An isolated NOVX full length protein or a portion or fragment thereof can be used as an immunogen to generate antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal and monoclonal antibody preparation.
  • An antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence of the full length protein, such as an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34, and encompasses an epitope.
  • the antigenic peptide may comprise at least 10 amino acid residues, or at least 15, at least 20 congestion or at least 30 amino acid residues.
  • Epitopes may encompassed by the antigenic peptide are regions of the protein that are located on its surface; commonly these are hydrophilic regions.
  • At least one epitope encompassed by the antigenic peptide is a region of NOVX that is located on the surface of the protein, e.g., a hydrophilic region and may be determined by a hydrophobicity analysis of the NOVX protein sequence.
  • hydropathy plots showing regions of hydrophilicity and hydrophobicity may be generated by any method well known in the art (for example see PNAS USA 78:3824,1981; and J. Mol. Biol. 157:105, 1982).
  • epitope includes any protein determinant capable of specific binding to an immunoglobulin or T-cell receptor.
  • Epitopic determinants usually consist of chemically active surface groupings of molecules such as amino acids or sugar side chains and usually have specific three dimensional structural characteristics, as well as specific charge characteristics.
  • a NOVX polypeptide or a fragment thereof comprises at least one antigenic epitope.
  • An anti-NOVX antibody of the present invention is said to specifically bind to antigen NOVX when the equilibrium binding constant (K D ) is ⁇ 1 ⁇ M, preferably ⁇ 100 nM, more preferably ⁇ 10 nM, and most preferably ⁇ 100 pM to about 1 pM, as measured by assays including radioligand binding assays or similar assays known to skilled artisans.
  • K D equilibrium binding constant
  • NOVX nucleic acid molecules are used directly for production of antibodies recognizing NOVX polypeptides.
  • Antibodies can be prepared by genetic or DNA-based immunization. It has been shown that intramuscular immunization of mice with a naked DNA plasmid led to expression of reporter proteins in muscle cells (Science 247:1465, 1990) and that this technology could stimulate an immune response (Nature. 356:152, 1992). The success of genetic immunization in stimulating both cellular and humoral immune responses has been widely reported (e.g. in Annu. Rev. Immunol. 15:617, 1997; Immunol. Today 19:89, 1998; Annu. Rev. Immunol. 18:927, 2000). Using this technology, antibodies can be generated through immunization with a cDNA sequence encoding the protein in question. Following genetic immunization, the animal ' s immune system is activated in response to the synthesis of the foreign protein.
  • the quantity of protein produced in vivo following genetic immunization is within the picogram to nanogram range, which is much lower than the amounts of protein introduced by conventional immunization protocols.
  • a very efficient immune response is achieved due to the foreign protein being expressed directly in, or is quickly taken up by antigen-presenting dendritic cells (J. Leuk. Biol. 66:350, 1999; J. Exp. Med. 186:1481, 1997; Nat. Med. 2:1122, 1996).
  • a further increase in the effectivity of genetic immunization is due to the inherent immune-enhancing properties of the DNA itself, i.e., the presence of CpG-motifs in the plasmid backbone, which activate both dendritic cells (J. Immunol. 161 3042, 1998) and B-cells (Nature 374:546, 1995).
  • Anti NOVX antibodies can further comprise humanized or human antibodies. Humanization can be performed following methods known in the art (Nature, 321 :522-525, 1986; Nature, 332:323-327, 1988; Science, 239:1534-1536, 1988; U.S. Patent No. 5,225,539; and Curr. Op. Struct. Biol., 2:593-596, 1992). Human Antibodies
  • Fully human antibodies are antibody molecules in which the entire sequence of both the light chain and the heavy chain, including the CDRs, arise from human genes. Such antibodies are termed "human antibodies", or “fully human antibodies” herein.
  • Human monoclonal antibodies can be prepared by methods known in the art, see, for example, Immunol Today 4: 72, 1983; In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96,1985;. PNAS USA 80:2026, 1983; J. Mol. Biol., 227:381 , 1991 ; J. Mol. Biol., 222:581, 1991; U.S. Patent Nos.
  • techniques can be adapted for the production of single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778).
  • methods can be adapted for the construction of F ⁇ expression libraries (see e.g., Science 246:1275, 1989) to allow rapid and effective identification of monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, analogs or homologs thereof.
  • Antibody fragments that contain the idiotypes to a protein antigen may be produced by techniques known in the art including, but not limited to: (i) an F (ab .
  • Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding specificities for at least two different antigens. In the present case, one of the binding specificities is for an antigenic protein of the invention.
  • the second binding target is any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit.
  • bispecific antibodies are known in the art, see Nature, 305:537, 1983 and may be purified by affinity chromatography steps (EMBO J., 10:3655, 1991).
  • affinity chromatography steps EMBO J., 10:3655, 1991.
  • bispecific antibodies see, for example, Methods in Enzymology, 121:210 (1986); Science 229:81 (1985); J. Exp. Med. 175:217 (1992); J. Immunol. 148(5):1547 (1992); "diabody” technology described in PNAS USA 90:6444 (1993); and single-chain Fv (sFv) dimers in J. Immunol. 152:5368 (1994).
  • Antibodies with more than two valencies are contemplated, see for example J. Immunol. 147:60 (1991).
  • Heteroconjugate antibodies composed of two covalently joined antibodies are also within the scope of the present invention, see for example, U.S. Patent No. 4,676,980 and EP 03089. It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic protein chemistry, including those involving crosslinking agents. For example, immunotoxins can be constructed using a disulfide exchange reaction or by forming a thioether bond. Examples of suitable reagents for this purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980.
  • the invention also pertains to immunoconjugates comprising an antibody according to the present invention conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate).
  • a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate).
  • a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate
  • the antibodies disclosed herein can also be formulated as immunoiiposomes prepared by methods known in the art, such as described in PNAS USA, 82:3688, 1985; PNAS USA, 77:4030, 1980; and U.S. Pat. Nos. 4,485,045; 4,544,545; and 5,013,556; J. Biol. Chem., 257:286, 1982; J. National Cancer Inst., 81(19):1484, 1989.
  • methods for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme linked immunosorbent assay (ELISA) and other immunoiogically mediated techniques known within the art.
  • ELISA enzyme linked immunosorbent assay
  • selection of antibodies that are specific to a particular domain of an NOVX protein is facilitated by generation of hybridomas that bind to the fragment of an NOVX protein possessing such a domain.
  • antibodies that are specific for a desired domain within an NOVX protein, or derivatives, fragments, analogs or homologs thereof, are also provided herein.
  • Antibodies directed against a NOVX protein of the invention may be used in methods known within the art relating to the localization and/or quantitation of a NOVX protein (e.g., for use in measuring levels of the NOVX protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like).
  • antibodies specific to a NOVX protein, or derivative, fragment, analog or homolog thereof, that contain the antibody derived antigen binding domain are utilized as pharmacologically active compounds (referred to hereinafter as "Therapeutics").
  • An antibody specific for a NOVX protein of the invention can be used to isolate a NOVX polypeptide by standard techniques, such as immunoaffinity, chromatography or immunoprecipitation.
  • An antibody to a NOVX polypeptide can facilitate the purification of a natural NOVX antigen from cells, or of a recombinantly produced NOVX antigen expressed in host cells.
  • an anti-NOVX antibody can be used to detect the antigenic NOVX protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the antigenic NOVX protein.
  • Antibodies directed against a NOVX protein can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance.
  • detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials.
  • suitable enzymes include horseradish peroxidase, alkaline phosphatase, D-galactosidase, or acetylcholinesterase;
  • suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin;
  • suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin;
  • an example of a luminescent material includes luminol;
  • examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 25 1, 31 l, 35 S or 3 H.
  • Antibodies of the present invention may be used as therapeutic agents. Such agents will generally be employed to treat or prevent a disease or pathology in a subject.
  • An antibody preparation preferably one having high specificity and high affinity for its target antigen, is administered to the subject and will generally have an effect due to its binding with the target.
  • Such an effect may be one of two kinds, depending on the specific nature of the interaction between the given antibody molecule and the target antigen in question.
  • administration of the antibody may abrogate or inhibit the binding of the target with an endogenous ligand to which it naturally binds.
  • the antibody binds to the target and masks a binding site of the naturally occurring ligand, wherein the ligand serves as an effector molecule.
  • the receptor mediates a signal transduction pathway for which ligand is responsible.
  • the effect may be one in which the antibody elicits a physiological result by virtue of binding to an effector binding site on the target molecule.
  • the target a receptor having an endogenous ligand which may be absent or defective in the disease or pathology, binds the antibody as a surrogate effector ligand, initiating a receptor-based signal transduction event by the receptor.
  • a therapeutically effective amount of an antibody of the invention relates generally to the amount needed to achieve a therapeutic objective. As noted above, this may be a binding interaction between the antibody and its target antigen that, in certain cases, interferes with the functioning of the target, and in other cases, promotes a physiological response.
  • the amount required to be administered will furthermore depend on the binding affinity of the antibody for its specific antigen, and will also depend on the rate at which an administered antibody is depleted from the free volume other subject to which it is administered.
  • Common ranges for therapeutically effective dosing of an antibody or antibody fragment of the invention may be, by way of nonlimiting example, from about 0.1 mg/kg body weight to about 50 mg/kg body weight. Common dosing frequencies may range, for example, from twice daily to once a week.
  • Antibodies specifically binding a protein of the invention, as well as other molecules identified by the screening assays disclosed herein, can be administered for the treatment of various disorders in the form of pharmaceutical compositions.
  • Principles and considerations involved in preparing such compositions, as well as guidance in the choice of components are provided, for example, in Remington: The Science And Practice Of Pharmacy 19th ed. (Alfonso R. Gennaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; Drug Absorption Enhancement : Concepts, Possibilities, Limitations, And Trends, Harwood Academic Publishers, Langhorne, Pa., 1994; and Peptide And Protein Drug Delivery (Advances In Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York.
  • the antigenic protein is intracellular and whole antibodies are used as inhibitors, internalizing antibodies are preferred.
  • liposomes can also be used to deliver the antibody, or an antibody fragment, into cells. Where antibody fragments are used, the smallest inhibitory fragment that specifically binds to the binding domain of the target protein is preferred.
  • peptide molecules can be designed that retain the ability to bind the target protein sequence. Such peptides can be synthesized chemically and/or produced by recombinant DNA technology. See, e.g., PNAS USA, 90:7889, 1993.
  • the formulation herein can also contain more than one active compound as necessary for the particular indication being treated, preferably those with complementary activities that do not adversely affect each other.
  • the composition can comprise an agent that enhances its function, such as, for example, a cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent.
  • an agent that enhances its function such as, for example, a cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent.
  • Such molecules are suitably present in combination in amounts that are effective for the purpose intended.
  • the active ingredients can also be entrapped in microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization, for example, hydroxymethylcellulose or gelatin-microcapsules and poly-(methylmethacrylate) microcapsules, respectively, in colloidal drug delivery systems (for example, liposomes, albumin microspheres, microemulsions, nano-particies, and nanocapsules) or in macroemulsions.
  • colloidal drug delivery systems for example, liposomes, albumin microspheres, microemulsions, nano-particies, and nanocapsules
  • the formulations to be used for in vivo administration must be sterile. This is readily accomplished by filtration through sterile filtration membranes.
  • sustained-release preparations can be prepared. Suitable examples of sustained-release preparations include semipermeable matrices of solid hydrophobic polymers containing the antibody, which matrices are in the form of shaped articles, e.g., films, or microcapsules. Examples of sustained-release matrices include polyesters, hydrogels (for example, poly (2-hydroxyethyl-methacrylate), or poly(vinylalcohol)), polylactides (U.S. Pat. No.
  • copolymers of L-glutamic acid and y ethyl-L-glutamate non-degradable ethylene-vinyl acetate
  • degradable lactic acid-glycolic acid copolymers such as the LUPRON DEPOT TM (injectable microspheres composed of lactic acid-glycolic acid copolymer and leuprolide acetate)
  • poly-D-(-)-3-hydroxybutyric acid While polymers such as ethylene-vinyl acetate and lactic acid-glycolic acid enable release of molecules for over 100 days, certain hydrogels release proteins for shorter time periods.
  • An agent for detecting an analyte protein is for example, an antibody capable of binding to an analyte protein, preferably an antibody with a detectable label.
  • Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., F at , or F (a )2 ) can be used.
  • the term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled.
  • biological sample is intended to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. Included within the usage of the term "biological sample”, therefore, is blood and a fraction or component of blood including blood serum, blood plasma, or lymph. That is, the detection method of the invention can be used to detect an analyte mRNA, protein, or genomic DNA in a biological sample in vitro as well as in vivo.
  • in vitro techniques for detection of an analyte mRNA include Northern hybridizations and in situ hybridizations.
  • In vitro techniques for detection of an analyte protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations, and immunofluorescence.
  • In vitro techniques for detection of an analyte genomic DNA include Southern hybridizations. Procedures for conducting immunoassays are described, for example in "ELISA: Theory and
  • analyte protein in vivo techniques for detection of an analyte protein include introducing into a subject a labeled anti-an analyte protein antibody.
  • the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques.
  • Another aspect of the invention pertains to vectors, preferably expression vectors, containing a nucleic acid encoding a NOVX protein according to the present invention, or derivatives, fragments, analogs or homologs thereof.
  • NOVX recombinant expression victors and host cells are more fully defined in United States Patent 6,600,019, column 92, line 31 to column 96, line 9, the disclosure of which is incorporated in toto herein
  • Transgenic NOVX Animals The host cells of the invention can also be used to produce non-human transgenic animals by methods known in the art, for example as described in U.S. Patent Nos. 4,736,866; 4,870,009; and 4,873,191; and Hogan, 1986. In: MANIPULATING THE MOUSE EMBRYO, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y; Ce//51: 503 (1987); Ce//69: 915, 1992;. In: TERATOCARCINOMAS AND EMBRYONIC STEM CELLS: A PRACTICAL APPROACH, Robertson, ed. IRL, Oxford, pp. 113-152, 1987; Curr. Opin. Biotechnol.
  • NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies also referred to herein as "active compounds" of the present invention, and derivatives, fragments, analogs and homologs thereof, can be incorporated into pharmaceutical compositions suitable for administration.
  • Such compositions are well known in the compounding arts and are futher described in United States Patent 6,600,019, column 98, line 25 to column 101, line 14, the disclosure of which is incorporated in toto herein
  • compositions of the present invention can be included in a container, • pack, or dispenser together with instructions for administration.
  • the isolated nucleic acid molecules of the invention can be used to express NOVX protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in a NOVX gene, and to modulate NOVX activity, as described further, below.
  • the NOVX proteins can be used to screen drugs or compounds that modulate the NOVX protein activity or expression as well as to treat disorders characterized by insufficient or excessive production of NOVX protein or production of NOVX protein forms that have decreased or aberrant activity compared to NOVX wild-type protein (e.g.; diabetes (regulates insulin release); obesity (binds and transport lipids); metabolic disturbances associated with obesity, the metabolic syndrome X as well as anorexia and wasting disorders associated with chronic diseases and various cancers, and infectious disease(possesses anti-microbial activity) and the various dyslipidemias.
  • the anti-NOVX antibodies of the invention can be used to detect and isolate NOVX proteins and modulate NOVX activity.
  • the invention can be used in methods to influence appetite, absorption of nutrients and the disposition of metabolic substrates in both a positive and negative fashion.
  • the invention further pertains to novel agents identified by the screening assays described herein and uses thereof for treatments as described, supra.
  • the present invention also provides for a method (also referred to herein as a "screening assay") for identifying modulators, i.e., candidate or test compounds or agents (e.g., peptides, peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity.
  • modulators i.e., candidate or test compounds or agents (e.g., peptides, peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity.
  • the invention also includes compounds identified in the screening assays described herein. Such assays are well know in the art and are more fully described in United States Patent 6,600,019, column 102, line 45 to column 195, line 51 , the disclosure of which is incorporated in toto herein
  • modulators of NOVX protein expression are identified in a method wherein a cell is contacted with a candidate compound and the expression of NOVX mRNA or protein in the cell is determined. The level of expression of NOVX mRNA or protein in the presence of the candidate compound is compared to the level of expression of NOVX mRNA or protein in the absence of the candidate compound. The candidate compound can then be identified as a modulator of NOVX mRNA or protein expression based upon this comparison. For example, when expression of NOVX mRNA or protein is greater (i.e., statistically significantly greater) in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of NOVX mRNA or protein expression.
  • the candidate compound when expression of NOVX mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of NOVX mRNA or protein expression.
  • the level of NOVX mRNA or protein expression in the cells can be determined by methods described herein for detecting NOVX mRNA or protein.
  • the NOVX proteins can be used as "bait proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; Cell 2:223, 1993; J. Biol. Chem. 268:12046, 1993; Biotechniques 14:920, 1993; and Oncogene 8:1693, 1993), to identify other proteins that bind to or interact with NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX activity.
  • NOVX-binding proteins are also involved in the propagation of signals by the NOVX proteins as, for example, upstream or downstream elements of the NOVX pathway.
  • the present invention further pertains to novel agents identified by the aforementioned screening assays and uses thereof for treatments as described herein. Detection Assays
  • cDNA sequences identified herein can be used in numerous ways as polynucleotide reagents.
  • these sequences can be used to: (/) map their respective genes on a chromosome; and, thus, locate gene regions associated with genetic disease; (//) identify an individual from a minute biological sample (tissue typing); and (; ' / ' ;) aid in forensic identification of a biological sample.
  • NOVX genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the NOVX, sequences can be used to rapidly select primers that do not span more than one exon in the genomic DNA, thus complicating the amplification process. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the NOVX sequences will yield an amplified fragment. See for example Science 220:919 (1983). Somatic cell hybrids containing only fragments of human chromosomes can also be produced by using human chromosomes with translocations and deletions.
  • Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location, see, Verma, etal, HUMAN CHROMOSOMES: A MANUAL OF BASIC TECHNIQUES (Pergamon Press, New York 1988).
  • FISH Fluorescence in situ hybridization
  • differences in the DNA sequences between individuals affected and unaffected with a disease associated with the NOVX gene can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.
  • the present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual prophylactically.
  • diagnostic assays for determining NOVX protein and/or nucleic acid expression as well as NOVX activity, in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a disease or disorder, or is at risk of developing a disorder, associated with aberrant NOVX expression or activity.
  • the disorders include, but are not limited to metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders, and the various dyslipidemias, metabolic disturbances associated with obesity, the metabolic syndrome X and wasting disorders associated with chronic diseases and various cancers.
  • Another aspect of the invention provides methods for determining NOVX protein, nucleic acid expression or activity in an individual to thereby select appropriate therapeutic or prophylactic agents for that individual (referred to herein as "pharmacogenomics").
  • Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or prophylactic treatment of an individual based on the genotype of the individual (e.g., the genotype of the individual examined to determine the ability of the individual to respond to a particular agent.)
  • Yet another aspect of the invention pertains to monitoring the influence of agents (e.g., drugs, compounds) on the expression or activity of NOVX in clinical trials.
  • An exemplary method for detecting the presence or absence of NOVX in a biological sample involves obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting NOVX protein or the nucleic acid (e.g., mRNA, genomic DNA) that encodes NOVX protein such that the presence of NOVX is detected in the biological sample.
  • a compound or an agent capable of detecting NOVX protein or the nucleic acid e.g., mRNA, genomic DNA
  • An agent for detecting NOVX mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to NOVX mRNA or genomic DNA as described herein.
  • an agent for detecting NOVX protein can be an antibody capable of binding to NOVX protein, preferably an antibody with a detectable label as described herein.
  • the biological sample contains protein molecules from the test subject.
  • the biological sample can contain mRNA molecules from the test subject or genomic DNA molecules from the test subject.
  • the invention also encompasses kits for detecting the presence of NOVX in a biological sample.
  • the kit can comprise: a labeled compound or agent capable of detecting NOVX protein or mRNA in a biological sample; means for determining the amount of NOVX in the sample; and/or means for comparing the amount of NOVX in the sample with a standard.
  • the compound or agent can be packaged in a suitable container.
  • the kit can further comprise instructions for using the kit to detect NOVX protein or nucleic acid.
  • Assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant NOVX expression or activity.
  • agent e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate
  • the methods of the invention can also be used to detect genetic lesions in a NOVX gene
  • a subject with the lesioned gene is at risk for a disorder characterized by aberrant cell proliferation and/or differentiation.
  • such genetic lesions can be detected by ascertaining the existence of at least one of: (/) a deletion of one or more nucleotides from a NOVX gene; (//) an addition of one or more nucleotides to a NOVX gene; (///) a substitution of one or more nucleotides of a NOVX gene, (iv) a chromosomal rearrangement of a NOVX gene; (v) an alteration in the level of a messenger RNA transcript of a NOVX gene, (vi) aberrant modification of a NOVX gene, such as of the methylation pattern of the genomic DNA, (vii) the presence of a non-wild-type splicing pattern of a messenger RNA transcript of a NOVX gene, (viii) a non-wild-type level of a NOVX protein, (/*) allelic loss of a NOVX gene, and (x) inappropriate post-translational modification of a NOVX protein.
  • detection of the lesion involves the use of a probe/primer in a polymerase chain reaction (PCR) (U.S. Patent Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, Science 241 : 10770, 1988; and PNAS USA 91:360, 1994), the latter of which can be particularly useful for detecting point mutations in the NOVX-gene (see, Nucl. Acids Res. 23:675, 1995).
  • PCR polymerase chain reaction
  • LCR ligation chain reaction
  • This method can include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or more primers that specifically hybridize to a NOVX gene under conditions such that hybridization and amplification of the NOVX gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample
  • nucleic acid e.g., genomic, mRNA or both
  • Alternative amplification methods include: self sustained sequence replication (PNAS USA 87:1874, 1990), transcriptional amplification system (PNAS USA 86:1173, 1989); Q ⁇ Replicase (BioTechnology 6:1197, 1988), or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such molecules are present in very low numbers.
  • mutations in a NOVX gene from a sample cell can be identified by alterations in restriction enzyme cleavage patterns.
  • sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA.
  • sequence specific ribozymes see, e.g., U.S. Patent No. 5,493,531 can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.
  • genetic mutations in NOVX can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing hundreds or thousands of oligonucleotides probes (e.g., Human Mutation 7:244, 1996.; Nat. Med. 2:753, 1996). For example, by two dimensional arrays containing light-generated DNA probes. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations.
  • a sample and control nucleic acids e.g., DNA or RNA
  • high-density arrays containing hundreds or thousands of oligonucleotides probes e.g., Human Mutation 7:244, 1996.; Nat. Med. 2:753, 1996.
  • two dimensional arrays containing light-generated DNA probes e.g.,
  • Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.
  • any of a variety of sequencing reactions known in the art can be used to directly sequence the NOVX gene and detect mutations by comparing the sequence of the sample NOVX with the corresponding wild-type (control) sequence.
  • sequencing reactions see PNAS USA 74:560 (1977); PNAS USA 74:5463 (1977); Biotechniques 19:448, 1995; Adv. Chromatography 36:127, 1996; and Appl. Biochem. Biotechnol.38:147, 1993.
  • RNA/RNA or RNA/DNA heteroduplexes Other methods for detecting mutations in the NOVX gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes see, e.g., Science 230: 1242, 1985; PNAS USA 85:4397, 1988; Methods Enzymol. 217:286, 1992.
  • the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in NOVX cDNAs, see Carcinogenesis 15:1657, 1994; U.S. Patent No. 5,459,039.
  • alterations in electrophoretic mobility will be used to identify mutations in NOVX genes.
  • SSCP single strand conformation polymorphism
  • PNAS USA 86:2766, 1989;. Mutat. Res. 285:125, 1993; Genet. Anal. Tech. Appl. 9:73, 1992; Trends Genet. 7:5, 1991).
  • the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) e.g. Nature 313: 495, 1985; Biophys. Chem. 265: 12753, 1987.
  • DGGE denaturing gradient gel electrophoresis
  • examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension, e.g. Nature 324: 163, 1986; PNAS USA 86: 6230, 1989.
  • Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization e.g., Nucl. Acids Res. 17: 2437-2448, 1989) or at the extreme 3'-terminus of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (e.g., Tibtech. 11 :238, 1993).
  • amplification may also be performed using Taq ligase for amplification, e.g., PNAS. USA 88: 189, 1991.
  • ligation will occur only if there is a perfect match at the 3'-terminus of the 5' sequence, making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.
  • the methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a NOVX gene.
  • Pharmacogenomics Agents or modulators that have a stimulatory or inhibitory effect on NOVX activity (e.g.,
  • NOVX gene expression as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) disorders.
  • the disorders include but are not limited to, e.g., those diseases, disorders and conditions listed above, and more particularly include those diseases, disorders, or conditions associated with homologs of a NOVX protein, such as those summarized in Table A.
  • Pharmacogenomics the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug permits the selection of effective agents (e.g., drugs) for prophylactic or therapeutic treatments based on a consideration of the individual's genotype. Such pharmacogenomics can further be used to determine appropriate dosages and therapeutic regimens. Accordingly, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual. Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons, e.g., Clin. Exp. Pharmacol.
  • the activity of NOVX protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual.
  • pharmacogenetic studies can be used to apply genotyping of polymorphic alleles encoding drug-metabolizing enzymes to the identification of an individual's drug responsiveness phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a NOVX modulator, such as a modulator identified by one of the exemplary screening assays described herein.
  • the invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant NOVX expression or activity.
  • ADAMs and MMPs are broadly relevant to extracellular proteolysis. Proteolysis of the extracellular matrix plays a critical role in establishing tissue architecture during development and in tissue degradation in diseases such as cancer, arthritis, Alzheimer disease, and a variety of inflammatory conditions.
  • the proteolytic enzymes responsible include members of diverse protease families and they may work in concert or in cascades to degrade or process molecules.
  • ADAM family members are quite similar in domain organization, bearing, from amino to carboxyl termini, a signal peptide, a proregion, a zinc metalloprotease catalytic domain with the typical reprolysin signature motif, a disintegrin domain, a cysteine-rich domain, an EGF-like domain, and, in many cases, a membrane-spanning region and a cytoplasmic domain with signaling potential.
  • Members of the ADAMTS family differ substantially from the prototypic ADAM structure in that they lack the EGF-like domain, do not have a canonical disintegrin sequence, and possess modules with similar thrombospondin type 1 repeats.
  • the novel gene described here is a novel splice variant of ADAMS-TS 18: the first 1-1062 aa sequence is the same however, the last 3 exons 1063-1133aa, 1134-1182aa, 1183-1212aa differ from ADMAS-TS 18. These last 3 exons have perfect splicing sites, while ADAMS-TS 18's 3 end 1063-1181 aa is from intronic sequence. Because of the presence of the Thrombospondin type 1 domain, and Reprolysin domain, the novel sequence described here has properties and functions similar to these genes. This novel gene has a role in the regulation of cellular functions the modulation of which has implications in cancer and inflammatory diseases.
  • the a disintegrin-like and metalloptorease (reprolysin type) with thrombospondin type 1 motif, 12-like gene disclosed in this invention maps to chromosome 16.
  • NOV1 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 1A.
  • Z 1 is Y or H; Z 2 is K or E; Z 3 is L or I.
  • NOV1e TINEDTGLGLAFTIAHESGHNFGMIHDGEGNPCRKAEGNIMSPTLTGNNGVFS SSCSRQ
  • NOVIa SEQ ID NO 2
  • NOVIb SEQ ID NO 4
  • NOVIc SEQ ID NO 6
  • NOVld SEQ ID NO 8
  • NOVle SEQ ID NO 10
  • NOVlf SEQ ID NO 12
  • NOVlg SEQ ID NO 14
  • NOVlh SEQ ID NO 16
  • NOVli SEQ ID NO 18
  • PSG a new signal peptide prediction method N-region: length 2; pos.chg 0; neg.chg 1 H-region: length 17; peak value 0.00 PSG score: -4.40
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1): 3.61 possible cleavage site: between 16 and 1 »> Seems to have no N-terminal signal peptide
  • ALOM Klein et al's method for TM region allocation
  • KKXX-like motif in the C-terminus CTRK SKL: peroxisomal targeting signal in the C-terminus: none PTS2: 2nd peroxisomal targeting signal: none VAC: possible vacuolar targeting motif: none RNA-binding motif: none Actinin-type actin-binding motif: type 1 : none type 2: none NMYR: N-myristoylation pattern : none
  • Prenylation motif none memYQRL: transport motif from cell surface to Golgi: none
  • NOVIa protein was found to have homology to the proteins shown in the BLASTP data in Table 1 E.
  • NOV2 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 2A. j Table 2A. NOV2 Sequence Analysis
  • IDNA Sequence ORF Start: ATG at 14 JORF Stop: at 944 iCACCGGATCCACCATGGCGCTGAGGCG. 3CCATCGCGACTCCGGCTCTGCGCTCGGCTGCCTGACTTCT
  • VKPVTPVCRVPKAVPVGK AT HCQESEGHPRPHYSWYRNDVPLPTDSRANPRFRNSSFHLNSETGTL
  • NOV2b PTDSRANPRFRNSSFHLNS ⁇ TGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQEMEVYDL
  • PSG a new signal peptide prediction method N-region: length 10; pos.chg 4; neg.chg 0 H-region: length 3; peak value -4.60 PSG score: -9.00
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1 ): 1.54 possible cleavage site: between 30 and 31
  • NNCN Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 76.7
  • NOV2a protein was found to have homology to the proteins shown in the BLASTP data in Table 2E.
  • RXR- beta Retinoic acid receptor
  • RAR retinoic acid receptors
  • RARA alpha
  • RARB beta
  • RARG gamma
  • RXR-beta forms heterodimers with RAR preferentially increasing its DNA binding and transcriptional activity on promoters containing retinoic acid, but not thyroid hormone or vitamin D, response elements.
  • RXR-beta also heterodimerizes with thyroid hormone and vitamin D receptors, increasing both DNA binding and transcriptional function on their respective response elements.
  • RXR-alpha also formes heterodimers with these receptors.
  • Retinoid X receptor coregulators selectively target the high affinity binding of retinoic acid, thyroid hormone, and vitamin D receptors to their cognate DNA response elements.
  • Retinoids are involved in controlling the function of the dopaminergic mesolimbic pathway and defects in retinoic acid signaling contribute to disorders such as Parkinson disease and schizophrenia (Kreczel et al. (1998) Impaired locomotion and dopamine signaling in retinoid receptor mutant mice. Science 279: 863-867).
  • RXR heterodimers serve as key regulators in cholesterol homeostasis by governing reverse cholesterol transport from peripheral tissues, bile acid synthesis in liver, and cholesterol absorption in intestine.
  • RXR/LXR heterodimers inhibits cholesterol absorption through upregulation of ABC1 expression in the small intestine.
  • Activation of RXR FXR heterodimers represses CYP7A1 expression and bile acid production, leading to a failure to solubilize and absorb cholesterol (Lu et al., 2000 Molecular basis for feedback regulation of bile acid synthesis by nuclear receptors. Molec. Cell 6: 507-515).
  • the Retinoic acid receptor RXR-beta-like gene disclosed in this invention maps to chromosome 6.
  • NOV3 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 3A.
  • PSG a new signal peptide prediction method N-region: length 6; pos.chg l; neg.chg 0 H-region: length 6; peak value -7.06 PSG score: -11.46
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1): -5.15 possible cleavage site: between 58 and 59
  • NUCDISC discrimination of nuclear localization signals pat4: RRRR (5) at 37 pat4: RRRP (4) at 38 pat7: none bipartite: RKEMHCGVASRWRRRRP at 25 content of basic residues: 11.9%
  • Nuclear hormones receptors DNA-binding region signature (PS00031): r found CAICGDRSSGKHYGVYSCEGCKGFFKR at 205
  • NOV3a protein was found to have homology to the proteins shown in the BLASTP data in Table 10E.
  • Example 4 NOV4, CG190229, Dihydrolipoamide branched chain transacylase.
  • Dihydrolipoyl transacylase (acyltransferase, E2) is a component of the branched- chain alpha-keto acid dehydrogenase complex and has dihydrolipoyl dehydrogenase E3 binding and lipoyl-bearing domains. Mutation in this enzyme causes a subset of maple syrup urine disease in Ashkenazi Jewish population.
  • the Dihydrolipoamide branched chain transacylase-like gene disclosed in this invention maps to chromosome 1. NOV4 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 4A. i Table 4A. NOV4 Sequence Analysis jNOV4a, CG 190229-02 SEQ ID NO: 33 1276 bp JDNA Sequence ORF Start: at 2 ORF Stop: at 1268
  • N0V4b (SEQ ID NO: 36)
  • PSG a new signal peptide prediction method N-region: length 9; pos.chg l; neg.chg 0 H-region: length 2; peak value -10.23 PSG score: -14.63
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1): -6.09 possible cleavage site: between 15 and 16
  • NUCDISC discrimination of nuclear localization signals pat4: none pat7: none bipartite: none content of basic residues: 13.0% NLS Score: -0.47
  • NNCN Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 89
  • NOV4a protein was found to have homology to the proteins shown in the BLASTP data in Table 4E.
  • the NOV5 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 5A.
  • Xi is C or T. iNOV5c, CG194245 SEQ ID NO: 42 629 aa MW at 71268.6kD jProtein Sequence
  • Z ⁇ is H or Y.
  • N0V5a (SEQ ID NO: 40 ) N0V5b (SEQ ID NO: 42 ) Further analysis of the NOV5a protein yielded the following properties shown in Table 5C.
  • PSG a new signal peptide prediction method N-region: length 0; pos.chg 0; neg.chg 0 H-region: length 28; peak value 10.51 PSG score: 6.11
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1): -1.72 possible cleavage site: between 24 and 25
  • Gavel prediction of cleavage sites for mitochondrial preseq R-2 motif at 74 ARQ
  • NUCDISC discrimination of nuclear localization signals pat4: KRRP (4) at 48 pat4: KHRK (3) at 581 pat7: none bipartite: none content of basic residues: 12.6% NLS Score: -0.03
  • NNCN Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 94.1
  • NOV ⁇ a protein was found to have homology to the proteins shown in the BLASTP data in Table 5E.
  • Example 6 NOV6, CG196732, NM_21797 like.
  • the NOV6 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 6A.
  • CAACAAATX 2 TGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC
  • NOV ⁇ a (SEQ ID NO 44 ) NOV ⁇ b (SEQ ID NO 46) NOV ⁇ c (SEQ ID NO 48 ) NOV ⁇ d (SEQ ID NO 50 ) NOV ⁇ e (SEQ ID NO 52 ) NOV ⁇ f (SEQ ID NO 54 )
  • PSG a new signal peptide prediction method N-region: length 8; pos.chg 1 ; neg.chg 1 H-region: length 8; peak value 5.96 PSG score: 1.56
  • GvH von Heijne's method for signal seq. recognition
  • GvH score (threshold: -2.1): -8.56 possible cleavage site: between 38 and 39
  • D/E content 2 S/T content: ⁇
  • Gavel prediction of cleavage sites for mitochondrial preseq R-2 motif at 30 LRQ
  • NUCDISC discrimination of nuclear localization signals pat4: none pat7: none bipartite: none content of basic residues: 6.2% NLS Score: -0.47
  • NNCN Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 89
  • NOV6a protein was found to have homology to the proteins shown in the BLASTP data in Table 6E.
  • Example 7 NOV7, CG53147, CG53147-FLF/SalR.
  • the NOV7 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 7A.
  • JNOV7e 316900924 SEQ ID NO: 66 527 aa MW about 60000kD jProtein Sequence jMPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPR ⁇ RPRGCPCTGRASSLARDSAAAASD

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Disclosed herein are novel nucleic acid sequences that encode novel polypeptides. Also disclosed are polypeptides encoded by these nucleic acid sequences, and antibodies that immunospecifically bind to the polypeptide, as well as derivatives, variants mutants, or fragments of the novel polypeptide, polynucleotide, or antibody specific to the polypeptide. Vectors host cells, antibodies and recombinant methods for producing the polypeptides and polynucleotides, as well as methods for using same are also included. The invention further discloses therapeutic, diagnostic and research methods for diagnosis, treatment, and prevention of disorders involving any one of these novel human nucleic acids and proteins.

Description

THERAPEUTIC POLYPEPTIDES, NUCLEIC ACIDS ENCODING SAME,
AND METHODS OF USE
RELATED APPLICATIONS
This application is a continuation in part of U.S.S.N. 09/972211 , filed October 5, 2001 , which claims priority to U.S.S.N. 60/238323, filed October 5, 2000; and claims priority to U.S.S.N. 60/421698, filed October 28, 2002; U.S.S.N. 60/421155, filed October 25, 2002; U.S.S.N. 60/420968, filed October 24, 2002; U.S.S.N. 60/416662, filed October 7, 2002; U.S.S.N. 60/418808, filed October 16, 2002; and U.S.S.N. 60/423795, filed November 5, 2002, each of which is incorporated herein by reference in their entirety.
FIELD OF THE INVENTION
The present invention relates to novel polypeptides, and the novel nucleic acids encoding them, having properties related to stimulation of biochemical or physiological responses in a cell, a tissue, an organ or an organism. More particularly, the novel polypeptides are gene products of novel genes, or are specified biologically active fragments or derivatives thereof. Methods of use encompass diagnostic and prognostic assay procedures as well as methods of treating diverse pathological conditions.
BACKGROUND OF THE INVENTION
Eukaryotic cells are characterized by biochemical and physiological processes that under normal conditions are exquisitely balanced to achieve the preservation and propagation of the cells. When such cells are components of multicellular organisms such as vertebrates, or more particularly organisms such as mammals, the regulation of the biochemical and physiological processes involves intricate signaling pathways. Frequently, such signaling pathways involve extracellular signaling proteins, cellular receptors that bind the signaling proteins, and signal transducing components located within the cells. Signaling proteins may be classified as endocrine effectors, paracrine effectors or autocrine effectors. Endocrine effectors are signaling molecules secreted by a given organ into the circulatory system, which are then transported to a distant target organ or tissue. The target cells include the receptors for the endocrine effector, and when the endocrine effector binds, a signaling cascade is induced. Paracrine effectors involve secreting cells and receptor cells in close proximity to each other, for example two different classes of cells in the same tissue or organ. One class of cells secretes the paracrine effector, which then reaches the second class of cells, for example by diffusion through the extracellular fluid. The second class of cells contains the receptors for the paracrine effector; binding of the effector results in induction of the signaling cascade that elicits the corresponding biochemical or physiological effect. Autocrine effectors are highly analogous to paracrine effectors, except that the same cell type that secretes the autocrine effector also contains the receptor. Thus the autocrine effector binds to receptors on the same cell, or on identical neighboring cells. The binding process then elicits the characteristic biochemical or physiological effect.
Signaling processes may elicit a variety of effects on cells and tissues including by way of nonlimiting example induction of cell or tissue proliferation, suppression of growth or proliferation, induction of differentiation or maturation of a cell or tissue, and suppression of differentiation or maturation of a cell or tissue.
Many pathological conditions involve dysregulation of expression of important effector proteins. In certain classes of pathologies the dysregulation is manifested as diminished or suppressed level of synthesis and secretion of protein effectors. In other classes of pathologies the dysregulation is manifested as increased or up-regulated level of synthesis and secretion of protein effectors. In a clinical setting a subject may be suspected of suffering from a condition brought on by altered or mis-regulated levels of a protein effector of interest. Therefore there is a need to assay for the level of the protein effector of interest in a biological sample from such a subject, and to compare the level with that characteristic of a nonpathoiogical condition. There also is a need to provide the protein effector as a product of manufacture. Administration of the effector to a subject in need thereof is useful in treatment of the pathological condition. Accordingly, there is a need for a method of treatment of a pathological condition brought on by a diminished or suppressed levels of the protein effector of interest. In addition, there is a need for a method of treatment of a pathological condition brought on by a increased or up-regulated levels of the protein effector of interest.
Antibodies are multichain proteins that bind specifically to a given antigen, and bind poorly, or not at all, to substances deemed not to be cognate antigens. They are comprised of two short chains termed light chains and two long chains termed heavy chains. These chains are constituted of immunoglobulin domains, of which generally there are two classes: one variable domain per chain, one constant domain in light chains, and three or more constant domains in heavy chains. The antigen-specific portion of the immunoglobulin molecules resides in the variable domains; the variable domains of one light chain and one heavy chain associate with each other to generate the antigen-binding moiety. Antibodies that bind immunospecifically to a cognate or target antigen bind with high affinities, and are thus useful in assaying specifically for the presence of the antigen in a sample. In addition, they have the potential of inactivating the activity of the antigen.
Therefore there is a need to assay for the level of a protein effector of interest in a biological sample from such a subject, and to compare this level with that characteristic of a nonpathoiogical condition. In particular, there is a need for such an assay based on the use of an antibody that binds immunospecifically to the antigen. There further is a need to inhibit the activity of the protein effector in cases where a pathological condition arises from elevated or excessive levels of the effector based on the use of an antibody that binds immunospecifically to the effector. Thus, there is a need for the antibody as a product of manufacture. There further is a need for a method of treatment of a pathological condition brought on by an elevated or excessive level of the protein effector of interest based on administering the antibody to the subject. SUMMARY OF THE INVENTION
The present invention is based in part upon the discovery of isolated polypeptides including amino acid sequences selected from mature forms of the amino acid sequences selected , from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34. The novel nucleic acids and polypeptides are referred to herein as NOVIa, NOVIb, NOVIb, NOVIc, NOV2a, NOV2b, NOV2c, NOV2d, NOV3a, NOV3b, etc. These nucleic acids and polypeptides, as well as derivatives, homologs, analogs and fragments thereof, will hereinafter be collectively designated as "NOVX" nucleic acid or polypeptide sequences.
The present invention also is based in part upon variants of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein any amino acid in the mature form is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence of the mature form are so changed.
In one embodiment, the present invention includes the amino acid sequences selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34. In another embodiment, the invention also comprises variants of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence are so changed. The invention also involves fragments of any of the mature forms of the amino acid sequences selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, or any other amino acid sequence selected from this group. The invention also comprises fragments from these groups in which up to 15% of the residues are changed.
In another embodiment, the present invention encompasses polypeptides that are naturally occurring allelic variants of the sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34. These allelic variants include amino acid sequences that are the translations of nucleic acid sequences differing by a single nucleotide from nucleic acid sequences selected from the group consisting of SEQ ID NOS: 2n-1 , wherein n is an integer between 1 and 34. The variant polypeptide where any amino acid changed in the chosen sequence is changed to provide a conservative substitution.
In another embodiment, the invention comprises a pharmaceutical composition involving a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 and a pharmaceutically acceptable carrier. In another embodiment, the invention involves a kit, including, in one or more containers, this pharmaceutical composition.
In another embodiment, the invention includes the use of a therapeutic in the manufacture of a medicament for treating a syndrome associated with a human disease, the disease being selected from a pathology associated with a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein said therapeutic is the polypeptide selected from this group.
In another embodiment, the invention comprises a method for determining the presence or amount of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a sample, the method involving providing the sample; introducing the sample to an antibody that binds immunospecifically to the polypeptide; and determining the presence or amount of antibody bound to the polypeptide, thereby determining the presence or amount of polypeptide in the sample.
In another embodiment, the invention includes a method for determining the presence of or predisposition to a disease associated with altered levels of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a first mammalian subject, the method involving measuring the level of expression of the polypeptide in a sample from the first mammalian subject; and comparing the amount of the polypeptide in this sample to the amount of the polypeptide present in a control sample from a second mammalian subject known not to have, or not to be predisposed to, the disease, wherein an alteration in the expression level of the polypeptide in the first subject as compared to the control sample indicates the presence of or predisposition to the disease.
In another embodiment, the invention involves a method of identifying an agent that binds to a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including introducing the polypeptide to the agent; and determining whether the agent binds to the polypeptide. The agent could be a cellular receptor or a downstream effector.
In another embodiment, the invention involves a method for identifying a potential therapeutic agent for use in treatment of a pathology, wherein the pathology is related to aberrant expression or aberrant physiological interactions of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including providing a cell expressing the polypeptide of the invention and having a property or function ascribable to the polypeptide; contacting the cell with a composition comprising a candidate substance; and determining whether the substance alters the property or function ascribable to the polypeptide; whereby, if an alteration observed in the presence of the substance is not observed when the cell is contacted with a composition devoid of the substance, the substance is identified as a potential therapeutic agent.
In another embodiment, the invention involves a method for screening for a modulator of activity or of latency or predisposition to a pathology associated with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including administering a test compound to a test animal at increased risk for a pathology associated with the polypeptide of the invention, wherein the test animal recombinantly expresses the polypeptide of the invention; measuring the activity of the polypeptide in the test animal after administering the test compound; and comparing the activity of the protein in the test animal with the activity of the polypeptide in a control animal not administered the polypeptide, wherein a change in the activity of the polypeptide in the test animal relative to the control animal indicates the test compound is a modulator of latency of, or predisposition to, a pathology associated with the polypeptide of the invention. The recombinant test animal could express a test protein transgene or express the transgene under the control of a promoter at an increased level relative to a wild-type test animal. The promoter may or may not be the native gene promoter of the transgene.
In another embodiment, the invention involves a method for modulating the activity of a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including introducing a cell sample expressing the polypeptide with a compound that binds to the polypeptide in an amount sufficient to modulate the activity of the polypeptide.
In another embodiment, the invention involves a method of treating or preventing a pathology associated with a polypeptide with an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, the method including administering the polypeptide to a subject in which such treatment or prevention is desired in an amount sufficient to treat or prevent the pathology in the subject. The subject could be human.
In another embodiment, the invention involves a method of treating a pathological state in a mammal, the method including administering to the mammal a polypeptide in an amount that is sufficient to alleviate the pathological state, wherein the polypeptide is a polypeptide having an amino acid sequence at least 95% identical to a polypeptide having the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 or a biologically active fragment thereof.
In another embodiment, the invention involves an isolated nucleic acid molecule comprising a nucleic acid sequence encoding a polypeptide having an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34; a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid in the mature form of the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence of the mature form are so changed; the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34; a variant of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, in which any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15% of the amino acid residues in the sequence are so changed; a nucleic acid fragment encoding at least a portion of a polypeptide comprising the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 or any variant of the polypeptide wherein any amino acid of the chosen sequence is changed to a different amino acid, provided that no more than 10% of the amino acid residues in the sequence are so changed; and the complement of any of the nucleic acid molecules. In another embodiment, the invention comprises an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule comprises the nucleotide sequence of a naturally occurring allelic nucleic acid variant.
In another embodiment, the invention involves an isolated nucleic acid molecule including a nucleic acid sequence encoding a polypeptide having an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 that encodes a variant polypeptide, wherein the variant polypeptide has the polypeptide sequence of a naturally occurring polypeptide variant.
In another embodiment, the invention comprises an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule differs by a single nucleotide from a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 2n-1 , wherein n is an integer between 1 and 34.
In another embodiment, the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34; a nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15% of the nucleotides are so changed; a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34; and a nucleic acid fragment wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15% of the nucleotides are so changed.
In another embodiment, the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule hybridizes under stringent conditions to the nucleotide sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a complement of the nucleotide sequence.
In another embodiment, the invention includes an isolated nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34, wherein the nucleic acid molecule has a nucleotide sequence in which any nucleotide specified in the coding sequence of the chosen nucleotide sequence is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15% of the nucleotides in the chosen coding sequence are so changed, an isolated second polynucleotide that is a complement of the first polynucleotide, or a fragment of any of them.
In another embodiment, the invention includes a vector involving the nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34. This vector can have a promoter operably linked to the nucleic acid molecule. This vector can be located within a cell.
In another embodiment, the invention involves a method for determining the presence or amount of a nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a sample, the method including providing the sample; introducing the sample to a probe that binds to the nucleic acid molecule; and determining the presence or amount of the probe bound to the nucleic acid molecule, thereby determining the presence or amount of the nucleic acid molecule in the sample. The presence or amount of the nucleic acid molecule is used as a marker for cell or tissue type. The cell type can be cancerous.
In another embodiment, the invention involves a method for determining the presence of or predisposition for a disease associated with altered levels of a nucleic acid molecule having a nucleic acid sequence encoding a polypeptide including an amino acid sequence selected from the group consisting of a mature form of the amino acid sequence given SEQ ID NO:2n, wherein n is an integer between 1 and 34 in a first mammalian subject, the method including measuring the amount of the nucleic acid in a sample from the first mammalian subject; and comparing the amount of the nucleic acid in the sample of step (a) to the amount of the nucleic acid present in a control sample from a second mammalian subject known not to have or not be predisposed to, the disease; wherein an alteration in the level of the nucleic acid in the first subject as compared to the control sample indicates the presence of or predisposition to the disease.
The invention further provides an antibody that binds immunospecifically to a NOVX polypeptide. The NOVX antibody may be monoclonal, humanized, or a fully human antibody. Preferably, the antibody has a dissociation constant for the binding of the NOVX polypeptide to the antibody less than 1 x 10"9 M. More preferably, the NOVX antibody neutralizes the activity of the NOVX polypeptide.
In a further aspect, the invention provides for the use of a therapeutic in the manufacture of a medicament for treating a syndrome associated with a human disease, associated with a NOVX polypeptide. Preferably the therapeutic is a NOVX antibody.
In yet a further aspect, the invention provides a method of treating or preventing a NOVX-associated disorder, a method of treating a pathological state in a mammal, and a method of treating or preventing a pathology associated with a polypeptide by administering a NOVX antibody to a subject in an amount sufficient to treat or prevent the disorder.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description and claims.
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides novel nucleotides and polypeptides encoded thereby. Included in the invention are the novel nucleic acid sequences, their encoded polypeptides, antibodies, and other related compounds. The sequences are collectively referred to herein as "NOVX nucleic acids" or "NOVX polynucleotides" and the corresponding encoded polypeptides are referred to as "NOVX polypeptides" or "NOVX proteins." Unless indicated otherwise, "NOVX" is meant to refer to any of the novel sequences disclosed herein. Table A provides a summary of the NOVX nucleic acids and their encoded polypeptides.
TABLE A. Sequences and Corresponding SEQ ID Numbers
Figure imgf000009_0001
NOV2b CG 136984-01 21 22 Junction adhesion molecule 3 - Homo sapiens
NOV2c CG 136984-03 23 24 Junction adhesion molecule 3 - Homo
1 sapiens
NOV2d 312713075 25 26 Junction adhesion molecule 3 - Homo sapiens
NOV2e SNP 13382593 27 28 Junction adhesion molecule 3 - Homo
, . ..... sapiens
NOV2f CG 136984 29 30 Junction adhesion molecule 3 - Homo sapiens
NOV3a CG 189936-02 31 32 Retinoic acid receptor RXR-beta - Homo sapiens
JNOV4a CG 190229-02 33 34 Lipoamide acyltransferase component of branched-chain alpha-keto acid dehydrogenase complex - Homo sapiens
NOV4b CG 190229-04 35 36 Lipoamide acyltransferase component of branched-chain alpha-keto acid dehydrogenase complex - Homo sapiens lNOV5a CG 194245-03 37 38 Very-long-chain acyl-CoA synthetase -
1 Homo sapiens
]NOV5b SNP C99.877 of 39 40 Very-long-chain acyl-CoA synthetase -
1 CG 194245-03 Homo sapiens
ΪNOV5c CG 194245 41 42 Very-long-chain acyl-CoA synthetase - ϊ Homo sapiens
NOV6a CG 196732-01 43 44 Chitinase family - Homo sapiens
ΪNOV6b CG 196732-02 45 46 Chitinase family - Homo sapiens
!NOV6c CG 196732-03 47 48 Chitinase family - Homo sapiens
!NOV6d SNP 13382594 49 50 Chitinase family - Homo sapiens lNOV6e SNP 13382595 51 52 Chitinase family - Homo sapiens lNOV6f SNP 13382596 53 54 Chitinase family - Homo sapiens jNOV6g CG 196732 55 56 Chitinase family - Homo sapiens lNOV7a CG53147-02 57 58 Chitinase family - Homo sapiens lNOV7b CG53147-01 59 60 Chitinase family - Homo sapiens
NOV7c CG53147-03 61 62 Chitinase family - Homo sapiens
!NOV7d 316900904 63 64 Chitinase family - Homo sapiens
NOV7e 316900924 65 66 Chitinase family - Homo sapiens
Table A indicates the homology of NOVX polypeptides to known protein families. Thus, the nucleic acids and polypeptides, antibodies and related compounds according to the invention corresponding to a NOVX as identified in column 1 of Table A are useful in therapeutic and diagnostic applications implicated in, for example, pathologies and disorders associated with the known protein families identified in column 5 of Table A.
Pathologies, diseases, disorders and condition and the like that are associated with NOVX sequences include, but are not limited to: e.g., cardiomyopathy, atherosclerosis, hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), vascular calcification, fibrosis, atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic stenosis, ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, obesity, metabolic disturbances associated with obesity, transplantation, osteoarthritis, rheumatoid arthritis, osteochondrodysplasia, adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, diabetes, metabolic disorders, neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, glomerulonephritis, hemophilia, hypercoagulation, idiopathic thrombocytopenic purpura, immunodeficiencies, psoriasis, skin disorders, graft versus host disease, AIDS, bronchial asthma, lupus, Crohn's disease; inflammatory bowel disease, ulcerative colitis, multiple sclerosis, treatment of Albright Hereditary Ostoeodystrophy, infectious disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune disorders, hematopoietic disorders, and the various dyslipidemias, schizophrenia, depression, asthma, emphysema, allergies, the metabolic syndrome X and wasting disorders associated with chronic diseases and various cancers, as well as conditions such as transplantation, neuroprotection, fertility, or regeneration (in vitro and in vivo).
NOVX polypeptides of the present invention show homology to, and contain domains that are characteristic of members of such protein families. Details of the sequence relatedness and domain analysis for each NOVX are presented in Example A.
The NOVX nucleic acids and polypeptides are used to screen for molecules, which inhibit or enhance NOVX activity or function. Specifically, the nucleic acids and polypeptides according to the invention are used as targets for the identification of small molecules that modulate or inhibit associated diseases.
The NOVX nucleic acids and polypeptides are also useful for detecting and differentiating specific cell types, tissues, pathological tissues, cell activation states and the like. Details of expression analysis for each NOVX are presented in Example C. Accordingly, the NOVX nucleic acids, polypeptides, antibodies and related compounds according to the invention have diagnostic and therapeutic applications in the detection of a variety of diseases with differential expression in normal vs. diseased tissues, e.g. detection of cancer.
Additional utilities for NOVX nucleic acids and polypeptides according to the invention are disclosed herein.
NOVX clones The NOVX nucleic acids and proteins of the invention are useful in diagnostic and therapeutic applications and as a research tool. These include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be assessed, as well as potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) a biological defense weapon.
In one specific embodiment, the invention includes an isolated polypeptide comprising an amino acid sequence selected from the group consisting of: (a) a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (b) a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34, wherein any amino acid in the mature form is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5% no more than 2% or no more than 1 % of the amino acid residues in the sequence of the mature form are so changed; (c) an amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (d) a variant of the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34 wherein any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5% no more than 2% or no more than 1 % of the amino acid residues in the sequence are so changed; and (e) a fragment of any of (a) through (d).
In another specific embodiment, the invention includes an isolated nucleic acid molecule comprising a nucleic acid sequence encoding a NOVX polypeptide comprising an amino acid sequence selected from the group consisting of: (a) a mature form of the amino acid sequence given SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (b) a variant of a mature form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34 wherein any amino acid in the mature form of the chosen sequence is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1 % of the amino acid residues in the sequence of the mature form are so changed; (c) the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34; (d) a variant of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34, in which any amino acid specified in the chosen sequence is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1 % of the amino acid residues in the sequence are so changed; (e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 34 or any variant of said polypeptide wherein any amino acid of the chosen sequence is changed to a different amino acid, provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1% of the amino acid residues in the sequence are so changed; and (f) the complement of any of said nucleic acid molecules.
In yet another specific embodiment, the invention includes an isolated nucleic acid molecule, wherein said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of: (a) the nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-1 , wherein n is an integer between 1 and 34; (b) a nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1 % of the nucleotides are so changed; (c) a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID NO: 2n-1, wherein n is an integer between 1 and 34; and (d) a nucleic acid fragment wherein one or more nucleotides in the nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-1 , wherein n is an integer between 1 and 34 is changed from that selected from the group consisting of the chosen sequence to a different nucleotide provided that no more than 15%, no more than 10%, no more than 5%, no more than 2%, or no more than 1% of the nucleotides are so changed.
NOVX Nucleic Acids and Polypeptides
One aspect of the invention pertains to isolated nucleic acid molecules that encode NOVX polypeptides or biologically active portions thereof. Also included in the invention are nucleic acid fragments sufficient for use as hybridization probes to identify NOVX-encoding nucleic acids (e.g., NOVX mRNAs) and fragments for use as PCR primers for the amplification and/or mutation of NOVX nucleic acid molecules. As used herein, the term "nucleic acid molecule" is intended to include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using nucleotide analogs, and derivatives, fragments and homologs thereof. The nucleic acid molecule may be single-stranded or double-stranded, but preferably is comprised of double-stranded DNA.
A NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a "mature" form of a polypeptide or protein disclosed in the present invention is the product of a naturally occurring polypeptide or precursor form or proprotein. The naturally occurring polypeptide, precursor or proprotein includes, by way of nonlimiting example, the full-length gene product encoded by the corresponding gene. Alternatively, it may be defined as the polypeptide, precursor or proprotein encoded by an ORF described herein. The product "mature" form arises, by way of nonlimiting example, as a result of one or more naturally occurring processing steps that may take place within the cell (e.g., host cell) in which the gene product arises. Examples of such processing steps leading to a "mature" form of a polypeptide or protein include the cleavage of the N-terminal methionine residue encoded by the initiation codon of an ORF, or the proteolytic cleavage of a signal peptide or leader sequence. Thus a mature form arising from a precursor polypeptide or protein that has residues 1 to N, where residue 1 is the N-terminal methionine, would have residues 2 through N remaining after removal of the N-terminal methionine.
Alternatively, a mature form arising from a precursor polypeptide or protein having residues 1 to N, in which an N-terminal signal sequence from residue 1 to residue M is cleaved, would have the residues from residue M+1 to residue N remaining. Further as used herein, a "mature" form of a polypeptide or protein may arise from a step of post-translational modification other than a proteolytic cleavage event. Such additional processes include, by way of non-limiting example, glycosylation, myristylation or phosphorylation. In general, a mature polypeptide or protein may result from the operation of only one of these processes, or a combination of any of them.
The term "probe", as utilized herein, refers to nucleic acid sequences of variable length, preferably between at least about 10 nucleotides (nt), about 100 nt, or as many as approximately 6,000 nt, depending upon the specific use. Probes are used in the detection of identical, similar, or complementary nucleic acid sequences. Longer length probes are generally obtained from a natural or recombinant source, are highly specific, and much slower to hybridize than shorter-length oligomer probes. Probes 05/be single- stranded or double-stranded and designed to have specificity in PCR, membrane-based hybridization technologies, or ELISA-like technologies.
The term "isolated" nucleic acid molecule, as used herein, is as defined in United States Patent 6,600,019, columns 69, lines 8 to 30, the disclosure of which is hereby incorporated in toto by reference. A nucleic acid molecule of the invention, e.g., a nucleic acid molecule having the nucleotide sequence of SEQ ID NO:2π-1 , wherein n is an integer between 1 and 34, or a complement of this nucleotide sequence, can be isolated using standard molecular biology techniques and the sequence information provided herein. Using all or a portion of the nucleic acid sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, as a hybridization probe, NOVX molecules can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, ef a/., (eds.), MOLECULAR CLONING: A LABORATORY MANUAL 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; and Ausubel, et al., (eds.), CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, NY, 1993.)
A nucleic acid of the present invention can also be amplified as described in Untied States Patent 6,600,019, column 69 , lines 45 to 54, the disclosure of which is hereby incorporated in toto by reference.
As used herein, the term "oligonucleotide" refers to a series of linked nucleotide residues as defined in United States Patent 6,600,019, columns 69 and 70, the disclosure of which is hereby incorporated in toto herein by reference. In one embodiment of the present invention, an oligonucleotide comprising a nucleic acid molecule less than 100 nt in length would further comprise at least 6 contiguous nucleotides of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a complement thereof.
In another embodiment, an isolated nucleic acid molecule of the invention comprises a nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or a portion of this nucleotide sequence (e.g., a fragment that can be used as a probe or primer or a fragment encoding a biologically-active portion of a NOVX polypeptide). A nucleic acid molecule that is complementary to the nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, is one that is sufficiently complementary to the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, that it can hydrogen bond with few or no mismatches to the nucleotide sequence shown in SEQ ID NO:2τ-1, wherein n is an integer between 1 and 34, thereby forming a stable duplex.
As used herein, the term "complementary", and the term "binding" are as defined in United States Patent 6,600,019, column 70, lines 15 to 27, the disclosure of which is hereby incorporated in toto herein by reference A "fragment" provided herein is defined as a sequence of at least 6 (contiguous) nucleic acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific hybridization in the case of nucleic acids or for specific recognition of an epitope in the case of amino acids, and is at most some portion less than a full length sequence. Fragments may be derived from any contiguous portion of a nucleic acid or amino acid sequence of choice.
A full-length NOVX clone is identified as containing an ATG translation start codon and an in-frame stop codon. Any disclosed NOVX nucleotide sequence lacking an ATG start codon therefore encodes a truncated C-terminal fragment of the respective NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 5' direction of the disclosed sequence. Any disclosed NOVX nucleotide sequence lacking an in-frame stop codon similarly encodes a truncated N-terminal fragment of the respective NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 3' direction of the disclosed sequence.
A "derivative" is a nucleic acid sequence or amino acid sequence formed from the native compounds either directly, by modification or partial substitution. An "analog" is a nucleic acid sequence or amino acid sequence that has a structure similar to, but not identical to, the native compound, e.g. they differs from it in respect to certain components or side chains. Analogs may be synthetic or derived from a different evolutionary origin and may have a similar or opposite metabolic activity compared to wild type. A "homolog" is a nucleic acid sequence or amino acid sequence of a particular gene that is derived from different species. Derivatives and analogs may be full length or other than full length. Derivatives or analogs of the nucleic acids or proteins of the invention include, but are not limited to, molecules comprising regions that are substantially homologous to the nucleic acids or proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95%, or more identity, with a preferred identity of 80-95%, and most preferred identity of 98-99% or more, over a nucleic acid or amino acid sequence of identical size or when compared to an aligned sequence in which the alignment is done by a computer homology program known in the art, or whose encoding nucleic acid is capable of hybridizing to the complement of a sequence encoding the proteins under stringent, moderately stringent, or low stringent conditions. See Ausubel, etal., Supra, and below. A "homologous nucleic acid sequence" or "homologous amino acid sequence," or variations thereof, are as defined in United States Patent 6,600,019, column 71 , the disclosure of which is hereby incorporated in toto by reference. Homologous nucleic acid sequences according to the present invention include those nucleic acid sequences that encode conservative amino acid substitutions (see below) in SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, as well as a polypeptide possessing NOVX biological activity. Various biological activities of the NOVX proteins are described below.
A NOVX polypeptide is encoded by the open reading frame ("ORF") of a NOVX nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be translated into a polypeptide. A stretch of nucleic acids comprising an ORF is uninterrupted by a stop codon. An ORF that represents the coding sequence for a full protein begins with an ATG "start" codon and terminates with one of the three "stop" codons, namely, TAA, TAG, orTGA. For the purposes of this invention, an ORF may be any part of a coding sequence, with or without a start codon, a stop codon, or both. For an ORF to be considered as a good candidate for coding for a bona fide cellular protein, a minimum size requirement is often set, e.g., a stretch of DNA that would encode a protein of 50 amino acids or more. The nucleotide sequences determined from the cloning of the human NOVX genes allows for the generation of probes and primers designed for use in identifying and/or cloning NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX homologues from other vertebrates. The probe/primer typically comprises substantially purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 250, 300, 350 or 400 consecutive sense strand nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34; or an anti-sense strand nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34; or of a naturally occurring mutant of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34. Probes based on the human NOVX nucleotide sequences can be used to detect transcripts or genomic sequences encoding the same or homologous proteins. In various embodiments, the probe has a detectable label attached, e.g. the label can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as a part of a diagnostic test kit for identifying cells or tissues which mis-express a NOVX protein, such as by measuring a level of a NOVX-encoding nucleic acid in a sample of cells from a subject e.g., detecting NOVX mRNA levels or determining whether a genomic NOVX gene is up or down regulated or has been mutated or deleted.
"A polypeptide having a biologically-active portion of a NOVX polypeptide", according to the present invention, refers to polypeptides as defined in United States Patent 6,600,019, columns 71 , lines 59 to 64, the disclosure of which is hereby incorporated in toto by reference. A nucleic acid fragment encoding a "biologically-active portion of NOVX" can be prepared by isolating a portion of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, that encodes a polypeptide having a NOVX biological activity (the biological activities of the NOVX proteins are described below), expressing the encoded portion of NOVX protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of NOVX.
NOVX Single Nucleotide Polymorphisms
Variant sequences are also included in the present invention. A variant sequence can include a single nucleotide polymorphism (SNP). A SNP can, in some instances, be referred to as a "cSNP" to denote that the nucleotide sequence containing the SNP originates as a cDNA. SNPs occurring within genes may result in an alteration of the amino acid encoded by the gene at the position of the SNP. Preferred embodiments include NOV1f, NOV1g, NOV1h, NOV1i, NOV2e, NOV2f, NOV5b, NOV5c, NOV6d, NOV6e, NOV6f, and NOV6g. NOVX Nucleic Acid and Polypeptide Variants
The invention further encompasses nucleic acid molecules that differ from the nucleotide sequences of SEQ ID N0:2π-1 , wherein n is an integer between 1 and 34, due to degeneracy of the genetic code and thus encode the same NOVX proteins as that encoded by the nucleotide sequences of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
In addition to the human NOVX nucleotide sequences of SEQ ID NO:2π-1, wherein n is an integer between 1 and 34, it will be appreciated by those skilled in the art that DNA sequence polymorphisms that lead to changes in the amino acid sequences of the NOVX polypeptides may exist within a population (e.g., the human population). Such genetic polymorphism in the NOVX genes may exist among individuals within a population due to natural allelic variation. As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules comprising an open reading frame (ORF) encoding a NOVX protein, preferably a vertebrate NOVX protein. Such natural allelic variations can typically result in 1-5% variance in the nucleotide sequence of the NOVX genes. Any and all such nucleotide variations and resulting amino acid polymorphisms in the NOVX polypeptides, which are the result of natural allelic variation and that do not alter the functional activity of the NOVX polypeptides, are intended to be within the scope of the invention. Moreover, nucleic acid molecules encoding NOVX proteins from other species, and thus that have a nucleotide sequence that differs from a human SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, are intended to be within the scope of the invention. Nucleic acid molecules corresponding to natural allelic variants and homologues of the NOVX cDNAs of the invention can be isolated based on their homology to the human NOVX nucleic acids disclosed herein using the human cDNAs, or a portion thereof, as a hybridization probe according to - standard hybridization techniques under stringent hybridization conditions.
Accordingly, in another embodiment, an isolated nucleic acid molecule of the invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34. In another embodiment, the nucleic acid is at least 10, 25, 50, 100, 250, 500, 750, 1000, 1500, or 2000 or more nucleotides in length. In yet another embodiment, an isolated nucleic acid molecule of the invention hybridizes to the coding region. Homologs (i.e., nucleic acids encoding NOVX proteins derived from species other than human) or other related sequences (e.g., paralogs) can be obtained by low, moderate or high stringency hybridization with all or a portion of the particular human sequence as a probe using methods well known in the art for nucleic acid hybridization and cloning.
As used herein, the phrase "stringent hybridization conditions" is as defined in United States Patent 6,600,019, column 73, lines 7 to 42, the disclosure of which is hereby incorporated in toto by reference. An isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to a sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, corresponds to a naturally-occurring nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).
In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof, under conditions of moderate stringency is provided. A non-iimiting example of moderate stringency hybridization conditions are hybridization in 6X SSC, 5X Reinhardt's solution, 0.5% SDS and 100 mg/ml denatured salmon sperm DNA at 55 °C, followed by one or more washes in 1X SSC, 0.1% SDS at 37 °C. Other conditions of moderate stringency that may be used are well-known within the art. See, Ausubel, et al. (eds.), 1993, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, NY, and Krieger, 1990; GENE TRANSFER AND EXPRESSION, A LABORATORY MANUAL, Stockton Press, NY. In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule comprising the nucleotide sequences of SEQ ID NO:2π-1, wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof, under conditions of low stringency, is provided. A non-limiting example of low stringency hybridization conditions are hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCI (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 mg/ml denatured salmon sperm DNA, 10% (wt/vol) dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM Tris-HCI (pH 7.4), 5 mM EDTA, and 0.1% SDS at 50°C. Other conditions of low stringency that may be used are well known in the art (e.g., as employed for cross-species hybridizations). See, e.g., Ausubel, supra; and Proc NatlAcad Sci USA 78: 6789-6792 (1981).
Conservative Mutations
In addition to naturally-occurring allelic variants of NOVX sequences that may exist in the population, the skilled artisan will further appreciate that changes can be introduced by mutation into the nucleotide sequences of SEQ ID NO:2/>1 , wherein n is an integer between 1 and 34, thereby leading to changes in the amino acid sequences of the encoded NOVX protein, without altering the functional ability of that NOVX protein. For example, nucleotide substitutions leading to amino acid substitutions at "non-essential" amino acid residues can be made in the sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34. A "non-essential" amino acid residue is a residue that can be altered from the wild-type sequences of the NOVX proteins without altering their biological activity, whereas an "essential" amino acid residue is required for such biological activity. For example, amino acid residues that are conserved among the NOVX proteins of the invention are predicted to be particularly non-amenable to alteration. Amino acids for which conservative substitutions can be made are well-known within the art.
Another aspect of the invention pertains to nucleic acid molecules encoding NOVX proteins that contain changes in amino acid residues that are not essential for activity. Such NOVX proteins differ in amino acid sequence from SEQ ID NO:2/>1 , wherein n is an integer between 1 and 34, yet retain biological activity. In one embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an amino acid sequence at least about 80% homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34; more preferably at least about 90% homologous, even more preferably at least about 95% homologous, most preferably 98-99% homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34.
An isolated nucleic acid molecule encoding a NOVX protein homologous to the protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, can be created by introducing one or more nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID NO:2π-1 , wherein n is an integer between 1 and 34, such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein.
Mutations can be introduced any one of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, by standard techniques, such as described in United States Patent 6,600,019, columns 74, line 58 to column 75, line 14, the disclosure of which is hereby incorporated in toto by reference. Following mutagenesis of a nucleic acid of SEQ ID NO: 2n-1, wherein n is an integer between 1 and 34, the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined.
The relatedness of amino acid families may also be determined based on side chain interactions. Substituted amino acids may be fully conserved "strong" residues or fully conserved "weak" residues. The "strong" group of conserved amino acid residues may be any one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, FYW, wherein the single letter amino acid codes are grouped by those amino acids that may be substituted for each other. Likewise, the "weak" group of conserved residues may be any one of the following: CSA, ATV, SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, HFY, wherein the letters within each group represent the single letter amino acid code. In one embodiment, a mutant NOVX protein can be assayed for (/) the ability to form protein:protein interactions with other NOVX proteins, other cell-surface proteins, or biologically-active portions thereof, (ii) complex formation between a mutant NOVX protein and a NOVX ligand; or (///) the ability of a mutant NOVX protein to bind to an intracellular target protein or biologically-active portion thereof; (e.g. avidin proteins). In yet another embodiment, a mutant NOVX protein can be assayed for the ability to regulate a specific biological function (e.g., regulation of insulin release).
Interfering RNA
In one aspect of the invention, NOVX gene expression can be attenuated by RNA interference. One approach well-known in the art is short interfering RNA (siRNA) mediated gene silencing where expression products of a NOVX gene are targeted by specific double stranded NOVX derived siRNA nucleotide sequences that are complementary to at least a 19-25 nt long segment of the NOVX gene transcript, including the 5' untranslated (UT) region, the ORF, or the 3' UT region. See, e.g., PCT applications WO00/44895, W099/32619, WO01/75164, WO01/92513, WO 01/29058, WO01/89304, WO02/16620, and WO02/29858, each incorporated by reference herein in their entirety. Targeted genes can be a NOVX gene, or an upstream or downstream modulator of the NOVX gene. Nonlimiting examples of upstream or downstream modulators of a NOVX gene include, e.g., a transcription factor that binds the NOVX gene promoter, a kinase or phosphatase that interacts with a NOVX polypeptide, and polypeptides involved in a NOVX regulatory pathway.
An inventive therapeutic method of the invention contemplates administering a NOVX siRNA construct as therapy to compensate for increased or aberrant NOVX expression or activity. The NOVX ribopolynucleotide is obtained and processed into siRNA fragments, or a NOVX siRNA is synthesized, as described above. The NOVX siRNA is administered to cells or tissues using known nucleic acid transfection techniques, as described above. A NOVX siRNA specific for a NOVX gene will decrease or knockdown NOVX transcription products, which will lead to reduced NOVX polypeptide production, resulting in reduced NOVX polypeptide activity in the cells or tissues.
The present invention also encompasses a method of treating a disease or condition associated with the presence of a NOVX protein in an individual comprising administering to the individual an RNAi construct that targets the mRNA of the protein (the mRNA that encodes the protein) for degradation. A specific RNAi construct includes a siRNA or a double stranded gene transcript that is processed into siRNAs. Upon treatment, the target protein is not produced or is not produced to the extent it would be in the absence of the treatment. In specific embodiments, a NOVX siRNA is used in therapy. Methods for the generation and use of a NOVX siRNA are known to those skilled in the art. Example techniques are provided below.
Production of RNAs
Sense RNA (ssRNA) and antisense RNA (asRNA) of NOVX are produced using known methods such as transcription in RNA expression vectors. In the initial experiments, the sense and antisense RNA are about 500 bases in length each. The produced ssRNA and asRNA (0.5 UM) in 10 mM Tris-HCI (pH 7.5) with 20 mM NaCl were heated to 95° C for 1 min then cooled ' and annealed at room temperature for 12 to 16 h. The RNAs are precipitated and resuspended in lysis buffer (below). To monitor annealing, RNAs are electrophoresed in a 2% agarose gel in TBE buffer and stained with ethidium bromide. See, e.g., Sambrook et al., Molecular Cloning. Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989).
Lysate Preparation
Untreated rabbit reticulocyte lysate (Ambion) are assembled according to the manufacturer's directions. dsRNA is incubated in the lysate at 30° C for 10 min prior to the addition of mRNAs. Then NOVX mRNAs are added and the incubation continued for an additional 60 min. The molar ratio of double stranded RNA and mRNA is about 200:1. The NOVX mRNA is radiolabeled (using known techniques) and its stability is monitored by gel electrophoresis. In a parallel experiment made with the same conditions, the double stranded RNA is internally radiolabeled with a 32P-ATP. Reactions are stopped by the addition of 2 X proteinase K buffer and deproteinized as described previously (Genes Dev., 13:3191-3197, 1999). Products are analyzed by electrophoresis in 15% or 18% polyacrylamide sequencing gels using appropriate RNA standards. By monitoring the gels for radioactivity, the natural production of 10 to 25 nt RNAs from the double stranded RNA can be determined.
The band of double stranded RNA, about 21-23 bps, is eluded. The efficacy of these 21 -23 mers for suppressing NOVX transcription is assayed in vitro using the same rabbit reticulocyte assay described above using 50 nanomolar of double stranded 21-23 mer for each assay. The sequence of these 21-23 mers is then determined using standard nucleic acid sequencing techniques.
RNA Preparation
21 nt RNAs, based on the sequence determined above, are chemically synthesized using Expedite RNA phosphoramidites and thymidine phosphoramidite (Proligo, Germany). Synthetic oligonucleotides are deprotected and gel-purified (Genes & Dev. 15, 188-200, 2001), followed by Sep-Pak C18 cartridge (Waters, Milford, Mass., USA) purification (Biochemistry, 32:11658-11668 1993).
These RNAs (20 μM) single strands are incubated in annealing buffer (100 mM potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate) for 1 min at 90° C followed by 1 h at 37° C.
Cell Culture
A cell culture known in the art to regularly express NOVX is propagated using standard conditions. 24 hours before transfection, at approx. 80% confluency, the cells are trypsinized and diluted 1 :5 with fresh medium without antibiotics (1-3 X 105 cells/ml) and transferred to 24-well plates (500 ml/well). Transfection is performed using a commercially available lipofection kit and NOVX expression is monitored using standard techniques with positive and negative control. A positive control is cells that naturally express NOVX while a negative control is cells that do not express NOVX. Base-paired 21 and 22 nt siRNAs with overhanging 3' ends mediate efficient sequence-specific mRNA degradation in lysates and in cell culture. Different concentrations of siRNAs are used. An efficient concentration for suppression in vitro in mammalian culture is between 25 nM to 100 nM final concentration. This indicates that siRNAs are effective at concentrations that are several orders of magnitude below the concentrations applied in conventional antisense or ribozyme gene targeting experiments.
The above method provides a way both for the deduction of NOVX siRNA sequence and the use of such siRNA for in vitro suppression. In vivo suppression may be performed using the same siRNA using well known in vivo transfection or gene therapy transfection techniques. Antisense Nucleic Acids
Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, or fragments, analogs or derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein (e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence). In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire NOVX coding strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, or antisense nucleic acids complementary to a NOVX nucleic acid sequence of SEQ ID NO:2n-1, wherein n is an integer between 1 and 34, are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a nucleotide sequence encoding a NOVX protein. The term "coding region" refers to the region of the nucleotide sequence comprising codons which are translated into amino acid residues. In another embodiment, the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence encoding the NOVX protein. The term "noncoding region" refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions).
Given the coding strand sequences encoding the NOVX protein disclosed herein, antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary to the entire coding region of NOVX mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of NOVX mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of NOVX mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally-occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids (e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used). Examples of modified nucleotides are well-known in the art and may be found and discussed, for example, in United States Patent 6,600,019, column 76, line 21, to column 77, line 21 , the disclosure of which is hereby incorporated in toto by reference. Ribozymes and PNA Moieties
Nucleic acid modifications include, by way of non-limiting example, modified bases, and nucleic acids whose sugar phosphate backbones are modified or derivatized. These modifications are carried out at least in part to enhance the chemical stability of the modified nucleic acid, such that they may be used, for example, as antisense binding nucleic acids in therapeutic applications in a subject.
In one embodiment, an antisense nucleic acid of the invention is a ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes as described in Nature 334:585,1988) can be used to catalytically cleave NOVX mRNA transcripts to thereby inhibit translation of NOVX mRNA. A ribozyme having specificity for a NOVX-encoding nucleic acid can be designed based upon the nucleotide sequence of a NOVX cDNA disclosed herein (SEQ ID NO:2π-1 , wherein n is an integer between 1 and 34). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a NOVX-encoding mRNA. See, U.S. Patents 4,987,071 and 5,116,742. NOVX mRNA can also be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, Science 261:1411 (1993).
Alternatively, NOVX gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the NOVX nucleic acid (e.g., the NOVX promoter and/or enhancers) to form triple helical structures that prevent transcription of the NOVX gene in target cells. See, e.g.,. Anticancer Drug Des. 6:569, 1991 ;. Ann. N.Y. Acad. Sci. 660:27,1992; and Bioassays 14:80,1992.
In various embodiments, the NOVX nucleic acids can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule as has been reported (see, United States Patent 6,600,019, column 77, line 54 to column 78, line 15, the disclosure of which is hereby incorporated in toto by reference.
In other embodiments, the oligonucleotide according to the present invention may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Proc. Natl. Acad. Sci. U.S.A.
86:6553,1989;. Proc. Natl. Acad. Sci. 84:648,1987; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO 89/10134). In addition, oligonucleotides can be modified with hybridization triggered cleavage agents (see, e.g., BioTechniques 6:958,1988) or intercalating agents (see, e.g.,. Pharm. Res. 5: 539,1988). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, and the like.
NOVX Polypeptides
A polypeptide according to the present invention includes a polypeptide of the amino acid sequence of NOVX polypeptides whose sequences are provided in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 34. The invention also includes a mutant or variant protein any of whose residues may be changed from the corresponding residues shown in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 34, while still encoding a protein that maintains its NOVX activities and physiological functions, or a functional fragment thereof. One aspect of the invention pertains to isolated NOVX proteins, and biologically-active portions thereof, or derivatives, fragments, analogs or homologs thereof. Also provided are polypeptide fragments suitable for use as immunogens to raise anti-NOVX antibodies. In one embodiment, native NOVX proteins can be isolated from cells or tissue sources by an appropriate purification scheme using standard protein purification techniques. In another embodiment, NOVX proteins are produced by recombinant DNA techniques. Alternative to recombinant expression, a NOVX protein or polypeptide can be synthesized chemically using standard peptide synthesis techniques.
An "isolated" or "purified" polypeptide or protein or biologically-active portion thereof is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the NOVX protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized as more fully defined in United States Patent 6,600,019, column 79, lines 28 to 44, the disclosure of which is incorporated in toto herein. The language "substantially free of chemical precursors or other chemicals" is as more fully defined in United States Patent 6,600,019, column 79, lines 51 to 55, the disclosure of which is incorporated in toto herein. In one embodiment, the language "substantially free of chemical precursors or other chemicals" includes preparations of NOVX proteins having less than about 30% (by dry weight) of chemical precursors or non-NOVX chemicals, preferably less than about 20%, even more preferably less than about 10% still more preferably less than about 5%, and most preferably less that 1-2% chemical precursors or non-NOVX chemicals.
Biologically-active portions of NOVX proteins include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequences of the NOVX proteins (e.g., the amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34) that include fewer amino acids than the full-length NOVX proteins, and exhibit at least one activity of a NOVX protein. Typically, biologically-active portions comprise a domain or motif with at least one activity of the NOVX protein. A biologically-active portion of a NOVX protein can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acid residues in length. Moreover, other biologically-active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native NOVX protein.
In an embodiment, the NOVX protein has an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34. In other embodiments, the NOVX protein is substantially homologous to SEQ ID NO:2n, wherein n is an integer between 1 and 34, and retains the functional activity of the protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, yet differs in amino acid sequence due to natural allelic variation or mutagenesis, as described in detail, below. Accordingly, in another embodiment, the NOVX protein is a protein that comprises an amino acid sequence at least about 80% homologous to the amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34, and retains the functional activity of the NOVX proteins of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
Determining Homology Between Two or More Sequences
According to the present invention, the determination of homology between two or more sequences " is as more fully defined in United States Patent 6,600,019, column 80, line 30 to colum 81 , line 10, the disclosure of which is incorporated in toto herein wherein the CDS (encoding) part of the DNA sequence of SEQ ID NO:2π-1, wherein n is an integer between 1 and 34.
The term "sequence identity" refers to the degree to which two polynucleotide or polypeptide sequences are identical on a residue-by-residue basis over a particular region of comparison. The term "percentage of sequence identity" is calculated by comparing two optimally aligned sequences over that region of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I, in the case of nucleic acids) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the region of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity. The term "substantial identity" as used herein denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 80 percent sequence identity, preferably at least 85 percent identity and often 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison region.
Chimeric and Fusion Proteins
The present invention also provides NOVX chimeric or fusion proteins. As used herein, a NOVX "chimeric protein" or "fusion protein" comprises a NOVX polypeptide operatively-linked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a polypeptide having an amino acid sequence corresponding to a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 34, whereas a "non-NOVX polypeptide" refers to a polypeptide having an amino acid sequence corresponding to a protein that is not substantially homologous to the NOVX protein, e.g., a protein that is different from the NOVX protein and that is derived from the same or a different organism. Within a NOVX fusion protein the NOVX polypeptide can correspond to all or a portion of a NOVX protein. In one embodiment, a NOVX fusion protein comprises at least one biologically-active portion of a NOVX protein. In another embodiment, a NOVX fusion protein comprises at least two biologically-active portions of a NOVX protein. In yet another embodiment, a NOVX fusion protein comprises at least three biologically-active portions of a NOVX protein. Within the fusion protein, the term "operatively-linked" is intended to indicate that the NOVX polypeptide and the non-NOVX polypeptide are fused in-frame with one another. The non-NOVX polypeptide can be fused to the N-terminus or C-terminus of the NOVX polypeptide.
In one embodiment, the fusion protein is a GST-NOVX fusion protein in which the NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) sequences. Such fusion proteins can facilitate the purification of recombinant NOVX polypeptides.
In another embodiment, the fusion protein is a NOVX protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of NOVX can be increased through use of a heterologous signal sequence. In yet another embodiment, the fusion protein is a NOVX-immunoglobulin fusion protein in which the NOVX sequences are fused to sequences derived from a member of the immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject to inhibit an interaction between a NOVX ligand and a NOVX protein on the surface of a cell, to thereby suppress NOVX-mediated signal transduction in vivo. The NOVX-immunoglobulin fusion proteins can be used to affect the bioavailability of a NOVX cognate ligand. Inhibition of the NOVX ligand/NOVX interaction may be useful therapeutically for both the treatment of proliferative and differentiative disorders, as well as modulating (e.g. promoting or inhibiting) cell survival. Moreover, the NOVX-immunoglobulin fusion proteins of the invention can be used as immunogens to produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in screening assays to identify molecules that inhibit the interaction of NOVX with a NOVX ligand.
A NOVX chimeric or fusion protein of the invention can be produced by standard recombinant DNA techniques as described and more fully defined in United States Patent 6,600,019, column 82, lines 15 to 37, the disclosure of which is incorporated in toto herein. NOVX Agonists and Antagonists
The present invention also pertains to variants of the NOVX proteins that function as either NOVX agonists (i.e., mimetics) or as NOVX antagonists as more fully defined in United States Patent 6,600,019, column 82, line 40 to column 83, line 19, the disclosure of which is incorporated in toto herein. Anti-NOVX Antibodies
Included in the invention are antibodies to NOVX proteins, or fragments of NOVX proteins or a derivative, fragment, analog, homolog or ortholog thereof. The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fa , Fab' and F(ab.)2 fragments, and an F^ expression library. Antibodies may be any of the classes IgG, IgM, IgA, IgE and IgD, and include subclasses such as IgG-i, lgG2, and others. The light chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a reference to all such classes, subclasses and types of antibody species.
An isolated NOVX full length protein or a portion or fragment thereof, can be used as an immunogen to generate antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal and monoclonal antibody preparation. An antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence of the full length protein, such as an amino acid sequence of SEQ ID NO:2n, wherein n is an integer between 1 and 34, and encompasses an epitope. The antigenic peptide may comprise at least 10 amino acid residues, or at least 15, at least 20„ or at least 30 amino acid residues. Epitopes may encompassed by the antigenic peptide are regions of the protein that are located on its surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by the antigenic peptide is a region of NOVX that is located on the surface of the protein, e.g., a hydrophilic region and may be determined by a hydrophobicity analysis of the NOVX protein sequence. As a means for targeting antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be generated by any method well known in the art (for example see PNAS USA 78:3824,1981; and J. Mol. Biol. 157:105, 1982).
The term "epitope" includes any protein determinant capable of specific binding to an immunoglobulin or T-cell receptor. Epitopic determinants usually consist of chemically active surface groupings of molecules such as amino acids or sugar side chains and usually have specific three dimensional structural characteristics, as well as specific charge characteristics. A NOVX polypeptide or a fragment thereof comprises at least one antigenic epitope. An anti-NOVX antibody of the present invention is said to specifically bind to antigen NOVX when the equilibrium binding constant (KD) is <1 μM, preferably ≤ 100 nM, more preferably ≤ 10 nM, and most preferably < 100 pM to about 1 pM, as measured by assays including radioligand binding assays or similar assays known to skilled artisans.
Various procedures known within the art may be used for the production of polyclonal or monoclonal antibodies directed against a protein of the invention, or against derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory Manual, Hariow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY).
In another embodiment NOVX nucleic acid molecules are used directly for production of antibodies recognizing NOVX polypeptides. Antibodies can be prepared by genetic or DNA-based immunization. It has been shown that intramuscular immunization of mice with a naked DNA plasmid led to expression of reporter proteins in muscle cells (Science 247:1465, 1990) and that this technology could stimulate an immune response (Nature. 356:152, 1992). The success of genetic immunization in stimulating both cellular and humoral immune responses has been widely reported (e.g. in Annu. Rev. Immunol. 15:617, 1997; Immunol. Today 19:89, 1998; Annu. Rev. Immunol. 18:927, 2000). Using this technology, antibodies can be generated through immunization with a cDNA sequence encoding the protein in question. Following genetic immunization, the animal's immune system is activated in response to the synthesis of the foreign protein.
The quantity of protein produced in vivo following genetic immunization is within the picogram to nanogram range, which is much lower than the amounts of protein introduced by conventional immunization protocols. Despite these low levels of protein, a very efficient immune response is achieved due to the foreign protein being expressed directly in, or is quickly taken up by antigen-presenting dendritic cells (J. Leuk. Biol. 66:350, 1999; J. Exp. Med. 186:1481, 1997; Nat. Med. 2:1122, 1996). A further increase in the effectivity of genetic immunization is due to the inherent immune-enhancing properties of the DNA itself, i.e., the presence of CpG-motifs in the plasmid backbone, which activate both dendritic cells (J. Immunol. 161 3042, 1998) and B-cells (Nature 374:546, 1995).
Genetic immunization and production of high affinity monoclonal antibodies has been successful in mice (Biotechniques 16:616, 1994; J. Biotechnol. 51:191 , 1996; Hybridoma 17:569, 1998; J. Virol. 72:4541, 1998;. J. Immunol. 160:1458, 1998; J. Biotechnol. 73:119, 1999). It has been shown that monoclonal antibodies of the mature IgG subclasses can be obtained (Hybridoma 17:569, 1998) and single chain libraries can be generated from genetically immunized mice (PNAS USA 95:669, 1998). It has also been shown that genetic immunization can generate antibodies in other species such as rabbits (J. Lipid. Res. 38:2627, 1997) and turkeys (J. Lipid. Res. 38:2627, 1999). Genetic immunization has been used for the production of human antibodies recognizing extracellular targets. Humanized Antibodies
Anti NOVX antibodies can further comprise humanized or human antibodies. Humanization can be performed following methods known in the art (Nature, 321 :522-525, 1986; Nature, 332:323-327, 1988; Science, 239:1534-1536, 1988; U.S. Patent No. 5,225,539; and Curr. Op. Struct. Biol., 2:593-596, 1992). Human Antibodies
Fully human antibodies are antibody molecules in which the entire sequence of both the light chain and the heavy chain, including the CDRs, arise from human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. Human monoclonal antibodies can be prepared by methods known in the art, see, for example, Immunol Today 4: 72, 1983; In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96,1985;. PNAS USA 80:2026, 1983; J. Mol. Biol., 227:381 , 1991 ; J. Mol. Biol., 222:581, 1991; U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016; 5,939,598, and 5,916,771; Bio/Technology 10, 779-783, 1992; Nature 368:856, 1994; Nature 368:812, 1994; Nature Biotechnology 14:845, 1996; Nature Biotechnology 14:826, 1996; and Intern. Rev. Immunol. 13: 65, 1995.
Fab Fragments and Single Chain Antibodies
According to the invention, techniques can be adapted for the production of single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). In addition, methods can be adapted for the construction of F^ expression libraries (see e.g., Science 246:1275, 1989) to allow rapid and effective identification of monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen may be produced by techniques known in the art including, but not limited to: (i) an F(ab.)2 fragment produced by pepsin digestion of an antibody molecule; (ii) an F& fragment generated by reducing the disulfide bridges of an F(a .)2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) Fv fragments.
Bispecific Antibodies Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding specificities for at least two different antigens. In the present case, one of the binding specificities is for an antigenic protein of the invention. The second binding target is any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit.
Methods for making bispecific antibodies are known in the art, see Nature, 305:537, 1983 and may be purified by affinity chromatography steps (EMBO J., 10:3655, 1991). For further details of generating bispecific antibodies see, for example, Methods in Enzymology, 121:210 (1986); Science 229:81 (1985); J. Exp. Med. 175:217 (1992); J. Immunol. 148(5):1547 (1992); "diabody" technology described in PNAS USA 90:6444 (1993); and single-chain Fv (sFv) dimers in J. Immunol. 152:5368 (1994). Antibodies with more than two valencies are contemplated, see for example J. Immunol. 147:60 (1991).
Heteroconjugate Antibodies
Heteroconjugate antibodies composed of two covalently joined antibodies are also within the scope of the present invention, see for example, U.S. Patent No. 4,676,980 and EP 03089. It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic protein chemistry, including those involving crosslinking agents. For example, immunotoxins can be constructed using a disulfide exchange reaction or by forming a thioether bond. Examples of suitable reagents for this purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980.
Effector Function Engineering It can be desirable to modify the antibody of the invention with respect to effector function, see for example, J. Exp Med., 176:1191, 1992; J. Immunol., 148:2918, 1992;Cancer Research, 53:2560, 1993; Anti-Cancer Drug Design, 3:219, 1989. Immunoconjugates
The invention also pertains to immunoconjugates comprising an antibody according to the present invention conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate). Immunoconjugates are further described in United States Patent 6,600,019, column 91 , line 54 to column 92, line 31 , the disclosure of which is incorporated in toto herein
Immunoiiposomes
The antibodies disclosed herein can also be formulated as immunoiiposomes prepared by methods known in the art, such as described in PNAS USA, 82:3688, 1985; PNAS USA, 77:4030, 1980; and U.S. Pat. Nos. 4,485,045; 4,544,545; and 5,013,556; J. Biol. Chem., 257:286, 1982; J. National Cancer Inst., 81(19):1484, 1989.
Diagnostic Applications of Antibodies Directed Against the Proteins of the Invention
In one embodiment, methods for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme linked immunosorbent assay (ELISA) and other immunoiogically mediated techniques known within the art. In a specific embodiment, selection of antibodies that are specific to a particular domain of an NOVX protein is facilitated by generation of hybridomas that bind to the fragment of an NOVX protein possessing such a domain. Thus, antibodies that are specific for a desired domain within an NOVX protein, or derivatives, fragments, analogs or homologs thereof, are also provided herein.
Antibodies directed against a NOVX protein of the invention may be used in methods known within the art relating to the localization and/or quantitation of a NOVX protein (e.g., for use in measuring levels of the NOVX protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like). In a given embodiment, antibodies specific to a NOVX protein, or derivative, fragment, analog or homolog thereof, that contain the antibody derived antigen binding domain, are utilized as pharmacologically active compounds (referred to hereinafter as "Therapeutics").
An antibody specific for a NOVX protein of the invention (e.g., a monoclonal antibody or a polyclonal antibody) can be used to isolate a NOVX polypeptide by standard techniques, such as immunoaffinity, chromatography or immunoprecipitation. An antibody to a NOVX polypeptide can facilitate the purification of a natural NOVX antigen from cells, or of a recombinantly produced NOVX antigen expressed in host cells. Moreover, such an anti-NOVX antibody can be used to detect the antigenic NOVX protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the antigenic NOVX protein. Antibodies directed against a NOVX protein can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, D-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 251, 31l, 35S or 3H.
Antibody Therapeutics Antibodies of the present invention, including polyclonal, monoclonal, humanized and fully human antibodies, may used as therapeutic agents. Such agents will generally be employed to treat or prevent a disease or pathology in a subject. An antibody preparation, preferably one having high specificity and high affinity for its target antigen, is administered to the subject and will generally have an effect due to its binding with the target. Such an effect may be one of two kinds, depending on the specific nature of the interaction between the given antibody molecule and the target antigen in question. In the first instance, administration of the antibody may abrogate or inhibit the binding of the target with an endogenous ligand to which it naturally binds. In this case, the antibody binds to the target and masks a binding site of the naturally occurring ligand, wherein the ligand serves as an effector molecule. Thus the receptor mediates a signal transduction pathway for which ligand is responsible.
Alternatively, the effect may be one in which the antibody elicits a physiological result by virtue of binding to an effector binding site on the target molecule. In this case the target, a receptor having an endogenous ligand which may be absent or defective in the disease or pathology, binds the antibody as a surrogate effector ligand, initiating a receptor-based signal transduction event by the receptor.
A therapeutically effective amount of an antibody of the invention relates generally to the amount needed to achieve a therapeutic objective. As noted above, this may be a binding interaction between the antibody and its target antigen that, in certain cases, interferes with the functioning of the target, and in other cases, promotes a physiological response. The amount required to be administered will furthermore depend on the binding affinity of the antibody for its specific antigen, and will also depend on the rate at which an administered antibody is depleted from the free volume other subject to which it is administered. Common ranges for therapeutically effective dosing of an antibody or antibody fragment of the invention may be, by way of nonlimiting example, from about 0.1 mg/kg body weight to about 50 mg/kg body weight. Common dosing frequencies may range, for example, from twice daily to once a week.
Pharmaceutical Compositions of Antibodies
Antibodies specifically binding a protein of the invention, as well as other molecules identified by the screening assays disclosed herein, can be administered for the treatment of various disorders in the form of pharmaceutical compositions. Principles and considerations involved in preparing such compositions, as well as guidance in the choice of components are provided, for example, in Remington: The Science And Practice Of Pharmacy 19th ed. (Alfonso R. Gennaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; Drug Absorption Enhancement : Concepts, Possibilities, Limitations, And Trends, Harwood Academic Publishers, Langhorne, Pa., 1994; and Peptide And Protein Drug Delivery (Advances In Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York.
If the antigenic protein is intracellular and whole antibodies are used as inhibitors, internalizing antibodies are preferred. However, liposomes can also be used to deliver the antibody, or an antibody fragment, into cells. Where antibody fragments are used, the smallest inhibitory fragment that specifically binds to the binding domain of the target protein is preferred. For example, based upon the variable-region sequences of an antibody, peptide molecules can be designed that retain the ability to bind the target protein sequence. Such peptides can be synthesized chemically and/or produced by recombinant DNA technology. See, e.g., PNAS USA, 90:7889, 1993. The formulation herein can also contain more than one active compound as necessary for the particular indication being treated, preferably those with complementary activities that do not adversely affect each other. Alternatively, or in addition, the composition can comprise an agent that enhances its function, such as, for example, a cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent. Such molecules are suitably present in combination in amounts that are effective for the purpose intended.
The active ingredients can also be entrapped in microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization, for example, hydroxymethylcellulose or gelatin-microcapsules and poly-(methylmethacrylate) microcapsules, respectively, in colloidal drug delivery systems (for example, liposomes, albumin microspheres, microemulsions, nano-particies, and nanocapsules) or in macroemulsions.
The formulations to be used for in vivo administration must be sterile. This is readily accomplished by filtration through sterile filtration membranes.
Sustained-release preparations can be prepared. Suitable examples of sustained-release preparations include semipermeable matrices of solid hydrophobic polymers containing the antibody, which matrices are in the form of shaped articles, e.g., films, or microcapsules. Examples of sustained-release matrices include polyesters, hydrogels (for example, poly (2-hydroxyethyl-methacrylate), or poly(vinylalcohol)), polylactides (U.S. Pat. No. 3,773,919), copolymers of L-glutamic acid and y ethyl-L-glutamate, non-degradable ethylene-vinyl acetate, degradable lactic acid-glycolic acid copolymers such as the LUPRON DEPOT ™ (injectable microspheres composed of lactic acid-glycolic acid copolymer and leuprolide acetate), and poly-D-(-)-3-hydroxybutyric acid. While polymers such as ethylene-vinyl acetate and lactic acid-glycolic acid enable release of molecules for over 100 days, certain hydrogels release proteins for shorter time periods. ELISA Assay
An agent for detecting an analyte protein is for example, an antibody capable of binding to an analyte protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fat, or F(a )2) can be used. The term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled. Examples of indirect labeling include detection of a primary antibody using a fluorescently-labeled secondary antibody and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently-labeled streptavidin. The term "biological sample" is intended to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. Included within the usage of the term "biological sample", therefore, is blood and a fraction or component of blood including blood serum, blood plasma, or lymph. That is, the detection method of the invention can be used to detect an analyte mRNA, protein, or genomic DNA in a biological sample in vitro as well as in vivo. For example, in vitro techniques for detection of an analyte mRNA include Northern hybridizations and in situ hybridizations. In vitro techniques for detection of an analyte protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations, and immunofluorescence. In vitro techniques for detection of an analyte genomic DNA include Southern hybridizations. Procedures for conducting immunoassays are described, for example in "ELISA: Theory and
Practice: Methods in Molecular Biology", Vol. 42, J. R. Crowther (Ed.) Human Press, Totowa, NJ, 1995; "Immunoassay", E. Diamandis and T. Christopoulus, Academic Press, Inc., San Diego, CA, 1996; and "Practice and Theory of Enzyme Immunoassays", P. Tijssen, Elsevier Science Publishers, Amsterdam, 1985. Furthermore, in vivo techniques for detection of an analyte protein include introducing into a subject a labeled anti-an analyte protein antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques.
NOVX Recombinant Expression Vectors and Host Cells
Another aspect of the invention pertains to vectors, preferably expression vectors, containing a nucleic acid encoding a NOVX protein according to the present invention, or derivatives, fragments, analogs or homologs thereof. As used herein, NOVX recombinant expression victors and host cells are more fully defined in United States Patent 6,600,019, column 92, line 31 to column 96, line 9, the disclosure of which is incorporated in toto herein
Transgenic NOVX Animals The host cells of the invention can also be used to produce non-human transgenic animals by methods known in the art, for example as described in U.S. Patent Nos. 4,736,866; 4,870,009; and 4,873,191; and Hogan, 1986. In: MANIPULATING THE MOUSE EMBRYO, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y; Ce//51: 503 (1987); Ce//69: 915, 1992;. In: TERATOCARCINOMAS AND EMBRYONIC STEM CELLS: A PRACTICAL APPROACH, Robertson, ed. IRL, Oxford, pp. 113-152, 1987; Curr. Opin. Biotechnol. 2: 823-829, 1991; PCT International Publication Nos.: WO 90/11354; WO 91/01140; WO 92/0968; and WO 93/04169; the cre/loxP recombinase system PNAS USA 89:6232, 1992; a recombinase system Science 251 :1351, 1991 ; and clones of the non-human transgenic animals described in Nature 385:810, 1997.
Pharmaceutical Compositions
The NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies (also referred to herein as "active compounds") of the present invention, and derivatives, fragments, analogs and homologs thereof, can be incorporated into pharmaceutical compositions suitable for administration. Such compositions are well known in the compounding arts and are futher described in United States Patent 6,600,019, column 98, line 25 to column 101, line 14, the disclosure of which is incorporated in toto herein
The pharmaceutical compositions of the present invention can be included in a container, pack, or dispenser together with instructions for administration.
Screening and Detection Methods
The isolated nucleic acid molecules of the invention can be used to express NOVX protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in a NOVX gene, and to modulate NOVX activity, as described further, below. In addition, the NOVX proteins can be used to screen drugs or compounds that modulate the NOVX protein activity or expression as well as to treat disorders characterized by insufficient or excessive production of NOVX protein or production of NOVX protein forms that have decreased or aberrant activity compared to NOVX wild-type protein (e.g.; diabetes (regulates insulin release); obesity (binds and transport lipids); metabolic disturbances associated with obesity, the metabolic syndrome X as well as anorexia and wasting disorders associated with chronic diseases and various cancers, and infectious disease(possesses anti-microbial activity) and the various dyslipidemias. In addition, the anti-NOVX antibodies of the invention can be used to detect and isolate NOVX proteins and modulate NOVX activity. In yet a further aspect, the invention can be used in methods to influence appetite, absorption of nutrients and the disposition of metabolic substrates in both a positive and negative fashion.
The invention further pertains to novel agents identified by the screening assays described herein and uses thereof for treatments as described, supra.
Screening Assays
The present invention also provides for a method (also referred to herein as a "screening assay") for identifying modulators, i.e., candidate or test compounds or agents (e.g., peptides, peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity. The invention also includes compounds identified in the screening assays described herein. Such assays are well know in the art and are more fully described in United States Patent 6,600,019, column 102, line 45 to column 195, line 51 , the disclosure of which is incorporated in toto herein
In another embodiment, modulators of NOVX protein expression are identified in a method wherein a cell is contacted with a candidate compound and the expression of NOVX mRNA or protein in the cell is determined. The level of expression of NOVX mRNA or protein in the presence of the candidate compound is compared to the level of expression of NOVX mRNA or protein in the absence of the candidate compound. The candidate compound can then be identified as a modulator of NOVX mRNA or protein expression based upon this comparison. For example, when expression of NOVX mRNA or protein is greater (i.e., statistically significantly greater) in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of NOVX mRNA or protein expression. Alternatively, when expression of NOVX mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of NOVX mRNA or protein expression. The level of NOVX mRNA or protein expression in the cells can be determined by methods described herein for detecting NOVX mRNA or protein.
In yet another aspect of the invention, the NOVX proteins can be used as "bait proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; Cell 2:223, 1993; J. Biol. Chem. 268:12046, 1993; Biotechniques 14:920, 1993; and Oncogene 8:1693, 1993), to identify other proteins that bind to or interact with NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX activity. Such NOVX-binding proteins are also involved in the propagation of signals by the NOVX proteins as, for example, upstream or downstream elements of the NOVX pathway.
The present invention further pertains to novel agents identified by the aforementioned screening assays and uses thereof for treatments as described herein. Detection Assays
Portions or fragments of the cDNA sequences identified herein (and the corresponding complete gene sequences) can be used in numerous ways as polynucleotide reagents. By way of example, and not of limitation, these sequences can be used to: (/) map their respective genes on a chromosome; and, thus, locate gene regions associated with genetic disease; (//) identify an individual from a minute biological sample (tissue typing); and (;'/';) aid in forensic identification of a biological sample. Some of these applications are described in the subsections, below.
Chromosome Mapping
Once the sequence (or a portion of the sequence) of a gene has been isolated, this sequence can be used to map the location of the gene on a chromosome ("chromosome mapping"). Briefly, NOVX genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the NOVX, sequences can be used to rapidly select primers that do not span more than one exon in the genomic DNA, thus complicating the amplification process. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the NOVX sequences will yield an amplified fragment. See for example Science 220:919 (1983). Somatic cell hybrids containing only fragments of human chromosomes can also be produced by using human chromosomes with translocations and deletions.
Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location, see, Verma, etal, HUMAN CHROMOSOMES: A MANUAL OF BASIC TECHNIQUES (Pergamon Press, New York 1988). Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. The relationship between genes and disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes; see Nature, 325:783).
Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the NOVX gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.
Predictive Medicine
The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, one aspect of the invention relates to diagnostic assays for determining NOVX protein and/or nucleic acid expression as well as NOVX activity, in the context of a biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an individual is afflicted with a disease or disorder, or is at risk of developing a disorder, associated with aberrant NOVX expression or activity. The disorders include, but are not limited to metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders, and the various dyslipidemias, metabolic disturbances associated with obesity, the metabolic syndrome X and wasting disorders associated with chronic diseases and various cancers.
Another aspect of the invention provides methods for determining NOVX protein, nucleic acid expression or activity in an individual to thereby select appropriate therapeutic or prophylactic agents for that individual (referred to herein as "pharmacogenomics"). Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or prophylactic treatment of an individual based on the genotype of the individual (e.g., the genotype of the individual examined to determine the ability of the individual to respond to a particular agent.)
Yet another aspect of the invention pertains to monitoring the influence of agents (e.g., drugs, compounds) on the expression or activity of NOVX in clinical trials. An exemplary method for detecting the presence or absence of NOVX in a biological sample involves obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting NOVX protein or the nucleic acid (e.g., mRNA, genomic DNA) that encodes NOVX protein such that the presence of NOVX is detected in the biological sample. An agent for detecting NOVX mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to NOVX mRNA or genomic DNA as described herein. An agent for detecting NOVX protein can be an antibody capable of binding to NOVX protein, preferably an antibody with a detectable label as described herein. In one embodiment, the biological sample contains protein molecules from the test subject. Alternatively, the biological sample can contain mRNA molecules from the test subject or genomic DNA molecules from the test subject. The invention also encompasses kits for detecting the presence of NOVX in a biological sample. For example, the kit can comprise: a labeled compound or agent capable of detecting NOVX protein or mRNA in a biological sample; means for determining the amount of NOVX in the sample; and/or means for comparing the amount of NOVX in the sample with a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect NOVX protein or nucleic acid.
Assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant NOVX expression or activity. The methods of the invention can also be used to detect genetic lesions in a NOVX gene
(characterized by at least one of an alteration affecting the integrity of a gene encoding a NOVX-protein, or the misexpression of the NOVX gene), thereby determining if a subject with the lesioned gene is at risk for a disorder characterized by aberrant cell proliferation and/or differentiation. For example, such genetic lesions can be detected by ascertaining the existence of at least one of: (/) a deletion of one or more nucleotides from a NOVX gene; (//) an addition of one or more nucleotides to a NOVX gene; (///) a substitution of one or more nucleotides of a NOVX gene, (iv) a chromosomal rearrangement of a NOVX gene; (v) an alteration in the level of a messenger RNA transcript of a NOVX gene, (vi) aberrant modification of a NOVX gene, such as of the methylation pattern of the genomic DNA, (vii) the presence of a non-wild-type splicing pattern of a messenger RNA transcript of a NOVX gene, (viii) a non-wild-type level of a NOVX protein, (/*) allelic loss of a NOVX gene, and (x) inappropriate post-translational modification of a NOVX protein.
In certain embodiments, detection of the lesion involves the use of a probe/primer in a polymerase chain reaction (PCR) (U.S. Patent Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, Science 241 : 10770, 1988; and PNAS USA 91:360, 1994), the latter of which can be particularly useful for detecting point mutations in the NOVX-gene (see, Nucl. Acids Res. 23:675, 1995). This method can include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or more primers that specifically hybridize to a NOVX gene under conditions such that hybridization and amplification of the NOVX gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample
Alternative amplification methods include: self sustained sequence replication (PNAS USA 87:1874, 1990), transcriptional amplification system (PNAS USA 86:1173, 1989); Qβ Replicase (BioTechnology 6:1197, 1988), or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such molecules are present in very low numbers. In an alternative embodiment, mutations in a NOVX gene from a sample cell can be identified by alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, e.g., U.S. Patent No. 5,493,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.
In other embodiments, genetic mutations in NOVX can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing hundreds or thousands of oligonucleotides probes (e.g., Human Mutation 7:244, 1996.; Nat. Med. 2:753, 1996). For example, by two dimensional arrays containing light-generated DNA probes. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.
In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the NOVX gene and detect mutations by comparing the sequence of the sample NOVX with the corresponding wild-type (control) sequence. For examples of sequencing reactions see PNAS USA 74:560 (1977); PNAS USA 74:5463 (1977); Biotechniques 19:448, 1995; Adv. Chromatography 36:127, 1996; and Appl. Biochem. Biotechnol.38:147, 1993. Other methods for detecting mutations in the NOVX gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes see, e.g., Science 230: 1242, 1985; PNAS USA 85:4397, 1988; Methods Enzymol. 217:286, 1992.
In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA mismatch repair" enzymes) in defined systems for detecting and mapping point mutations in NOVX cDNAs, see Carcinogenesis 15:1657, 1994; U.S. Patent No. 5,459,039.
In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in NOVX genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids, (PNAS USA: 86:2766, 1989;. Mutat. Res. 285:125, 1993; Genet. Anal. Tech. Appl. 9:73, 1992; Trends Genet. 7:5, 1991).
In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) e.g. Nature 313: 495, 1985; Biophys. Chem. 265: 12753, 1987. Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension, e.g. Nature 324: 163, 1986; PNAS USA 86: 6230, 1989.
Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization e.g., Nucl. Acids Res. 17: 2437-2448, 1989) or at the extreme 3'-terminus of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (e.g., Tibtech. 11 :238, 1993). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection, e.g., Mol. Cell Probes 6:1 , 1992. It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification, e.g., PNAS. USA 88: 189, 1991. In such cases, ligation will occur only if there is a perfect match at the 3'-terminus of the 5' sequence, making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification. The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a NOVX gene.
Pharmacogenomics Agents, or modulators that have a stimulatory or inhibitory effect on NOVX activity (e.g.,
NOVX gene expression), as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) disorders. The disorders include but are not limited to, e.g., those diseases, disorders and conditions listed above, and more particularly include those diseases, disorders, or conditions associated with homologs of a NOVX protein, such as those summarized in Table A.
Pharmacogenomics, the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug permits the selection of effective agents (e.g., drugs) for prophylactic or therapeutic treatments based on a consideration of the individual's genotype. Such pharmacogenomics can further be used to determine appropriate dosages and therapeutic regimens. Accordingly, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual. Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons, e.g., Clin. Exp. Pharmacol. PhysioL, 23:983, 1996; Clin. Chem., 43:254, 1997. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism).
Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual. In addition, pharmacogenetic studies can be used to apply genotyping of polymorphic alleles encoding drug-metabolizing enzymes to the identification of an individual's drug responsiveness phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a NOVX modulator, such as a modulator identified by one of the exemplary screening assays described herein.
Monitoring of Effects During Clinical Trials Monitoring the influence of agents (e.g., drugs, compounds) on the expression or activity of NOVX (e.g., the ability to modulate aberrant cell proliferation and/or differentiation) can be applied not only in basic drug screening, but also in clinical trials as described more fully in United States Patent 6,600,019, column 115, line 31 to column 116, line 29, the disclosure of which is incorporated in toto herein
Methods of Treatment
The invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant NOVX expression or activity. Diseases and Disorders
Diseases and disorders, prophylactic methods, therapeutic methods, and determination of the biological effect of the therapeutic to which the present invention may be applied are more fully disclosed in United States Patent 6,600,019, column 116, line 45 to column 118, line 51 , the disclosure of which is incorporated in toto herein.
A more complete understanding of the terms, and scope of the present invention may be obtained in reference to the following discussion and examples, all of which are illustrative of the present invention and are not to be taken as limiting the scope and breadth of the present invention in any manner.
EXAMPLE 1
NOV1 , CG110205, A DISINTEGRIN-LIKE AND METALLOPROTEASE (REPROLYSIN TYPE) WITH THROMBOSPONDIN TYPE 1, MOTIF 12. Two groups of zinc metalloproteinases in particular, ADAMs and MMPs are broadly relevant to extracellular proteolysis. Proteolysis of the extracellular matrix plays a critical role in establishing tissue architecture during development and in tissue degradation in diseases such as cancer, arthritis, Alzheimer disease, and a variety of inflammatory conditions. The proteolytic enzymes responsible include members of diverse protease families and they may work in concert or in cascades to degrade or process molecules. Most ADAM family members are quite similar in domain organization, bearing, from amino to carboxyl termini, a signal peptide, a proregion, a zinc metalloprotease catalytic domain with the typical reprolysin signature motif, a disintegrin domain, a cysteine-rich domain, an EGF-like domain, and, in many cases, a membrane-spanning region and a cytoplasmic domain with signaling potential. Members of the ADAMTS family differ substantially from the prototypic ADAM structure in that they lack the EGF-like domain, do not have a canonical disintegrin sequence, and possess modules with similar thrombospondin type 1 repeats.
The novel gene described here is a novel splice variant of ADAMS-TS 18: the first 1-1062 aa sequence is the same however, the last 3 exons 1063-1133aa, 1134-1182aa, 1183-1212aa differ from ADMAS-TS 18. These last 3 exons have perfect splicing sites, while ADAMS-TS 18's 3 end 1063-1181 aa is from intronic sequence. Because of the presence of the Thrombospondin type 1 domain, and Reprolysin domain, the novel sequence described here has properties and functions similar to these genes. This novel gene has a role in the regulation of cellular functions the modulation of which has implications in cancer and inflammatory diseases.
The a disintegrin-like and metalloptorease (reprolysin type) with thrombospondin type 1 motif, 12-like gene disclosed in this invention maps to chromosome 16.
NOV1 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 1A.
Table 1A. NOV1 Sequence Analysis iNOVla, CGI 10205-03 SEQ ID NO: 1 3882 bp IDNA Sequence ORF Start: ATG at 83 ORF Stop: TGA at 3719
CGGCCGCGGAAAGAATGCGCGCCGCCCGTGCGCTCCGCCTGCCGCGTCTGGCCACCCGCAGCCGCCGC
GTCCGCACCTGΆCCATGGAGTGCGCCCTCCTGCTCGCGTGTGCCTTCCCGGCTGCGGGTTCGGGCCCG
CCGAGGGGCCTGGCGGGACTGGGGCGCGTGGCCAAGGCGCTCCAGCTGTGCTGCCTCTGCTGTGCGTC GGTCGCCGCGGCCTTAGCCAGTGACAGCAGCAGCGGCGCCAGCGGATTAAΆTGATGATTACGTCTTTG CACGCCAGTAGAAGTAGACTCAGCCGGGTCATATATTTCACACGACATTTTGCACAACGGCAGGAAA 'AAGCGATCGGCGCAGAATGCCAGAAGCTCCCTGCACTACCGATTTTCAGCATTTGGACAGGAACTGCA ^CTTAGAACTTAAGCCCTCGGCGATTTTGAGCAGTCACTTTATTGTCCAGGTACTTGGAAAAGATGGTG JCTTCAGAGACTCAGAAACCCGAGGTGCAGCAATGCTTCTATCAGGGATTTATCAGAAATGACAGCTCC ITCCTCTGTCGCTGTGTCTACGTGTGCTGGCTTGTCAGGTTTAATAAGGACACGAAAAAATGAATTCCT ICATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACTACAGCTCCCCTGCGGGTCACCATCCTC JACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCCGG LAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGTATCACCA JTCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCAAGCCTCCCACAGAGG JACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCACAA JAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGAAA ITGTCACCACATACATTCTCACAGTAATGAAGGTTTCTGGCCTATTTAAAGATGGGACTATTGGAAGTG JACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCAACCAT
(CATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGCAAGAG
JACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGACACTC TAGGGTTTGCCCCCACCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACACAGGA ICTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGACGGAGAAGG IGAATCCCTGCAGAAAGGCTGAAGGCAΆTATCATGTCTCCCACACTGACCGGAAACAATGGAGTGTTTT ICATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTCTAGTG JGATGAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGATGCTGA JCACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGATATTT ΪGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAGAAGGG ΪACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGGCCCCG IGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGGAGGAG |TCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCATATO?CTGTCCAGGTTCT ΪAGCCGTATTTATCAGC GTGCAATATTAACCCTTGCAATGAAAATAGCTTGGATTTTCGGGCTCAACA JGTGTGCAGAATATAACAGCAAACCTTTCCGTGGATGGTTCTACCAGTGGAAACCCTATACAAAAGTGG JAAGAGGAAGATCGATGCAAACTGTACTGCAAGGCTGAGAACTTTGAATTTTTTTTTGCAATGTCCGGC JAAAGTGAAAGATGGAACTCCCTGCTCCCCAAACAAAAATGATGTTTGTATTGACGGGGTTTGTGAACT IAGTGGGATGTGATCATGAACTAGGCTCTAAAGCAGTTTCAGATGCTTGTGGCGTTTGCAAAGGTGATA JATTCAACTTGCAAGTTTTATAAAGGCCTGTACCTCAACCAGCATAAAGCAAATGAATATTATCCGGTG JGTCCTCATTCCAGCTGGCGCCCGAAGCATCGAAATCCAGGAGCTGCAGGTTTCCTCCAGTTACCTCGC ΪAGTTCGAAGCCTCAGTCAAAAGTATTACCTCACCGGGGGCTGGAGCATCGACTGGCCTGGGGAGTTCC CTTCGCTGGGACCACGTTTGAATACCAGCGCTCTTTCAACCGCCCGGAACGTCTGTACGCGCCAGGG ICCCACAAATGAGACGCTGGTCTTTGAAATTCTGATGCAAGGCAAAAATCCAGGGATAGCTTGGAAGTA JTGCACTTCCCAAGGTCATGAATGGAACTCCACCAGCCACAAAAAGACCTGCCTATACCTGGAGTATCG JTGCAGTCAGAGTGCTCCGTCTCCTGTGGTGGAGGTTACATAAATGTAAAGGCCATTTGCTTGCGAGAT JCAAAATACTCAAGTCAATTCCTCATTCTGCAGTGCAAAAACCAAGCCAGTAACTGAGCCCAAAATCTG ICAACGCTTTCTCCTGCCCGGCTTACTGGATGCCAGGTGAATGGAGTACATGCAGCAAGTCCTGTGCTG IGAGGCCAGCAGAGCCGAAAGATCCAGTGTGTGCAAAAGAAGCCCTTCCAAAAGGAGGAAGCAGTGTTG JCATTCTCTCTGTCCAGTAAGCACACCCACTCAGGTCCAAGCCTGCAACAGCCATGCCTGCCCTCCACA IATGGAGCCTTGGACCCTGGTCTCAGTGTTCCAAGACCTGTGGACGAGGGGTGAGGAAGCGTGAACTCC .TCTGCAAGGGCTCTGCCGCAGAAACCCTCCCCGAGAGCCAGTGTACCAGTCTCCCCAGACCTGAGC-.G JCAGGAGGGCTGTGTGCTTGGACGATGCCCCAAGAACAGCCGGCTACAGTGGGTCGCTTCTTCGTGGAG 1CGAGTGTTCTGCAACCTGTGGTTTGGGTGTGAGGAAGAGGGAGATGAAGTGCAGCGAGAAGGGCTTCC
JAGGGAAAGCTGATAACTTTCCCAGAGCGAAGATGCCGTAATATTAAGAAACCAAATCTGGACTTGGAA JGAGACCTGCAACCGACGGGCTTGCCCAGCCCATCCAGTGTACAACATGGTAGCTGGATGGTATTCATT JGCCGTGGCAGCAGTGCACAGTCACCTGTGGGGGAGGGGTCCAGACCCGGTCAGTCCACTGTGTTCAGC IAAGGCCGGCCTTCCTCAAGTTGTCTGCTCCATCAGAAACCTCCGGTGCTACGAGCCTGTI\ATACAAAC "TTCTGTCCAGCTCCTGAAAAGAGAGATTCTGCAGGAAGCCAGTTACCATGTTGTGATGGTCCTCAAGC AGTCCATGAAGAAGGACTGAGGTTTCCTGACAACCACTGGGCTATGTGAGAAAGCCTTCCAGAAGCAG
IATTCCCCAGCCCAGCCGAGTATTCAGATAACTACAGCCCCACCTCGAATCTTTCTTGCAGCCTGATGA
GACCTCAAGCCAGAACCACACAGCTAACCATTTCTGAATTTCCATGCAATAAATGCATAATGTTATAA
GCCAAA
JNOV1 a, CG110205-03 SEQ ID NO: 2 1212 aa MW at 133862.0kD jProtein Sequence
! ECAL--, ACAFPAAGSGPPRGLAGI-GRVA ALQLCCLCCASVAAAr-ASDSSSGASGLNDDYVFVTPVE
IVDSAGSYISHDI HNGRKKRSAQNARSS HYRFSAFGQELHLE KPSAir-SSHFIVQVLG DGASETQ KPEVQQCFYQGFIRNDSSSSVAVSTCAG SG IRTRKNEFLISPLPQLLAQEHNYSSPAGHHPHVLY
'jRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRETEYHHRRLQKQHFCGRRKKYAPKPPTEDTYliR FDEYGSSGRPRRSAGKSQKGLNVET VVADKKMVΕKHGKG1WTTYILTVM VSG FKDGTIGSDINVV VVSLI EQEPGG IlffiHADQSLNSFCQWQSALIGKNGKRHDHAILLTGFDICSWKNEPCDTLGFAP SGMCS YRSCTIKTEDTGLGLAFTIAHESGHNFGMIHDGEGNPCRKAEGNIMSPTLTGNNGVFS SSC SRQY KKFLSTPQAGCLVDEPKQAGQYKYPDKLPGQIYDADTQCK QFGA--^K CSLGFVKDIC S W CHRVGHRCETKFMPAAEGTVCG S WCRQGQCVKFGELGPRPIHGQ SA SKWSECSRTCGGGV FQE RHCIsmPKPQYGGIFCPGSSRIYQ CNINPCN---NSLDFRAQQCAEYMSKPFRGWFYQ KPYTKVEEEDR C-^YCKAEaSTFΞFFFA SGKVKDGTPCSPNK DVCIDGVCΞ VGCDHELGSKAVSDACGVCKGDNSTCK iFYKGLY NOHKANEYYPWLIPAGARSIEIOELOVSSSY AVRSLSOKYY TGGWSID PGEFPFAGT §TFEYQRSF]SMPERLYAPGPΑ?NETLVFEILMQGK PGIAW YALPKVMISIGTPPATKRPAY--WSIVQSEC
JSVSCGGGYIERVICAICLRDQNTQWSSFCSAKTKPV EPKICK^^
IRKIQCVQK-KPFQKEΞAVLHS CPVSTPTQVOACNSHACPPQWS GPWSQCSKTCGRGΛ KRE CKGS
IAAETLPESQCTS PRPE QEGCVLGRCP NSRLQWVASSWSΞCSATCGLGVRKREMKCSEKGFQGKLI
ITFPE-^CRNIKKPL-ILDLEETC RRACPAHPVYLSRMVAGWYS PWQQCTVTCGGGVQTRSVHCVQQGRPS
(SSC HQKPPV RACNTNFCPAPEKRDSAGSQLPCCDGPQAVHEEG RFPDNHWAM jNOVIb, CG110205-07 SEQ ID NO: 3 )1611 bp
|DNA Sequence ORF Start: at 1 |ORF Stop: end of sequence
IAAGCTTGAACTTAAGCCCTCGGCGATTTTGAGCAGTCACTTTATTGTCCAGGTACTTGGAAAAGATGG :TGCTTCAGAGACTCAGAAACCCGAGGTGCAGCAATGCTTCTATCAGGGATTTATCAGAAATGACAGCT ICCTCCTCTGTCGCTGTGTCTACGTGTGCTGGCTTGTCAGGTTTAATAAGGACACGAAAAAATGAATTC ICTCATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACTACAGCTCCCCTGCGGGTCACCATCC JTCACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCC JGGAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGO.ATCAC ICATCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCAAGCCTCCCACAGA !GGACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCAC IAAAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGA IAATGTCACCACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGG IAAGTGACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCA ACCATCATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGC IAAGAGACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTC-.TGGAAGAATGAACCATGTGA ICACTCTAGGGTTTGCCCCCATCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACA [CAGGACTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGACGGA AAGGGAATCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGT JGTTTTCATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTC TAGTGGATGAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGAT IGCTGACACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGA TATTTGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAG JAAGGGACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGG JCCCCGGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGG ^AGGAGTCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCTTATTCTGTCCAG JGTTCTAGCCGTATTTATCAGCTGTGCAATATTAACCCTTGCCTCGAG
|NOV1b,CG110205-07 SEQ ID NO: 4 537 aa MW at 59663.4kD iProtein Sequence
SKLELKPSAILSSHFIVQVLG DGASETQKPEVQQCFYQGFIRNDSSSSVAVSTCAGLSG IRTRKWEF ILISPLPQLLAQEHNYSSPAGHHPHVLYKRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRETEYH
IHRR QKQHFCGRRKKYAP PPTEDTY RFDEYGSSGRPRRSAGKSQKGLNVETLVVADKKMVEKHGKG
INVTTYILTVMN VSG FKDGTIGSDIΪ^^
IKRHDHAILLTGFDICS K EPCDTLGFAPISGMCSKYRSCTIWEDTG G AFTIAHESGH FG IHDG EGNPCRKAEGNI SPTLTGN GVFS SSCSRQY KKFLSTPQAGCLVDEPKQAGQYKYPDK PGQIYD
'ADTQCK QFGAKAKLCSLGFV DICKSLWCHRVGHRCΞTK-MPAAEGTVCG SMWCRQGQCV FGΞLG
IPRPIHGQ SA S WSECSRTCGGGVKFQERHCKT P PQYGG FCPGSSRIYQLCNINPCLE
|NOV1c, 318171228 SEQ ID NO: 5 ,624bp
DNA Sequence |< DRF Start: at 1 JORF Stop: end of sequence
^GCTTGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGAAATGTCAC ICACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGGAAGTGACA JTAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCAACCATCAT IGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGCAAGAGACA JTGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGACACTCTAG IGGTTTGCCCCCTVTCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACACAGGACTT GGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGATGGAGAAGGGAA TCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGTGTTTTCAT GGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTCTAGTGGAT JGAGCCCCTCGAG
NOVIc, 318171228 S »EQ ID NO: 6 J208 aa MW at 22572.6kD Protein Sequence
KLVETLWADK-^-WEKHG GIWTTYI T
ADQSI-WSFCQWQSALIGK G-^HDHAIL TGFDICSWKNEPCDT GFAPISG CSKYRSCTIISIEDTG G--.AFTIAHESGHNFGMIHDGEGNPCRKAEGNIMSPTLTGNNGλFS SSCSRQY]-ιKKFLSTPQAGC VD EPLE
NOV1d, 318171484 3EQID O:7 1066bp
DNA Sequence < DRF Start: at2 |ORF Stop: end of sequence
Figure imgf000044_0001
;CTCATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACCACAGCTCCCCTGCGGGTCACCATCC JTCACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCC IGGAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGTATCAC (CATCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCAAGCCTCCCACAGA JGGACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCAC ΪAAAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGA
ΪAATGTCACCACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGG
IAAGTGACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCA ACCATCATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGC FAAGAGACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGA ICACTCTAGGGTTTGCCCCCATCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACA JCAGGACTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGATGGA IGAAGGGAATCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGT GTTTTCATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTC TAGTGGA GAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGAT JGCTGACACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGA JTATTTGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAG SAAGGGACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGG ICCCCGGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGG AGGAGTCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCTTATTCTGTCCAG GTTCTAGCCGTATTTATCAGCTGTGCAATATTAACCCTTGCAATGAAAATAGCTTGGATTTTCGGGCT CAACAGTGTGCAGAATATAACAGCAAACCTTTCCGTGGATGGTTCTACCAGTGGAAACCCTATACAAA AGTGGAAGAGGAAGATCGATGCAAACTGTACTGCAAGGCTGAGAACTTTGAATTTTTTTTTGCAATGT CCGGCAAAGTGAAAGATGGAACTCCCTGCTCCCCAAACAAAAATGATGTTTGTATTGACGGGGTTTGT IGAACTAGTGGGATGTGATCATGAACTAGGCTCTAAAGCAGTTTCAGATGCTTGTGGCGTTTGCAAAGG ITGATAATTCAACTTGCAAGTTTTATAAAGGCCTGTACCTCAACCAGCATAAAGCAAATGAATATTATC ICGGTGGTCCTCATTCCAGCTGGCGCCCGAAGCATCGAAATCCAGGAGCTGCAGGTTTCCTCCAGTTAC ICTCGCAGTTCGAAGCCTCAGTCAAAAGTATTACCTCACCGGGGGCTGGAGCATCGACTGGCCTGGGGA (GTTCCCCTTCGCTGGGACCACGTTTGAATACCAGCGCTCTTTCAACCGCCCGGAACGTCTGTACGCGC .CAGGGCCCACAAATGAGACGCTGATTCTGATGCAAGGCAAAAATCCAGGGATAGCTTGGAAGTATGCA SCTTCCCAAGGTCATGAATGGAACTCCACCAGCCACAAAAAGACCTGCCTATACCTGCTGGATGCCAGG JTGAATGGAGTACATGCAGCAAGGCCTGTGCTGGAGGCCAGCAGAGCCGAAAGATCCAGTGTGTGCAAA
IAGAAGCCCTTCCAAAAGGAGGAAGCAGTGTTGCATTCTCTCTGTCCAGTGAGCACACCCACTCAGGTC
ICAAGCCTGCAACAGCCATGCCTGCCCTCCACAATGGAGCCTTGGACCCTGGTCTCAGTGTTCCAAGAC CTGTGGACGAGGGGTGAGGAAGCGTGAACTCCTCTGCAAGGGCTCTGCCGCAGAAACCCTCCCCGAGA ;GCCAGTGTACCAGTCTCCCCAGACCTGAGCTGCAGGAGGGCTGTGTGCTTGGACGATGCCCCAAGAAC JAGCCGGCTACAGTGGGTCGCTTCTTCGTGGAGCGAGTGTTCTGCAACCTGTGGTTTGGGTGTGAGGAA IGAGGGAGATGAAGTGCAGCGAGAAGGGCTTCCAGGGAAAGCTGATAACTTTCCCAGAGCGAAGATGCC JGTAATATTAAGAAACCAAATCTGGACTTGGAAGAGACCTGCAACCGACGGGCTTGCCCAGCCCATCCA JGTGTACAACATGGTAGCTGGATGGTATTCATTGCCGTGGCAGCAGTGCACAGTCACCTGTGGGGGAGG
IGGTCCAGACCCGGTCAGTCCACTGTGTTCAGCAAGGCCGGCCTTCCTCAAGTTGTCTGCTCCATCAGA 'AACCTCCGGTGCTACGAGCCTGTAATACAAACTTCTGTCCAGCTCCTGAAAAGAGAGAGGATCCATCC ^TGCGTAGATTTCTTCAACTGGTGTCACCTAGTTCCTCAGCATGGTGTCTGCAACCACAAGTTTTACGG sAAAACAATGCTGCAAGTCATGCACAAGGAAGATCTGATCTTGGTGTCCTCCCCAGCACCTTATGGCCA .GGGGCTTACCTTTCAACCTCTAGAGA jNOVIf, 13379193 SEQ ID NO: 12 1162 aa MW at 128750.6kD ! Protein Sequence
MECA L ACAFPAAGSGPPRGLAGLGRVAKΆ Q CCLCCASVAAALASDSSSGASGLNDDYVFVTPVΈ
VDSAGSY1SHDILH GRKKRSAQNARSSLHYRFSAFGQELHLELKPSAI---SSHFIVQV GKDGASETQ
KPEVQQCFYQGFIRWDSSSSVAVSTCAG SGLIRTRKNEFLISP PQ AQEHNHSSPAGHHPHVI-YK
IRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRETEYHHRRLQKQHFCGRRKKYAP PPTEDTYLR
JFDEYGSSGRPKRSAGKSQKGL VΕT WADKKMV---KHGKGNVOT
JVVVSLILLEQEPGG IKΓHHADQS NSFCQWQSALIGKWG RHDHAI TGFDICSWKISIEPCDTLGFA
IPISG CSKYRSCTI EDTG G AFTIAHESGH FG IHDGEGNPCRKAEGNIMSPT TG NGVFS SΞ
JCSRQY KKF STPQAGC VDEPKQAGQYKYPDK PGQIYDADTQC QFGAKAK CSLGFV DICKSL
J CHRVGHRCETKFMPAAEGTVCGLSMWCRQGQCVKFGE GPRPIHGQ SA S WSECSRTCGGGVKFQ
ERHCNWPKPQYGGLFCPGSSRIYQ CNINPCNENSLDFRAQQCAEYNSKPFRG FYQ KPYTKVEEED CK YCKAENFEFFFAMSG VKDGTPCSPNKNDVCIDGVCELVGCDHELGSKAVSDACGVCKGDNSTC
KFYKGLY---WQHKANEYYPVVLIPAGARSIEIQE QVSSSYI-AVRS SQ YY TGGWSID PGEFPFAG
TTFΞYQRSF RPΞRLYAPGPTNΞTLILMQG NPGIAWKYALPKVMNGTPPATKRPAYTCW-MPGEWSTC
SKACAGGQQSRKIQCVQK PFQKEEAVLHSLCPVSTPTQVQACNSHACPPQ SLGPWSQCSKTCGRGV
RKREL C GSAAET PESQCTSLPRPE QEGCV--.GRCPI^SR QWVASSWSECSATCGLGΛ R---MKC
SEKGFQGKLITFPERRC-^IK PNLDLEETC RRACPAHPVYMWAG YSLPWQQCTVTCGGGVQTRS
VHCVQQGRPSSSCLLHQKPPVLRACNTNFCPAPEKREDPSCVDFF WCHLVPQHGVCNHKFYGKQCCK
SCTRKI
INOV1Q. 13379194 SEQ ID NO: 13 3630 bp 'DNA Sequence ORF Start: ATG at 85 ORF Stop: TGA at 3571 jTGCGGCCGCGGAAAGAATGCGCGCCGCCCGTGCGCTCCGCCTGCCGCGTCTGGCCACCCGCAGCCGCC
IGCGTCCGCACCTGACCATGGAGTGCGCCCTCOTGCTCGCGTGTGCCTTCCCGGCTGCGGGTTCGGOR-N ICGCCGAGGGGCCTGGCGGGACTGGGGCGCGTGGCCAAGGCGCTCCAGCTGTGCTGCCTCTGCTGTGCG I CGGTCGCCGCGGCCTTAGCCAGTGACAGCAGCAGCGGCGCCAGCGGATTAAATGATGATTACGTCTT JTGTCACGCCAGTAGAAGTAGACTCAGCCGGGTCATATATTTCACACGACATTTTGCACAACGGCAGGA JAAAAGCGATCGGCGCAGAATGCCAGAAGCTCCCTGCACTACCGATTTTCAGCATTTGGACAGGAACTG JCACTTAGAACTTAAGCCCTCGGCGATTTTGAGCAGTCACTTTATTGTCCAGGTACTTGGAAAAGATGG .TGCTTCAGAGACTCAGAAACCCGAGGTGCAGCAATGCTTCTATCAGGGATTTATCAGAAATGACAGCT JCCTCCTCTGTCGCTGTGTCTACGTGTGCTGGCTTGTCAGGTTTAATAAGGACACGAAAAAATGAATTC JCTCATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACTACAGCTCCCCTGCGGGTCACCA-.EE JTCACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCC JGGAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGTATCAC ΪCATCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCAAGCCTCCCACAGA SGGACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCAC -AAAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGA JAATGTCACCACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGG LAAGTGACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCA JACCATCATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGC JAAGAGACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGA CACTCTAGGGTTTGCCCCCATCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACA CAGGACTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGATGGA GAAGGGAATCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGT GTTTTCATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTC ΪTAGTGGATGAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGAT (GCTGACACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGA JTATTTGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAG JAAGGGACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGG CCCCGGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGG IAGGAGTCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCATATTCTGTCCAG IGTTCTAGCCGTATTTATCAGCTGTGCAATATTAACCCTTGCAATGAAAATAGCTTGGATTTTCGGGCT ICAACAGTGTGCAGAATATAACAGCAAACCTTTCCGTGGATGGTTCTACCAGTGGAAACCCTATACAAA JAGTGGAAGAGGAAGATCGATGCAAACTGTACTGCAAGGCTGAGAACTTTGAATTTTTTTTTGCAATGT ICCGGCAAAGTGAAAGATGGAACTCCCTGCTCCCCAAACAAAAATGATGTTTGTATTGACGGGGTTTGT! IGAACTAGTGGGATGTGATCATGAACTAGGCTCTAAAGCAGTTTCAGATGCTTGTGGCGTTTGCAAAGG! !TGATAATTCAACTTGCAAGTTTTATAAAGGCCTGTACCTCAACCAGCATAAAGCAAATGAATATTATC! JCGGTGGTCCTCATTCCAGCTGGCGCCCGAAGCATCGAAATCCAGGAGCTGCAGGTTTCCTCCAGTTAC CTCGCAGTTCGAAGCCTCAGTCAAAAGTATTACCTCACCGGGGGCTGGAGCATCGACTGGCCTGGGGA JGTTCCCCTTCGCTGGGACCACGTTTGAATACCAGCGCTCTTTCAACCGCCCGGAACGTCTGTACGCGC JCAGGGCCCACAAATGAGACGCTGATTCTGATGCAAGGCAAAAATCCAGGGATAGCTTGGAAGTATGCA ΪCTTCCCAAGGTCATGAATGGAACTCCACCAGCCACAAAAAGACCTGCCTATACCTGCTGGATGCCAGG JTGAATGGAGTACATGCAGCAAGGCCTGTGCTGGAGGCCAGCAGAGCCGAAAGATCCAGTGTGTGCAAA JAGAAGCCCTTCCAAAAGGAGGAAGCAGTGTTGCATTCTCTCTGTCCAGTGAGCACACCCACTCAGGTC (CAAGCCTGCAACAGCCATGCCTGCCCTCCACAATGGAGCCTTGGACCCTGGTCTCAGTGTTCCAAGAC JCTGTGGACGAGGGGTGAGGAAGCGTGAACTCCTCTGCAAGGGCTCTGCCGCAGAAACCCTCCCCGAGA IGCCAGTGTACCAGTCTCCCCAGACCTGAGCTGCAGGAGGGCTGTGTGCTTGGACGATGCCCCAAGAAC IAGCCGGCTACAGTGGGTCGCTTCTTCGTGGAGCGAGTGTTCTGCAACCTGTGGTTTGGGTGTGAGGAA JGAGGGAGATGAAGTGCAGCGAGAAGGGCTTCCAGGGAAAGCTGATAACTTTCCCAGAGCGAAGATGCC 5GTAATATTAAGAAACCAAATCTGGACTTGGAAGAGACCTGCAACCGACGGGCTTGCCCAGCCCATCCA JGTGTACAACATGGTAGCTGGATGGTATTCATTGCCGTGGCAGCAGTGCACAGTCACCTGTGGGGGAGG
JGGTCCAGACCCGGTCAGTCCACTGTGTTCAGCAAGGCCGGCCTTCCTCAAGTTGTCTGCTCCATCAGA JAACCTCCGGTGCTACGAGCCTGTAATACAAACTTCTGTCCAGCTCCTGAAAAGAGAGAGGATCCATCC TGCGTAGATTTCTTCAACTGGTGTCACCTAGTTCCTCAGCATGGTGTCTGCAACCACAAGTTTTACGG IAAAACAATGCTGCAAGTCATGCACAAGGAAGATCTGATCTTGGTGTCCTCCCCAGCACCTTATGGCCA ''GGGGCTTACCTTTCAACCTCTAGAGA
INOVIg, 13379194 of SEQ ID NO: 14 1162aa MWat128776.6kD jProtein Sequence
MECALL ACAFPAAGSGPPRGLAGLGRVAKALQ CCI-CCASVAAALASDSSSGASGLISJDDYVFVTPVE VDSAGSYISHDI HNGRKKRSAQNARSSLHYRFSAFGQELH E KPSAILSSHFIVQVLGKDGASETQ KPEVQQCFYQGFIR DSSSSVAVSTCAGLSGLIRTRKNΞF ISP PQ AQEHNYSSPAGHHPHVLYK RTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRETEYHHRRLQKQHFCGRRKKYAP PPTΞDTYLR FDΞYGSSGRPRRSAGKSQ GLNVΕTLλA/ADKKlWEKHGKGl TTYILTVK-NMVSGLFKDGTIGSDI V VVVS I LEQEPGGLLI HHADQS NSFCQ QSALIGKNGKRHDHAIL TGFDICS KNEPCDTLGFA PISGMCSKYRSCTINEDTGLGLAFTIAHESGHNFGMIHDGEGNPCR AEGNIMSPT TGKT GVFSWSS JCSRQY KF STPQAGC VDEPKQAGQYKYPD LPGQIYDADTQCKWQFGAKA CS GFVKDICKS jWCHRVGHRCETKFMPAAEGTVCG SMWCROGOCVKFGELGPRPIHGO SA SKWSECSRTCGGGVKFO 'ERHCNNPKPQYGGIFCPGSSRIYQLCNINPCK-ΞNSLDFRAQQCAEYNSKPFRGWFYQWKPYTKVEEED RC YC AENFEFFFAMSGKV DGTPCSP KNDVCIDGVCELVGCDHELGSKAVSDACGVCKGDNSTC KFYKGLYL-NQH ANEYYPVVLIPAGARSIEIQE QVSSSYLAVRSLSQKYYLTGG SIDWPGEFPFAG TTFEYQRSFl^PER YAPGPTNETLI QGK PGIAWKYALP GTPPATKRPAYTC PGΞWSTC SKACAGGQQSKKIQCVQKKPFQKEEAV HSLCPVSTPTQVQACNSHACPPQWSLGP SQCS TCGRGV RKREL CKGSAAET PΞSQCTSLPRPE QEGCVIJGRCPKNSR Q VASS SECSATCGLGVRKRE KC SEKGFQG-O^ITFPERRCRNIK-KPl^D EΞTCNRRACPAHPVYKrK-VAGWΪSLPW
^ HCVQQGRPSSSCL HQKPPV RACNTNFCPAPEKREDPSCVDFFNWCH VPQHGVCNHKFYGKQCCK JSCTRKI
!NOV1h, 13379195 (SEQ ID NO: 15 J3630 bp iDNA Sequence ORF Start: ATG at 85 ORF Stop: TGA at 3571 jTGCGGCCGCGGAAAGAATGCGCGCCGCCCGTGCGCTCCGCCTGCCGCGTCTGGCCACCCGCAGCCGCC
FGCGTCCGCACCTGACCATGGAGTGCGCCCTCCTGCTCGCGTGTGCCTTCCCGGCTGCGGGTTCGGGCC ΪCGCCGAGGGGCCTGGCGGGACTGGGGCGCGTGGCCAAGGCGCTCCAGCTGTGCTGCCTCTGCTGTGCG ITCGGTCGCCGCGGCCTTAGCCAGTGACAGCAGCAGCGGCGCCAGCGGATTAAATGATGATTACGTCTT JTGTCACGCCAGTAGAAGTAGACTCAGCCGGGTCATATATTTCACACGACATTTTGCACAACGGCAGGA JAAAAGCGATCGGCGCAGAATGCCAGAAGCTCCCTGCACTACCGATTTTCAGCATTTGGACAGGAACTG JCACTTAGAACTTAAGCCCTCGGCGATTTTGAGCAGTCACTTTATTGTCCAGGTACTTGGAAAAGATGG .TGCTTCAGAGACTCAGAAACCCGAGGTGCAGCAATGCTTCTATCAGGGATTTATCAGAAATGACAGCT ]CCTCCTCTGTCGCTGTGTCTACGTGTGCTGGCTTGTCAGGTTTAATAAGGACACGAAAAAATGAATTC ICTCATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACTACAGCTCCCCTGCGGGTCACCATCC JTCACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCC JGGAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGTATCAC LCATCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCGAGCCTCCCACAGA .GGACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCAC 'AAAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGA IAATGTCACCACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGG IAAGTGACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCA JACCATCATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGC JAAGAGACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGA ]CACTCTAGGGTTTGCCCCCATCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACA JCAGGACTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGATGGA JGAAGGGAATCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGT ΪGTTTTCATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTC ^TAGTGGATGAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGAT !GCTGACACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGA JTATTTGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAG JAAGGGACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGG CCCGGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGG .AGGAGTCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCTTATTCTGTCCAG IGTTCTAGCCGTATTTATCAGCTGTGCAATATTAACCCTTGCAATGAAAATAGCTTGGATTTTCGGGCT JCAACAGTGTGCAGAATATAACAGCAAACCTTTCCGTGGATGGTTCTACCAGTGGAAACCCTATACAAA JAGTGGAAGAGGAAGATCGATGCAAACTGTACTGCAAGGCTGAGAACTTTGAATTTTTTTTTGCAATGT -CCGGCAAAGTGAAAGATGGAACTCCCTGCTCCCCAAACAAAAATGATGTTTGTATTGACGGGGTTTGT GAACTAGTGGGATGTGATCATGAACTAGGCTCTAAAGCAGTTTCAGATGCTTGTGGCGTTTGCAAAGG TGATAATTCAACTTGCAAGTTTTATAAAGGCCTGTACCTCAACCAGCATAAAGCAAATGAATATTATC CGGTGGTCCTCATTCCAGCTGGCGCCCGAAGCATCGAAATCCAGGAGCTGCAGGTTTCCTCCAGTTAC CTCGCAGTTCGAAGCCTCAGTCAAAAGTATTACCTCACCGGGGGCTGGAGCATCGACTGGCCTGGGGA GTTCCCCTTCGCTGGGACCACGTTTGAATACCAGCGCTCTTTCAACCGCCCGGAACGTCTGTACGCGC CAGGGCCCACAAATGAGACGCTGATTCTGATGCAAGGCAAAAATCCAGGGATAGCTTGGAAGTATGCA CTTCCCAAGGTCATGAATGGAACTCCACCAGCCACAAAAAGACCTGCCTATACCTGCTGGATGCCAGG TGAATGGAGTACATGCAGCAAGGCCTGTGCTGGAGGCCAGCAGAGCCGAAAGATCCAGTGTGTGCAAA AGAAGCCCTTCCAAAAGGAGGAAGCAGTGTTGCATTCTCTCTGTCCAGTGAGCACACCCACTCAGGTC CAAGCCTGCAACAGCCATGCCTGCCCTCCACAATGGAGCCTTGGACCCTGGTCTCAGTGTTCCAAGAC CTGTGGACGAGGGGTGAGGAAGCGTGAACTCCTCTGCAAGGGCTCTGCCGCAGAAACCCTCCCCGAGA GCCAGTGTACCAGTCTCCCCAGACCTGAGCTGCAGGAGGGCTGTGTGCTTGGACGATGCCCCAAGAAC AGCCGGCTACAGTGGGTCGCTTCTTCGTGGAGCGAGTGTTCTGCAACCTGTGGTTTGGGTGTGAGGAA GAGGGAGATGAAGTGCAGCGAGAAGGGCTTCCAGGGAAAGCTGATAACTTTCCCAGAGCGAAGATGCC GTAATATTAAGAAACCAAATCTGGACTTGGAAGAGACCTGCAACCGACGGGCTTGCCCAGCCCATCCA GTGTACAACATGGTAGCTGGATGGTATTCATTGCCGTGGCAGCAGTGCACAGTCACCTGTGGGGGAGG
GGTCCAGACCCGGTCAGTCCACTGTGTTCAGCAAGGCCGGCCTTCCTCAAGTTGTCTGCTCCATCAGA AACCTCCGGTGCTACGAGCCTGTAATACAAACTTCTGTCCAGCTCCTGAAAAGAGAGAGGATCCATCC TGCGTAGATTTCTTCAACTGGTGTCACCTAGTTCCTCAGCATGGTGTCTGCAACCACAAGTTTTACGG AAAACAATGCTGCAAGTCATGCACAAGGAAGATCTGATCTTGGTGTCCTCCCCAGCACCTTATGGCCA GGGGCTTACCTTTCAACCTCTAGAGA
ΪNOVIh. 13379195 SEQ ID NO: 16 1162 aa MW at 128777.6kD (Protein Sequence
MECAI-L ACAFPAAGSGPPRG AG GRVAKALQ--.CC CCASVAAALASDSSSGASGLNDDYVFVTPVE
VDSAGSYISHDILH GRKKRSAQNARSS--HYRFSAFGQΞLHLΞLKPSAI SSHFIVQVLGKDGASΞTQ
!ΚPEVQQCFYQGFIRLTOSSSSVAVSTCAG SGLIRTRKKREF--.ISP PQLLAQE-- YSSPAGHHPHVLYK
IRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRΞTEYHHRRLQKQHFCGRRKKYAPEPPTEDTYLR
JFDEYGSSGRPRRSAGKSQKGI-JSRVET WADKKMVΕK^^
JVΛA7SLI EQEPGG LINHHADQS NSFCQ QSALIG NGK- HDHAIL TGFDICSWKNEPCDTLGFA
PISGMCSKYRSCTIISJΞDTGLG AFTIAHESGHNFG IHDGEGNPCRKAEGNIMSPTLTG NGVFS SS
CSRQYLKKF STPQAGCLVDΞP QAGQYKYPDKLPGQIYDADTQCK QFGA AKLCS GFVKDICKS
( CHRVGHRCET FMPAAEGN^CGLSMWCRQGQCV FGELGPRPIHGQWSAWSKWSECSRTCGGG-VKFQ
JERHCNNP PQYGGLFC PGS SRI YQLCNINPCNENS DFRAQQCAEYNSKPFRGWFYQWKPYTKVEEED
'RCKLYCKAENFΞFFFAMSGKVKDGTPCSPNKNDVCIDGVCELVGCDHELGSKAVSDACGVCKGDNSTC
J FY GLYI- QH A-NTEYYPVVLIPAGARSIEIQΞLQVSSSYLAVRS SQKYYIJTGGWSIDWPGEFPFAG
JTTFEYQRSFNRPER YAPGPTNETLILMQGKNPGIAW YALPKVL^GTPPATKRPAYTCWIMPGE STC
SKACAGGQQSKKIQCVQKKPFQKΞEAVLHSLCPVSTPTQVQACNSHACPPQ S GP SQCSKTCGRGV
RKRE--.LCKGSAAETLPESQCTSLPRPELQEGCVLGRCPKLNRSRLQ VASSWSECSATCG GVRKRΞLV-KC
SEKGFQGKLITFPE-^CR IKKPN D EETCNR-^CPAHPΛRYRAWAG YSLPWQQCTVTCGGGVQTRS
VHCVQQGRPSSSC LHQKPPV RACNTNFCPAPEKREDPSCVDFF CH VPQHGVC H FYG QCCK
SCTRKI jNOVIi, CG110205 SEQ ID NO: 17 3630 bp (DNA Sequence ORF Start: ATG at 85 ORF Stop: TGA at
I 3571
TGCGGCCGCGGAAAGAATGCGCGCCGCCCGTGCGCTCCGCCTGCCGCGTCTGGCCACCCGCAGCCGCC iGCGTCCGCACCTGACCATGGAGTGCGCCCTCCTGCTCGCGTGTGCCTTCCCGGCTGCGGGTTCGGGCC jCGCCGAGGGGCCTGGCGGGACTGGGGCGCGTGGCCAAGGCGCTCCAGCTGTGCTGCCTCTGCTGTGCG JTCGGTCGCCGCGGCCTTAGCCAGTGACAGCAGCAGCGGCGCCAGCGGATTAAATGATGATTACGTCTT JTGTCACGCCAGTAGAAGTAGACTCAGCCGGGTCATATATTTCACACGACATTTTGCACAACGGCAGGA .AAAAGCGATCGGCGCAGAATGCCAGAAGCTCCCTGCACTACCGATTTTCAGCATTTGGACAGGAACTG JCACTTAGAACTTAAGCCCTCGGCGATTTTGAGCAGTCACTTTATTGTCCAGGTACTTGGAAAAGATGG ITGCTTCAGAGACTCAGAAACCCGAGGTGCAGCAATGCTTCTATCAGGGATTTATCAGAAATGACAGCT JCCTCCTCTGTCGCTGTGTCTACGTGTGCTGGCTTGTCAGGTTTAATAAGGACACGAAAAAATGAATTC
'CTCATCTCGCCATTACCTCAGCTTCTGGCCCAGGAACACAACXjACAGCTCCCCTGCGGGTCACCATCC iTCACGTACTGTACAAAAGGACAGCAGAGGAGAAGATCCAGCGGTACCGTGGCTACCCCGGCTCTGGCC GGAATTATCCTGGTTACTCCCCAAGTCACATTCCCCATGCATCTCAGAGTCGAGAGACAGAGTATCAC
CATCGAAGGTTGCAAAAGCAGCATTTTTGTGGACGACGCAAGAAATATGCTCCCX2AGCCTCCCACAGA jGGACACCTATCTAAGGTTTGATGAATATGGGAGCTCTGGGCGACCCAGAAGATCAGCTGGAAAATCAC lAAAAGGGCCTCAATGTGGAAACCCTCGTGGTGGCAGACAAGAAAATGGTGGAAAAGCATGGCAAGGGA jAATGTCACCACATACATTCTCACAGTAATGAACATGGTTTCTGGCCTATTTAAAGATGGGACTATTGG ^AAGTGACATAAACGTGGTTGTGGTGAGCCTAATTCTTCTGGAACAAGAACCTGGAGGATTATTGATCA ACCATCATGCAGACCAGTCTCTGAATAGTTTTTGTCAATGGCAGTCTGCCCTCATTGGAAAGAATGGC AAGAGACATGATCATGCCATCTTACTAACAGGATTTGATATTTGTTCTTGGAAGAATGAACCATGTGA CACTCTAGGGTTTGCCCCCATCAGTGGAATGTGCTCTAAGTACCGAAGTTGTACCATCAATGAGGACA CAGGACTTGGCCTTGCCTTCACCATCGCTCATGAGTCAGGGCACAACTTTGGTATGATTCACGATGGA GAAGGGAATCCCTGCAGAAAGGCTGAAGGCAATATCATGTCTCCCACACTGACCGGAAACAATGGAGT GTTTTCATGGTCTTCCTGCAGCCGCCAGTATCTCAAGAAATTCCTCAGCACACCTCAGGCGGGGTGTC TAGTGGATGAGCCCAAGCAAGCAGGACAGTATAAATATCCGGACAAACTACCAGGACAGATTTATGAT GCTGACACACAGTGTAAATGGCAATTTGGAGCAAAAGCCAAGTTATGCAGCCTTGGTTTTGTGAAGGA TATTTGCAAATCACTTTGGTGCCACCGAGTAGGCCACAGGTGTGAGACCAAGTTTATGCCCGCAGCAG AAGGGACCGTTTGTGGCTTGAGTATGTGGTGTCGGCAAGGCCAGTGCGTAAAGTTTGGGGAGCTCGGG CCCCGGCCCATCCACGGCCAGTGGTCCGCCTGGTCGAAGTGGTCAGAATGTTCCCGGACATGTGGTGG
AGGAGTCAAGTTCCAGGAGAGACACTGCAATAACCCCAAGCCTCAGTATGGTGGCX3TATTCTGTCCAG
GTTCTAGCCGTATTTATCAGCTGTGCAATATTAACCCTTGCAATGAAAATAGCTTGGATTTTCGGGCT
CAACAGTGTGCAGAATATAACAGCAAACCTTTCCGTGGATGGTTCTACCAGTGGAAACCCTATACAAA
AGTGGAAGAGGAAGATCGATGCAAACTGTACTGCAAGGCTGAGAACTTTGAATTTTTTTTTGCAATGT
CCGGCAAAGTGAAAGATGGAACTCCCTGCTCCCCAAACAAAAATGATGTTTGTATTGACGGGGTTTGT
GAACTAGTGGGATGTGATCATGAACTAGGCTCTAAAGCAGTTTCAGATGCTTGTGGCGTTTGCAAAGG
TGATAATTCAACTTGCAAGTTTTATAAAGGCCTGTACCTCAACCAGCATAAAGCAAATGAATATTATC
CGGTGGTCCTCATTCCAGCTGGCGCCCGAAGCATCGAAATCCAGGAGCTGCAGGTTTCCTCCAGTTAC
CTCGCAGTTCGAAGCCTCAGTCAAAAGTATTACCTCACCGGGGGCTGGAGCATCGACTGGCCTGGGGA
GTTCCCCTTCGCTGGGACCACGTTTGAATACCAGCGCTCTTTCAACCGCCCGGAACGTCTGTACGCGC
CAGGGCCCACAAATGAGACGCTGATTCTGATGCAAGGCAAAAATCCAGGGATAGCTTGGAAGTATGCA
CTTCCCAAGGTCATGAATGGAACTCCACCAGCCACAAAAAGACCTGCCTATACCTGCTGGATGCCAGG
TGAATGGAGTACATGCAGCAAGGCCTGTGCTGGAGGCCAGCAGAGCCGAAAGATCCAGTGTGTGCAAA
AGAAGCCCTTCCAAAAGGAGGAAGCAGTGTTGCATTCTCTCTGTCCAGTGAGCACACCCACTCAGGTC
CAAGCCTGCAACAGCCATGCCTGCCCTCCACAATGGAGCCTTGGACCCTGGTCTCAGTGTTCCAAGAC CTGTGGACGAGGGGTGAGGAAGCGTGAACTCCTCTGCAAGGGCTCTGCCGCAGAAACCCTCCCCGAGA GCCAGTGTACCAGTCTCCCCAGACCTGAGCTGCAGGAGGGCTGTGTGCTTGGACGATGCCCCAAGAAC AGCCGGCTACAGTGGGTCGCTTCTTCGTGGAGCGAGTGTTCTGCAACCTGTGGTTTGGGTGTGAGGAA GAGGGAGATGAAGTGCAGCGAGAAGGGCTTCCAGGGAAAGCTGATAACTTTCCCAGAGCGAAGATGCC GTAATATTAAGAAACCAAATCTGGACTTGGAAGAGACCTGCAACCGACGGGCTTGCCCAGCCCATCCA GTGTACAACATGGTAGCTGGATGGTATTCATTGCCGTGGCAGCAGTGCACAGTCACCTGTGGGGGAGG GGTCCAGACCCGGTCAGTCCACTGTGTTCAGCAAGGCCGGCCTTCCTCAAGTTGTCTGCTCCATCAGA AACCTCCGGTGCTACGAGCCTGTAATACAAACTTCTGTCCAGCTCCTGAAAAGAGAGAGGATCCATCC TGCGTAGATTTCTTCAACTGGTGTCACCTAGTTCCTCAGCATGGTGTCTGCAACCACAAGTTTTACGG ΪAAAACAATGCTGCAAGTCATGCACAAGGAAGATCTGATCTTGGTGTCCTCCCCAGCACCTTATGGCCA jGGGGCTTACCTTTCAACCTCTAGAGA jWherein residue i is C or T; X2 is A or G; X3 is T or A.
NOV1i, CG110205 |SEQIDNO:18 1162 aa MWat 28776.6kD
Protein Sequence |
MECA LLACAFPAAGSGPPRG--ιAGLGRVAKA QLCCLCCASVAAALASDSSSGASGLNDDYVFVTPVE VDSAGSYISHDILHNGR KRSAQNARSSLHYRFSAFGQELHLΞLKPSAI SSHFIVQVLGKDGASETQ
JKPEVQQCFYQGFIRNDSSSSVAVSTCAGLSGLIRTRKNΞF ISPLPQL AQEHNZ1SSPAGHHPHVLYK
JRTAEEKIQRYRGYPGSGR YPGYSPSHIPHASQSRETEYHHRR QKQHFCGRR KYAPZ2PPTEDTYLR ;FDEYGSSGRPRRSAGKSQ GLNVETLVVADKKMVEKHGKGNVTTYILTViy-- VSGLF DGTIGSDINV 'VVVSLILLEQΞPGG LINHHADQS NSFCQ QSA IGKNGK-flHDHAILLTGFDICS KNEPCDT GFA IplSG CSKYRSCTINEDTGLGLAFTIAHESGH FG IHDGEGNPCRKAEGNIMSPTLTG NGλΛFSWSS CSRQYLKKF STPQAGCLVDΞPKQAGQYKYPDKLPGQIYDADTQCK QFGAKA CS--1GFVKDICKSL CHRVGHRCET FMPAAEGTVCGLSMWCRQGQCV FGELGPRPIHGQWSAWS WSECSRTCGGGVKFQ
JΞRHCMSTPKPQYGGZ3FCPGSSRIYQ CNINPCNENS DFRAQQCAΞYNS PFRG FYQ KPYTKVEEED
IRCKLYCKAENFEFFFAMSG ^^DGTPCSPNKNDVCIDGVCELVGCDHELGS AVSDACGVCKGDNSTC
;KFYKGLY---NQHKA3STΞYYPVVLIPAGARSIEIQELQVSSSYLAVRSLSQKYYLTGG SID PGEFPFAG jTTFEYQRSFNRPERLYAPGPTNET I MQGKNPGIAlraYALPKVM GTPPAT-πiPAYTCW PGEWSTC iSKACAGGQQSRKIQCVQKKPFQKΞEAVLHS CPVSTPTQVQACNSHACPPQ S GP SQCSKTCGRGV
.RKRELLCKGSAAETLPESQCTS PRPE QEGCVLGRCPKNSRLQWVASSWSΞCSATCGLGVRKRE KC
|SEKGFQGK ITFPERRCR IKKP DLEETCNRRACPAHPVYNMV"AGWYS P QQCTVTCGGGVQTRS VHCVQQGRPSSSC LHQKPPV--,RACNTNFCPAPEKREDPSCVDFFNWCH VPQHGVCNH FYG QCCK
ISCTRKI
JWherein Z1 is Y or H; Z2 is K or E; Z3 is L or I.
A ClustalW comparison ofthe above protein sequences yields the following sequence alignment shown in Table IB.
Table IB. Comparison of the NO VI protein sequences.
NOVIa MECA LACAFPAAGSGPPRGLAGLGRVAKALQLCC CCASVAAALASDSSSGASGLNDD
NOVIb MECAL LACAFPAAGSGPPRGLAG GRVAKALQ CCLCCASVAAA ASDSSSGASGIiNDD
NOVIc
NCVld
NOVle
NOVlf
NOVlg 1X-ECALLLACAFPAAGSGPPRGLAGLGRVAKALQLCC CCASVAAALASDSSSGASG---NDD
NOVlh MECAL--.LACAFPAAGSGPPRGLAGLGRVA ALQLCCLCCASVAAAASDSSSGASGI-NDD
NOV1i MΞCAL LACAFPAAGSGPPRGLAGLGRVAKALQLCC CCASVAAALASDSSSGASGLNDD
NOVIa YVFVTPVEVDSAGSYISHDILHNGRKKRSAQNARSSLHYRFSAFGQE H ELKPSAII-SS
NOVIb YVFVTPVEVDSAGSYISHDILHNGRKKRSAQNARSSLHYRFSAFGQΞLHLΞLKPSAILSS
NOVIc K ELKPSAII-SS
NCVld
NOVle
NOVlf K ΞLKPSAI SS
NOVlg YVFVTPVEVDSAGSYISHDILHNGRKKRSAQNARSS HYRFSAFGQELH EL PSAILSS
NOVlh. YVFVTPVEVDSAGSYISHDI HNGRKKRSAQNARSSLHYRFSAFGQE HI-ELKPSAILSS NOVli YVFVTPVEVDSAGSYISHDir-HNGRKKRSAQNARSS--.HYRFSAFGQELHLEl-KPSAI SS
NOVIa HFIVQVLG DGASETQKPEVQQCFYQGFIRNDSSSSVAVSTCAGLSGLIRTRKNEF ISP NOVIb HFIVQVLG DGASETQ PEVQQCFYQGFIRNDSSSSVAVSTCAGLSGLIRTRINΞFr-ISP NOVIc HFIVQVLGKDGASETQKPΞVQQCFYQGFIRNDSSSSVAVSTCAGLSG 1RTRKNEF ISP NOVld NOVle NOVlf HFIVQVLGEXIGASETQKPΞVQQCFYQGFIRNDSSSSVAVSTCAGLSG IRTRKNEF--,ISP NOVlg HFIVQVLGKDGASETQ PEVQQCFYQGFIRNDSSSSVAVSTCAGLSGLIRTRKNEFLISP NOVlh HFIVQVLGKDGASETQKPEVQQCFYQGFIRNDSSSSVAVSTCAGLSG IRTRKNEFLISP NOVli HFIVQVGKDGASETQKPEVQQCFYQGFIRNDSSSSVAVSTCAGLSGLIRTRKNEFLISP
NOVIa PQ LAQΞHNYSSPAGHHPHVLYKRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVIb LPQ--.LAQEHNYSSPAGHHPHVLYKRTAEE IQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVIc LPQ LAQEHNYSSPAGHHPHVLYKRTAEE IQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVld NOVle NOVlf PQ AQEHNYSSPAGHHPHVLY RTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVlg LPQL AQΞHNHSSPAGHHPHV YKRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVlh LPQLLAQΞHNYSSPAGHHPHVLYKRTAEEKIQRYRGYPGSGRNYPGYSPSHIPHASQSRE NOVli LPQLLAQEHNYSSPAGHHPHVLYKRTAEΞKIQRYRGYPGSGRNYPGYSPSHIPHASQSRE
NOVIa TEYHHRR QKQHFCGRRK YAPKPPTEDTY RFDEYGSSGRPRRSAGKSQ GLNVETLW NOVIb TEYHHRR QKQHFCGRRK YAPKPPTEDTY RFDΞYGSSGRPRRSAGKSQKG NVETW NOVIc TEYHHRR QKQHFCGRRKKYAPKPPTEDTYLRFDEYGSSGRPRRSAGKSQKG NVETLW NOVld KVETLW NOVle TGSVΞTI-W NOVlf TEYHHRRLQ QHFCGRRKKYAPKPPTEDTY RFDΞYGSSGRPRRSAGKSQKG NVETLW NOVlg TEYHHRR QKQHFCGRRKKYAPKPPTΞDTYLRFDEYGSSGRPRRSAGKSQKG NVETLW NOVlh TEYHHRRLQ QHFCGRRK YAPKPPTEDTY RFDEYGSSGRPRRSAGKSQKGLNVΞTI-W NOVli TEYHHRR QKQHFCGRRKKYAPEPPTEDTY RFDEYGSSGRPRRSAGKSQ GLNVETLW
NOVIa ADK MVΈKHGKGNVTTYILTVNMVSGLFKDGTIGSDINVVVVSLIL EQEPGG LINHH NOVIb ADKKMVEKHGKGNVTTYILTVMKVS-G FKDGTIGSDINVVVVSLI LΞQEPGGRILINHH NOVIc ADK-KMVEKHGKGNVTTYILTVMKTJWSG ^ NOVld ADKDWEKHG GNVTTYI TVM-Π-RVSGLF-^GTIGSDINVVVVSLILLEQΞPGG--ILINHH NOVle ADKQMVEKHGKGNVTTYI TVMNMVSG FKDGTIGSDINVVVVS I LΞQEPGGL INHH NOVlf ADKKMVEKHGKGNVTTYILTVMN]V-VSG F DGTIGSDINVVVVSLILLEQEPGG --IINHH NOVlg ADKK VEKHGKGNVTTYILTVMNLWSG F-ΠSGTIGSDINVVVVS ILLEQEPGGL INHH NOVlh ADK-KMVEKHGKGNVTTYILTVI MVSG FKD^ NOVli ADK KfVEKHGKGNVTTYILTViy-Ni-VSGLFKDGTIGSDINVVVVSLI-^
NOVIa ADQSLNSFCQWQSALIGKNGKRHDHAIL TGFDICSW NEPCDT GFAPISGMCSKYRSC NOVIb ADQSLNSFCQWQSALIG1.G RHDHAI TGFDICSWKNΞPCDTLGFAPTSGMCSKYRSC NOVIc ADQSLNSFCQWQSA IGKNGKRHDHA1LLTGFDICS KNEPCDT GFAPISGMCSKYRSC NOVld ADQSLNSFCQWQSALIGKNGK-EIHDHAIL TGFDICS KNEPCDTLGFAPISG CSKYRSC NOVle ADQSLNSFCQ QSA IGKNGKRHDHAI LTGFDICS KNEPCDTLGFAPISGMCS YRSC NOVlf ADQSLNSFCQWQSALIGKNG-EIHDHAI TGFDICSWKNEPCDT GFAPISGMCSKYRSC NOVlg ADQS NSFCQWQSALIGKNGKRHDHA1 TGFDICSWKNEPCDTLGFAPISGMCSKYRSC NOVlh ADQS NSFCQ QSALIGKNGKRHDHAI LTGFDICSWKNEPCDT GFAPISGMCSKYRSC NOVli ADQS NSFCQWQSALIG NGKRHDHAILLTGFDICS KNΞPCDT GFAPISGMCSKYRSC
NOVIa TINEDTG GLAFTIAHESGHNFG IHDGEGNPCRKAEGNIMSPTLTGNNGVFS SSCSRQ NOVIb TINEDTGLGLAFTIAHESGHNFG IHDGΞGNPCR AEGNIMSPTLTGNNGVFSWSSCSRQ NOVIc TINΞDTG G AFTIAHESGHNFGMIHDGEGNPCRKAEGNIMSPTLTGNNGVFS SSCSRQ NOVld TINEDTGLGLAFTIAHESGHNFGMIHDGEGNPCR AEGNIMSPTLTGNNGVFSWSSCSRQ
NOV1e TINEDTGLGLAFTIAHESGHNFGMIHDGEGNPCRKAEGNIMSPTLTGNNGVFS SSCSRQ
NOV1f TINEDTGLG AFTIAHESGHNFG IHDGEGNPCRKAEGNI SPTLTGNNGVFSWSSCSRQ
NOVlg TINEDTGLGAFTIAHESGHNFGMIHDGEGNPCR AEGNIMSPTLTGNNGVFS SSCSRQ
NOVlh TINEDTGLGLAFTIAHESGHNFG IHDGEGNPCRKAEGNIMSPTLTGNNGVFSWSSCSRQ
NOVli TINEDTGLGLAFTIAHESGHNFGMIHDGEGNPCRKAEGNI SPTLTGNNGVFS SSCSRQ
NOVIa Y KFLSTPQAGC VDEPKQAGQYKYPDKI-PGQIYDADTQCK QFGAKAKLCS GFVKDI
NOVIb YLKKF STPQAGCLVDEPKQAGQYKYPDKLPGQIYDADTQCKWQFGAKAKLCSLGFVKDI
NOVIc YLKKFLSTPQAGCLVDEPKQAGQYKYPDKLPGQIYDADTQCKWQFGAKAK CS GFVKDI
NOVld YLKKF STPQAGCI-VDEPLE
NOV1e Y K F STPQAGCLVDEPKQAGQYKYPDKI-PGQIYDADTLCKWQFGAKAK CSLGFVKDI
NOVlf YLKKFLSTPQAGCLVDEP E
NOVlg YL KF STPQAGC VDΞP QAGQYKYPDKLPGQIYDADTQCK QFGAKAKLCSLGFVKDI
NOVlh Y KKFLSTPQAGCLVDEPKQAGQYKYPDK PGQIYDADTQCK QFGAKAKLCSLGFVKDI
NOV1i Y KKFLSTPQAGC VDEPKQAGQY YPDK PGQIYDADTQCKMQFGAKAKLCS GFVKDI
NOVIa CKSLWCHRVGHRCETKFMPAAEGTVCGLS WCRQGQCVKFGELGPRPIHGQWSAWSKWSE NOVIb CKSL CHRVGHRCETKFMPAAEGTVCGLSMWCRQGQCVKFGELGPRPIHGQWSAWSKWSE NOVIc CKS WCHRVGHRCETKFMPAAEGTVCGLS WCRQGQCVKFGΞ GPRPIHGQWSAWSK SΞ NOVld NOVle CKS WCHRVGHRCETKFMPAAEGTVCGLSMWCRQGQCVKFGE GPRPIHGQWSAWS WSΞ NOVlf NOVlg CKSLWCHRVGHRCETKF PAAEGTVCG SMWCRQGQCVKFGELGPRPIHGQWSAWSKWSE NOVlh CKSLWCHRVGHRCETKFMPAAEGTVCGLSMWCRQGQCVKFGΞ GPRPIHGQWSA SK SΞ NOVli CKSLWCHRVGHRCΞTKFMPAAEGTVCG SMWCRQGQCVKFGE GPRPIHGQWSAWSKWSE
NOVIa CSRTCGGGVKFQERHCNNP PQYGGLFCPGSSRIYQ CNINPCNENSLDFRAQQCAEYNS
NOVIb CSRTCGGGVKFQERHCNNPKPQYGGIFCPGSSRIYQ CNINPCNENSLDFRAQQCAEYNS
NOV1C CSRTCGGGVKFQERHCNNPKPQYGGLFCPGSSRIYQLCNINPC E
NOVld
NOV1e CSRTCGGGVKFQERHCNNPKPQYGGLFCPGSSRIYQLCNINPC EG
NOVlf
NOVlg CSRTCGGGVKFQERHCNNPKPQYGGLFCPGSSRIYQLCNINPCNENSLDFRAQQCAEYNS
NOVlh CSRTCGGGVKFQERHCNNPKPQYGGIFCPGSSRIYQ CNINPCNENSLDFRAQQCAEYNS
NOV1i CSRTCGGGVKFQERHCNNPKPQYGGLFCPGSSRIYQ CNINPCNENSLDFRAQQCAEYNS
NOVIa KPFRGWFYQW PYTKVEEEDRCK YCKAENFΞFFFAMSGKVKDGTPCSPNKNDVCIDGVC
NOVIb KPFRGWTFYQ KPYT VEEEDRCK YCKAENFEFFFAMSG VKDGTPCSPNKNDVCIDGVC
NOVIc
NOVld
NOVle
NOVlf
NOVlg KPFRGWrFYQ PYTKVEEEDRCKLYCKAENFEFEFA SG VDGTPCSPNKNbVCIDGVC
NOVlh KPFRGWTFYQWKPYTKVEEEDRCKLYCKAENFEFFFAMSGKVKDGTPCSPNKNDVCIDGVC
NOVli PFRGWFYQWKPYTKVEEEDRCKLYC AENFEFFFAMSGKVKDGTPCSPNKNDVCIDGVC
NOVIa E VGCDHELGSKAVSDACGVCKGDNSTCKFYKGLYI-NQHKANEYYPVV IPAGARSIEIQ
NOVIb ELVGCDHELGSKAVSDACGVC GDNSTCKFYKGLYI-NQHKANEYYPW IPAGARSIEIQ
NOVIc
NOVld
NOVle
NOVlf
NOVlg ΞLVGCDHELGSAVSDACGVCKGDNSTCKFYKGLY-^QH ANEYYPVV IPAGARSIΞIQ
NOVlh ELVGCDHELGSKAVSDACGVCKGDNSTC FY GLY---NQHKANEYYPVVLIPAGARSIEIQ NOVli Ξ VGCDHELGSKAVSDACGVCKGDNSTCKFYKGLYLNQH ANEYYPWLIPAGARSIEIQ
NOVIa ELQVSSSYLAVRS SQKYYLTGGWSIDWPGEFPFAGTTFΞYQRSFNRPΞR YAPGPTNET
NOVIb ELQVSSSY AVRSLSQKYYI-TGG SIDWPGEFPFAGTTFEYQRSFNRPER YAPGPTNΞT
NOVIc
NOVld
NOVle
NOVlf
NOVlg EI-QVSSSY AVRSLSQKYYLTGG SIDWPGΞFPFAGTTFΞYQRSFNRPER YAPGPTNET
NOVlh ELQVSSSY AVRS SQKYYLTGGWSIDWPGEFPFAGTTFEYQRSFNRPERLYAPGPTNET
NOVli E QVSSSYLAVRS SQKYYLTGG SIDWPGEFPFAGTTFΞYQRSFNRPERLYAPGPTNET
NOVIa Lir-MQG-^PGIAVKYA PKVMNGTPPATKRPAYTCWMPGE STCSKACAGGQQSR IQCV
NOVIb VFΞI---MQGKNPGIAKYA PKVMNGTPPATKRPAYTWSIVQSECSVSCGGGYINVAIC
NOVIc
NOVld
NOVle
NOVlf — r
NOVlg I---MQGKNPGIAW YALPKV1-INGTPPATKRPAYTCWMPGEWSTCSKACAGGQQSR IQCV
NOVlh I---MQG NPGIAVmYALPVMNGTPPATKRPAYTCWMPGEWSTCSKACAGGQQSRKIQCV
NOVli I--MQGKNPGIAWKYA P MSTGTPPATKRPAYTCWMPGEWSTCSKACAGGQQSR IQCV
NOVIa QKKPFQKEEAVHSLCPVSTPTQVQACNSHACPPQWSLGPWSQCSKTCGRGVRKRELLCK
NOVIb LRDQNTQVNSSFCSA T PVTEPKICNAFSCPAYWMPGEWSTCS SCAGGQQSKKIQCVQ
NOVIc
NOVld —
NOVle —
NOVlf
NOVlg Q KPFQKEEAV--1HS CPVSTPTQVQACNSHACPPQWS GPWSQCS TCGRGVRKREL C
NOVlh Q PFQKEEAVLHSLCPVSTPTQVQACNSHACPPQWSLGP SQCSKTCGRGVRKREL CK
NOVli QK PFQ EEAVLHSLCPVSTPTQVQACNSHACPPQ SLGP SQCSKTCGRGVRKRELL.CK
NOVIa GSAAETLPΞSQCTS PRPELQEGCV GRCPKNSRLQWVASSWSECSATCGLGV REMKC
NOVIb KPFQKEΞAVLHSLCPVSTPTQVQACNSHACPPQWSLGP SQCSKTCGRGVRKRΞL CKG
NOVIc
NOVld •
NOVle
NOVlf
NOV1g GSAAETLPESQCTSI-PRPE QEGCV GRCPKNSR QWVASSWSECSATCG GVR REMKC
NOVlh GSAAΞT--JPESQCTS--1PRPELQEGCVLGRCP NSR QWVASSWSECSATCG GVR RE KC
NOVli GSAAETLPESQCTSLPRPELQΞGCVLGRCPKNSRLQWVASSWSECSATCGLGVR RΞMKC
NOVIa SEKGFQGKLITFPΞRRCRNIKKP-^DLEETCNRRACPAHPVY-MVAGWYSLPWQQCTVTC
NOVIb SAAETLPΞSQCTSLPRPELQEGCVLGRCPWSR Q VASS SECSATCGLGVRKREMKCS
NOVIc
NOVld
NOVle
NOVlf
NOVlg SΞKGFQGKLITFPERRCRNI-^PN D EETCNRRACPAHPVYNMVAGWYSLPWQQCTVTC
NOVlh SEKGFQGKLITFPERRCRNIK-KPN D ΞETCNRRACPAHPVYOTWAG YSLP QQCTVTC
NOV1i SEKGFQGK ITFPE-lRCRNIKKPNLD EΞTCNRRACPAHPVYlSriW'AG YS PWQQCTVTC
NOVIa GGGVQTRSVHCVQQGRPSSSC1.LHQ PPVI.RACNTNFCPAPEKREDPSCVDFFN CHI-VP NOVIb EKGFQG LITFPERRCRNIKKPNDLEETCNRRACPAHPVYMaVAGWYSI-P QQCTVTCG NOVIc NOVld
NOVle
NOVlf
NOVlg GGGVQTRSVHCVQQGRPSSSCLLHQ PPVRACNTNFCPAPEKRΞDPSCVDFFNWCHI-VP
NOVlh GGGVQTRSVHCVQQGRPSSSCLLHQKPPVRACNTNFCPAPΞKRΞDPSCVDFFN CHLVP
NOVli GGGVQTRSVHCVQQGRPSSSCL HQKPPVLRACNTNFCPAPEKREDPSCVDFFNWCHVP
NOVIa QHGVCNHKFYGKQCCKSCTRKI NOVIb GGVQTRSVHCVQQGRPSSSCL HQKPPVRACNTNFCPAPEKRDSAGSQLPCCDGPQAVH NOVIc NOVld NOVle NOVlf NOVlg QHGVCNHKFYGKQCCKSCTRKI- NOVlh QHGVCNHKFYGKQCCKSCTRKI- NOVli QHGVCNHKFYGKQCCKSCTRKI-
NOVIa NOVIb EEGLRFPDNHWAM NOVIc NOVld NOVle NOVlf NOVlg NOVlh NOVli
NOVIa (SEQ ID NO 2) NOVIb (SEQ ID NO 4) NOVIc (SEQ ID NO 6) NOVld (SEQ ID NO 8) NOVle (SEQ ID NO 10) NOVlf (SEQ ID NO 12) NOVlg (SEQ ID NO 14) NOVlh (SEQ ID NO 16) NOVli (SEQ ID NO 18)
Further analysis of the NOVIa protein yielded the following properties shown in Table 1C.
Table 1C. Protein Sequence Properties NOVIa
SignalP analysis: Cleavage site between residues 48 and 49
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 2; pos.chg 0; neg.chg 1 H-region: length 17; peak value 0.00 PSG score: -4.40
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1): 3.61 possible cleavage site: between 16 and 1 »> Seems to have no N-terminal signal peptide ALOM: Klein et al's method for TM region allocation
Init position for calculation: 1
Tentative number of TMS(s) for the threshold 0.5: 2
INTEGRAL Likelihood = -4.78 Transmembrane 31 - 47
INTEGRAL Likelihood = -2.50 Transmembrane 332 - 348
PERIPHERAL Likelihood = 1.85 (at 1)
ALOM score: -4.78 (number of TMSs: 2) MTOP: Prediction of membrane topology (Hartmann et al.)
Center position for calculation: 38
Charge difference: -6.0 C(-3.0) - N( 3.0)
N >= C: N-terminal side will be inside >» membrane topology: type 3a MITDISC: discrimination of mitochondrial targeting seq
R content: 2 Hyd Moment(75): 3.45
Hyd Moment(95): 5.03 G content: 5
D/E content: 2 S/T content: 3
Score: -7.19 Gavel: prediction of cleavage sites for mitochondrial preseq
R-2 motif at 37 GRV|AK NUCDISC: discrimination of nuclear localization signals pat4: RKKR (5) at 85 pat4: RRKK (5) at 256 pat4: RPRR (4) at 281 pat7: PERRCRN (4) at 1033 bipartite: none content of basic residues: 12.0%
NLS Score: 0.75 KDEL: ER retention motif in the C-terminus: none ER Membrane Retention Signals:
KKXX-like motif in the C-terminus: CTRK SKL: peroxisomal targeting signal in the C-terminus: none PTS2: 2nd peroxisomal targeting signal: none VAC: possible vacuolar targeting motif: none RNA-binding motif: none Actinin-type actin-binding motif: type 1 : none type 2: none NMYR: N-myristoylation pattern : none
Prenylation motif: none memYQRL: transport motif from cell surface to Golgi: none
Tyrosines in the tail: none
Dileucine motif in the tail: none checking 63 PROSITE DNA binding motifs: none checking 71 PROSITE ribosomal protein motifs: none checking 33 PROSITE prokaryotic DNA binding motifs: none
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination
Prediction: nuclear
Reliability: 89 COIL: Lupas's algorithm to detect coiled-coil regions total: 0 residues
Final Results (k = 9/23): 34.8 %: nuclear 30.4 %: mitochondrial 26.1 %: endoplasmic reticulum 4.3 %: peroxisomal 4.3 %: cytoplasmic » prediction for CG110205-01 is nuc (k=23)
A search of the NOV1 a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 1 D.
Figure imgf000055_0001
Figure imgf000056_0001
In a BLAST search of public sequence databases, the NOVIa protein was found to have homology to the proteins shown in the BLASTP data in Table 1 E.
Figure imgf000056_0002
PFam analysis predicts that the N0V1 a protein contains the domains shown in the Table
1 F.
Figure imgf000057_0001
Example 2. NOV2, CG189936, Junctional Adhesion Molecule 3
The NOV2 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 2A. j Table 2A. NOV2 Sequence Analysis
INOV2a, CG136984-02 SEQ ID NO: 19 J952 bp
IDNA Sequence ORF Start: ATG at 14 JORF Stop: at 944 iCACCGGATCCACCATGGCGCTGAGGCG. 3CCATCGCGACTCCGGCTCTGCGCTCGGCTGCCTGACTTCT
JTCCTGCTGCTGCTTTTCAGGGGCTGCC ΓGATAGGGGCTGTAAATCTCAAATCCAGCAATCGAACCCCA 5GTGGTACAGGAATTTGAAAGTGTGGAA. TGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAG ΪGATCGAGTGGAAGAAAATTCAAGATGA- CAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAG JACTTGGCGGGTCGTGCAGAAATACTGGC 3GAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGAC .TCAGCCCTTTATCGCTGTGAGGTCGTT.3CTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGA |GTTAACTGTGCAAGTGAAGCCAGTGAC( CTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGA JTGGCAACACTGCACTGCCAGGAGAGTG- \GGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGAT JGTACCACTGCCCACGGATTCCAGAGCC- VATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGA 'AACAGGCACTTTGGTGTTCACTGCTGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCAi SATGACGCAGGCTCAGCCAGGTGTGAGGAGCAGGAGATGGAAGTCTATGACCTGAACATTGGCGGAATT: IATTGGGGGGGTTCTGGTTGTCCTTGCTGTACTGGCCCTGATCACGTTGGGCATCTGCTGTGCATACAG' JACGTGGCTACTTCATCAACAATAAACAGGATGGAGAAAGTTACAAGAACCCAGGGAAACCAGATGGAG1 TTAACTACATCCGCACTGACGAGGAGGGCGACTTCAGACACAAGTCATCGTTTGTGATCCTCGAGGGC;
|NOV2a, CG136984-02 SEQ ID NO: 20 310 aa jMW at 35009.5kD JProtein Sequence
LI -ALRRPSR R CARLPDFF LL FRGCLIGAVNLKSSNRTPVVQΞFESVELSCIITDSQTSDPRIE K IKIQDEQTTYVFFDNKIQGDLAGRAEILGKTSLKIWNVTRRDSALYRCEVVARNDRKEIDEIVIE TVQ 1VKPVTPVCRVPKAVPVGKMAT HCQESEGHPRPHYSWYRNDVP--.PTDSRANPRFRNSSFHLNSETGT |VFTAVΗKDDSGQYYCIASNDAGSARCEEQEM--RVΥDLNIGGIIGGV VVI-AVLALITLGICCAYRRGYF JINNKQDGESYKNPGKPDGVNYIRTDEEGDFRHKSSFVI
NOV2b, CG136984-01 SEQ ID NO: 21 J939 bp DNA Sequence ORF Start: ATG at 7 (ORF Stop: TGA at 937
GCCCTTATGGCGCTGACGCGGCCATCGCGACTCCGGCTCTGCGCTCGGCTGCCTGACTTCTTCCTGCT
GCTGCTTTTCAGGGGCTGCCTGATAGGGGCTGTAAATCTCAAATCCAGCAATCGAACCCCAGTGGTAC ^AGGAATTTGAAAGTGTGGAACTGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAGGATCGAG JTGGAAGAAAATTCAAGATGAACAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAGACTTGGC GGGTCGTGCAGAAATACTGGGGAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGACTCAGCCC .TTTATCGCTGTGAGGTCGTTGCTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGAGTTAACT GTGCAAGTGAAGCCAGTGACCCCTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGATGGCAAC ACTGCACTGCCAGGAGAGTGAGGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGATGTACCAC TGCCCACGGATTCCAGAGCCAATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGAAACAGGC -ACTTTGGTGTTCACTGCTGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCAATGACGC JAGGCTCAGCCAGGTGTGAGGAGCAGGAGATGGAAGTCTATGACCTGAACATTGGCGGAATTATTGGGG JGGGTTCTGGTTGTCCTTGCTGTACTGGCCCTGATCACGTTGGGCATCTGCTGTGCATACAGACGTGGT JTACTTCATCAACAATAAACAGGATGGAGAAAGTTACAAGAACCCAGGGAAACCAGATGGAGTTAACTA ICATCCGCACTGACGAGGAGGGCGACTTCAGACACAAGTCATCGTTTGTGATCTGA
!NOV2b, CG136984-01 SEQ ID NO: 22 310 aa !MW at 34954.4kD jProtein Sequence
]MA TRPSR R CAR PDFFL LLFRGCLIGAVNLKSSNRTPWQEFESVΞLSCIITDSQTSDPRIEWK; IKIQDEQTTYVFFDNKIQGDLAGRAEILGKTSLKII VTRRDSALYRCEVVARNDRKEIDEIVIE TVQ: jVKPVTPVCRVPKAVPVGK AT HCQESEGHPRPHYS YRNDVPLPTDSRANPRFRNSSFH NSETGTLi IVFTAVHKDDSGQYYCIASNDAGSARCΞEQEM--^YDLNIGGIIGGVVVLAVLALITLGICCAYRRGYF; ^INNKQDGESYKNPGKPDGVNYIRTDEEGDFRHKSSFVI
^NOV2c, CG136984-03 SEQ ID NO: 23 637 bp [DNA Sequence ORF Start: at 11 ORF Stop: at 629
JCACCGGATCCGCTGTAAATCTCAAATCCAGCAATCGAACCCCAGTGGTACAGGAATTTGAAAGTGTGG! JAACTGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAGGATCGAGTGGAAGAAAATTCAAGAT: GAACAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAGACTTGGCGGGTCGTGCAGAAATACT: JGGGGAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGACTCAGCCCTTTATCGCTGTGAGGTCG JTTGCTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGAGTTAACTGTGCAAGTGAAGCCAGTG CCCCTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGATGGCAACACTGCACTGCCAGGAGAG TGAGGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGATGTACCACTGCCCACGGATTCCAGAG
CCAATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGAAACAGGCACTTTGGTGTTCACTGCT: jGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCAATGACGCAGGCTCAGCCAGGTGTGAj iGGAGCAGGAGATGGAACTCGAGGGC
NOV2C, CG 136984-03 SEQ ID NO: 24 206 aa IMW at 23390.0kD JProtein Sequence
(AVNIiKSSNRTPWQEFESVE SCIITDSQTSDPRIE KKIQDEQTTYVFFDNKIQGDLAGRAEILGKT iSLKIVTOVTRRDSALYRCEVVA-SNDRKEIDΞIVIE TVQVKPVTPVCRVPKAVPVGK ATLHCQESEGH RPHYS YRNDVP PTDSRANPRFRNSSFH NSΞTGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQE jME
|NOV2d, 312713075 SEQ ID NO: 25 618 bp jDNA Sequence ORF Start: at 1 ORF Stop: end of sequence
GCTGTAAATCTCAAATCCAGCAATCGAACCCCAGTGGTACAGGAATTTGAAAGTGTGG
AACTGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAGGATCGAGTGGAAGAAAATTCAAGAT
GAACAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAGACTTGGCGGGTCGTGCAGAAATACT
GGGGAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGACTCAGCCCTTTATCGCTGTGAGGTCG
TTGCTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGAGTTAACTGTGCAAGTGAAGCCAGTG jACCCCTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGATGGCAACACTGCACTGCCAGGAGAG ITGAGGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGATGTACCACTGCCCACGGATTCCAGAG |CCAATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGAAACAGGCACTTTGG-?GTTCACTGCT
JGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCAATGACGCAGGCTCAGCCAGGTGTGA IGGAGCAGGAGATGGAA
]NOV2d, 312713075 SEQ ID NO: 26 206 aa MW at 23934.5kD IProtein Sequence
JAVNLKSSNRTPVVQEFESVELSCIITDSQTSDPRIE KKIQDEQTTYVFFDNKIQGDLAGRAEIL KTSLKIWNVTRRDSALYRCEVVARlTORKEIDΞIVIELTVQVKPVTPVCRVPKAVPVGKMAT HCQES JEGHPRPHYSWYRNDVP PTDSRANPRFRNSSFHLNSETGTLVFTAVHKDDSGQYYCIASNDAGSARCΞ EQEME
|NOV2e, 13382593 SEQ ID NO: 27 952 bp jDNA Sequence ORF Start: ATG at 14 ORF Stop: at944
JCACCGGATCCACCATGGCGCTGAGGCGGCCACCGCGACTCCGGCTCTGCGCTCGGCTGCCTGACTTCT JTCCTGCTGCTGCTTTTCAGGGGCTGCCTGATAGGGGCTGTAAATCTCAAATCCAGCAATCGAACCCCA |GTGGTACAGGAATTTGAAAGTGTGGAACTGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAG IGATCGAGTGGAAGAAAATTCAAGATGAACAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAG JACTTGGCGGGTCGTGCAGAAATACTGGGGAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGAC !TCAGCCCTTTATCGCTGTGAGGTCGTTGCTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGA IGTTAACTGTGCAAGTGAAGCCAGTGACCCCTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGA TGGCAACACTGCACTGCCAGGAGAGTGAGGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGAT JGTACCACTGCCCACGGATTCCAGAGCCAATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGA AACAGGCACTTTGGTGTTCACTGCTGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCA ATGACGCAGGCTCAGCCAGGTGTGAGGAGCAGGAGATGGAAGTCTATGACCTGAACATTGGCGGAATT ATTGGGGGGGTTCTGGTTGTCCTTGCTGTACTGGCCCTGATCACGTTGGGCATCTGCTGTGCATACAG ACGTGGCTACTTCATCAACAATAAACAGGATGGAGAAAGTTACAAGAACCCAGGGAAACCAGATGGAG TTAACTACATCCGCACTGACGAGGAGGGCGACTTCAGACACAAGTCATCGTTTGTGATCCTCGAGGGC
NOV2e, 13382593 SEQ ID NO: 28 310 aa MW at 35019.5kD Protein Sequence
IKIQDEQTTYVFFDNKIQGDLAGRAEI GKTS KIWNVTRRDSALYRCEVVARNDRKEIDΞIVIE TVQ ^VKPVTPVCRVPKAVPVGKMATLHCQESEGHPRPHYSW-FRNDVPLPTDSR-^PRFRNSSFH NSETGTL IVFTAVHKDDSGQYYCIASNDAGSARCEEQEMEVYD NIGGIIGGVLVVLAVLA ITI-GICCAYRRGYF | INNKQDGESYKNPGKPDGVNYIRTDΞEGDFRHKS SFVI jNOV2f, CG136984 SEQ ID NO: 29 952 bp JDNA Sequence JORF Start: ATG at 14 JORF Stop: at 944
ICACCGGATCCACCATGGCGCTGAGGCGGCCAXjCGCGACTCCGGCTCTGCGCTCGGCTGCCTGACTTCT
ITCCTGCTGCTGCTTTTCAGGGGCTGCCTGATAGGGGCTGTAAATCTCAAATCCAGCAATCGAACCCCA JGTGGTACAGGAATTTGAAAGTGTGGAACTGTCTTGCATCATTACGGATTCGCAGACAAGTGACCCCAG IGATCGAGTGGAAGAAAATTCAAGATGAACAAACCACATATGTGTTTTTTGACAACAAAATTCAGGGAG JACTTGGCGGGTCGTGCAGAAATACTGGGGAAGACATCCCTGAAGATCTGGAATGTGACACGGAGAGAC JTCAGCCCTTTATCGCTGTGAGGTCGTTGCTCGAAATGACCGCAAGGAAATTGATGAGATTGTGATCGA {GTTAACTGTGCAAGTGAAGCCAGTGACCCCTGTCTGTAGAGTGCCGAAGGCTGTACCAGTAGGCAAGA ITGGCAACACTGCACTGCCAGGAGAGTGAGGGCCACCCCCGGCCTCACTACAGCTGGTATCGCAATGAT IGTACCACTGCCCACGGATTCCAGAGCCAATCCCAGATTTCGCAATTCTTCTTTCCACTTAAACTCTGA JAACAGGCACTTTGGTGTTCACTGCTGTTCACAAGGACGACTCTGGGCAGTACTACTGCATTGCTTCCA ΪATGACGCAGGCTCAGCCAGGTGTOAGGAGCAGGAGATGGAAGTCTATGACCTGAACATTGGCGGAATT JATTGGGGGGGTTCTGGTTGTCCTTGCTGTACTGGCCCTGATCACGTTGGGCATCTGCTGTGCATACAG JACGTGGCTACTTCATCAACAATAAACAGGATGGAGAAAGTTACAAGAACCCAGGGAAACCAGATGGAG ITTAACTACATCCGCACTGACGAGGAGGGCGACTTCAGACACAAGTCATCGTTTGTGATCCTCGAGGGC (Wherein X-, is T or C.
(NOV2f, CG136984 SEQ ID NO: 30 310 aa MW at 35009.5kD JProtein Sequence
MAI-RRPZj^RLRLCARLPDFFL L FRGC IGAVNLKSSNRTPWQEFESVELSCIITDSQTSDPRIE K
KIQDEQTTYVFFDNKIQGD AGRAΞILGKTS KIWNVTRRDSA--.YRCEVVARNDRKΞIDEIVIELTVQ
VKPVTPVCRVPKAVPVGK AT HCQESEGHPRPHYSWYRNDVPLPTDSRANPRFRNSSFHLNSETGTL
VFTAVHKDDSGQYYCIASNDAGSARCEEQEM-irvYDLNIGGIIGGVLVV AV A ITLGICCAYRRGYF
INNKQDGESYKNPGKPDGVNYIRTDEΞGDFRHKSSFVI
Wherein is S or P. A ClustalW comparison of the above protein sequences yields the following sequence alignment shown in Table 2B.
Table 2B. Comparison of the NOV2 protein sequences.
NOV2a MALRRPSRLRLCARLPDFFLLLLFRGCLIGAVNLKSSNRTPWQEFESVELSCIITDSQT
NOV2b MALTRPSRLRLCARLPDFFLLLLFRGCLIGAVNLKSSNRTPWQEFESVELSCIITDSQT
NOV2c AVNLKSSNRTPWQEFESVE SCIITDSQT
NOV2d AVNLKSSNRTPWQEFESVELSCIITDSQT
NOV2a SDPRIΞWKKIQDEQTTYVFFDNKIQGDLAGRAEILGKTSLKIWVTRRDSALYRCEWAR
NOV2b SDPRI-i-I^KIQDEQTTYVFFDNKIQGDLAGRAEILGKTSLKIW-NVTRRDSALYRCΞVVAR
NOV2C SDPRI--MKKIQDEQTTYVFFDNKIQGDLAGRAEILGKTSLKI NVTRRDSALYRCEVVAR
NOV2d SDPRIEWKKIQDEQTTYVFFDNKIQGDLAGRAΞILGKTSLKIWNVTRRDSALYRCE AR
NOV2a NDRKΞIDEIVIELTVQVKPVTPVCRVPKAVPVGKMATLHCQESEGHPRPHYS YRNDVPL
NOV2b NDRKEIDEIVIELTVQVKPVTPVCRVPKAVPVGKMATLHCQESEGHPRPHYS YRNDVPL
NOV2c NDRKEIDΞIVIELTVQVKPVTPVCRVPKAVPVG-KMATLHCQESEGHPRPHYSWYRNDVPL
NOV2d NDRKEIDEIVIΞLTVQVKPVTPVCRVPKAVPVGKMATLHCQESEGHPRPHYS YRNDVPL
NOV2a PTDSRANPRFRNSSFHLNSETGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQΞMEVYDL
NOV2b PTDSRANPRFRNSSFHLNSΞTGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQEMEVYDL
NOV2c PTDSRANPRFRNSSFHLNSETGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQEME
NOV2d PTDSRANPRFRNSSFHLNSETGTLVFTAVHKDDSGQYYCIASNDAGSARCEEQEME
NOV2a NIGGIIGGVLWLAVLALITLGICCAYRRGYFINNKQDGESYKNPGKPDGVNYIRTDΞEG
NOV2b NIGGIIGGVLWLAVLALITLGICCAYRRGYFINNKQDGΞSYKNPGKPDGVNYIRTDEEG
NOV2C
NOV2d
NOV2a DFRHKSSFVI
NOV2b DFRHKSSFVI
NOV2c
NOV2d
NOV2a (SEQ ID NO 20)
NOV2b (SEQ ID NO 22)
NOV2C (SEQ ID NO 24)
NOV2d (SEQ ID NO 26)
Further analysis of the NOV2a protein yielded the following properties shown in Table 2C.
Table 2C. Protein Sequence Properties NOV2a
SignalP analysis: Cleavage site between residues 31 and 32
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 10; pos.chg 4; neg.chg 0 H-region: length 3; peak value -4.60 PSG score: -9.00
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1 ): 1.54 possible cleavage site: between 30 and 31
>» Seems to have no N-terminal signal peptide
ALOM: Klein et al's method for TM region allocation Init position for calculation: 1
Tentative number of TMS(s) for the threshold 0.5: 2 INTEGRAL Likelihood = -4.51 Transmembrane 18 - 34 INTEGRAL Likelihood =-13.48 Transmembrane 249 - 265 PERIPHERAL Likelihood = 5.09 (at 128) ALOM score: -13.48 (number of TMSs: 2)
MTOP: Prediption of membrane topology (Hartmann et al.) Center position for calculation: 25 Charge difference: -1.0 C( 0.0) - N( 1.0) N >= C: N-terminal side will be inside
>» membrane topology: type 3a
MITDISC: discrimination of mitochondrial targeting seq R content: 5 Hyd Moment(75): 13.14 Hyd Moment(95): 14.90 G content: 0 D/E content: 1 S/T content: 1 Score: 1.87
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 49 NRT|PV content of basic residues: 12.9% NLS Score: -0.47
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 76.7
Final Results (k = 9/23):
66.7 %: endoplasmic reticulum 22.2 %: mitochondrial 11.1 %: cytoplasmic
A search of the NOV2a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 2D.
Figure imgf000061_0001
Figure imgf000062_0001
In a BLAST search of public sequence databases, the NOV2a protein was found to have homology to the proteins shown in the BLASTP data in Table 2E.
Figure imgf000062_0002
PFam analysis predicts that the NOV2a protein contains the domains shown in the Table 2F. Specific amino acid residues of CG 136984-02 for each domain are shown in column 2, equivalent domains in other NOV2 and CG136984 family of proteins are also encompassed herein.
Table 2F. Domain Analysis of NOV2a
NOV2a Match Region:
Pfam Domain Identities/ Amino Acid Residues Similarit-A-; Expect Value
Figure imgf000063_0001
Example 3. NOV3, CG189936-02 Retinoic acid receptor RXR-beta
The gene described here encodes for a novel splice variant of Retinoic acid receptor RXR- beta, a coregulator of the retinoic acid receptors (RAR), alpha (RARA), beta (RARB), and gamma (RARG). RXR-beta, forms heterodimers with RAR preferentially increasing its DNA binding and transcriptional activity on promoters containing retinoic acid, but not thyroid hormone or vitamin D, response elements. RXR-beta also heterodimerizes with thyroid hormone and vitamin D receptors, increasing both DNA binding and transcriptional function on their respective response elements. RXR-alpha also formes heterodimers with these receptors. Retinoid X receptor coregulators selectively target the high affinity binding of retinoic acid, thyroid hormone, and vitamin D receptors to their cognate DNA response elements. Retinoids are involved in controlling the function of the dopaminergic mesolimbic pathway and defects in retinoic acid signaling contribute to disorders such as Parkinson disease and schizophrenia (Kreczel et al. (1998) Impaired locomotion and dopamine signaling in retinoid receptor mutant mice. Science 279: 863-867). RXR heterodimers serve as key regulators in cholesterol homeostasis by governing reverse cholesterol transport from peripheral tissues, bile acid synthesis in liver, and cholesterol absorption in intestine. Activation of RXR/LXR heterodimers inhibits cholesterol absorption through upregulation of ABC1 expression in the small intestine. Activation of RXR FXR heterodimers represses CYP7A1 expression and bile acid production, leading to a failure to solubilize and absorb cholesterol (Lu et al., 2000 Molecular basis for feedback regulation of bile acid synthesis by nuclear receptors. Molec. Cell 6: 507-515). The Retinoic acid receptor RXR-beta-like gene disclosed in this invention maps to chromosome 6.
NOV3 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 3A.
Table 3A. NOV3 Sequence Analysis jNOV3a, CG189936-02 SEQ ID NO: 31 1636bp DNA Sequence ORF Start: ATG at 17 ORF Stop: at 1628
ICACCAGATCTCCCACCATGTCTTGGGCCGCTCGCCCGCCCTTCCTCCCTCAGCGGCATGCCGCAGGGC IAGTGTGGGCCGGTGGGGGTGCGAAAAGAAATGCATTGTGGGGTCGCGTCCCGGTGGCGGCGGCGACGG JCCCTGGCTGGATCCCGCAGCGGCGGCGGCGGCGGCGGTGGCAGGCGGAGAACAACAAACCCCGGAGCC JGGAGCCAGGGGAGGCTGGACGGGACGGGATGGGCGACAGCGGGCGGGACTCCCGAAGCCCAGACAGCT ICCTCCCCAAATCCCCTTCCCCAGGGAGTCCCTCCCCCTTCTCCTCCTGGGCCACCCCTACCCCCTTCA JACAGCTCCATCCCTTGGAGGCTCTGGGGCCCCACCCCCACCCCCGATGCCACCACCCCCACTGGGCTC STCCCTTTCCAGTCATCAGTTCTTCCATGGGGTCCCCTGGTCTGCCCCCTCCAGCTCCCCCAGGATTCT CCGGGCCTGTCAGCAGCCCCCAGATTAACTCAACAGTGTCACTCCCTGGGGGTGGGTCTGGCCCCCCT GAAGATGTGAAGCCACCAGTCTTAGGGGTCCGGGGCCTGCACTGTCCACCCCCTCCAGGTGGCCCTGG GGCTGGCAAACGGCTATGTGCAATCTGCGGGGACAGAAGCTCAGGCAAACACTACGGGGTTTACAGCT GTGAGGGTTGCAAGGGCTTCTTCAAACGCACCATCCGCAAAGACCTTACATACTCTTGCCGGGACAAC AAAGACTGCACAGTGGACAAGCGCCAGCGGAACCGCTGTCAGTACTGCCGCTATCAGAAGTGCCTGGC CACTGGCATGAAGAGGGAGGCGGTACAGGAGGAGCGTCAGCGGGGAAAGGACAAGGATGGGGATGGGG IAGGGGGCTGGGGGAGCCCCCGAGGAGATGCCTGTGGACAGGATCCTGGAGGCAGAGCTTGCTGTGGAA FCAGAAGAGTGACCAGGGCGTTGAGGGTCCTGGGGGAACCGGGGGTAGCGGCAGCAGCCCAAATGACCC (TGTGACTAACATCTGTCAGGCAGCTGACAAACAGCTATTCACGCTTGTTGAGTGGGCGAAGAGGATCC JCACACTTTTCCTCCTTGCCTCTGGATGATCAGGTCATATTGCTGCGGGCAGGCTGGAATGAACTCCTC JATTGCCTCCTTTTCACACCGATCCATTGATGTTCGAGATGGCATCCTCCTTGCCACAGGTCTTCACGT JGCACCGCAACTCAGCCCATTCAGCAGGAGTAGGAGCCATCTTTGATCGGTCCCTCTCCAGGGTGCTGA ICAGAGCTAGTGTCCAAAATGCGTGACATGAGGATGGACAAGACAGAGCTTGGCTGCCTGAGGGCAATC JATTCTGTTTAATCCAGATGCCAAGGGCCTCTCCAACCCTAGTGAGGTGGAGGTCCTGCGGGAGAAAGT IGTATGCATCACTGGAGACCTACTGCAAACAGAAGTACCCTGAGCAGCAGGGACGGTTTGCCAAGCTGC JTGCTACGTCTTCCTGCCCTCCGGTCCATTGGCCTTAAGTGTCTAGAGCATCTGTTTTTCTTCAAGCTC JATTGGTGACACCCCCATCGACACCTTCCTCATGGAGATGCTTGAGGCTCCCCATCAACTGGCCGTCGA jCGGC
!NOV3a, CG189936-02 SEQ ID NO: 32 537 aa IMW at 57364.7kD '.Protein Sequence
IMSWAARPPFLPQRHAAGQCGPVGVRKEMHCGVASRWRRRRP LDPAAAAAAAVAGGEQQTPΞPEPGEA IGRDGMGDSGRDSRSPDSSSPNPLPQGVPPPSPPGPPLPPSTAPSLGGSGAPPPPPMPPPPLGSPFPVI JSSSMGSPGLPPPAPPGFSGPVSSPQINSTVSLPGGGSGPPEDVKPPVLGVRGLHCPPPPGGPGAGKRL SCAICGDRSSG-CHYGVYSCEGCKGFFKRTIRKDLTYSCRDNKDCTVDKRQRNRCQYCRYQKCLATGMKR 1EAVQEERQRGKDKDGDGEGAGGAPE---MPVDRILEAELAVEQKSDQGVEGPGGTGGSGSSPNDPVTNIC QAADKQLFTLVE AKRIPHFSSLPLDDQVILLRAGWNELLIASFSHRSIDVRDGILLATGLHVHRNSA IHSAGVGAIFDRSLSRVLTELVSKIYLRDMRMDKTELGCLRAIILFNPDAKGLSNPSEVEVLRΞKVYASLE
ITYCKQKYPEQQGRFAKLLLRLPALRSIGLKCLΞHLFFFKLIGDTPIDTFL EMLEAPHQLA
Further analysis of the NOV3a protein yielded the following properties shown in Table 3C.
Table 3C. Protein Sequence Properties NOV3a
SignalP analysis: No Known Signal Sequence Predicted
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 6; pos.chg l; neg.chg 0 H-region: length 6; peak value -7.06 PSG score: -11.46
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1): -5.15 possible cleavage site: between 58 and 59
>» Seems to have no N-terminal signal peptide
ALOM: Klein et al's method forTM region allocation I nit position for calculation: 1
Tentative number of TMS(s) for the threshold 0.5: 0 number of TMS(s) .. fixed PERIPHERAL Likelihood = 3.45 (at 369) ALOM score: 3.45 (number of TMSs: 0)
MITDISC: discrimination of mitochondrial targeting seq R content: 3 Hyd Moment(75): 5.78 Hyd Moment(95): 7.59 G content: 3 D/E content: 1 S/T content: 1 Score: -3.26
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 50 RRP|WL
NUCDISC: discrimination of nuclear localization signals pat4: RRRR (5) at 37 pat4: RRRP (4) at 38 pat7: none bipartite: RKEMHCGVASRWRRRRP at 25 content of basic residues: 11.9%
NLS Score: 0.75 checking 63 PROSITE DNA binding motifs:
Nuclear hormones receptors DNA-binding region signature (PS00031): r found CAICGDRSSGKHYGVYSCEGCKGFFKR at 205
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: nuclear Reliability: 89
Final Results (k = 9/23):
87.0 %: nuclear 13.0 %: mitochondrial
A search of the NOV3a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 3D.
Figure imgf000065_0001
In a BLAST search of public sequence databases, the NOV3a protein was found to have homology to the proteins shown in the BLASTP data in Table 10E.
Figure imgf000065_0002
Figure imgf000066_0001
- PFam analysis predicts that the NOV3a protein contains the domains shown in the Table 3F. Specific amino acid residues of CG 189936-02 for each domain are shown in column 2, equivalent domains in other NOV3 and CG189936 family of proteins are also encompassed herein.
Figure imgf000066_0002
Example 4. NOV4, CG190229, Dihydrolipoamide branched chain transacylase.
These novel sequences are novel splice variants of Dihydrolipoamide branched chain transacylase. Dihydrolipoyl transacylase (acyltransferase, E2) is a component of the branched- chain alpha-keto acid dehydrogenase complex and has dihydrolipoyl dehydrogenase E3 binding and lipoyl-bearing domains. Mutation in this enzyme causes a subset of maple syrup urine disease in Ashkenazi Jewish population. The Dihydrolipoamide branched chain transacylase-like gene disclosed in this invention maps to chromosome 1. NOV4 clones were analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 4A. i Table 4A. NOV4 Sequence Analysis jNOV4a, CG 190229-02 SEQ ID NO: 33 1276 bp JDNA Sequence ORF Start: at 2 ORF Stop: at 1268
CACCGGATCCACCATGGCTGCAGTCCGTATGCTGAGAACCTGGAGCAGGAATGCGGGGAAGCTGATTT GTGTTCGCTATTTTCAAACATGTGGTAATGTTCATGTTTTGAAGCCAAATTATGTGTGTTTCTTTGGT TATCCTTCATTCAAGTATAGTCATCCACATCACTTCCTGAAAACAACTGCTGCTCTCCGTGGACAGGT
TGTTCAGTTCAAGCTCTCAGACATTGGAGAAGGGATTAGAGAAGTAACTGTTAAAGAATGGTATGTAA AGAAGGAGATACAGTGTCTCAGTTTGATAGCATCTGTGAAGTTCAAAGTGATAAAGCTTCTGTTACC ATCACTAGTCGTTATGATGGAGTCATTAAAAAACTCTATTATAATCTAGACGATATTGCCTATGTGGG GAAGCCATTAGTAGACATAGAAACGGAAGCTTTAAAAGATTCAGAAGAAGATGTTGTTGAAACTCCTG CAGTGTCTCATGATGAACATACACACCAAGAGATAAAGGGCCGAAAAACACTGGCAACTCCTGCAGTT ICGCCGTCTGGCAATGGAAAACAATATTAAGCTGAGTGAAGTTGTTGGCTCAGGAAAAGATGGCAGAAT JACTTAAAGAAGATATCCTCAACTATTTGGAAAAGCAGACAGGAGCTATATTGCCTCCTTCACCCAAAG ' TGAAATTATGCCACCTCCACCAAAGCCAAAAGACATGACTGTTCCTATACTAGTATCAAAACCTCCG JGTATTCACAGGCAAAGACAAAACAGAACCCATAAAAGGCTTTCAAAAAGCAATGGTCAAGACTATGTC JTGCAGCCCTGAAGATACCTCATTTTGGTTATTGTGATGAGATTGACCTTACTGAACTGGTTAAGCTCC .GAGAAGAATTAAAACCCATTGCATTTGCTCGTGGAATTAAACTCTCCTTTATGCCTTTCTTCTTAAAG IGCTGCTTCCTTGGGATTACTACAGTTTCCTATCCTTAACGCTTCTGTGGATGAAAACTGCCAGAATAT JAACATATAAGATTGGTGGTACCTTTGCCAAACCAGTGATAATGCCACCTGAAGTAGCCATTGGGGCCC JTTGGATCAATTAAGGCCATTCCCCGATTTAACCAGAAAGGAGAAGTATATAAGGCACAGATAATGAAT PTGAGCTGGTCAGCTGATCACAGAGTTATTGATGGTGCTACAATGTCACGCTTCTCCAATTTGTGGAA IATCCTATTTAGAAAACCCAGCTTTTATGCTACTAGATCTGAAACTCGAGGGC jNOV4a, CG190229-02 (SEQ ID NO: 34 422 aa iMW at 47147.3kD jProtein Sequence I
PGSTMAAVRMLRTWSRNAGKLICVRYFQTCGNVHVLKPNYVCFFGYPSFKYSHPHHFLKTTAALRGQV IVQFKLSDIGEGIREVTVKEWYVKEGDTVSQFDSICΞVQSDKASVTITSRYDGVIKKLYYNLDDIAYVG JKPLVDIETEALKDSΞEDWETPAVSHDEHTHQEIKGRKTLATPAVRRLAMΞNNIKLSEWGSGKDGRI J KΞDILNYLEKQTGAILPPSPKVEIMPPPPKPKDMTVPILVSKPPVFTGKDKTEPIKGFQKAMVKTMS IAALKIPHFGYCDΞIDLTELVKLREELKPIAFARGIKLSFMPFFLKAASLGLLQFPILNASVDENCQNI !ΤYKIGGTFAKPVIMPP---VAIGALGSIKAIPRBTXRQKGEVYK-AQIMNVSWSADHRVIDGAT SRFSNLWK .SYLENPAF LLDLK
ΪNOV4b, CG190229-04 SEQ ID NO: 35 1390 bp .DNA Sequence ORF Start: at 1 ORF Stop: at 1369
JATGGCTGCAGTCCGTATGCTGAGAACCTGGAGCAGGAATGCGGGGAAGCTGATTT GTGTTCGCTATTTTCAAACATGTGGTAATGTTCATGTTTTGAAGCCAAATTATGTGTGTTTCTTTGGT
JTATCCTTCATTCAAGTATAGTCATCCACATCACTTCCTGAAAACAACTGCTGCTCTCCGTGGACAGGT
-TGTTCAGTTCAAGCTCTCAGACATTGGAGAAGGGATTAGAGAAGTAACTGTTAAAGAATGGTATGTAA
SAAGAAGGAGATACAGTGTCTCAGTTTGATAGCATCTGTGAAGTTCAAAGTGATAAAGCTTCTGTTACC
JATCACTAGTCGTTATGATGGAGTCATTAAAAAACTCTATTATAATCTAGACGATATTGCCTATGTGGG
GAAGCCATTAGTAGACATAGAAACGGAAGCTTTAAAAGATTCAGAAGAAGATGTTGTTGAAACTCCTG
ICAGTGTCTCATGATGAACATACACACCAAGAGATAAAGGGCCGAAAAACACTGGCAACTCCTGCAGTT
|CGCCGTCTGGCAATGGAAAACAATATTAAGCTGAGTGAAGTTGTTGGCTCAGGAAAAGATGGCAGAAT
IACTTAAAGAAGATATCCTCAACTATTTGGAAAAGCAGACAGGAGCTATATTGCCTCCTTCACCCAAAG
'TTGAAATTATGCCACCTCCACCAAAGCCAAAAGACATGACTGTTCCTATACTAGTATCAAAACCTCCG
JGTATTCACAGGCAAAGACAAAACAGAACCCATAAAAGGCTTTCAAAAAGCAATGGTCAAGACTATGTC
ΪTGCAGCCCTGAAGATACCTCATTTTGGTTATTGTGATGAGATTGACCTTACTGAACTGGTTAAGCTCC
IGAGAAGAATTAAAACCCATTGCATTTGCTCGTGGAATTAAACTCTCCTTTATGCCTTTCTTCTTAAAG
IGCTTCTCATAACATTGGGATAGCAATGGATACTGAGCAGGGTTTGATTGTCCCTAATGTGAAAAATGT
JTCAGATCTGCTCTATATTTGACATCGCCACTGAACTGAACCGCCTCCAGAAATTGGGCTCTGTGGGTC
JAGCTCAGCACCACTGATCTTACAGGAGGAACATTTACTCTTTCCAACATTGGATCAATTGGTGGTACC
'JTTTGCCAAACCAGTGATAATGCCACCTGAAGTAGCCATTGGGGCCCTTGGATCAATTAAGGCCATTCC
SCCGATTTAACCAGAAAGGAGAAGTATATAAGGCACAGATAATGAATGTGAGCTGGTCAGCTGATCACA
JGAGTTATTGATGGTGCTACAATGTCACGCTTCTCCAATTTGTGGAAATCCTATTTAGAAAACCCAGCT
ITTTATGCTACTAGATCTGAAACTCGAGGGC
|NOV4b, CG190229-04 SEQ ID NO: 36 460 aa MW about 50000.7kD JProtein Sequence AAVRMLRT SRNAGKLICVRYFQTCGNVHVLKPNYVCFFGYPSFKYSHPHHFLKTTAALRGQV
VQFKLSDIGΞGIRΞVTVKEWYVKEGDTVSQFDSICEVQSDKASVTITSRYDGVIKKLYYNLDDIAYVG
KPLVDIETΞALKDSEEDWETPAVSHDEHTHQEIKGRKTLATPAVRRLAMENNIKLSEWGSGKDGRI
LKEDILNYLEKQTGAILPPSPKVEIMPPPPKPKDMTVPILVSKPPVFTGKDKTEPIKGFQKAMVKTMS
AALKIPHFGYCDEIDLTELVKLRΞELKPIAFARGIKLSFMPFFLKASHNIGIAMDTEQGLIVPNVKNV
QICSIFDIATELNRLQKLGSVGQLSTTDLTGGTFTLSNIGSIGGTFAKPVI PPEVAIGALGSIKAIP
RFNQKGEVYKAQI NVS SADHRVIDGATMSRFSNLWKSYLΞNPAF LLDLK
A ClustalW comparison of the above protein sequences yields the following sequence alignment shown in Table 4B.
Table 4B. Comparison of the NOV4 protein sequences.
NOV4a K--^VRMLRTWSRNAGKLICVRYFQTCGNVHVLKPNYVCFFGYPSFKYSHPHHFLKT N0V4b ]y--AAVRMLRTWSRNAGKLICVRYFQTCGNVHVLKPNYVCFFGYPSFKYSHPHHFLKT
NOV4a TAALRGQVVQFKLSDIGEGIREVTVKEWTYVKEGDTVSQFDSICEVQSDKASVTITSRYDG 61
N0V4b TAALRGQWQFKLSDIGEGIREVTVKEWYVKEGDTVSQFDSICEVQSDKASVTITSRYDG
N0V4a VIKKLYYNLDDIAYVGKPLVDIETEALKDSEΞDWETPAVSHDEHTHQEIKGRKTLATPA
N0V4b VIKKLYYNLDDIAYVGKPLVDIETEALKDSEEDWETPAVSHDΞHTHQEIKGRKTLATPA
NOV4a VRRLAMENNIKLSEVVGSGKDGRILKEDILNYLEKQTGAILPPSPKVEIMPPPPKPKDMT
NOV4b VRRLA --INNIKLSΞVVGSGKDGRILKΞDILNYLEKQTGAILPPSPKVΞI PPPPKPKDMT
N0V4a VPILVSKPPVFTGKDKTEPIKGFQKAMVKTMSAALKIPHFGYCDEIDLTELVKLREELKP
N0V4b VPILVSKPPVFTGKDKTEPIKGFQKA VKTMSAALKIPHFGYCDΞIDLTΞLVKLRΞELKP
N0V4a IAFARGIKLSFMPFFLKAAS
N0V4b IAFARGIKLSF PFFLKAS HNIGIAMDTEQGLIV
N0V4a LGLLQFPILNASVD ENCQN ITYKIGGTFAKPVIMPP
N0V4b PNVKNVQICSIFDIATELNRLQKLGSVGQLSTTDLTGGTFTLSNIGSIGGTFAKPVI PP
N0V4a --TVAIGALGSIKAIPRFNQKGEVYKAQIMNVSWSADHRVIDGATMSRFSNL KSYLENPAF
N0V4b EVAIGALGSIKAIPRBTSIQKGEVYKAQIMNVSWSADHRVIDGATMSRFSNL KSYLENPAF
N0V4a LLDLK
N0V4b MLLDLK
N0V4a (SEQ ID NO: 34)
N0V4b (SEQ ID NO: 36)
Further analysis of the NOV4a protein yielded the following properties shown in Table 4C.
Table 4C. Protein Sequence Properties NOV4a
SignalP analysis: No Known Signal Sequence Predicted
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 9; pos.chg l; neg.chg 0 H-region: length 2; peak value -10.23 PSG score: -14.63
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1): -6.09 possible cleavage site: between 15 and 16
>» Seems to have no N-terminal signal peptide
ALOM: Klein et al's method for TM region allocation Init position for calculation: 1 Tentative number of TMS(s) for the threshold 0.5: number of TMS(s) .. fixed PERIPHERAL Likelihood = 1.70 (at 313) ALOM score: 1.70 (number of TMSs: 0)
MITDISC: discrimination of mitochondrial targeting seq R content: 5 Hyd Moment(75): 0.44 Hyd Moment(95): 1.83 G content: 5 D/E content: 1 S/T content: 11 Score: -0.78
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 75 LRG|QV
NUCDISC: discrimination of nuclear localization signals pat4: none pat7: none bipartite: none content of basic residues: 13.0% NLS Score: -0.47
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 89
Final Results (k = 9/23):
65.2 %: mitochondrial
17.4 %: cytoplasmic
8.7 %: nuclear
8.7 %: vesicles of secretory system
A search of the NOV4a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 4D.
Figure imgf000069_0001
In a BLAST search of public sequence databases, the NOV4a protein was found to have homology to the proteins shown in the BLASTP data in Table 4E.
Figure imgf000070_0001
PFam analysis predicts that the NOV4a protein contains the domains shown in the Table 4F. Specific amino acid residues of CG 190229-02 for each domain are shown in column 2, equivalent domains in other NOV4 and CG190229 family of proteins are also encompassed herein.
Figure imgf000070_0002
Example 5. NOV5
The NOV5 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 5A.
Table 5A. NOV5 Sequence Analysis
|NOV5a, CG194245-03 SEQ ID NO: 37 j1906bp jDNA Sequence [ORF Start ATG atϊϊ jORFStop: at 1898
CACCGGATCCATGCTTTCCGCCATCTACACAGTCCTGGCGGGACTGCTGTTCCTGCCGCTCCTGGTGA ACCTCTGCTGCCCATACTTCTTCCAGGACATAGGCTACTTCTTGAAGGTGGCCGCCGTGGGCCGGAGG (GTGCGCAGCTACGGGAAGCGGCGGCCGGCGCGCACCATCCTGCGGGCGTTCCTGGAGAAAGCGCGCCA IGACGCCACACAAGCCTTTTCTGCTCTTCCGCGACGAGACTCTCACCTACGCGCAGGTGGACCGGCGCA JGCAATCAAGTGGCCCGGGCGCTGCACGACCACCTCGGCCTGCGCCAGGGAGACTGCGTGGCGCTCCTT LATGGGTAACGAGCCGGCCTACGTGTGGCTGTGGCTGGGGCTGGTGAAGCTGGGCTGTGCCATGGCGTG .CCTCAATTACAACATCCGCGCGAAGTCCCTGCTGCACTGCTTCCAGTGCTGCGGGGCGAAGGTGCTGC ITGGTGTCGCCAGAACTACAAGCAGCTGTCGAAGAGATACTGCCAAGCCTTAAAAAAGATGATGTGTCC JATCTATTATGTGAGCAGAACTTCTAACACAGATGGGATTGACTCTTTCCTGGACAAAGTGGATGAAGT
|ATCAACTGAACCTATCCCAGAGTCATGGAGGTCTGAAGTCACTTTTTCCACTCCTGCCTTATACATTT
SATACTTCTGGAACCACAGGTCTTCCAAAAGCAGCCATGATCACTCATCAGCGCATATGGTATGGAACT JGGCCTCACTTTTGTAAGCGGATTGAAGGCAGATGATGTCATCTATATCACTCTGCCCTTTTACCACAG ITGCTGCACTACTGATTGGCATTCACGGATGTATTGTGGCTGGTAAGCTTTTTCTACAAAATGTTGGAG IGTGCTACTCTTGCCTTGCGGACTAAATTTTCAGCCAGCCAGTTTTGGGATGACTGCAGAAAATACAAC SGTCACTGTCATTCAGTATATCGGTGAACTGCTTCGGTATTTATGCAACTCACCACAGAAACCAAATGA ΪCCGTGATCATAAAGTGAGACTGGCACTGGGAAATGGCTTACGAGGAGATGTGTGGAGACAATTTGTCA !AGAGATTTGGGGACATATGCATCTATGAGTTCTATGCTGCCACTGAAGGCAATATTGGATTTATGAAT JTATGCGAGAAAAGTTGGTGCTGTTGGAAGAGTAAACTACCTACAGAAAAAAATCATAACTTATGACCT GATTAAATATGATGTGGAGAAAGATGAACCTGTCCGTGATGAAAATGGATATTGCGTCAGAGTTCCCA 'AAGGTGAAGTTGGACTTCTGGTTTGCAAAATCACACAACTTACACCATTTAATGGCTATGCTGGAGCA
IAAGGCTCAGACAGAGAAGAAAAAACTGAGAGATGTCTTTAAGAAAGGAGACCTCTATTTCAACAGTGG
JAGATCTCTTAATGGTTGACCATGAAAATTTCATCTATTTCCACGACAGAGTTGGAGATACATTCCGGT IGGAAAGGGGAAAATGTGGCCACCACTGAAGTTGCTGATACAGTTGGACTGGTTGATTTTGTCCAAGAA ΪGTAAATGTTTATGGAGTGCATGTGCCAGATCATGAGGGTCGCATTGGCATGGCCTCCATCAAAATGAA !AGAAAACCATGAATTTGATGGAAAGAAACTCTTTCAGCACATTGCTGATTACCTACCTAGTTATGCAA JGGCCCCGGTTTCTAAGAATACAGGACACCATTGAGATCACTGGAACTTTTAAACACCGCAAAATGACC JCTGGTGGAGGAGGGCTTTAACCCTGCTGTCATCAAAGATGCCTTGTATTTCTTGGATGACACAGCAAA SAATGTATGTGCCTATGACTGAGGACATCTATAATGCCATAAGTGCTAAAACCCTGAAACTCGTCGACG
!GC
!NOV5a, CG194245-03 SEQ ID NO: 38 629 aa MW at 71268.6kD (Protein Sequence
IMLSAIYTVLAGLLFLPLLVNLCCPYFFQDIGYFLKVAAVGRRVRSYGKRRPARTILRAFLEKARQTPH PFLLFRDETLTYAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYVWL LGLVKLGCAMACLNY INIRAKSLLHCFQCCGAKVLLVSPELQAAVEEILPSLKKDDVSIYYVSRTSNTDGIDSFLDKVDΞVSTE JPIPESWRSEVTFSTPALYIYTSGTTGLPKAAMITHQRIWYGTGLTFVSGLKADDVIYITLPFYHSAAL ΪLIGIHGCIVAGKLFLQNVGGATLALRTKFSASQF DDCRKYNVTVIQYIGELLRYLCNSPQKPNDRDH fKVRLALGNGLRGDVWRQFVKRFGDICIYEFYAATEGNIGFMNYARKVGAVGRVNYLQKKIITYDLIKY jDVEKDEPVRDΞNGYCVRVPKGEVGLLVCKITQLTPFNGYAGAKAQTEKKKLRDVFKKGDLYFNSGDLL J VDHENFIYFHDRVGDTFRWKGENVATTEVADTVGLVDFVQEVNVYGVHVPDHEGRIG ASIKMKΞNH lEFDGKKLFQHIADYLPSYARPRFLRIQDTIΞITGTFKHRKMTLVEEGFNPAVIKDALYFLDDTAKMYV ΪPMTEDIYNA SAKTLKL
|NOV5b, C99.877 SEQ ID NO: 39 1906 bp !DNA Sequence JORF Start: ATG at 11 ORF Stop: at 1898
CACCGGATCCATGCTTTCCGCCATCTACACAGTCCTGGCGGGACTGCTGTTCCTGCCGCTCCTGGTGA
ACCTCTGCTGCCCATACTTCTTCCAGGACATAGGCTACTTCTTGAAGGTGGCCGCCGTGGGCCGGAGG GTGCGCAGCTACGGGAAGCGGCGGCCGGCGCGCACCATCCTGCGGGCGTTCCTGGAGAAAGCGCGCCA GACGCCACACAAGCCTTTTCTGCTCTTCCGCGACGAGACTCTCACCTACGCGCAGGTGGACCGGCGCA GCAATCAAGTGGCCCGGGCGCTGCACGACCACCTCGGCCTGCGCCAGGGAGACTGCGTGGCGCTCCTT ATGGGTAACGAGCCGGCCTACGTGTGGCTGTGGCTGGGGCTGGTGAAGCTGGGCTGTGCCATGGCGTG CCTCAATTACAACATCCGCGCGAAGTCCCTGCTGCACTGCTTCCAGTGCTGCGGGGCGAAGGTGCTGC TGGTGTCGCCAGAACTACAAGCAGCTGTCGAAGAGATACTGCCAAGCCTTAAAAAAGATGATGTGTCC ATCTATTATGTGAGCAGAACTTCTAACACAGATGGGATTGACTCTTTCCTGGACAAAGTGGATGAAGT ATCAACTGAACCTATCCCAGAGTCATGGAGGTCTGAAGTCACTTTTTCCACTCCTGCCTTATACATTT JATACTTCTGGAACCACAGGTCTTCCAAAAGCAGCCATGATCACTCATCAGCGCATATGGTATGGAACT IGGCCTCACTTTTGTAAGCGGATTGAAGGCAGATGATGTCATCTATATCACTCTGCCCTTTTACCACAG JTGCTGCACTACTGATTGGCATTCACGGATGTATTGTGGCTGGTAAGCTTTTTCTACAAAATGTTGGAG IGTGCTACTCTTGCCTTGCGGACTAAATTTTCAGCCAGCCAGTTTTGGGATGACTGCAGAAAATACAAC JGTCACTGTCATTCAGTATATCGGTGAACTGCTTCGGTATTTATGCAACTCACCACAGAAACCAAATGA ICCGTGATCATAAAGTGAGACTGGCACTGGGAAATGGCTTACGAGGAGATGTGTGGAGACAATTTGTCA AGAGATTTGGGGACATATGCATCTATGAGTTCTATGCTGCCACTGAAGGCAATATTGGATTTATGAAT JTATGCGAGAAAAGTTGGTGCTGTTGGAAGAGTAAACTACCTACAGAAAAAAATCATAACTTATGACCT IGATTAAATATGATGTGGAGAAAGATGAACCTGTCCGTGATGAAAATGGATATTGCGTCAGAGTTCCCA 'AAGGTGAAGTTGGACTTCTGGTTTGCAAAATCACACAACTTACACCATTTAATGGCTATGCTGGAGCA JAAGGCTCAGACAGAGAAGAAAAAACTGAGAGATGTCTTTAAGAAAGGAGACCTCTATTTCAACAGTGG 'AGATCTCTTAATGGTTGACCATGAAAATTTCATCTATTTCCACGACAGAGTTGGAGATACATTCCGGT IGGAAAGGGGAAAATGTGGCCACCACTGAΆGTTGCTGATACAGTTGGACTGGTTGATTTTGTCCAAGAA
(GTAAATGTTTATGGAGTGCATGTGCCAGATCATGAGGGTCGCATTGGCATGGCCTCCATCAAAATGAA lAGAAAACCATGAATTTGATGGAAAGAAACTCTTTCAGCACATTGCTGATTACCTACCTAGTTATGCAA fGGCCCCGGTTTCTAAGAATACAGGACACCATTGAGATCACTGGAACTTTTAAATACCGCAAAATGACC CTGGTGGAGGAGGGCTTTAACCCTGCTGTCATCAAAGATGCCTTGTATTTCTTGGATGACACAGCAAA AATGTATGTGCCTATGACTGAGGACATCTATAATGCCATAAGTGCTAAAACCCTGAAACTCGTCGACG GC
NOV5b, C99.877 SEQ ID NO: 40 629 aa MW at 71294.7kD Protein Sequence
MLSAIYTVLAGLLFLPLLVNLCCPYFFQDIGYFLKVAAVGRRVRSYGKRRPARTILRAFLEKARQTPH KPFLLFRDETLTYAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYVWLWLGLVKLGCAMACLNY NIRAKSLLHCFQCCGAKVLLVSPELQAAVEEILPSLKKDDVSIYYVSRTSNTDGIDSFLDKVDEVSTE PIPES RSEVTFSTPALYIYTSGTTGLPKAAMITHQRI YGTGLTFVSGLKADDVIYITLPFYHSAAL LIGIHGCIVAGKLFLQNVGGATLALRTKFSASQFWDDCRKYNVTVIQYIGELLRYLCNSPQKPNDRDH KVRLALGNGLRGDVWRQFVKRFGDICIYEFYAATEGNIGF NYARKVGAVGRVNYLQKKIITYDLIKY DVEKDEPVRD---NGYCVRVPKGEVGLLVCKITQLTPFNGYAGAKAQTEKKKLRDVFKKGDLYFNSGDLL VDHE-NFIYFHDRVGDTFRWKG---NVATT--IVADTVGLVDFVQEVNVYGVHVPDHΞGRIGMASIKMKENH EFDGKKLFQHIADYLPSYARPRFLRIQDTIEITGTFKYRKMTLVEEGFNPAVIKDALYFLDDTAKMYV P TΞD YNAISAKTLKL jNOV5c, CG194245 SEQ ID NO: 41 1906 bp (DNA Sequence ORF Start: ATG ORF Stop: at 1898 at 11 jCACCGGATCCATGCTTTCCGCCATCTACACAGTCCTGGCGGGACTGCTGTTCCTGCCGCTCCTGGTGA
ACCTCTGCTGCCCATACTTCTTCCAGGACATAGGCTACTTCTTGAAGGTGGCCGCCGTGGGCCGGAGG GTGCGCAGCTACGGGAAGCGGCGGCCGGCGCGCACCATCCTGCGGGCGTTCCTGGAGAAAGCGCGCCA GACGCCACACAAGCCTTTTCTGCTCTTCCGCGACGAGACTCTCACCTACGCGCAGGTGGACCGGCGCA GCAATCAAGTGGCCCGGGCGCTGCACGACCACCTCGGCCTGCGCCAGGGAGACTGCGTGGCGCTCCTT LATGGGTAACGAGCCGGCCTACGTGTGGCTGTGGCTGGGGCTGGTGAAGCTGGGCTGTGCCATGGCGTG CTCAATTACAACATCCGCGCGAAGTCCCTGCTGCACTGCTTCCAGTGCTGCGGGGCGAAGGTGCTGC TGGTGTCGCCAGAACTACAAGCAGCTGTCGAAGAGATACTGCCAAGCCTTAAAAAAGATGATGTGTCC .ATCTATTATGTGAGCAGAACTTCTAACACAGATGGGATTGACTCTTTCCTGGACAAAGTGGATGAAGT JATCAACTGAACCTATCCCAGAGTCATGGAGGTCTGAAGTCACTTTTTCCACTCCTGCCTTATACATTT ATACTTCTGGAACCACAGGTCTTCCAAAAGCAGCCATGATCACTCATCAGCGCATATGGTATGGAACT GGCCTCACTTTTGTAAGCGGATTGAAGGCAGATGATGTCATCTATATCACTCTGCCCTTTTACCACAG TGCTGCACTACTGATTGGCATTCACGGATGTATTGTGGCTGGTAAGCTTTTTCTACAAAATGTTGGAG GTGCTACTCTTGCCTTGCGGACTAAATTTTCAGCCAGCCAGTTTTGGGATGACTGCAGAAAATACAAC GTCACTGTCATTCAGTATATCGGTGAACTGCTTCGGTATTTATGCAACTCACCACAGAAACCAAATGA JCCGTGATCATAAAGTGAGACTGGCACTGGGAAATGGCTTACGAGGAGATGTGTGGAGACAATTTGTCA AGAGATTTGGGGACATATGCATCTATGAGTTCTATGCTGCCACTGAAGGCAATATTGGATTTATGAAT TATGCGAGAAAAGTTGGTGCTGTTGGAAGAGTAAACTACCTACAGAAAAAAATCATAACTTATGACCT GATTAAATATGATGTGGAGAAAGATGAACCTGTCCGTGATGAAAATGGATATTGCGTCAGAGTTCCCA JAAGGTGAAGTTGGACTTCTGGTTTGCAAAATCACACAACTTACACCATTTAATGGCTATGCTGGAGCA JAAGGCTCAGACAGAGAAGAAAAAACTGAGAGATGTCTTTAAGAAAGGAGACCTCTATTTCAACAGTGG JAGATCTCTTAATGGTTGACCATGAAAATTTCATCTATTTCCACGACAGAGTTGGAGATACATTCCGGT .GGAAAGGGGAAAATGTGGCCACCACTGAAGTTGCTGATACAGTTGGACTGGTTGATTTTGTCCAAGAA JGTAAATGTTTATGGAGTGCATGTGCCAGATCATGAGGGTCGCATTGGCATGGCCTCCATCAAAATGAA :AGAAAACCATGAATTTGATGGAAAGAAACTCTTTCAGCACATTGCTGATTACCTACCTAGTTATGCAA
ΪGGCCCCGGTTTCTAAGAATACAGGACACCATTGAGATCACTGGAACTTTTAAAXJ^ACCGCAAAATGACC1
CTGGTGGAGGAGGGCTTTAACCCTGCTGTCATCAAAGATGCCTTGTATTTCTTGGATGACACAGCAAA
AATGTATGTGCCTATGACTGAGGACATCTATAATGCCATAAGTGCTAAAACCCTGAAACTCGTCGACG
GC
Wherein Xi is C or T. iNOV5c, CG194245 SEQ ID NO: 42 629 aa MW at 71268.6kD jProtein Sequence
J LSAIYTVLAGLLFLPLLVNLCCPYFFQDIGYFLKVAAVGRRVRSYGKRRPARTILRAFLEKARQTPH JKPFLLFRDETLTYAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYVWLWLGLVKLGCAMACLNY JNIRAKSLLHCFQCCGAKVLLVSPΞLQAAVEΞILPSLKKDDVSIYYVSRTSNTDGIDSFLDKVDEVSTE "PIPESWRSEVTFSTPALYIYTSGTTGLPKAA ITHQRIWYGTGLTFVSGLKADDVIYITLPFYHSAAL ILIGIHGCIVAGKLFLQNVGGATLALRTKFSASQF DDCRKYNVTVIQYIGELLRYLCNSPQKPNDRDH JKVRLALGNGLRGDVWRQFVKRFGDICIYΞFYAATEGNIGFJY-NYARKVGAVGRVNYLQKKIITYDLIKY JDVEKDEPVRDENGYCVRVPKGEVGLLVCKITQLTPFNGYAGAKAQTEKKKLRDVFKKGDLYFNSGDLL JMVDH--3STFIYFHDRVGDTFR GENVATTWADTVGLVDFVQEVNVYGV^^
(EFDGKKLFQHIADYLPSYARPRFLRIQDTIEITGTFKZJ^RKMTLVEEGFNPAVIKDALYFLDDTAKMYV JPMTEDIYNAISAKTLKL
Wherein Zι is H or Y.
A ClustalW comparison of the above protein sequences yields the following sequence alignment shown in Table 5B.
Table 5B. Comparison of the NOV5 protein sequences.
NOV5a MLSAIYTVLAGLLFLPLLVNLCCPYFFQDIGYFLKVT AVGRRVRSYGKRRPARTILRAFL
NOV5b l-LSAIYTVLAGLLFLPLLVNLCCPYFFQDIGYFLKVAAVGRRVRSYGKRRPART LRAFL
N0V5a EKARQTPHKPFLLFRDETLTYAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYVWL
N0V5b EKARQTPHKPFLLFRDETLTYAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYV L
NOV5a LGLVKLGCAMACLNYNIRAKSLLHCFQCCGAKVLLVSPELQAAVEEILPSLKKDDVSIY
NOV5b V^GLVKLGCAMACLlSrYNIRAKSLLHCFQCCGAKVLLVSPELQAAVΞEILPSLKKDDVSIY
N0V5a YVSRTSNTDGIDSFLDKVDΞVSTEPIPΞSWRSEVTFSTPALYIYTSGTTGLPKAAMITHQ
N0V5b YVSRTSNTDGIDSFLDKVDEVSTEPIPESWRSEVTFSTPALYIYTSGTTGLPKAAMITHQ
N0V5a RIWYGTGLTFVSGLKADDVIYITLPFYHSAALLIGIHGCIVAGKLFLQNVGGATLALRTK
NOV5b RIWYGTGLTFVSGLKADDVIYITLPFYHSAALLIGIHGCIVAGKLFLQNVGGATLALRTK
N0V5a FSASQFWDDC-^Y1TV--VIQYIGELLRYLCNSPQKPNDRDHKVRLALGNGLRGDVWRQFVK
N0V5b FSASQF DDCRKYNVTVIQYIGELLRYLCNSPQKPNDRDHKVRLALGNGLRGDVWRQFVK
N0V5a RFGDICIYΞFYAATEGNIGFMNYARKVGAVGRVNYLQKKIITYDLIKYDVEKDEPVRDEN
NOV5b RFGDICIYΞFYAATEGNIGFMNYARKVGAVGRVNYLQKKIITYDLIKYDVEKDΞPVRDEN
N0V5a GYCVRVPKGEVGLLVCKITQLTPFNGYAGAKAQTΞKKKLRDVFKKGDLYFNSGDLL VDH
N0V5b GYCVRVPKGEVGLLVCKITQLTPFNGYAGAKAQTΞKKKLRDVFKKGDLYFNSGDLLMVDH
N0V5a ENFIYFHDRVGDTFR KG---NVATT---VADTVGLVDFVQEVNVYGVHVPDHEGRIGMASIKM
NOV5b ENFIYFHDRVGDTFRWKG--TIWATTEVADTVGLVDFVQEVNVYGVHVPDHΞGRIGMASIK
N0V5a KENHEFDGKKLFQHIADYLPSYARPRFLRIQDTIEITGTFKHRKMTLVEEGFNPAVIKDA
N0V5b KENHEFDGKKLFQHIADYLPSYARPRFLRIQDTIEITGTFKYRKMTLVEEGFNPAVIKDA
NOV5a LYFLDDTAKMYVPMTEDIYNAISAKTLKL
NOV5b LYFLDDTAKMYVPMTEDIYWAISAKTLKL
N0V5a (SEQ ID NO: 40 ) N0V5b (SEQ ID NO: 42 ) Further analysis of the NOV5a protein yielded the following properties shown in Table 5C.
Table 5C. Protein Sequence Properties NOV5a
SignalP analysis: Cleavage site between residues 25 and 26
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 0; pos.chg 0; neg.chg 0 H-region: length 28; peak value 10.51 PSG score: 6.11
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1): -1.72 possible cleavage site: between 24 and 25
>» Seems to have a cleavable signal peptide (1 to 24)
ALOM: Klein et al's method forTM region allocation lnit position for calculation: 25 Tentative number of TMS(s) for the threshold 0.5: 2 INTEGRAL Likelihood = -2.39 Transmembrane 118 - 134 INTEGRAL Likelihood = -3.50 Transmembrane 271 - 287 PERIPHERAL Likelihood = 2.01 (at 142) ALOM score: -3.50 (number of TMSs: 2)
MTOP: Prediction of membrane topology (Hartmann et al.) Center position for calculation: 12 Charge difference: 1.0 C( 2.0) - N( 1.0) C > N: C-terminal side will be inside
»>Caution: Inconsistent mtop result with signal peptide >» membrane topology: type 3b
MITDISC: discrimination of mitochondrial targeting seq R content: 0 Hyd Moment(75): 0.63 Hyd Moment(95): 3.53 G content: 1 D/E content: 1 S/T content: 2 Score: -5.93
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 74 ARQ|TP
NUCDISC: discrimination of nuclear localization signals pat4: KRRP (4) at 48 pat4: KHRK (3) at 581 pat7: none bipartite: none content of basic residues: 12.6% NLS Score: -0.03
ER Membrane Retention Signals: KKXX-like motif in the C-terminus: KTLK
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 94.1
Final Results (k = 9/23):
44.4 %: endoplasmic reticulum .1 %: vacuolar .1 %: Golgi .1 %: nuclear .1 %: cytoplasmic .1 %: mitochondrial
A search of the NOV5a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 5D.
Figure imgf000075_0001
In a BLAST search of public sequence databases, the NOVδa protein was found to have homology to the proteins shown in the BLASTP data in Table 5E.
Figure imgf000075_0002
Figure imgf000076_0001
PFam analysis predicts that the NOVδa protein contains the domains shown in the Table 5F. Specific amino acid residues of CG 194245-03 for each domain are shown in column 2, equivalent domains in other NOVδ and CG194245 family of proteins are also encompassed herein.
Table 5F. Domain Analysis of NOVδa
Identities/
D Pffa--mm n D«omma-,!mn N AOmVjnδoa A Mcajtdch Re Rsejg;jiuoens: Similarities Expect Value for the Matched Region
AMP-binding 80..521 112/451 (25%) 1.7e-78 307/451 (68%)
Example 6. NOV6, CG196732, NM_21797 like.
The NOV6 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 6A.
Table 6A. NOV6 Sequence Analysis
NOV6a, CG 196732-01 SEQ ID NO: 43 1354 bp ONA Sequence ORF Start: ATG at 157 ORF Stop: TAA at 1261
GAAACCTCCTCGTCTGTGCACGAACAGGTGGCCGACTCTGGAGCCCAGGCTGTTGCTTTCCAGTCTGG TCGTGAATCCTCCATAGTCTGGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCA
GGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATC
AAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAG CCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGG CCAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAG TCTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCA TGGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCA ACGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTC ATCGTTGGATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGC CCCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGA TCTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTAT CAGGGCAATGTGTGGGTTGGCTATGACAACGTCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCA JCAACAAATTTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC JAGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTTGGCCTGCAGAGTGCAAGTTGCACGGCT JCCAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAG JCTCTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGGCAGAGCCAACGGCCTCTACCCCGTGGCAA ATAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTT sGTCTTCGACACCAGCTGTGATTGCTGCAAGTGGGrATAAACCTGACCTGGTCTATATTCCCTAGAGTT
ICCAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAAAATCAGCAGTC
!NOV6a, CG196732-01 SEQ ID NO: 44 I368 aa JMW at 40082.3kD jProtein Sequence jft-VSTPENRQTF TSVIKFLRQYEFDGLDFD EYPGSRGSPPQDKHLFTVLVQEMREAFEQEAKQINKP IRIJWTAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGS EGYTGENSPLYKYPTDTGSNAYLNVD ΪYV1-NYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI AYYEICTFLKN GATQGWDAPQ--TVPYAYQGNVlrtrVGYDNVKSFDIKAQWLKHNKFGGAMVWAIDLDDFTGTFCNQGKFPLI jSTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSSGGSSGGSGFCAG-^-ANGLYPVA-SJNRNAFW IHCVNGVTYQQNCQAGLVFDTSCDCCNWA
NOV6b, CG196732-02 ]SEQ ID NO: 45 [1625 bp
DNA Sequence ORF Start: ATG at 104 ORF Stop: TAA at 1532
GCTTTCCAGTCTGGTGGTGAATCCTCCATAGTCTGAAGCCTTTGTGATAACCACAGAATCAGAACATAl
TAAAAAGCTCTGCGGGACTGGTGCTGACTGCAACCATGACAAAGCTTATTCTCCTCACAGGTCTTGTC
JCTTATACTGAATTTGCAGCTCGGCTCTGCCTACCAGCTGACATGCTACTTCACCAACTGGGCCCAGTA JCCGGCCAGGCCTGGGGCGCTTCATGCCTGACAACATCGACCCCTGCCTCTGTACCCACCTGATCTACG •CCTTTGCTGGGAGGCAGAACAACGAGATCACCACCATCGAATGGAACGATGTGACTCTCTACCAAGCT ITTCAATGGCCTGAAAAATAAGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCGG IGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATCA JAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAGC JCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGGC CAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAGT CTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCAT GGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCAA CGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTCA TCGTTGGATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGCC CCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGAT CTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTATC JAGGGCAATGTGTGGGTTGGCTATGACAACATCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCAC IAACAAATTTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACCA SGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAΆGGCCCTCGGCCTGCAGAGTGCAAGTTGCACGGCTC ICAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAGC CTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGTCAGAGCCAACGGCCTCTACCCCGTGGCAAA TAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTTG TCTTCGACACCAGCTGTGATTGCTGCAACTGGGCATAAACCTGACCTGGTCTATATTCCCTAGAGTTC
CAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAΆAATCAGCAGTC
NOV6b, CG196732-02 SEQ ID NO: 46 I476 aa jMW at 52270.3kD Protein Sequence
MTKLILLTGLVLILNLQLGSAYQLTCYFTNWAQYRPGLGRFMPDNIDPCLCTHLIYAFAGRQNNEITT IE^røDVTLYQAFNGLKNKNSQLKTLLAIGGWFGTAPFTA VSTPENRQTFITSVIKFLRQYEFDGLD FDWEYPGSRGSPPQDKHLFTVLVQEMREAFΞQΞAKQINKPRLMVTAAVAAGISNIQSGYEIPQLSQYL DYIHVMTYDLHGS EGYTGENSPLYKYPTDTGSNAYLNVDYVMNY KDNGAPAEKLIVGFPTYGHNFI .LSNPSNTGIGAPTSGAGPAGPYAKESGIWAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNIK SFDIKAQ LKHNKFGGAMVAIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAP ϊSGSGNGSGSSSSGGSSGGSGFCAVRANGLYPVANNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCN A
NOV6c, CG 196732-03 SEQ ID NO: 47 1126 bp DNA Sequence ORF Start: ATG at 14 ORF Stop: at 1118
.CACCGGATCCACCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATCAAATTCC
TGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAGCCCTCCT CAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGGCCAAGCA GATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAGTCTGGCT ATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCATGGCTCC TGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCAACGCCTA CCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTCATCGTTG GATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGCCCCCACC TCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGATCTGTAC CTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTATCAGGGCA ATGTGTGGGTTGGCTATGACAACATCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCACAACAAA TTTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACCAGGGCAA GTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTCGGCCTGCAGAGTGCAAGTTGCACGGCTCCAGCTC AGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAGCTCTGGA GGCAGCTCGGGAGGCAGTGGATTCTGTGCTGTCAGAGCCAACGGCCTCTACCCCGTGGCAAATAACAG AAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTTGTCTTCG JACACCAGCTGTGATTGCTGCAACTGGGCAGTCGACGGc" jNOVΘc, CG 196732-03 SEQ ID NO: 48 368 aa MW at 40138.4kD (Protein Sequence
SMVSTPENRQTFITSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQE REAFEQEAKQINKP (R---MVTAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWEGYTGΞNSPLYKYPTDTGSNAYLNVD LYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI AYYEICTFLKN JGATQGWDAPQEVPYAYQGNVWVGYDNIKSFDIKAQ LK-^INKFGGAMVWAIDLDDFTGTFCNQGKFPLI SSTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSSGGSSGGSGFCAV-RANGLYPVANNRNAFW ΉCVNGVTYQQNCQAGLVFDTSCDCCNWA
NOV6d, 13382594 DNA Sequence SEQ ID NO: 49 1354 bp
ORF Start: ATG at 157 ORF Stop: TAA at 1261
IGAAACCTCCTCGTCTGTGCACGAACAGGTGGCCGACTCTGGAGCCCAGGCTGTTGCTTTCCAGTCTGG GGTGAATCCTCCATAGTCTGGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCA JGGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATC JAAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAG ICCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGG .CCAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAG JTCTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCA ITGGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCA JACGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTC SATCGTTGGATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGC JCCCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGA JTCTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTAT ICAGGGCAATGTGTGGGTTGGCTATGACAACGTCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCA JCAACAAATTTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC IAGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTTGGCCTGCAGAGTGCAAGTTGCACGGCT ICCAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAG TCTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGTCAGAGCCAACGGCCTCTACCCCGTGGCAA JATAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTT JGTCTTCGACACCAGCTGTGATTGCTGCAACTGGGCATAAACCTGACCTGGTCTATATTCCCTAGAGTT JCCAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAAAATCAGCAGTC iNOV6d, 13382594 SEQ ID NO: 50 368 aa MW at 40124.4kD jProtein Sequence
VSTPENRQTFITSVIKFLRQYΞFDGLDFDWEYPGSRGSPPQDKHLFTVLVQEMREAFEQEAKQINKP (RI-J^TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGS EGYTGENSPLYKYPTDTGSNAYLNVD IYVΪ-NYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKΞSGI AYYEICTFLKN ATQG DAPQEVPYAYQGNVVN/GYDNVKSFDIKAQWLKHNKFGGAMVWAIDLDDFTGTFCNQGKFPLI JSTLKKALGLQSASCTAPAQPIΞPITAAPSGSGNGSGSSSSGGSSGGSGFCAVR-WGLYPVANNRNAFW JHCVNGVTYQQNCQAGLVFDTSCDCCNWA
|NOV6e, 13382595 SEQ ID NO: 51 1354 bp |DNA Sequence ORF Start: ATG at 157 ORF Stop: TAA at 1261
JGAAACCTCCTCGTCTGTGCACGAACAGGTGGCCGACTCTGGAGCCCAGGCTGTTGCTTTCCAGTCTGG JTGGTGAATCCTCCATAGTCTGGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCA JGGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATC AAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAG CCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGG CCAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAG TCTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCA TGGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCA jACGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTC iATCGTTGGATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGC CCCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGA TCTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTAT CAGGGCAATGTGTGGGTTGGCTATGACAACGTCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCA CAACAAATCTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC AGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTTGGCCTGCAGAGTGCAAGTTGCACGGCT CCAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAG CTCTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGGCAGAGCCAACGGCCTCTACCCCGTGGCAA ATAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTT GTCTTCGACACCAGCTGTGATTGCTGCAACTGGGCATAAACCTGACCTGGTCTATATTCCCTAGAGTT CCAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAAAATCAGCAGTC
NOV6e. 13382595 ISEQ ID NO: 52 368 aa MW at 40022.2kD jProtein Sequence
MVSTP-i-NRQTFITSVIKFLRQYEFDGLDFD EYPGSRGSPPQDKHLFTVLVQEMREAFEQΞAKQ NKP RL VTAAVAAGISNIQSGYΞIPQLSQYLDYIHVTYDLHGS EGYTGENSPLYKYPTDTGSNAYLNVD YVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGIWAYYEICTFLKN GATQGWDAPQ- PYAYQGNVWVGYDNVKSFDIKAQWLKHNKSGGAMVWAIDLDDFTGTFCNQGKFPLI STLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSSGGSSGGSGFCAGRANGLYPVANNRNAF HCVNGVTYQQNCQAGLVFDTSCDCCNWA
)NOV6f, 13382596 ISEQ ID NO: 53 1354 bp
DNA Sequence iORF Start: ATG at 157 JORF Stop: TAA at liiT"
ΪGAAACCTCCTCGTCTGTGCACGAACAGGTGGCCGACTCTGGAGCCCAGGCTGTTGCTTTCCAGTCTGG
TGGTGAATCCTCCATAGTCTGGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCA JGGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATC ΪAAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAG ΪCCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGG FCCAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAG JTCTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCA JTGGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCA .ACGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTC ATCGTTGGATTCCCTACCTATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGC 'CCCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGA JTCTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTAT -CAGGGCAATGTGTGGGTTGGCTATGACAACATCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCA ICAACAAATTTGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC ΪAGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTTGGCCTGCAGAGTGCAAGTTGCACGGCT JCCAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAG JCTCTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGGCAGAGCCAACGGCCTCTACCCCGTGGCAA JATAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTT JGTCTTCGACACCAGCTGTGATTGCTGCAACTGGGCATAAACCTGACCTGGTCTATATTCCCTAGAGTT .CCAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAAAATCAGCAGTC
!NOV6f, 13382596 SEQ ID NO: 54 368 aa MW at 40096.3kD >Protein Sequence
(MVSTPENRQTFITSVIKFLRQYEFDGLDFD ΞYPGSRGSPPQDKHLFTVLVQEMREAFEQEAKQINKP jRLMVTAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGS ΞGYTGENSPLYKYPTDTGSNAYLNVD ;YVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKΞSGI AYYEICTFLKN !GATQG DAPQ--rVPYAYQGNV /GYDNIKSFDIKAQWLKHNKFGGAMVWAIDLDDFTGTFCNQGKFPLI ISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSSGGSSGGSGFCAGRANGLYPVANNRNAFW JHCVNGVTYQQNCQAGLVFDTSCDCCNWA jNOVβg, CG196732 SEQ ED NO: 55 J1354 bp jDNA Sequence ORF Start: ATG at 157 ORF Stop: TAA at 1261
GAAACCTCCTCGTCTGTGCACGAACAGGTGGCCGACTCTGGAGCCCAGGCTGTTGCTTTCCAGTCTGG
TGGTGAATCCTCCATAGTCTGGAACAGCCAGCTGAAAACTCTCCTGGCCATTGGAGGCTGGAACTTCA
GGACTGCCCCTTTCACTGCCATGGTTTCTACTCCTGAGAACCGCCAGACTTTCATCACCTCAGTCATC
AAATTCCTGCGCCAGTATGAGTTTGACGGGCTGGACTTTGACTGGGAGTACCCTGGCTCTCGTGGGAG CCCTCCTCAGGACAAGCATCTCTTCACTGTCCTGGTGCAGGAAATGCGTGAAGCTTTTGAGCAGGAGG CCAAGCAGATCAACAAGCCCAGGCTGATGGTCACTGCTGCAGTAGCTGCTGGCATCTCCAATATCCAG TCTGGCTATGAGATCCCCCAACTGTCACAGTACCTGGACTACATCCATGTCATGACCTACGACCTCCA TGGCTCCTGGGAGGGCTACACTGGAGAGAACAGCCCCCTCTACAAATACCCGACTGACACCGGCAGCA ACGCCTACCTCAATGTGGATTATGTCATGAACTACTGGAAGGACAATGGAGCACCAGCTGAGAAGCTC ATCGTTGGATTCCCTAC€TATGGACACAACTTCATCCTGAGCAACCCCTCCAACACTGGAATTGGTGC CCCCACCTCTGGTGCTGGTCCTGCTGGGCCCTATGCCAAGGAGTCTGGGATCTGGGCTTACTACGAGA TCTGTACCTTCCTGAAAAATGGAGCCACTCAGGGATGGGATGCCCCTCAGGAAGTGCCTTATGCCTAT
CAGGGCAATGTGTGGGTTGGCTATGACAACXjTCAAGAGCTTCGATATTAAGGCTCAATGGCTTAAGCA
CAACAAATX2TGGAGGCGCCATGGTCTGGGCCATTGATCTGGATGACTTCACTGGCACTTTCTGCAACC
AGGGCAAGTTTCCCCTAATCTCCACCCTGAAGAAGGCCCTTGGCCTGCAGAGTGCAAGTTGCACGGCT
CCAGCTCAGCCCATTGAGCCAATAACTGCTGCTCCCAGTGGCAGCGGGAACGGGAGCGGGAGTAGCAG
CTCTGGAGGCAGCTCGGGAGGCAGTGGATTCTGTGCTGX3CAGAGCCAACGGCCTCTACCCCGTGGCAA ATAACAGAAATGCCTTCTGGCACTGCGTGAATGGAGTCACGTACCAGCAGAACTGCCAGGCCGGGCTT GTCTTCGACACCAGCTGTGATTGCTGCAACTGGGCATAAACCTGACCTGGTCTATATTCCCTAGAGTT CCAGTCTCTTTTGCTTAGGACATGTTGCCCCTACCTAAAGTCCTGCAATAAAATCAGCAGTC
Wherein X! is G or A; X2 is T or C; X3 is G or T. !NON6g, CG196732 SEQ ID NO: 56 368 aa JMW at 40082.3kD jProtein Sequence
MVSTPENRQTFITSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQEMREAFEQEAKQINKP JR----WTAAVAAGISNIQSGYΞIPQLSQYLDYIHVMTYDLHGS ΞGYTGΞNSPLYKYPTDTGSNAY----WI) {YVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGIWAYYEICTFLKN
JGATQGWDAPQΗ-VPYAYQGNVV^GYDNZ1KSFDIKAQWLKHNKZ2GGAMVWAIDLDDFTGTFCNQGKFPLI
]sTLKKALGLQSASCTAPAQPIΞPITAAPSGSGNGSGSSSSGGSSGGSGFCAZ3RANGLYPVANNRNAFW JHCVNGVTYQQNCQAGLVFDTSCDCCNWA
I Wherein Z1 is V or I; Z2 is F or S; Z3 is G or V.
A ClustalW comparison of the above protein sequences yields the following sequence alignment shown in Table 6B.
Table 6B. Comparison of the NOV6 protein sequences.
NOVβa
N0V6b MTKLILLTGLVLILNLQLGSAYQLTCYFTNWAQYRPGLGRF PDNIDPCLCTHLIYAFAG
N0V6c
NOVβd
NOVβe
NOVβf
NOVβa MVSTPENRQTFI
NOVβb RQNNEITTIEWNDVTLYQAFNGLKNKNSQLKTLLAIGGWNFGTAPFTAMVSTPENRQTFI
NOVβc ^ MVSTPENRQTFI
NOVβd MVSTPENRQTFI
NOVβe MVSTPENRQTFI
NOV6f MVSTPENRQTFI
N0V6a TSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQΞMREAFEQEAKQINKPRLMV
NOVβb TSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQΞMREAFΞQEAKQINKPRLMV
N0V6c TSVIKFLRQYΞFDGLDFDWEYPGSRGSPPQDKHLFTVLVQEMRΞAFEQΞAKQINKPRLMV
NOVβd TSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQEMREAFEQEAKQINKPRLMV
NOV6e TSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQΞMREAFEQEAKQINKPRLMV
N0V6f TSVIKFLRQYEFDGLDFDWEYPGSRGSPPQDKHLFTVLVQEMREAFΞQEAKQINKPRLMV
NOVβa TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWEGYTGENSPLYKYPTDTGSNAY
NOVβb TAAVAAGISNIQSGYΞIPQLSQYLDYIHVMTYDLHGS EGYTGΞNSPLYKYPTDTGSNAY
NOVβc TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWΞGYTGΞNSPLYKYPTDTGSNAY
NOV6d TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWEGYTGΞNSPLYKYPTDTGSNAY
NOVβe TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWEGYTGENSPLYKYPTDTGSNAY
N0V6f TAAVAAGISNIQSGYEIPQLSQYLDYIHVMTYDLHGSWΞGYTGENSPLYKYPTDTGSNAY
NOVβa LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKΞSGI
N0V6b LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI
NOVβc LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI
NOV6d LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKΞSGI
N0V6e LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI
N0V6f LNVDYVMNYWKDNGAPAEKLIVGFPTYGHNFILSNPSNTGIGAPTSGAGPAGPYAKESGI
N0V6a WAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNVKSFDIKAQWLKHNKFGGAMVW NOVδb WAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNIKSFDIKAQWLKHNKFGGAMVW NOV6c WAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNIKSFDI-^QWLKHNKFGGAMVW NOVδd AYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNVKSFDIKAQWLKHNKFGGAMVW NOVδe WAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNVKSFDIKAQWLKHNKSGGAMVW N0V6f WAYYEICTFLKNGATQGWDAPQEVPYAYQGNVWVGYDNIKSFDIKAQWLKHNKFGGAMVW
NOVβa AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS NOVδb AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS NOVδc AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS N0V6d AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS NOVδe AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS NOVδf AIDLDDFTGTFCNQGKFPLISTLKKALGLQSASCTAPAQPIEPITAAPSGSGNGSGSSSS
NOVδa GGSSGGSGFCAGRANGLYPVANNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCNWA NOVδb GGSSGGSGFCAVRANGLYPVANNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCNWA NOVδc GGSSGGSGFCAVRANGLYPVANNRNA--ΗHCVNGVTYQQNCQAGLVFDTSCDCCNWA NOVδd GGSSGGSGFCAVRANGLYPVANNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCNWA NOVδe GGSSGGSGFCAGRA1SJGLYPVANNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCNWA N0V6f GGSSGGSGFCAGRANGLYPVAKΓNRNAFWHCVNGVTYQQNCQAGLVFDTSCDCCNWA
NOVδa (SEQ ID NO 44 ) NOVδb (SEQ ID NO 46) NOVδc (SEQ ID NO 48 ) NOVδd (SEQ ID NO 50 ) NOVδe (SEQ ID NO 52 ) NOVδf (SEQ ID NO 54 )
Further analysis of the NOV6a protein yielded the following properties shown in Table 6C.
Table 6C. Protein Sequence Properties NOVβa
SignalP analysis: No Known Signal Sequence Predicted
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 8; pos.chg 1 ; neg.chg 1 H-region: length 8; peak value 5.96 PSG score: 1.56
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1): -8.56 possible cleavage site: between 38 and 39
>» Seems to have no N-terminal signal peptide
ALOM: Klein et al's method forTM region allocation I nit position for calculation: 1 Tentative number of TMS(s) for the threshold O.δ: 0 number of TMS(s) .. fixed PERIPHERAL Likelihood = 3.55 (at 70) ALOM score: 3.5δ (number of TMSs: 0)
MTOP: Prediction of membrane topology (Hartmann et al.) Center position for calculation: 6 Charge difference: 3.0 C( 3.0) - N( 0.0) C > N: C-terminal side will be inside >»Caution: Inconsistent mtop result with signal peptide M1TDISC: discrimination of mitochondrial targeting seq
R content 2 Hyd Moment(7δ): 5.41
Hyd Moment(95): 3.73 G content: 0
D/E content: 2 S/T content: δ
Score: -4.09
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 30 LRQ|YE
NUCDISC: discrimination of nuclear localization signals pat4: none pat7: none bipartite: none content of basic residues: 6.2% NLS Score: -0.47
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 89
Final Results (k = 9/23):
52.2 %: cytoplasmic
30.4 %: mitochondrial
13.0 %: nuclear
4.3 %: vesicles of secretory system
A search of the NOV6a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 6D.
Figure imgf000082_0001
AAG62543 Disease treatment related protein SEQ I 1 -368 366/368 (99%) 0.0 ID NO: 5 - Homo sapiens, 476 aa. j 109..476 367/368 (99%)
[WO200136633-A1 , 25-MAY-2001 ]
In a BLAST search of public sequence databases, the NOV6a protein was found to have homology to the proteins shown in the BLASTP data in Table 6E.
Figure imgf000083_0001
PFam analysis predicts that the NOV6a protein contains the domains shown in the Table 6F. Specific amino acid residues of CG196732-01 for each domain are shown in column 2, equivalent domains in other NOV6 and CG196732 family of proteins are also encompassed herein.
Figure imgf000083_0002
Example 7. NOV7, CG53147, CG53147-FLF/SalR.
The NOV7 clone was analyzed, and the nucleotide and encoded polypeptide sequences are shown in Table 7A.
] Table 7A. NOV7 Sequence Analysis
NOV7a, CG53147-02 SEQ ID NO: δ7 1566 bp
(UNA Sequence 0RF Start: ATG at 1 ORF Stop: end of sequence
IATGCCGGGGCTGCGCCGGGACCGCCTACTGACTCTGCTGCTGCTGGGCGCGCTGCTCTCCGCCGACCT JCTACTTCCACCTCTGGCCCCAAGTACAGCGCCAGCTGCGGCCTCGGGAGCGCCCGCGGGGGTGCCCGT ^GCACCGGCCGCGCCTCCTCCCTGGCGCGGGACTCGGCCGCAGCTGCCTCGGACCCCGGCACGATCGTG FCACAACTTTTCCCGAACCGAGCCCCGGACTGAACCGGCTGGCGGCAGCCACAGCGGGTCGAGCTCCAA (GTTGCAGGCCCTCTTCGCCCACCCGCTGTACAACGTCCCGGAGGAGCCGCCTCTCCTGGGAGCCGAGG JACTCACTCCTGGCCAGCCAGGAGGCGCTGCGGTATTACCGGAGGAAGGTGGCCCGCTGGAACAGGCGA ^CACAAGATGTACAGAGAGCAGATGAACCTTACCTCCCTGGACCCCCCACTGCAGCTCCGACTCGAGGC JCAGCTGGGTCCAGTTCCACCTGGGTATTAACCGCCATGGGCTCTACTCCCGGTCCAGCCCTGTTGTCA IGCAAACTTCTGCAAGACATGAGGCACTTTCCCACCATCAGTGCTGATTACAGTCAAGATGAGAAAGCC JTTGCTGGGGGCATGTGACTGCACCCAGATTGTGAAACCCAGTGGGGTCCACCTCAAGCTGGTGCTGAG GTTCTCGGATTTCGGGAAGGCCATGTTCAAACCCATGAGACAGCAGCGAGATGAGGAGACACCAGTGG JACTTCTTCTACTTCATTGACTTTCAGAGACACAATGCTGAGATCGCAGCTTTCCATCTGGACAGGATT JCTGGACTTCCGACGGGTGCCGCCAACAGTGGGGAGGATAGTAAATGTCACCAAGGAAATCCTAGAGGT |CACCAAGAATGAAATCCTGCAGAGTGTTTTCTTTGCCTCTCCAGTGAGCAACGTGTGCTTCTTCGCCA JAGTGTCCATACATGTGCAAGACGGAGTATGCTGTCTGTGGCAACCCACACCTGCTGGAGGGTTCCCTC ΪTCTGCCTTCCTGCCGTCCCTCAACCTGGCCCCCAGGCTGTCTGTGCCCAACCCCTGGATCCGCTCCTA ICACACTGGCAGGAAAAGAGGAGTGGGAGGTCAATCCCCTTTACTGTGACACAGTGAAACAGATCTACC JCGTACAACAACAGCCAGCGGCTCCTCAATGTCATCGACATGGCCATCTTCGACTTCTTGATAGGGAAT LATGGACCGGCACCATTATGAGATGTTCACCAAGTTCGGGGATGATGGGTTCCTTATTCACCTTGACAA JCGCCAGAGGGTTCGGACGACACTCCCATGATGAAATCTCCATCCTCTCGCCTCTCTCCCAGTGCTGCA .TGATAAAAAAGAAAACACTTTTGCACCTGCAGCTGCTGGCCCAAGCTGACTACAGACTCAGCGATGTG JATGCGAGAATCACTGCTGGAAGACCAGCTCAGCCCTGTCCTCACTGAACCCCACCTCCTTGCCCTGGA JTCGAAGGCTCCAAACCATCCTAAGGACAGTGGAGGGGTGCATAGTGGTCCATGGACAGCAGAGTGTCA
!TA fNOV7a, CG53147-02 SEQ ID NO: 58 1522 aa MW at 59563.8kD JProtein Sequence
JMPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRΞRPRGCPCTGRASSLARDSAAAASDPGTIV JHNFSRTEPRTEPAGGSHSGSSSKLQALFAHPLYNVPΞEPPLLGAEDSLLASQEALRYYRRKVARWNRR JHKIY-YREQMKRLTSLDPPLQLRLEASWQFHLGINRHGLYSRSSP SIKLLQDMRHFPTISADYSQDEKA JLLGACDCTQIVKPSGVHLKLVLRFSDFGKAMFKPMRQQRDEΞTPVDFFYFIDFQRHNAEIAAFHLDRI JLDFRRVPPTVGRIVNVTKEILEVTKNEILQSVFFASPVSNVCFFAKCPYMCKTEYAVCGNPHLLEGSL SAFLPSLNLAPRLSVPNPWIRSYTLAGKEEWEWPLYCDTVKQIYPYNNSQRLLNVIDMAIFDFLIGN JMDRHHYEMFTKFGDDGFLIHLDNARGFGRHSHDΞISILSPLSQCC IKKKTLLHLQLLAQADYRLSDV MRESLLΞDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
!NOV7b, CG53147-01 SEQ ID NO: 59 |1937bp JDNA Sequence ORF Start: 1 ORF Stop: End of Sequence
.ATGCCGGGGCTGCGCCGGGACCGCCTAC
JTGACTCTGCTACTGCTGGGCGCGCTGCTCTCCGCCGACCTCTACTTCCACCTCTGGCCCCAAGTACAG
.CGCCAGCTGCGGCCTCGGGAGCGCCCGCGGGGGTGCCCGTGCACCGGCCGCGCCTCCTCCCTGGCGCG
IGGACTCGGCCGCAGCTGCCTCGGACCCCGGCACGATCGTGCACAACTTTTCCCGAACCGAGCCCCGGA
JCTGAACCGGCTGGCGGCAGCCACAGCGGGTCGAGCTCCAAGTTGCAGGCCCTCTTCGCCCACCCGCTG
JTACAACGTCCCGGAGGAGCCGCCTCTCCTGGGAGCCGAGGACTCGCTCCTGGCCAGCCAGGAGGCGCT
IGCGGTATTACCGGAGGAAGGTGGCCCGCTGGAACAGGCGACACAAGATGTACAGAGAGCAGATGAACC
JTTACCTCCCTGGACCCCCCACTGCAGCTCCGACTCGAGGCCAGCTGGGTCCAGTTCCACCTGGGTATT
LAACCGCCATGGGCTCTACTCCCGGTCCAGCCCTGTTGTCAGCAAACTTCTGCAAGACATGAGGCACTT
.TCCCACCATCAGTGCTGATTACAGTCAAGATGAGAAAGCCTTGCTGGGGGCATGTGACTGCACCCAGA
ITTGTGAAACCCAGTGGGGTCCACCTCAAGCTGGTGCTGAGGTTCTCGGATTTCGGGAAGGCCATGTTC
JAAACCCATGAGACAGCAGCGAGATGAGGAGACACCAGTGGACTTCTTCTACTTCATTGACTTTCAGAG
'ACACAATGCTGAGATCGCAGCTTTCCATCTGGACAGGATTCTGGACTTCCGACGGGTGCCGCCAACAG
TGGGGAGGATAGTAAATGTCACCAAGGAAATCCTAGAGGTCACCAAGAATGAAATCCTGCAGAGTGTT
TTCTTTGTCTCTCCAGCGAGCAACGTGTGCTTCTTCGCCAAGTGTCCATACATGTGCAAGACGGAGTA
TGCTGTCTGTGGCAACCCACACCTGCTGGAGGGTTCCCTCTCTGCCTTCCTGCCGTCCCTCAACCTGG
CCCCCAGGCTGTCTGTGCCCAACCCCTGGATCCGCTCCTACACACTGGCAGGAAAAGAGGAGTGGGAG
GTCAATCCCCTTTACTGTGACACAGTGAAACAGATCTACCCGTACAACAACAGCCAGCGGCTCCTCAA
TGTCATCGACATGGCCATCTTCGACTTCTTGATAGGGAATATGGACCGGCACCATTATGAGATGTTCA
CCAAGTTCGGGGATGATGGGTTCCTTATTCACCTTGACAACGCCAGAGGGTTCGGACGACACTCCCAT
GATGAAATCTCCATCCTCTCGCCTCTCTCCCAGTGCTGCATGATAAAAAAGAAAACACTTTTGCACCT
GCAGCTGCTGGCCCAAGCTGACTACAGACTCAGCGATGTGATGCGAGAATCACTGCTGGAAGACCAGC
TCAGCCCTGTCCTCACTGAACCCCACCTCCTTGCCCTGGATCGAAGGCTCCAAACCATCCTAAGGACA
GTGGAGGGGTGCATAGT
NOV7b, CG53147-01 SEQ ID NO: 60 δ22 aa MW is about 60103.δkD Protein Sequence PGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRASSLARDSAA
AASDPGTIVHNFSRTEPRTΞPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYR
RKVARWNR-FIHKMYREQ NLTSLDPPLQLRLEASWVQFHLGINRHGLYSRSSPVVSKLLOD RHFPTIS JADYSQDEKALLGACDCTQIVKPSGVHLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAE JIAAFHLDRILDFRRVPPTVGRIV-WTKEILEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTEYAVCG INPHLLEGSLSAFLPSLNLAPRLSVPNPWIRSYTLAGKEEWEVNPLYCDTVKQIYPYNNSQRLLNVIDM IAIFDFLIGNMDRHHYEMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLA JQADYRLSDVMRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIVAHGQQSVIV
|NOV7c, CGδ3147-03 SEQ ID NO: 61 !1593bp DNA Sequence ORF Start: ATG at 16 JORF Stop: TAG at 1582
ICACCGCGGCCGCACCATGCCGGGGCTGCGCCGGGACCGCCTACTGACTCTGCTGCTGCTGGGCGCGCT -GCTCTCCGCCGACCTCTACTTCCACCTCTGGCCCCAAGTACAGCGCCAGCTGCGGCCTCGGGAGCGCC ICGCGGGGGTGCCCGTGCACCGGCCGCGCCTCCTCCCTGGCGCGGGACTCGGCCGCAGCTGCCTCGGAC ICCCGGCACGATCGTGCACAACTTTTCCCGAACCGAGCCCCGGACTGAACCGGCTGGCGGCAGCCACAG JCGGGTCGAGCTCCAAGTTGCAGGCCCTCTTCGCCCACCCGCTGTACAACGTCCCGGAGGAGCCGCCTC JTCCTGGGAGCCGAGGACTCGCTCCTGGCCAGCCAGGAGGCGCTGCGGTATTACCGGAGGAAGGTGGCC SCGCTGGAACAGGCGACACAAGATGTACAGAGAGCAGATGAACCTTACCTCCCTGGACCCCCCACTGCA IGCTCCGACTCGAGGCCAGCTGGGTCCAGTTCCACCTGGGTATTAACCGCCATGGGCTCTACTCCCGGT CCAGCCCTGTTGTCAGCAAACTTCTGCAAGACATGAGGCACTTTCCCACCATCAGTGCTGATTACAGT CAAGATGAGAAAGCCTTGCTGGGGGCATGTGACTGCACCCAGATTGTGAAACCCAGTGGGGTCCACCT CAAGCTGGTGCTGAGGTTCTCGGATTTCGGGAAGGCCATGTTCAAACCCATGAGACAGCAGCGAGATG AGGAGACACCAGTGGACTTCTTCTACTTCATTGACTTTCAGAGACACAATGCTGAGATCGCAGCTTTC ICATCTGGACAGGATTCTGGACTTCCGACGGGTGCCGCCAACAGTGGGGAGGATAGTAAATGTCACCAA JGGAAATCCTAGAGGTCACCAAGAATGAAATCCTGCAGAGTGTTTTCTTTGTCTCTCCAGCGAGCAACG TGTGCTTCTTCGCCAAGTGTCCATACATGTGCAAGACGGAGTATGCTGTCTGTGGCAACCCACACCTG ICTGGAGGGTTCCCTCTCTGCCTTCCTGCCGTCCCTCAACCTGGCCCCCAGGCTGTCTGTGCCCAACCC ICTGGATCCGCTCCTACACACTGGCAGGAAAAGAGGAGTGGGAGGTCAATCCCCTTTACTGTGACACAG JTGAAACAGATCTACCCGTACAACAACAGCCAGCGGCTCCTCAATGTCATCGACATGGCCATCTTCGAC ΪTTCTTGATAGGGAATATGGACCGGCACCATTATGAGATGTTCACCAAGTTCGGGGATGATGGGTTCCT TATTCACCTTGACAACGCCAGAGGGTTCGGACGACACTCCCATGATGAAATCTCCATCCTCTCGCCTC TCTCCCAGTGCTGCATGATAAAAAAGAAAACACTTTTGCACCTGCAGCTGCTGGCCCAAGCTGACTAC AGACTCAGCGATGTGATGCGAGAATCACTGCTGGAAGACCAGCTCAGCCCTGTCCTCACTGAACCCCA ICCTCCTTGCCCTGGATCGAAGGCTCCAAACCATCCTAAGGACAGTGGAGGGGTGCATAGTGGTCCATG JGACAGCAGAGTGTCATATAGGTCGACGGC
JNOV7C, CG53147-03 SEQ ID NO: 62 522 aa MW at δ9563.8kD jProtein Sequence
JMPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRASSLARDSAAAASDPGTIV 'HNFSRTEPRTEPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVARWNRR JHK3YREQI-MLTSLDPPLQLRLEASWVQFHLGINRHGLYSRSSPVVSKLLQD RHFPTISADYSQDEKA ILLGACDCTQIVKPSGVHLKLVLRFSDFGKAMFKPMRQQRDEΞTPVDFFYFIDFQRHNAEIAAFHLDRI ILDFRRVPPTVGRIVNVTKEILEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTΞYAVCGNPHLLEGSL ISAFLPSLK^APRLSVPNPWIRSYTLAGKEE EVNPLYCDTVKQIYPYNNSQRLLNVIDMAIFDFLIGN
JMDRHHYE FTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADYRLSDV -MRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIVVHGQQSVI iNOV7d, 316900904 |SEQ ID NO:, 63 1578 bp iDNA Sequence .__-^^ ORF Stop: TAG at 1579
IATGCCGGGGCTGCGCCGGGACCGCCTACTGACTCTGCTGCTGCTGGGCGCGCT
JGCTCTCCGCCGACCTCTACTTCCACCTCTGGCCCCAAGTACAGCGCCAGCTGCGGCCTCGGGAGCGCC
JCGCGGGGGTGCCCGTGCACCGGCCGCGCCTCCTCCCTGGCGCGGGACTCGGCCGCAGCTGCCTCGGAC
ICCCGGCACGATCGTGCACAACTTTTCCCGAACCGAGCCCCGGACTGAACCGGCTGGCGGCAGCCACAG
JCGGGTCGAGCTCCAAGTTGCAGGCCCTCTTCGCCCACCCGCTGTACAACGTCCCGGAGGAGCCGCCTC
JTCCTGGGAGCCGAGGACTCGCTCCTGGCCAGCCAGGAGGCGCTGCGGTATTACCGGAGGAAGGTGGCC
ICGCTGGAACAGGCGACACAAGATGTACAGAGAGCAGATGAACCTTACCTCCCTGGACCCCCCACTGCA
JGCTCCGACTCGAGGCCAGCTGGGTCCAGTTCCACCTGGGTATTAACCGCCATGGGCTCTACTCCCGGT
.CCAGCCCTGTTGTCAGCAAACTTCTGCAAGACATGAGGCACTTTCCCACCATCAGTGCTGATTACAGT
SCAAGATGAGAAAGCCTTGCTGGGGGCATGTGACTGCACCCAGATTGTGAAACCCAGTGGGGTCCACCT
|CAAGCTGGTGCTGAGGTTCTCGGATTTCGGGAAGGCCATGTTCAAACCCATGAGACAGCAGCGAGATG
JAGGAGACACCAGTGGACTTCTTCTACTTCATTGACTTTCAGAGACACAATGCTGAGATCGCAGCTTTC
CATCTGGACAGGATTCTGGACTTCCGACGGGTGCCGCCAACAGTGGGGAGGATAGTAAATGTCACCAA
GGAAATCCTAGAGGTCACCAAGAATGAAATCCTGCAGAGTGTTTTCTTTGTCTCTCCAGCGAGCAACG
ITGTGCTTCTTCGCCAAGTGTCCATACATGTGCAAGACGGAGTATGCTGTCTGTGGCAAACCACACCTG
JCTGGAGGGTTCCCTCTCTGCCTTCCTGCCGTCCCTCAACCTGGCCCCCAGGCTGTCTGTGCCCAACCC
|CTGGATCCGCTCCTACACACTGGCAGGAAAAGAGGAGTGGGAGGTCAATCCCCTTTACTGTGACACAG
JTGAAACAGATCTACCCGTACAACAACAGCCAGCGGCTCCTCAATGTCATCGACATGGCCATCTTCGAC
FTTCTTGATAGGGAATATGGACCGGCACCATTATGAGATGTTCACCAAGTTCGGGGATGATGGGTTCCT
TATTCACCTTGACAACGCCAGAGGGTTCGGACGACACTCCCATGATGAAATCTCCATCCTCTCGCCTC
TCTCCCAGTGCTGCATGATAAAAAAGAAAACACTTTTGCACCTGCAGCTGCTGGCCCAAGCTGACTAC ΆGACTCAGCGATGTGATGCGAGAATCACTGCTGGAAGACCAGCTCAGCCCTGTCCTCACTGAACCCCA JCCTCCTTGCCCTGGATCGAAGGCTCCAAACCATCCTAAGGACAGTGGAGGGGTGCATAGTGGTCCATG IGACAGCAGAGTGTCATATAGGTCGACGG jNOV7d, 316900904 SEQ ID NO: 64 δ27 aa iMW about 60000kD jProtein Sequence --_-_-_____--_--___ pdPGLRRDRLLmLLLGALLSADΪ-ΫiΗLWPQVQRQLRPRERPRGCPCTGRASSLARDSAAAASD
IPGTIVHNFSRTEPRTΞPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVA
SRWNRRHK YREQMNLTSLDPPLQLRLEASWVQFHLGINRHGLYSRSSPVVSKLLQDMRHFPTISADYS
JQDEKALLGACDCTQIVKPSGVHLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEIAAF
!HLDRILDFRRVPPTVGRIVNVTKEILEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTEYAVCGKPHL
JLEGSLSAFLPSLNLAP-^SVPNPWIRSYTLAGKE--mrEVNPLYCDTVKQIYPYNNSQRLLNVIDMAIFD
JFLIGNMDRHHYEMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADY iRLSDVMRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
:NOV7e, 316900924 ISEQ ID NO: 6δ |1578 bp
;DNA Sequence ORF Start: at 1 JORF Stop: TAG at 1579 jATGCCGGGGCTGCGCCGGGACCGCCTACTGACTCTGCTGCTGCTGGGCGCGCT
IGCTCTCCGCCGACCTCTACTTCCACCTCTGGCCCCAAGTACAGCGCCAGCTGCGGCCTCGGGAGCGCC JCGCGGGGGTGCCCGTGCACCGGCCGCGCCTCCTCCCTGGCGCGGGACTCGGCCGCAGCTGCCTCGGAC ^CCCGGCACGATCGTGCACAACTTTTCCCGAACCGAGCCCCGGACTGAACCGGCTGGCGGCAGCCACAG ICGGGTCGAGCTCCAAGTTGCAGGCCCTCTTCGCCCACCCGCTGTACAACGTCCCGGAGGAGCCGCCTC iTCCTGGGAGCCGAGGACTCGCTCCTGGCCAGCCAGGAGGCGCTGCGGTATTACCGGAGGAAGGTGGCC iCGCTGGAACAGGCGACACAAGATGTACAGAGAGCAGATGAACCTTACCTCCCTGGACCCCCCACTGCA fGCTCCGACTCGAGGCCAGCTGGGTCCAGTTCCACCTGGGTATTAACCGCCATGGGCTCTACTCCCGGT SCCAGCCCTGTTGTCAGCAAACTTCTGCAAGACATGAGGCACTTTCCCACCATCAGTGCTGATTACAGT iCAAGATGAGAAAGCCTTGCTGGGGGCATGTGACTGCACCCAGATTGTGAAACCCAGTGGGGTCCACCT iCAAGCTGGTGCTGAGGTTCTCGGATTTCGGGAAGGCCATGTTCAAACCCATGAGACAGCAGCGAGATG :AGGAGACACCAGTGGACTTCTTCTACTTCATTGACTTTCAGAGACACAATGCTGAGATCGCAGCTTTC CATCTGGACAGGATTCTGGACTTCCGACGGGTGCCGCCAACAGTGGGGAGGATAGTAAATGTCACCAA GGAAATCCTAGAGGTCACCAAGAATGAAATCCTGCAGAGTGTTTTCTTTGTCTCTCCAGCGAGCAACG TGTGCTTCTTCGCCAAGTGTCCATACATGTGCAAGACGGAGTATGCTGTCTGTGGCAACCCACACCTG jCTGGAGGGTTCCCTCTCTGCCTTCCTGCCGTCCCTCAACCTGGCCCCCAGGCTGTCTGTGCCCAACCC JCTGGATCCGCTCCTACACACTGGCAGGAAAAGAGGAGTGGGAGGTCAATCCCCTTTACTGTGACACAG jTGAAACAGATCTACCCGTACAACAACAGCCAGCGGCTCCTCAATGTCATCGACATGGCCATCTTCGAC jTTCTTGATAGGGAATATGGACCGGCACCATTATGAGATGTTCACCAAGTTCGGGGATGATGGGTTCCT jTATTCACCTTGACAACGCCAGAGGGTTCGGACGACACTCCCATGATGAAATCTCCATCCTCTCGCCTC jTCTCCCAGTGCTGCATGATAAAAAAGAAAACACTTTTGCACCTGCAGCTGCTGGCCCAAGCTGACTAC AGACTCAGCGATGTGATGCGAGAATCACTGCTGGAAGACCAGCTCAGCCCTGTCCTCACTGAACCCCA JCCTCCTTGCCCTGGATCGAAGGCTCCAAACCATCCTAAGGACAGTGGAGGGGTGCATAGTGGTCCATG IGACAGCAGAGTGTCATATAGGTCGACGGC
JNOV7e, 316900924 SEQ ID NO: 66 527 aa MW about 60000kD jProtein Sequence jMPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRΞRPRGCPCTGRASSLARDSAAAASD
JPGTIVHNFSRTEPRTEPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVA
JRWNRRHKR-YREQM LTSLDPPLQLRLEASWVQFHLGINRHGLYSRSSPVVSKLLQDMRHFPTISADYS jQDEKALLGACDCTQIVKPSGVHLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEIAAF
IHLDRILDFRRVPPTVGRIVNVTKEILEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTEYAVCGNPHL
JLEGSLSAFLPSLNLAPRLSVPNP IRSYTLAGKEEWEVNPLYCDTVKQIYPYNNSQRLLNVID AIFD iFLIGIlMDRHHYEMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADY
IRLSDVMRΞSLLΞDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
A ClustalW comparison of the above protein sequences yields the following sequence alignment shown in Table 7B.
Table 7B. Comparison of the NOV7 protein sequences.
NOWa MPGLRRDRLLTLLLLGALLSADLYFHL PQVQRQLRPRERPRGCPCTGRAS
NOV7b MPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRAS
NOWC PGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRAS
NOWd MPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRAS
NOWe MPGLRRDRLLTLLLLGALLSADLYFHLWPQVQRQLRPRERPRGCPCTGRAS
NOWa SLARDSAAAASDPGTIVHNFSRTEPRTEPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLG N0V7b SLA-^SAAAASDPGTIVHNFSRTΞPRTEPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLG
NOWC SLARDS AAASDPGTIVHNFSRTEPRTΞPAGGSHSGSSSKXQALFAHPLYNVPEEPPLLG
N0V7d SLARDSAAAASDPGTIVHNFSRTEPRTEPAGGSHSGSSSKLQ- -.FAHPLYNVPEEPPLLG
N0V7e SLARDSAAAASDPGTIVHNFSRTEPRTEPAGGSHSGSSSKLQALFAHPLYNVPEEPPLLG
N0V7a AEDSLLASQEALRYYRRKVARWNRRHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRH
N0V7b AEDSLLASQEALRYYRRKVARWNRRHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRH
NOV7c AEDSLLASQΞALRYYRRKV--RWNR-IHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRH
NOWd AEDSLLASQEALRYYRRKVARWNRRHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRH
NOWe AEDSLLASQEALRYYRR-^ARWNRRHKMYREQMNLTSLDPPLQLRLEASWVQFHLGINRH
N0V7a GLYSRSSPWSKLLQDMRHFPTISADYSQDΞKALLGACDCTQIVKPSGVHLKLVLRFSDF
NOWb GLYSRSSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGVHLKLVLRFSDF
N0V7c GLYSRSSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGVHLKLVLRFSDF
N0V7d GLYSRSSPWSKLLQD RHFPTISADYSQDEKALLGACDCTQIVKPSGVHLKLVLRFSDF
N0V7e GLYSRSSPWSKLLQD RHFPTISADYSQDEKALLGACDCTQIVKPSGVHLKLVLRFSDF
N0V7a GKAMFKP RQQRDEETPVDFFYFIDFQRHNAΞIAAFHLDRILDFRRVPPTVGRIVNVTKE
N0V7b GKAMFKPMRQQRDEETPVDFFYFIDFQRHNAΞIAAFHLDRILDFRRVPPTVGRIVNVTKΞ
NOVIc GKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEIAAFHLDRILDFRRVPPTVGRIVNVTKE
N0V7d GKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEIAAFHLDRILDFRRVPPTVGRIVNVTKE
N0V7e GKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEIAAFHLDRILDFRRVPPTVGRIVNVTKE
N0V7a ILEVTKNEILQSWFASPVSNVCFFAKCPYMCKTΞYAVCGNPHLLEGSLSAFLPSLNLAP
NOWb ILEVTKNEILQSVFFVSPASNVCFFAKCPYCKTEYAVCGNPHLLΞGSLSAFLPSLNLAP
NOWC LEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTEYAVCGNPHLLEGSLSAFLPSLNLAP
NCWd ILEVTKNEILQSVFFVSPASNVCFFAKCPYMCKTEYAVCGKPHLLEGSLSAFLPSLNLAP
NOV7e ILEVTKNEILQSVFFVSPASNVCFFAKCPY CKTEYAVCGNPHLLEGSLSAFLPSLNLAP
NOWa RLSVPNPWIRSYTLAGKEEWEVNPLYCDTVKQIYPYNNSQRLLNVID AIFDFLIGNMDR
NOV7b RLSVPNPWIRSYTLAGKΞEΛ^VNPLYCDTVKQIYPYIS∞SQRLLNVIDMAIFDFLIGNMDR
NOV7c RLSVPNPWIRSYTLAGKEEWEVNPLYCDTVKQIYPYNNSQRLLNVIDMAIFDFLIGN DR
NOV7d RLSVPNPWIRSYTLAGKEΞWEVNPLYCDTVKQIYPYNNSQRLI-DWIDMAIFDFLIGNMDR
NOV7e RLSVPNPWIRSYTLAGKEEWΞVNPLYCDTVKQIYPYNNSQRLLNVIDMAIFDFLIGNMDR
NOV7a HHYE FTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCC IKKKTLLHLQLLAQADY
NOV7b HHYEMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADY
NOV7c HHYΞMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADY
NOWd HHYEMFTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCC IKKKTLLHLQLLAQADY
NOV7e HHYE FTKFGDDGFLIHLDNARGFGRHSHDEISILSPLSQCCMIKKKTLLHLQLLAQADY
NOV7a RLSDVMRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGC WHGQQSVI
NOWb RLSDVMRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIVAHGQQSVI
NOWc RLSDV RESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
NOWd RLSDVMRΞSLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
NOV7e RLSDVMRESLLEDQLSPVLTEPHLLALDRRLQTILRTVEGCIWHGQQSVI
NOV7a (SEQ ID NO: 58)
NOWb (SEQ ID NO: 60)
NCWc (SEQ ID NO: 62)
NOWd (SEQ ID NO: 64)
NOWe (SEQ ID NO: 66)
Further analysis of the NOV7a protein yielded the following properties shown in Table 7C. Table 7C. Protein Sequence Properties NOV7a
SignalP analysis: Cleavage site between residues 34 and 35
PSORT II analysis:
PSG: a new signal peptide prediction method N-region: length 8; pos.chg 3; neg.chg 1 H-region: length 13; peak value 10.14 PSG score: 5.74
GvH: von Heijne's method for signal seq. recognition GvH score (threshold: -2.1 ): 2.15 possible cleavage site: between 21 and 22
>» Seems to have a cleavable signal peptide (1 to 21 )
ALOM: Klein et al's method forTM region allocation Init position for calculation: 22 Tentative number of TMS(s) for the threshold 0.5: 0 number of TMS(s) .. fixed PERIPHERAL Likelihood = 0.53 (at 393) ALOM score: 0.53 (number of TMSs: 0)
MTOP: Prediction of membrane topology (Hartmann et al.) Center position for calculation: 10 Charge difference: -1.5 C( 1.δ) - N( 3.0) N >= C: N-terminal side will be inside
MITDISC: discrimination of mitochondrial targeting seq
R content: 3 Hyd Moment(7δ): 4.δ4 „ Hyd Moment(9δ): 2.30 G content: 2
D/E content: 2 S/T content: 2
Score: -5.32
Gavel: prediction of cleavage sites for mitochondrial preseq R-2 motif at 59 GRA|SS
NUCDISC: discrimination of nuclear localization signals pat4: RRHK (3) at 135 pat7: PGLRRDR (3) at 2 bipartite: none content of basic residues: 11.7% NLS Score: -0.03
NNCN: Reinhardt's method for Cytoplasmic/Nuclear discrimination Prediction: cytoplasmic Reliability: 55.δ
Final Results (k = 9/23):
33.3 %: endoplasmic reticulum 33.3 %: extracellular, including cell wall 22.2 %: mitochondrial 11.1 %: vacuolar A search of the NOV7a protein against the Geneseq database, a proprietary database that contains sequences published in patents and patent publication, yielded several homologous proteins shown in Table 7D.
Figure imgf000089_0001
In a BLAST search of public sequence databases, the NOV7a protein was found to have homology to the proteins shown in the BLASTP data in Table 7E.
Figure imgf000089_0002
Example B: Sequencing Methodology and Identification of NOVX Clones
1. GeneCalling™ Technology: A method of differential gene expression profiling between two or more samples (Nature Biotechnology 17:198 1999) was used to identify NOVX genes. Briefly cDNA was derived from various human samples of whole tissue, primary cells or tissue cultured primary cells or cell lines representing multiple tissue types, normal and diseased states, physiological states, and developmental states from different donors. Samples were obtained as. Cells and cell lines may have been treated with biological or chemical agents that regulate gene expression, for example, growth factors, chemokines or steroids. The cDNA thus derived was then digested with up to as many as 120 pairs of restriction enzymes and pairs of linker-adaptors specific for each pair of restriction enzymes were ligated to the appropriate end. The restriction digestion generates a mixture of unique cDNA gene fragments. Limited PCR amplification is performed with primers homologous to the linker adapter sequence where one primer is biotinylated and the other is fluorescently labeled. The doubly labeled material is isolated and the fluorescently labeled single strand is resolved by capillary gel electrophoresis. A computer algorithm compares the electropherograms from an experimental and control group for each of the restriction digestions. This and additional sequence-derived information is used to predict the identity of each differentially expressed gene fragment using a variety of genetic databases. The identity of the gene fragment is confirmed by additional, gene-specific competitive PCR or by isolation and sequencing of the gene fragment. 2. SeqCalling™ Technology: The cDNA thus derived was then sequenced using
CuraGen Corporation's proprietary SeqCalling technology. Sequence traces were evaluated manually and edited for corrections if appropriate. cDNA sequences from all samples were assembled together, sometimes including public human sequences, using bioinformatic programs to produce a consensus sequence for each assembly. Sequences were included as components for assembly when the extent of identity with another component was at least 95% over 50 bp.
Each assembly represents a gene or portion thereof and includes information on variants, such as splice forms single nucleotide polymorphisms (SNPs), insertions, deletions and other sequence variations.
3. PathCalling™ Technology: The NOVX nucleic acid sequences are derived by laboratory screening of cDNA library by the two-hybrid approach by methods previously described
(Nature 403:623, 2000; U. S. Patents 6,057,101 and 6,083,693).
4. RACE: Techniques based on the polymerase chain reaction such as rapid amplification of cDNA ends (RACE), were used to isolate or complete the predicted sequence of the cDNA of the invention. Usually multiple clones were sequenced from one or more human samples to derive the sequences for fragments. Various human tissue samples from different donors were used for the RACE reaction. The sequences derived from these procedures were included in the SeqCalling Assembly process described in preceding paragraphs.
5. Exon Linking: The NOVX target sequences identified in the present invention were subjected to the exon linking process to confirm the sequence. PCR primers were designed by starting at the most upstream sequence available, for the forward primer, and at the most downstream sequence available for the reverse primer. In each case, the sequence was examined, walking inward from the respective termini toward the coding sequence, until a suitable sequence that is either unique or highly selective was encountered, or, in the case of the reverse primer, until the stop codon was reached. Such primers were designed based on in silico predictions for the full length cDNA, part (one or more exons) of the DNA or protein sequence of the target sequence, or by translated homology of the predicted exons to closely related human sequences from other species. These primers were then employed in PCR amplification based on the following pool of human cDNAs: adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea, uterus. Usually the resulting amplicons were gel purified, cloned and sequenced to high redundancy. The PCR product derived from exon linking was cloned into the pCR2.1 vector from Invitrogen. The resulting bacterial clone has an insert covering the entire open reading frame cloned into the pCR2.1 vector. The resulting sequences from all clones were assembled with themselves, with other fragments in CuraGen Corporation's database and with public ESTs. Fragments and ESTs were included as components for an assembly when the extent of their identity with another component of the assembly was at least 95% over 50 bp. In addition, sequence traces were evaluated manually and edited for corrections if appropriate. These procedures provide the sequence reported herein.
6. Physical Clone: Exons were predicted by homology and the intron/exon boundaries were determined using standard genetic rules. Exons were further selected and refined by means of similarity determination using multiple BLAST (for example, tBlastN, BlastX, and BlastN) searches, and, in some instances, GeneScan and Grail. Expressed sequences from both public and proprietary databases were also added when available to further define and complete the gene sequence. The DNA sequence was then manually corrected for apparent inconsistencies thereby obtaining the sequences encoding the full-length protein.
The PCR product derived by exon linking, covering the entire open reading frame, was cloned into the pCR2.1 vector from Invitrogen to provide clones used for expression and screening purposes.
Example C: Quantitative expression analysis of clones in various cells and tissues
The quantitative expression of various NOV genes was assessed using microtiter plates containing RNA samples from a variety of normal and pathology-derived cells, cell lines and tissues using real time quantitative PCR (RTQ-PCR) performed on an Applied Biosystems ABI PRISM® 7700 or an ABI PRISM® 7900 HT Sequence Detection System.
RNA integrity of all samples was determined by visual assessment of agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio as a guide (2:1 to 2.5:1 28s:18s) and the absence of low molecular weight RNAs (degradation products). Control samples to detect genomic DNA contamination included RTQ-PCR reactions run in the absence of reverse transcriptase using probe and primer sets designed to amplify across the span of a single exon.
RNA samples were normalized in reference to nucleic acids encoding constitutively expressed genes (i.e., β-actin and GAPDH). Alternatively, non-normalized RNA samples were converted to single strand cDNA (sscDNA) using Superscript II (Invitrogen Corporation, Catalog No. 18064-147) and random hexamers according to the manufacturer's instructions. Reactions containing up to 10 μg of total RNA in a volume of 20 μl or were scaled up to contain δO μg of total RNA in a volume of 100 μl and were incubated for 60 minutes at 42°C. sscDNA samples were then normalized in reference to nucleic acids as described above. Probes and primers were designed according to Applied Biosystems Primer Express
Software package (version I for Apple Computer's Macintosh Power PC) or a similar algorithm using the target sequence as input. Default reaction condition settings and the following parameters were set before selecting primers: 250 nM primer concentration; 58°-60° C primer melting temperature (Tm) range; 69° C primer optimal Tm; 2° C maximum primer difference (if probe does not have δ' G, probe Tm must be 10° C greater than primer Tm; and 7δ bp to 100 bp amplicon size. The selected probes and primers were synthesized by Synthegen (Houston, TX). Probes were double purified by HPLC to remove uncoupled dye and evaluated by mass spectroscopy to verify coupling of reporter and quencher dyes to the 5' and 3' ends of the probe, respectively. Their final concentrations were: 900 nM forward and reverse primers, and 200nM probe.
Normalized RNA was spotted in individual wells of a 96 or 384-well PCR plate (Applied Biosystems). PCR cocktails included a single gene-specific probe and primers set or two multiplexed probe and primers sets. PCR reactions were done using TaqMan® One-Step RT-PCR Master Mix (Applied Biosystems, Catalog No. 4313803) following manufacturer's instructions. Reverse transcription was performed at 48° C for 30 minutes followed by amplification/PCR cycles: 95° C 10 min, then 40 cycles at 95° C for 15 seconds, followed by 60° C for 1 minute. Results were recorded as CT values (cycle at which a given sample crosses a threshold level of fluorescence) and plotted using a log scale, with the difference in RNA concentration between a given sample and the sample with the lowest CT value being represented as 2 to the power of delta CT. The percent relative expression was the reciprocal of the RNA difference multiplied by 100. CT values below 28 indicate high expression, between 28 and 32 indicate moderate expression, between 32 and 35 indicate low expression and above 35 reflect levels of expression that were too low to be measured reliably.
Normalized sscDNA was analyzed by RTQ-PCR using 1X TaqMan® Universal Master mix (Applied Biosystems; catalog No. 4324020), following the manufacturer's instructions. PCR amplification and analysis were done as described above. Panels 1, 1.1, 1.2, and 1.3D
Panels 1 , 1.1 , 1.2 and 1.3D included 2 control wells (genomic DNA control and chemistry control) and 94 wells of cDNA samples from cultured cell lines and primary normal tissues. Cell lines were derived from carcinomas (ca) including: lung, small cell (s cell var), non small cell (non-s or non-sm); breast; melanoma; colon; prostate; glioma (glio), astrocytoma (astro) and neuroblastoma (neuro); squamous cell (squam); ovarian; liver; renal; gastric and pancreatic from the American Type Culture Collection (ATCC). Normal tissues were obtained from individual adults or fetuses and included: adult and fetal skeletal muscle, adult and fetal heart, adult and fetal kidney, adult and fetal liver, adult and fetal lung, brain, spleen, bone marrow, lymph node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, thymus, stomach, small intestine, colon, bladder, trachea, breast, ovary, uterus, placenta, prostate, testis and adipose. The following abbreviations are used in reporting the results: metastasis (met); pleural effusion (pi. eff or pi effusion) and * indicates established from metastasis.
General_screening_paneLv1.4, v1.5, v1.6 and v1.7
Panels 1.4, 1.δ, 1.6 and 1.7 were as described for Panels 1 , 1.1 , 1.2 and 1.3D, above except that normal tissue samples were pooled from 2 to δ different adults or fetuses.
Panels 2D, 2.2, 2.3 and 2.4 Panels 2D, 2.2, 2.3 and 2.4 included 2 control wells and 94 wells containing RNA or cDNA from human surgical specimens procured through the National Cancer Institute's Cooperative Human Tissue Network (CHTN) or the National Disease Research Initiative (NDRI), Ardais (Lexington, MA) or Clinomics BioSciences (Frederick, MD). Tissues included human malignancies and in some cases matched adjacent normal tissue (NAT). Information regarding histopathological assessment of tumor differentiation grade as well as the clinical stage of the patient from which samples were obtained was generally available. Normal tissue RNA and cDNA samples were purchased from various commercial sources such as Clontech (Palo Alto, CA), Research Genetics and Invitrogen (Carlsbad, CA).
HASS Panel v 1.0 The HASS Panel v1.0 included 93 cDNA samples and two controls including: 81 samples of cultured human cancer cell lines subjected to serum starvation, acidosis and anoxia according to established procedures for various lengths of time; 3 human primary cells; 9 malignant brain cancers (4 medulloblastomas and δ glioblastomas); and 2 controls. Cancer cell lines (ATCC) were cultured using recommended conditions and included: breast, prostate, bladder, pancreatic and CNS. Primary human cells were obtained from Clonetics (Walkersville, MD). Malignant brain samples were gifts from the Henry Ford Cancer Center.
ARDAIS Panel v1.0 and v1.1
The ARDAIS Panel v1.0 and v1.1 included 2 controls and 22 test samples including: human lung adenocarcinomas, lung squamous cell carcinomas, and in some cases matched adjacent normal tissues (NAT) obtained from Ardais (Lexington, MA). Unmatched malignant and non-malignant RNA samples from lungs with gross histopathological assessment of tumor differentiation grade and stage and clinical state of the patient were obtained from Ardais.
ARDAIS Prostate v1.0 ARDAIS Prostate v1.0 panel included 2 controls and 68 test samples of human prostate malignancies and in some cases matched adjacent normal tissues (NAT) obtained from Ardais. RNA from unmatched malignant and non-malignant prostate samples with gross histopathological assessment of tumor differentiation grade and stage and clinical state of the patient were also obtained from Ardais.
ARDAIS Kidney v1.0
ARDAIS Kidney v1.0 panel included 2 control wells and 44 test samples of human renal cell carcinoma and in some cases matched adjacent normal tissue (NAT) obtained from Ardais. RNA from unmatched renal cell carcinoma and normal tissue with gross histopathological assessment of tumor differentiation grade and stage and clinical state of the patient were also obtained from Ardais.
ARDAIS Breast v1.0
ARDAIS Breast v1.0 panel included 2 control wells and 71 test samples of human breast malignancies and in some cases matched adjacent normal tissue (NAT) obtained from Ardais. RNA from unmatched malignant and non-malignant breast samples with gross histopathological assessment of tumor differentiation grade and stage and clinical state of the patient were also obtained from Ardais.
Panel 3D, 3.1 and 3.2
Panels 3D, 3.1, and 3.2 included two controls, 92 cDNA samples of cultured human cancer cell lines and 2 samples of human primary cerebellum. Cell lines (ATCC, National Cancer Institute (NCI), German tumor cell bank) were cultured as recommended and were derived from: squamous cell carcinoma of the tongue, melanoma, sarcoma, leukemia, lymphoma, and epidermoid, bladder, pancreas, kidney, breast, prostate, ovary, uterus, cervix, stomach, colon, lung and CNS carcinomas. Panels 4D, 4R, and 4.1 D
Panels 4D, 4R, and 4.1 D included 2 control wells and 94 test samples of RNA (Panel 4R) or cDNA (Panels 4D and 4.1 D) from human cell lines or tissues related to inflammatory conditions. Controls included total RNA from normal tissues such as colon, lung (Stratagene, La Jolla, CA), thymus and kidney (Clontech). Total RNA from cirrhotic and lupus kidney was obtained from BioChain Institute, Inc., (Hayward, CA). Crohn's intestinal and ulcerative colitis samples were obtained from the National Disease Research Interchange (NDRI, Philadelphia, PA). Cells purchased from Clonetics included: astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, small airway epithelium, bronchial epithelium, microvascular dermal endothelial cells, microvascular lung endothelial cells, human pulmonary aortic endothelial cells, and human umbilical vein endothelial. These primary cell types were activated by incubating with various cytokines (IL-1 beta ~1-δ ng/ml, TNF alpha -5-10 ng/ml, IFN gamma -20-50 ng/ml, IL-4 -5-10 ng/ml, IL-9 ~δ-10 ng/ml, IL-13 5-10 ng/ml) or combinations of cytokines as indicated. Starved endothelial cells were cultured in the basal media (Clonetics) with 0.1% serum. Mononuclear cells were prepared from blood donations using Ficoll. LAK cells were cultured in culture media [DMEM, 5% FCS (Hyclone, Logan, UT), 100 mM non essential amino acids (Gibco/Life Technologies, Rockville, MD), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"5 M (Gibco), and 10 mM Hepes (Gibco)] and interleukin 2 for 4-6 days. Cells were activated with 10-20 ng/ml PMA and 1 -2 μg/ml ionomycin, 5-10 ng/ml IL-12, 20-50 ng/ml IFN gamma or 5-10 ng/ml IL-18 for 6 hours. In some cases, mononuclear cells were cultured for 4-5 days in culture media with ~5 mg/ml PHA (phytohemagglutinin) or PWM (pokeweed mitogen; Sigma-Aldrich Corp., St. Louis, MO). Samples were taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte reaction) samples were obtained by taking blood from two donors, isolating the mononuclear cells using Ficoll and mixing them 1 :1 at a final concentration of -2x106 cells/ml in culture media. The MLR samples were taken at various time points from 1-7 days for RNA preparation.
Monocytes were isolated from mononuclear cells using CD14 Miltenyi Beads, -i-ve VS selection columns and a Vario Magnet (Miltenyi Biotec, Auburn, CA) according to the manufacturer's instructions. Monocytes were differentiated into dendritic cells by culturing in culture media with 50 ng/ml GMCSF and 5 ng/ml IL-4 for 5-7 days. Macrophages were prepared by culturing monocytes for 5-7 days in culture media with -50 ng/ml 10% type AB Human Serum (Life technologies, Rockville, MD) or MCSF (Macrophage colony stimulating factor; R&D, Minneapolis, MN). Monocytes, macrophages and dendritic cells were stimulated for 6 or 12-14 hours with 100 ng/ml lipopolysaccharide (LPS). Dendritic cells were also stimulated with 10 μg/ml anti-CD40 monoclonal antibody (Pharmingen, San Diego, CA) for 6 or 12-14 hours.
CD4+ lymphocytes, CD8+ lymphocytes and NK cells were also isolated from mononuclear cells using CD4, CD8 and CD56 Miltenyi beads, positive VS selection columns and a Vario Magnet (Miltenyi Biotec, Auburn, CA) according to the manufacturer's instructions. CD4δ+RA and CD4δ+RO CD4+ lymphocytes were isolated by depleting mononuclear cells of CD8+, CDδ6+,
CD14+ and CD19+ cells using CD8, CDδ6, CD14 and CD19 Miltenyi beads and positive selection. CD45RO Miltenyi beads were then used to separate the CD45+RO CD4+ lymphocytes from CD45+RA CD4+ lymphocytes. CD45+RA CD4+, CD45+RO CD4 +and CD8+ lymphocytes were cultured in culture media at 106 cells/ml in culture plates precoated overnight with 0.5 mg/ml anti-CD28 (Pharmingen, San Diego, CA) and 3 μg/ml anti-CD3 (OKT3, ATCC) in PBS. After 6 and 24 hours, the cells were harvested for RNA preparation. To prepare chronically activated CD8+ lymphocytes, isolated CD8+ lymphocytes were activated for 4 days on anti-CD28, anti-CD3 coated plates and then harvested and expanded in culture media with IL-2 (1 ng/ml). These CD8+ cells were activated again with plate bound anti-CD3 and anti-CD28 for 4 days and expanded as described above. RNA was isolated 6 and 24 hours after the second activation and after 4 days of the second expansion culture. Isolated NK cells were cultured in culture media with 1 ng/ml IL-2 for 4-6 days before RNA was prepared.
B cells were prepared from minced and sieved tonsil tissue (NDRI). Tonsil cells were pelleted and resupended at 106 cells/ml in culture media. Cells were activated using 5 μg/ml PWM (Sigma-Aldrich Corp., St. Louis, MO) or -10 μg/ml anti-CD40 (Pharmingen) and 5-10 ng/ml IL-4. Cells were harvested for RNA preparation after 24, 48 and 72 hours.
To prepare primary and secondary Th1/Th2 and Tr1 cells, umbilical cord blood CD4+ lymphocytes (Poietic Systems, German Town, MD) were cultured at 105-106cells/ml in culture media with IL-2 (4 ng/ml) in 6-well Falcon plates (precoated overnight with 10 μg/ml anti-CD28 (Pharmingen) and 2 μg/ml anti-CD3 (OKT3; ATCC) then washed twice with PBS).
To stimulate Th1 phenotype differentiation, IL-12 (δ ng/ml) and anti-IL4 (1 μg/ml) were used; for Th2 phenotype differentiation, IL-4 (5 ng/ml) and anti-IFN gamma (1 μg/ml) were used; and forTrl phenotype differentiation, IL-10 (5 ng/ml) was used. After 4-δ days, the activated Th1 , Th2 and Tr1 lymphocytes were washed once with DMEM and expanded for 4-7 days in culture media with IL-2 (1 ng/ml). Activated Th1 , Th2 and Tr1 lymphocytes were re-stimulated for 5 days with anti-CD28/CD3 and cytokines as described above with the addition of anti-CD95L (1 μg/ml) to prevent apoptosis. After 4-δ days, the Th1 , Th2 and Tr1 lymphocytes were washed and expanded in culture media with IL-2 for 4-7 days. Activated Th1 and Th2 lymphocytes were maintained for a maximum of three cycles. RNA was prepared from primary and secondary Th1 , Th2 and Tr1 after 6 and 24 hours following the second and third activations with plate-bound anti-CD3 and anti-CD28 mAbs and 4 days into the second and third expansion cultures.
Leukocyte cells lines Ramos, EOL-1 , KU-812 were obtained from the ATCC. EOL-1 cells were further differentiated by culturing in culture media at δ x105 cells/ml with 0.1 mM dbcAMP for 8 days, changing the media every 3 days and adjusting the cell concentration to δ x105 cells/ml. RNA was prepared from resting cells or cells activated with PMA (10 ng/ml) and ionomycin (1 μg/ml) for 6 and 14 hours. RNA was prepared from resting CCD 1106 keratinocyte cell line (ATCC) or from cells activated with ~δ ng/ml TNF alpha and 1 ng/ml IL-1 beta. RNA was prepared from resting NCI-H292, airway epithelial tumor cell line (ATCC) or from cells activated for 6 and 14 hours in culture media with δ ng/ml IL-4, 5 ng/ml IL-9, 5 ng/ml IL-13, and 25 ng/ml IFN gamma. RNA was prepared by lysing approximately 107 cells/ml using Trizol (Gibco BRL) then adding 1/10 volume of bromochloropropane (Molecular Research Corporation, Cincinnati, OH), vortexing, incubating for 10 minutes at room temperature and then spinning at 14,000 rpm in a Sorvall SS34 rotor. The aqueous phase was placed in a 15 ml Falcon Tube and an equal volume of isopropanol was added and left at -20° C overnight. The precipitated RNA was spun down at 9,000 rpm for 15 min and washed in 70% ethanol. The pellet was redissolved in 300 μl of RNAse-free water with 35 ml buffer (Promega, Madison, Wl) 5 μl DTT, 7 μl RNAsin and 8 μl DNAse and incubated at 37° C for 30 minutes to remove contaminating genomic DNA, extracted once with phenol chloroform and re-precipitated with 1/10 volume of 3 M sodium acetate and 2 volumes of 100% ethanol. The RNA was spun down, placed in RNAse free water and stored at - 80° C.
ALcomprehensive panel_v1.0
Autoimmunity (Al) comprehensive panel v1.0 included two controls and 89 cDNA test samples isolated from male (M) and female (F) surgical and postmortem human tissues that were obtained from the Backus Hospital and Clinomics (Frederick, MD). Tissue samples included : normal, adjacent (Adj); matched normal adjacent (match control); joint tissues (synovial (Syn) fluid, synovium, bone and cartilage, osteoarthritis (OA), rheumatoid arthritis (RA)); psoriatic; ulcerative colitis colon; Crohns disease colon; and emphysmatic, asthmatic, allergic and chronic obstructive pulmonary disease (COPD) lung.
Pulmonary and General inflammation (PGI) panel v1.0
Pulmonary and General inflammation (PGI) panel v1.0 included two controls and 39 test samples isolated as surgical or postmortem samples. Tissue samples include: five normal lung samples obtained from Maryland Brain and Tissue Bank, University of Maryland (Baltimore, MD), International Bioresource systems, IBS (Tuscon, AZ), and Asterand (Detroit, Ml), five normal adjacent intestine tissues (NAT) from Ardais, ulcerative colitis samples (UC) ( Ardais); Crohns disease colon (NDRI); emphysematous tissue samples (Ardais), and Genomic Collaborative Inc. (Cambridge, MA), asthmatic tissue from Maryland Brain and Tissue Bank, University of Maryland (Baltimore, MD) and Genomic Collaborative Inc (Cambridge, MA) and fibrotic tissue (Ardais and Genomic Collaborative).
Cellular OA/RA Panel
Cellular OA.RA panel includes 2 control wells and 35 test samples comprised of cDNA generated from total RNA isolated from human cell lines or primary cells representative of the human joint and its inflammatory condition. Cell types included normal human osteoblasts (Nhost) from Clonetics (Cambrex, East Rutherford, NJ), human chondrosarcoma SW1353 cells (ATCC), human fibroblast-like synoviocytes from Cell Applications, Inc. (San Diego, CA) and MH7A cell line (a rheumatoid fibroblast-like synoviocytes transformed with SV40 T antigen) from Riken Cell bank ( Tsukuba Science City, Japan). These cell types were activated by incubating with various cytokines (IL-1 beta -1-10 ng/ml, TNF alpha ~δ-δ0 ng/ml, or prostaglandin E2 for Nhost cells) for 1 , 6, 18 or 24 h. All these cells were starved for at least δ h and cultured in their corresponding basal medium with - 0.1 to 1 % FBS.
Minitissue OA/RA Panel
The OA/RA mini panel includes two control wells and 31 test samples comprised of cDNA generated from total RNA isolated from surgical and postmortem human tissues obtained from the University of Calgary (Alberta, Canada), NDRI (Philadelphia, PA), and Ardais Corporation
(Lexington, MA). Joint tissue samples include synovium, bone and cartilage from osteoarthritic and rheumatoid arthritis patients undergoing reconstructive knee surgery, as well as, normal synovium samples (RNA and tissue). Visceral normal tissues were pooled from 2-5 different adults and included adrenal gland, heart, kidney, brain, colon, lung, stomach, small intestine, skeletal muscle, and ovary.
AI.05 chondrosarcoma
AI.05 chondrosarcoma plates included SW1353 cells (ATCC) subjected to serum starvation and treated for 6 and 18 h with cytokines that are known to induce MMP (1 , 3 and 13) synthesis (e.g. ILIbeta). These treatments included: IL-1 beta (10 ng/ml), IL-1 beta + TNF-alpha (50 ng/ml), IL-1 beta + Oncostatin (50 ng/ml) and PMA (100 ng/ml). Supernatants were collected and analyzed for MMP 1 , 3 and 13 production. RNA was prepared from these samples using standard procedures.
Panels 5D and 51 Panel 5D and 51 included two controls and cDNAs isolated from human tissues, human pancreatic islets cells, cell lines, metabolic tissues obtained from patients enrolled in the Gestational Diabetes study (described below), and cells from different stages of adipocyte differentiation, including differentiated (AD), midway differentiated (AM), and undifferentiated (U; human mesenchymal stem cells). Gestational Diabetes study subjects were young (18 - 40 years), otherwise healthy women with and without gestational diabetes undergoing routine (elective) Caesarean section. Uterine wall smooth muscle (UT), visceral (Vis) adipose, skeletal muscle (SK), placenta (PI) greater omentum adipose (GO Adipose) and subcutaneous (SubQ) adipose samples (less than 1 cc) were collected, rinsed in sterile saline, blotted and flash frozen in liquid nitrogen. Patients included: Patient 2, an overweight diabetic Hispanic not on insulin; Patient 7-9, obese non-diabetic Caucasians with body mass index (BMI) greater than 30; Patient 10, an overweight diabetic Hispanic, on insulin; Patient 11, an overweight nondiabetic African American; and Patient 12, a diabetic Hispanic on insulin.
Differentiated adipocytes were obtained from induced donor progenitor cells (Clonetics). Differentiated human mesenchymal stem cells (HuMSCs) were prepared as described in Mark F. Pittenger, et al., Multilineage Potential of Adult Human Mesenchymal Stem Cells (see Science Apr 2 1999: 143-147). mRNA was isolated and sscDNA was produced from Trizol lysates or frozen pellets. Human cell lines (ATCC, NCI or German tumor cell bank) included: kidney proximal convoluted tubule, uterine smooth muscle cells, small intestine, liver HepG2 cancer cells, heart primary stromal cells and adrenal cortical adenoma cells. Cells were cultured, RNA extracted and sscDNA was produced using standard procedures.
Panel 51 also contains pancreatic islets (Diabetes Research Institute at the University of Miami School of Medicine).
Human Metabolic RTQ-PCR Panel
Human Metabolic RTQ-PCR Panel included two controls (genomic DNA control and chemistry control) and 211 cDNAs isolated from human tissues and cell lines relevant to metabolic diseases. This panel identifies genes that play a role in the etiology and pathogenesis of obesity and/or diabetes. Metabolic tissues including placenta (PI), uterine wall smooth muscle (Ut), visceral adipose, skeletal muscle (Sk) and subcutaneous (SubQ) adipose were obtained from the Gestational Diabetes study (described above). Included in the panel are: Patients 7 and 8, obese non-diabetic Caucasians; Patient 12 a diabetic Caucasian with unknown BMI, on insulin (treated); Patient 13, an overweight diabetic Caucasian, not on insulin (untreated); Patient 15, an obese, untreated, diabetic Caucasian; Patient 17 and 25, untreated diabetic Caucasians of normal weight; Patient 18, an obese, untreated, diabetic Hispanic; Patient 19, a non-diabetic Caucasian of normal weight; Patient 20, an overweight, treated diabetic Caucasian; Patient 21 and 23, overweight non-diabetic Caucasians; Patient 22, a treated diabetic Caucasian of normal weight; Patient 23, an overweight non-diabetic Caucasian; and Patients 26 and 27, obese, treated, diabetic Caucasians.
Total RNA was isolated from metabolic tissues including: hypothalamus, liver, pancreas, pancreatic islets,, small intestine, psoas muscle, diaphragm muscle, visceral (Vis) adipose, subcutaneous (SubQ) adipose and greater omentum (Go) from 12 Type II diabetic (Diab) patients and 12 non diabetic (Norm) at autopsy. Control diabetic and non-diabetic subjects were matched where possible for: age; sex, male (M); female (F); ethnicity, Caucasian (CC); Hispanic (HI); African American (AA); Asian (AS); and BMI, 20-25 (Low BM), 26-30 (Med BM) or overweight (Overwt), BMI greater than 30 (Hi BMI) (obese). RNA was extracted and ss cDNA was produced from cell lines (ATCC) by standard methods.
CNS Panels
CNS Panels CNSD.01 , CNS Neurodegeneration V1.0 and CNS Neurodegeneration V2.0 included two controls and 46 to 94 test cDNA samples isolated from postmortem human brain tissue obtained from the Harvard Brain Tissue Resource Center (McLean Hospital). Brains were removed from calvaria of donors between 4 and 24 hours after death, and frozen at -80° C in liquid nitrogen vapor.
Panel CNSD.01
Panel CNSD.01 included two specimens each from: Alzheimer's disease, Parkinson's disease, Huntington's disease, Progressive Supernuclear Palsy (PSP), Depression, and normal controls. Collected tissues included: cingulate gyrus (Cing Gyr), temporal pole (Temp Pole), globus palladus (Glob palladus), substantia nigra (Sub Nigra), primary motor strip (Brodman Area 4), parietal cortex (Brodman Area 7), prefrontal cortex (Brodman Area 9), and occipital cortex (Brodman area 17). Not all brain regions are represented in all cases. Panel CNS Neurodegeneration V1.0
The CNS Neurodegeneration V1.0 panel included: six Alzheimer's disease (AD) brains and eight normals which included no dementia and no Alzheimer's like pathology (control) or no dementia but evidence of severe Alzheimer's like pathology (Control Path), specifically senile plaque load rated as level 3 on a scale of 0-3; 0 no evidence of plaques, 3 severe AD senile plaque load. Tissues collected included: hippocampus, temporal cortex (Brodman Area 21), parietal cortex (Brodman area 7), occipital cortex (Brodman area 17) superior temporal cortex (Sup Temporal Ctx) and inferior temporal cortex (Inf Temproal Ctx).
Gene expression was analyzed after normalization using a scaling factor calculated by subtracting the Well mean (CT average for the specific tissue) from the Grand mean (average CT value for all wells across all runs). The scaled CT value is the result of the raw CT value plus the scaling factor.
Panel CNS Neurodegeneration V2.0 The CNS Neurodegeneration V2.0 panel included sixteen cases of Alzheimer's disease (AD) and twenty-nine normal controls (no evidence of dementia prior to death) including fourteen controls (Control) with no dementia and no Alzheimer's like pathology and fifteen controls with no dementia but evidence of severe Alzheimer's like pathology (AH3), specifically senile plaque load rated as level 3 on a scale of 0-3; 0 no evidence of plaques, 3 severe AD senile plaque load. Tissues from the temporal cortex (Brodman Area 21) included the inferior and superior temporal cortex that was pooled from a given individual (Inf & Sup Temp Ctx Pool).
A. NOV1 , CG110205-03 and CG110205-07: A DISINTEGRIN-LIKE AND METALLOPROTEASE (REPROLYSIN TYPE) WITH THROMBOSPONDIN.
Expression of gene NOV1 CG110205-03 was assessed using the primer-probe sets Ag2430, Ag7067 and Ag7512, described in Tables AA, AC and AD. Results of the RTQ-PCR runs are shown in Tables AE, AF and AJ. The expression of variant CG110205-07 was assessed using the primer-probe sets Ag2430 and Ag4413, described in Tables AA and AB. Results of the RTQ- PCR runs are shown in Table AE.
Table AA. Probe Name Ag2430
Figure imgf000100_0001
Table AB. Probe Name Ag4413
Figure imgf000100_0002
Figure imgf000100_0003
Table AD. Probe Name Ag7512
Figure imgf000100_0004
Table AE. CNS_neurodegeneration_v1.0
I Column A - Rel. Exp.(%) Ag2430, Run 208712834 j Column B - Rel. Exp.(%) Ag4413, Run 224505949
{ Tissue Name | A B Tissue Name A B
JAD 1 Hippo "| 1.6 foTl iAH3 4624 8.5 j 6.3
|AD 2 Hippo j 22.8 16.8 AH3 4640 1.6 3.1
Figure imgf000101_0001
Table AF. General_screening_panel_v1.7
Figure imgf000101_0002
ιυι
Figure imgf000102_0001
Table AG. PG11.0
Figure imgf000102_0002
Figure imgf000103_0001
Figure imgf000104_0001
Table Al. Panel 4.1 D
Figure imgf000104_0002
Figure imgf000105_0001
Table AJ. AI_comprehensive panel_vl.O
Figure imgf000105_0002
Figure imgf000106_0001
CNS_neurodegeneration_v1.0 Summary: Ag2430/Ag4413 Gene expression was upregulated in the temporal cortex of patients with Alzheimer disease (AD). Detection of this gene in postmordem brain tissue from patients thought to have AD is useful in assisting to make an AD diagnosis. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product are useful in the treatment of Alzheimer disease.
General_screening_panel_v1.7 Summary: Ag7512 Highest gene expression was detected in the cerebellum (CT=27) and Moderate to low were detected in all CNS samples tested.. The protein product of this gene is useful as a specific target for the treatment of CNS disorders that originate in this region, such as autism and the ataxias.
Moderate to low gene expression was detected in tissues with metabolic/endocrine function including adipose, pancreas, heart, fetal skeletal muscle and liver. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product is useful in the treatment of endocrine/metabolically related diseases, such as obesity, diabetes, hypercholesterolemia and hypertension.
Moderate gene expression was seen in cell lines derived from brain, colon, lung, renal and melanoma cancers therefore gene expression is useful as a marker of these cancers and for differentiating these cell types from normal counterparts. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product are useful in the treatment of these cancers.
PGI1.0 Summary: Ag4413 Expression of this gene was upregulated in inflamed pulmonary and colon tissues such as emphysematous iung tissue samples, ulcerative colitis and
Crohn's disease colon samples. This gene is useful for detecting inflammation in colon and lung tissues and can therefore be used alongside diagnostic tools for diseases such as emphysema, ulcerative colitis, and Chron's disease.
Panel 2D Summary: Ag2430 Upregulated gene expression was detected in lung, thyroid, gastric and ovarian cancer specimens as compared to expression in the corresponding normal adjacent tissues. This protein is homologous to members of the family of ADAMTS proteins that are characterized by disintegrin, metalloproteinase and thrombospondin domains. The metalloproteinase domain plays a role in cell invasion and metastasis, and the thrombospondin domain plays a role in angiogenesis (Clin Cancer Res, 7(11):3437, 2001). Based on the expression profile of this gene and the role played by ADAMTS proteins, this gene is involved in tumor angiogenesis. Based on its expression profile, this gene is useful in distinguishing between cancer and non-cancerous lung, thyroid, gastric, and ovarian tissues.
Panel 4.1 D Summary: Ag4413 Gene expression was detected in endothelial cells, including microvascular dermal endothelial cells, microvascular lung endothelial cells, human pulmonary aortic endothelial cells and human umbilical vein endothelial cells. Endothelial cells are known to play important roles in inflammatory responses by altering the expression of surface proteins that are involved in activation and recruiting of effector inflammatory cells.
ALcomprehensive panel_v1.0 Summary: Ag7067 The highest expression of this gene was detected in a colon sample from a Crohn's disease patient. This gene was expressed at a higher level in asthma lung samples than in normal lung samples. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product would be useful in the treatment of asthma.
B. NOV2, CG189936, Junctional Adhesion Molecule 3
Expression of gene CG 189936-02 was assessed using the primer-probe set Ag754δ, described in Table BA. Results of the RTQ-PCR runs are shown in Table BB.
Table BA. Probe Name Ag7δ4δ
Figure imgf000107_0001
Table BB. General_screening_panel_v1.7
Figure imgf000107_0002
Melanoma* Hs688(A).T 0.1 Colon ca. SW-948 11.6
Melanoma* Hs688(B).T 31.9 (Colon ca. SW480 0.1
Melanoma (met) SK-MEL-δ 13.δ Colon ca. (SW480 met) SW620 18.3
Testis 7.7 Colon ca. HT29 31.6
Prostate ca. (bone met) PC-3 1.6 Colon ca. HCT-116 28.δ
Prostate ca. DU14δ 26.1 Colon cancer tissue 0.5
Prostate pool 7.3 Colon ca. SW1116 7.8
Uterus pool 1.7 Colon ca. Colo-20δ 8.2
Ovarian ca. OVCAR-3 9.0 Colon ca. SW-48 15.3
[Ovarian ca. (ascites) SK-OV-3 0.7 Colon 10.3
Ovarian ca. OVCAR-4 40.6 Small Intestine 2.1
Ovarian ca. OVCAR-5 9.7 Fetal Heart 8.8
Ovarian ca. IGROV-1 29.δ Heart 2.4 iOvarian ca. OVCAR-8 30.6 Lymph Node Pool 3.8
Ovary 17.3 Lymph Node pool 2 32.3
Breast ca. MCF-7 17.4 Fetal Skeletal Muscle 7.7
Breast ca. MDA-MB-231 100.0 Skeletal Muscle pool 2.8
(Breast ca. BT 549 9.4 Skeletal Muscle 12.5
Breast ca. T47D 21.3 Spleen 7.1
113452 mammary gland 6.δ Thymus 6.0
Trachea 17.1 CNS cancer (glio/astro) SF-268 13.1
Lung 20.7 CNS cancer (glio/astro) T98G 18.6
Fetal Lung 21.5 CNS cancer (neuro;met) SK-N-AS 0.8
(Lung ca. NCI-N417 8.4 CNS cancer (astro) SF-539 67.4
Lung ca. LX-1 6.7 CNS cancer (astro) SNB-7δ 36.9
Lung ca. NCI-H146 26.1 CNS cancer (glio) SNB-19 19.5 sLung ca. SHP-77 29.3 CNS cancer (glio) SF-29δ 11.6
Lung ca. NCI-H23 21.2 Brain (Amygdala) 18.0
Lung ca. NC1-H460 12.5 Brain (Cerebellum) 34.2
Lung ca. HOP-62 46.7 Brain (Fetal) 31.4
Lung ca. NCI-H522 38.7 Brain (Hippocampus) 14.2
Lung ca. DMS-114 19.2 Cerebral Cortex pool 16.8
Liver 4.4 Brain (Substantia nigra) 11.8
Fetal Liver 3.9 Brain (Thalamus) 11.9
Kidney pool 23.2 Brain (Whole) 30.1
Fetal Kidney 13.6 Spinal Cord 6.4
Renal ca. 786-0 31.4 Adrenal Gland 14.4
Renal ca. A498 3.1 Pituitary Gland 11.6
Renal ca. ACHN 7.1 Salivary Gland 9.5
Renal ca. UO-31 18.0 Thyroid 28.5
Renal ca. TK-10 25.3 Pancreatic ca. PANC-1 8.2
Bladder 9.δ Pancreas pool 2.1
General_screening_panel_v1.7 Summary: Ag7545 High to moderate gene expression was seen in tissues with metabolic/endocrine functions including pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal muscle, heart, liver. Therapeutic modulation of this gene, expressed protein and/or use of small molecule drugs targeting the gene or gene product are useful in the treatment of endocrine/metabolically related diseases, such as obesity and diabetes, hypercholesterolemia and hypertension.
Elevated geneexpression wasdetected in cancer cell lines derived from breast, pancreas, and CNS cancer. Therefore, the expression of this gene can be used as to differentiate cancerous and non-cancerous cells.
C. NOV4, CG190229-02 and CG190229-04: Dihydrolipoamide branched chain transacylase.
Expression of genes CG190229-02 and CG190229-04 was assessed using the primer- probe set Ag7δ19, described in Table CA. Results of the RTQ-PCR runs are shown in Table CB.
Table CA. Probe Name Ag7δ19
Figure imgf000109_0001
Table CB. General_screening_panel_v1.7
Figure imgf000109_0002
Figure imgf000110_0001
General_screening_panel_vl.7 Summary: Ag7519 High to moderate gene expression was seen in tissues with metabolic/endocrine functions including pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal muscle, heart, liver. Therapeutic modulation of this gene, expressed protein and/or use of small molecule drags targeting the gene or gene product are useful in the treatment of endocrine/metabolically related diseases, such as obesity and diabetes. Elevated gene expression was detected in ovarian cancer cell lines. Therefore, expression of this gene is useful in differentiating cancerous and non-cancerous ovarian cells.
D. NOV6, CG196732-01: NM_21797 like.
Expression of gene CG196732-01 was assessed using the primer-probe set Ag7937, described in Table DA. Results of the RTQ-PCR runs are shown in Table DB.
Table DA. Probe Name Ag7937
Figure imgf000110_0002
Table DB. AI_comprehensive panel_v1.0
Column A - Rel. Exp.(%) Ag7937, Run 317419930 i Tissue Name A ( Tissue Name A jϊ 10967 COPD-F j 0.0 112427 Match Control Psoriasis-F 4.7
'110980 COPD-F 0.0 112418 Psoriasis-M 0.0
H 10968 COPD-M 0.0 112723 Match Control Psoriasis-M 0.0
(110977 COPD-M 0.0 112419 Psoriasis-M 0.5
Figure imgf000111_0001
Table AD. PGI1.0
( Column A - Rel. Exp.(%) Ag7937, Run 370758789 j Tissue Name A Tissue Name A
162191 Normal Lung 1 (IBS) 3.0 162185 Emphysema Lung 12 (Ardais) 16.8
Figure imgf000112_0001
AI_comprehensive panel_v1.0 Summary: Ag7937 Moderate gene expressionwas detected in rheumatoid arthritis bone, cartilage, synovium and synovial fluid samples, while it was not detected in normal corresponding tissues. Therefore, expression of this gene is useful in distinguishing tissues afflicted with rheumatoid arthritis from normal counterparts.
PGI1.0 Summary: Ag7937 This gene was upregulated in 11 out of 14 lung samples from patients with emphysema and fibrosis disease when compared with normal lung tissues. Therefore, expression of this gene is useful as a marker for lung emphysema and fibrosis. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product is useful in the treatment of lung emphysema and fibrosis.
E. NOV7, CG53147-02: CG53147-FLF/SalR. Expression of gene CG53147-02 was assessed using the primer-probe set Ag7178, described in Table EA. Results of the RTQ-PCR runs are shown in Table EB. Table EA. Probe Name Ag7178
Figure imgf000112_0002
Table EB. General_screening_panel_v1.7
Column A - Rel. Exp.(%) Ag7178, Run 318040030 Tissue Name A Tissue Name A
(Adipose 2δ.O Gastric ca. (liver met.) NCI-N87 0.3
{HUVEC 0.8 Stomach 1.3
{Melanoma* Hs688(A).T 0.0 Colon ca. SW-948 0.2
(Melanoma* Hs688(B).T 2.8 (Colon ca. SW480 0.1
Melanoma (met) SK-MEL-5 2.9 Colon ca. (SW480 met) SW620 1.3
Testis 22.δ Colon ca. HT29 0.8
Prostate ca. (bone met) PC-3 0.4 Colon ca. HCT-116 2.9
Prostate ca. DU14δ 0.4 Colon cancer tissue 0.4
Prostate pool 2.1 Colon ca. SW1116 0.4
Uterus pool 1.9 Colon ca. Colo-205 0.5
Ovarian ca. OVCAR-3 0.2 Colon ca. SW-48 0.3
(Ovarian ca. (ascites) SK-OV-3 0.2 Colon 4.2
Ovarian ca. OVCAR-4 2.0 Small Intestine 6.8
Ovarian ca. OVCAR-δ 4.9 Fetal Heart 4.7
;Ovarian ca. IGROV-1 2.9 Heart 4.7
Ovarian ca. OVCAR-8 1.0 Lymph Node Pool 10.9 sOvary 21.8 Lymph Node pool 2 27.2
Breast ca. MCF-7 1.1 Fetal Skeletal Muscle 4.7
Breast ca. MDA-MB-231 3.1 Skeletal Muscle pool 1.0
IBreast ca. BT 549 9.0 Skeletal Muscle 3.7
Breast ca. T47D 0.9 Spleen 7.0
113452 mammary gland 10.5 Thy mus 6.7
Trachea 100.0 CNS cancer (glio/astro) SF-268 0.4
Lung 46.7 CNS cancer (glio/astro) T98G 0.7
Fetal Lung 88.9 CNS cancer (neuro;met) SK-N-AS 0.2
ILung ca. NCI-N417 0.1 CNS cancer (astro) SF-639 6.7
'Lung ca. LX-1 0.2 CNS cancer (astro) SNB-76 10.5
(Lung ca. NCI-H146 0.6 CNS cancer (glio) SNB-19 2.0
Lung ca. SHP-77 1.8 CNS cancer (glio) SF-295 0.7
Lung ca. NCI-H23 1.4 Brain (Amygdala) 1.5
Lung ca. NCI-H460 2.7 Brain (Cerebellum) 5.3
Lung ca. HOP-62 3.1 Brain (Fetal) 13.4
Lung ca. NCI-H622 2.3 Brain (Hippocampus) 1.5
(Lung ca. DMS-114 0.6 Cerebral Cortex pool 2.2
Liver 24.8 Brain (Substantia nigra) 0.8
Fetal Liver 17.9 Brain (Thalamus) 1.8
Kidney pool 49.7 Brain (Whole) 12.4
Fetal Kidney 5.4 Spinal Cord 1.6
Renal ca. 786-0 26.6 Adrenal Gland 5.9
Renal ca. A498 0.8 Pituitary Gland 9.7
Renal ca. ACHN 0.5 Salivary Gland 79.6
Renal ca. UO-31 0.9 Thyroid 16.6
Renal ca. TK-10 1.4 Pancreatic ca. PANC-1 0.6
Bladder 7.5 Pancreas pool 0.9 General_screening_panel_v1.7 Summary: Ag7178 High to moderate levels of gene expression were detected in tissues with metabolic/endocrine functions including pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal muscle, heart and liver. Therapeutic modulation of this gene, expressed protein and/or use of small molecule drugs targeting the gene or gene product are useful in the treatment of endocrine/metabolically related diseases, such as obesity and diabetes.
Gene expression was reduced in renal, lung and colon cancenoma cell lines. Therefore detecting expression of this gene in biological samples is useful in differentiating between cancerous and non-cancerous cells. Example D : Gene Expression analysis using CuraChip in human tissues from disease and from equivalent normal tissues
CuraGen has developed a proprietary gene microarray (CuraChip™ 1.2) for target identification. It provides a high-throughput means of global mRNA expression analyses of CuraGen's collection of GDNA sequences representing the Pharmaceutically Tractable Genome (PTG). This sequence set includes genes which can be developed into protein therapeutics, or used to develop antibody or small molecule therapeutics. CuraChip™ 1.2 contains ~11 ,000 oligos representing approximately 8,δ00 gene loci, including (but not restricted to) kinases, ion channels, G-protein coupled receptors (GPCRs), nuclear hormone receptors, proteases, transporters, metabolic enzymes, hormones, growth factors, chemokines, cytokines, complement and coagulation factors, and cell surface receptors.
The CuraChip™ cDNAs were represented as 30-mer oligodeoxyribonucleotides (oligos) on a glass microchip. Hybridization methods using the longer CuraChip™ oligos are more specific compared to methods using 2δ-mer oligos. CuraChip™ oligos were synthesized with a linker, purified to remove truncated oligos (which can influence hybridization strength and specificity), and spotted on a glass slide. Oligo-dT primers were used to generate cRNA probes for hybridization from samples of interest. A biotin-avidin conjugation system was used to detect hybridized probes with a fluorophore-labeled secondary antibody. Gene expression was analyzed using clustering and correlation bioinformatics tools such as Spotfire® (Spotfire, Inc, Somerville, MA) and statistical tools such as multivariate analysis (MVA). Threshhold for CuraChip™ data analysis
A number of control spots are present on CuraChip™ 1.2 for efficiency calculations and to provide alternative normalization methods. For example, CuraChip™ 1.2 contains a number of empty or negative control spots, as well as positive control spots containing a dilution series of oligos that detect the highly-expressed genes Ubiquitin and glyceraldehyde-3-phosphate dehydrogenase (GAPD). An analysis of spot signal level was performed using raw data from 67 hybridizations using all oligos. The maximum signal intensity for each oligo across all 67 hybridizations was determined, and the fold-over-background for this maximum signal was calculated (i.e. if the background reading is 20 and the raw spot intensity is 100, then the fold-over- background for that spot is δx). The negative control or empty spots do occasionally "fire" or give a signal over the background level; however, they do not fire very strongly, with 77.1% of empty spots firing <3x over background and 91.7% <δx . The positive control spots (Ubiquitin and
GAPD) always fired at >100x background. The experimental oligos (CuraOligos) fired over the entire range of intensities, with some at low fold-over-background intensities. Since the negative control spots do fire occasionally at low levels, we have set a suggested threshhold for data analysis at >δx background.
Expression analysis of NOVδ, CG194245-03: Oligonucleotide (optg2_1202184,
AATGTATGTGCCTATGACTGAGGACATCTA) (SEQ ID NO: 93) corresponding to CG194245-03 was used to determine specific gene expression on PTG Chip 1.2. Gene expression was detected in breast, lung, liver, pituitary gland, kidney, colon, pancreas, and hippocampus. Elevated expression was detected in ovarian and prostate cancer cell lines (Table DI). Detecting expression levels of this gene is useful in differentiating ovarian and prostate cancer cells from normal cells towards making a diagnosis. Therapeutic modulation of this gene, expressed protein and/or use of antibodies or small molecule drugs targeting the gene or gene product would be useful in the treatment of ovarian and prostate cancer. Table DI: CG 194245-03
G1C4D21B11-39_Alzheimer's disease B4951 13.47
G1C4D21 B11-40_Alzheimer's disease B4953 30.27
G1C4D21B11-41_Alzheimer's disease B5018 26.36
G1C4D21B11-43_Alzheimer's disease B5019 0
G1 C4D21 B11 -44_Alzheimer's disease B5086 13.69
G1C4D21B11-51_Alzheimer's disease B5096 3.19
G1 C4D21 B11 -52_Alzheimer's disease B5098 0
G1C4D21B11-54_Alzheimer's disease B5129 10.31
G1C4D21B11-55_Alzheimer's disease B5210 72.91
G1C4D21 B11-56_Control B4810 63.01
G1 C4D21 B11 -57_Control B4825 33.04
G1 C4D21 B11 -58_Control B4930 11.31
G1 C4D21 B11 -59_Control B4932 60.38
G1 C4D21 B11 -60_Control B5024 64.02
G1C4D21 B11-61_Control B5113 16.94
G1C4D21 B11-62_Control B5140 3.5
G1C4D21 B11-63_Control B5190 29.25
G1C4D21 B11-64_Control B5220 73.54
G1C4D21 B11-65_Control B5245 75.3
G1C4D21B11-66_AH3 B3791 53.93
G1C4D21 B11-67_AH3 B3855 68.42
G1C4D21 B11-68_AH3 B3877 77.39
G1C4D21B11-69_AH3 B3893 31.99
G1C4D21B11-70_AH3 B3894 72.08
G1C4D21B11-71_AH3 B3949 17.04
G 1 C4D21 B11 -72_AH3 B4477 40.6
G1C4D21B11-73_AH3 B4540 48.39
G1C4D21B11-74_AH3 B4577 67.65
G1C4D21B11-75_AH3 B4639 18.86
G1C4E19B13-45_Schizophrenia hippocampus 683 276.88
G1C4E19B13-46_Depression hippocampus 487 0 G1C4E19B13-47_Depression hippocampus 600 39.88
G1C4E19B13-48_Normal hippocampus 2407a 35.1
G1C4E19B13-49_Normal hippocampus 1042 0
G1 C4E19B13-50_Depression hippocampus 2767 60.08
G1 C4E19B13-51_Depression hippocampus 567 1.04
G1C4E19B13-52_Control hippocampus 3175 160.17
G1C4E19B13-53_Depression hippocampus 3096 70.21
G1C4E19B13-54_Depressipn hippocampus 1491 210.29
G1 C4E19B13-55_Depression hippocampus 2540 65.72
G1 C4E19B13-56_Schizophrenia hippocampus 2798 0
G1 C4E19B13-57_Control hippocampus 1973 177.66
G1 C4E19B13-58_Normal hippocampus and amygdala 2601 71.46
G1 C4E19B13-59_Schizophrenia hippocampus 2785 0
G1C4E19B13-60_Schizophrenia hippocampus 484 138.94
G1 C4E19B13-61_Normal hippocampus 2556 81.93
G1 C4E19B13-62_Depression hippocampus 1158 172.63
G1 C4E19B13-63_Control hippocampus 552 135.87
G1 C4E19B13-64_Schizophr enia hippocampus 1737 0
G1C4E19B13-65_Normal hippocampus 1239 162.72
G1C4E19B13-66_Normal hippocampus 1465 422.13
G1 C4E19B13-67_Normal hippocampus 3080 0
G1C4E19B13-68_Normal hippocampus 738 439.81
G1 C4E19B13-69_Schizophrenia hippocampus 2586 41.89
G1C4E19B13-70_Normal hippocampus 2551 239.93
G1C4E19B13-71_Depression hippocampus 588 94.66
G1C4E19B13-72_Depression hippocampus 529 259.11
G1C4E19B13-73_Depression hippocampus and dentate gyrus 16.35
G1C4E21B14-41_Schizophrenia amygdala 2586 71.78
G1 C4E21 B14-42_Normal substantia nigra 234 0
G1 C4E21 B14-43_Normal substantia nigra 1065 0
G1 C4E21 B14-44_Normal substantia nigra 3236 0
G1 C4E21 B14-45_Nor mal substantia nigra 2551 0
G1C4E21B14-46_Normal substantia nigra 1597 0
G1 C4E21 B14-47_Control thalamus 552 0
G1 C4E21 B14-48_Control thalamus 566 0
G1 C4E21 B14-49_Control thalamus 606 19.4
G1 C4E21 B14-50_Control thalamus 738 0
G1C4E21B14-51_Control thalamus 1065 37.68
G1 C4E21 B14-52_Control thalamus 1092 0
G1C4E21B14-53_Control thalamus 1597 0
G1C4E21B14-54_Control thalamus 2253 0
G1C4E21B14-55_Control thalamus 2551 0
G 1 C4E21 B14-56_Depr ession thalamus 588 0
G1 C4E21 B14-57_Depr ession thalamus 600 0
G1 C4E21 B14-58_Depression thalamus 721 0
G1 C4E21 B14-59_Depression thalamus 728 11.67
G1C4E21B14-60_Depression thalamus 759 78.09
G1 C4E21 B14-61_Depression thalamus 881 0
G1C4E21B14-62_Schizophrenia thalamus 477 0
G1 C4E21 B14-63_Schizophrenia thalamus 532 61.94 G1 C4E21 B14-64_Schizophrenia thalamus 683 33.46
G1C4E21B14-65_Schizophrenia thalamus 544- 4.06
G1C4E21B14-66_Schizophrenia thalamus 1671 0
G1C4E21B14-67_Schizophrenia thalamus 1737 0
G1 C4E21 B14-68_Schizophrenia thalamus 2464 4.92
G1 C4E21 B14-69_Schizophrenia thalamus 2586 23.07
G1C4E23B15-1_Depression amygdala 600 0
G1C4E23B15-10_Depression amygdala 759 83.99
G1G4E23B15-11_Depression anterior cingulate 759 49.6
G1 C4E23B15-12_Control amygdala 552 92.36
G1C4E23B15-14_Control anterior cingulate 482 82.14
G1 C4E23B15-15_Depression anterior cingulate 721 12.23
G1C4E23B15-16_Control amygdala 3175 123.73
G1C4E23B15-17_Depression anterior cingulate 600 5.23
G1C4E23B15-18_Depression anterior cingulate 588 63.36
G1C4E23B15-19_Control anterior cingulate 3175 150.86
G1 C4E23B15-2_ControI anterior cingulate 606 129.01
G1 C4E23B15-20_Depression anterior cingulate 567 15.65
G1C4E23B15-21_Depression amygdala 588 96.35
G1 C4E23B15-22_Control anterior cingulate 3080 53.07
G1 C4E23B15-23_Control anterior cingulate 2601 74.18
G1 C4E23B15-24_Control anterior cingulate 1042 106.02
G1 C4E23B15-25_Control anterior cingulate 3236 60.95
G'1 C4E23B15-26_Control amygdala 1502 64.64
G1 C4E23B15-27_Control anterior cingulate 807 22.04
G1 C4E23B15-28_Control amygdala 1597 93.76
G1 C4E23B15-29_Parkinson's substantia nigra 2842 0
G1 C4E23B15-3_Parkinson's substantia nigra 2917 0
G1 C4E23B15-4_Schizophrenia amygdala 544 0
G1C4E23B15-5_Schizophrenia amygdala 532 0
G1C4E23B15-7_Depression amygdala 2540 43.56
G1 C4E23B15-8_Parkinson's substantia nigra 2899 0 G1 C4E23B15-9_Depression anterior cingulate 881 NA
G1C4D21 B11-13_Normal Lung 4 71.9
G1C4D21 B11-14_Normal Lung 5 307
G1C4D21 B11-19_Normal Lung 1 102.66
G1C4D21 B11-24_Normal Lung 2 224.58
G1C4D21 B11-30_Normal Lung 3 254.25
G1 C4E23B15-52_SW1353 resting 1 h 17.78
G1C4E23B15-53_SW1353 resting 6h 0.4
G1 C4E23B15-54_SW1353 resting 16h 0
G1C4E23B15-55_SW1353 IL-1b (1 ng ) 1h 0
G1C4E23B15-56_SW1353 IL-1b (1 ng/) 6h 0
G1 C4E23B15-57_SW1353 IL-1 b (1 ng/) 16h 0
G1 C4E23B15-58_SW1353 FGF20 (1 ug/) 1 h 5.8
G1 C4E23B15-59_SW1353 FGF20 (1 ug ) 16h 0
G1C4E23B15-61_SW1353 FGF20 (5 ug ) 1h 2.75
G1 C4E23B15-62_SW1353 FGF20 (5 ug ) 6h 2.93
G1 C4E23B15-63_SW1353 FGF20 (5 ug/) 16h 9.8
G1C4E23B15-64_SW1353 FGF20 (1 ug/) IL-1b (1 ng/) 6h 2.13 G1C4E23B15-65_SW1353 FGF20 (1 ug/) IL-1b (1 ng/) 16h 5.65
G1 C4E23B15-66_SW1353 FGF20 (5 ug/) IL-1 b (1 ng/) 1 h 0
G1 C4E23B15-67_SW1353 FGF20 (5 ug/) IL-1 b (1 ng/) 6h 0
G1C4E23B15-69_THP-1 aCD40 (1 ug/) 1 h 554.88
G1C4E23B15-70_THP-1 aCD40 (1 ug ) 6h 352.35
G1C4E23B15-71_THP-1 LPS (100 ng/) 1 ' 525.2
G1C4E23B15-72_THP-1 LPS (100 ng/) 6h 178.12
G1 C4E23B15-73_CCD1070SK TNFa (5 ng/) 6h 0
G1 C4E23B15-74_CCD1070SK TNFa (5 ng/) 24h 0
G1 C4E23B15-75_CCD1070SK IL-1 b (1 ng ) 24h 0
G1C4E23B15-76_THP-1 resting 577.23
G1C4E23B15-77_THP-1 aCD40 (1 ug ) 24h 84.38
G1C4E23B15-78_THP-1 LPS (100 ng/) 24h 69.91
G1C4E23B15-79_CCD1070SK IL-1 b (1 ng/) 6h 0
G1 C4F06B17-28J.C 18hr 0
G1C4F06B17-29_LC-IL-! 18hr 0
G1C4F06B17-34_Astrocyte_IL1 B_1hr_a 0
G1 C4F06B17-35_Astrocyte_IL1 B_6 hr_a 0
G1 C4F06B17-36_Astrocyte_IL1 B_24 hr_a 0
G1C4F06B17-40_SHSY 5Y Undifferentiated 109.38
G1C4F06B17-41_SHSY 5Y Differentiated 76.2
G1C4F06B17-5_LC 0hr 0
G1 C4F06B17-50_Normal Fetal Kidney 260
G1 C4F06B17-52_Normal Liver 712.42
G 1 C4F06B17-53_Normal Fetal Liver 1197.15
G1C4F06B17-54_Normal Fetal Lung 89.55
G1C4F06B17-55_Normal Salivary Gland 0
G1 C4F06B17-56_Normal Fetal Skeletal Muscle 0
G1 C4F06B17-58_Normal Thyroid 69.41
G1 C4F06B17-59_Normal Trachea 298.6
G1C4F06B17-6J.C-IL-1 0 hr 0
G1 C4F06B17-60_Heart pool 0
G1C4F06B17-61_Pituitary Pool 1018
G1C4F06B17-62_Spleen Pool 0
G1C4F06B17-63_Stomach Pool 0
G1 C4F06B17-64_Testis Pool 196.06
G1C4F06B17-65_Thymus Pool 6.9
G1 C4F06B1 -66_Small Intestine- 5 donor pool 409.54
G1 C4F06B17-67_Lymph node- 5 donor pool 60.88
G1 C4F06B17-68_Kidney- 5 donor pool 0
G1C4I11 B20-34_Jurkat Resting 69.84
G1 C4I11 B20-35_Jurkat CD3 (500ng/ml) 6hr A 48.48
G1C4l11B20-36_Jurkat CD3 (500ng/ml) 24hr A 46.83
G1 C4I11 B20-37_Jurkat CD3 (500ng/ml)+CD28(1 ug/ml) 6hr A 0
G1 C4I11 B20-38_Jurkat CD3 (500ng/ml)+CD28(1 ug/ml) 24hr A 58.53
G1 C4I11 B20-55_control (no treatment)_1 hr 5.2
G1C4l11B20-56_10ng/ml lL-1b_1 hr 1.91
G1C4l11B20-57_10ng/ml TNF-a_1 hr 0
G1 C4I11 B20-58_200uM BzATP_1 hr 0.2
G1 C4I11 B20-59_control (no treatment)_5 hr 2.27 G1 C4I11 B20-60_1 Ong/ml IL-1 b_5 hr 6.27
G1C4M 1 B20-61_10ng/ml TNF-a_5 hr 5.45
G 1 C4111 B20-62_200uM BzATP_5 hr 5.87
G1 C4I11 B20-63_control (no treatment)_24 hr 0
G1 C4I11 B20-64_1 Ong/ml IL-1 b_24 hr 4.97
G1C4I11B20-65_1 Ong/ml TNF-a_24 hr 4.25
G 1 C4111 B20-66_200uM BzATP_24 hr 4.27
G1C4l12B21-72_#689 Control Lung 119.06
G1C4I12B21-73_#812 Asthma Lung 96.23
G 1 C4112B21 -74_#1078 Control Lung 90.07
G1 C4D21 B11 -01_Lung cancer(35C) 70.72
G1 C4D21 B11 -02_Lung NAT(36A) 52.77
G1C4D21 B11-03_Lung cancer(35E) 184.16
G1C4D21B11-04_Lung cancer(365) 21.69
G1C4D21 B11-05_Lung cancer(368) 145.63
G 1 C4D21 B11 -06_Lung cancer(369) 178.71
G1C4D21B11-07_Lung cancer(36E) 45.13
G1C4D21B11-08_Lung NAT(36F) 76.4
G1C4D21B11-09_Lung cancer(370) 33.57
G1C4D21B11-10_Lung cancer(376) 12.94
G1C4D21B11-11_Lung cancer(378) 14.07
G1C4D21B11-12_Lung cancer(37A) 218.75
G1C4D21B11-13_Normal Lung 4 71.9
G1C4D21B11-14_Normal Lung 5 307
G1C4D21B11-16_5.Melanoma 436.7
G1C4D21B11-17_6.Melanoma 23.82
G1C4D21B11-18_Melanoma (19585) 41.92
G1C4D21B11-19_Normal ung 1 102.66
G1C4D21 B11-20_Lung cancer(372) 31.2
G1 C4D21 B11 -21_Lung NAT(35D) 82.85
G1C4D21 B11-22_Lung NAT(361) 38.1
G1C4D21 B11-23_1.Melanoma 8.84
G1C4D21 B11-24_Normal Lung 2 224.58
G1 C4D21 B11 -25_Lung cancer(374) 576.22
G1C4D21B11-26_Lung cancer(36B) 219.96
G1 C4D21 B11 -27_Lung cancer(362) 207.73
G1C4D21B11-28_Lung cancer(358) 229.46
G1C4D21B11-29_2.Melanoma 14.12
G1C4D21B11-30_Normal Lung 3 254.25
G1C4D21B11-31_Lung NAT(375) 172.85
G1C4D21B11-32_Lung cancer(36D) 75.79
G1C4D21 B11-33_Lung NAT(363) 41.83
G 1 C4D21 B11 -34_Lung cancer(35A) 156.57
G1C4D21 B11-35_4.Melanoma 16.91
G1C4E09B12-54_Prostate cancer(B8B) 619.99
G 1 C4E09B12-55_Prostate cancer(B88) 602.36
G1 C4E09B12-56_Prostate NAT(B93) 27.59
G1C4E09B12-57_Prostate cancer(B8C) 194.19
G1C4E09B12-58_Prostate cancer(AD5) 247.18
G1 C4E09B12-59_Prostate NAT(AD6) 83.68 G1 C4E09B12-60_Prostate cancer(AD7) 149.18
G1 C4E09B12-61_Prostate NAT(AD8) 215.48
G1 C4E09B12-62_Prostate cancer(ADA) 349.04
G1C4E09B12-63_Prostate NAT(AD9) 112.02
G1 C4E09B12-64_Prostate cancer(9E7) 581.27
G1 C4E09B12-65_Prostate NAT(AOB) 132.12
G1C4E09B12-66_Prostate cancer(A0A) 59.32
G1C4E09B12-67_Prostate cancer(9E2) 1602.94
G1 C4E09B12-68_Pancreatic cancer(9E4) 43.37
G1C4E09B12-69_Pancreatic cancer(9D8) 5.8
G1C4E09B12-70_Pancreatic cancer(9D4) 169.98
G1C4E09B12-71_Pancreatic cancer(9BE) 100.19
G1C4E09B12-73_Pancreatic NAT(ADB) 201.77
G1 C4E09B12-74_Pancreatic NAT(ADC) 275.28
G1C4E09B12-76_Pancreatic NAT(ADD) 293.33
G1 C4E09B12-77_Pancreatic NAT(AED) 113.89
G1C4E19B13-10_Colon NAT(8B6) 574.41
G1C4E19B13-12_Colon NAT(9F1) 633.27
G1C4E19B13-13_Colon cancer(9F2) 183.56
G1C4E19B13-14_Colon NAT(A1 D) 191.94
G1C4E19B13-15_Colon cancer(9DB) 1157.25
G1C4E19B13-16_Colon NAT(A15) 437.74
G1C4E19B13-17_Colon cancer(A14) 592.15
G1C4E19B13-18_Colon NAT(ACB) 658.93
G1C4E19B13-19_Colon cancer(ACO) 1005.69
G1 C4E19B13-2_Colon cancer(8A4) 191.48
G1 C4E19B13-20_Colon NAT(ACD) 500.75
G1C4E19B13-21_Colon cancer(AC4) 593.18
G1 C4E19B13-22_Colon NAT(AC2) 1479.25
G1C4E19B13-23_Colon cancer(ACI) 851.74
G1 C4E19B13-24_Colon NAT(ACC) 1578.41
G1 C4E19B13-25_Colon cancer(AC3) 876.84
G1 C4E19B13-26_Br east cancer(9B7) 544.77
G1 C4E19B13-27_Breast NAT(9CF) 29.04
G1 C4E19B13-28_Breast cancer(9B6) 61.92
G1 C4E19B13-29_Breast cancer(9C7) 27.41
G1C4E19B13-3_Colon cancer(8A6) 102.9
G1 C4E19B13-30_Breast NAT(A11 ) 171.51
G1C4E19B13-31_Breast cancer(AIA) 82.27
G1 C4E19B13-32_Breast cancer(9F3) 10.06
G1 C4E19B13-33_Breast cancer(9B8) 6.53
G1 C4E19B13-34_Breast NAT(9C4) 3956.69
G1 C4E19B13-35_Breast cancer(9EF) 908.68
G1 C4E19B13-36_Breast cancer(ΘFO) 65.08
G1 C4E19B13-37_Breast cancer(9B4) 7.42
G 1 C4E19B13-38_Breast cancer(9EC) 30.7
G1 C4E19B13-4_Colon cancer(8A7) 787.34
G1C4E19B13-44_Colon cancer(8B7) 619.77
G1 C4E19B13-5_Colon cancer(8A9) 940.07
G1 C4E19B13-6_Colon cancer(8AB) 221.07 G1C4E19B13-7_Colon cancer(8AC) 686.49
G1C4E19B13-8_Colon NAT(8AD) 482.37
G1C4E19B13-9_Colon cancer(8B5) 749.96
G1 C4E21 B14-1_Cerv ical cancer(B08) 927.08
G1 C4E21 B14-10_Brain cancer(9F8) 0
G1C4E21 B14-11_Brain cancer(9C0) 0
G1C4E21B14-12_Brain cancer(9F7) 0
G1 C4E21 B14-13_Brain cancer(AOO) 15.05
G1C4E21B14-14_Brain NAT(A01) 15.19
G1C4E21B14-15_Brain cancer(9DA) 0
G1 C4E21 B14-16_Brain cancer(9FE) 0
G1C4E21B14-17_Brain cancer(9C6) 0
G1 C4E21 B14-18_Brain cancer(9F6) 0
G1 C4E21 B14-2_Cervical NAT(AEB) 0
G1C4E21B14-21_Bladder NAT(23954) 0
G1 C4E21 B14-22_Urinary cancer(AF6) 660.8
G1 C4E21 B14-23_Urinary cancer(BOC) 0
G1 C4E21 B14-24_Urinary cancer(AE4) 9.74
G1 C4E21 B14-25_Urinary NAT(B20) 0
G1 C4E21 B14-26_Urinary cancer(AE6) 338.11 G1 C4E21 B14-27_Urinary NAT(B04) ?
G1C4E21B14-28_Urinary cancer(B07) 0
G1C4E21B14-29_Urinary NAT(AF8) 0
G1 C4E21 B14-3_Cervical cancer(AFF) 227.75
G1 C4E21 B14-30_Ovarian cancer(9D7) 3.9
G1C4E21 B14-31_Urinary cancer(AF7) 0
G1 C4E21 B14-32_0varian cancer(9F5) 525.31
G1C4E21B14-33_Ovarian cancer(A05) 607.66
G1 C4E21 B14-34_0varian cancer(9BC) 0
G1C4E21B14-35_0varian cancer(9C2) , 987.3
G1 C4E21 B14-36_0varian cancer(9D9) 0
G 1 C4E21 B14-37_0varian NAT(AC7) 0
G1 C4E21 B14-38_0varian NAT(AC9) 0
G1 C4E21 B14-39_Ovarian NAT(ACA) 0
G1 C4E21 B14-4_Cervical NAT(B1 E) 0
G1 C4E21 B14-40_Ovarian NAT(AC5) 0
G 1 C4E21 B14-6_Cervical NAT(AFA) 111.34
G1C4E21B14-7_Cervical cancer(BI F) 129.66
G1 C4E21 B14-8_Cervical NAT(B1 C) 0
G1C4E23B15-32_Breast cancer(D34) 162.15
G 1 C4E23B15-33_Breast cancer(D35) 0
G1 C4E23B15-34_Breast cancer(D36) 16.74
G 1 C4E23B15-35_Breast cancer(D37) 0
G1 C4E23B15-36_Breast cancer(D38) 22.58
G1 C4E23B15-37_Breast cancer(D39) 67.06
G1C4E23B15-38_Breast cancer(D3A) 32.01
G1C4E23B15-39_Breast cancer(D3B) ' 442.37
G1 C4E23B15-40_Breast cancer(D3C) 16.77
G1 C4E23B15-41_Breast cancer(D3D) 30.74
G1 C4E23B15-42_Breast cancer(D3E) 3.38 G1C4E23B15-43_Breast cancer(D3F) 72.91
G1C4E23B15-44_Breast cancer(D40) 94.13
G1C4E23B15-45_Breast cancer(D42) 175.21
G1C4E23B15-46_Breast cancer(D43) 18.5
G1 C4E23B15-47_Breast cancer(D44) 281.42
G1 C4E23B15-48_Breast cancer(D45) 626.87
G1 C4E23B15-49_Breast cancer(D46) 389.85
G1C4E30B16-1_2.SK- ES 0
G1C4E30B16-10_40.HLaC-79 0
G1C4E30B16-11_43.H226 0
G1C4E30B16-12_45.HCT-116 159.69
G1C4E30B16-13_53.IGROV-1 374.36
G1C4E30B16-14_59.MX-1 66.85
G1C4E30B16-15_63.C33A 699.14
G1C4E30B16-16_65.Daudi 0
G1 C4E30B16-17_71.MV522 463.82
G1C4E30B16-18_76.RWP-2 969.3
G1C4E30B16-19_77.BON 936.95
G1C4E30B16-2_6.MiaPaCa 0
G1C4E30B16-20_82.H82 1185.97
G1C4E30B16-21_86.H69 112.24
G1C4E30B16-22_95.Caki-2 63.95
G1C4E30B16-23_100.LNCaP 1748.11
G1C4E30B16-24 01.A549 233.93
G1C4E30B16-25_1. DU145 1729.21
G1C4E30B16-26_6. OVCAR-3 97.18
G1 C4E30B16-27_11. HT-29 839.83
G1C4E30B16-28 3. DLD-2 7.97
G1C4E30B16-29_18. MCF-7 827.79
G1C4E30B16-3_9.H460 450.16
G1C4E30B16-4_15.SW620 322.32
G1C4E30B16-5_20.SK-OV-3 12.52
G1C4E30B16-6_23.MDA-231 0
G1C4E30B16-7_27.Caki-1 0
G1C4E30B16-8_31.PC-3 675.91
G1C4E30B16-9_35.LoVo 752.62
G1C4l11 B20-10_Kidney NAT(10B1) 1228.13
G1 C4I11 B20-11_Kidney cancer(10B2) 116.67
G1C4l11 B20-12_Kidney NAT(10B3) 54.8
G1 C4I11 B20-13_Kidney cancer(10B4) 433.72
G1C4l11 B20-14_Kidney NAT(10B5) 1208.14
G1 C4I11 B20-15_Kidney cancer(10B6) 9.49
G1 C4I11 B20-16_Kidney NAT(10B7) 3050.54
G1 C4I11 B20-17_Kidney cancer(1 OBA) 0
G1C4l11 B20-18_Kidney NAT(10BB) 3828.27
G1 C4I11 B20-19_Kidney cancer(1 OCO) 1189.64
G1 C4I11 B20-20_Kidney NAT(1 OC1 ) 2203.97
G1 C4I11 B20-21_Kidney cancer(10C4) 3513.9
G1 C4I11 B20-22_Kidney NAT(10C5) 156.35
G1C4l11 B20-23_Kidne cancer(10A8) 328.39 G1 C4I11 B20-24_Kidney NAT(10A9) 1214.88
G1C4l11B20-25_Kidney cancer(10AA) 896.39
G1C4l11B20-4_Kidney NAT(10AB) 1785.18
G1 C4I11 B20-5_Kidney cancer(1 OAC) 16.2
G 1 C4111 B20-6_Kidney NAT(1 OAD) 392.88
G1 C4111 B20-7_Kidney cancer(1 OAE) 216.89
G1C4l11B20-8_Kidn&y NAT(10AF) 132.72
G1C4H 1B20-9_Kidney cancer(IOBO) 584.63
G1C4l12B21-66_Ardais Lung 4 20.12
G1C4l12B21-67_Ardais Lung 6 20.51
G1C4l12B21-68_Ardais ung 7 198.19
G1C4l12B21-69_Ardais Lung 10 79.71
G1C4I12B21-70_4169B1 normal lung 129.49
G1C4I12B21-71_4267B1 normal lung 58.84
G1C4l12B21-72_#689 Control Lung 119.06
G1C4I12B21-73_#812 Asthma Lung 96.23
G1C4I12B21-74_#1078 Control Lung 90.07
G1C4l17B22-10_Lymphoma(9BF) 0
G1C4l17B22-11_Lymphoma(9D2) 0
G 1 C4117B22-12_Lymphoma(A04) 157.8
G1C4l17B22-13_Lymphoma(9DD) 7.85
G1C4l17B22-14_Lymphoma(F68) 0
G1C4l17B22-15_Lymphoma(F6A) 0
G1C4l17B22-16_Lymphoma(F6B) 0
G1C4l17B22-17_Lymphoma(F6C) 50
G1C4l17B22-18_Lymphoma(F6D) 0
G1C4l17B22-19_Lymphoma(F6E) 0
G1C4l17B22-20_Lymphoma(F6F) 0
G1C4l17B22-21_Lymphoma(F70) 0
G1C4l17B22-22_Lymphoma(F71) 0
G 1 C4117B22-23_Lymphoma(F72) 0
G1C4l17B22-24_Lymphoma(F73) 0
G1C4l17B22-25_Lymphoma(F74) 0
G1C4l17B22-26_Lymphoma NAT(1002) 0
G1C4l17B22-28_Lymphoma NAT(1004) 0
G1C4l17B22-29_Lymphoma NAT(1005) 0
G1 C4I17B22-30_Lymphoma NAT(1007) 0
G1C4l17B22-32_Lymphoma NAT(1003) 0
G1C4l17B22-4_Lymphoma(9E3) 0
G1C4l17B22-5_Lymphoma(9D0) 0
G1C4l17B22-6_Lymphoma(9E1) 51.56
G1C4l17B22-7_Lymphoma(A0D) 2.78
G1C4l17B22-8_Lymphoma(9B5) 0
G1C4l17B22-9_Lymphoma(9D3) 219.11
Example E: NOV2a, CG136984, T-cell Costimulation
T cells require two signals to become fully activated. Signal one is delivered by the TCR (and can be mimicked by mAb against the TCR or CD3 complex), whereas signal two can be delivered through other cell surface proteins/receptors. The act of stimulating through the second signal in the presence of signal one is termed costimulation. CG136984-Fc, a costimulator consists of the mature extracellular domain of CG136984-02 (NOV2a, SEQ ID NO: 20), spanning amino acids 31 through 236 (SEQ ID NOs: 67 and 68 below), which is fused with an Ig domain (Molecular Biotechnology 21 :259, (2002). >CG136984-02_mat_ecd, SEQ ID NO: 67
GCTGTAAATCTCAAATCCAGCAATCGAACCCCAGTGGTACAGGAATTTGAAAGTGTGGAACTGTCTTGCA TCATTACGGATTCGCAGACAAGTGACCCCAGGATCGAGTGGAAGAAAATTCAAGATGAACAAACCACATA TGTGTTTTTTGACAACAAAATTCAGGGAGACTTGGCGGGTCGTGCAGAAATACTGGGGAAGACATCCCTG AAGATCTGGAATGTGACACGGAGAGACTCAGCCCTTTATCGCTGTGAGGTCGTTGCTCGAAATGACCGCA AGGAAATTGATGAGATTGTGATCGAGTTAACTGTGCAAGTGAAGCCAGTGACCCCTGTCTGTAGAGTGCC GAAGGCTGTACCAGTAGGCAAGATGGCAACACTGCACTGCCAGGAGAGTGAGGGCCACCCCCGGCCTCAC TACAGCTGGTATCGCAATGATGTACCACTGCCCACGGATTCCAGAGCCAATCCCAGATTTCGCAATTCTT CTTTCCACTTAAACTCTGAAACAGGCACTTTGGTGTTCACTGCTGTTCACAAGGACGACTCTGGGCAGTA CTACTGCATTGCTTCCAATGACGCAGGCTCAGCCAGGTGTGAGGAGCAGGAGATGGAA
>CG136984-02_mat_ecd, SEQ ID NO: 68
AVKTLKSSIsmTPVVQΞFΞSVΞLSCIITDSQTSDPRIEWKKIQDEQTTYVFFDNKIQGDLAGRAEILGKTS KIWSJV RRDSA YRCΞVVAKNDR EIDEIVIΞLTVQVKPVTPVCRVPKAVPVGKMATLHCQESEGHPRPH YS YRNDVP PTDSRANPRFRNSSFH NSETGT VFTAVHKDDSGQYYCIAS DAGSARCEEQEME T cells were isolated using RosetteSep. Briefly whole blood was mixed with RosetteSep cocktail (δO ul cocktail per 1 ml of blood) and incubated for 20 min at room temperature. Samples were diluted with an equal volume of PBS and spun over Lymphoprep (20 min, brake off). Enriched T cell fraction was collected from Lymphoprep:plasma interface. T cells were washed twice at 1300 rpm with PBS -Mg and Ca and re-suspended in RPM1 1640 10% FBS + additives (HEPES, 2-mercaptoethanol, glutamine, penicillin, and streptomycin) at 106 cells/ml. Falcon 96- well plates were pre-coated overnight at 4°C with anti-CD3 mAb (clone UCHT1 BD catalog # δδ329) at 0.2δ μg/ml in 100 ml/well were washed and coated at 37°C for 4 h with CG136984-Fc 100 ug/ml in 100 μl/well. Wells were washed with 200 ul PBS and T cells added.200 ul per well. After 3 days of culture cytokine concentration was measured by ELISA. The data shown in Figure E1 indicate that CG 136984 is involved in activation of T lymphocytes. In the presence of anti-CD3 mAb CG136984 significantly increases production of II-8 and TNF-alpha. These data suggest that CG136984 protein can be used to boost immune response against various cancers and viral infections. Monoclonal antibodies against CG136984 can be effective for treatment of deceases associated with increased levels of proinflammatory cytokines including but not limited to COPD, psoriases, osteoarthritis and rheumatoid arthritis. mAb should inhibit proinflammatory cytokines production by blocking interaction between CG136984 and its putative receptor. In addition, the presence of intracellular tail (~40 aa) of CG136984 suggests that it can deliver intracellular signal. Thus agonistic mAb against CG136984 are useful for alteration function of any cell types expressing CG136984 on their surface.
Figure E1 :
Figure imgf000125_0002
Figure imgf000125_0001
Figure E1. CG136984-Fc augments IL-8. THF -a and IFH-q production by T cells Purified human T cells (2x1 D5 cells/well) were cultured in 96-well flat-bottom plates p e „gats,d with antι-CD3 ΠJ JJ} (250 pg/ml) and CG53449-Fc fusion protein (10 ng/ml) Concentrations of IL-B, TNF-alpha and IFN-gamma were measured at 72 h by ELISA
Example F: Antibody recognizing NOV 6 CG196732
Immunogenic peptides were derived from the sequence of CG 196732 by comparing the SEQ ID NO:46 to the two closest X-ray crystal structure sequences (human chitotriosidase PDB code:1 LG1;Fusetti et al, 2002 and murine Ym1 lectin PDB code:1 E9L; Sun, et al, 2001) and considering flexibility, surface accessibility, and hydrophilicity. Several potential immunogenic peptides were identified: Peptide 1 QYRPGLGRFM (aa 33-42) SEQ ID NO:94 ; Peptide 2 YPTDTGSN (aa 231-238) SEQ ID NO:9δ ; Peptide 3 QINKPRL (aa 172-178) SEQ ID NO:96 ; Peptide 4 GSRGSPPQ (aa 143-160) SEQ ID NO:97; and Peptide δ LASSSDT (aa 273-279) SEQ ID NO:98 . Immunogens based on several of these peptides were synthesized and used to generate polyclonal antibody in rabbits.
Peptide 1: CQYRPGLGRFMPD (13 aa) SEQ ID NO:99 (underlined sequence corresponds to amino acid residues 33-44 of CG196732-02, SEQ ID NO 46 . Peptide 3: CEAFEQEAKQINKPRL (16 aa) SEQ ID NO:100 (underlined sequence sorresponds to amino acid residues 164-178 of CG 19732-02, SEQ ID NO 46 .
Peptide 4: CGSRGSPPQDK (11 aa) SEQ ID NO: 101 (underlined sequence corresponds to amino acid residues 143-162 of CG196732-02, SEQ ID N046 .
The immunogenic peptide may further include additional amino acid residues from SEQ ID NO 46 from either N terminal or C terminal direction of the designated immunogenic sequence. Additional residues or domains may be added to stabilize conformation, increase immunogenicity or the like.
Rabbits were immunized with the immunogen emulsified in complete Freund's adjuvant and injected subcutaneously or intraperitoneally or intramuscolar in an amount from 60-1000 micrograms. The immunized rabbits are then boosted 10 to 12 days later with additional immunogen emulsified in the selected adjuvant. Thereafter, for several weeks, the rabbits were boosted with additional immunization injections. Serum samples were periodically obtained from the rabbit by bleeding of the ear for testing in ELISA assays to detect circulating antibodies recognizing the immunogen. Polyclonal sera antigen recognition can also be confirmed by Western blot analysis.
Immunogenic peptide at a concentration of approximately δug/ml of coating buffer (0.1 M Carbonate, pH9.δ) was plated at δOul/well on to a 96-well high protein binding ELISA plate
(Corning Costar #3690) and incubated overnight at 4 C. The plate was washed δ times with 200- 300 ul of 0.δ% Tween-20 in PBS. The plate was blocked with 200ul of assay diluent (Pharmingen #26411 E) for at least 1 hr at room temperature. The plate was further washed δ times as previously described. Polyclonal antibodies to be tested were diluted in assay diluent. δOul of each antibody dilution was added to the wells and incubated at room temperature for 2 hr. The plate was washed and δOul of secondary antibody (anti rabbit conjugate) was added to each well and incubated for 1 hr at room temperature. The plate was washed and the assay was developed with 100ul of TMB substrate solution/well (1 :1 ratio of solution A+B, Pharminger #2642KK). The reaction was stopped with δOul of sulfuric acid and plate was read at 4δ0nm with a correction of δδOnm. The results represent a relative measurement of the quantity of peptide-specific antibody present in the serum samples. The titer of the serum was determined to be that dilution which gives a reading of approximately 0.1 O.D. above background.
Table F1 : Rabbit antisera titers
Figure imgf000126_0001
Thus while having illustrated and described the preferred embodiments of the invention, it should be understood that this invention is capable of variation and modification, and therefore is not limited to the precise terms set forth, but includes such changes and alterations which may be made for adapting the invention to various usages and conditions. Thus, such variations and modifications are properly intended to be within the full range of equivalents, and therefore within the purview of the following claims.
Having thus described the invention and the manner and a process of making and using it in such full, clear, concise and exact terms so as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same: What is claimed is:

Claims

1. An isolated polypeptide comprising the mature form of an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
2. A composition comprising the polypeptide of claim 1 and a carrier.
3. A kit comprising, in one or more containers, the composition of claim 2.
4. A method for determining the presence or amount of the polypeptide of claim 1 in a sample, the method comprising:
(a) providing the sample;
(b) introducing the sample to an antibody that binds immunospecifically to the polypeptide; and
(c) determining the presence or amount of the antibody bound to the polypeptide, wherein the presence or amount of the antibody indicates the presence or amount of the polypeptide in the sample.
δ. A method for determining the presence of or predisposition to a disease associated with altered levels of expression of the polypeptide of claim 1 in a first mammalian subject, the method comprising: a) measuring the level of expression of the polypeptide in a sample from the first mammalian subject; and b) comparing the expression of the polypeptide in the sample of step (a) to the expression of the polypeptide present in a control sample from a second mammalian subject known not to have, or not to be predisposed to the disease, wherein an alteration in the level of expression of the polypeptide in the first subject as compared to the control sample indicates the presence of or predisposition to the disease.
6. A method of identifying an agent that binds to the polypeptide of claim 1 , the method comprising:
(a) introducing the polypeptide to the agent; and
(b) determining whether the agent binds to the polypeptide.
7. The method of claim 6 wherein the agent is a cellular receptor or a downstream effector.
8. A method for identifying a potential therapeutic agent for use in treatment of a pathology, wherein the pathology is related to aberrant expression or aberrant physiological interactions of the polypeptide of claim 1 , the method comprising:
(a) providing a cell expressing the polypeptide of claim 1 and having a property or function ascribable to the polypeptide; (b) contacting the cell with a composition comprising a candidate substance; and
(c) determining whether the substance alters the property or function ascribable to the polypeptide; whereby, if an alteration observed in the presence of the substance is not observed when the cell is contacted with a composition in the absence of the substance, the substance is identified as a potential therapeutic agent.
9. A method for screening for a modulator of activity of or of latency or predisposition to a pathology associated with the polypeptide of claim 1 , the method comprising:
(a) administering a test compound to a test animal at increased risk for a pathology associated with the polypeptide of claim 1 , wherein the test animal recombinantly expresses the polypeptide of claim 1 ;
(b) measuring the activity of the polypeptide in the test animal after administering the compound of step (a); and
(c) comparing the activity of the polypeptide in the test animal with the activity of the polypeptide in a control animal not administered the compound, wherein a change in the activity of the polypeptide in the test animal relative to the control animal indicates the test compound is a modulator of activity of or latency or predisposition to, a pathology associated with the polypeptide of claim 1.
10. The method of claim 9, wherein said test animal is a recombinant test animal that expresses the polypeptide as a transgene or expresses the transgene under the control of a promoter at an increased level relative to a wild-type test animal, and wherein the promoter is not the native gene promoter of the transgene.
11. An antibody that immunospecifically binds to the polypeptide of claim 1.
12. The antibody of claim 11 , wherein the antibody is a human monoclonal antibody.
13. A method of producing the polypeptide of claim 1 , the method comprising culturing a cell under conditions that lead to expression of the polypeptide, wherein said cell comprises a vector comprising an isolated nucleic acid molecule comprising a nucleic acid sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34.
14. The method of claim 13 wherein the cell is chosen from the group comprising a bacterial cell, an insect cell, a yeast cell and a mammalian cell.
16. An isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
16. A method of treating a pathological state in a mammal, the method comprising administering to the mammal a polypeptide in an amount that is sufficient to alleviate the pathological state, wherein the polypeptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34, or a biologically active fragment thereof.
17. An isolated nucleic acid molecule comprising a nucleic acid sequence selected from the group consisting of SEQ ID NO:2n-1 , wherein n is an integer between 1 and 34.
18. An isolated nucleic acid molecule encoding the mature form of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 34.
19. A vector comprising the nucleic acid molecule of claim 18.
20. A cell comprising the vector of claim 19.
PCT/US2003/031817 2001-10-05 2003-10-07 Therapeutic polypeptides, nucleic acids encoding same, and methods of use Ceased WO2004055158A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003282764A AU2003282764A1 (en) 2002-10-07 2003-10-07 Therapeutic polypeptides, nucleic acids encoding same, and methods of use

Applications Claiming Priority (13)

Application Number Priority Date Filing Date Title
US09/972211 2001-10-05
US41666202P 2002-10-07 2002-10-07
US60/416,662 2002-10-07
US41880802P 2002-10-16 2002-10-16
US60/418,808 2002-10-16
US42096802P 2002-10-24 2002-10-24
US60/420,968 2002-10-24
US42115502P 2002-10-25 2002-10-25
US60/421,155 2002-10-25
US42169802P 2002-10-28 2002-10-28
US60/421,698 2002-10-28
US42379502P 2002-11-05 2002-11-05
US60/423,795 2002-11-05

Publications (2)

Publication Number Publication Date
WO2004055158A2 true WO2004055158A2 (en) 2004-07-01
WO2004055158A3 WO2004055158A3 (en) 2005-08-04

Family

ID=32601216

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/031817 Ceased WO2004055158A2 (en) 2001-10-05 2003-10-07 Therapeutic polypeptides, nucleic acids encoding same, and methods of use

Country Status (2)

Country Link
AU (1) AU2003282764A1 (en)
WO (1) WO2004055158A2 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE368735T1 (en) * 2000-12-18 2007-08-15 Astellas Pharma Inc NEW AGGRECANASE
JP2003144154A (en) * 2001-11-14 2003-05-20 Mitsubishi Pharma Corp Novel ADAMTS family polypeptide and gene encoding the same
JP2005515779A (en) * 2002-01-31 2005-06-02 ワイス Aggrecanase molecule

Also Published As

Publication number Publication date
AU2003282764A1 (en) 2004-07-09
AU2003282764A8 (en) 2004-07-09
WO2004055158A3 (en) 2005-08-04

Similar Documents

Publication Publication Date Title
WO2004000997A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2003023008A9 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
US7033790B2 (en) Proteins and nucleic acids encoding same
WO2002066643A2 (en) Proteins, polynucleotides encoding them and methods of using the same
WO2003083046A2 (en) Novel proteins and nucleic acids encoding same
WO2002099062A2 (en) Novel antibodies that bind to antigenic polypeptides, nucleic acids encoding the antigens, and methods of use
WO2002094870A2 (en) Proteins and nucleic acids encoding same
JP2005522186A (en) Novel proteins and nucleic acids encoding them
WO2003064628A2 (en) Novel proteins and nucleic acids encoding same
WO2003083039A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2004055158A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2002046408A2 (en) Human proteins, polynucleotides encoding them and methods of using the same
WO2004022723A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
US20050048507A1 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
EP1401470A2 (en) Novel antibodies that bind to antigenic polypeptides, nucleic acids encoding the antigens, and methods of use
US20040067505A1 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2004089282A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2003064589A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2003060149A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2004015079A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
US20050037957A1 (en) Novel proteins and nucleic acids encoding same
EP1581618A2 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
US20050049192A1 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use
WO2004056961A2 (en) Methods of identifying compounds that modulate protein activity
US20040259774A1 (en) Therapeutic polypeptides, nucleic acids encoding same, and methods of use

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP