[go: up one dir, main page]

WO2010005303A2 - New indicators of human longevity and biological ageing rate - Google Patents

New indicators of human longevity and biological ageing rate Download PDF

Info

Publication number
WO2010005303A2
WO2010005303A2 PCT/NL2009/050409 NL2009050409W WO2010005303A2 WO 2010005303 A2 WO2010005303 A2 WO 2010005303A2 NL 2009050409 W NL2009050409 W NL 2009050409W WO 2010005303 A2 WO2010005303 A2 WO 2010005303A2
Authority
WO
WIPO (PCT)
Prior art keywords
longevity
nucleotide sequence
seq
substance
snps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/NL2009/050409
Other languages
French (fr)
Other versions
WO2010005303A3 (en
Inventor
Pieternella Slagboom
Rudolf Gerardus Johannes Westendorp
Jeanne Jacobine Houwing - Duistermaat
Bastiaantheodoor Heijmans
Marian Beekman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leids Universitair Medisch Centrum LUMC
Publiekrechtelijke Rechtspersoon Academisch Ziekenhuis Leiden HODN
Original Assignee
Leids Universitair Medisch Centrum LUMC
Publiekrechtelijke Rechtspersoon Academisch Ziekenhuis Leiden HODN
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leids Universitair Medisch Centrum LUMC, Publiekrechtelijke Rechtspersoon Academisch Ziekenhuis Leiden HODN filed Critical Leids Universitair Medisch Centrum LUMC
Publication of WO2010005303A2 publication Critical patent/WO2010005303A2/en
Publication of WO2010005303A3 publication Critical patent/WO2010005303A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to the field of molecular human genetics and epidemiology.
  • the invention relates to genetic and biochemical markers of longevity.
  • the GWAS was followed by replication analysis in additional cohorts using cross-sectional and prospective data.
  • One locus influenced the probability to survive into old age by affecting both the risk of cancer and cardiovascular mortality and was related to transcription of nearby genes.
  • hybridisation refers to the binding of two single stranded nucleic acids via complementary base pairing.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions.
  • stringent conditions refers to conditions under which a probe will hybridize preferentially to its target subsequence, and to a lesser extent to, or not at all to, other sequences in a mixed population (e.g., a cell lysate or DNA preparation from a tissue biopy)
  • a "stringent hybridization” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization are sequence dependent, and are different under different environmental parameters.
  • Very stringent conditions are selected to be equal to the T m for a particular probe.
  • An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on an array or on a filter in a Southern or northern blot is 42 0 C using standard hybridization solutions (see, e.g., Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual (3rd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, and detailed discussion, below), with the hybridization being carried out overnight.
  • An example of highly stringent wash conditions is 0.15 M NaCl at 72 0 C for about 15 minutes.
  • An example of stringent wash conditions is a 0.2 x SSC wash at 65 0 C for 15 minutes (see, e.g., Sambrook supra, for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is 1 x SSC at 45 0 C for 15 minutes.
  • An example of a low stringency wash for a duplex of, e.g., more than 100 nucleotides is 4 x to 6 x SSC at 40 0 C for 15 minutes.
  • nucleic acid or “nucleic acid molecule” as used herein refers to a deoxyribonucleotide or ribonucleotide in either single- or double-stranded form.
  • the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid.
  • the term also includes nucleic acids which are metabolized in a manner similar to naturally occurring nucleotides or at rates that are improved for the purposes desired.
  • nucleic-acid-like structures with synthetic backbones are examples of synthetic backbones.
  • DNA backbone analogues provided by the invention include phosphodiester, phosphorothioate, phosphorodithioate, methylphosphonate, phosphoramidate, alkyl phosphotriester, sulfamate, 3'-thioacetal, methylene(methylimino), 3'-N-carbamate, morpholino carbamate, and peptide nucleic acids (PNAs); see Oligonucleotides and Analogues, a Practical Approach, edited by F. Eckstein, IRL Press at Oxford University Press (1991); Antisense Strategies, Annals of the New York Academy of Sciences, Volume 600, Eds. Baserga and Denhardt (NYAS 1992); Milligan (1993) J. Med. Chem.
  • PNAs contain non-ionic backbones, such as N-(2- aminoethyl) glycine units. Phosphorothioate linkages are described in WO 97/03211; WO 96/39154; Mata (1997) Toxicol. Appl. Pharmacol. 144: 189-197. Other synthetic backbones encompassed by the term include methyl-phosphonate linkages or alternating methylphosphonate and phosphodiester linkages (Strauss-Soukup (1997) Biochemistry 36: 8692-8698), and benzylphosphonate linkages (Samstag (1996) Antisense Nucleic Acid Drug Dev 6: 153-156).
  • nucleic acid molecule or sequence of the invention When a nucleic acid molecule or sequence of the invention is in single stranded form, the opposite, i.e. complementary strand of the nucleic acid molecule or sequence is expressly included within the scope of the invention.
  • An isolated nucleic acid means an object species of the invention that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition).
  • an isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of all macro molecular species present.
  • the object species is purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods).
  • array refers to an arrangement, on a substrate surface, of multiple nucleic acid molecules of predetermined identity, of which preferably the sequences are known. Each nucleic acid molecule is immobilized to a "discrete spot” (i.e., a defined location or assigned position) on the substrate surface.
  • micro-array more specifically refers to an array that is miniaturized so as to require microscopic examination for visual evaluation.
  • the arrays used in the methods of the invention are preferably microarrays.
  • the nucleic acid array as used herein is a plurality of target elements, each target element comprising one or more nucleic acid molecules (probes) immobilized on one or more solid surfaces to which sample nucleic acids can be hybridized.
  • the nucleic acids of a probe can contain sequence(s) from specific genes or clones, e.g. from specific genomic regions described in Tables herein. Other probes may contain, for instance, reference sequences.
  • the probes of the arrays may be arranged on the solid surface at different densities. The probe densities will depend upon a number of factors, such as the nature of the label, the solid support, and the like. One of skill will recognize that each probe may comprise a mixture of nucleic acids of different lengths and sequences.
  • a probe may contain more than one copy of a cloned piece of DNA or RNA, and each copy may be broken into fragments of different lengths.
  • the length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure, and to provide the required resolution among different genes or genomic locations.
  • probe or "nucleic acid probe”, as used herein, is defined to be one or more nucleic acid fragments whose specific hybridization to a sample can be detected.
  • the probe may be unlabelled or labelled as described below so that its binding to the target or sample can be detected.
  • the probe is produced from a source of nucleic acids from one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products.
  • the probes of the present invention are produced from nucleic acids found in the regions described herein.
  • the probe may also be isolated nucleic acids immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array.
  • the probe may be a member of an array of nucleic acids as described, for instance, in WO 96/17958.
  • Techniques capable of producing high density arrays can also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) Biotechniques 23: 120-124; U.S. Pat. No. 5,143,854).
  • primer refers to a single-stranded oligonucleotide capable of acting as a point of initiation of template-directed DNA synthesis under appropriate conditions (i.e., in the presence of four different nucleoside triphosphates and an agent for polymerization, such as, DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
  • the appropriate length of a primer depends on the intended use of the primer but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.
  • a primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template.
  • primer site refers to the area of the target DNA to which a primer hybridizes.
  • primer pair means a set of primers including a 5' upstream primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' downstream primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • nucleic acid molecules and nucleotide sequences of the present invention that are to be used for the detection and/or for quantification of polymorphisms, genetic markers, (differentially) expressed sequences, such as e.g. probes and primers, are chosen such that they are specific for sequence to be detected in the context of the human genome and/or human transcriptome.
  • nucleic acid molecules and nucleotide sequences will comprise a unique sequence (in the context of the human genome and/or human transcriptome) consisting of sufficient length to be specific (i.e. at least 10, 11, 12, 13, 14, 15, 16, 17. 18.
  • sample as used herein relates to a material or mixture of materials, containing one or more components of interest. Samples include, but are not limited to, samples obtained from an organism and may be directly obtained from a source (e.g., such as a blood sample, a biopsy or from a tumor) or indirectly obtained e.g., after culturing and/or one or more processing steps.
  • a source e.g., such as a blood sample, a biopsy or from a tumor
  • indirectly obtained e.g., after culturing and/or one or more processing steps.
  • genomic refers to all nucleic acid sequences (coding and non-coding) and elements present in each cell type, preferably each somatic cell type, of a subject.
  • genome also applies to any naturally occurring or induced variation of these sequences that may be present in a mutant or disease variant of any cell type, including tumour cells.
  • genomic DNA and “genomic nucleic acid” are used herein interchangeably. They refer to nucleic acid isolated from a nucleus of one or more cells, and include nucleic acid derived from (i.e., isolated from, amplified from, cloned from as well as synthetic versions of) genomic DNA.
  • the human genome consists of approximately 3.0 x 10 9 base pairs of DNA organised into distinct chromosomes.
  • the genome of a normal diploid somatic human cell consists of 22 pairs of autosomes (chromosomes 1 to 22) and either chromosomes X and Y (males) or a pair of chromosome Xs (female) for a total of 46 chromosomes.
  • a genome of a cancer cell may contain variable numbers of each chromosome in addition to deletions, rearrangements and amplification of any subchromosomal region or DNA sequence.
  • each nucleic acid probe immobilised to a discrete spot on an array has a sequence that is specific to (or characteristic of) a particular genomic region.
  • the ratio of intensity of two differentially labelled test and reference samples at a given spot on the array reflects the genome copy number ratio of the two samples at a particular genomic region.
  • Linkage describes the tendency of genes, alleles, loci or genetic markers to be inherited together as a result of their location on the same chromosome.
  • Linkage disequilibrium or allelic association means the preferential association of a particular allele or genetic marker with a specific allele or genetic marker at a nearby chromosomal location more frequently than expected by chance for any particular allele frequency in the population. For example, if locus X has alleles a and b, which occur equally frequently, and linked locus Y has alleles c and d, which occur equally frequently, one would expect the combination ac to occur with a frequency of 0.25. If ac occurs more frequently, then alleles a and c are in linkage disequilibrium.
  • Linkage disequilibrium may result from natural selection of certain combination of alleles or because an allele has been introduced into a population too recently to have reached equilibrium with linked alleles.
  • a marker in linkage disequilibrium can be particularly useful in detecting susceptibility to disease (or other phenotype) notwithstanding that the marker does not cause the disease.
  • a marker (X) that is not itself a causative element of a disease, but which is in linkage disequilibrium with a gene (including regulatory sequences) (Y) that is a causative element of a phenotype can be used detected to indicate susceptibility to the disease in circumstances in which the gene Y may not have been identified or may not be readily detectable.
  • Polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population.
  • a polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population.
  • a polymorphic locus may be as small as one base pair.
  • Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as AIu.
  • allelic form occurring most frequently in a selected population is sometimes referred to as the wildtype form. Diploid organisms may be homozygous or heterozygous for allelic forms.
  • a diallelic polymorphism has two forms.
  • a triallelic polymorphism has three forms.
  • a single nucleotide polymorphism or SNP occurs at a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences. The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). SNPs are most frequently diallelic.
  • a single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site.
  • a transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine.
  • a transversion is the replacement of a purine by a pyrimidine or vice versa.
  • Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
  • To identify common genetic variation associated with longevity we have performed a genome wide association scan in the Leiden Longevity study and followed the results up for replication in other cohorts of elderly subjects.
  • the present invention therefore relates to genetic markers of longevity.
  • genetic markers of longevity are preferably markers for mammalian longevity, more preferably human longevity, of which markers for longevity in the Caucasian population are most preferred.
  • the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.1.
  • a nucleotide sequence that is in linkage disequilibrium with an SNP preferably is a polymorphic nucleotide sequence, Preferably a polymorphic nucleotide sequence is in linkage disequilibrium with the minor allele of an SNP.
  • a preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.3.
  • the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.2. Still more preferably the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.4. Most preferably the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5.
  • the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.6.
  • the nucleotide sequence comprises a polymorphism in the human population that is associated with longevity as listed in Table 2.1.1.
  • the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.3. More preferably, the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.2.
  • the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.4. Most preferably the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.5. In an alternative preferred embodiment the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.6.
  • the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s comprising an SNP listed in the above-mentioned Tables (see Table 3.1).
  • the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 1 - 94, more preferably the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 17, 18, 22, 23, 32, 39, 41, 44, 45, 56, 83 and 84.
  • a particularly preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs rsl6905070, rs7814049 and rs7013830.
  • SNPs rsl6905070, rs7814049 and rs7013830 are a nucleotide sequence present at chromosome 8q24.22.
  • a nucleotide sequence that is in linkage disequilibrium with SNPs rsl6905070, rs7814049 and rs7013830 is a nucleotide sequence comprised in a chromosomal fragment extending from at least 500 kb upstream of SNPs rsl 6905070, rs7814049 and rs7013830, to at least 450 kb downstream of SNPs rsl6905070, rs7814049 and rs7013830.
  • a nucleotide sequence that is in linkage disequilibrium with SNPs rsl6905070, rs7814049 and rs7013830 is a nucleotide sequence comprised in a chromosomal fragment extending from ST3GAL1 to ZFAT.
  • Another particularly preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs rs4513644 rs4700233, rs854050 and rs4700231.
  • a nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence present at chromosome 5ql 1.2.
  • a nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence comprised in a chromosomal fragment extending from at least 400 kb upstream of SNPs rs4513644 rs4700233, rs854050 and rs4700231, to at least 650 kb downstream of SNPs rs4513644 rs4700233, rs854050 and rs4700231.
  • a nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence comprised in a chromosomal fragment extending from ACTBL2 to PLK2. If the nucleic acid molecule does not comprise the polymorphism (SNP) itself, it is preferred that nucleic acid molecule comprises at least a nucleotide sequence in the proximity of the polymorphism (SNP), i.e.
  • nucleic acid molecule comprises at least a nucleotide sequence immediately adjacent to the polymorphism (SNP).
  • nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined as a nucleotide sequence showing an r 2 within 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1.0 and/or D' within 0.6, 0.7, 0.8, 0.9, or 1.0 with a SNP identified in Table 9 and where r 2 and D' are measures indicating the extent of the LD between markers.
  • a nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined is thus at least physically linked to the SNP and will preferably be present in the vicinity of the SNP, whereby the vicinity of the SNP is understood to mean within no more than 750, 500, 200, 100, 50, 20, 10 or 5 kb from the chromosomal location of the SNP.
  • a preferred nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined and that is physically linked to the SNP is an expressed nucleotide sequence of a gene as listed in Table 2.1.2.
  • a more preferred nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined and that is physically linked to the SNP is an expressed nucleotide sequence of a gene as listed in Table 2.1.5.
  • the genetic marker is a nucleic acid molecule comprising a nucleotide sequence that is differentially expressed between a population that expresses excess survival and a control population.
  • a population that expresses excess survival is herein defined as a population that has a lifespan of or above 85 years.
  • a nucleotide sequence that is differentially expressed between a population that expresses excess survival and a control population preferably is a nucleotide sequence that is differentially expressed between one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations (samples) as described in the Examples herein.
  • a differentially expressed genetic marker for longevity may be useful in the methods of the present invention without knowledge of the polymorphism that underlies the difference in expression.
  • a preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) that specifically hybridises to a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871(NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN).
  • Genbank accession no.'s, Unigene No.'s of expressed sequences identified by these probes are given in Tables 2.4.2 and 2.4.3.
  • nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Tables 2.4.2 and 2.4.3.
  • a particularly preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2).
  • a most preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1).
  • gene and SNP information as used herein and in the Tables is expressed by gene symbols, gene locations, polymorphism names and their identifications in public SNP database (dbSNP ID), and the genotypes using the standard expression used in molecular biology, and they are readily understood by one of ordinary skill in the art.
  • Gene symbol is the acronym or abbreviation corresponding to a given gene name. Genes and markers may have multiple symbols and names due to rediscovery or correlation to function following discovery.
  • a nucleotide sequence of the invention that is a genetic marker for longevity may be a nucleotide sequence of which a human allele is associated with genetic predisposition for deficiency in a health function, such as e.g.
  • risk factors for these diseases such as high serum cholesterol, triglycerides, blood pressure etc., ) or other parameters of impaired functions of brain, heart, endocrine systems involved in metabolism, skeleton, muscles .
  • a portfolio of markers for use in the invention may contain any number of two or more and any type of marker as disclosed herein.
  • the markers in the portfolio are genetic markers as disclosed herein.
  • One embodiment of the invention relates to a portfolio comprising at least 2, 3, 4, 5, 6, 8, 10, 12, 15, 20 or 30 (isolated) nucleotide sequences or their complements, wherein each nucleotide sequence is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in any one of Tables 3.1 and 2.1.1 to 2.1.4.
  • the portfolio comprises at least 2, 3, 4, 5, 6, 8, 10, 12, 15, 20 or 30 (isolated) nucleotide sequences or their complements, wherein each nucleotide sequence is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6.
  • the portfolio comprises at least two nucleotide sequences, which at least two nucleotide sequences are in linkage disequilibrium with at least two different SNPs listed in any one of Tables 3.1 and 2.1.1 to 2.1.4, whereby at least one nucleotide sequence is in linkage disequilibrium with a SNP selected from the group consisting of the SNPs rsl6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231.
  • the polymorphic nucleotide sequences are in linkage disequilibrium with the minor allele of the SNPs.
  • the portfolio comprises at least two nucleotide sequences selected the group consisting of: i) nucleotide sequences comprising at least 10 contiguous nucleotides from a transcript or complement thereof that specifically hybridises to a probe selected from the group consisting of probes having a nucleotide sequences of SEQ ID NO: 95 - 103; and, ii) nucleotide sequences that specifically hybridise to a transcript having at least 80% sequence identity with at least one nucleotide sequence selected from the group consisting of SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123 and their complements; whereby the portfolio comprises at least one nucleotide sequence selected from ii).
  • each nucleotide sequence in the portfolio is in linkage disequilibrium with a different SNP in the group.
  • one or more of the nucleotide sequences themselves comprise a polymorphism in the human population. This polymorphism may comprise an SNP as defined above.
  • the nucleotide sequences comprise at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s comprising an SNP listed in the above-mentioned Tables (see Table 3.1).
  • the nucleotide sequences comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 1 - 94, more preferably the nucleotide sequences comprise at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 17, 18, 22, 23, 32, 39, 41, 44, 45, 56, 83 and 84.
  • the nucleotide sequences in the portfolio are expressed nucleotide sequences.
  • the invention relates to the use of any of the markers for longevity as defined herein in any of the methods of the invention.
  • the invention relates to the use of the longevity markers of the invention in methods for determining genetic predisposition for longevity, in methods of screening for a substance that modulates the biological aging rate (and promotes/inhibits a healthy aging process), or a substance that is capable of modulation of longevity and/or life expectancy, and/or in methods for assessing the physiological age of (a sample from) a subject.
  • the invention in a fourth aspect relates to a method for determining a genetic predisposition for longevity. It is herein understood that methods for determining a genetic predisposition for longevity more generally include methods for determining a genetic predisposition for a certain life expectancy, including both a genetic predisposition against longevity as well as a genetic predisposition for longevity and all intermediate phenotypes.
  • the methods for determining genetic predisposition for longevity of a person (or a subject) preferably is an ex vivo method, e.g. performed in vitro on a sample obtained from the subject.
  • the method preferably comprises determining the genotype of the subject with respect to one or more genetic markers of longevity as herein defined above.
  • the method comprises detecting the presence of a polymorphism that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in any one of Tables 2.1.1 to 2.1.4, more preferably a polymorphism that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6, wherein the presence of the polymorphism is indicative of longevity.
  • the method comprises detecting the presence of a polymorphism selected from the group consisting of the SNPs listed in any one of Tables 2.1.1 to 2.1.4., more preferably a SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6, wherein the presence of the polymorphism is indicative of longevity.
  • a genotype that is indicative of longevity is a carrier of an allele that is indicative of longevity i.e. a carrier of a protective allele.
  • This may be deduced from the odds ratios and the indications of major and minor alleles for each SNP as presented in Table 2.1.1.
  • an odds ratios (OR) greater than one indicates that a given minor allele from a polymorphism in Table 2.1.1 is the allele that is indicative of longevity
  • odds ratios less than one indicate that a given minor allele is the allele that is indicative of mortality.
  • the allele that is indicative of longevity may also be referred to as the protective allele. For a given allele that is indicative of longevity (i.e.
  • the other alternative allele at the SNP position may be considered a mortality and/or disease risk allele.
  • the other alternative allele at the SNP position may be considered an allele that is indicative of longevity (i.e. the allele protective against mortality and disease).
  • the genotype "a carrier of the protective allele of a given SNP" is understood to mean a genotype that is homozygous or heterozygous for the protective allele of that SNP.
  • the present methods may be performed using any known biological or biochemical method in which genetic polymorphisms, such as SNPs, can be detected or visualized.
  • genetic polymorphisms such as SNPs
  • Such methods include, but are not limited to, DNA sequencing, allele specific PCR, PCR amplification followed by an allele/mutant specific restriction digestion, oligonucleotide ligation assays, primer hybridization and primer extension assays, optionally combined with or facilitated by microarray analysis.
  • Alternative methods for determining allelic variants and gene polymorphisms are readily available to the skilled person in the art of molecular diagnostics.
  • oligonucleotides capable of hybridizing to sequences in or flanking genes (e.g., polymorphic regions) involved in adenosine metabolism, and the use of these oligonucleotides for performing these methods.
  • Primers may be designed to amplify (e.g., by PCR) at least a fragment of a gene encoding an adenosine metabolism- associated enzyme.
  • a polymorphism may be present within the amplified sequence and may be detected by, for example, a restriction enzyme digestion or hybridization assay.
  • the polymorphism may also be located at the 3' end of the primer or oligonucleotide, thus providing means for an allele or polymorphism specific amplification, primer extension or oligonucleotide ligation reaction, optionally with a labelled nucleotide or oligonucleotide.
  • the label may be an enzyme (e.g., alkaline phosphatase, horseradish peroxidase), radiolabel ( 32 P, 33 P, 3 H, 125 1, 35 S etc.), a fluorescent label (Cy3, Cy5, GFP, EGFP, FITC, TRITC and the like) or a hapten/ligand (e.g., digoxigenin, biotin, HA, etc.).
  • the detection is carried out using oligonucleotides physically linked to a solid support, and may be performed in a microarray format.
  • a kit comprising one or more oligonucleotides capable of hybridizing to, or adjacent to, any of the polymorphic sites in any genetic markers as defined hereinabove.
  • the oligonucleotide(s) may be provided in solid form, in solution or attached on a solid carrier such as a DNA microarray.
  • the kit may provide detection means, containers comprising solutions and/or enzymes and a manual with instructions for use.
  • the method for determining genetic predisposition for longevity of a subject comprises determining the expression level of a nucleotide sequence that is in linkage disequilibrium with a SNP selected from the group consisting of the SNPs listed in Table 2.1.1.
  • Preferred examples of expressed nucleotide sequences that are in linkage disequilibrium with the SNPs listed in Table 2.1.1 are provide in Tables 2.1.2, 2.1.5, 2.1.7, 2.4.2 and 2.4.3.
  • the expression level is determined of a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.1 to 2.1.4, or yet more preferably in Table 2.1.5 or 2.1.6.
  • the expression level is determined of a transcript (or complement thereof) that specifically hybridises to a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871(NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN).
  • Genbank accession no.'s, Unigene no.'s of expressed sequences identified by these probes are given in Tables 2.4.2 and 2.4.3.
  • the expression level can be determined of a transcript that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Tables 2.4.2 and 2.4.3.
  • the expression level is determined of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2).
  • the expression level is determined of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1).
  • Expressed nucleotide sequence (or nucleotide sequences) that are indicative of longevity are preferably those nucleotide sequences that are differentially expressed as up regulated or down regulated (in cells) in a sample from a population that expresses excess survival as compared to corresponding (cells) in a sample from a control population.
  • a modulated nucleotide sequence is (part of) a gene that is differentially expressed between one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations (samples) as described in the Examples herein.
  • Up regulation and down regulation are relative terms meaning that a detectable difference (beyond the contribution of noise in the system used to measure it) is found in the amount of expression of the nucleotide sequences relative to a baseline.
  • a baseline preferably comes from a pool of subjects that do not express excess survival.
  • a pool of these subjects preferably contains 1, 3, 5, 10, 20, 30, 100, 400, 500, 600 or more subjects.
  • the expression level of a differentially expressed nucleotide sequence that is indicative of longevity is then considered either up regulated or down regulated relative to a baseline level using the same measurement method.
  • the assessment of the expression level of a nucleotide sequence in order to assess whether a gene/nucleotide sequence is modulated is preferably performed using classical molecular biology techniques to detect mRNA levels, such as (real time) reverse transcriptase PCR (whether quantitative or semi-quantitative), mRNA (micro)array analysis or Northern blot analysis, or other methods to detect RNA.
  • the expression level of a nucleotide sequence is determined indirectly by quantifying the amount of the polypeptide encoded by the gene/nucleotide sequence. Quantifying an amount of polypeptide may be carried out by any known technique. Preferably, an amount of polypeptide is quantified by Western blotting.
  • the quantification of a substrate of the corresponding polypeptides or of any compound known to be associated with the function of the corresponding polypeptides or the quantification of the function or activity of the corresponding polypeptide using a specific assay is encompassed within the scope of the prognosticating method of the invention.
  • the assessment of the expression level of a nucleotide sequence is carried out using (micro)arrays as defined herein.
  • the expression levels of a nucleotide sequence and/or amounts of a corresponding polypeptide are preferably measured in a sample from a subject.
  • the expression level (of a nucleotide sequence or polypeptide) is determined ex vivo in a sample obtained from a subject.
  • a sample may be liquid, semi- liquid, semi-solid or solid.
  • a preferred sample comprises 100 or more cells and/or a tissue from a subject to be tested taken in a biopsy.
  • a sample may comprises blood of a subject.
  • the skilled person knows how to isolate and optionally purify RNA and/or protein present in such a sample. In case of RNA, the skilled person may further amplify it using known techniques.
  • An increase (or upregulation) (which is synonymous with a higher expression level) or decrease (or downregulation) (which is synonymous with a lower expression level) of the expression level of a nucleotide sequence (or steady state level of the encoded polypeptide) is preferably defined as being a detectable change of the expression level of the nucleotide sequence (or steady state level of the encoded polypeptide or any detectable change in the biological activity of the polypeptide) using a method as defined earlier on as compared to the expression level of the corresponding nucleotide sequence (or steady state level of the corresponding encoded polypeptide) in a baseline.
  • an increase or decrease of a polypeptide activity is quantified using a specific assay for the polypeptide activity.
  • an increase of the expression level of a nucleotide sequence means an increase of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the nucleotide sequence using arrays. More preferably, an increase of the expression level of a nucleotide sequence means an increase of at least 10 or 11%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • a decrease of the expression level of a nucleotide sequence means a decrease of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the nucleotide sequence using arrays. More preferably, a decrease of the expression level of a nucleotide sequence means an decrease of at least 10%, even more preferably at least 20%., at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • an increase of the expression level of a polypeptide means an increase of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the polypeptide using western blotting.
  • an increase of the expression level of a polypeptide means an increase of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • a decrease of the expression level of a polypeptide means a decrease of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the polypeptide using western blotting.
  • a decrease of the expression level of a polypeptide means a decrease of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • an increase of the polypeptide activity means an increase of at least 2,
  • an increase of the polypeptide activity means an increase of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • a decrease of the polypeptide activity means a decrease of at least 2, 3,
  • a decrease of the polypeptide activity means a decrease of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
  • a transcript or complement thereof
  • a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871 (NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN) as a function of the age and sex of a subject is indicative of longevity.
  • up- or down-regulation can be determined of a transcript that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Table 2.4.2 and 2.4.3.
  • the up- or down-regulation of a transcript having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2), as a function of the age and sex of a subject is indicative of longevity.
  • the up- or down-regulation of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1), as a function of the age and sex of a subject is indicative of longevity.
  • the invention in a fifth aspect relates to a method for assessing physiological age of a subject, preferably a human subject.
  • the rate of aging is very species specific, where a human may be aged at about 50 years; and a rodent at about 2 years.
  • a natural progressive decline in body systems starts in early adulthood, but in humans it becomes most evident several decades later.
  • One arbitrary way to define old age more precisely in humans is to say that it begins at conventional retirement age, around about 60, around about 65 years of age.
  • Another definition sets parameters for aging coincident with the loss of reproductive ability, which is around about age 45, more usually around about 50 in humans, but will, however, vary with the individual. It has been found that individuals age at different rates, even within a species.
  • the method for assessing physiological age of a subject preferably is performed or measured in a sample from a subject, Thus, preferably the physiological age is assessed ex vivo in a sample obtained from a subject.
  • the method preferably comprises: a) determining expression information in a sample obtained from the subject, of one or more differentially expressed nucleotide sequences as herein defined above, preferably expression information of a portfolio of expressed nucleotide sequences as herein defined above; b) using the expression information to generate an age signature for the sample; and, c) comparing the age signature obtained in b) with a control age signature; wherein a statistically significant match with a positive control or a statistically significant difference from a negative control is indicative of age in the sample.
  • Method for generating an age signature for a sample, for determining control age signatures and for determining statistically significant matches of differences with controls are described in detail in US20070161022, which is incorporated by reference herein.
  • the invention in a sixth aspect relates to a method for identification of a substance that modulates (or is capable of modulating) the biological aging rate in a subject, and/or that modulates (or is capable of modulating) longevity and/or life expectancy.
  • the method preferably comprises the steps of a) contacting the substance to a test cell or administering the substance to a test organism; b) determining in the test cell or in (a sample from) the test organism the expression level of one or more differentially expressed nucleotide sequences as herein defined above, preferably determining the expression level of a portfolio of expressed nucleotide sequences as herein defined above; c) comparing the expression level of the nucleotide sequence(s) with the expression level of the corresponding nucleotide sequence(s) in a test cell that is not contacted with the substance or in (a sample from) the test organism that is not contacted with the substance; and, d) identifying a substance that produces a difference in expression level of at least one of
  • a substance is identified as a substance the promotes longevity, extends lifespan and/or improves health, when the substance upregulates a nucleotide sequence that is upregulated in a population that expresses excess survival or when the substance downregulates a nucleotide sequence that is downregulated in a population that expresses excess survival.
  • the up- or down-regulation respectively in a population that expresses excess survival means that a nucleotide sequence is up- or down-regulated respectively in a population of cells or organisms that expresses excess survival as compared to a corresponding control population (e.g. one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations as described in the Examples herein).
  • the expression level of the nucleotide acid sequence may be determined indirectly by quantifying the amount of polypeptide encoded by the nucleotide acid sequence.
  • the cells in the test cell population are preferably mammalian cells, preferably human cells.
  • the test cell population that is contacted with the substance and the test cell population that is not contacted with the substance are derived from one cell population, preferably from one cell line, more preferably from one cell.
  • the test organism may is a non-human mammal or a human volunteer.
  • the invention in a seventh aspect relates to a method of improving health of a subject.
  • the method preferably comprises: a) determining a genetic predisposition for longevity using a method as defined herein above; and, b) if the subject does not have the genetic predisposition for longevity, 1) providing, to the subject, a substance that modulates (or is capable of modulating) the biological aging rate identified in a method as described above; or 2) providing to the subject other medication or other dosages of medication on the basis of the biological age established by the methods of the invention method as compared the medication or dosage thereof normally provided to the subject.
  • the determination of the genetic predisposition for longevity in a) comprises determining in the test cell or in (a sample from) the test organism the expression level of one or more differentially expressed nucleotide sequences as herein defined above, more preferably determining the expression level of a portfolio of expressed nucleotide sequences as herein defined above; whereby in b) a substance is provided to the subject that modulates expression or activity of the (differentially) expressed nucleotide sequence(s) or its gene product(s).
  • the substance upregulates expression or activity of those (differentially) expressed nucleotide sequence(s) or its gene product(s) that are upregulated in a population that expresses excess survival or that the substance downregulates expression or activity of those (differentially) expressed nucleotide sequence(s) or its gene product(s) a nucleotide sequence that are downregulated in a population that expresses excess survival.
  • LLC 18 Longevity Study (LLS 18 ), Leiden 85plus Study 20 ' 21 , Danish 1905 cohort 22 and PROSPER 23 were published previously and is provided together with numbers and description of mortality data below. Numbers of cases and controls are summarized in
  • Inclusion criteria called for men and women between the ages of 70 and 82 years with a total plasma cholesterol of 155- 350 mg/dL (4-9/mmol/L) and triglyceride levels ⁇ 200 mg/dL (6 mmol/L). Patients were excluded if they showed signs of cognitive decline, indicated by a score of 23 or less on the Mini Mental State Examination and a series of psychometric tests. The study population was distributed evenly with respect to existing vascular disease and qualifying risk factors. Patients were followed every 3 months for an average of 3.2 years. The primary composite endpoint, definite or suspected death from coronary heart disease (CHD), nonfatal myocardial infarction (MI), or fatal/nonfatal stroke, was measured at 3-year follow-up. A total of 604 subjects died during follow up. In statistical analysis adjustments for study cohort as well as use of Pravastatin were made. 1.1.5 German cohort
  • the unrelated German study participants were drawn from population-based collections and comprised 1447 long-lived individuals of exceptional age (810 nonagenarians, 637 centenarians; age range: 95 - 110 years; mean age: 99 years) and 1104 younger control subjects (age range: 60-75 years; mean age: 67 years) (Nebel, A. et al, 2005, Proc. Natl. Acad. Sci. U. S. A 102, 7906-7909).
  • the gender ratio was about 75% females vs. 25% males.
  • the controls match the long-lived individuals in terms of ancestry, gender and geographical origin within the country and genetic differences between the case-control samples are considered to be very low (Wilicox, B. J. et al., 2008, Proc. Natl. Acad. Sci. U. S. A 105, 13987-13992).
  • the survival benefit of these families is marked by a 30 % excess survival observed in the proband, the parental and the offspring generation (Schoenmaker et al., 2006, supra).
  • the male offspring of the long-lived subjects have a lower prevalence of diabetes and cardiovascular disease as compared to their partners.
  • the offspring has lower serum glucose than their partners and beneficial lipoprotein profiles in the sense that on average offspring has significantly larger LDL particle sizes than partners, a feature even stronger represented in the nonagenarian siblings (Heijmans et al., 2006, PLoS Med. 2006 Dec; 3 ⁇ 12):e495).
  • 1.2 Genotyping A flow chart of the approach used to identify SNPs associated to survival is provided in Figure 1 and a description of population is provided in Table 1.1.
  • Genotyping for stage 1 of the GWAS was performed at Perlegen Sciences by applying the first generation genome-wide SNA array Affymetrix Gene Chip Human Mapping 500K Array set comprising two arrays (262K + 238K) and for stage 2 by using an in-house Perlegen platform.
  • a MAF > 0.02 a successful call rate > 80%
  • P HWE ⁇ 10 "4 In total 357,162 SNPs were used for GWAS analysis with a mean genotype call rate of 95%. Genotype data were used to confirm sex and family relationships.
  • stage 1 genotypes due to population substructure was assessed by clustering analysis of IBS matrix using Plink. One cluster was identified indicating that the Leiden Longevity Study is a homogenous population. The quantile-quantile plot showed that the p-value distribution of stage 1 conformed to a null distribution at all but the extreme tail (data not shown) The genomic inflation factor ( ⁇ ), which measures over-dispersion of test statistics from association tests indicating population stratification was 1.019.
  • Stage 2 SNPs were selected for analyses when the mean call rate > 95%, MAF > 0.02 and P HWE > 0.01. Genotyping quality control was performed using duplicate DNA samples within the LLS and SNP assays of the Sequenom MassARRAY platform to confirm genotyping accuracy of SNPs genotyped in stage 1 and 2 combined for which 99.67% concordant results were obtained. Genotyping for replication studies was performed with MassARRAY Iplex (Sequenom, San Diego). Four i-plexes containing 96 out of 104 prioritised SNPs could be designed for replication studies 81 of which were successfully typed in 100% of the samples at a genotype call rate of .
  • stage 1 of the GWAS comparing genotype frequencies between cases and controls we applied the Cochrane- Armitage test for additive effects and the Fishers exact test for recessive effects.
  • stage 1 of the GWAS comparing genotype frequencies between cases and controls
  • Table 1.1 we applied the Cochrane- Armitage test for additive effects and the Fishers exact test for recessive effects.
  • SNPs with P-value below 0.01 for the trend were selected for the second stage.
  • Restricted space on the chip for stage 2 allowed us to select a limited number of additional SNPs for which reason SNPs associating in the recessive model at a p-value below 0.005 for Fishers exact test were selected for genotyping in the second stage.
  • stage 1 and 2 For the joint analysis of stage 1 and 2 a variance-modified version of these score- tests was used for additive and recessive effects. The relatedness between the highly aged sibling cases was taken into account when computing the variances of the scores 25 . Unless otherwise stated, subsequent association analyses of the SNPs in additional cohorts were restricted to only the genetic model corresponding to the most significant result in the GWAS. Odds ratios were estimated and corresponding 95% confidence intervals were computed based on empirical standard errors. For meta-analyses a fixed effect approach was used. Scores and their variances were computed within each study and combined across the three studies to obtain a single meta-statistic. P-values below 5 x 10 ⁇ 8 were considered as genome wide significant 26 . Heterogeneity between studies was assessed by estimating the between study variance using random effects meta analysis.
  • RNA expression analysis For testing association to survival of disease susceptibility alleles, the same methods as for the GWAS and meta-analysis were used. A linear regression model with the number of disease risk alleles as outcome and longevity status as covariate was performed. For testing within the Leiden Longevity Study, empirical standard errors were used. To search for a set of single SNPs predicting survival, a cross-validation based model selection algorithm was applied 27 . In addition to main effects the method also considers all pairwise interactions. 1.4 RNA expression analysis
  • RNA yield was evaluated on the 2100 Bioanalyzer (Agilent Technologies) and the concentration was measured using a NanoDrop spectrophotometer (NanoDrop Technologies). Quality criteria included a 28S/18S ratio as measured by the Bioanalyzer of at least 1.2, and a total RNA yield of at least 3 ⁇ g.
  • the samples were hybridized on 54k CodeLink Human Whole Genome Bioarrays (GE Healthcare, currently of Applied Micro arrays). Images were quantified with CodeLink Expression software (version 4.0). For the analysis of expression levels, 11 Codelink Bioarray probes were selected on their correspondence to the ST3GAL1 (7 probes) or ZFAT locus (4 probes).
  • ST3GAL1 and one ZFAT probe were present in an exon of these genes, while the other probes corresponded to one or more UniGene Cluster or Expressed Sequence Tags (ESTs) within intronic regions 1.5 Selection for a set of disease susceptibility alleles
  • a GWAS was performed in two stages in subjects of the Leiden Longevity Study (LLS) and controls.
  • LLC Leiden Longevity Study
  • genotype data of 357,162 SNPs that passed quality control were analysed in a comparison of 417 unrelated probands from a nonagenarian sibling pair (94 years on average) and 470 controls (60 years on average, i.e. partners of offspring).
  • a flow chart of the consecutive analysis steps is depicted in Figure 1 and a description of the population samples investigated in the GWAS and subsequent replication studies is given in Table 1.1.
  • the association analysis for survival into old age is based only on the age difference between cases and controls, not on health status.
  • the nearest genes of the SNPs associating with p ⁇ 10 "4 are depicted in Table 2.1.2.
  • a multiplex genotyping assay could be designed and successful typing was obtained in the replication cohorts. Association analysis was performed comparing 1236 cases of a mean age of 87 years from the Leiden 85 Plus Study and 1644 cases of a mean age of 93 years from the Danish 1905 Cohort and appropriate populations of younger controls (Table 1.1). Both are population based cohort studies from a genetic background similar to the LLS cohort 30 .
  • the survival-associated SNPs at 8q24.22 cover a chromosome region of 72 kb that contains no known protein coding gene located within a 100 kb distance of the SNPs. Since long range cis-acting regulating variation has been documented 31 we tested whether the expression of the nearest genes to the chromosome 8q24.22 SNPs, including ZFAT, 391kb upstream of rsl6905070 and ST 3 GALl at 515 kb downstream of the top SNP, correlated to the variation. Gene expression micro-array data were available from 50 unrelated long-lived siblings from LLS families, 50 of their offspring and 50 controls 32 .
  • Seven probes on the microarray corresponded to the ST3GAL1 genome region and four probes to the ZFA T region.
  • the expression levels of these probes were compared between 60 carriers (nonagenarians, offspring and partners combined) and 90 non-carriers of any of the minor alleles of the three 8q24.22 SNPs.
  • the effect size and direction was similar in siblings, offspring and partner groups separately and comparable but non-significant increases were observed for the other five probes targeting this locus (Table 2.1.7) of which one is present in a known exon.
  • RNA expression analysis 150 RNA samples belonging to 50 trios (50 unrelated nonagenarian siblings, 50 of their offspring, 50 partners of this offspring) were analysed using CodeLink microarrays. The characteristics of these samples are summarized in Table 2.4.1. A T- test analysis showed that none of the variables in Table 2.4.1 differ significantly between offspring and partners (which was one of the criteria for selecting the 50 trios), but that several parameters differ between sibs and offspring or partners (on which item the trios were not selected).
  • the gene expression data files were read into R (from Bioconductor) by using the 'Codelink' package. Normalisation was performed using the Cyclic Lowess method, resulting in a Spearman correlation between all samples of 0.97 and 0.98-0.99 between the technical replicates. In total 87.7% of the probes on the CodeLink array (53.423 in total) was expressed in at least 10% of the samples and 66% of the probes were expressed in at least 90% of the samples indicating that a major part of the genes on the array are expressed in blood. We tested which transcripts were differentially expressed with age, comparing the nonagenarian sibs to partners of offspring. Corrected for gender 3127 transcripts were found.
  • the SNP is located on 8q24.22 and was identified in a meta-analysis of GWAS candidate SNPs in the Leiden Longevity Study (LLS) and two population-based cohort studies, consisting in total of 3700 highly aged and 4153 middle-aged controls.
  • LLS Leiden Longevity Study
  • P 6.63 x 10 "9 ).
  • the minor allele associated to a 33% decreased probability of becoming long-lived.
  • Prospective analysis of survival data in an additional large cohort revealed that this allele indeed associated from seventy years onwards with an increased mortality risk due to cancer and cardiovascular causes.
  • Pleiotropic effects of genetic variation have previously been reported such as for variation at HFNlB influencing the risk of prostate cancer and diabetes 33 .
  • the offspring of the nonagenarian siblings in the LLS have a decreased prevalence of myocardial infarction and hypertension 19 as compared to partners, whereas the somewhat older offspring of centenarians have a decreased prevalence of both cancer and cardiovascular disease 5 .
  • the proximal chromosome band 8q24.21 has frequently been implicated in the risk of various types of cancer (prostate, ovary, breast, bladder 34 ), however, none of the 8q24.22 SNPs in our study are in LD with the cancer associated SNPs on 8q24.21, 6Mb centromeric of the 8q24.22 SNP. Consistent asssociations of genetic variation to survival in the general population have not frequently been described 9 . This was further illustrated by applying the same cross-sectional study design to test the effect on survival into old age for human longevity of 22 GWAS-identified susceptibility loci for coronary artery disease, cancer and type-2 diabetes.
  • GWAS identified SNPs have a relatively low predictive value for disease risk and explain only a small fraction of the heritability of the traits involved 35 . It would be expected that such alleles only marginally affect population-wide survival.
  • the subjects surviving into old age may carry protective genetic variation or be subject to environmental features counteracting the disease promoting effect of disease susceptibility alleles.
  • Carriers of the minor alleles of any of the three 8q24.22 SNPs were found to have an 1.11-fold increased expression of mRNA probes covering the ST3GAL1 gene located 515 kb from rsl6905070. Small long-range effects were also observed for variation in the 5pl3 gene desert that correlated with expression of the PTGER4 gene at a 270 kb distance 31 .
  • the STSGALl gene encodes ST3 beta-galactoside alpha-2,3-sialyltransferase 1, a member of glycosyltransferase family 29.
  • This type II membrane protein catalyzes the transfer of sialic acid from CMP-sialic acid to galactose-containing substrates and sialic acid modulates immune interactions.
  • ST3Gal I modulates surface sialylated structures during the generation of dendritic cells from monocytes 36 .
  • Dendritic cells are antigen-presenting cells with high endocytic capacity that play a central role in immune regulation. The regulation of glycosylation has been implicated in immune responsiveness, multiple diseases and aging 37 .
  • Kenyon C The plasticity of aging: insights from long-lived mutants. Cell 2005; 120(4):449-460.
  • apolipoprotein E gene is a "frailty gene," not a "longevity gene”. Genet Epidemiol 2000; 19(3):202-210.
  • TCF7L2 Transcription factor 7-like 2
  • Raskin,L. et al. FGFR2 is a breast cancer susceptibility gene in Jewish and Arab Israeli populations. Cancer Epidemiol. Biomarkers Prev. 17, 1060-1065 (2008).
  • CAD coronary artery disease
  • T2D Type-2 diabetes
  • HF heart failure
  • BC breast cancer
  • PC prostate cancer
  • CC colon carcinoma
  • LC lung cancer
  • CVD cardiovascular disease.
  • SNPs selected for replication analysis associating at p ⁇ 6.43X10 " 4 to longevity in the GWAS analysis of LLS stage 1 and 2 combined. Chromosome position according to build 36. Position according to dbSNPbuild 129. MAF indicates minimal allele frequency in all 953 Dutch controls. Major/minor refers to the allele with the highest or lowest frequency in controls. Ptrend, Precessive refers to the p value obtained in either the additive or recessive model. OR indicates Odds Ratio of the most significant model. OR's above 1 indicate the increased probability to become long-lived based on the minor allele being overrepresented in the elderly as compared to young controls. OR's below 1 indicate the opposite. Supplementary Table 3.1 present the sequences of the SNP identifiers of the prioritized SNPs.
  • CodeLink Bioarray probes were selected on basis of their correspondence to the ST3GAL1 or ZFAT locus.
  • One ST3GAL1 (GE57639) and one ZFAT (GE86200) probe were present in an exon of these genes, while the other probes corresponded to one (or more) Expressed Sequence Tags (ESTs) within intronic regions. Expression differences within these probes were analyzed using Linear regression in Stata/SE 8.0 between 60 carriers and 90 non-carriers of one or more of the 8q24.22 SNPs.
  • FC Fold Change
  • a M indicates that the risk allele contributes to metabolic disease (CAD or T2D); C indicates that the risk allele contributes to cancer.
  • the major allele (A) has been associated with risk for cancer, while the minor allele G) has been associated with T2D.
  • c The major allele is indicated in bold.
  • Z) Logistic regression with long-lived/control status as outcome, the study as covariate and the SNP genotypes as independent variable (Stata/SE 8.0). Analyses were performed with robust standard errors to take into account family dependency in the Leiden Longevity Study.
  • Table 2.4.1 Medians of parameters measured in donors of the 150 RNA samples that were used in the microarray comparison study. *: p ⁇ 0.05 between long-lived sibs and offspring or partners. None of the parameters is significantly different between partners and offspring.
  • Lymphocytes E-9 cells/L* 2.01 2.03 1.19
  • Eosinophils E- 10 cells/L 1.63 1.84 1.56
  • RNA from 150 subjects from the Leiden Longevity study results from the microarray comparison of RNA from 150 subjects from the Leiden Longevity study. Probes that are 1) differentially expressed between offspring of long-lived siblings and their partners and 2) reside in transcripts in which SNPs were associated with longevity in the Leiden Longevity Study with p-value ⁇ 0.05 (Combination expression/GWA) .
  • GE Probe ID which is a unique identifier for the probe sequence in the CodeLink WEBB database. This is an internal GE Healthcare relational database that held all gene associated annotations and linked them to the specific codelink probe ID.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Cell Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to genetic and biochemical markers of longevity, that were identified as common genetic variation associated with longevity in cohorts of elderly subjects. In particular the genetic markers for longevity of the invention concern a collection of nucleic acid molecules comprising a nucleotide sequence that is in linkage disequilibrium with a SNP that is associated with longevity in cohorts of elderly subjects. In another embodiment of the invention the genetic markers for longevity are a nucleic acid molecules comprising nucleotide sequences that are differentially expressed between a population that expresses excess survival and a control population. In a further aspect the invention relates to portfolio's comprising subsets of the genetic markers for longevity. The genetic markers for longevity of the invention may be used in methods for determining genetic predisposition for longevity, methods of screening for a substance that modulates the biological aging rate or a substance that is capable of modulation of longevity and/or life expectancy, and/or methods for assessing physiological age.

Description

New indicators of human longevity and biological ageing rate
Field of the invention
The present invention relates to the field of molecular human genetics and epidemiology. In particular the invention relates to genetic and biochemical markers of longevity.
Background of the invention Survival to high ages, often referred to as longevity, clusters in families1. Although twin studies show the genetic contribution to human lifespan variation to be low2 (25%), age-specific susceptibility to death (frailty) is more heritable3(50%). Recent analyses of -20,000 twins born in Nordic countries between 1870 and 1910 confirm this, and showed that genetic influences on lifespan become apparent only after the age of 60 years4. Families enriched for survival to high ages display a lower prevalence of diabetes, cardiovascular disease and cancer5' 6 from middle age onwards. This suggests the existence of genetic loci that beneficially moderate the biological rate of aging7 and the susceptibility for various major diseases.
These findings provide a rationale for a search to identify loci affecting survival into old age. Genetic association studies comparing highly aged cases to young controls reveal loci at which genetic variants may contribute to a higher or lower probability to survive into old age. So far this approach was mainly applied to study candidate genes such as the orthologs of loci in insulin/IGF- 1 signaling (IIS) pathways8' 9 that emerged from lifespan extension studies in animal models10. Such studies demonstrated that protective 'longevity' genes indeed exist but do not affect mortality at all ages11. The human candidate longevity gene studies are dominated by contradictory results12 The more consistent evidence for association to longevity was found for the APOE13 and more recently to FOXO3A 14' 15 loci. Two genome wide scans for longevity have been reported including a linkage scan in sib-ships of exceptional longevity16 and a GWA IOOK scan for longevity- associated traits in a community based study but neither of them revealed genome-wide significant associations17. GWAS studies have identified risk alleles for major diseases such as diabetes, cancer and cardiovascular disease. For most of the loci involved it is not clear yet to what extent they affect survival in the general population. It is an object of the present invention to identify loci regulating human lifespan. The applicants have therefore performed a GWAS comparing nonagenarian sibling pairs from the Leiden Longevity Study and younger controls. The propensity to survive into old age in subsequent generations of these families, is illustrated by a 30% survival benefit18 as compared to the general population and an underrepresentation of type 2 diabetes and myocardial infarction at middle age19. The GWAS was followed by replication analysis in additional cohorts using cross-sectional and prospective data. One locus influenced the probability to survive into old age by affecting both the risk of cancer and cardiovascular mortality and was related to transcription of nearby genes. For comparison we examined 22 known GWAS identified loci affecting susceptibility to mortality-associated diseases for their effect on survival into old age.
It is thus an object of the present invention to provide for the identification of genetic and biochemical markers of longevity and preferably thereby to find pathways that modulate the biological aging rate and promote a healthy aging process.
Description of the invention Definitions
The term "hybridisation" refers to the binding of two single stranded nucleic acids via complementary base pairing. The terms "hybridizing specifically to", "specific hybridization", and "selectively hybridize to," as used herein refer to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions. The term "stringent conditions" refers to conditions under which a probe will hybridize preferentially to its target subsequence, and to a lesser extent to, or not at all to, other sequences in a mixed population (e.g., a cell lysate or DNA preparation from a tissue biopy) A "stringent hybridization" and "stringent hybridization wash conditions" in the context of nucleic acid hybridization (e.g., as in array, Southern or nrthern hybridizations) are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes part I, Ch. 2, "Overview of principles of hybridization and the strategy of nucleic acid probe assays," Elsevier, N.Y. ("Tijssen"). Generally, highly stringent hybridization and wash conditions are selected to be about 5 0C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on an array or on a filter in a Southern or northern blot is 42 0C using standard hybridization solutions (see, e.g., Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual (3rd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, and detailed discussion, below), with the hybridization being carried out overnight. An example of highly stringent wash conditions is 0.15 M NaCl at 72 0C for about 15 minutes. An example of stringent wash conditions is a 0.2 x SSC wash at 65 0C for 15 minutes (see, e.g., Sambrook supra, for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is 1 x SSC at 45 0C for 15 minutes. An example of a low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4 x to 6 x SSC at 40 0C for 15 minutes.
The term "nucleic acid" or "nucleic acid molecule" as used herein refers to a deoxyribonucleotide or ribonucleotide in either single- or double-stranded form. The term encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid. The term also includes nucleic acids which are metabolized in a manner similar to naturally occurring nucleotides or at rates that are improved for the purposes desired. The term also encompasses nucleic-acid-like structures with synthetic backbones. DNA backbone analogues provided by the invention include phosphodiester, phosphorothioate, phosphorodithioate, methylphosphonate, phosphoramidate, alkyl phosphotriester, sulfamate, 3'-thioacetal, methylene(methylimino), 3'-N-carbamate, morpholino carbamate, and peptide nucleic acids (PNAs); see Oligonucleotides and Analogues, a Practical Approach, edited by F. Eckstein, IRL Press at Oxford University Press (1991); Antisense Strategies, Annals of the New York Academy of Sciences, Volume 600, Eds. Baserga and Denhardt (NYAS 1992); Milligan (1993) J. Med. Chem. 36:1923-1937; Antisense Research and Applications (1993, CRC Press). PNAs contain non-ionic backbones, such as N-(2- aminoethyl) glycine units. Phosphorothioate linkages are described in WO 97/03211; WO 96/39154; Mata (1997) Toxicol. Appl. Pharmacol. 144: 189-197. Other synthetic backbones encompassed by the term include methyl-phosphonate linkages or alternating methylphosphonate and phosphodiester linkages (Strauss-Soukup (1997) Biochemistry 36: 8692-8698), and benzylphosphonate linkages (Samstag (1996) Antisense Nucleic Acid Drug Dev 6: 153-156).
When a nucleic acid molecule or sequence of the invention is in single stranded form, the opposite, i.e. complementary strand of the nucleic acid molecule or sequence is expressly included within the scope of the invention. An isolated nucleic acid means an object species of the invention that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition). Preferably, an isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of all macro molecular species present. Most preferably, the object species is purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods).
The term "array", "micro-array", "nucleic acid array" and "biochip" are used herein interchangeably. They refer to an arrangement, on a substrate surface, of multiple nucleic acid molecules of predetermined identity, of which preferably the sequences are known. Each nucleic acid molecule is immobilized to a "discrete spot" (i.e., a defined location or assigned position) on the substrate surface. The term "micro-array" more specifically refers to an array that is miniaturized so as to require microscopic examination for visual evaluation. The arrays used in the methods of the invention are preferably microarrays. The nucleic acid array as used herein is a plurality of target elements, each target element comprising one or more nucleic acid molecules (probes) immobilized on one or more solid surfaces to which sample nucleic acids can be hybridized. The nucleic acids of a probe can contain sequence(s) from specific genes or clones, e.g. from specific genomic regions described in Tables herein. Other probes may contain, for instance, reference sequences. The probes of the arrays may be arranged on the solid surface at different densities. The probe densities will depend upon a number of factors, such as the nature of the label, the solid support, and the like. One of skill will recognize that each probe may comprise a mixture of nucleic acids of different lengths and sequences. Thus, for example, a probe may contain more than one copy of a cloned piece of DNA or RNA, and each copy may be broken into fragments of different lengths. The length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure, and to provide the required resolution among different genes or genomic locations.
The term "probe" or "nucleic acid probe", as used herein, is defined to be one or more nucleic acid fragments whose specific hybridization to a sample can be detected. The probe may be unlabelled or labelled as described below so that its binding to the target or sample can be detected. The probe is produced from a source of nucleic acids from one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products. The probes of the present invention are produced from nucleic acids found in the regions described herein.
The probe may also be isolated nucleic acids immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array. In some embodiments, the probe may be a member of an array of nucleic acids as described, for instance, in WO 96/17958. Techniques capable of producing high density arrays can also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) Biotechniques 23: 120-124; U.S. Pat. No. 5,143,854). One of skill will recognize that the precise sequence of the particular probes described herein can be modified to a certain degree to produce probes that are "substantially identical" to the disclosed probes, but retain the ability to specifically bind to (i.e., hybridize specifically to) the same targets or samples as the probe from which they were derived (see discussion above). Such modifications are specifically covered by reference to the individual probes described herein.
The term primer refers to a single-stranded oligonucleotide capable of acting as a point of initiation of template-directed DNA synthesis under appropriate conditions (i.e., in the presence of four different nucleoside triphosphates and an agent for polymerization, such as, DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate length of a primer depends on the intended use of the primer but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template. The term primer site refers to the area of the target DNA to which a primer hybridizes. The term primer pair means a set of primers including a 5' upstream primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' downstream primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
As will be understood by the skilled artisan, the nucleic acid molecules and nucleotide sequences of the present invention that are to be used for the detection and/or for quantification of polymorphisms, genetic markers, (differentially) expressed sequences, such as e.g. probes and primers, are chosen such that they are specific for sequence to be detected in the context of the human genome and/or human transcriptome. This generally means that nucleic acid molecules and nucleotide sequences will comprise a unique sequence (in the context of the human genome and/or human transcriptome) consisting of sufficient length to be specific (i.e. at least 10, 11, 12, 13, 14, 15, 16, 17. 18. 19, 20, 21, 22, 23, 24, or 25 nt) and that they preferably will not contain sequences that occur elsewhere in the human genome and/or human transcriptome, such as e.g. repetitive sequences and homopolymer stretches (e.g. polyA tails). The term "sample" as used herein relates to a material or mixture of materials, containing one or more components of interest. Samples include, but are not limited to, samples obtained from an organism and may be directly obtained from a source (e.g., such as a blood sample, a biopsy or from a tumor) or indirectly obtained e.g., after culturing and/or one or more processing steps. The term "genome" refers to all nucleic acid sequences (coding and non-coding) and elements present in each cell type, preferably each somatic cell type, of a subject. The term genome also applies to any naturally occurring or induced variation of these sequences that may be present in a mutant or disease variant of any cell type, including tumour cells. The terms "genomic DNA" and "genomic nucleic acid" are used herein interchangeably. They refer to nucleic acid isolated from a nucleus of one or more cells, and include nucleic acid derived from (i.e., isolated from, amplified from, cloned from as well as synthetic versions of) genomic DNA. For example, the human genome consists of approximately 3.0 x 109 base pairs of DNA organised into distinct chromosomes. The genome of a normal diploid somatic human cell consists of 22 pairs of autosomes (chromosomes 1 to 22) and either chromosomes X and Y (males) or a pair of chromosome Xs (female) for a total of 46 chromosomes. A genome of a cancer cell may contain variable numbers of each chromosome in addition to deletions, rearrangements and amplification of any subchromosomal region or DNA sequence.
As used herein, the term "genomic locus" or "genomic region" refer to a defined portion of a genome. Likewise the terms "chromosomal locus" and "chromosomal region" refer to a defined portion of a chromosome. For practical purposes the terms "genomic locus", "genomic region", "chromosomal region" and "chromosomal locus" are used interchangeably herein. In the methods of the invention, each nucleic acid probe immobilised to a discrete spot on an array has a sequence that is specific to (or characteristic of) a particular genomic region. In an array-based comparative genomic hybridisation experiment, the ratio of intensity of two differentially labelled test and reference samples at a given spot on the array reflects the genome copy number ratio of the two samples at a particular genomic region.
Linkage describes the tendency of genes, alleles, loci or genetic markers to be inherited together as a result of their location on the same chromosome.
Linkage disequilibrium or allelic association means the preferential association of a particular allele or genetic marker with a specific allele or genetic marker at a nearby chromosomal location more frequently than expected by chance for any particular allele frequency in the population. For example, if locus X has alleles a and b, which occur equally frequently, and linked locus Y has alleles c and d, which occur equally frequently, one would expect the combination ac to occur with a frequency of 0.25. If ac occurs more frequently, then alleles a and c are in linkage disequilibrium. Linkage disequilibrium may result from natural selection of certain combination of alleles or because an allele has been introduced into a population too recently to have reached equilibrium with linked alleles. A marker in linkage disequilibrium can be particularly useful in detecting susceptibility to disease (or other phenotype) notwithstanding that the marker does not cause the disease. For example, a marker (X) that is not itself a causative element of a disease, but which is in linkage disequilibrium with a gene (including regulatory sequences) (Y) that is a causative element of a phenotype, can be used detected to indicate susceptibility to the disease in circumstances in which the gene Y may not have been identified or may not be readily detectable. Polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. A polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population. A polymorphic locus may be as small as one base pair. Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as AIu. The allelic form occurring most frequently in a selected population is sometimes referred to as the wildtype form. Diploid organisms may be homozygous or heterozygous for allelic forms. A diallelic polymorphism has two forms. A triallelic polymorphism has three forms.
A single nucleotide polymorphism or SNP occurs at a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences. The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). SNPs are most frequently diallelic. A single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site. A transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine or vice versa. Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
Detailed description of the invention It is an object of the present invention to provide for the identification of genetic and biochemical markers of longevity and preferably thereby to find pathways that modulate the biological aging rate and inhibit or promote a healthy aging process. To identify common genetic variation associated with longevity, we have performed a genome wide association scan in the Leiden Longevity study and followed the results up for replication in other cohorts of elderly subjects. In addition we have performed a genome wide linkage scan in the nonagenarian proband sibling pairs of the study to identify rare alleles contributing to longevity and in 27 larger sibships separately to identify private genetic variation associated to longevity (Martin, G. M. (1997) The Werner Mutation: does it lead to a "public" or "private" mechanism of aging? MoI Med. 3, 356-358).
In a first aspect the present invention therefore relates to genetic markers of longevity. In the context of the invention, genetic markers of longevity are preferably markers for mammalian longevity, more preferably human longevity, of which markers for longevity in the Caucasian population are most preferred.
In one embodiment the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.1. A nucleotide sequence that is in linkage disequilibrium with an SNP preferably is a polymorphic nucleotide sequence, Preferably a polymorphic nucleotide sequence is in linkage disequilibrium with the minor allele of an SNP. A preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.3. More preferably the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.2. Still more preferably the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.4. Most preferably the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5. In an alternative preferred embodiment the genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.6. Preferably in the nucleic acid molecule the nucleotide sequence comprises a polymorphism in the human population that is associated with longevity as listed in Table 2.1.1. Preferably, the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.3. More preferably, the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.2. Still more preferably, the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.4. Most preferably the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.5. In an alternative preferred embodiment the polymorphism is a SNP selected from the group consisting of the SNPs listed in Table 2.1.6. In a particularly preferred embodiment the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s comprising an SNP listed in the above-mentioned Tables (see Table 3.1). In an alternative embodiment the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 1 - 94, more preferably the nucleic acid molecule comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 17, 18, 22, 23, 32, 39, 41, 44, 45, 56, 83 and 84.
A particularly preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs rsl6905070, rs7814049 and rs7013830. A nucleotide sequence that is in linkage disequilibrium with SNPs rsl6905070, rs7814049 and rs7013830 is a nucleotide sequence present at chromosome 8q24.22. Preferably, a nucleotide sequence that is in linkage disequilibrium with SNPs rsl6905070, rs7814049 and rs7013830 is a nucleotide sequence comprised in a chromosomal fragment extending from at least 500 kb upstream of SNPs rsl 6905070, rs7814049 and rs7013830, to at least 450 kb downstream of SNPs rsl6905070, rs7814049 and rs7013830. More preferably, a nucleotide sequence that is in linkage disequilibrium with SNPs rsl6905070, rs7814049 and rs7013830 is a nucleotide sequence comprised in a chromosomal fragment extending from ST3GAL1 to ZFAT. Another particularly preferred genetic marker for longevity is a nucleic acid molecule comprising a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs rs4513644 rs4700233, rs854050 and rs4700231. A nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence present at chromosome 5ql 1.2. Preferably, a nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence comprised in a chromosomal fragment extending from at least 400 kb upstream of SNPs rs4513644 rs4700233, rs854050 and rs4700231, to at least 650 kb downstream of SNPs rs4513644 rs4700233, rs854050 and rs4700231. More preferably, a nucleotide sequence that is in linkage disequilibrium with SNPs rs4513644 rs4700233, rs854050 and rs4700231 is a nucleotide sequence comprised in a chromosomal fragment extending from ACTBL2 to PLK2. If the nucleic acid molecule does not comprise the polymorphism (SNP) itself, it is preferred that nucleic acid molecule comprises at least a nucleotide sequence in the proximity of the polymorphism (SNP), i.e. a nucleotide sequence that specifically identifies the polymorphism (SNP) and that is located less than 10, 5, 2, 1 kb or more preferably, less than 500, 200, 100, 50, 20, 10 or 5 bp from the polymorphism (SNP). Most preferably, the nucleic acid molecule comprises at least a nucleotide sequence immediately adjacent to the polymorphism (SNP).
It is herein understood that a nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined as a nucleotide sequence showing an r2 within 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1.0 and/or D' within 0.6, 0.7, 0.8, 0.9, or 1.0 with a SNP identified in Table 9 and where r2 and D' are measures indicating the extent of the LD between markers. A nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined is thus at least physically linked to the SNP and will preferably be present in the vicinity of the SNP, whereby the vicinity of the SNP is understood to mean within no more than 750, 500, 200, 100, 50, 20, 10 or 5 kb from the chromosomal location of the SNP. A preferred nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined and that is physically linked to the SNP is an expressed nucleotide sequence of a gene as listed in Table 2.1.2. A more preferred nucleotide sequence that is in linkage disequilibrium with an SNP as herein defined and that is physically linked to the SNP is an expressed nucleotide sequence of a gene as listed in Table 2.1.5. In another embodiment the genetic marker is a nucleic acid molecule comprising a nucleotide sequence that is differentially expressed between a population that expresses excess survival and a control population. A population that expresses excess survival is herein defined as a population that has a lifespan of or above 85 years. A nucleotide sequence that is differentially expressed between a population that expresses excess survival and a control population preferably is a nucleotide sequence that is differentially expressed between one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations (samples) as described in the Examples herein. A differentially expressed genetic marker for longevity may be useful in the methods of the present invention without knowledge of the polymorphism that underlies the difference in expression. A preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) that specifically hybridises to a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871(NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN). Genbank accession no.'s, Unigene No.'s of expressed sequences identified by these probes are given in Tables 2.4.2 and 2.4.3. More preferably the nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Tables 2.4.2 and 2.4.3.
A particularly preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2). A most preferred differentially expressed genetic marker for longevity is a nucleic acid molecule comprising at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1).
It is noted that the gene and SNP information as used herein and in the Tables, is expressed by gene symbols, gene locations, polymorphism names and their identifications in public SNP database (dbSNP ID), and the genotypes using the standard expression used in molecular biology, and they are readily understood by one of ordinary skill in the art. Gene symbol is the acronym or abbreviation corresponding to a given gene name. Genes and markers may have multiple symbols and names due to rediscovery or correlation to function following discovery. A nucleotide sequence of the invention that is a genetic marker for longevity may be a nucleotide sequence of which a human allele is associated with genetic predisposition for deficiency in a health function, such as e.g. a predisposition for cardiovascular disease, cancer, diabetes, dementia, metabolic disease, bone or joint diseases or intermediate phenotypes by which is meant: risk factors for these diseases (such as high serum cholesterol, triglycerides, blood pressure etc., ) or other parameters of impaired functions of brain, heart, endocrine systems involved in metabolism, skeleton, muscles . In a second aspect the invention pertains to a group of markers is selected for use in the methods of the invention. These groups of markers are "portfolios". A portfolio of markers for use in the invention may contain any number of two or more and any type of marker as disclosed herein. Preferably the markers in the portfolio are genetic markers as disclosed herein. One embodiment of the invention relates to a portfolio comprising at least 2, 3, 4, 5, 6, 8, 10, 12, 15, 20 or 30 (isolated) nucleotide sequences or their complements, wherein each nucleotide sequence is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in any one of Tables 3.1 and 2.1.1 to 2.1.4.. More preferably the portfolio comprises at least 2, 3, 4, 5, 6, 8, 10, 12, 15, 20 or 30 (isolated) nucleotide sequences or their complements, wherein each nucleotide sequence is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6.
In a preferred embodiment, the portfolio comprises at least two nucleotide sequences, which at least two nucleotide sequences are in linkage disequilibrium with at least two different SNPs listed in any one of Tables 3.1 and 2.1.1 to 2.1.4, whereby at least one nucleotide sequence is in linkage disequilibrium with a SNP selected from the group consisting of the SNPs rsl6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231. Preferably, the polymorphic nucleotide sequences are in linkage disequilibrium with the minor allele of the SNPs.
In another preferred embodiment the portfolio comprises at least two nucleotide sequences selected the group consisting of: i) nucleotide sequences comprising at least 10 contiguous nucleotides from a transcript or complement thereof that specifically hybridises to a probe selected from the group consisting of probes having a nucleotide sequences of SEQ ID NO: 95 - 103; and, ii) nucleotide sequences that specifically hybridise to a transcript having at least 80% sequence identity with at least one nucleotide sequence selected from the group consisting of SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123 and their complements; whereby the portfolio comprises at least one nucleotide sequence selected from ii). Preferably, each nucleotide sequence in the portfolio is in linkage disequilibrium with a different SNP in the group. In preferred portfolio of the invention, one or more of the nucleotide sequences themselves comprise a polymorphism in the human population. This polymorphism may comprise an SNP as defined above. In a particularly preferred embodiment of the portfolio the nucleotide sequences comprise at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s comprising an SNP listed in the above-mentioned Tables (see Table 3.1). In an alternative embodiment of the portfolio the nucleotide sequences comprises at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 1 - 94, more preferably the nucleotide sequences comprise at least 10, 12, 15, 20, 25 or 30 contiguous nucleotides from a nucleotide sequence that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence as defined in any one of SEQ ID NO.'s: 17, 18, 22, 23, 32, 39, 41, 44, 45, 56, 83 and 84. In another preferred embodiment of the portfolio the nucleotide sequences in the portfolio are expressed nucleotide sequences.
In a third aspect the invention relates to the use of any of the markers for longevity as defined herein in any of the methods of the invention. In particular the invention relates to the use of the longevity markers of the invention in methods for determining genetic predisposition for longevity, in methods of screening for a substance that modulates the biological aging rate (and promotes/inhibits a healthy aging process), or a substance that is capable of modulation of longevity and/or life expectancy, and/or in methods for assessing the physiological age of (a sample from) a subject.
In a fourth aspect the invention relates to a method for determining a genetic predisposition for longevity. It is herein understood that methods for determining a genetic predisposition for longevity more generally include methods for determining a genetic predisposition for a certain life expectancy, including both a genetic predisposition against longevity as well as a genetic predisposition for longevity and all intermediate phenotypes. The methods for determining genetic predisposition for longevity of a person (or a subject), preferably is an ex vivo method, e.g. performed in vitro on a sample obtained from the subject. The method preferably comprises determining the genotype of the subject with respect to one or more genetic markers of longevity as herein defined above. Preferably the method comprises detecting the presence of a polymorphism that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in any one of Tables 2.1.1 to 2.1.4, more preferably a polymorphism that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6, wherein the presence of the polymorphism is indicative of longevity. In a further preferred embodiment, the method comprises detecting the presence of a polymorphism selected from the group consisting of the SNPs listed in any one of Tables 2.1.1 to 2.1.4., more preferably a SNP selected from the group consisting of the SNPs listed in Table 2.1.5 or 2.1.6, wherein the presence of the polymorphism is indicative of longevity.
In one embodiment, a genotype that is indicative of longevity is a carrier of an allele that is indicative of longevity i.e. a carrier of a protective allele. This may be deduced from the odds ratios and the indications of major and minor alleles for each SNP as presented in Table 2.1.1. Thus it is understood that an odds ratios (OR) greater than one indicates that a given minor allele from a polymorphism in Table 2.1.1 is the allele that is indicative of longevity, whereas odds ratios less than one indicate that a given minor allele is the allele that is indicative of mortality. The allele that is indicative of longevity may also be referred to as the protective allele. For a given allele that is indicative of longevity (i.e. the protective allele), the other alternative allele at the SNP position may be considered a mortality and/or disease risk allele. For a given risk allele, the other alternative allele at the SNP position may be considered an allele that is indicative of longevity (i.e. the allele protective against mortality and disease). The genotype "a carrier of the protective allele of a given SNP" is understood to mean a genotype that is homozygous or heterozygous for the protective allele of that SNP.
The present methods may be performed using any known biological or biochemical method in which genetic polymorphisms, such as SNPs, can be detected or visualized. Such methods include, but are not limited to, DNA sequencing, allele specific PCR, PCR amplification followed by an allele/mutant specific restriction digestion, oligonucleotide ligation assays, primer hybridization and primer extension assays, optionally combined with or facilitated by microarray analysis. Alternative methods for determining allelic variants and gene polymorphisms are readily available to the skilled person in the art of molecular diagnostics. Another embodiment is oligonucleotides capable of hybridizing to sequences in or flanking genes (e.g., polymorphic regions) involved in adenosine metabolism, and the use of these oligonucleotides for performing these methods. Primers may be designed to amplify (e.g., by PCR) at least a fragment of a gene encoding an adenosine metabolism- associated enzyme. A polymorphism may be present within the amplified sequence and may be detected by, for example, a restriction enzyme digestion or hybridization assay. The polymorphism may also be located at the 3' end of the primer or oligonucleotide, thus providing means for an allele or polymorphism specific amplification, primer extension or oligonucleotide ligation reaction, optionally with a labelled nucleotide or oligonucleotide. The label may be an enzyme (e.g., alkaline phosphatase, horseradish peroxidase), radiolabel (32P, 33P, 3H, 1251, 35S etc.), a fluorescent label (Cy3, Cy5, GFP, EGFP, FITC, TRITC and the like) or a hapten/ligand (e.g., digoxigenin, biotin, HA, etc.). In one embodiment, the detection is carried out using oligonucleotides physically linked to a solid support, and may be performed in a microarray format. Another embodiment is a kit comprising one or more oligonucleotides capable of hybridizing to, or adjacent to, any of the polymorphic sites in any genetic markers as defined hereinabove. The oligonucleotide(s) may be provided in solid form, in solution or attached on a solid carrier such as a DNA microarray. In addition, the kit may provide detection means, containers comprising solutions and/or enzymes and a manual with instructions for use.
In a further preferred embodiment the method for determining genetic predisposition for longevity of a subject comprises determining the expression level of a nucleotide sequence that is in linkage disequilibrium with a SNP selected from the group consisting of the SNPs listed in Table 2.1.1. Preferred examples of expressed nucleotide sequences that are in linkage disequilibrium with the SNPs listed in Table 2.1.1 are provide in Tables 2.1.2, 2.1.5, 2.1.7, 2.4.2 and 2.4.3. More preferably the expression level is determined of a nucleotide sequence that is in linkage disequilibrium with an SNP selected from the group consisting of the SNPs listed in Table 2.1.1 to 2.1.4, or yet more preferably in Table 2.1.5 or 2.1.6. Still more preferably, the expression level is determined of a transcript (or complement thereof) that specifically hybridises to a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871(NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN). Genbank accession no.'s, Unigene no.'s of expressed sequences identified by these probes are given in Tables 2.4.2 and 2.4.3. Likewise the expression level can be determined of a transcript that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Tables 2.4.2 and 2.4.3. In a particularly preferred method, the expression level is determined of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2). In a most preferred method, the expression level is determined of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1).
It is understood herein that the expression level as a function of the age and sex of subjects is indicative of longevity. Genes showing 'differential expression' as indicated above mean that with increasing age the expression of the gene changes and that such change is different in aging offspring of nonagenarian siblings than in partners as controls, taking age and gender into account in this statistical analysis.
Expressed nucleotide sequence (or nucleotide sequences) that are indicative of longevity are preferably those nucleotide sequences that are differentially expressed as up regulated or down regulated (in cells) in a sample from a population that expresses excess survival as compared to corresponding (cells) in a sample from a control population. Preferably a modulated nucleotide sequence is (part of) a gene that is differentially expressed between one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations (samples) as described in the Examples herein. Up regulation and down regulation are relative terms meaning that a detectable difference (beyond the contribution of noise in the system used to measure it) is found in the amount of expression of the nucleotide sequences relative to a baseline. In this case, a baseline preferably comes from a pool of subjects that do not express excess survival. A pool of these subjects preferably contains 1, 3, 5, 10, 20, 30, 100, 400, 500, 600 or more subjects. The expression level of a differentially expressed nucleotide sequence that is indicative of longevity is then considered either up regulated or down regulated relative to a baseline level using the same measurement method.
The assessment of the expression level of a nucleotide sequence in order to assess whether a gene/nucleotide sequence is modulated is preferably performed using classical molecular biology techniques to detect mRNA levels, such as (real time) reverse transcriptase PCR (whether quantitative or semi-quantitative), mRNA (micro)array analysis or Northern blot analysis, or other methods to detect RNA. Alternatively, according to another preferred embodiment, in the methods of the invention the expression level of a nucleotide sequence is determined indirectly by quantifying the amount of the polypeptide encoded by the gene/nucleotide sequence. Quantifying an amount of polypeptide may be carried out by any known technique. Preferably, an amount of polypeptide is quantified by Western blotting. The skilled person will understand that alternatively or in combination with the quantification of the identified nucleotide sequences and/or corresponding polypeptides, the quantification of a substrate of the corresponding polypeptides or of any compound known to be associated with the function of the corresponding polypeptides or the quantification of the function or activity of the corresponding polypeptide using a specific assay is encompassed within the scope of the prognosticating method of the invention. In a preferred embodiment, the assessment of the expression level of a nucleotide sequence is carried out using (micro)arrays as defined herein.
The expression levels of a nucleotide sequence and/or amounts of a corresponding polypeptide are preferably measured in a sample from a subject. Thus, preferably the expression level (of a nucleotide sequence or polypeptide) is determined ex vivo in a sample obtained from a subject. A sample may be liquid, semi- liquid, semi-solid or solid. A preferred sample comprises 100 or more cells and/or a tissue from a subject to be tested taken in a biopsy. Alternatively or in addition, a sample may comprises blood of a subject. The skilled person knows how to isolate and optionally purify RNA and/or protein present in such a sample. In case of RNA, the skilled person may further amplify it using known techniques.
An increase (or upregulation) (which is synonymous with a higher expression level) or decrease (or downregulation) (which is synonymous with a lower expression level) of the expression level of a nucleotide sequence (or steady state level of the encoded polypeptide) is preferably defined as being a detectable change of the expression level of the nucleotide sequence (or steady state level of the encoded polypeptide or any detectable change in the biological activity of the polypeptide) using a method as defined earlier on as compared to the expression level of the corresponding nucleotide sequence (or steady state level of the corresponding encoded polypeptide) in a baseline. According to a preferred embodiment, an increase or decrease of a polypeptide activity is quantified using a specific assay for the polypeptide activity.
Preferably, an increase of the expression level of a nucleotide sequence means an increase of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the nucleotide sequence using arrays. More preferably, an increase of the expression level of a nucleotide sequence means an increase of at least 10 or 11%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
Preferably, a decrease of the expression level of a nucleotide sequence means a decrease of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the nucleotide sequence using arrays. More preferably, a decrease of the expression level of a nucleotide sequence means an decrease of at least 10%, even more preferably at least 20%., at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more. Preferably, an increase of the expression level of a polypeptide means an increase of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the polypeptide using western blotting. More preferably, an increase of the expression level of a polypeptide means an increase of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more. Preferably, a decrease of the expression level of a polypeptide means a decrease of at least 2, 3, 4, 5, 6, 7, 8, or 9% of the expression level of the polypeptide using western blotting. More preferably, a decrease of the expression level of a polypeptide means a decrease of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more. Preferably, an increase of the polypeptide activity means an increase of at least 2,
3, 4, 5, 6, 7, 8, or 9% of the polypeptide activity using a suitable assay. More preferably, an increase of the polypeptide activity means an increase of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more. Preferably, a decrease of the polypeptide activity means a decrease of at least 2, 3,
4, 5, 6, 7, 8, or 9% of the polypeptide activity using a suitable assay. More preferably, a decrease of the polypeptide activity means a decrease of at least 10%, even more preferably at least 20%, at least 30%, at least 40%, at least 50%, at least 70%, at least 90%, at least 150% or more.
In a further preferred embodiment of the method for determining genetic predisposition for longevity of a subject up- or down-regulation of a transcript (or complement thereof) that specifically hybridises to a probe selected from the group consisting of GE488443 (KALRN), GE83396 (C3orf26), GE4871 (NFIA), GE88135 (TCF4), GE 785523 (METT5D1), GE624013 and GE535567 (MARCHIII), GE749029 and GE57513 (SNRPN) as a function of the age and sex of a subject is indicative of longevity. See Tables 2.4.2 and 2.4.3 for Genbank accession no.'s, Unigene no.'s of expressed sequences identified by these probes. Likewise the up- or down-regulation can be determined of a transcript that has at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence with a Genbank accession no. as provide in Table 2.4.2 and 2.4.3. Particularly preferred, in the method the up- or down-regulation of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120 (ST3GAL1), SEQ ID NO: 121 (ZFAT), SEQ ID NO: 122 (ACTBL2) and SEQ ID NO: 123 (PLK2), as a function of the age and sex of a subject is indicative of longevity. In a most preferred method, the up- or down-regulation of a transcript (or complement thereof) having at least 80, 85, 90, 95, 98, 99% sequence identity with the nucleotide sequence of SEQ ID NO: 120 (ST3GAL1), as a function of the age and sex of a subject is indicative of longevity.
In a fifth aspect the invention relates to a method for assessing physiological age of a subject, preferably a human subject. The rate of aging is very species specific, where a human may be aged at about 50 years; and a rodent at about 2 years. In general terms, a natural progressive decline in body systems starts in early adulthood, but in humans it becomes most evident several decades later. One arbitrary way to define old age more precisely in humans is to say that it begins at conventional retirement age, around about 60, around about 65 years of age. Another definition sets parameters for aging coincident with the loss of reproductive ability, which is around about age 45, more usually around about 50 in humans, but will, however, vary with the individual. It has been found that individuals age at different rates, even within a species. Therefore chronological age may be at best imprecise and even misleading as to the extent of decline in function. It is therefore useful to use the methods of the present invention and to evaluate the physiological age of an individual, organ, tissue, cell, etc., rather than the chronological age. In addition to the patterns of gene expression reported herein, there are a number of indicia of physiological aging that are tissue specific.
The method for assessing physiological age of a subject preferably is performed or measured in a sample from a subject, Thus, preferably the physiological age is assessed ex vivo in a sample obtained from a subject. The method preferably comprises: a) determining expression information in a sample obtained from the subject, of one or more differentially expressed nucleotide sequences as herein defined above, preferably expression information of a portfolio of expressed nucleotide sequences as herein defined above; b) using the expression information to generate an age signature for the sample; and, c) comparing the age signature obtained in b) with a control age signature; wherein a statistically significant match with a positive control or a statistically significant difference from a negative control is indicative of age in the sample. Method for generating an age signature for a sample, for determining control age signatures and for determining statistically significant matches of differences with controls are described in detail in US20070161022, which is incorporated by reference herein.
In a sixth aspect the invention relates to a method for identification of a substance that modulates (or is capable of modulating) the biological aging rate in a subject, and/or that modulates (or is capable of modulating) longevity and/or life expectancy. The method preferably comprises the steps of a) contacting the substance to a test cell or administering the substance to a test organism; b) determining in the test cell or in (a sample from) the test organism the expression level of one or more differentially expressed nucleotide sequences as herein defined above, preferably determining the expression level of a portfolio of expressed nucleotide sequences as herein defined above; c) comparing the expression level of the nucleotide sequence(s) with the expression level of the corresponding nucleotide sequence(s) in a test cell that is not contacted with the substance or in (a sample from) the test organism that is not contacted with the substance; and, d) identifying a substance that produces a difference in expression level of at least one of the nucleotide sequences, between the test cell or test organism that is contacted with the substance and the test cell or test organism that is not contacted with the substance. In a preferred method, a substance is identified as a substance the promotes longevity, extends lifespan and/or improves health, when the substance upregulates a nucleotide sequence that is upregulated in a population that expresses excess survival or when the substance downregulates a nucleotide sequence that is downregulated in a population that expresses excess survival. It is understood the up- or down-regulation respectively in a population that expresses excess survival means that a nucleotide sequence is up- or down-regulated respectively in a population of cells or organisms that expresses excess survival as compared to a corresponding control population (e.g. one or more of the cohorts of the Leiden Longevity Study and their corresponding control populations as described in the Examples herein). In these screening methods of the invention the expression level of the nucleotide acid sequence may be determined indirectly by quantifying the amount of polypeptide encoded by the nucleotide acid sequence. The cells in the test cell population are preferably mammalian cells, preferably human cells. Preferably in the method, the test cell population that is contacted with the substance and the test cell population that is not contacted with the substance are derived from one cell population, preferably from one cell line, more preferably from one cell. The test organism may is a non-human mammal or a human volunteer.
In a seventh aspect the invention relates to a method of improving health of a subject. The method preferably comprises: a) determining a genetic predisposition for longevity using a method as defined herein above; and, b) if the subject does not have the genetic predisposition for longevity, 1) providing, to the subject, a substance that modulates (or is capable of modulating) the biological aging rate identified in a method as described above; or 2) providing to the subject other medication or other dosages of medication on the basis of the biological age established by the methods of the invention method as compared the medication or dosage thereof normally provided to the subject. In a preferred method, the determination of the genetic predisposition for longevity in a) comprises determining in the test cell or in (a sample from) the test organism the expression level of one or more differentially expressed nucleotide sequences as herein defined above, more preferably determining the expression level of a portfolio of expressed nucleotide sequences as herein defined above; whereby in b) a substance is provided to the subject that modulates expression or activity of the (differentially) expressed nucleotide sequence(s) or its gene product(s). It is understood thereby that the substance upregulates expression or activity of those (differentially) expressed nucleotide sequence(s) or its gene product(s) that are upregulated in a population that expresses excess survival or that the substance downregulates expression or activity of those (differentially) expressed nucleotide sequence(s) or its gene product(s) a nucleotide sequence that are downregulated in a population that expresses excess survival.
In this document and in its claims, the verb "to comprise" and its conjugations is used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. In addition, reference to an element by the indefinite article "a" or "an" does not exclude the possibility that more than one of the element is present, unless the context clearly requires that there be one and only one of the elements. The indefinite article "a" or "an" thus usually means "at least one". All patent and literature references cited in the present specification are hereby incorporated by reference in their entirety.
The following examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way.
Description of the Figures
Figure 1 Flow chart of experimental work.
Examples
1. Materials and methods
1.1 Study populations
Description of inclusion criteria and numbers of cases and controls for the Leiden
Longevity Study (LLS18), Leiden 85plus Study20' 21, Danish 1905 cohort22 and PROSPER23 were published previously and is provided together with numbers and description of mortality data below. Numbers of cases and controls are summarized in
Table 1.1.
1.1.1 Leiden Longevity Study
In the Leiden Longevity Study long-lived Caucasian siblings were recruited together with their offspring and the partners thereof. Families were recruited if at least two long lived siblings were alive and fulfilled the age-criterion of 89 years or older for males and 91 year or older for females, representing less than 0.5 % of the Dutch population in 2001 (Schoenmaker, M. et al, 2006, Eur. J. Hum. Genet. U, 79-84). In total 956 long-lived proband siblings were included with a mean age of 94 (89-104), 1750 offspring of a mean age 61 (39-81) and 758 partners of a mean age of 60 (36-79). Remaining partners of offspring were not willing to participate. In Stage 2 of the GWAS an additional control sample of 192 Leiden Blood Bank donors (31 years, 18- 40) was added to the study.
1.1.2 Leiden 85 plus Study .
In the Leiden 85 + Study two prospective population based cohorts are combined of inhabitants of Leiden aged 85 years and older (der Wiel, A. B. et al., 2002, J. Clin. Epidemiol. 55, 1119-1125; Weverling-Rijnsburger, A. W. et al., Lancet 350, 1119-1123 (1997). Between 1987-1989 (Cohort 1) 673 subjects aged 85 years and older were enrolled and followed up for survival for 17 years, during which 672 subjects (99.9%) died. Between 1997- 1999, 563 subjects were enrolled in the month of their 85th birthday (Cohort 2) and followed up for survival for 10 years during which 453 subjects (81%) died. Subjects were visited at their home and there were no exclusion criteria related to health. DNA was available from the combined cohorts for 1245 subjects aged 85 years and above. Control samples comprised 1203 unrelated participants of the NTR Biobank project collected from all over The Netherlands (Boomsma, D. I. et al., 2008 Twin Res. Hum. Genet. U_, 342-348)
1.1.3 Danish 1905 cohort and control samples. The participants in this study are from the Danish 1905 birth ascertained in 1998
(Nybo et al., 2003, J. Am. Geriatr. Soc. 5J_, 1365-1373) when they were aged 92-93. 3,600 subjects were alive from that cohort, 2,262 participants entered the study. Participants were subjected to a home based interview on health and lifestyle parameters, physical and cognitive tests and collection of biological material. The current genetic study comprises a total of 1644 of these individuals. Survival was followed up until 1st of March 2007. 94% percent (1,547 subjects) of subjects died in the 8 years to follow up. Control samples were 2007 twins (one twin for each pair) collected from all over Denmark.
1.1.4 PROSPER study The Prospective Study of Pravastatin in the Elderly at Risk (PROSPER) is a double-blinded, randomized, controlled trial consisting of 5804 patients recruited at Cork, Ireland; Glasgow, Scotland; and Leiden, The Netherlands. Patients were randomized to either placebo (n = 2913) or 40 mg of Pravastatin (n = 2891) (Shepherd, J. et al, 2002, Lancet 360, 1623-1630). Patients were recruited if they had either preexisting vascular disease (coronary, cerebral, or peripheral) or were at increased risk for vascular disease due to such factors as smoking, hypertension, or diabetes. Inclusion criteria called for men and women between the ages of 70 and 82 years with a total plasma cholesterol of 155- 350 mg/dL (4-9/mmol/L) and triglyceride levels < 200 mg/dL (6 mmol/L). Patients were excluded if they showed signs of cognitive decline, indicated by a score of 23 or less on the Mini Mental State Examination and a series of psychometric tests. The study population was distributed evenly with respect to existing vascular disease and qualifying risk factors. Patients were followed every 3 months for an average of 3.2 years. The primary composite endpoint, definite or suspected death from coronary heart disease (CHD), nonfatal myocardial infarction (MI), or fatal/nonfatal stroke, was measured at 3-year follow-up. A total of 604 subjects died during follow up. In statistical analysis adjustments for study cohort as well as use of Pravastatin were made. 1.1.5 German cohort
The unrelated German study participants were drawn from population-based collections and comprised 1447 long-lived individuals of exceptional age (810 nonagenarians, 637 centenarians; age range: 95 - 110 years; mean age: 99 years) and 1104 younger control subjects (age range: 60-75 years; mean age: 67 years) (Nebel, A. et al, 2005, Proc. Natl. Acad. Sci. U. S. A 102, 7906-7909). The gender ratio was about 75% females vs. 25% males. The controls match the long-lived individuals in terms of ancestry, gender and geographical origin within the country and genetic differences between the case-control samples are considered to be very low (Wilicox, B. J. et al., 2008, Proc. Natl. Acad. Sci. U. S. A 105, 13987-13992). 1.1.6 CHARGE Consortium
The international Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium was convened to promote the discovery of new genes involved in multiple complex traits using genome-wide association analysis (Psaty, B. M. et al, 2009, Circ. Cardiovasc. Genet. 2, 73-80). The following CHARGE consortium cohorts contributed data to the present meta-analysis: Age Gene/Environment Susceptibility Reykjavik Study (AGESReykjavik), the Cardiovascular Health Study (CHS), the Framingham Heart Study (FHS) and the Rotterdam Study (RS). Across the four CHARGE cohorts there were 1836 participants who achieved age 90 years or older and 1955 participants who died between ages 55 and 80 years from the same cohorts who served as a comparison group. When the SNP of interest was not genotyped directly in a study, the genotypes were imputed using standard SNP imputation software (BIMBAM, MaCH) The mean dosage of one of the alleles (a value between 0 and 2) was used as the predictor for imputed SNPs. The logarithm of the odds ratios of additive effects and the corresponding empirical standard errors for the four studies were estimated using logistic regression adjusting for sex and (for CHS) study site. The FHS used general estimating equations (GEE) to account for familial correlations.1.1.7 Relevant phenotypic information of the Leiden Longevity study cohort
The survival benefit of these families is marked by a 30 % excess survival observed in the proband, the parental and the offspring generation (Schoenmaker et al., 2006, supra). The male offspring of the long-lived subjects have a lower prevalence of diabetes and cardiovascular disease as compared to their partners. The offspring has lower serum glucose than their partners and beneficial lipoprotein profiles in the sense that on average offspring has significantly larger LDL particle sizes than partners, a feature even stronger represented in the nonagenarian siblings (Heijmans et al., 2006, PLoS Med. 2006 Dec; 3{12):e495). 1.2 Genotyping A flow chart of the approach used to identify SNPs associated to survival is provided in Figure 1 and a description of population is provided in Table 1.1. DNA from the Leiden Longevity Study was extracted from samples at baseline using conventional methods24. Genotyping for stage 1 of the GWAS was performed at Perlegen Sciences by applying the first generation genome-wide SNA array Affymetrix Gene Chip Human Mapping 500K Array set comprising two arrays (262K + 238K) and for stage 2 by using an in-house Perlegen platform. We discarded SNPs that did not meet the following criteria: a MAF > 0.02, a successful call rate > 80% and PHWE ≥ 10"4. In total 357,162 SNPs were used for GWAS analysis with a mean genotype call rate of 95%. Genotype data were used to confirm sex and family relationships. Latent clustering of stage 1 genotypes due to population substructure was assessed by clustering analysis of IBS matrix using Plink. One cluster was identified indicating that the Leiden Longevity Study is a homogenous population. The quantile-quantile plot showed that the p-value distribution of stage 1 conformed to a null distribution at all but the extreme tail (data not shown) The genomic inflation factor (λ), which measures over-dispersion of test statistics from association tests indicating population stratification was 1.019.
Stage 2 SNPs were selected for analyses when the mean call rate > 95%, MAF > 0.02 and PHWE > 0.01. Genotyping quality control was performed using duplicate DNA samples within the LLS and SNP assays of the Sequenom MassARRAY platform to confirm genotyping accuracy of SNPs genotyped in stage 1 and 2 combined for which 99.67% concordant results were obtained. Genotyping for replication studies was performed with MassARRAY Iplex (Sequenom, San Diego). Four i-plexes containing 96 out of 104 prioritised SNPs could be designed for replication studies 81 of which were successfully typed in 100% of the samples at a genotype call rate of .
For investigations of disease susceptibility alleles, 30 SNPs were successfully genotyped using Sequenom iPLEX. The average genotype call rate for these SNPs was 96.9% and the average concordance rate was 99.7% among 128 duplicated control samples. All of the SNPs were in Hardy- Weinberg equilibrium (P>0.002) among the controls. Samples with genotypes for all 30 SNPs were included for analyses (LLS: 723 cases and 721 controls; Leiden 85 Plus Study: 979 cases and 1167 controls). 1.3 Statistical analyses
In the association analysis of stage 1 of the GWAS comparing genotype frequencies between cases and controls (Table 1.1) we applied the Cochrane- Armitage test for additive effects and the Fishers exact test for recessive effects. For X-linked SNPs, the genotypes of the males were considered as homozygous genotypes. SNPs with P-value below 0.01 for the trend were selected for the second stage. Restricted space on the chip for stage 2 allowed us to select a limited number of additional SNPs for which reason SNPs associating in the recessive model at a p-value below 0.005 for Fishers exact test were selected for genotyping in the second stage.
For the joint analysis of stage 1 and 2 a variance-modified version of these score- tests was used for additive and recessive effects. The relatedness between the highly aged sibling cases was taken into account when computing the variances of the scores25. Unless otherwise stated, subsequent association analyses of the SNPs in additional cohorts were restricted to only the genetic model corresponding to the most significant result in the GWAS. Odds ratios were estimated and corresponding 95% confidence intervals were computed based on empirical standard errors. For meta-analyses a fixed effect approach was used. Scores and their variances were computed within each study and combined across the three studies to obtain a single meta-statistic. P-values below 5 x 10~8 were considered as genome wide significant26. Heterogeneity between studies was assessed by estimating the between study variance using random effects meta analysis.
For survival analyses in PROSPER, Kaplan Meier curves were plotted for the three genotypes and delayed entry was used because subjects entered the study at various ages. Hazard ratios and 95% confidence intervals were computed using Cox proportional hazards model. For association analysis of carriers of three SNPs at 8q24.22 with expression levels of probes on the micro-array, WaId test statistics were computed adjusted for hybridization date. For all analyses in the Leiden Longevity Study empirical standard errors were used to account for relatedness.
For testing association to survival of disease susceptibility alleles, the same methods as for the GWAS and meta-analysis were used. A linear regression model with the number of disease risk alleles as outcome and longevity status as covariate was performed. For testing within the Leiden Longevity Study, empirical standard errors were used. To search for a set of single SNPs predicting survival, a cross-validation based model selection algorithm was applied27. In addition to main effects the method also considers all pairwise interactions. 1.4 RNA expression analysis
We tested whether carriers of the risk alleles of the three SNPs at 8q24.22 showed differential expression of nearby genes in an existing microarray dataset of whole blood RNA form 50 trios from the LLS cohort (a long-lived individual, one of their offspring and the partner thereof (controls)) The selection of these trios was based on equal blood cell counts in offspring and controls. From these 150 non-fasted individuals peripheral blood was harvested using PAXgene™ tubes (Qiagen). The tubes were frozen and kept at -200C for ~3-5 years. After thawing at room temperature for at least 2 hours, total RNA was extracted from the approximately 2.5 ml of peripheral blood in each tube following the manufacturer's recommended protocol (PaxGene Blood RNA Kit Handbook, Qiagen). The quality of the total RNA was evaluated on the 2100 Bioanalyzer (Agilent Technologies) and the concentration was measured using a NanoDrop spectrophotometer (NanoDrop Technologies). Quality criteria included a 28S/18S ratio as measured by the Bioanalyzer of at least 1.2, and a total RNA yield of at least 3 μg. The samples were hybridized on 54k CodeLink Human Whole Genome Bioarrays (GE Healthcare, currently of Applied Micro arrays). Images were quantified with CodeLink Expression software (version 4.0). For the analysis of expression levels, 11 Codelink Bioarray probes were selected on their correspondence to the ST3GAL1 (7 probes) or ZFAT locus (4 probes). One ST3GAL1 and one ZFAT probe were present in an exon of these genes, while the other probes corresponded to one or more UniGene Cluster or Expressed Sequence Tags (ESTs) within intronic regions 1.5 Selection for a set of disease susceptibility alleles
We reviewed 1 83 GWAS that were pub lished until August 2008 (http://www.genome.gov/26525384), 55 of which reported on associations with coronary artery disease, heart failure, cancers and type-2 diabetes. These 55 GWAS reported 22 disease-associated loci within at least two independent GWAS, harbouring 77 associated SNPs with P < 10"4 (Table 1.5). To compile a set of disease risk alleles for each locus, the most replicated SNP was selected out of the 77 SNPs and subsequently, in case of equal number of replications, the SNP with the lowest reported p-value was selected per locus. For five loci, nine additional disease-associated SNPs were selected that were in low to moderate LD (r2 < 0.80) with the most replicated SNP. This resulted in 31 SNPs to be selected for analysis. Since one SNP failed in genotyping, in this paper 30 SNPs covering all 22 loci were tested for association to survival into old age.
2. Results
2.1 GWAS and studies in additional cohorts
A GWAS was performed in two stages in subjects of the Leiden Longevity Study (LLS) and controls. In the first stage genotype data of 357,162 SNPs that passed quality control were analysed in a comparison of 417 unrelated probands from a nonagenarian sibling pair (94 years on average) and 470 controls (60 years on average, i.e. partners of offspring). A flow chart of the consecutive analysis steps is depicted in Figure 1 and a description of the population samples investigated in the GWAS and subsequent replication studies is given in Table 1.1. The association analysis for survival into old age is based only on the age difference between cases and controls, not on health status. Assuming an additive genetic model 3983 autosomal SNPs and 85 X-linked SNPs were selected at P<0.01. We tested also for recessive effects to influence survival into old age since these have been described for variation in a number of longevity associating genes such as CETP and APOC32S' 29. Assuming a recessive model, 1206 SNPs were selected at P< 0.005 which arbitrary criterion was chosen because of the restricted number of SNPs that could be typed on the chip in the second stage. In total 4963 SNPs (including SNPs selected according to both criteria) were successfully genotyped in stage 2 in 403 nonagenarian siblings (i.e. the sibling of the proband in stage 1), and an additional control series of 401 unrelated middle aged-subjects (i.e. LLS controls and Leiden Blood Bank controls8). Results indicated that SNPs at Ip 13.2, 3q21.2, 6p22.2, 8q24.3 and 16pl3.2 represent the strongest associations observed (minimal P values of 10"6) not reaching genome wide significance (5x10"8)26. For analysis in replication studies we prioritized 104 SNPs that were the strongest candidates for association with survival into old age according to the joint analysis of stage 1 and 2 (P<6.43 x 10"4, Table 2.1.1). Supplementary Table 3.1 present the sequences of the SNP identifiers of the prioritized SNPs. The nearest genes of the SNPs associating with p<10"4 are depicted in Table 2.1.2. For 81 of the prioritized SNPS a multiplex genotyping assay could be designed and successful typing was obtained in the replication cohorts. Association analysis was performed comparing 1236 cases of a mean age of 87 years from the Leiden 85 Plus Study and 1644 cases of a mean age of 93 years from the Danish 1905 Cohort and appropriate populations of younger controls (Table 1.1). Both are population based cohort studies from a genetic background similar to the LLS cohort30. Meta-analysis of the LLS, Leiden 85 Plus Study and Danish 1905 Cohort comprising a total of 3700 highly aged cases and 4153 younger controls was performed applying consequently the additive or recessive model depending on the primary association in the LLS study (Tables 2.1.3 and 2.1.4). The nearest genes of the SNPs associating with p<10"2 in the meta-analysis are depicted in Table 2.1.5. A previous preliminary version of the meta-analysis is provide in Table 2.1.6.
For SNP rsl6905070 on chromosome 8q24.22 a genome wide significant association with survival into old age was observed in the meta-analysis (P = 6.63 x 10"9 ). The association was additive and the minor allele (MAF) was underrepresented in the cases as compared to controls, hence associated to a decreased probability of carriers to survive into old age corresponding to an odds ratio (OR) below unity (OR = 0.67 (95%CI 0.58-0.77). The association between rsl 6905070 with survival did not show any heterogeneity across the three studies (between study variance of 0.01, P=O.21). The GWAS revealed association for two additional SNPs at 8q24.22 (Table 2.1.1) that were in LD with rsl6905070, these are rs7814049 (MAF =0.09, D' = 0.99, r2 = 068 in Dutch controls, Ptrend GWAS = 1.46X10~4)) and rs7013830 (MAF =0.21, D' = 0.91, r2 = 0.21 in Dutch controls, Precessive GWAS = 4.7xlO"4\ Meta-analysis in LLS and the replication cohorts of the haplotype containing the minor allele of the three 8q24.22 SNPs provided a similar result as the analysis of rs 16905070 alone.
To investigate wheter rs 16905070 at 8q24.22 influences survival by affecting a specific disease, we performed a prospective analysis for survival after age 70 in the PROSPER cohort (3 years follow up on average). Subjects in this study had either preexisting vascular disease or were at increased risk due to major risk factors. The minor allele of rsl6905070 associated to an increased mortality risk from all causes of death in the PROSPER Study (HR =1.38, 95%CI 1.09-1.74, P=0.007), which effect was similar for men and women and was mainly due to death from cancer (HR = 1.49, 95%CI 1.01- 2.19, P=0.046) and cardiovascular disease (HR = 1.31, 95%CI 0.93-1.85, P=O.122).
The survival-associated SNPs at 8q24.22 (rsl6905070, rs7814049 and rs7013830) cover a chromosome region of 72 kb that contains no known protein coding gene located within a 100 kb distance of the SNPs. Since long range cis-acting regulating variation has been documented31 we tested whether the expression of the nearest genes to the chromosome 8q24.22 SNPs, including ZFAT, 391kb upstream of rsl6905070 and ST 3 GALl at 515 kb downstream of the top SNP, correlated to the variation. Gene expression micro-array data were available from 50 unrelated long-lived siblings from LLS families, 50 of their offspring and 50 controls32. Seven probes on the microarray corresponded to the ST3GAL1 genome region and four probes to the ZFA T region. The expression levels of these probes were compared between 60 carriers (nonagenarians, offspring and partners combined) and 90 non-carriers of any of the minor alleles of the three 8q24.22 SNPs. A significant 1.11-fold and 1.09-fold increase in expression was observed for two probes at the ST3GAL1 locus (P = 0.006 and P = 0.031, respectively). The effect size and direction was similar in siblings, offspring and partner groups separately and comparable but non-significant increases were observed for the other five probes targeting this locus (Table 2.1.7) of which one is present in a known exon. For the probes corresponding to the ZFAT locus no consistent differential expression was observed. 2.2 The impact of disease susceptibility alleles on survival In order to estimate the relevance of our findings, we investigated whether GWAS identified susceptibility loci for mortality-associated diseases generally reveal an influence on survival into old age in the study design we have used here. We selected a set of 22 GWAS-identified loci affecting susceptibility for coronary artery disease, cancer and type 2 diabetes by reviewing all GWAS studies published until August 2008 (for selection criteria, see materials and methods section, Table 1.5). In the same design that was used for the discovery of the 8q24.22 locus we investigated 30 SNPs representing 22 GWAS identified disease susceptibility loci and tested for association to survival into old age, by comparing 723 nonagenarian siblings (mean age 93 years) from the LLS cohort representing familial long-lived cases and 721 unrelated younger controls (mean age 47 years). We also compared 1167 sporadic long-lived cases (mean age 88 years) from the population based Leiden 85 Plus Study and 979 younger controls (mean age 41 years). A meta-analysis of the two cohorts to test for the effects of each SNP individually revealed seven SNPs that associated to survival into old age at nominal significance (Table 2.2.1), however when correcting for multiple testing no significant associations were observed (P>0.01).
The nonagenarian siblings on average carried 26.7 ± 0.19 disease risk alleles, which was virtually identical to the number carried by middle-aged controls (27.0 ± 0.12); P=O.127;) as well as the number carried by sporadic long-lived cases (26.8 ± 0.11) and young controls (26.8 ± 0.10, P=O.847). The number of risk alleles was similar in highly aged and young subjects even in the tails of the distribution (data not shown) and also when the 19 alleles associated with metabolic disease or the 11 cancer- associated alleles were considered separately. To allow for the possibility that sub sets of alleles were associated with longevity through a non-additive interaction, we applied spline regression, but no evidence was observed for such effects. Thus, in contrast to the findings we described for the 8q24.22 SNPs, we did not observe in our study design an association to old age survival for GWAS identified risk alleles for mortality-associated diseases.
2.3 Analysis of a longevity locus at chromosome 5ql 1.2
Next, we investigated longevity loci at which the minor allele was overrepresented in long-lived subjects (OR above unity, Table 2.1.2). To test at what ages these loci may exert longevity promoting effects, we re-analyzed the cross- sectional meta- analysis of LLS, Danish and Leiden 85 plus studies for subjects above 92 years of age (the youngest age at baseline in the Danish cohort) and each gender separately (Table 2.3.1). For the SNPs at 5ql 1.2 an association was found in females (OR = 1.66, 95% CI 1.27-2.13), P = 4.00 x 10"5 for rs4700233. Since the two SNPs (rs4700233 and rs4513644) at this locus are in high LD (D' = 0.99, r2 = 0.98) and the results are almost identical, only rs4700233 is discussed further.
The prospective survival analysis of the Leiden 85 plus Study and Danish 1905 Cohort revealed that homozygous female carriers of the minor allele of rs4700233 have a better survival above 85 years (HR = 0.58, 95% CI 0.38-0.89) but not above 95 years in the Danish 1905 Cohort. In accordance, a cross-sectional analysis of a large cohort of German women aged 95 to 110 years compared to gender-matched controls (see 1.1.5) showed absence of association for rs4700233 (OR = 1.01, P = 0.96). The relevance of the locus for survival up to 90 years, however, was once more demonstrated by the association of rs4700233 with survival in a comparison of 1836 nonagenarian cases and 1955 controls (55-80 years) from a combined and heterogeneous sample of 4 community-based cohort studies assembled in the 'CHARGE' Consortium (see 1.1.6). The minor allele associated to longevity only when applying the additive model (OR = 1.15, 95%CI 1.02-1.30, P = 0.027). Together these data suggest that the beneficial effects of genetic variation at 5ql 1.2 is limited to an age-range below 95 years. We further questioned whether rs4700233 could be associated to potential intermediate phenotypes of longevity. We measured serum lipids and circulating markers of the insulin-IGFl -signaling pathway (IIS) in 750 middle-aged partners of offspring from the LLS longevity families. We tested whether the 5ql 1.2 SNPs associated to serum levels of LDL-, HDLcholesterol, non fasting insulin, IGFl and IGFBP3. Carriers of the rs4700233 minor allele had significantly lower levels of IGFBP3 by a factor 0.24 per minor allele (P = 1.33 x 10"4) and of IGF-I by a factor 0.81 (P = 0.02). No associations were observed with serum levels of LDL- or HDLcholesterol. 2.4 RNA expression analysis 150 RNA samples belonging to 50 trios (50 unrelated nonagenarian siblings, 50 of their offspring, 50 partners of this offspring) were analysed using CodeLink microarrays. The characteristics of these samples are summarized in Table 2.4.1. A T- test analysis showed that none of the variables in Table 2.4.1 differ significantly between offspring and partners (which was one of the criteria for selecting the 50 trios), but that several parameters differ between sibs and offspring or partners (on which item the trios were not selected).
Twelve samples did not pass Quality Control criteria and had to be repeated. In the end there were 156 arrays that passed QC; 150 samples and 6 technical replicates.
The gene expression data files were read into R (from Bioconductor) by using the 'Codelink' package. Normalisation was performed using the Cyclic Lowess method, resulting in a Spearman correlation between all samples of 0.97 and 0.98-0.99 between the technical replicates. In total 87.7% of the probes on the CodeLink array (53.423 in total) was expressed in at least 10% of the samples and 66% of the probes were expressed in at least 90% of the samples indicating that a major part of the genes on the array are expressed in blood. We tested which transcripts were differentially expressed with age, comparing the nonagenarian sibs to partners of offspring. Corrected for gender 3127 transcripts were found. Of these, 130 transcripts showed differential expression when offspring was compared to partners (corrected for age and gender). The most prominent of these are depicted in Table 2.4.2. Genes showing 'differential expression' as depicted in Table 2.4.2 and 2.4.3 mean that with increasing age the expression of the gene changes and that such change is different in aging offspring of nonagenarian siblings than in partners as controls, taking age and gender into account in this statistical analysis.
2.4 Analysis of gene expression data
By using the R package 'limma' for linear models in microarray data, we analysed gene regions that were candidate as a result from the genome scanning experiments and tested for probes in the area of either linkage or association whether gene expression differed between sibs and partners and between offspring and partners. 3. Discussion
Here we present the outcomes of the first genome-wide significant association for a locus with survival into old age. The SNP is located on 8q24.22 and was identified in a meta-analysis of GWAS candidate SNPs in the Leiden Longevity Study (LLS) and two population-based cohort studies, consisting in total of 3700 highly aged and 4153 middle-aged controls. A, number of loci associated to survival of which only rsl6905070 reached genome wide significance in the meta-analysis (P = 6.63 x 10"9). The minor allele associated to a 33% decreased probability of becoming long-lived. Prospective analysis of survival data in an additional large cohort revealed that this allele indeed associated from seventy years onwards with an increased mortality risk due to cancer and cardiovascular causes. Pleiotropic effects of genetic variation have previously been reported such as for variation at HFNlB influencing the risk of prostate cancer and diabetes33. The offspring of the nonagenarian siblings in the LLS, have a decreased prevalence of myocardial infarction and hypertension19 as compared to partners, whereas the somewhat older offspring of centenarians have a decreased prevalence of both cancer and cardiovascular disease5. These data fit the observation that the propensity to become long-lived may involve loci with pleiotropic effects on survival.
The proximal chromosome band 8q24.21 has frequently been implicated in the risk of various types of cancer (prostate, ovary, breast, bladder34), however, none of the 8q24.22 SNPs in our study are in LD with the cancer associated SNPs on 8q24.21, 6Mb centromeric of the 8q24.22 SNP. Consistent asssociations of genetic variation to survival in the general population have not frequently been described9. This was further illustrated by applying the same cross-sectional study design to test the effect on survival into old age for human longevity of 22 GWAS-identified susceptibility loci for coronary artery disease, cancer and type-2 diabetes. Despite the fact that these diseases contribute to the majority of deaths in modern societies, the separate or cumulative effects of this set of risk alleles does not restrain people from surviving into an old age. GWAS identified SNPs, have a relatively low predictive value for disease risk and explain only a small fraction of the heritability of the traits involved35. It would be expected that such alleles only marginally affect population-wide survival. Alternatively, the subjects surviving into old age may carry protective genetic variation or be subject to environmental features counteracting the disease promoting effect of disease susceptibility alleles.
Carriers of the minor alleles of any of the three 8q24.22 SNPs were found to have an 1.11-fold increased expression of mRNA probes covering the ST3GAL1 gene located 515 kb from rsl6905070. Small long-range effects were also observed for variation in the 5pl3 gene desert that correlated with expression of the PTGER4 gene at a 270 kb distance31. The STSGALl gene encodes ST3 beta-galactoside alpha-2,3-sialyltransferase 1, a member of glycosyltransferase family 29. This type II membrane protein catalyzes the transfer of sialic acid from CMP-sialic acid to galactose-containing substrates and sialic acid modulates immune interactions. ST3Gal I modulates surface sialylated structures during the generation of dendritic cells from monocytes36. Dendritic cells are antigen-presenting cells with high endocytic capacity that play a central role in immune regulation. The regulation of glycosylation has been implicated in immune responsiveness, multiple diseases and aging37.
In conclusion, by performing both cross-sectional and prospective genetic association analyses in cohorts of elderly and highly aged subjects we have identified a locus at chromosome 8q24.22 that may influence the probability to survive into old age. Analysis of gene expression profiles in longevity family members and their partners indicated that glycosylation may be influenced by the genetic variation involved. Prominent risk alleles from previous GWAS for cancer and cardiovascular disease had less effect on the probability to survive into old age. Our observations support the notion that an assessment of disease risk (even in carriers of high numbers of susceptibility alleles) requires a testing also of variants that provide protection against disease risk38.
References
1. Perls TT, Wilmoth J, Levenson R et al. Life-long sustained mortality advantage of siblings of centenarians. Proc Natl Acad Sci U S A 2002; 99(12): 8442-8447.
2. Skytthe A, Pedersen NL, Kaprio J et al. Longevity studies in GenomEUtwin. Twin Res 2003; 6(5):448-454.
3. Iachine IA, Holm NV, Harris JR et al. How heritable is individual susceptibility to death? The results of an analysis of survival data on Danish, Swedish and Finnish twins.
Twin Res 1998; 1(4): 196-205.
4. Hjelmborg JV, Iachine I, Skytthe A et al. Genetic influence on human lifespan and longevity. Hum Genet 2006; 119(3):312-321.
5. Terry DF, Wilcox MA, McCormick MA et al. Lower all-cause, cardiovascular, and cancer mortality in centenarians' offspring. J Am Geriatr Soc 2004; 52(12):2074-
2076.
6. Adams ER, Nolan VG, Andersen SL, Perls TT, Terry DF. Centenarian Offspring: Start Healthier and Stay Healthier. J Am Geriatr Soc 2008; 56(11):2089-2092. 7. Schachter F, Cohen D, Kirkwood T. Prospects for the Genetics of Human Longevity. Hum Genet 1993; 91(6):519-526.
8. van Heemst D., Beekman M, Mooijaart SP et al. Reduced insulin/IGF-1 signalling and human longevity. Aging Cell 2005; 4(2):79-85. 9. Christensen K, Johnson TE, Vaupel JW. The quest for genetic determinants of human longevity: challenges and insights. Nat Rev Genet 2006; 7(6):436-448.
10. Kenyon C. The plasticity of aging: insights from long-lived mutants. Cell 2005; 120(4):449-460.
11. Johnson TE, de CE, Hegi de CS, Cypser J, Henderson S, Tedesco P. Relationship between increased longevity and stress resistance as assessed through gerontogene mutations in Caenorhabditis elegans. Exp Gerontol 2001; 36(10):1609-1617.
12. Heijmans BT, Westendorp RG, Slagboom PE. Common gene variants, mortality and extreme longevity in humans. Exp Gerontol 2000; 35(6-7):865-877.
13. Gerdes LU, Jeune B, Ranberg KA, Nybo H, Vaupel JW. Estimation of apolipoprotein E genotype-specific relative mortality risks from the distribution of genotypes in centenarians and middle-aged men: apolipoprotein E gene is a "frailty gene," not a "longevity gene". Genet Epidemiol 2000; 19(3):202-210.
14. Wilicox BJ, Donlon TA, He Q et al. FOXO3A genotype is strongly associated with human longevity. Proc Natl Acad Sci U S A 2008; 105(37): 13987-13992. 15. Flachsbart F, Caliebe A, Kleindorp R et al. Association of FOXO3A variation with human longevity confirmed in German centenarians. Proc Natl Acad Sci U S A 2009.
16. Puca AA, Daly MJ, Brewster SJ et al. A genome-wide scan for linkage to human exceptional longevity identifies a locus on chromosome 4. Proc Natl Acad Sci U S A 2001; 98(18):10505-10508.
17. Lunetta KL, DAgostino RB, Karasik D et al. Genetic correlates of longevity and selected age-related phenotypes: a genome-wide association study in the Framingham Study. BMC Med Genet 2007; 8.
18. Schoenmaker M, de Craen AJ, de Meijer PH et al. Evidence of genetic enrichment for exceptional survival using a family approach: the Leiden Longevity Study. Eur J
Hum Genet 2006; 14(l):79-84. 19. Westendorp RGJ, van Heemst D., Rozing MP et al. Nonagenarian siblings and their offspring display lower risk of mortality and morbidity than sporadic nonagenarians: The Leiden Longevity Study. J Am Geriatr Soc 2009; In Press.
20. Weverling-Rijnsburger AW, Blauw GJ, Lagaay AM, Knook DL, Meinders AE, Westendorp RG. Total cholesterol and risk of mortality in the oldest old. Lancet 1997;
350(9085): 1119-1123.
21. der Wiel AB, van EE, de Craen AJ et al. A high response is not essential to prevent selection bias: results from the Leiden 85-plus study. J Clin Epidemiol 2002; 55(11):1119-1125. 22. Nybo H, Petersen HC, Gaist D et al. Predictors of mortality in 2,249 nonagenarians~the Danish 1905-Cohort Survey. J Am Geriatr Soc 2003; 51(10): 1365- 1373.
23. Shepherd J, Blauw GJ, Murphy MB et al. Pravastatin in elderly individuals at risk of vascular disease (PROSPER): a randomised controlled trial. Lancet 2002; 360(9346):1623-1630.
24. Beekman M, Blauw GJ, Houwing-Duistermaat JJ, Brandt BW, Westendorp RG, Slagboom PE. Chromosome 4q25, microsomal transfer protein gene, and human longevity: novel data and a meta-analysis of association studies. J Gerontol A Biol Sci Med Sci 2006; 61(4):355-362. 25. Slager SL, Schaid DJ. Evaluation of candidate genes in case-control studies: a statistical method to account for related subjects. Am J Hum Genet 2001; 68(6):1457- 1462.
26. Pe'er I, Yelensky R, Altshuler D, Daly MJ. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol 2008; 32(4):381-385.
27. Kooperberg C, Bose S, Stone CJ. Polychotomous Regression. J Am Stat Ass 1997; 92:117-127.
28. Barzilai N, Atzmon G, Schechter C et al. Unique lipoprotein phenotype and genotype associated with exceptional longevity. JAMA 2003; 290(15):2030-2040. 29. Atzmon G, Rincon M, Schechter CB et al. Lipoprotein genotype and conserved pathway for exceptional longevity in humans. PLoS Biol 2006; 4(4):el 13. 30. Heath SC, Gut IG, Brennan P et al. Investigation of the fine structure of European populations with applications to disease association studies. Eur J Hum Genet 2008; 16(12):1413-1429.
31. Libioulle C, Louis E, Hansoul S et al. Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p 13.1 and modulates expression of
PTGER4. PLoS Genet 2007; 3(4):e58.
32. Passtoors WM, Beekman M, Gunn D et al. Genomic studies in ageing research: the need to integrate genetic and gene expression approaches. J Intern Med 2008; 263(2):153-166. 33. Gudmundsson J, Sulem P, Steinthorsdottir V et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nat Genet 2007; 39(8):977-983.
34. Ioannidis JP, Thomas G, Daly MJ. Validating, augmenting and refining genome- wide association signals. Nat Rev Genet 2009; 10(5):318-329. 35. Goldstein DB. Common genetic variation and human traits. N Engl J Med 2009; 360(17):1696-1698.
36. Videira PA, Amado IF, Crespo HJ et al. Surface alpha 2-3- and alpha 2-6- sialylation of human monocytes and derived dendritic cells and its influence on endocytosis. Glycoconj J 2008; 25(3):259-268. 37. Vanhooren V, Liu XE, Franceschi C et al. N-glycan profiles as tools in diagnosis of hepatocellular carcinoma and prediction of healthy human ageing. Mech Ageing Dev 2009; 130(l-2):92-97.
38. Donnelly P. Progress and challenges in genome-wide association studies in humans. Nature 2008; 456(7223):728-731.
References for Table 1.5
1. Samani,N.J. et al. Genomewide association analysis of coronary artery disease. N. Engl. J. Med. 357, 443-453 (2007). 2. Willer,C.J. et al. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat. Genet. 40, 161-169 (2008).
3. MeigsJ.B. et al. Genome-wide association with diabetes-related traits in the Framingham Heart Study. BMC. Med. Genet. 8 Suppl 1, S 16 (2007). 4. Larson, M. G. et al. Framingham Heart Study I OOK project: genome-wide associations for cardiovascular disease outcomes. BMC. Med. Genet. 8 Suppl 1, S5 (2007).
5. Hayes,M.G. et al. Identification of type 2 diabetes genes in Mexican Americans through genome-wide association studies. Diabetes 56, 3033-3044 (2007).
6. Zeggini,E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat. Genet. 40, 638-645 (2008).
7. Saxena,R. et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331-1336 (2007).
8. Scott, L. J. et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316, 1341-1345 (2007).
9. Zeggini,E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336-1341 (2007). 10. The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661- 678 (2007).
11. Helgadottir,A. et al. The same sequence variant on 9p21 associates with myocardial infarction, abdominal aortic aneurysm and intracranial aneurysm. Nat. Genet. 40, 217-224 (2008).
12. Omori,S. et al. Association of CDKAL l , IGF2BP2, CDKN2A/B, HHEX, SLC30A8, and KCNJl 1 with susceptibility to type 2 diabetes in a Japanese population. Diabetes 57, 791-795 (2008).
13. Cauchi,S. et al. Post genome-wide association studies of novel genes associated with type 2 diabetes show gene-gene interaction and high predictive value. PLoS. ONE.
3, e2031 (2008).
14. Sandhu,M.S. et al. Common variants in WFSl confer risk of type 2 diabetes. Nat. Genet. 39, 951-953 (2007).
15. Franks,P.W. et al. Replication of the association between variants in WFSl and risk of type 2 diabetes in European populations. Diabetologia 51, 458-463 (2008).
16. MurabitoJ.M. et al. A genome-wide association study of breast and prostate cancer in the NHLBI's Framingham Heart Study. BMC. Med. Genet. 8 Suppl 1, S6 (2007). 17. Stancakova,A. et al. Single-nucleotide polymorphism rs7754840 of CDKALl is associated with impaired insulin secretion in nondiabetic offspring of type 2 diabetic subjects and in a large sample of men with normal glucose tolerance. J. Clin. Endocrinol. Metab 93, 1924-1930 (2008). 18. Steinthorsdottir,V. et al. A variant in CDKALl influences insulin response and risk of type 2 diabetes. Nat. Genet. 39, 770-775 (2007).
19. Sladek,R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881-885 (2007).
20. Easton,D.F. et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087-1093 (2007).
21. Zanke,B.W. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat. Genet. 39, 989-994 (2007).
22. Tomlinson,I. et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nat. Genet. 39, 984-988 (2007). 23. Ghoussaini,M. et al. Multiple loci with different cancer specificities within the 8q24 gene desert. J. Natl. Cancer Inst. 100, 962-966 (2008).
24. Haiman,C.A. et al. A common genetic risk factor for colorectal and prostate cancer. Nat. Genet. 39, 954-956 (2007).
25. Yeager,M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat. Genet. 39, 645-649 (2007).
26. Thomas, G. et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat. Genet. 40, 310-315 (2008).
27. Tomlinson,I.P. et al. A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10pl4 and 8q23.3. Nat. Genet. 40, 623-630 (2008). 28. Tenesa,A. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on 1 Iq23 and replicates risk loci at 8q24 and 18q21. Nat. Genet. 40, 631-637 (2008).
29. Severi,G. et al. The common variant rsl447295 on chromosome 8q24 and prostate cancer risk: results from an Australian population-based case-control study. Cancer Epidemiol. Biomarkers Prev. 16, 610-612 (2007).
30. Cheng, I. et al. 8q24 and prostate cancer: association with advanced disease and meta-analysis. Eur. J. Hum. Genet. 16, 496-505 (2008). 31. Broadbent,H.M. et al. Susceptibility to coronary artery disease and diabetes is encoded by distinct, tightly linked SNPs in the ANRIL locus on chromosome 9p. Hum. MoI. Genet. 17, 806-814 (2008).
32. McPherson,R. et al. A common allele on chromosome 9 associated with coronary heart disease. Science 316, 1488-1491 (2007).
33. Helgadottir,A. et al. A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 316, 1491-1493 (2007).
34. SalonenJ.T. et al. Type 2 diabetes whole-genome association study in four populations: the DiaGen consortium. Am. J. Hum. Genet. 81, 338-345 (2007). 35. Lyssenko,V. et al. Mechanisms by which common variants in the TCF7L2 gene increase risk of type 2 diabetes. J. Clin. Invest 117, 2155-2163 (2007). 36. Palmer,N.D. et al. Association of TCF7L2 gene polymorphisms with reduced acute insulin response in Hispanic Americans. J. Clin. Endocrinol. Metab 93, 304-309 (2008). 37. Folsom,A.R. et al. Variation in TCF7L2 and increased risk of colon cancer: the Atherosclerosis Risk in Communities (ARIC) Study. Diabetes Care 31, 905-909 (2008).
38. Burwinkel,B. et al. Transcription factor 7-like 2 (TCF7L2) variant is associated with familial breast cancer risk: a case-control study. BMC. Cancer 6, 268 (2006).
39. Agalliu,I. et al. Evaluation of a variant in the transcription factor 7-like 2 (TCF7L2) gene and prostate cancer risk in a population-based study. Prostate 68, 740-
747 (2008).
40. Hunter,D.J. et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat. Genet. 39, 870-874 (2007). 41. Stacey,S.N. et al. Common variants on chromosome 5pl2 confer susceptibility to estrogen receptor-positive breast cancer. Nat. Genet. 40, 703-706 (2008).
42. Raskin,L. et al. FGFR2 is a breast cancer susceptibility gene in Jewish and Arab Israeli populations. Cancer Epidemiol. Biomarkers Prev. 17, 1060-1065 (2008).
43. Amos, C. I. et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet. 40, 616-622 (2008).
44. Hung,R.J. et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature 452, 633-637 (2008). 45. Thorgeirsson,T.E. et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 452, 638-642 (2008).
46. Stacey,S.N. et al. Common variants on chromosomes 2q35 and 16ql2 confer susceptibility to estrogen receptor-positive breast cancer. Nat. Genet. 39, 865-869 (2007).
47. Garcia-Closas,M. et al. Heterogeneity of breast cancer associations with five susceptibility loci by clinical and pathological characteristics. PLoS. Genet. 4, el000054 (2008).
48. Scuteri,A. et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS. Genet. 3, el 15 (2007).
49. Winckler,W. et al. Evaluation of common variants in the six known maturity- onset diabetes of the young (MODY) genes for association with type 2 diabetes. Diabetes 56, 685-693 (2007).
50. GudmundssonJ. et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nat. Genet. 39, 977-983
(2007).
51. Levin,A.M. et al. Chromosome 17ql2 variants contribute to risk of early-onset prostate cancer. Cancer Res. 68, 6492-6495 (2008).
52. Broderick,P. et al. A genome-wide association study shows that common alleles of SMAD7 influence colorectal cancer risk. Nat. Genet. 39, 1315-1317 (2007).
53. GudmundssonJ. et al. Common sequence variants on 2pl5 and XpI 1.22 confer susceptibility to prostate cancer. Nat. Genet. 40, 281-283 (2008).
54. Eeles,R.A. et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat. Genet. 40, 316-321 (2008).
Figure imgf000045_0001
Table 1.5 Overview of loci associated with heart disease, cancer and type-2 diabetes in recent GWAS
Locus Chromosome Nearest genes SNPa Position Risk /non risk0 Associated disease Odds Ratioα References
Ipl3.3 CELSR2, PSRCl rs599839 109623689 AJG CAD 1.39 surrogate for rs599839 rs646776 109620053 T/C CAD (r2=0.94)
2q32.3 TMEFF2 rslO497721 192622607 A/C" T2D rsl0497726 192759565 C/Abs CAD rsl0497723 192817829 G/A T2D 2.32
3 3p25.2 PPARG rsl7036101 12252845 G/A T2D 1.15 6 rsl801282 12368125 C/G T2D 1.14 7-11
4 3q27.2 IGF2BP2 rs4402960 186994381 T/G T2D 1.14 1-9,12 rsl470579 187011774 C/A T2D 1.14 7,13
4pl6.1 WFSl rsiooiom 6343816 G/A T2D 1.15 14,13 rs6446482 6346594 G/C T2D 1.15 14,15 rs734312 6354255 AJG T2D 1.23/1.25° 14,15
5q34 BCOl 1998 rslO515869 163444804 G/AD HF _ 4 rs6556756 163821858 G/Tb BC 16 rs9314033 163822784 C/Ab BC - 16
6p22.3 CDKALl rsl0946398 20769013 C/A T2D 1.12 rs7754840 20769229 C/G T2D 1.12 7,8,17 rs7756992 20787688 G/A T2D 1.20 12,18 rs9465871 20825234 C/T T2D 1.18/2.17C 10
6q25.1 MTHFDlL rs6922269 151294678 AJG CAD 1.37
Locus Chromosome Nearest genes SNPa Position Risk /non risk0 Associated disease Odds Ratiod References
9 8q24.11 SLC30A8 rsl3266634 118253964 C/T T2D 1.12
10 8q24.21 POU5F1,LOC727677 rs7001069 128179828 A/Gb PC - 16 rsl0505483 128194377 G/Ab PC - 16 rsl3281615 128424800 G/A BC 1.08 20 rsl0505477 128476625 G/A CC, PC 1.27/1.43 21-23 rsl 1985829 128478414 C/Tb CC, PC 1.08/1.22 24 rsl0808556 128482329 AJG CC, PC 1.26/1.31 23,24 rs6983267 128482487 G/T CC, PC 1.25/1.20 22,24-27 rs7013278 128484074 c/τb CC - 24 rsl0505474 128486686 G/Ab CC - 24 rs2060776 128489299 T/Gb CC - 24 rsl0956369 128492999 A/Tb CC - 24 rs7014346 128493974 AJG CC 1.19 24,28 rs4871789 128497243 A/Gb CC - 24 rs7842552 128500876 G/A CC 1.15 28 rsl447295 128554220 AJC PC 1.43/2.23C 23,25,29,30 rs4242382 128586755 AJG PC 1.66/2.22C 26 rs7837688 128608542 T/G PC 1.46/2.03 25
11 9p21.3 CDKN2BAS rs564398 22019547 T/C T2D, CAD 1.12/1.21" 9,31 rsl0757274 22086055 G/A CAD 1.18/1.29" 32 rsl537371 22089568 A/Cb CVD - 4 rsl556516 22090176 C/Gb CVD - 4 rsl0511701 22102599 C/Tb CVD - 4
Locus Chromosome Nearest genes SNPa Position Risk /non risk0 Associated disease Odds Ratiod References rs2383206 22105026 G/A CAD 1.26/1.26" rs2383207 22105959 G/A CAD 1.25 33 rsl0757278 22114477 G/A CAD 1.28 11,33 rsl333049 22115503 C/G CAD 1.47/1.90" 1,10 rslO811661 22124094 T/C T2D 1.20 7-9,11,12
12 10q23.33 HHEX rsllll875 94452862 C/T T2D 1.13 rs5015480 94455539 C/T T2D 1.13 9 rs7923837 94471897 A/G T2D 1.22/1.45C 12,19
13 10q25.2 TCF7L2 rs7901695 114744078 C/T T2D 1.37 9,34 rs4506565 114746031 T/A T2D 1.36/1.88C 10 rs7903146 114748339 T/C T2D, CC 1.37/1.25-2.15C 7,10,19,34-37 rsl2255372 114798892 T/G T2D, BC, PC 1.64/1.21-1.3771.09-1.15C 34-36-38-39
14 10q26.13 FGFR2 rsl219648 123336180 G/A BC 1.32 40,41 rs2420946 123341314 T/C BC 1.32 40,42 rs2981582 123342307 A/G BC 1.26 20,42
15 Ilpl5.1 KCNJIl rs5215 17365206 C/T T2D 1.14 y rs5219 17366148 T/C T2D 1.14 7,8,12
16 12q21.1 TSPAN8 rsl495377 69863368 G/C T2D 1.28/1.51C iυ rs7961581 69949369 C/T T2D 1.09 6
17 15q25.1 LOC123688, CHRNA3 rs8034191 76593078 T/C LC 1.30 43,44 rsl051730 76681394 G/A LC 1.31 43-45
18 16ql2.1 TOX3 rs8051542 51091668 T/C BC 1.09 21) rsl2443621 51105538 G/A BC 1.11 20
Locus Chromosome Nearest genes SNPa Position Risk /non risk0 Associated disease Odds Ratiod References rs3803662 51143842 T/C BC 1.20 20,46,4 /
19 16ql2.2 FTO rs8050136 52373776 A/C T2D 1.17 8,y,48 rs9939609 52378028 AT T2D 1.34/1.55" 10,48
20 17ql2 HNFlB rs757210 33170628 T/C T2D 1.12 4y,50 rs4430796 33172153 A/G PC, T2D 1.22/0.91c 26,50 rs7501939 33175269 C/T PC 1.19 50,51 rs3760511 33180426 C/A PC 1.16 50
21 18q21.1 SMAD7 rs4939827 44707461 T/C CC 1.18 2 /,28,5 rsl2953717 44707927 T/C CC 1.17 28,52 rs4464148 44713030 C/T CC 1.15 52
22 23pll.22 NUDTIl rs5945572 51246423 A/G PC 1.23 rs5945619 51258412 T/C PC 1.19 54
aSNPs geno typed in Leiden Longevity Study denoted in bold. bRisk allele not found in literature; data retrieved from SnpPer (ChiP Bioinformatics), minor allele denoted first. cMajor allele of genotyped SNPs underlined. dOdds ratio per allele. eOdds-ratio in heterozygotes and ho mo zygotes respectively. Abbreviations: CAD: coronary artery disease, T2D: Type-2 diabetes, HF: heart failure, BC: breast cancer, PC: prostate cancer, CC: colon carcinoma, LC: lung cancer, CVD: cardiovascular disease.
Figure imgf000050_0001
SNP Chr. Position MAF (Cases) MAF (Controls) Minor Allele OR 95% Cl Ptrend Precessive rs680822 3q21.2 125,885,055 0.225 0.301 T 0.67 0.56-0.8 6.41x10 1.86x10 rs17308266 3q21.2 125,888,788 0.176 0.233 A 0.71 0.59-0.85 2.32x10 1.93x10- rs10512957 3q22.2 136,756,778 0.153 0.209 C 0.68 0.56-0.84 1.79x10 3.56x10"' rs13435042 4p14 38,187,440 0.247 0.288 G 0.81 0.69-0.95 1.47x10 3.00x10" rs17015471 4q22.1 90,492,255 0.054 0.025 G 2.17 1.47-3.19 1.18x10 3.07x10" rs903595 4q22.1 90,651,485 0.288 0.232 T 1.33 1.12-1.57 9.74x10 1.14x10" rs6857132 4q25 112,483,749 0.477 0.404 C 1.33 1.15-1.55 1.53x10 1.08x10"' rs17050228 4q26 120,092,636 0.186 0.245 C 0.70 0.58-0.85 2.12x10 1.57x10" rs4862270 4q35.1 185,097,150 0.023 0.051 C 0.44 0.28-0.7 2.01x10 6.88x10" rs25943 5p15.2 14,701,689 0.112 0.160 C 0.66 0.53-0.83 2.49x10 6.54x10"' rs2911762 5q11.2 51,724,359 0.518 0.448 G 1.32 1.13-1.53 2.79x10 1.07x10"' rs854050 5q11.2 57,141,824 0.245 0.213 A 1.19 1-1.42 4.84x10 3.04x10"
Ul rs4700231 5q11.2 57,153,837 0.246 0.210 T 1.21 1.02-1.45 2.92x10 3.61x10" O rs4513644 5q11.2 57,168,902 0.223 0.189 A 1.23 1.02-1.47 3.02x10 5.40x10": rs4700233 5q11.2 57,193,489 0.220 0.186 C 1.22 1.02-1.46 3.26x10 7.15x10": rs6556883 5q15 95,177,841 0.205 0.152 A 1.44 1.18-1.76 2.76x10 2.56x10" rs1015565 5q23.3 129,632,118 0.114 0.164 G 0.66 0.53-0.82 1.62x10 1.59x10"' rs17165505 5q23.3 129,684,141 0.119 0.170 A 0.66 0.53-0.82 1.70x10 2.68x10"' rs4912610 5q31.3 141,034,424 0.103 0.153 C 0.64 0.51-0.81 1.69x10 1.49x10"' rs2804916 6p23 13,689,823 0.291 0.336 C 0.81 0.69-0.95 1.35x10 5.67x10": rs9379626 6p22.2 24,080,884 0.272 0.211 A 1.40 1.17-1.66 1.73x10 2.49x10" rs6922905 6p22.2 24,089,922 0.256 0.195 A 1.42 1.19-1.7 1.27x10 3.68x10- rs2274089 6p22.2 25,596,562 0.053 0.104 T 0.50 0.37-0.69 2.45x10 2.50x10" rs2700695 7p14.3 33,574,187 0.190 0.156 C 1.25 1.03-1.52 2.18x10 6.18x10- rs1001903 7p14.3 35,408,165 0.248 0.310 A 0.73 0.62-0.87 2.48x10 5.65x10"' rs321967 7q21.11 78,152,485 0.306 0.266 G 1.21 1.02-1.43 2.25x10 1.50x10"
Figure imgf000052_0001
SNP Chr. Position MAF (Cases) MAF (Controls) Minor Allele OR 95% Cl Ptrend Precessive rs4265646 12q12 40,256,146 0.143 0.095 G 1.55 1.24-1.94 1.57x10 7.64x10"' rs10219573 12q12 40,269,724 0.142 0.094 C 1.57 1.26-1.96 1.01x10 7.55x10"' rs12306172 12q13.13 52,825,272 0.215 0.173 A 1.30 1.08-1.57 6.07x10 2.96x10" rs12580632 12q23.2 100,486,792 0.087 0.132 C 0.63 0.5-0.81 2.21x10 5.27x10" rs1586393 12q24.32 128,221,462 0.421 0.495 G 0.75 0.64-0.86 1.16x10 1.28x10" rs11147137 12q24.33 131,964,367 0.145 0.202 T 0.68 0.55-0.82 1.23x10 2.15x10" rs4243013 13q21.2 58,697,727 0.301 0.367 T 0.74 0.63-0.87 2.52x10 9.05x10- rs4886131 13q21.2 58,697,800 0.303 0.370 G 0.74 0.63-0.87 2.44x10 6.03x10- rs1536626 13q21.2 58,703,537 0.304 0.370 G 0.75 0.64-0.87 2.91x10 9.77x10- rs10459348 13q31.1 84,517,083 0.246 0.310 T 0.73 0.62-0.86 2.05x10 3.85x10- rs16949097 15q26.2 92,747,291 0.026 0.058 A 0.44 0.28-0.68 3.82x10 6.82x10" rs1397141 16p13.2 9,250,868 0.120 0.181 A 0.62 0.49-0.77 9.36x10 2.59x10"
U rs11076193 16q13 56,097,198 0.383 0.349 C 1.16 0.99-1.36 6.77x10 2.23x10" κ>l rs230966 17p12 15,148,660 0.291 0.247 G 1.25 1.05-1.48 9.28x10 3.04x10" rs16953309 18p11.23 8,328,100 0.093 0.057 C 1.72 1.3-2.28 2.85x10 1.81x10" rs598149 18p11.23 8,330,623 0.100 0.063 A 1.68 1.28-2.2 2.75x10 1.49x10" rs9964359 18q21.1 43,881,631 0.449 0.382 T 1.31 1.13-1.53 4.12x10 1.77x10" rs892019 19p13.13 13,135,431 0.529 0.453 C 1.37 1.17-1.6 1.10x10 7.78x10" rs11700097 20p12.1 15,919,797 0.477 0.408 C 1.32 1.14-1.54 2.51x10 9.54x10- rs6044398 20p12.1 16,757,810 0.382 0.442 G 0.78 0.67-0.9 1.21x10 2.17x10" rs6095314 20q13.13 46,875,410 0.311 0.371 C 0.78 0.66-0.91 1.66x10 1.51x10" rs6095325 20q13.13 46,887,009 0.303 0.359 G 0.78 0.67-0.92 2.26x10 3.48x10" rs1204759 20q13.13 49,190,813 0.534 0.456 T 1.35 1.16-1.56 6.17x10 3.78x10- rs713766 22q12.3 30,936,790 0.194 0.255 G 0.71 0.59-0.86 2.19x10 3.75x10- rs5961501 Xp22.33 3,656,485 0.439 0.375 G 1.21 1.06-1.37 3.86x10 8.54x10"' rs2362161 Xq22.1 100,802,986 0.160 0.109 T 1.35 1.13-1.62 1.21x10 9.88x10"'
SNP Chr. Position MAF (Cases) MAF (Controls) Minor Allele OR 95% Cl Ptrend Precessive rs5974679 Xq26.3 136,270,381 0.228 0.163 T 1.33 1.13-1.56 3.63x10"4 4.62x10"J rs17001408 Xq26.3 136,271 ,325 0.236 0.168 C 1.35 1.15-1.59 1.94x1 (T 3.67x1 O^
SNPs selected for replication analysis, associating at p<6.43X10" 4 to longevity in the GWAS analysis of LLS stage 1 and 2 combined. Chromosome position according to build 36. Position according to dbSNPbuild 129. MAF indicates minimal allele frequency in all 953 Dutch controls. Major/minor refers to the allele with the highest or lowest frequency in controls. Ptrend, Precessive refers to the p value obtained in either the additive or recessive model. OR indicates Odds Ratio of the most significant model. OR's above 1 indicate the increased probability to become long-lived based on the minor allele being overrepresented in the elderly as compared to young controls. OR's below 1 indicate the opposite. Supplementary Table 3.1 present the sequences of the SNP identifiers of the prioritized SNPs.
Figure imgf000055_0001
Table 2.1.3 Additive association analysis with survival of prioritized SNPs in separate studies and by meta analysis.
Leiden Longevity Study Danish 1905 Leiden 85 | 3lUS Meta-analysis
SNP Chr. Position OR 95% Cl P OR 95% Cl P OR 95% Cl P OR 95% Cl P rs7511741 1p36.31 7,068,004 0.62 0.49-0.78 4.50x10"° 0.96 0.84-1.1 5.90x10"' 0.98 0.83-1.16 8.19x10"' 0.90 0.82-0.99 1.03x10"'
ΓS 1695945 1p32.3 55,822,604 0.70 0.58-0.84 9.91 x10 B 0.99 0.89-1.11 9.21x10 π 0.97 0.85-1.11 6.35x10 0.93 0.86-1 1.81x10' rs 1695946 1p32.3 55,824,896 0.70 0.58-0.84 6.31 x10"° 0.99 0.89-1.11 8.55x10"1 0.96 0.84-1.1 5.86x10"' 0.92 0.85-1 1.17x10"' rs 1869580 1p31.1 74,103,869 0.49 0.33-0.72 2.08x10"* 0.99 0.79-1.24 9.2OxI O"1 0.89 0.68-1.16 3.88x10"' 0.85 0.72-0.99 1.40x10"' rs 17033794 1p13.2 115,746,673 2.17 1.53-3.07 9.99x10"° 0.74 0.6-0.92 5.70x10"' 1.07 0.82-1.4 5.96x10"' 1.02 0.88-1.18 4.83x10"' rs17163877 1p13.2 115,752,306 2.12 1.5-3 1 91 X10"5 0 78 0.63-0.96 1.81x10"' 1 03 0.77-1.37 8.60x10"' 1.02 0.87-1.19 4.72x10"' rs 12043501 1q23.3 163,673,914 0.67 0.53-0.84 1.94x10"* 1.01 0.89-1.14 9.32x10"1 0.94 0.81-1.09 4.00x10"' 0.93 0.85-1.01 2.38x10"' rs 16844995 1q23.3 163,688,929 0.70 0.57-0.86 2.54x10"* 0.96 0.85-1.07 4.26x10"1 0.99 0.86-1.13 8.46x10"' 0.92 0.85-1 1.26x10"' Ol rs 16845022 1q23.3 163,690,220 0.70 0.57-0.85 2.03x10"4 0.95 0.85-1.07 4.11 X10"1 1.01 0.88-1.16 8.90x10"' 0.93 0.85-1 1.85x10"' rs 17760425 2q32.1 184,032,166 0.66 0.53-0 82 1 63x10"* 0 99 0.87-1.12 8.56x10"1 0 97 0.83-1.13 6.89x10"' 0.92 0.84-1 2.47x10"' rs 17760539 2q32.1 184,033,421 0.67 0.54-0.83 2.52x10"* 1.00 0.87-1.15 9.89x10"1 0.90 0.77-1.05 1.85x10"' 0.89 0.81-0.98 7.34x10"J rs2551201 2q36.1 222,391,180 1.32 1.14-1.54 2.45x10"* 1.03 0.94-1.13 5.09x10"1 0.94 0.83-1.05 2.59x10"' 1.05 0.98-1.12 8.39x10"' rs 1530545 2q36.3 226,395,585 0.67 0.56-0.81 5.14x10"° 1.00 0.89-1.12 9.61X10"1 1.08 0.93-1.25 3.11x10"' 0.95 0.87-1.03 1.02x10"' rs952753 2q37.3 240,881,726 0.59 0.45-0 77 1 16x10"* 0 94 0.78-1.12 4.73x10"1 0 99 0.82-1.21 9.53x10"' 0.87 0.77-0.98 1.01x10"' rs 12474682 2q37.3 240,891,998 0.56 0.43-0.74 1.64x10"B 0.99 0.83-1.18 9.26x10"1 1.03 0.85-1.25 7.47x10"' 0.91 0.81-1.02 2.66x10"' rs1394156 3p26.1 6,199,872 0.76 0.65-0.88 2.75x10"* 0.97 0.89-1.07 5.5OxI O"1 0.93 0.83-1.05 2.40x10"' 0.92 0.86-0.98 3.39x10"J rs8516 3p25.1 14,159,670 1.42 1.19-1.69 7.81 x10"° 1.01 0.91-1.13 8.21X10"1 0.90 0.79-1.03 1.34x10"' 1.04 0.97-1.13 1.53x10"' rs1718235 3q12.1 101,066,174 0.75 0.65-0 88 2 85x10"* 1 08 0.98-1.19 1.03X10"1 1 04 0.93-1.16 4.96x10"' 1.00 0.94-1.07 6.94x10"'
ΓS680822 3q21.2 125,885,055 0.67 0.56-0.8 6.41 x10 ° 0.98 0.89-1.09 7.24x10 π 1.00 0.88-1.14 9.83x10 0.92 0.86-0.99 1.23x10' rs 17308266 3q21.2 125,888,788 0.71 0.59-0.85 2.32x10"* 0.99 0.89-1.11 9.21X10"1 0.95 0.83-1.09 4.63x10"' 0.92 0.85-1 1.85x10"' rs10512957 3q22.2 136,756,778 0.68 0.56-0.84 1.79x10"* 1.02 0.91-1.15 7.21X10"1 0.96 0.84-1.1 5.50x10"' 0.94 0.86-1.01 5.04x10"' rs17015471 4q22.1 90,492,255 2.17 1.47-3.19 1.18x10"* 0.87 0.71-1.08 2.11 X10"1 0.94 0.71-1.25 6.92x10"' 1.04 0.89-1.21 4.35x10"'
Figure imgf000057_0001
Figure imgf000058_0001
Ul
Figure imgf000059_0001
Ul
Figure imgf000060_0001
Figure imgf000061_0001
Table 2.1.7 Expression analysis of CodeLink probes corresponding to the ST3GAL1 or ZFAT locus
All 8q24.22 SNPs
CodeLink ProbelD Position (Start-End) FC 95% Cl
ST3GAL1
GE57639 134,541 ,345- 134,541 ,364 1.03 0.95 - 1.1 1 4.4IxIO"1
GE737738 134,576,628- 134,576,657 1.06 0 99 - 1 14 1.04x10"'
GE646251 134,579,290- 134,579,319 1.05 0.98 - 1.13 1.54x10"
GE817971 134,617,092- 134,617,121 1.09 0.96 - 1.25 1.95x10"
GE521413 134,642,738- 134,642,767 1.09 1.01 - 1.18 3.10x10"'
GE702947 134,644,436- 134,644,465 1.11 1.03 - 1.20 6.00x10 J
GE608909 134,648,550- 134,648,579 1.05 0 96 - 1 15 3.05X10"1 c
Figure imgf000062_0001
\
ZFAT
GE479359 135,561 ,672-135,561 ,701 1.06 1.00 - 1.13 5.90x10"z GE785819 135,573,555-135,573,584 1.00 0.94 - 1.07 9.82x10 ' GE895449 135,657,141-135,657,170 0.91 0 82 - 1 02 1.08X10"1 GE86200 135,665,328-135,665,357 0.92 0.86 - 0.99 2.60x10"z
CodeLink Bioarray probes were selected on basis of their correspondence to the ST3GAL1 or ZFAT locus. One ST3GAL1 (GE57639) and one ZFAT (GE86200) probe were present in an exon of these genes, while the other probes corresponded to one (or more) Expressed Sequence Tags (ESTs) within intronic regions. Expression differences within these probes were analyzed using Linear regression in Stata/SE 8.0 between 60 carriers and 90 non-carriers of one or more of the 8q24.22 SNPs. FC: Fold Change
Table 2.2.1 Association of GWA- identified disease risk alleles with longevity
Risk Allele Frequency Meta analysis
Chromosome Snp Position Disease Risk /non-riskc LLS LLS L85plus L85plus Odds Ratio 95% CI P-value profile3 Controls 90+Cases Controls Cases
Ipl3.3 rs646776 109620053 M T/C 0.786 0.737 0.778 0.777 0.88 0.79-0.99 0.035
2q32.3 rs 10497721 192622607 M AJC 0.092 0.113 0.093 0.090 1.07 0.91-1.26 0.367
3p25.2 rsl801282 12368125 M C/G 0.884 0.883 0.892 0.873 0.89 0.76-1.03 0.131
3q27.2 rs4402960 186994381 M T/G 0.316 0.284 0.300 0.305 0.96 0.86-1.06 0.374
4pl6.1 rsl0010131 6343816 M G/A 0.586 0.562 0.590 0.600 0.98 0.89-1.08 0.742
5q34 rslO515869 163444804 - AJG 0.444 0.424 0.440 0.434 0.95 0.87-1.05 0.320
5q34 rs6556756 163821858 - T/G 0.089 0.112 0.116 0.096 0.97 0.83-1.13 0.722
6p22.3 rs7754840 20769229 M C/G 0.324 0.332 0.306 0.317 1.05 0.95-1.17 0.391
6q25.1 rs6922269 151294678 M AJG 0.255 0.307 0.247 0.250 1.12 1.00-1.25 0.033 O\
8q24.11 rsl3266634 118253964 M C/T 0.686 0.693 0.696 0.701 1.03 0.93-1.15 0.597
8q24.21 rs6983267 128482487 C G/T 0.528 0.526 0.525 0.516 0.98 0.87-1.08 0.606
8q24.21 rs7014346 128493974 C AJG 0.386 0.376 0.369 0.352 0.95 0.86-1.05 0.227
8q24.21 rs 1447295 128554220 C AJC 0.142 0.110 0.116 0.122 0.91 0.79-1.06 0.235
9p21.3 rs564398 22019547 M T/C 0.572 0.559 0.596 0.563 0.90 0.82-1.00 0.037
9p21.3 rsl0757278 22114477 M G/A 0.457 0.461 0.476 0.428 0.90 0.81-0.99 0.026
9p21.3 rsl333049 22115503 M C/G 0.542 0.543 0.527 0.570 1.11 1.00-1.22 0.030
9p21.3 rslO811661 22124094 M T/C 0.823 0.808 0.796 0.826 1.08 0.96-1.23 0.216
10q23.33 rsllll875 94452862 M C/T 0.593 0.594 0.598 0.605 1.02 0.93-1.13 0.680
10q25.2 rs7903146 114748339 M T/C 0.278 0.275 0.264 0.273 1.02 0.92-1.14 0.720
10q26.13 rs2420946 123341314 C T/C 0.392 0.359 0.396 0.391 0.93 0.84-1.103 0.167
Ilpl5.1 rs5219 17366148 M T/C 0.378 0.344 0.370 0.359 0.91 0.83-1.01 0.080
12q21.1 rsl495377 69863368 M G/C 0.505 0.528 0.494 0.519 1.10 1.00-1.22 0.046
15q25.1 rs8034191 76593078 C T/C 0.682 0.682 0.662 0.683 1.06 0.95-1.18 0.270
Risk Allele Frequency Meta analysis
Chromosome Snp Position Disease Risk /non-riskc LLS LLS L85plus L85plus Odds Ratio 95% CI P-value profile3 Controls 90+Cases Controls Cases
16ql2.1 rs8051542 51091668 C T/C 0.730 0.730 0.756 0.715 0.88 0.79-0.98 0.023
16ql2.1 rs 12443621 51105538 C G/A 0.425 0.439 0.421 0.446 1.09 0.99-1.20 0.083
16ql2.2 rs8050136 52373776 M AJC 0.383 0.371 0.391 0.388 0.97 0.88-1.08 0.599
17ql2 rs757210 33170628 M, C T/C 0.384 0.381 0.387 0.384 0.99 0.90-1.09 0.772
17ql2 rs4430796 33172153 M, Cb AJG 0.486 0.501 0.484 0.494 1.05 0.96-1.16 0.320
18q21.1 rs4939827 44707461 C T/C 0.494 0.528 0.491 0.494 1.06 0.97-1.17 0.202
XpI 1.22 rs5945572 51246423 C AJG 0.354 0.375 0.341 0.366 0.99 0.91-1.08 0.852
a M indicates that the risk allele contributes to metabolic disease (CAD or T2D); C indicates that the risk allele contributes to cancer. The major allele (A) has been associated with risk for cancer, while the minor allele G) has been associated with T2D. c The major allele is indicated in bold. Z) Logistic regression with long-lived/control status as outcome, the study as covariate and the SNP genotypes as independent variable (Stata/SE 8.0). Analyses were performed with robust standard errors to take into account family dependency in the Leiden Longevity Study.
Figure imgf000065_0001
10
Table 2.4.1 Medians of parameters measured in donors of the 150 RNA samples that were used in the microarray comparison study. *: p<0.05 between long-lived sibs and offspring or partners. None of the parameters is significantly different between partners and offspring.
Partners Offspring Sibs
Age (years)* 62.0 60.5 92.6
Nr of females (%) 26 (52%) 25 (50%) 24 (48%)
Glucose (mmol/L)* 5.9 5.6 6.7 Blood cell counts:
White blood cells (E-9 cells/L) 6.96 7.00 6.69
Hemoglobin (mmol/L)* 8.90 9.00 8.15 Ul
Hematocrite (IVL)* 0.43 0.43 0.41
Platelets (E-9/L)* 244.0 276.5 225.5
Neutrophils (E-9 cells/L) 4.02 4.15 4.36
Lymphocytes (E-9 cells/L)* 2.01 2.03 1.19
Monocytes (E-IO cells/L) 3.74 4.12 4.48
Eosinophils (E- 10 cells/L) 1.63 1.84 1.56
Basophils (E-I l cells/L)* 5.23 4.94 3.95
Unknown cells (E- 10 cells/L) 1.41 1.43 1.36
Table 2.4.2 Results from the microarray comparison of RNA from 150 subjects from the Leiden Longevity study. Probes that are 1) differentially expressed between offspring of long-lived siblings and their partners and 2) reside in transcripts in which SNPs were associated with longevity in the Leiden Longevity Study with p-value <0.05 (Combination expression/GWA) .
Figure imgf000067_0001
Table 2.4.3 Results from the microarray comparison of RNA from 150 subjects from the Leiden Longevity study. Probes that are 1) differentially expressed between offspring of long-lived siblings and their partners and 2) resides in linkage area resulted from affected sibpair analysis in the large families of the Leiden Longevity Study with LODscore >2.0 (Table 4).
GE probe ID Gene SEQ ID NO: GenBank. Genbank Unigene Chromosome Combination
Ace No. version
GE624013 MARCHIII 100 M85500 May 26 1992 Hs.132441 5q23.2 Expression/Linkage
GE535567 101 Al 792600 Dec 13 1999 - 5q31.2 Expression/Linkage
GE749029 SNRPN 102 BU075221 Aug 27 2002 Hs 564847 15q11 2 Expression/Linkage
GE57513 SNRPN 103 NM_005678 NM_005678 3 Hs.564847 15q11.2 Expression/Linkage
GE Probe ID, which is a unique identifier for the probe sequence in the CodeLink WEBB database. This is an internal GE Healthcare relational database that held all gene associated annotations and linked them to the specific codelink probe ID. GE Healthcare Gene Expression Bioarrays (Codelink): CodeLink Human Whole Genome Bioarrays (GE Healthcare) targeting approximately 7,000 transcripts and ESTs.
Supplementary Table 3.1 Sequences of SNP identifiers of prioritized SNPs. Position in de sequences with letters other than A,G,C,T.indicate the presence of other SNPs at those positions. In the sequence listing these SNPs are coded according to the IUPAC code: B=C, G or T; D=A, G or T; H=A, C or T; R=A or G; Y=C or T; K= G or T; M= A or C; S=G or C; W=A or T; N=A, C, T or G; V=A, C or G. Letter in lower case indicate sequences that are repeated in the genome. These sequences are preferably avoided in the design of nucleic acid molecules of the invention (probes and primers) for analysis of the SNPs and their genomic environment.
Figure imgf000069_0001
Figure imgf000070_0001
O
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Ζ
Figure imgf000080_0001
O
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001

Claims

Claims
1. A portfolio comprising: a) at least two nucleotide sequences, wherein the nucleotide sequences are in linkage disequilibrium with at least two different SNPs listed in Tables 3.1, and wherein at least one nucleotide sequence is in linkage disequilibrium with a SNP selected from the group consisting of the SNPs rsl6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231; or, b) at least two nucleotide sequences selected the group consisting of: i) nucleotide sequences comprising at least 10 contiguous nucleotides from a transcript or complement thereof that specifically hybridises to a probe selected from the group consisting of probes having a nucleotide sequences of SEQ ID NO: 95 - 103; and, ii) nucleotide sequences that specifically hybridise to a transcript having at least
80% sequence identity with at least one nucleotide sequence selected from the group consisting of SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123 and their complements; whereby the portfolio comprises at least one nucleotide sequence selected from ii).
2. A portfolio according to claim 1, wherein a nucleotide sequences that is in linkage disequilibrium with a SNP listed in Table 3.1 comprises at least 10 contiguous nucleotides from a nucleotide sequence as defined in any one of SEQ ID NO.'s: 1 - 94 and 104 - 119.
3. A portfolio according to claim 1 or 2, wherein the SNPs are SNPs listed in Table 2.1.5.
4. A portfolio according to any one of claims 1 - 3, wherein the nucleotide sequences are expressed sequences.
5. Use of a nucleic acid molecule comprising: a) a nucleotide sequence that is in linkage disequilibrium with a SNP selected from the group consisting of rsl6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231; or, b) a nucleotide sequence that specifically hybridises to a transcript comprising a sequence that has at least 80% sequence identity with at least one sequence selected from the group consisting of SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122 and SEQ ID NO: 123; in a method for determining genetic predisposition for longevity, a method of screening for a substance that modulates the biological aging rate or a substance that is capable of modulation of longevity and/or life expectancy, and/or a method for assessing physiological age.
6. A use according to claim 5, wherein the nucleotide sequence is present in a chromosomal fragment extending from ST3GAL1 to ZFAT or in a chromosomal fragment extending from ACTBL2 to PLK2 and wherein the nucleotide sequence is a sequence that is unique in the human genome or transcriptome.
7. A use according to claim 6, wherein the nucleotide sequence comprises at least 10 contiguous nucleotides from a nucleotide sequence selected from SEQ ID NO's: 56, 55, 54, 120, 121, 36, 37, 34, 35, 122 and 123.
8. A method for determining a genetic predisposition for longevity, wherein the method comprises: a) detecting the presence of a polymorphism that is in linkage disequilibrium with a SNP selected from the group consisting of rsl 6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231, wherein the presence of the polymorphism is indicative of longevity; or, b) determining the expression level of a transcript having at least 80% sequence identity with a nucleotide sequence selected from the group consisting of SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122 and SEQ ID NO: 123, wherein the expression level is indicative of longevity.
9. A method according to claim 8, wherein the polymorphism is present in a chromosomal fragment extending from ST3GAL1 to ZFAT or in a chromosomal fragment extending from ACTBL2 to PLK2.
10. A method according to claim 8 or 9, wherein the polymorphism is a SNP selected from the group consisting of rsl6905070, rs7814049, rs7013830, rs4513644 rs4700233, rs854050 and rs4700231.
11. A method according to claim 8, wherein an increase of at least 3% in expression level of a transcript having at least 80% sequence identity with the nucleotide sequence of SEQ ID NO: 120, compared to the average expression level in a pool of subjects that do not express excess survival, is indicative of longevity.
12. A method for assessing physiological age of a subject, wherein the method comprises: a) determining expression information in a sample obtained from the subject, of one or more expressed nucleotide sequences as present in a portfolio as defined in claim 4; b) using the expression information to generate an age signature for the sample; and, c) comparing the age signature obtained in b) with a control age signature; wherein a statistically significant match with a positive control or a statistically significant difference from a negative control is indicative of age in the sample.
13. A method for identification of a substance that modulates the biological aging rate in a subject, wherein the method comprises the steps of: a) contacting the substance to a test cell or administering the substance to a test organism; b) determining in the test cell or in a test organism the expression level of one or more expressed nucleotide sequences as present in a portfolio as defined in claim 4; c) comparing the expression level of the nucleotide sequence(s) with the expression level of the corresponding nucleotide sequence(s) in a test cell that is not contacted with the substance or in the test organism that is not contacted with the substance; and, d) identifying a substance that produces a difference in expression level of at least one of the nucleotide sequences, between the test cell or test organism that is contacted with the substance and the test cell or test organism that is not contacted with the substance.
14. A method according to claim 13 wherein is substance is identified as a subtance that promotes longevity when the substance upregulates a nucleotide sequence that is upregulated in a population that expresses excess survival or when the substance downregulates a nucleotide sequence that is downregulated in a population that expresses excess survival.
15. A method of improving health of a subject, the method comprising: a) determining a genetic predisposition for longevity using a method as defined in any one of claims 8 - 1 1; b) and if the subject does not have the genetic predisposition for longevity, providing, to the subject, a subtance that promotes longevity as identifiable in a method according to claim 13.
16. A method of improving health of a subject, the method comprising: a) determining a genetic predisposition for longevity by a method which comprises determining in (a sample from) the subject the expression level of one or more expressed nucleotide sequences as present in a portfolio as defined in claim 4; and wherein in b) a substance is provided to the subject that modulates expression or activity of the expressed nucleotide sequence(s) or its gene product(s).
PCT/NL2009/050409 2008-07-07 2009-07-07 New indicators of human longevity and biological ageing rate Ceased WO2010005303A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP08159827 2008-07-07
EP08159827.8 2008-07-07

Publications (2)

Publication Number Publication Date
WO2010005303A2 true WO2010005303A2 (en) 2010-01-14
WO2010005303A3 WO2010005303A3 (en) 2010-04-29

Family

ID=39938277

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NL2009/050409 Ceased WO2010005303A2 (en) 2008-07-07 2009-07-07 New indicators of human longevity and biological ageing rate

Country Status (1)

Country Link
WO (1) WO2010005303A2 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010020422A (en) * 1997-04-29 2001-03-15 케니쓰 블럼, 인코포레이티드 Allelic polygene diagnosis of reward deficiency syndrome and treatment
US6673546B2 (en) * 2000-08-11 2004-01-06 The Children's Medical Center Corporation Genetic loci indicative of propensity for longevity and methods for identifying propensity for age-related disease
GB0224559D0 (en) * 2002-10-22 2002-11-27 Oxagen Ltd Test
ATE423218T1 (en) * 2002-12-04 2009-03-15 Elixir Pharmaceuticals Inc COMPONENTS OF THE AMPK PATH
WO2004085996A2 (en) * 2003-03-20 2004-10-07 Albert Einstein College Of Medicine Of Yeshiva University Biomarkers for longevity and disease and uses thereof
US7908090B2 (en) * 2005-11-30 2011-03-15 The Board Of Trustees Of The Leland Stanford Junior University Signatures for human aging
WO2007131345A1 (en) * 2006-05-12 2007-11-22 The Hospital For Sick Children Genetic risk factor in sod1 and sfrs15 in renal disease, diabetic cataract, cardiovascular disease and longevity

Also Published As

Publication number Publication date
WO2010005303A3 (en) 2010-04-29

Similar Documents

Publication Publication Date Title
KR101546058B1 (en) SNP markers for metabolic syndrome and use thereof
Saarela et al. PRKCA and multiple sclerosis: association in two independent populations
KR101536213B1 (en) SNP markers for abdominal obesity and use thereof
KR101532308B1 (en) SNP markers for abdominal obesity and use thereof
US20030032099A1 (en) Methods for predicting susceptibility to obesity and obesity-associated health problems
KR102724607B1 (en) Markers for diagnosing Sarcopenia and use thereof
KR102543907B1 (en) A genetic marker for evaluating risk of periodontitis
US10731219B1 (en) Method for preventing progression to metabolic syndrome
US20090092987A1 (en) Polymorphic Nucleic Acids Associated With Colorectal Cancer And Uses Thereof
KR101304535B1 (en) Method for predicting susceptibility to cardiovascular disease using SNP of klotho genes
KR101546069B1 (en) SNP markers for metabolic syndrome and use thereof
WO2009055596A2 (en) Methods of using genetic variants to diagnose and predict metabolic syndrome and associated traits
WO2010005303A2 (en) New indicators of human longevity and biological ageing rate
KR20150092937A (en) SNP Markers for hypertension in Korean
KR101092580B1 (en) WCAN polymorphism marker for gastric cancer susceptibility prediction and method for predicting gastric cancer susceptibility using same
KR101543774B1 (en) SNP markers for abdominal obesity and use thereof
KR101546070B1 (en) SNP markers for metabolic syndrome and use thereof
US20040076975A1 (en) Methods for assessing the risk of non-insulin-dependent diabetes mellitus based on allelic variations in the 5&#39;-flanking region of the insulin gene and body fat
KR20110011306A (en) Marker for Diagnosing Lung Cancer Susceptibility and Predicting and Determining Lung Cancer Susceptibility Using Telomere Retaining Gene
WO2009053513A1 (en) Method and kit for determining predisposition to and risk of developing psoriasis and for the diagnosis thereof
KR20240057772A (en) Single nucleotide polymorphism for diagnosing of depression and the use thereof
KR20240088237A (en) Single nucleotide polymorphism for diagnosing of diabetic kidney disease and the use thereof
US20100184839A1 (en) Allelic polymorphism associated with diabetes
KR20220113305A (en) A single nucleotide polymorphism marker composition for diagnosing an adverse reactions with angiotensin converting enzyme inhibitor and a method using the same
JP2007517511A (en) Haplotypes and polymorphisms associated with human thiopurine S-methyltransferase deficiency

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09788216

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09788216

Country of ref document: EP

Kind code of ref document: A2