AU2016225923B2

AU2016225923B2 - Novel Methods of Constructing Libraries Comprising Displayed and/or Expressed Members of a Diverse Family of Peptides, Polypeptides or Proteins and the Novel Libraries

Info

Publication number: AU2016225923B2
Application number: AU2016225923A
Authority: AU
Inventors: Edward Hirsch Cohen; Rene Hoet; Hendricus R. J. M. Hoogenboom; Robert Charles Ladner; Horacio Gabriel Nastri; Kristin L. Rookey
Original assignee: Takeda Pharmaceutical Co Ltd
Current assignee: Takeda Pharmaceutical Co Ltd
Priority date: 2001-04-17
Filing date: 2016-09-09
Publication date: 2018-07-05
Anticipated expiration: 2022-04-17
Also published as: AU2016225923A1; AU2018241075A1; AU2018241075B2

Abstract

Methods useful in constructing libraries that collectively display and/or express members of diverse families of peptides, polypeptides or proteins and the libraries produced using those methods. Methods of screening those libraries and the peptides, polypeptides or proteins identified by such screens. z <co w I. <a CO Io uq~ ________r-ill CL' CL I- \o Ile L= -j D 0 CD0 co LU

Description

The present invention relates to libraries of genetic packages that display and/or express a member of a diverse family of peptides, polypeptides or proteins and collectively display and/or express at least a portion of the diversity of the family. In an alternative embodiment, the invention relates to libraries that include a member of a diverse family of peptides, polypeptides or proteins and collectively comprise at least a portion of the diversity of the family. In a preferred embodiment, the displayed and/or expressed polypeptides are human Fabs .

2016225923 09 Sep 2016

More specifically, the invention is directed to the methods of cleaving single-stranded nucleic acids at chosen locations, the cleaved nucleic acids encoding, at least in part, the peptides, polypeptides or proteins displayed on the genetic packages of, and/or expressed in, the libraries of the invention.

In a preferred embodiment, the genetic packages are filamentous phage or phagemids or yeast.

The present invention further relates to 10 vectors for displaying and/or expressing a diverse family of peptides, polypeptides or proteins.

The present invention further relates to methods of screening the libraries of the invention and to the peptides, polypeptides and proteins identified by such screening.

BACKGROUND OF THE INVENTION

It is now common practice in the art to prepare libraries of genetic packages that display, express or comprise a member of a diverse family of peptides, polypeptides or proteins and collectively display, express or comprise at least a portion of the diversity of the family. In many common libraries, the peptides, polypeptides or proteins are related to antibodies. Often, they are Fabs or single chain antibodies.

In general, the DNAs that encode members of the families to be displayed and/or expressed must be amplified before they are cloned and used to display and/or express the desired member. Such amplification typically makes use of forward and backward primers.

2016225923 09 Sep 2016

Such primers can be complementary to sequences native to the DNA to be amplified or complementary to oligonucleotides attached at the 5' or 3' ends of that DNA. Primers that are complementary to sequences native to the DNA to be amplified are disadvantaged in that they bias the members of the families to be displayed. Only those members that contain a sequence in the native DNA that is substantially complementary to the primer will be amplified. Those that do not will be absent from the family. For those members that are amplified, any diversity within the primer region will be suppressed.

For example, in European patent 368,684 BI, the primer that is used is at the 5' end of the V_H region of an antibody gene. It anneals to a sequence region in the native DNA that is said to be sufficiently well conserved within a single species. Such primer will bias the members amplified to those having this conserved region. Any diversity within this region is extinguished.

It is generally accepted that human antibody genes arise through a process that involves a combinatorial selection of V and J or V, D, and J followed by somatic mutations. Although most diversity occurs in the Complementary Determining Regions (CDRs) , diversity also occurs in the more conserved Framework Regions (FRs) and at least some of this diversity confers or enhances specific binding to antigens (Ag). As a consequence, libraries should contain as much of the CDR and FR diversity as possible.

To clone the amplified DNAs of the peptides, polypeptides or proteins that they encode for display on a genetic package and/or for expression, the DNAs

-42016225923 12 Jun2018 must be cleaved to produce appropriate ends for ligation to a vector. Such cleavage is generally effected using restriction endonuclease recognition sites carried on the primers. When the primers are at the 5' end of DNA produced from reverse transcription of RNA, such restriction leaves deleterious 5' untranslated regions in the amplified DNA. These regions interfere with expression of the cloned genes and thus the display of the peptides, polypeptides and proteins coded for by them.

SUMMARY OF THE INVENTION

In one aspect, the present invention advantageously provides novel methods for constructing libraries that display, express or comprise a member of a diverse family of peptides, polypeptides or proteins and collectively display, express or comprise at least a portion of the diversity of the family.

These methods are not biased toward DNAs that contain native sequences that are complementary to the primers used for amplification. They also enable any sequences that may be deleterious to expression to be removed from the amplified DNA before cloning and displaying and/or expressing.

In another aspect the present invention advantageously provides a method for cleaving single-stranded nucleic acid sequences at a desired location, the method comprising the steps of:

(i) contacting the nucleic acid with a singlestranded oligonucleotide, the oligonucleotide being functionally complementary to the nucleic acid in the region in which cleavage is desired and including a sequence that with its complement

-52016225923 12 Jun2018 in the nucleic acid forms a restriction endonuclease recognition site that on restriction results in cleavage of the nucleic acid at the desired location; and (ii) cleaving the nucleic acid solely at the recognition site formed by the complementation of the nucleic acid and the oligonucleotide;

the contacting and the cleaving steps being performed at a temperature sufficient to maintain the nucleic acid in substantially single-stranded form, the oligonucleotide being functionally complementary to the nucleic acid over a large enough region to allow the two strands to associate such that cleavage may occur at the chosen temperature and at the desired location, and the cleavage being carried out using a restriction endonuclease that is active at the chosen temperature.

In a further aspect, the present invention advantageously provides an alternative method for cleaving single-stranded nucleic acid sequences at a desired location, the method comprising the steps of:

(i) contacting the nucleic acid with a partially double-stranded oligonucleotide, the single-stranded region of the oligonucleotide being functionally complementary to the nucleic acid in the region in which cleavage is desired, and the double-stranded region of the oligonucleotide having a restriction endonuclease recognition site; and (ii) cleaving the nucleic acid solely at the cleavage site formed by the

-62016225923 12 Jun 2018 complementation of the nucleic acid and the singlestranded region of the oligonucleotide;

In an alternative embodiment of this aspect of the invention, the restriction endonuclease recognition site is not initially located in the double-stranded part of the oligonucleotide. Instead, it is part of an amplification primer, which primer is complementary to the double-stranded region of the oligonucleotide. On amplification of the DNA-partially double-stranded combination, the restriction endonuclease recognition site carried on the primer becomes part of the DNA.

It can then be used to cleave the DNA.

Preferably, the restriction endonuclease recognition site is that of a Type II-S restriction endonuclease whose cleavage site is located at a known distance from its recognition site .

In another aspect, the present invention advantageously provides a method of capturing DNA molecules that comprise a member of a diverse family of DNAs and collectively comprise at least a portion of the diversity of the family.

These DNA molecules in

-72016225923 12 Jun 2018 single-stranded form have been cleaved by one of the methods of this invention. This method involves ligating the individual single-stranded DNA members of the family to a partially duplex DNA complex. The method comprises the steps of:

(i) contacting a single-stranded nucleic acid sequence that has been cleaved with a restriction endonuclease with a partially double-stranded oligonucleotide, the single-stranded region of the oligonucleotide being functionally complementary to the nucleic acid in the region that remains after cleavage, the double-stranded region of the oligonucleotide including any sequences necessary to return the sequences that remain after cleavage into proper reading frame for expression and containing a restriction endonuclease recognition site 5' of those sequences; and (ii) cleaving the partially double-stranded oligonucleotide sequence solely at the restriction endonuclease cleavage site contained within the double-stranded region of the partially doublestranded oligonucleotide.

As before, in this aspect of the invention, the restriction endonuclease recognition site need not be located in the double-stranded portion of the oligonucleotide. Instead, it can be introduced on amplification with an amplification primer that is used to amplify the DNA-partially double-stranded oligonucleotide combination.

- 8 2016225923 12 Jun 2018

In another aspect, the present invention advantageously provides the preparation of libraries, that display, express or comprise a diverse family of peptides, polypeptides or proteins and collectively display, express or comprise at least part of the diversity of the family, using the methods and DNAs described above.

In another aspect, the present invention advantageously provides screening those libraries to identify useful peptides, polypeptides and proteins and to use those substances in human therapy.

In one aspect, the invention provides a method for cleaving a nucleic acid at a desired location, the method comprising the steps of:

(i) contacting a single-stranded nucleic acid with a single-stranded oligonucleotide, the single-stranded oligonucleotide being complementary to the single-stranded nucleic acid in the region in which cleavage is desired; wherein the single-stranded nucleic acid and the singlestranded oligonucleotide associate to form a locally double-stranded region of the single-stranded nucleic acid, wherein the locally double-stranded region com-prises a restriction endonuclease recognition site; and (ii) cleaving the nucleic acid at the restriction endonuclease recognition, wherein the cleaving comprises contacting a restriction endonuclease to the locally double-stranded region, wherein the restriction endonuclease is specific for the restriction endonuclease recognition site;

the contacting and the cleaving steps being performed at a tempera-ture wherein the single-stranded nucleic acid and the single-stranded oligonucleotide associate to form a locally double-stranded region of the single-stranded nucleic acid, wherein the remainder of the single-stranded nucleic acid is single-stranded, and wherein the restriction endonuclease is active at the temperature.

- 8a 2016225923 12 Jun 2018

A definition of the specific embodiment of the invention claimed herein follows.

In a broad format, the present invention provides a library comprising a collection of nucleic acids, which collectively encodes a plurality of antibody heavy chains each comprising a heavy chain variable region containing, from its N-terminus to Cterminus, Framework Region 1 (FR1), Complementary Determining Region 1 (CDRl), Framework Region 2 (FR2), Complementary Determining Region 2 (CDR2), Framework Region 3 (FR3), Complementary Determining Region 3 (CDR3), and Framework Region 4 (FR4), wherein:

(a) the CDRl region comprises the amino acid sequence -Χχ-ΥX2-M-X3-, in which each of Xi, X2, and X3 is independently selected from the group consisting of A, D, E, F, G, Η, I, K, L, Μ, N, P, Q, R, S, Τ, V, W, and Y;

(b) the CDR2 region comprises the amino acid sequence X4-IX5-X6-S-G-G-X7-T-X8-Y-A-D-S-V-K-G, in which each of X4 and X5 is independently selected from the group consisting of Y, R, W, V,

G, and S, Χβ is selected from the group consisting of P and S, and each of X7 and Xe is independently selected from the group consisting of A, D, E, F, G, Η , I, K, L, Μ, N, P, Q, R, S, Τ, V, W, and Y; and (c) the CDR3 region is captured from the CDR3 region of an immunoglobulin heavy chain variable gene from a B cell.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic of various methods that may be employed to amplify VH genes without using primers specific for VH sequences.

FIG. 2 is a schematic of various methods that may be employed to amplify VL genes without using primers specific for VL sequences.

FIG. 3 is a schematic of RACE amplification of antibody heavy and light chains.

FIG. 4 depicts gel analysis of amplification products obtained after the primary PCR reaction from 4 different patient samples .

-8b2016225923 12 Jun 2018

FIG. 5 depicts gel analysis of cleaved kappa DNA from

Example 2.

FIG. 6 depicts gel analysis of extender-cleaved kappa DNA from Example 2.

2016225923 09 Sep 2016

FIG. 7 depicts gel analysis of the PCR product from the extender-kappa amplification from Example 2.

FIG. 8 depicts gel analysis of purified PCR product from the extender-kappa amplification from Example 2.

FIG. 9 depicts gel analysis of cleaved and ligated kappa light chains from Example 2.

FIG. 10 is a schematic of the design for CDR1 and CDR2 synthetic diversity.

FIG. 11 is a schemaitc of the cloning schedule for construction of the heavy chain repertoire .

FIG. 12 is a schematic of the cleavage and ligation of the antibody light chain.

FIG. 13 depicts gel analysis of cleaved and ligated lambda light chains from Example 4.

FIG. 14 is a schematic of the cleavage and ligation of the antibody heavy chain.

FIG. 15 depicts gel analysis of cleaved and ligated lambda light chains from Example 5.

FIG. 16 is a schematic of a phage display vector.

FIG. 17 is a schematic of a Fab cassette.

FIG. 18 is a schematic of a process for incorporating fixed FR1 residues in an antibody lambda sequence.

FIG. 19 is a schematic of a process for incorporating fixed FR1 residues in an antibody kappa sequence .

FIG. 20 is a schematic of a process for incorporating fixed FR1 residues in an antibody heavy chain sequence.

2016225923 09 Sep 2016

TERMS

In this application, the following terms and abbreviations are used:

Sense strand 5

Antisense strand

Forward primer

Backward primer

The upper strand of ds DNA as usually written. In the sense strand, 5'-ATG-3’ codes for Met.

The lower strand of ds DNA as usually written. In the antisense strand, 3'-TAC-5’ would correspond to a Met codon in the sense strand.

A forward primer is complementary to a part of the sense strand and primes for synthesis of a new antisensestrand molecule. Forward primer and lower-strand primer are equivalent.

A backward primer is complementary to a part of the antisense strand and primes for synthesis of a new sensestrand molecule. Backward primer and top-strand primer are equivalent.

2016225923 09 Sep 2016

Bases

Sv

Ap ap^R

Bases are specified either by their position in a vector or gene as their position within a gene by codon and base. For example, 89.1 is the first base of codon 89, 89.2 is the second base of codon 89.

Streptavidin

Ampicillin

A gene conferring ampicillin resistance.

RERS

Restriction endonuclease recognition site

RE

Restriction endonuclease cleaves preferentially at RERS

URE

Universal restriction endonuclease

Functionally complementary

Two sequences are sufficiently complementary so as to anneal under the chosen conditions.

AA

Amino acid

PCR

Polymerization chain reaction

2016225923 09 Sep 2016

GLGS

Ab

Fab scFv

w. t.

HC

LC

VK

VH

Germline genes

Antibody: an immunoglobin.

The term also covers any protein having a binding domain which is homologous to an immunoglobin binding domain. A few examples of antibodies within this definition are, inter alia, immunoglobin isotypes and the Fab, F(ab¹)₂, scfv, Fv, dAb and Fd fragments.

Two chain molecule comprising an Ab light chain and part of a heavy-chain.

A single-chain Ab comprising either VH: :linker::VL or VL:: linker::VH

Wild type

Heavy chain

Light chain

A variable domain of a Kappa light chain.

A variable domain of a heavy chain.

2016225923 09 Sep 2016

VL A variable domain of a lambda light chain.

In this application when it is said that nucleic acids are cleaved solely at the cleavage site of a restriction endonuclease, it should be understood that minor cleavage may occur at random, e.g., at nonspecific sites other than the specific cleavage site that is characteristic of the restriction endonuclease. The skilled worker will recognize that such non10 specific, random cleavage is the usual occurrence. Accordingly, solely at the cleavage site of a restriction endonuclease means that cleavage occurs preferentially at the site characteristic of that endonuclease.

As used in this application and claims, the term cleavage site formed by the complementation of the nucleic acid and the single-stranded region of the oligonucleotide includes cleavage sites formed by the single-stranded portion of the partially double20 stranded ologonucleotide duplexing with the singlestranded DNA, cleavage sites in the double-stranded portion of the partially double-stranded oligonucleotide, and cleavage sites introduced by the amplification primer used to amplify the single25 stranded DNA-partially double-stranded oligonucleotide combination.

In the two methods of this invention for preparing single-stranded nucleic acid sequences, the first of those cleavage sites is preferred. In the methods of this invention for capturing diversity and cloning a family of diverse nucleic acid sequences, the latter two cleavage sites are preferred.

2016225923 09 Sep 2016

In this application, all references referred to are specifically incorporated by reference.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The nucleic acid sequences that are useful in 5 the methods of this invention, i.e., those that encode at least in part the individual peptides, polypeptides and proteins displayed, or expressed in or comprising the libraries of this invention, may be native, synthetic or a combination thereof. They may be mRNA,

DNA or cDNA. In the preferred embodiment, the nucleic acids encode antibodies. Most preferably, they encode Fabs .

The nucleic acids useful in this invention may be naturally diverse, synthetic diversity may be introduced into those naturally diverse members, or the diversity may be entirely synthetic. For example, synthetic diversity can be introduced into one or more CDRs of antibody genes. Preferably, it is introduced into CDR1 and CDR2 of immunoglobulins. Preferably, natural diversity is captured in the CDR3 regions of the immunoglogin genes of this invention from B cells. Most preferably, the nucleic acids of this invention comprise a population of immunoglobin genes that comprise synthetic diversity in at least one, and more preferably both of the CDR1 and CDR2 and diversity in CDR3 captured from B cells.

Synthetic diversity may be created, for example, through the use of TRIM technology (U.S. 5,869,644). TRIM technology allows control over exactly which amino-acid types are allowed at variegated positions and in what proportions. In TRIM technology, codons to be diversified are synthesized

2016225923 09 Sep 2016 using mixtures of trinucleotides. This allows any set of amino acid types to be included in any proportion.

Another alternative that may be used to generate diversified DNA is mixed oligonucleotide synthesis. With TRIM technology, one could allow Ala and Trp. With mixed oligonucleotide synthesis, a mixture that included Ala and Trp would also necessarily include Ser and Gly. The amino-acid types allowed at the variegated positions are picked with reference to the structure of antibodies, or other peptides, polypeptides or proteins of the family, the observed diversity in germline genes, the observed somatic mutations frequently observed, and the desired areas and types of variegation.

In a preferred embodiment of this invention, the nucleic acid sequences for at least one CDR or other region of the peptides, polypeptides or proteins of the family are cDNAs produced by reverse transcription from mRNA. More preferably, the mRNAs are obtained from peripheral blood cells, bone marrow cells, spleen cells or lymph node cells (such as B-lymphocytes or plasma cells) that express members of naturally diverse sets of related genes. More preferable, the mRNAs encode a diverse family of antibodies. Most preferably, the mRNAs are obtained from patients suffering from at least one autoimmune disorder or cancer. Preferably, mRNAs containing a high diversity of autoimmune diseases, such as systemic lupus erythematosus, systemic sclerosis, rheumatoid arthritis, antiphospholipid syndrome and vasculitis are used.

In a preferred embodiment of this invention, the cDNAs are produced from the mRNAs using reverse

2016225923 09 Sep 2016 transcription. In this preferred embodiment, the mRNAs are separated from the cell and degraded using standard methods, such that only the full length (i.e., capped) mRNAs remain. The cap is then removed and reverse transcription used to produce the cDNAs.

The reverse transcription of the first (antisense) strand can be done in any manner with any suitable primer. See, e.g·., HJ de Haard et al.,

Journal of Biological Chemistry, 274(26):18218-30 (1999). In the preferred embodiment of this invention where the mRNAs encode antibodies, primers that are complementary to the constant regions of antibody genes may be used. Those primers are useful because they do not generate bias toward subclasses of antibodies. In another embodiment, poly-dT primers may be used (and may be preferred for the heavy-chain genes) . Alternatively, sequences complementary to the primer may be attached to the termini of the antisense strand.

In one preferred embodiment of this invention, the reverse transcriptase primer may be biotinylated, thus allowing the cDNA product to be immobilized on streptavidin (Sv) beads. Immobilization can also be effected using a primer labeled at the 5' end with one of a) free amine group, b) thiol, c) carboxylic acid, or d) another group not found in DNA that can react to form a strong bond to a known partner on an insoluble medium. If, for example, a free amine (preferably primary amine) is provided at the 5' end of a DNA primer, this amine can be reacted with carboxylic acid groups on a polymer bead using standard amideforming chemistry. If such preferred immobilization is used during reverse transcription, the top strand RNA is degraded using well-known enzymes, such as a

2016225923 09 Sep 2016 combination of RNAseH and RNAseA, either before or after immobilization.

The nucleic acid sequences useful in the methods of this invention are generally amplified before being used to display and/or express the peptides, polypeptides or proteins that they encode. Prior to amplification, the single-stranded DNAs may be cleaved using either of the methods described before. Alternatively, the single-stranded DNAs may be amplified and then cleaved using one of those methods.

Any of the well known methods for amplifying nucleic acid sequences may be used for such amplification. Methods that maximize, and do not bias, diversity are preferred. In a preferred embodiment of this invention where the nucleic acid sequences are derived from antibody genes, the present invention preferably utilizes primers in the constant regions of the heavy and light chain genes and primers to a synthetic sequence that are attached at the 5' end of the sense strand. Priming at such synthetic sequence avoids the use of sequences within the variable regions of the antibody genes. Those variable region priming sites generate bias against V genes that are either of rare subclasses or that have been mutated at the priming sites'. This bias is partly due to suppression of diversity within the primer region and partly due to lack of priming when many mutations are present in the region complementary to the primer. The methods disclosed in this invention have the advantage of not biasing the population of amplified antibody genes for particular V gene types.

The synthetic sequences may be attached to the 5' end of the DNA strand by various methods well

2016225923 09 Sep 2016 known for ligating DNA sequences together. RT CapExtention is one preferred method.

In RT CapExtention (derived from Smart PCR'™’), a short overlap (5'..GGG-3' in the upper5 strand primer (USP-GGG) complements 3’-CCC....5' in the lower strand) and reverse transcriptases are used so that the reverse complement of the upper-strand primer is attached to the lower strand.

FIGs. 1 and 2 show schematics to amplify VH 10 and VL genes using RT CapExtention. FIG. 1 shows a schematic of the amplification of VH genes. FIG. 1, Panel A shows a primer specific to the poly-dT region of the 3' UTR priming synthesis of the first, lower strand. Primers that bind in the constant region are also suitable. Panel B shows the lower strand extended at its 3' end by three Cs that are not complementary to the mRNA. Panel C shows the result of annealing a synthetic top-strand primer ending in three GGGs that hybridize to the 3' terminal CCCs and extending the reverse transcription extending the lower strand by the reverse complement of the synthetic primer sequence. Panel D shows the result of PCR amplification using a 5' biotinylated synthetic top-strand primer that replicates the 5' end of the synthetic primer of panel

C and a bottom-strand primer complementary to part of the constant domain. Panel E shows immobilized doublestranded (ds) cDNA obtained by using a 5'-biotinylated top-strand primer.

FIG. 2 shows a similar schematic for amplification of VL genes. FIG. 2, Panel A shows a primer specific to the constant region at or near the 3' end priming synthesis of the first, lower strand. Primers that bind in the poly-dT region are also

2016225923 09 Sep 2016 suitable. Panel B shows the lower strand extended at its 3' end by three Cs that are not complementary to the mRNA. Panel C shows the result of annealing a synthetic top-strand primer ending in three GGGs that hybridize to the 3' terminal CCCs and extending the reverse transcription extending the lower strand by the reverse complement of the synthetic primer sequence. Panel D shows the result of PCR amplification using a 5' biotinylated synthetic top-strand primer that replicates the 5' end of the synthetic primer of panel C and a bottom-strand primer complementary to part of the constant domain. The bottom-strand primer also contains a useful restriction endonuclease site, such as Ascl. Panel E shows immobilized ds cDNA obtained by using a 5 '-biotinylated top-strand primer.

In FIGs. 1 and 2, each V gene consists of a

5' untranslated region (UTR) and a secretion signal, followed by the variable region, followed by a constant region, followed by a 3' untranslated region (which typically ends in poly-A). An initial primer for reverse transcription may be complementary to the constant region or to the poly A segment of the 3'-UTR. For human heavy-chain genes, a primer of 15 T is preferred. Reverse transcriptases attach several C residues to the 3' end of the newly synthesized DNA.

RT CapExtention exploits this feature. The reverse transcription reaction is first run with only a lowerstrand primer. After about 1 hour, a primer ending in GGG (USP-GGG) and more RTase are added. This causes the lower-strand cDNA to be extended by the reverse complement of the USP-GGG up to the final GGG. Using one primer identical to part of the attached synthetic sequence and a second primer complementary to a region

2016225923 09 Sep 2016 of known sequence at the 3' end of the sense strand, all the V genes are amplified irrespective of their V gene subclass.

In another preferred embodiment, synthetic sequences may be added by Rapid Amplification of cDNA Ends (RACE) (see Frohman, M.A. , Dush, M.K., & Martin, G.R. (1988) Proc. Natl. Acad. Sci. USA (85):

8998-9002).

FIG. 1 shows a schematic of RACE 10 amplification of antibody heavy and light chains.

First, mRNA is selected by treating total or poly(A+) RNA with calf intestinal phosphatase (CIP) to remove the 5'-phosphate from all molecules that have them such as ribosomal RNA, fragmented mRNA, tRNA and genomic

DNA. Full length mRNA (containing a protective 7methyl cap structure) is uneffected. The RNA is then treated with tobacco acid pyrophosphatase (TAP) to remove the cap structure from full length mRNAs leaving a 5'-monophosphate group. Next, a synthetic RNA adaptor is ligated to the RNA population, only molecules which have a 5-phosphate (uncapped, full length mRNAs) will accept the adaptor. Reverse trascriptase reactions using an oligodT primer, and nested PCR (using one adaptor primer (located in the 5' synthetic adaptor) and one primer for the gene) are then used to amplify the desired transcript.

In a preferred embodiment of this invention, the upper strand or lower strand primer may be also biotinylated or labeled at the 5' end with one of a) free amino group, b) thiol, c) carboxylic acid and d) another group not found in DNA that can react to form a strong bond to a known partner as an insoluble medium. These can then be used to immobilize the labeled strand

2016225923 09 Sep 2016 after amplification. The immobilized DNA can be either single or double-stranded.

After amplification (using e.g., RT CapExtension or RACE), the DNAs of this invention are rendered single-stranded. For example, the strands can be separated by using a biotinylated primer, capturing the biotinylated product on streptavidin beads, denaturing the DNA, and washing away the complementary strand. Depending on which end of the captured DNA is wanted, one will choose to immobilize either the upper (sense) strand or the lower (antisense) strand.

To prepare the single-stranded amplified DNAs for cloning into genetic packages so as to effect display of, or for expression of, the peptides, polypeptides or proteins encoded, at least in part, by those DNAs, they must be manipulated to provide ends suitable for cloning and display and/or expression. In particular, any 5' untranslated regions and mammalian signal sequences must be removed and replaced, in frame, by a suitable signal sequence that functions in the display or expression host. Additionally, parts of the variable domains (in antibody genes) may be removed and replaced by synthetic segments containing synthetic diversity. The diversity of other gene families may likewise be expanded with synthetic diversity.

According to the methods of this invention, there are two ways to manipulate the single-stranded DNAs for display and/or expression. The first method comprises the steps of:

(i) contacting the nucleic acid with a single-stranded oligonucleotide, the oligonucleotide being functionally complementary to the nucleic acid in the

2016225923 09 Sep 2016 region in which cleavage is desired and including a sequence that with its complement in the nucleic acid forms a restriction endonuclease recognition site that on restriction results in cleavage of the nucleic acid at the desired location; and (ii) cleaving the nucleic acid solely at the recognition site formed by the complementation of the nucleic acid and the oligonucleotide;

In this first method, short oligonucleotides are annealed to the single-stranded DNA so that restriction endonuclease recognition sites formed within the now locally double-stranded regions of the DNA can be cleaved. In particular, a recognition site that occurs at the same position in a substantial fraction of the single-stranded DNAs is identical.

For antibody genes, this can be done using a catalog of germline sequences. See, e.g., http://www.mrc-cpe. cam.ac.uk/imt-doc/restricted/ok.htm

1. Updates can be obtained from this site under the heading Amino acid and nucleotide sequence alignments.” For other families, similar comparisons

2016225923 09 Sep 2016 exist and may be used to select appropriate regions for cleavage and to maintain diversity.

For example, Table 1 depicts the DNA sequences of the FR3 regions of the 51 known human VH germline genes. In this region, the genes contain restriction endonuclease recognition sites shown in Table 2. Restriction endonucleases that cleave a large fraction of germline genes at the same site are preferred over endonucleases that cut at a variety of sites. Furthermore, it is preferred that there be only one site for the restriction endonucleases within the region to which the short oligonucleotide binds on the single-stranded DNA, e.g., about 10 bases on either side of the restriction endonuclease recognition site.

An enzyme that cleaves downstream in FR3 is also more preferable because it captures fewer mutations in the framework. This may be advantageous is some cases. However, it is well known that framework mutations exist and confer and enhance antibody binding. The present invention, by choice of appropriate restriction site, allows all or part of FR3 diversity to be captured. Hence, the method also allows extensive diversity to be captured.

Finally, in the methods of this invention restriction endonucleases that are active between about 37°C and about 75°C are used. Preferably, restriction endonucleases that are active between about 45°C and about 75°C may be used. More preferably, enzymes that are active above 50°C, and most preferably active about

55°C, are used. Such temperatures maintain the nucleic acid sequence to be cleaved in substantially singlestranded form.

2016225923 09 Sep 2016

Enzymes shown in Table 2 that cut many of the heavy chain FR3 germline genes at a single position include: Maelll(2404), Tsp45l(2104), Bphl(4405),

BsaJI(23065) , Alul (23047), BlpI(21048), Ddel(29058) ,

Bglll(10061) , Msll(44072), BsiEI (23074), Eael(23074), Eagl(23074), tfaelll (25075), Bst4CI(51086) ,

BpyCH4III(51086), Hint I(3802), Mlyl(1802), Plel(1802), Mnll(31067), HpyCH4V (21044), BsmAI(16011) , Bpml(19@12), XmnI(12030), and Sacl(11051). (The notation used means, for example, that BsmAI cuts 16 of the FR3 germline genes with a restriction endonuclease recognition site beginning at base 11 of FR3.)

For cleavage of human heavy chains in FR3, the preferred restriction endonucleases are: Bst4CI (or

Taal or BpyCH4III), BlpI, HpyCH4V, and Msll. Because ACNGT (the restriction endonuclease recognition site for Bs£4CI, Taal, and BpyCH4III) is found at a consistent site in all the human FR3 germline genes, one of those enzymes is the most preferred for capture of heavy chain CDR3 diversity. BlpI and HpyCH4V are complementary. BlpI cuts most members of the VH1 and VH4 families while BpyCH4V cuts most members of the VH3, VH5, VH6, and VH7 families. Neither enzyme cuts VH2s, but this is a very small family, containing only three members. Thus, these enzymes may also be used in preferred embodiments of the methods of this invention.

The restriction endonucleases BpyCH4III, Bst4CI, and Taal all recognize 5'-ACnGT-3' and cut upper strand DNA after n and lower strand DNA before the base complementary to n. This is the most preferred restriction endonuclease recognition site for this method on human heavy chains because it is found in all germline genes. Furthermore, the restriction

2016225923 09 Sep 2016 endonuclease recognition region (ACnGT) matches the second and third bases of a tyrosine codon (tav) and the following cysteine codon (tqy) as shown in Table 3. These codons are highly conserved, especially the cysteine in mature antibody genes.

Table 4 E shows the distinct oligonucleotides of length 22 (except the last one which is of length 20) bases. Table 5 C shows the analysis of 1617 actual heavy chain antibody genes. Of these, 1511 have the site and match one of the candidate oligonucleotides to within 4 mismatches. Eight oligonucleotides account for most of the matches and are given in Table 4 F.l. The 8 oligonucleotides are very similar so that it is likely that satisfactory cleavage will be achieved with only one oligonucleotide (such as H43.77.97.l-02#l) by adjusting temperature, pH, salinity, and the like. One or two oligonucleotides may likewise suffice whenever the germline gene sequences differ very little and especially if they differ very little close to the restriction endonuclease recognition region to be cleaved. Table 5 D shows a repeat analysis of 1617 actual heavy chain antibody genes using only the 8 chosen oligonucleotides. This shows that 1463 of the sequences match at least one of the oligonucleotides to within 4 mismatches and have the site as expected.

Only 7 sequences have a second HpyCH4III restriction endonuclease recognition region in this region.

Another illustration of choosing an appropriate restriction endonuclease recognition site involves cleavage in FRl of human heavy chains.

Cleavage in FRl allows capture of the entire CDR diversity of the heavy chain.

2016225923 09 Sep 2016

The germline genes for human heavy chain FR1 are shown in Table 6. Table 7 shows the restriction endonuclease recognition sites found in human germline genes FRls. The preferred sites are Bsgl (GTGCAG;3904),

BsoFI (GCngc;4306, 1109,203, 1012) ,

Tsel(Gcwgc;4306, 1109,203, 1012) ,

MspAlI(CMGckg;4607,2 01), PvuII(CAGctg;4607,201),

AluT(AGct;4808202), Ddel(Ctnag;22052, 9048),

HphI(tcacc;22080) , BssKI(Nccngg;35039, 204 0),

BsaJl(Ccnngg;32040,2041), BstNl(CCwgg;33040),

ScrFI(CCngg;35040,2041), £co0109I(RGgnccy;22046,

11043), Sau96I(Ggncc;23047,11044),

Avail(Ggwcc;23047,4044), PpuMI(RGgwccy;22046,4043), BsmFI(gtccc;20048), Hinfl(Gantc;34016,21056,21077),

Tfil(21077), Mlyl (GAGTC;34Θ16), Mlyl(gactc;21@56), and AlwNI(CAGnnnctg;22068) . The more preferred sites are MspAI and PvuII. MspAI and PvuII have 46 sites at 7-12 and 2 at 1-6. To avoid cleavage at both sites, oligonucleotides are used that do not fully cover the site at 1-6. Thus, the DNA will not be cleaved at that site. We have shown that DNA that extends 3, 4, or 5 bases beyond a PvuII-site can be cleaved efficiently.

Another illustration of choosing an appropriate restriction endonuclease recognition site involves cleavage in FR1 of human kappa light chains. Table 8 shows the human kappa FR1 germline genes and Table 9 shows restriction endonuclease recognition sites that are found in a substantial number of human kappa FR1 germline genes at consistent locations. Of the restriction endonuclease recognition sites listed, BsmAI and PflFI are the most preferred enzymes. BsmAI sites are found at base 18 in 35 of 40 germl'ine genes.

2016225923 09 Sep 2016

PflFI sites are found in 35 of 40 germline genes at base 12.

Another example of choosing an appropriate restriction endonuclease recognition site involves cleavage in FR1 of the human lambda light chain. Table 10 shows the 31 known human lambda FR1 germline gene sequences. Table 11 shows restriction endonuclease recognition sites found in human lambda FR1 'germline genes. Hintl and Ddel are the most preferred restriction endonucleases for cutting human lambda chains in FR1.

After the appropriate site or sites for cleavage are chosen, one or more short oligonucleotides are prepared so as to functionally complement, alone or in combination, the chosen recognition site. The oligonucleotides also include sequences that flank the recognition site in the majority of the amplified genes. This flanking region allows the sequence to anneal to the single-stranded DNA sufficiently to allow cleavage by the restriction endonuclease specific for the site chosen.

The actual length and sequence of the oligonucleotide depends on the recognition site and the conditions to be used for contacting and cleavage. The 25 length must be sufficient so that the oligonucleotide is functionally complementary to the single-stranded DNA over a large enough region to allow the two strands to associate such that cleavage may occur at the chosen temperature and at the desired location.

Typically, the oligonucleotides of this preferred method of the invention are about 17 to about 30 nucleotides in length. Below about 17 bases, annealing is too weak and above 30 bases there can be a

2016225923 09 Sep 2016 loss of specificity. A preferred length is 18 to 24 bases .

Oligonucleotides of this length need not be identical complements of the germline genes. Rather, a few mismatches taken may be tolerated. Preferably, however, no more than 1-3 mismatches are allowed. Such mismatches do not adversely affect annealing of the oligonucleotide to the single-stranded DNA. Hence, the two DNAs are said to be functionally complementary.

The second method to manipulate the singlestranded DNAs of this invention for display and/or expression comprises the steps of:

(i) contacting the nucleic acid with a partially double-stranded oligonucleotide, the single-stranded region of the oligonucleotide being functionally complementary to the nucleic acid in the region in which cleavage is desired, and the double-stranded region of the oligonucleotide having a restriction endonuclease recognition site; and (ii) cleaving the nucleic acid solely at the cleavage site formed by the complementation of the nucleic acid and the single-stranded region of the oligonucleotide;

the contacting and the cleaving steps being performed at a temperature sufficient to maintain the nucleic acid in substantially single-stranded form, the oligonucleotide being functionally complementary to the nucleic acid over a large enough region to allow the two strands to associate such that cleavage may occur

2016225923 09 Sep 2016 at the chosen temperature and at the desired location, and the cleavage being carried out using a restriction endonuclease that is active at the chosen temperature.

As explained above, the cleavage site may be 5 formed by the single-stranded portion of the partially double-stranded oligonucleotide duplexing with the single-stranded DNA, the cleavage site may be carried in the double-stranded portion of the partially doublestranded oligonucleotide, or the cleavage site may be introduced by the amplification primer used to amplify the single-stranded DNA-partially double-stranded oligonucleotide combination. In this embodiment, the first is preferred. And, the restriction endonuclease recognition site may be located in either the double15 stranded portion of the oligonucleotide or introduced by the amplification primer, which is complementary to that double-stranded region, as used to amplify the combination.

Preferably, the restriction endonuclease site is that of a Type II-S restriction endonuclease, whose cleavage site is located at a known distance from its recognition site.

This second method, preferably, employs Universal Restriction Endonucleases (URE). UREs are partially double-stranded oligonucleotides. The single-stranded portion or overlap of the URE consists of a DNA adapter that is functionally complementary to the sequence to be cleaved in the single-stranded DNA. The double-stranded portion consists of a restriction endonuclease recognition site, preferably type II-S.

The URE method of this invention is specific and precise and can tolerate some (e.g., 1-3) mismatches in the complementary regions, i.e., it is

2016225923 09 Sep 2016 functionally complementary to that region. Further, conditions under which the URE is used can be adjusted so that most of the genes that are amplified can be cut, reducing bias in the library produced from those genes.

The sequence of the single-stranded DNA adapter or overlap portion of the URE typically consists of about 14-22 bases. However, longer or shorter adapters may be used. The size depends on the ability of the adapter to associate with its functional complement in the single-stranded DNA and the temperature used for contacting the URE and the singlestranded DNA at the temperature used for cleaving the DNA with the restriction enzyme. The adapter must be functionally complementary to the single-stranded DNA over a large enough region to allow the two strands to associate such that the cleavage may occur at the chosen temperature and at the desired location. We prefer singe-stranded or overlap portions of 14-17 bases in length, and more preferably 18-20 bases in length.

The site chosen for cleavage using the URE is preferably one that is substantially conserved in the family of amplified DNAs. As compared to the first cleavage method of this invention, these sites do not need to be endonuclease recognition sites. However, like the first method, the sites chosen can be synthetic rather than existing in the native DNA. Such sites may be chosen by references to the -sequences of known antibodies or other families of genes. For example, the sequences of many germline genes are reported at http://www.mrc-cpe. cam.ac.uk/imtdoc/restricted/ok.html. For example, one preferred

2016225923 09 Sep 2016 site occurs near the end of FR3 — codon 89 through the second base of codon 93. CDR3 begins at codon 95.

The sequences of 79 human heavy-chain genes are also available at http://www.ncbi.nlm.nih.qov/entre2/nucleotide.html.

This site can be used to identify appropriate sequences for URE cleavage according to the methods of this invention. See, e.g., Table 12B.

Most preferably, one or more sequences are 10 identified using these sites or other available sequence information. These sequences together are present in a substantial fraction of the amplified DNAs. For example, multiple sequences could be used to allow for known diversity in germline genes or for frequent somatic mutations. Synthetic degenerate sequences could also be used. Preferably, a sequence(s) that occurs in at least 65% of genes examined with no more than 2-3 mismatches is chosen

URE single-stranded adapters or overlaps are then made to be complementary to the chosen regions. Conditions for using the UREs are determined empirically. These conditions should allow cleavage of DNA that contains the functionally complementary sequences with no more than 2 or 3 mismatches but that do not allow cleavage of DNA lacking such sequences.

As described above, the double-stranded portion of the URE includes an endonuclease recognition site, preferably a Type II-S recognition site. Any enzyme that is active at a temperature necessary to maintain the single-stranded DNA substantially in that form and to allow the single-stranded DNA adapter portion of the URE to anneal long enough to the single32

2016225923 09 Sep 2016 stranded DNA to permit cleavage at the desired site maybe used.

The preferred Type II-S enzymes for use in the ORE methods of this invention provide asymmetrical cleavage of the single-stranded DNA. Among these are the enzymes listed in Table 13. The most preferred Type II-S enzyme is Fokl.

When the preferred Fokl containing URE is used, several conditions are preferably used to effect cleavage:

1) Excess of the URE over target DNA should be present to activate the enzyme. URE present only in equimolar amounts to the target DNA would yield poor cleavage of ssDNA because the amount of active enzyme available would be limiting.

2) An activator may be used to activate part of the Fokl enzyme to dimerize without causing cleavage. Examples of appropriate activators are shown in Table 14.

3) The cleavage reaction is performed at a temperature between 45°-75°C, preferably above 50°C and most preferably above 55°C.

The UREs used in the prior art contained a

14-base single-stranded segment, a 10-base stem (containing a Fokl site), followed by the palindrome of the 10-base stem. While such UREs may be used in the methods of this invention, the preferred UREs of this invention also include a segment of three to eight bases (a loop) between the Fokl restriction endonuclease recognition site containing segments. In the preferred embodiment, the stem (containing the Fokl

2016225923 09 Sep 2016 site) and its palindrome are also longer than 10 bases. Preferably, they are 10-14 bases in length. Examples of these lollipop URE adapters are shown in Table 15.

One example of using a URE to cleave an 5 single-stranded DNA involves the FR3 region of human heavy chain. Table 16 shows an analysis of 840 fulllength mature human heavy chains with the URE recognition sequences shown. The vast majority (718/840=0.85) will be recognized with 2 or fewer mismatches using five UREs (VHS881-1.1, VHS881-1.2, VHS881-2.1, VHS881-4.1, and VHS881-9.1). Each has a 20-base adaptor sequence to complement the germline gene, a ten-base stem segment containing a FokI site, a five base loop, and the reverse complement of the first stem segment. Annealing those adapters, alone or in combination, to single-stranded antisense heavy chain DNA and treating with FokI in the presence of, e.g., the activator FOKIact, will lead to cleavage of the antisense strand at the position indicated.

Another example of using a URE(s) to cleave a single-stranded DNA involves the FR1 region of the human Kappa light chains. Table 17 shows an analysis of 182 full-length human kappa chains for matching by the four 19-base probe sequences shown. Ninety-six percent of the sequences match one of the probes with 2 or fewer mismatches. The URE adapters shown in Table 17 are for cleavage of the sense strand of kappa chains. Thus, the adaptor sequences are the reverse complement of the germline gene sequences. The URE consists of a ten-base stem, a five base loop, the reverse complement of the stem and the complementation sequence. The loop shown here is TTGTT, but other sequences could be used. Its function is to interrupt

2016225923 09 Sep 2016 the palindrome of the stems so that formation of a lollypop monomer is favored over dimerization. Table 17 also shows where the sense strand is cleaved.

Another example of using a URE to cleave a 5 single-stranded DNA involves the human lambda light chain. Table 18 shows analysis of 128 human lambda light chains for matching the four 19-base probes shown. With three or fewer mismatches, 88 of 128 (69%) of the chains match one of the probes. Table 18 also shows URE adapters corresponding to these probes. Annealing these adapters to upper-strand ssDNA of lambda chains and treatment with FokI in the presence of FOKIact at a temperature at or above 45°C will lead to specific and precise cleavage of the chains.

The conditions under which the short oligonucleotide sequences of the first method and the UREs of the second method are contacted with the single-stranded DNAs may be empirically determined.

The conditions must be such that the single-stranded

DNA remains in substantially single-stranded form.

More particularly, the conditions must be such that the single-stranded DNA does not form loops that may int .rfere with its association with the oligonucleotide sequence or the URE or that may themselves provide sites for cleavage by the chosen restriction endonuclease.

The effectiveness and specificity of short oligonucleotides (first method) and UREs (second method) can be adjusted by controlling the concentrations of the URE adapters/oligonucleotides and substrate DNA, the temperature, the pH, the concentration of metal ions, the ionic strength, the concentration of chaotropes (such as urea and

2016225923 09 Sep 2016 formamide), the concentration of the restriction endonuclease (e.g., Fokl) , and the time of the digestion. These conditions can be optimized with synthetic oligonucleotides having: 1) target germline gene sequences, 2) mutated target gene sequences, or 3) somewhat related non-target sequences. The goal is to cleave most of the target sequences and minimal amounts of non-targets.

In accordance with this invention, the 10 single-stranded DNA is maintained in substantially that form using a temperature between about 37 °C and about 75°C. Preferably, a temperature between about 45°C and about 75°C is used. More preferably, a temperature between 50°C and 60°C, most preferably between 55°C and

60°C, is used. These temperatures are employed both when contacting the DNA with the oligonucleotide or URE and when cleaving the DNA using the methods of this invention.

The two cleavage methods of this invention have several advantages. The first method allows the individual members of the family of single-stranded DNAs to be cleaved preferentially at one substantially conserved endonuclease recognition site. The method also does not require an endonuclease recognition site to be built into the reverse transcription or amplification primers. Any native or synthetic site in the family can be used.

The second method has both of these advantages. In addition, the preferred URE method allows the single-stranded DNAs to be cleaved at positions where no endonuclease recognition site naturally occurs or has been synthetically constructed.

2016225923 09 Sep 2016

Most importantly, both cleavage methods permit the use of 5' and 3' primers so as to maximize diversity and then cleavage to remove unwanted or deleterious sequences before cloning, display and/or expression.

After cleavage of the amplified DNAs using one of the methods of this invention, the DNA is prepared for cloning, display and/or expression. This is done by using a partially duplexed synthetic DNA adapter, whose terminal sequence is based on the specific cleavage site at which the amplified DNA has been cleaved.

The synthetic DNA is designed such that when it is ligated to the cleaved single-stranded DNA in proper reading frame so that the desired peptide, polypeptide or protein can be displayed on the surface of the genetic package and/or expressed. Preferably, the double-stranded portion of the adapter comprises the sequence of several codons that encode the amino acid sequence characteristic of the family of peptides, polypeptides or proteins up to the cleavage site. For human heavy chains, the amino acids of the 3-23 framework are preferably used to provide the sequences required for expression of the cleaved DNA.

Preferably, the double-stranded portion of the adapter is about 12 to 100 bases in length. More preferably, about 20 to 100 bases are used. The double-standard region of the adapter also preferably contains at least one endonuclease recognition site useful for cloning the DNA into a suitable display and/or expression vector {or a recipient vector used to archive the diversity). This endonuclease restriction site may be native to the germline gene sequences used

2016225923 09 Sep 2016 to extend the DNA sequence. It may be also constructed using degenerate sequences to the native germline gene sequences. Or, it may be wholly synthetic.

The single-stranded portion of the adapter is 5 complementary to the region of the cleavage in the single-stranded DNA. The overlap can be from about 2 bases up to about 15 bases. The longer the overlap, the more efficient the ligation is likely to be. A preferred length for the overlap is 7 to 10. This allows some mismatches in the region so that diversity in this region may be captured.

The single-stranded region or overlap of the partially duplexed adapter is advantageous because it allows DNA cleaved at the chosen site, but not other fragments to be captured. Such fragments would contaminate the library with genes encoding sequences that will not fold into proper antibodies and are likely to be non-specifically sticky.

One illustration of the use of a partially duplexed adaptor in the methods of this invention involves ligating such adaptor to a human FR3 region that has been cleaved, as described above, at 5'-ACnGT3' using HpyCH4III, Bst4CI or Taal.

Table 4 F.2 shows the bottom strand of the double-stranded portion of the adaptor for ligation to the cleaved bottom-strand DNA. Since the HpyCH4IIISite is so far to the right (as shown in Table 3), a sequence that includes the Aflll-site as well as the Xbal site can be added. This bottom strand portion of the partially-duplexed adaptor, H43.XAExt, incorporates both Xbal and Aflll-sites. The top strand of the double-stranded portion of the adaptor has neither site (due to planned mismatches in the segments

2016225923 09 Sep 2016 opposite the Xbal and Afill-Sites of H43.XAExt), but will anneal very tightly to H43.XAExt. H43AExt contains only the Aflll-site and is to be used with the top strands H43.ABrl and H43.ABr2 (which have intentional alterations to destroy the ΑΠΙΙ-site) .

After ligation, the desired, captured DNA can be PCR amplified again, if desired, using in the preferred embodiment a primer to the downstream constant region of the antibody gene and a primer to part of the double-standard region of the adapter. The primers may also carry restriction endonuclease sites for use in cloning the amplified DNA.

After ligation, and perhaps amplification, of the partially double-stranded adapter to the single15 stranded amplified DNA, the composite DNA is cleaved at chosen 5' and 3' endonuclease recognition sites.

The cleavage sites useful for cloning depend on the phage or phagemid or other vectors into which the cassette will be inserted and the available sites in the antibody genes. Table 19 provides restriction endonuclease data for 75 human light chains. Table 20 shows corresponding data for 79 human heavy chains. In each Table, the endonucleases are ordered by increasing frequency of cutting. In these Tables, Nch is the number of chains cut by the enzyme and Ns is the number of sites (some chains have more than one site).

From this analysis, Sfil, Notl, Aflll, ApaLI, and Ascl are very suitable. Sfil and Notl are preferably used in pCESl to insert the heavy-chain display segment. ApaLI and Ascl are preferably used in pCESl to insert the light-chain display segment.

BstEII-sites occur in 97% of germ-line JH genes. In rearranged V genes, only 54/79 (68%) of

2016225923 09 Sep 2016 heavy-chain genes contain a BstEII-Site and 7/61 of these contain two sites. Thus, 47/79 (59%) contain a single BstEII-Site. An alternative to using BstEII is to cleave via UREs at the end of JH and ligate to a synthetic oligonucleotide that encodes part of CHI.

One example of preparing a family of DNA sequences using the methods of this invention involves capturing human CDR 3 diversity. As described above, mRNAs from various autoimmune patients are reverse transcribed into lower strand cDNA. After the top strand RNA is degraded, the lower strand is immobilized and a short oligonucleotide used to cleave the cDNA upstream of CDR3. A partially duplexed synthetic DNA adapter is then annealed to the DNA and the DNA is amplified using a primer to the adapter and a primer to the constant region (after FR4). The DNA is then cleaved using BstEII (in FR4) and a restriction endonuclease appropriate to the partially doublestranded adapter (e.g., Xbal and Aflll (in FR3)). The

DNA is then ligated into a synthetic VH skeleton such as 3-23.

One example of preparing a single-stranded DNA that was cleaved using the URE method involves the human Kappa chain. The cleavage site in the sense strand of this chain is depicted in Table 17. The oligonucleotide kapextURE is annealed to the oligonucleotides (kaBROlUR, kaBR02UR, kaBR03(JR, and kaBR04UR) to form a partially duplex DNA. This DNA is then ligated to the cleaved soluble kappa chains. The ligation product is then amplified using primers kapextUREPCR and CKForeAsc (which inserts a Ascl site after the end of C kappa). This product is then cleaved with ApaLI and Ascl and ligated to similarly

2016225923 09 Sep 2016 cut recipient vector.

Another example involves the cleavage of lambda light chains, illustrated in Table 18. After cleavage, an extender (ON_LamExi33) and four bridge oligonucleotides (0N_LamBl-133, ON_LamB2-133, ON_LamB3-133, and ON_LamB4-l33) are annealed to form a partially duplex DNA. That DNA is ligated to the cleaved lambda-chain sense strands. After ligation, the DNA is amplified with 0N_Laml33PCR and a forward primer specific to the lambda constant domain, such as CL2ForeAsc or CL7ForeAsc (Table 130) .

In human heavy chains, one can cleave almost all genes in FR4 (downstream, i.e., toward the 3’ end of the sense strand, of CDR3) at a BsfcEII-Site that occurs at a constant position in a very large fraction of human heavy-chain V genes. One then needs a site in FR3, if only CDR3 diversity is to be captured, in FR2,_eif CDR2 and CDR3 diversity is wanted, or in FR1, if all the CDR diversity is wanted. These sites are preferably inserted as part of the partially doublestranded adaptor.

The preferred process of this invention is to provide recipient vectors (e.g., for display and/or expression) having sites that allow cloning of either light or heavy chains. Such vectors are well known and widely used in the art. A preferred phage display vector in accordance with this invention is phage MALIA3. This displays in gene III. The sequence of the phage MALIA3 is shown in Table 21A (annotated) and

Table 21B (condensed).

The DNA encoding the selected regions of the light or heavy chains can be transferred to the vectors using endonucleases that cut either light or heavy

2016225923 09 Sep 2016 chains only very rarely. For example, light chains may be captured with ApaLI and Ascl. Heavy-chain genes are preferably cloned into a recipient vector having Sfil, Ncol, Xbal, Aflll, BstEII, Apal, and Notl sites. The light chains are preferably moved into the library as ApaLI-AscI fragments. The heavy chains are preferably moved into the library as Sfil-Notl fragments.

Most preferably, the display is had on the surface of a derivative of M13 phage. The most preferred vector contains all the genes of M13, an antibiotic resistance gene, and the display cassette. The preferred vector is provided with restriction sites that allow introduction and excision of members of the diverse family of genes, as cassettes. The preferred vector is stable against rearrangement under the growth conditions used to amplify phage.

In another embodiment of this invention, the diversity captured by the methods of the present invention may be displayed and/or expressed in a phagemid vector (e.g., pCESl) that displays and/or expresses the peptide, polypeptide or protein. Such vectors may also be used to store the diversity for subsequent display and/or expression using other vectors or phage.

In another embodiment of this invention, the diversity captured by the methods of the present invention may be displayed and/or expressed in a yeast vector.

In another embodiment, the mode of display may be through a short linker to anchor domains -- one possible anchor comprising the final portion of M13 III (Illstump) and a second possible anchor being the full length III mature protein.

2016225923 09 Sep 2016

The Illstump fragment contains enough of M13 III to assemble into phage but not the domains involved in mediating infectivity. Because the w.t. Ill proteins are present the phage is unlikely to delete the antibody genes and phage that do delete these segments receive only a very small growth advantage.

For each of the anchor domains, the DNA encodes the w.t. AA sequence, but differs from the w.t. DNA sequence to a very high extent. This will greatly reduce the potential for homologous recombination between the anchor and the w.t. gene that is also present (see Example 6) .

Most preferably, the present invention uses a complete phage carrying an antibiotic-resistance gene (such as an ampicillin-resistance gene) and the display cassette. Because the w.t. iii and possibly viii genes are present, the w.t. proteins are also present. The display cassette is transcribed from a regulatable promoter (e.g., P_LacZ) - Use of a regulatable promoter allows control of the ratio of the fusion display gene to the corresponding w.t. coat protein. This ratio determines the average number of copies of the display fusion per phage (or phagemid) particle.

Another aspect of the invention is a method of displaying peptides, polypeptides or proteins (and particularly Fabs) on filamentous phage. In the most preferred embodiment this method displays FABs and comprises :

a) obtaining a cassette capturing a diversity of segments of DNA encoding the elements:

P_reg: :RBS1: :SS1: :VL: :CL: :stop: :RBS2: :SS2: :VH: :CH1: : linker: -.anchor: : stop: :,

2016225923 09 Sep 2016 where P_reg is a regulatable promoter, RBS1 is a first ribosome binding site, SSI is a signal sequence operable in the host strain, VL is a member of a diverse set of light-chain variable regions, CL is a light-chain constant region, stop is one or more stop codons, RBS2 is a second ribosome binding site, SS2 is a second signal sequence operable in the host strain,

VH is a member of a diverse set of heavy-chain variable regions, CHI is an antibody heavy-chain first constant domain, linker is a sequence of amino acids of one to about 50 residues, anchor is a protein that will assemble into the filamentous phage particle and stop is a second example of one or more stop codons; and

b) positioning that cassette within the phage genome to maximize the viability of the phage and to minimize the potential for deletion of the cassette or parts thereof.

The DNA encoding the anchor protein in the above preferred cassette should be designed to encode the same (or a closely related) amino acid sequence as is found in one of the coat proteins of the phage, but with a distinct DNA sequence. This is to prevent unwanted homologous recombination with the w.t. gene.

In addition, the cassette should be placed in the intergenic region. The positioning and orientation of the display cassette can influence the behavior of the phage.

In one embodiment of the invention, a transcription terminator may be placed after the second stop of the display cassette above (e.g., Trp). This will reduce interaction between the display cassette

2016225923 09 Sep 2016 and other genes in the phage antibody display vector.

In another embodiment of the methods of this invention, the phage or phagemid can display and/or express proteins other than Fab, by replacing the Fab portions indicated above, with other protein genes.

Various hosts can be used the display and/or expression aspect of this invention. Such hosts are well known in the art. In the preferred embodiment, where Fabs are being displayed and/or expressed, the preferred host should grow at 30°C and be RecA (to reduce unwanted genetic recombination) and EndA' (to make recovery of RF DNA easier). It is also preferred that the host strain be easily transformed by electroporation .

XLl-Blue MRF' satisfies most of these preferences, but does not grow well at 30°C. XLl-Blue MRF' does grow slowly at 38°C and thus is an acceptable host. TG-1 is also an acceptable host although it is RecA⁺ and EndA⁺. XLl-Blue MRF' is more preferred for the intermediate host used to accumulate diversity prior to final construction of the library.

After display and/or expression, the libraries of this invention may be screened using well known and conventionally used techniques. The selected peptides, polypeptides or proteins may then be used to treat disease. Generally, the peptides, polypeptides or proteins for use in therapy or in pharmaceutical compositions are produced by isolating the DNA encoding the desired peptide, polypeptide or protein from the member of the library selected. That DNA is then used in conventional methods to produce the peptide, polypeptides or protein it encodes in appropriate host cells, preferably mammalian host cells, e.g., CHO

2016225923 09 Sep 2016 cells. After isolation, the peptide, polypeptide or protein is used alone or with pharmaceutically acceptable compositions in therapy to treat disease.

EXAMPLES

Example 1: RACE amplification of heavy and light chain antibody repertoires from autoimmune patients.

Total RNA was isolated from individual blood samples (50 ml) of 11 patients using a RNAzolTM kit (CINNA/Biotecx), as described by the manufacturer. The patients were diagnosed as follows:

1. SLE and phospholipid syndrome

2. limited systemic sclerosis

3. SLE and Sjogren syndrome

4. Limited Systemic sclerosis

5. Reumatoid Arthritis with active vasculitis

6. Limited systemic sclerosis and Sjogren Syndrome

7. Reumatoid Artritis and (not active) vasculitis

8. SLE and Sjogren syndrome

9. SLE

10. SLE and (active) glomerulonephritis

11. Polyarthritis/ Raynauds Phenomen

From these 11 samples of total RNA, Poly-A+ RNA was isolated using Promega PoiyATtract® mRNA Isolation kit (Promega).

250 ng of each poly-A+ RNA sample was used to amplify antibody heavy and light chains with the GeneRAacerTM kit (Invitrogen cat no. L1500-01) . A schematic overview of the RACE procedure is shown in

2016225923 09 Sep 2016

FIG. 3.

Using the general protocol of the GeneRAacer”’ kit, an RNA adaptor was ligated to the 5’end of all mRNAs. Next, a reverse transcriptase reaction was performed in the presence of oligo(dT15) specific primer under conditions described by the manufacturer

ΓΜ in the GeneRAacer kit.

1/5 of the cDNA from the reverse transcriptase reaction was used in a 20 ul PCR reaction. For amplification of the heavy chain IgM repertoire, a forward primer based on the CHI chain of IgM [HuCmFOR] and a backward primer based on the ligated synthetic adaptor sequence [5Ά] were used.

(See Table 22)

For amplification of the kappa and lambda light chains, a forward primer that contains the 3' coding-end of the cDNA [HuCkFor and HuCLFor2+HuCLf or7 ] and a backward primer based on the ligated synthetic adapter sequence [5'A] was used (See Table 22).

Specific amplification products after 30 cycles of primary PCR were obtained.

FIG. 4 shows the amplification products obtained after the primary PCR reaction from 4 different patient samples. 8 ul primary PCR product from 4 different patients was analyzed on a agarose gel [labeled 1,2, 3 and 4]. For the heavy chain, a product of approximately 950 nt is obtained while for the kappa and lambda light chains the product is approximately 850 nt. Ml-2 are molecular weight markers.

PCR products were also analyzed by DNA sequencing [10 clones from the lambda, kappa or heavy chain repertoires]. All sequenced antibody genes recovered contained the full coding sequence as well as

2016225923 09 Sep 2016 the 5' leader sequence and the V gene diversity was the expected diversity (compared to literature data).

ng of all samples from all 11 individual amplified samples were mixed for heavy, lambda light or kappa light chains and used in secondary PCR reactions.

In all secondary PCRs approximately 1 ng template DNA from the primary PCR mixture was used in multiple 50 ul PCR reactions [25 cycles].

For the heavy chain, a nested biotinylated · forward primer [HuCm-Nested] was used, and a nested

5'end backward primer located in the synthetic adapter-sequence [5'NA] was used. The 5'end lower-strand of the heavy chain was biotinylated.

For the light chains, a 5'end biotinylated 15 nested primer in the synthetic adapter was used [5'NA] in combination with a 3'end primer in the constant region of Ckappa and Clambda, extended with a sequence coding for the Ascl restriction site [ kappa:

HuCkForAscI, Lambda: HuCL2-FOR-ASC + HuCL7-FOR-ASC] .

[5'end Top strand DNA was biotinylated]. After gel-analysis the secondary PCR products were pooled and purified with Promega Wizzard PCR cleanup.

Approximately 25 ug biotinylated heavy chain, lambda and kappa light chain DNA was isolated from the 11 patients.

Example 2: Capturing kappa chains with BsmAI.

A repertoire of human-kappa chain mRNAs was prepared using the RACE method of Example 1 from a collection of patients having various autoimmune diseases.

2016225923 09 Sep 2016

This Example followed the protocol of Example 1. Approximately 2 micrograms (ug) of human kappachain (Igkappa) gene RACE material with biotin attached to 5'-end of upper strand was immobilized as in Example

1 on 200 microliters (pL) of Seradyn magnetic beads.

The lower strand was removed by washing the DNA with 2 aliquots 200 pL of 0.1 M NaOH (pH 13) for 3 minutes for the first aliquot followed by 30 seconds for the second aliquot. The beads were neutralized with 200 pL of 10 mM Tris (pH 7.5) 100 mM NaCl. The short oligonucleotides shown in Table 23 were added in 40 fold molar excess in 100 pL of NEB buffer 2 (50 mM NaCl, 10 mM Tris-HCl, 10 mM MgCl₂, 1 mM dithiothreitol pH 7.9) to the dry beads. The mixture was incubated at

95°C for 5 minutes then cooled down to 55°C over 30 minutes. Excess oligonucleotide was washed away with 2 washes of NEB buffer 3 (100 mM NaCl, 50 mM Tris-HCl, 10 mM MgCl₂, 1 mM dithiothreitol pH 7.9). Ten units of BsmAI (NEB) were added in NEB buffer 3 and incubated for 1 h at 55°C. The cleaved downstream DNA was collected and purified over a Qiagen PCR purification column (FIGs. 5 and 6).

FIG. 5 shows an analysis of digested kappa single-stranded DNA. Approximately 151.5 pmol of adapter was annealed to 3.79 pmol of immobilized kappa single-stranded DNA followed by digestion with 15 U of BsmAI. The supernatant containing the desired DNA was removed and analyzed by 5% polyacrylamide gel along with the remaining beads which contained uncleaved full length kappa DNA. 189 pmol of cleaved single-stranded DNA was purified for further analysis. Five percent of the original full length ssDNA remained on the beads.

2016225923 09 Sep 2016

FIG. 6 shows an analysis of the extender cleaved kappa ligation. 180 pmol of pre-annealed bridge/extender was ligated to 1.8 pmol of BsmAI digested single-stranded DNA. The ligated DNA was purified by Qiagen PCR purification column and analyzed on a 5% polyacrylamide gel. Results indicated that the ligation of extender to single-stranded DNA was 95% efficient.

A partially double-stranded adaptor was 10 prepared using the oligonucleotide shown in Table 23.

The adaptor was added to the single-stranded DNA in 100 fold molar excess along with 1000 units of T4 DNA ligase and incubated overnight at 16°C. The excess oligonucleotide was removed with a Qiagen PCR purification column. The ligated material was amplified by PCR using the primers kapPCRtl and kapfor shown in Table 23 for 10 cycles with the program shown in Table 24.

The soluble PCR product was run on a gel and showed a band of approximately 700 n, as expected (FIGs. 7 and 8). The DNA was cleaved with enzymes ApaLI and Ascl, gel purified, and ligated to similarly cleaved vector pCESl.

FIG. 7 shows an analysis of the PCR product from the extender-kappa amplification. Ligated extender-kappa single-stranded DNA was amplified with primers specific to the extender and to the constant region of the light chain. Two different template concentrations, 10 ng versus 50 ng, were used as template and 13 cycles were used to generate approximately 1.5 ug of dsDNA as shown by 0.8% agarose gel analysis.

2016225923 09 Sep 2016

FIG. 8 shows an analysis of the purified PCR product from the extender-kappa amplification. Approximately 5 ug of PCR amplified extender-kappa double-stranded DNA was run out on a 0.8% agarose gel, cut out, and extracted with a GFX gel purification column. By gel analysis, 3.5 ug of double-stranded DNA was prepared.

The assay for capturing kappa chains with BsmAl was repeated and produced similar results.

FIG 9A shows the DNA after it was cleaved and collected and purified over a Qiagen PCR purification column.

FIG. 9B shows the partially double-stranded adaptor ligated to the single-stranded DNA. This ligated material was then amplified (FIG. 9C). The gel showed a band of approximately 700 n.

Table 25 shows the DNA sequence of a kappa light chain captured by this procedure. Table 26 shows a second sequence captured by this procedure. The closest bridge sequence was complementary to the sequence 5'-agccacc-3’, but the sequence captured reads 5'-Tgccacc-3’, showing that some mismatch in the overlapped region is tolerated.

Example 3: Construction of Synthetic CDR1 and CDR2 Diversity in V-3-23 VH Framework.

Synthetic diversity in Complementary

Determinant Region (CDR) 1 and 2 was created in the 323 VH framework in a two step process: first, a vector containing the 3-23 VH framework was constructed; and then, a synthetic CDR 1 and 2 was assembled and cloned into this vector.

2016225923 09 Sep 2016

For construction of the 3-23 VH framework, 8 oligonucleotides and two PCR primers (long oligonucleotides - TOPFR1A, BOTFR1B, BOTFR2, BOTFR3, F06, BOTFR4, ON-vgCl, and 0N-vgC2 and primers - SFPRMET and

BOTPCRPRIM, shown in Table 27) that overlap were designed based on the Genebank sequence of 3-23 VH framework region. The design incorporated at least one useful restriction site in each framework region, as shown in Table 27. In Table 27, the segments that were synthesized are shown as bold, the overlapping regions are underscored, and the PCR priming regions at each end are underscored.

A mixture of these 8 oligos was combined at a final concentration of 2.5uM in a 20ul PCR reaction.

The PCR mixture contained 200uM dNTPs, 2.5mM MgCl₂,

0.02U Pfu Turbo™ DNA Polymerase, 10 Qiagen HotStart Taq DNA Polymerase, and IX Qiagen PCR buffer. The PCR program consisted of 10 cycles of 94°C for 30s, 55°C for 30s, and 72°C for 30s.

The assembled 3-23 VH DNA sequence was then amplified, using 2.5ul of a 10-fold dilution from the initial PCR in lOOul PCR reaction. The PCR reaction contained 200uM dNTPs, 2.5mM MgCl₂, 0.02U Pfu Turbo™

DNA Polymerase, 1U Qiagen HotStart Taq DNA Polymerase,

IX Qiagen PCR Buffer and 2 outside primers (SFPRMET and BOTPCRPRIM) at a concentration of luM. The PCR program consisted of 23 cycles at 94°C for 30s, 55°C for 30s, and 72°C for 60s. The 3-23 VH DNA sequence was digested and cloned into pCESl (phagemid vector) using the Sfil and BstEII restriction endonuclease sites.

All restriction enzymes mentioned herein were supplied by New England BioLabs, Beverly, MA and used as per the manufacturer's instructions.

2016225923 09 Sep 2016

Stuffer sequences (shown in Table 28 and Table 29) were introduced into pCESl to replace CDR1/CDR2 sequences (900 bases between BspEI and Xbal RE sites) and CDR3 sequences (358 bases between Aflll and BstEII) prior to cloning the CDR1/CDR2 diversity. This new vector was termed pCES5 and its sequence is given in Table 29.

Having stuffers in place of the CDRs avoids the risk that a parental sequence would be over10 represented in the library. The stuffer sequences are fragments from the penicillase gene of E. coli. The CDR1-2 stuffer contains restriction sites for Bglll, Bsu36I, Bell, Xcml, Mlul, PvuII, Hpal, and Hindi, the underscored sites being unique within the vector pCES5.

The stuffer that replaces CDR3 contains the unique restriction endonuclease site BsrII.

A schematic representation of the design for CDRl and CDR2 synthetic diversity is shown FIG. 10.

The design was based on the presence of mutations in

DP47/3-23 and related germline genes. Diversity was designed to be introduced at the positions within CDRl and CDR2 indicated by the numbers in FIG. 10. The diversity at each position was chosen to be one of the three following schemes: 1 = ADEFGHIKLMNPQRSTVWY; 2 =

YRWVGS; 3 = PS, in which letters encode equimolar mixes of the indicated amino acids.

For the construction of the CDRl and CDR2 diversity, 4 overlapping oligonucleotides (ON-vgCl, 0N_Brl2, ON_CD2Xba, and 0N-vgC2, shown in Table 27 and

Table 30) encoding CDR1/2, plus flanking regions, were designed. A mixture of these 4 oligos was combined at a final concentration of 2.5uM in a 40ul PCR reaction. Two of the 4 oligos contained variegated sequences

2016225923 09 Sep 2016 positioned at the CDR1 and the CDR2. The PCR mixture contained 200uM dNTPs, 2.5U Pwo DNA Polymerase (Roche), and IX Pwo PCR buffer with 2mM MgSO₄. The PCR program consisted of 10 cycles at 94°C for 30s, 60°C for 30s, and 72°C for 60s. This assembled CDR1/2 DNA sequence was amplified, using 2.5ul of the mixture in lOOul PCR reaction. The PCR reaction contained 200uM dNTPs, 2.5U Pwo DNA Polymerase, IX Pwo PCR Buffer with 2mM MgSO„ and 2 outside primers at a concentration of luM. The PCR program consisted of 10 cycles at 94°C for 30s, 60°C for 30s, and 72°C for 60s. These variegated sequences were digested and cloned into the 3-23 VH framework in place of the CDR1/2 stuffer.

We obtained approximately 7 X 10⁷ independent transformants. CDR3 diversity either from donor populations or from synthetic DNA can be cloned into the vector containing synthetic CDR1 and CDR 2 diversity.

A schematic representation of this procedure is shown in FIG. 11. A sequence encoding the FRregions of the human V3-23 gene segment and CDR regions with synthetic diversity was made by oligonucleotide assembly and cloning via BspEl and Xbal sites into a vector that complements the FR1 and FR3 regions. Into this library of synthetic VH segments, the complementary VH-CDR3 sequence (top right) was cloned via Xbal an BstEll sites. The resulting cloned CH genes contain a combination of designed synthetic diversity and natural diversity (see FIG. 11).

2016225923 09 Sep 2016

Example 4: Cleavage and ligation of the lambda light chains with HintI.

A schematic of the cleavage and ligation of antibody light chains is shown in FIGs. 12A and 12B.

Approximately 2 ug of biotinylated human Lambda DNA prepared as described in Example 1 was immobilized on 200 ul Seradyn magnetic beads. The lower strand was removed by incubation of the DNA with 200 ul of 0.1 M NaOH (pH=13) for 3 minutes, the supernatant was removed and an additional washing of 30 seconds with 200 ul of 0.1 M NaOH was performed. Supernatant was removed and the beads were neutralized with 200 ul of 10 mM Tris (pH=7.5), 100 mM NaCl. 2 additional washes with 200 ul NEB2 buffer 2, containing 10 mM Tris (pH=7.9), 50 mM

NaCl, 10 mM MgCl2 and 1 mM dithiothreitol, were performed. After immobilization, the amount of ssDNA was estimated on a 5% PAGE-UREA gel.

About 0.8 ug ssDNA was recovered and incubated in 100 ul NEB2 buffer 2 containing 80 molar fold excess of an equimolar mix of ON_LamlaB7,

ON_Lam2aB7, ON_Lam31B7 and ON_Lam3rB7 [each oligo in 20 fold molar excess] (see Table 31).

The mixture was incubated at 95° C for 5 minutes and then slowly cooled down to 50° C over a period of 30 minutes. Excess of oligonucleotide was washed away with 2 washes of 200 ul of NEB buffer 2.

U/ug of Hinf I was added and incubated for 1 hour at 50° C. Beads were mixed every 10 minutes.

After incubation the sample was purified over a Qiagen PCR purification column and was subsequently analysed on a 5% PAGE-urea gel (see FIG. 13A, cleavage was more than 70% efficient).

2016225923 09 Sep 2016

A schematic of the ligation of the cleaved light chains is shown in FIG. 12B. A mix of bridge/extender pairs was prepared from the Brg/Ext oligo's listed in Table 31 {total molar excess 100 fold) in 1000 U of T4 DNA Ligase (NEB) and incubated overnight at 16 ^c C. After ligation of the DNA, the excess oligonucleotide was removed with a Qiagen PCR purification column and ligation was checked on a Urea-PAGE gel (see FIG. 13B; ligation was more than 95% efficient).

Multiple PCRs were performed containing 10 ng of the ligated material in an 50 ul PCR reaction using 25 pMol ON lamPlePCR and 25 pmol of an equimolar mix of Hu-CL2AscI/HuCL7AscI primer (see Example 1).

PCR was performed at 60° C for 15 cycles using Pfu polymerase. About 1 ug of dsDNA was recovered per PCR (see FIG. 13C) and cleaved with ApaLl and Ascl for cloning the lambda light chains in pCES2.

Example 5: Capture of human heavy-chain CDR3

0 population.

A schematic of the cleavage and ligation of antibody light chains is shown in FIGs. 14A and 14B.

Approximately 3 ug of human heavy-chain (IgM) gene RACE material with biotin attached to 5'-end of lower strand was immobilized on 300 uL of Seradyn magnetic beads. The upper strand was removed by washing the DNA with 2 aliquots 300 uL of 0.1 M NaOH (pH 13) for 3 minutes for the first aliquot followed by

30 seconds for the second aliquot. The beads were neutralized with 300 uL of 10 mM Tris (pH 7.5) 100 mM NaCl. The REdaptors (oligonucleotides used to make

2016225923 09 Sep 2016 single-stranded DNA locally double-stranded) shown in Table 32 were added in 30 fold molar excess in 200 uL of NEB buffer 4 (50 mM Potasium Acetate, 20 mM Tris-Acetate, 10 mM Magnesuim Acetate, 1 mM dithiothreitol pH 7.9) to the dry beads. The

REadaptors were incubated with the single-stranded DNA at 80 °C for 5 minutes then cooled down to 55 °C over 30 minutes. Excess REdaptors were washed away with 2 washes of NEB buffer 4. Fifteen units of HpyCH4III (NEB) were added in NEB buffer 4 and incubated for 1 hour at 55 °C. The cleaved downstream DNA remaining on the beads was removed from the beads using a Qiagen Nucleotide removal column (see FIG. 15).

The Bridge/Extender pairs shown in Table 33 were added in 25 molar excess along with 1200 units of T4 DNA ligase and incubated overnight at 16 °C. Excess Bridge/Extender was removed with a Qiagen PCR purification column. The ligated material was amplified by PCR using primers H43.XAExtPCR2 and

Hucumnest shown in Table 34 for 10 cycles with the program shown in Table 35.

The soluble PCR product was run on a gel and showed a band of approximately 500 n, as expected (see FIG. 15B) . The DNA was cleaved with enzymes Sfil and

Notl, gel purified, and ligated to similarly cleaved vector PCES1.

Example 6: Description of Phage Display Vector CJRA05, a member of the library built in vector DY3F7.

Table 36 contains an annotated DNA sequence 30 of a member of the library, CJRA05, see FIG. 16. Table is to be read as follows: on each line everything

2016225923 09 Sep 2016 that follows an exclamation mark ! is a comment. All occurrences of A, C, G, and T before ! *' are the DNA sequence. Case is used only to show that certain bases constitute special features, such as restriction sites, ribosome binding sites, and the like, which are labeled below the DNA. CJRA05 is a derivative of phage DY3F7, obtained by cloning an ApaLI to Notl fragment into these sites in DY3F31. DY3F31 is like DY3F7 except that the light chain and heavy chain genes have been replaced by stuffer DNA that does not code for any antibody. DY3F7 contains an antibody that binds streptavidin, but did not come from the present library.

The phage genes start with gene ii and continue with genes x, v, vii, ix, viii, iii, vi, i, and iv. Gene iii has been slightly modified in that eight codons have been inserted between the signal sequence and the mature protein and the final amino acids of the signal sequence have been altered. This allows restriction enzyme recognition sites Eagl and Xbal to be present. Following gene iv is the phage origin of replication (ori). After ori is bla which confers resistance to ampicillin (ApR) . The phage genes and bla are transcribed in the same sense.

After bla, is the Fab cassette (illustrated in FIG. 17) comprising:

a)	PlacZ promoter,
b)	A first Ribosome Binding Site (RBS1),
c)	The signal sequence form M13 iii,
d)	An ApaLI RERS,
e)	A light chain (a kappa L2O::JK1 shortened by one
	codon at the V-J boundary in this case),
f)	An Ascl RERS,

2016225923 09 Sep 2016

	g) h) i)	A second Ribosome Binding Site A signal sequence, preferably ] contains, An Sfil RERS,	(RBS2),
PelB,	which
5	j)	A synthetic 3-23 V region with	diversity in CDR1
		and CDR2,
	k)	A captured CDR3,
	1)	A partially synthetic J region	(FR4	after BstEII)
	m)	CHI,
10	n)	A Notl RERS,
	o)	A His6 tag,
	P)	A cMyc tag,
	q)	An amber codon,
	r)	An anchor DNA that encodes the	same	amino-acid
15	sequence as codons 273 to 424 of M13 iii	(as shown in

Table 37).

s)	Two stop codons,
t)	An Avril RERS, and
u)	A trp terminator.
20	The anchor (item r) encodes the same
amino-acid sequence as do codons 273 to 424 of M13 iii
but	the DNA is approximately as different as possible
from	the wild-type DNA sequence. In Table 36, the

III' stump runs from base 8997 to base 9455. Below the 25 DNA, as comments, are the differences with wild-type iii for the comparable codons with !W.T at the ends of these lines. Note that Met and Trp have only a single codon and must be left as is. These AA types are rare. Ser codons can be changed at all three base, while Leu and Arg codons can be changed at two.

In most cases, one base change can be introduced per codon. This has three advantages: 1) recombination with the wild-type gene carried elsewhere

2016225923 09 Sep 2016 on the phage is less likely, 2) new restriction sites can be introduced, facilitating construction; and 3) sequencing primers that bind in only one of the two regions can be designed.

The fragment of M13 III shown in CJRA05 is the preferred length for the anchor segment.

Alternative longer or shorter anchor segments defined by reference to whole mature III protein may also be utilized.

The sequence of M13 III consists of the following elements: Signal Sequence:: Domain 1 (Dl)::Linker 1 (Ll)::Domain 2 (D2)::Linker 2 (L2) :: Domain 3 (D3) : : Transmembrane Segment (TM) :: Intracellular anchor (IC) (see Table 38).

The pill anchor (also known as trpIII) preferably consists of D2::L2 : : D3: :TM::IC. Another embodiment for the pill anchor consists of

D2'::L2::D3::TM::IC (where D2' comprises the last 21 residues of D2 with the first 109 residues deleted). A further embodiment of the pill anchor consists of

D2’(C>S)::L2::D3::TM::IC (where D2'(C>S) is D2’ with the single C converted to S) , and d) D3::TM::IC.

Table 38 shows a gene fragment comprising the Notl site, His6 tag, cMyc tag, an amber codon, a recombinant enterokinase cleavage site, and the whole of mature M13 III protein. The DNA used to encode this sequence is intentionally very different from the DNA of wild-type gene iii as shown by the lines denoted W.T. containing the w.t. bases where these differ from this gene. Ill is divided into domains denoted domain 1, linker 1, domain 2, linker 2, domain 3, transmembrane segment, and intracellular anchor.

2016225923 09 Sep 2016

Alternative preferred anchor segments (defined by reference to the sequence of Table 38) include :

codons 1-29 joined to codons 104-435, deleting 5 domain 1 and retaining linker 1 to the end;

codons 1-38 joined to codons 104-435, deleting domain land retaining the rEK cleavage site plus linker 1 to the end from III;

codons 1-29 joined to codons 236-435, deleting 10 domain 1, linker 1, and most of domain 2 and retaining linker 2 to the end;

codons 1-38 joined to codons 236-435, deleting domain 1, linker 1, and most of domain 2 and retaining linker 2 to the end and the rEK cleavage site;

codons 1-29 joined to codons 236-435 and changing codon 240 to Ser(e.g., age), deleting domain 1, linker 1, and most of domain 2 and retaining linker 2 to the end; and codons 1-38 joined to codons 236-435 and changing codon 240 to Ser(e.g., age), deleting domain 1, linker 1, and most of domain 2 and retaining linker 2 to the end and the rEK cleavage site.

The constructs would most readily be made by methods similar to those of Wang and Wilkinson (Biotechniques 2001: 31(4)722-724) in which PCR is used to copy the vector except the part to be deleted and matching restriction sites are introduced or retained at either end of the part to be kept. Table 39 shows the oligonucleotides to be used in deleting parts of the III anchor segment. The DNA shown in Table 38 has an Nhel site before the DINDDRMA recombinant enterokinase cleavage site (rEKCS). If Nhel is used in the deletion process with this DNA, the rEKCS site

2016225923 09 Sep 2016 would be lost. This site could be quite useful in cleaving Fabs from the phage and might facilitate capture of very high-afffinity antibodies. One could mutagenize this sequence so that the Nhel site would follow the rEKCS site, an Ala Ser amino-acid sequence is already present. Alternatively, one could use SphI for the deletions. This would involve a slight change in amino acid sequence but would be of no consequence.

Example 7 : Selection of antigen binders from an 10 enriched library of human antibodies using phage vector

DY3F31.

In this example the human antibody library used is described in de Haard et al., (Journal of Biological Chemistry, 274 (26): 18218-30 (1999). This library, consisting of a large non-immune human Fab phagemid library, was first enriched on antigen, either on streptavidin or on phenyl-oxazolone (phOx). The methods for this are well known in the art. Two preselected Fab libraries, the first one selected once on immobilized phOx-BSA (Rl-ox) and the second one selected twice on streptavidin (R2-strep), were chosen for recloning.

These enriched repertoires of phage antibodies, in which only a very low percentage have binding activity to the antigen used in selection, were confirmed by screening clones in an ELISA for antigen binding. The selected Fab genes were transferred from the phagemid vector of this library to the DY3F31 vector via ApaLl-Notl restriction sites.

DNA from the DY3F31 phage vector was pretreated with ATP dependent DNAse to remove

2016225923 09 Sep 2016 chromosomal DNA and then digested with ApaLI and Notl. An extra digestion with Ascl was performed in between to prevent self-ligation of the vector. The ApaLl/Notl Fab fragment from the preselected libraries was subsequently ligated to the vector DNA and transformed into competent XLl-blue MRF' cells.

Libraries were made using vector: insert ratios of 1:2 for phOx-library and 1:3 for STREP library, and using 100 ng ligated DNA per 50 pi of electroporation-competent cells (electroporation conditions : one shock of 1700 V, 1 hour recovery of cells in rich SOC medium, plating on amplicillincontaining agar plates).

This transformation resulted in a library size of 1.6 x 10⁶ for Rl-ox in DY3F31 and 2.1 x 10⁶ for R2-strep in DY3F31. Sixteen colonies from each library were screened for insert, and all showed the correct size insert (±1400 bp) (for both libraries).

Phage was prepared from these Fab libraries as follows. A representative sample of the library was inoculated in medium with ampicillin and glucose, and at OD 0.5, the medium exchanged for ampicillin and 1 mM IPTG. After overnight growth at 37 °C, phage was harvested from the supernatant by PEG-NaCl precipitation. Phage was used for selection on antigen. Rl-ox was selected on phOx-BSA coated by passive adsorption onto immunotubes and R2-strep on streptavidin coated paramagnetic beads (Dynal, Norway), in procedures described in de Haard et. al. and Marks et. al., Journal of Molecular Biology. 222(3): 581-97 (1991). Phage titers and enrichments are given in Table 40.

2016225923 09 Sep 2016

Clones from these selected libraries, dubbed R2-ox and R3-strep respectively, were screened for binding to their antigens in ELISA. 44 clones from each selection were picked randomly and screened as phage or soluble Fab for binding in ELISA. For the libraries in DY3F31, clones were first grown in 2TY-2% glucose-50 pg/ml AMP to an OD600 of approximately 0.5, and then grown overnight in 2TY-50 pg/ml AMP +/- ImM IPTG. Induction with IPTG may result in the production of both phage-Fab and soluble Fab. Therefore the (same) clones were also grown without IPTG. Table 41 shows the results of an ELISA screening of the resulting supernatant, either for the detection of phage particles with antigen binding (Anti-M13 HRP = anti-phage antibody), or for the detection of human Fabs, be it on phage or as soluble fragments, either with using the anti-myc antibody 9E10 which detects the myc-tag that every Fab carries at the C-terminal end of the heavy chain followed by a HRP-labeled rabbit-anti-Mouse serum (column 9E10/RAM-HRP), or with anti-light chain reagent followed by a HRP-labeled goat-anti-rabbit antiserum(anti-CK/CL Gar-HRP).

The results shows that in both cases antigen-binders are identified in the library, with as

Fabs on phage or with the anti-Fab reagents (Table 41). IPTG induction yields an increase in the number of positives. Also it can be seen that for the phOx-clones, the phage ELISA yields more positives than the soluble Fab ELISA, most likely due to the avid binding of phage. Twenty four of the ELISA-positive clones were screened using PCR of the Fab-insert from the vector, followed by digestion with BstNI. This yielded 17 different patterns for the phOx-binding

2016225923 09 Sep 2016

Fab's in 23 samples that were correctly analyzed, and 6 out of 24 for the streptavidin binding clones. Thus, the data from the selection and screening from this pre-enriched non-immune Fab library show that the

DY3F31 vector is suitable for display and selection of Fab fragments, and provides both soluble Fab and Fab on phage for screening experiments after selection.

Example 8: Selection of Phage-antibody libraries on streptavidin magnetic beads .

The following example describes a selection in which one first depletes a sample of the library of binders to streptavidin and optionally of binders to a non-target (i.e., a molecule other than the target that one does not want the selected Fab to bind). It is hypothesized that one has a molecule, termed a competitive ligand, which binds the target and that an antibody which binds at the same site would be especially useful.

For this procedure Streptavidin Magnetic

Beads (Dynal) were blocked once with blocking solution (2% Marvel Milk, PBS (pH 7.4), 0.01% Tween-20 (2%MPBST)) for 60 minutes at room temperature and then washed five times with 2%MPBST. 450 pL of beads were blocked for each depletion and subsequent selection set.

Per selection, 6.25 pL of biotinylated depletion target (1 mg/mL stock in PBST) was added to 0.250 mL of washed, blocked beads (from step 1) . The target was allowed to bind overnight, with tumbling, at

4°C. The next day, the beads are washed 5 times with

PBST.

2016225923 09 Sep 2016

Per selection, 0.010 mL of biotinylated target antigen (1 mg/mL stock in PBST) was added to 0.100 mL of blocked and washed beads (from step 1).

The antigen was allowed to bind overnight, with tumbling, at 4 °C. The next day, the beads were washed 5 times with PBST.

In round 1, 2 X 10¹² up to 10¹³ plaque forming units (pfu) per selection were blocked against non-specific binding by adding to 0.500 mL of 2%MPBS (=2%MPBST without Tween) for 1 hr at RT (tumble). In later rounds, 1011 pfu per selection were blocked as done in round 1.

Each phage pool was incubated with 50 pL of depletion target beads (final wash supernatant removed just before use) on a Labquake rotator for 10 min at room temperature. After incubation, the phage supernatant was removed and incubated with another 50 pL of depletion target beads. This was repeated 3 more times using depletion target beads and twice using blocked streptavidin beads for a total of 7 rounds of depletion, so each phage pool required 350 pL of depletion beads.

A small sample of each depleted library pool was taken for titering. Each library pool was added to

0.100 mL of target beads (final wash supernatant was removed just before use) and allowed to incubate for 2 hours at room temperature (tumble).

Beads were then washed as rapidly as possible (e.g.,3 minutes total) with 5 X 0.500 mL PBST and then

2X with PBS. Phage still bound to beads after the washing were eluted once with 0.250 mL of competitive ligand (~1 ρμΜ) in PBST for 1 hour at room temperature on a Labquake rotator. The eluate was removed, mixed

2016225923 09 Sep 2016 with 0.500 mL Minimal A salts solution and saved. For a second selection, 0.500 mL 100 mM TEA was used for elution for 10 min at RT, then neutralized in a mix of 0.250 mL of 1 M Tris, pH 7.4 + 0.500 mL Min A salts.

After the first selection elution, the beads can be eluted again with 0.300 mL of non-biotinylated target (1 mg/mL) for 1 hr at RT on a Labquake rotator. Eluted phage are added to 0.450 mL Minimal A salts.

Three eluates (competitor from 1st selection, 10 target from 1st selection and neutralized TEA elution from 2nd selection) were kept separate and a small aliquot taken from each for titering. 0.500 mL Minimal A salts were added to the remaining bead aliquots after competitor and target elution and after TEA elution.

Take a small aliquot from each was taken for tittering.

Each elution and each set of eluted beads was mixed with 2X YT and an aliquot (e.g., 1 mL with 1. E 10/mL) of XLl-Blue MRF’ E. coli cells (or other F' cell line) which had been chilled on ice after having been grown to mid-logarithmic phase, starved and concentrated (see procedure below - Mid-Log prep of XL-1 blue MRF' cells for infection).

After approximately 30 minutes at room temperature, the phage/cell mixtures were spread onto

Bio-Assay Dishes (243 X 243 X 18 mm, Nalge Nunc) containing 2XYT, ImM IPTG agar. The plates were incubated overnight at 30°C. The next day, each amplified phage culture was harvested from its respective plate. The plate was flooded with 35 mL TBS or LB, and cells were scraped from the plate. The resuspended cells were transferred to a centrifuge bottle. An additional 20 mL TBS or LB was used to remove any cells from the plate and pooled with the

2016225923 09 Sep 2016 cells in the centrifuge bottle. The cells were centrifuged out, and phage in the supernatant was recovered by PEG precipitation. Over the next day, the amplified phage preps were titered.

In the first round, two selections yielded five amplified eluates. These amplified eluates were panned for 2-3 more additional rounds of selection using ~1. E 12 input phage/round. For each additional round, the depletion and target beads were prepared the night before the round was initiated.

For the elution steps in subsequent rounds, all elutions up to the elution step from which the amplified elution came from were done, and the previous elutions were treated as washes. For the bead infection amplifiedphage, for example, the competitive ligand and target elutions were done and then tossed as washes (see below). Then the beads were used to infect E. coli. Two pools, therefore, yielded a total of 5 final elutions at the end of the selection.

1st selection set

A. Ligand amplified elution: elute w/ ligand for 1 hr, keep as elution

B. Target amplified elution: elute w/ ligand for 1 hr, toss as wash elute w/ target for 1 hr, keep as elution

C. Bead infect, amp. elution: elute w/ ligand for 1 hr, toss as wash elute w/ target for 1 hr, toss as wash elute w/ cell infection, keep as elution

2016225923 09 Sep 2016

2nd selection set

A. TEA amplified elution; elute w/ TEA lOmin, keep as elution

B. Bead infect, amp. elution; elute w/

TEA lOmin, toss as wash elute w/ cell infection, keep as elution

Mid-log prep of XLl blue MRF' cells for infection (based on Barbas et al. Phage Display manual procedure)

Culture XLl blue MRF' in NZCYM (12.5 mg/mL tet) at 37°C and 250 rpm overnight. Started a 500 mL culture in 2 liter flask by diluting cells 1/50 in NZCYM/tet (10 mL overnight culture added) and incubated at 37°C at 250 rpm until OD600 of 0.45 (1.5-2 hrs) was reached. Shaking was reduced to 100 rpm for 10 min.

When OD600 reached between 0.55-0.65, cells were transferred to 2 x 250 mL centrifuge bottles, centrifuged at 600 g for 15 min at 4°C. Supernatant was poured off. Residual liquid was removed with a pipette.

The pellets were gently resuspended (not pipetting up and down) in the original volume of 1 X Minimal A salts at room temp. The resuspended cells were transferred back into 2-liter flask, shaken at 100 rpm for 45 min at 37°C. This process was performed in order to starve the cells and restore pili. The cells were transferred to 2 x 250 mL centrifuge bottles, and centrifuged as earlier.

The cells were gently resuspended in ice cold Minimal A salts (5 mL per 500 mL original culture).

2016225923 09 Sep 2016

The cells were put on ice for use in infections as soon as possible.

The phage eluates were brought up to 7.5 mL with 2XYT medium and 2.5 mL of cells were added. Beads were brought up to 3 mL with 2XYT and 1 mL of cells were added. Incubated at 37oC for 30 min. The cells were plated on 2XYT, 1 mM IPTG agar large NUNC plates and incubated for 18 hr at 30°C.

Example 9: Incorporation of synthetic region in FR1/3 10 region.

Described below are examples for incorporating of fixed residues in antibody sequences for light chain kappa and lambda genes, and for heavy chains. The experimental conditions and oligonucleotides used for the examples below have been described in previous examples (e.g., Examples 3 & 4).

The process for incorporating fixed FRl residues in an antibody lambda sequence consists of 3 steps (see FIG. 18): (1) annealing of single-stranded

DNA material encoding VL genes to a partially complementary oligonucleotide mix (indicated with Ext and Bridge), to anneal in this example to the region encoding residues 5-7 of the FRl of the lambda genes (indicated with X..X; within the lambda genes the overlap may sometimes not be perfect); {2) ligation of this complex; (3) PCR of the ligated material with the indicated primer ('PCRpr') and for example one primer based within the VL gene. In this process the first few residues of all lambda genes will be encoded by the sequences present in the oligonucleotides (Ext., Bridge

2016225923 09 Sep 2016 or PCRpr). After the PCR, the lambda genes can be cloned vising the indicated restriction site for ApaLI.

The process for incorporating fixed FRl residues in an antibody kappa sequence (FIG. 19) consists of 3 steps : (1) annealing of single-stranded

DNA material encoding VK genes to a partially complementary oligonucleotide mix (indicated with Ext and Bri), to anneal in this example to the region encoding residues 8-10 of the FRl of the kappa genes (indicated with X..X; within the kappa genes the overlap may sometimes not be perfect) ; (2) ligation of this complex; (3) PCR of the ligated material with the indicated primer ('PCRpr') and for example one primer based within the VK gene. In this process the first few (8) residues of all kappa genes will be encode by the sequences present in the oligonucleotides (Ext., Bridge or PCRpr.). After the PCR, the kappa genes can be cloned using the indicated restriction site for ApaLI.

The process of incorporating fixed FR3 residues in a antibody heavy chain sequence (FIG. 20) consists of 3 steps : (1) annealing of single-stranded DNA material encoding part of the VH genes (for example encoding FR3, CDR3 and FR4 regions) to a partially complementary oligonucleotide mix (indicated with Ext and Bridge), to anneal in this example to the region encoding residues 92-94 (within the FR3 region) of VH genes (indicated with X..X; within the VH genes the overlap may sometimes not be perfect); (2) ligation of this complex; (3) PCR of the ligated material with the indicated primer ('PCRpr') and for example one primer based within the VH gene (such as in the FR4 region).

In this process certain residues of all VH genes will be encoded by the sequences present in the

2016225923 09 Sep 2016 oligonucleotides used here, in particular from PCRpr (for residues 70-73), or from Ext/Bridge oligonucleotides (residues 74-91). After the PCR, the partial VH genes can be cloned using the indicated restriction site for Xbal.

It will be understood that the foregoing is only illustrative of the principles of this invention and that various modifications can be made by those skilled in the art without departing from the scope of and sprit of the invention.

The term comprise and variants of the term such as comprises or comprising are used herein to denote the inclusion of a stated integer or stated integers but not to exclude any other integer or any other integers, unless in the context or usage an exclusive interpretation of the term is required.

Any reference to publications cited in this specification is not an admission that the disclosures constitute common general knowledge in Australia.

Ό

Ο

CM

<υ

Table 1

.: Human

GLG

FR3 sequences

CZ

! VH1

OD ο

! 66

67

68

69

70 71

72

73

74

75

76

77

78

79

80

agg

gtc

acc

atg

acc agg

gac

aeg

tee

ate

age

aca

gee

tac

atg

5

! 81

82

82a

82b

82c 83

84

85

86

87

88

89

90

91

92

CD CM

gag

ctg

age

agg

ctg aga

t ct

gac

aeg

gee

gtg

tat

tac

tgt

OD

! 93

94

95

m CM

gcg

aga

ga

! 1-

02# 1

CM

aga

gtc

acc

att

acc agg

gac

aca

tee

gcg

age

aca

gee

tac

atg

Ό

10

gag

ctg

age

ctg aga

t ct

gaa

gac

aeg

get

gtg

tat

tac

tgt

o .-.-

gcg

aga

ga :

! 1-

03# 2

CM

aga

gtc

acc

atg

acc agg

aac

acc

tee

ata

age

aca

gee

tac

atg

gag

ctg

age

ctg aga

t ct

gag

gac

aeg

gee

gtg

tat

tac

tgt

gcg

aga

gg

! 1-

08# 3

15

aga

gtc

acc

atg

acc aca

gac

aca

tee

aeg

age

aca

gee

tac

atg

gag

ctg

agg

age

ctg aga

t ct

gac

aeg

gee

gtg

tat

tac

tgt

gcg

aga

ga

! 1-

18# 4

aga

gtc

acc

atg

acc gag

gac

aca

tet

aca

gac

aca

gee

tac

atg

gag

ctg

age

ctg aga

t ct

gag

gac

aeg

gee

gtg

tat

tac

tgt

20

gca

aca

ga

! 1-

24# 5

aga

gtc

acc

att

acc agg

gac

agg

tet

atg

age

aca

gee

tac

atg

gag

ctg

age

ctg aga

tet

gag

gac

aca

gee

atg

tat

tac

tgt

gca

aga

ta :

! 1-

45# 6

aga

gtc

acc

atg

acc agg

gac

aeg

tee

aeg

age

aca

gtc

tac

atg

25

gag

ctg

age

ctg aga

tet

gag

gac

aeg

gee

gtg

tat

tac

tgt

gcg

aga

ga

! 1-

46# 7

aga

gtc

acc

att

acc agg

gac

atg

tee

aca

age

aca

gee

tac

atg

gag

ctg

age

ctg aga

tee

gag

gac

aeg

gee

gtg

tat

tac

tgt

gcg

gca

ga

! 1-

58# 8

30

aga

gtc

aeg

att

acc gcg

gac

gaa

tee

aeg

age

aca

g^cc

tac

atg

gag

ctg

age

ctg aga

tet

gag

gac

aeg

gcc

gtg

tat

tac

tgt

gcg

aga

ga

! 1-

69# 9

aga

gtc

aeg

att

acc gcg

gac

aaa

tee

aeg

age

aca

gee

tac

atg

gag

ctg

age

ctg aga

tet

gag

gac

aeg

gee

gtg

tat

tac

tgt

35

gcg

aga

ga

! 1-

e# 10

aga

gtc

acc

ata

acc gcg

gac

aeg

tet

aca

gac

aca

gee

tac

atg

gag

ctg

age

ctg aga

tet

gag

gac

aeg

gee

gtg

tat

tac

tgt

gca

aca

ga

! 1-

f# 11

2016225923 09 Sep 2016

- 73 ! VH2

agg aca gca

etc atg cac

acc acc aga

ate acc aag aac atg gac c! 2-05# 12

gac cct

acc tee gtg gac

aaa aca

aac gee

cag aca

gtg gtc

ett tgt

tat

tac

5

agg

etc

acc

ate tee aag

gac

acc

tee

aaa

age

cag

gtg

gtc

ett

acc

atg

acc

aac atg gac

cct

gtg

gac

aca

gee

aca

tat

tac

tgt

gca

egg

ata

c! 2-26# 13

agg

etc

acc

ate tee aag

gac

acc

tee

aaa

aac

cag

gtg

gtc

ett

aca

atg

acc

aac atg gac

cct

gtg

gac

aca

gee

aeg

tat

tac

tgt

10

gca

egg

ata

c! 2-70# 14

! VH3

cga

ttc

acc

ate tee aga

gac

aac

gee

aag

aac

tea

ctg

tat

ctg

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga !

! 3-07# 15

15

cga

ttc

acc

ate tee aga

gac

aac

gee

aag

aac

tee

ctg

tat

ctg

caa

atg

aac

agt ctg aga

get

gag

gac

aeg

gee .

ttg

tat

tac

tgt

gca

aaa

gat

a! 3-09#16

cga

ttc

acc

ate tee agg

gac

aac

gee

aag

aac

tea

ctg

tat

ctg

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

gee

gtg

tat

tac

tgt

20

gcg

aga

ga !

! 3-11# 17

cga

ttc

acc

ate tee aga

gaa

aat

gee

aag

aac

tee

ttg

tat

ett

caa

atg

aac

age ctg aga

gee

ggg

gac

aeg

get

gtg

tat

tac

tgt

gca

aga

ga !

! 3-13# 18

aga

ttc

acc

ate tea aga

gat

tea

aaa

aac

aeg

ctg

tat

ctg

25

caa

atg

aac

age ctg aaa

acc

gag

gac

aca

gee

gtg

tat

tac

tgt

acc

aca

ga i

! 3-15# 19

cga

ttc

acc

ate tee aga

gac

aac

gee

aag

aac

tee

ctg

tat

ctg

caa

atg

aac

agt ctg aga

gee

gag

gac

aeg

gee

ttg

tat

cac

tgt

gcg

aga

ga !

! 3-20# 20

30

cga

ttc

acc

ate tee aga

gac

aac

gee

aag

aac

tea

ctg

tat

ctg

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga .

! 3-21# 21

egg

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ctg

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

gee

gta

tat

tac

tgt

35

gcg

aaa

ga ;

! 3-23# 22

cga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ctg

caa

atg

aac

age ctg aga

get

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aaa

ga :

! 3-30# 23

cga

ttc

see

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ctg

40

caa

atg

aac

age ctg aga

get

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga

! 3303# 24

2016225923 09 Sep 2016

cga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ctg

caa

atg

aac

age ctg aga

get

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aaa

ga !

3305# 25

cga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ctg

5

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga !

3-33# 26

cga

ttc

acc

ate tee aga

gac

aac

age

aaa

aac

tee

ctg

tat

ctg

caa

atg

aac

agt ctg aga

act

gag

gac

acc

gee

ttg

tat

tac

tgt

gca

aaa

gat

a! 3-43#27

10

cga

ttc

acc

ate tee aga

gac

aat

gee

aag

aac

tea

ctg

tat

ctg

caa

atg

aac

age ctg aga

gac

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga !

3-48# 28

aga

ttc

acc

ate tea aga

gat

ggt

tee

aaa

age

ate

gee

tat

ctg

caa

atg

aac

age ctg aaa

acc

gag

gac

aca

gee

gtg

tat

tac

tgt

15

act

aga

ga !

. 3-49# 29

cga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ett

caa

atg

aac

age ctg aga

gee

gag

gac

aeg

gee

gtg

tat

tac

tgt

gcg

aga

ga !

! 3-53# 30

aga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ett

20

caa

atg

ggc

age ctg aga

get

gag

gac

atg

get

gtg

tat

tac

tgt

gcg

aga

ga !

ί 3-64# 31

aga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

tat

ett

caa

atg

aac

age ctg aga

get

gag

gac

aeg

get

gtg

tat

tac

tgt

gcg

aga

ga !

! 3-66# 32

25

aga

ttc

acc

ate tea aga

gat

tea

aag

aac

tea

ctg

tat

ctg

caa

atg

aac

age ctg aaa

acc

gag

gac

aeg

gee

gtg

tat

tac

tgt

get

aga

ga !

! 3-72# 33

agg

ttc

acc

ate tee aga

gat

tea

aag

aac

aeg

gcg

tat

ctg

caa

atg

aac

age ctg aaa

acc

gag

gac

aeg

gee

gtg

tat

tac

tgt

30

act

aga

ca :

! 3-73# 34

cga

ttc

acc

ate tee aga

gac

aac

gee

aag

aac

aeg

ctg

tat

ctg

caa

atg

aac

agt ctg aga

gee

gag

gac

aeg

get

gtg

tat

tac

tgt

gca

aga

ga :

! 3-74# 35

aga

ttc

acc

ate tee aga

gac

aat

tee

aag

aac

aeg

ctg

cat

ett

35

caa

atg

aac

age ctg aga

get

gag

gac

aeg

get

gtg

tat

tac

tgt

aag

aaa

ga

! 3-d# 36

Ϊ VH4

cga

gtc

acc

ata tea gta

gac

aag

tee

aag

aac

cag

ttc

tee

ctg

aag

ctg

age

tet gtg acc

gee

gcg

gac

aeg

gee

gtg

tat

tac

tgt

40

gcg

aga

ga

! 4-04# 37

cga

gtc

acc

atg tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

2016225923 09 Sep 2016

aag gcg cga aag

ctg aga gtt ctg

age aa acc age

tet gtg acc gcc

gtg aeg gcg

gac tet gac

aeg aag aeg

gcc aac gee

gtg cag gtg

tat ttc tat

tac tee tac

tgt ctg tgt

! 4-28# 38

gac gcc

ata tea gta

tet gtg act

5

gcg

aga

ga

! 4301# 39

cga

gtc

acc

ata tea gta

gac

agg

tee

aag

aac

cag

ttc

tee

ctg

aag gcc

ctg aga

age ga

tet gtg acc ! 4302# 40

gcc

gcg

gac

aeg

gcc

gtg

tat

tac

tgt

cga

gtt

acc

ata tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

10

aag gcc

ctg aga

age ga

tet gtg act ! 4304# 41

gcc

gca

gac

aeg

gcc

gtg

tat

tac

tgt

cga

gtt

acc

ata tea gta

gac

aeg

tet

aag

aac

cag

ttc

tee

ctg

sag gcg

ctg aga

age ga

tet gtg act ! 4-31# 42

gcc

gcg

gac

aeg

gcc

gtg

tat

tac

tgt

15

cga

gtc

acc

ata tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

aag gcg

ctg aga

age ga

tet gtg acc ! 4-34# 43

gcc

gcg

gac

aeg

get

gtg

tat

tac

tgt

cga

gtc

acc

ata tee gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

20

aag gcg

ctg aga

age ca

tet gtg acc ! 4-39# 44

gcc

gca

gac

aeg

get

gtg

tat

tac

tgt

cga

gtc

acc

ata tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

aag gcg

ctg aga

age ga

tet gtg acc ! 4-59# 45

get

gcg

gac

aeg

gcc

gtg

tat

tac

tgt

cga

gtc

acc

ata tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

25

aag gcg

ctg aga

age ga

tet gtg acc ! 4-61# 46

get

gcg

gac

aeg

gcc

gtg

tat

tac

tgt

cga

gtc

acc

ata tea gta

gac

aeg

tee

aag

aac

cag

ttc

tee

ctg

30

aag gcg ! VH5

ctg aga

age ga

tet gtg acc ! 4-b# 47

gcc

gca

gac

aeg

gcc

gtg

tat

tac

tgt

cag

gtc

acc

ate tea gcc

gac

aag

tee

ate

age

acc

gcc

tac

ctg

cag gcg

tgg aga

age ca

age ctg aag ! 5-51# 48

gcc

teg

gac

acc

gcc

atg

tat

tac

tgt

cac

gtc

acc

ate tea get

gac

aag

tee

ate

age

act

gcc

tac

ctg

35

cag gcg ! VH6

tgg aga

age 1 1

age ctg aag 5-a# 49

gee

teg

gac

acc

gcc

atg

tat

tac

tgt

cga

ata

acc

ate aac cca

gac

aca

tee

aag

aac

cag

ttc

tee

ctg

40

cag gca

ctg aga

aac ga

tet gtg act ! 6-1# 50

ccc

gag

gac

aeg

get

gtg

tat

tac

tgt

! VH7

2016225923 09 Sep 2016

egg

ttt

gtc

ttc

tee

ttg

gac

acc

tet

gtc

age

aeg

gca

tat

ctg

cag

ate

tgc

age

eta

aag

get

gag

gac

act

gee

gtg

tat

tac

tgt

geg

aga

ga !

! 74,

.1#

51

2016225923 09 Sep 2016 _

BstEII Ggtnacc 2

Table 2: Enzymes that either cut 15 or more human GLGs or have 5+-base recognition in FR3 Typical entry:

REname Recognition #sites

GLGid#: base# GLGid#:base# GLGid#:base#.....

	1: The:	3 re	48: are 2	3	3
hits	at	base#
10	Maelll	gtnac					36
	1:	4	2:	4	3:	4	4 :	4	5:	4	6:	4
	7:	4	8:	4	9:	4	10:	4	11:	4	37:	4
	37:	58	38:	4	38:	58	39:	4	39:	58	40:	4
	40:	58	41:	4	41:	58	42:	4	42:	58	43:	4
15	43:	58	44:	4	44 :	58	45:	4	45:	58	46:	4
	46:	58	47:	4	47:	58	48:	4	49:	4	50:	58
	There	are 24	hits	at	base#	4
	Tsp45I	gtsac					33
20	1:	4	2:	4	3:	4	4 :	4	5:	4	6:	4
	7:	4	8:	4	9:	4	10:	4	11:	4	37:	4
	37:	58	38:	4	38:	58	39:	58	40:	4	40:	58
	41:	58	42:	58	43:	4	43:	58	44 :	4	44 :	58
	45:	4	45:	58	46:	4	46:	58	47:	4	47:	58
25	48:	4	49:	4	50:	58

There are 21 hits at base# 4

HphI tcacc 45

1:	5	2:	5	3:	5	4 :	5	5:	5	6:	5
7 :	5	8:	5	11:	5	12:	5	12:	11	13:	5
14 :	5	15:	5	16:	5	17:	5	18:	5	19:	5
20:	5	21:	5	22:	5	23:	5	24 :	5	25:	5
26:	5	27:	5	28:	5	29:	5	30:	5	31:	5
32:	5	33:	5	34:	5	35:	5	36:	5	37:	5
38:	5	40:	5	43:	5	44:	5	45:	5	46:	5
47:	5	48:	5	49:	5
There :	are 44	hits	; at	base#	; 5

2016225923 09 Sep 2016

Nlal	II	CATG					26
1:	9	1	: 42	2:	42	3:	9	3:	42	4 :	9
4 :	42	5	: 9	5:	42	6:	42	6:	78	7 :	9
7:	42	8	: 21	8:	42	9:	42	10:	42	11:	42
12:	57	13	: 48	13:	57	14:	57	31:	72	38:	9
48:	78	49	: 78
The	re	are	11 hits	at	base#	42

There are 1 hits at base# 48 Could cause raggedness

BsaJI Ccnngg 37

1:	14	2:	14	5:	14	6:	14	7:	14	8:	14
8:	65	9:	14	10:	14	11:	14	12:	14	13:	14
14 :	14	15:	65	17 :	14	17:	65	18:	65	19:	65
20:	65	21:	65	22:	65	26:	65	29:	65	30:	65
33:	65	34 :	65	35:	65	37:	65	38:	65	39:	65
40:	65	42:	65	43:	65	48 :	65	49:	65	50:	65

51: 14

There are 23 hits at base# 65

There are 14 hits at base# 14

Alul AGct ' 42

1:	47	2:	47	3:	47	4 :	47	5:	47	6:	47
7:	47	8:	47	9:	47	10:	47	11:	47	16:	63
23:	63	24 :	63	25:	63	31:	63	32:	63	36:	63
37:	47	37:	52	38:	47	38:	52	39:	47	39:	52
40:	47	40:	52	41:	47	41:	52	42:	47	42:	52
43:	47	43:	52	44 :	47	44 :	52	45:	47	45:	52
46:	47	46:	52	47:	47	47:	52	49:	15	50:	47

There are 23 hits at base# 47

There are 11 hits at base# 52 Only 5 bases from 47

BlpI GCtnagc 21

1:	48	2:	48	3: 48	5:	48	6:	48	7: 48
8:	48	9:	48	10:	48	11:	48	37:	48	38:	48
39:	48	40:	48	41:	48	42:	48	43:	48	44:	48
45:	48	46:	48	47:	48

There are 21 hits at base# 48

2016225923 09 Sep 2016

Mwol GCNNNNNnngc 19

	1:	48	2:	28	19:	36	22:	36	23:	36	24:	36
	25:	36	26:	36	35:	36	37:	67	39:	67	40:	67
	41:	67	42:	67	43:	67	44:	67	45:	67	46:	67
5	47:	67
	There are	10	hits	at	base#	67
	There are	7	hits	at	base#	36
	Ddel	Ctnag					71
10	1:	49	1:	58	2:	49	2:	58	3:	49	3:	58
	3:	65	4 :	49	4 :	58	5:	49	5:	58	5:	65
	6:	49	6:	58	6:	65	7:	49	7:	58	7 :	65
	8:	49	8 :	58	9:	49	9:	58	9:	65	10:	49
	10:	58	10:	65	11:	49	11:	58	11:	65	15:	58
15	16:	58	16:	65	17:	58	18:	58	20:	58	21:	58
	22:	58	23:	58	23:	65	24:	58	24 :	65	25:	58
	25:	65	26:	58	27:	58	27:	65	28:	58	30:	58
	31:	58	31:	65	32:	58	32:	65	35:	58	36:	58
	36:	65	37:	49	38:	49	39:	26	39:	49	40:	49
20	41:	49	42:	26	42:	49	43:	49	44 :	49	45:	49
	46:	49	47 :	49	48:	12	49:	12	51:	65
	There are	29	hits	at	base#	58
	There are	22	hits	at	base#	49	Only	nine	base from 58

There are 16 hits at base# 65 Only seven bases from 58 25

Bglll Agatct 11

1: 61 7: 61	2: 9:	61 61	3: 10:	61 61	4: 11:	61 61	5: 51:	61 47	6:	61
There are	10	hits	at	base#	61
BstYI Rgatcy					12
1: 61	2:	61	3:	61	4:	61	5:	61	6:	61
7: 61	8:	61	9:	61	10:	61	11:	61	51:	47
There are	11	hits	at	base#	61

2016225923 09 Sep 2016

Hpyl88I	TCNga				17
1: 64		2: 64	3:	64	4 :	64	5:	64	6:	64
7: 64		8: 64	9:	64	10:	64	11:	64	16:	57
20: 57		27: 57	35;	57	48:	67	49:	67
There	are	11 hits	at	base#	64
There	are	4 hits	at	base#	57
There	are	2 hits	at	base#	67	Could be	ragged.

MslI CAYNNnnRTG	44
10	1:	72	2:	72	3:	72	4:	72	5:	72	6:	72
	7:	72	8:	72	9:	72	10:	72	11:	72	15:	72
	17:	72	18:	72	19:	72	21:	72	23:	72	24:	72
	25:	72	26:	72	28:	72	29:	72	30:	72	31:	72
	32:	72	33:	72	34:	72	35:	72	36:	72	37:	72
15	38:	72	39:	72	40:	72	41:	72	42:	72	43:	72
	44:	72	45:	72	46:	72	47:	72	48:	72	49:	72
	50:	72	51:	72
	There are 44 hits	at	base#	! 72
20	BsiEI CGRYcg					23
	1:	74	3:	74	4:	74	5:	74	7:	74	8:	74
	9:	74	10:	74	11:	74	17:	74	22:	74	30:	74
	33:	74	34:	74	37:	74	38:	74	39:	74	40:	74
	41:	74	42:	74	45:	74	46:	74	47 :	74
25	There .	are 23 hits	i at	base# 74
	Eael	Yggccr					23
	1:	74	3:	74	4:	74	5:	74	7:	74	8:	74
	9:	74	10:	74	11:	74	17:	74	22:	74	30:	74
30	33:	74	34:	74	37:	74	38:	74	39:	74	40:	74
	41:	74	42:	74	45:	74	46:	74	47:	74
	There :	are 23 hits	ϊ at	base# 74
	Eagl	Cggccg					23
35	1:	74	3:	74	4:	74	5:	74	7:	74	8:	74
	9:	74	10:	74	11:	74	17:	74	22:	74	30:	74
	33:	74	34:	74	37:	74	38:	74	39:	74	40:	74
	41:	74	42:	74	45:	74	46:	74	47:	74
	There	are 23 hits at	base# 74

2016225923 09 Sep 2016

- 81 Haelll GGcc 27

	1:	75	3:	75	4 :	75	5:	75	7:	75	8:	75
	9:	75	10:	75	11:	75	16:	75	17:	75	20:	75
5	22:	75	30:	75	33:	75	34 :	75	37:	75	38:	75
	39:	75	40:	75	41:	75	42:	75	45:	75	46:	75
	47 :	75	48 :	63	49:	63

There are 25 hits at base# 75

Bst4CI ACNgt 65°C

Sites There is a third isoschismer

	1:	86	2:	86	3:	86	4:	86	5:	86	6:	86
	7:	34	7:	86	8:	86	9:	86	10:	86	11:	86
	12:	86	13:	86	14:	86	15:	36	15:	86	16:	53
	16:	86	17:	36	17:	86	18:	86	19:	86	20:	53
15	20:	86	21:	36	21:	86	22:	0	22:	86	23:	86
	24:	86	25:	86	26:	86	27:	53	27:	86	28:	36
	28:	86	29:	86	30:	86	31:	86	32:	86	33:	36
	33:	86	34:	86	35:	53	35:	86	36:	86	37:	86
	38:	86	39:	86	40:	86	41:	86	42:	86	43:	86
20	44:	86	45:	86	46:	86	47:	86	48:	86	49:	86
	50:	86	51:	0	51:	86
	There are 51	. hits at	base# 86	All	the	other	sites are well
	HpyCH4III	ACNgt				63
25	1:	86	2:	86	3:	86	4:	86	5:	: 86	6:	86
	7:	34	7:	86	8:	86	9:	86	10:	: 86	11:	86
	12:	86	13:	86	14:	86	15:	36	15:	: 86	16:	53
	16:	86	17:	36	17:	86	18:	86	19:	: 86	20:	53
	20:	86	21:	36	21:	86	22 :	0	22:	: 86	23:	86
30	24:	86	25:	86	26:	86	27:	53	27:	: 86	28:	36
	28:	86	29:	86	30:	86	31:	86	32:	: 86	33:	36
	33:	86	34:	86	35:	53	35:	86	36:	; 86	37:	86
	38:	86	39:	86	40:	86	41:	86	42:	: 86	43:	86
	44:	86	45:	86	46:	86	47:	86	48:	: 86	49:	86
35	50:	86	51:	0	51:	86

There are 51 hits at base# 86

2016225923 09 Sep 2016

- 82 Hinfl Gantc 43

	2:	2	3:	2	4:	2	5:	2	6:	2	7 :	2
	S:	2		9:	2	9:	22	10:	2	11:	2	15:	2
	16:	2	17:	2	18:	2	19:	2	19:	22	20:	2
5	21:	2	23:	2	24:	2	25:	2	26:	2	27:	2
	28:	2	29:	2	30:	2	31:	2	32:	2	33:	2
	33:	22	34: 22	35:	2	36:	2	37:	2	38:	2
	40:	2	43:	2	44:	2	45:	2	46:	2	47:	2
	50:	60
10	There	are	38	hits	at	base#	2
	Mlyl	GAGTCNNNNNn			18
	2:	2		3:	2	4 :	2	5:	2	6:	2	7 :	2
	8:	2		9:	2	10:	2	11:	2	37:	2	38 :	2
15	40:	2	43:	2	44 :	2	45:	2	4 6:	2	47:	2
	There	are	18	hits	at	base#	2
	Plel	gagtc					18
	2:	2		3:	2	4:	2	5:	2	6:	2	7:	2
20	8:	2		9:	2	10:	2	11:	2	37:	2	38:	2
	40:	2	43:	2	44:	2	45:	2	46:	2	47:	2
	There	are	18	hits	at	base#	2
	Acil	Ccgc					24
	2:	26		9:	14	10:	14	11:	14	27:	74	37:	62
25	37:	65		38:	62	39:	65	40:	62	40:	65	41:	65
	42:	65		43:	62	43:	65	44 :	62	44:	65	45:	62
	46:	62		47:	62	47 :	65	48:	35	48:	74	49:	74
	There	are	8	hits	at	base#	62
	There	are	8	hits	at	base#	65
30	There	are	3	hits	at	base#	14
	There	are	3	hits	at	base#	74
	There	are	1	hits	at	base#	26
	There	are	1	hits	at	base#	35
		Gcgg						11
35	8:	91		9:	16	10:	16	11:	16	37:	67	39:	67
	40:	67		42:	67	43:	67	45:	67	46:	67
	There	are	7	hits	at	base#	67
	There	are	3	hits	at	base#	16
	There	are	1	hits	at	base#	91

2016225923 09 Sep 2016

BsiHKAI 2: 30 12: 89 5 40: 51 46: 51 There	GWGCWc	20	10: 39: 45:	30 51 51
4: 30 13: 89 41: 51 47: 51 are 11 hits	6: 14 : 42: at	30 89 51 base#	7 : 37 : 43: 51	30 51 51	9: 38: 44 :	30 51 51
Bspl286I GDGCHc				20
0 2: 30	4: 30	6:	30	7:	30	9:	30	10:	30
12: 89	13: 89	14 :	89	37:	51	38:	51	39:	51
40: 51	41: 51	42:	51	43:	51	44 :	51	45:	51
46: 51	47: 51
There	are 11 hits	at	base#	51
HgiAI GWGCWc				20
2: 30	4: 30	6:	30	7 :	30	9:	30	10:	30
12: 89	13: 89	14 :	89	37 :	51	38:	51	39:	51
40: 51	41: 51	42 :	51	43:	51	44:	51	45:	51
0 46: 51	47: 51
There	are 11 hits	at	base#	51
BsoFI GCngc				26
2: 53	3: 53	5:	53	6:	53	7 :	53	8:	53
5 8: 91	9: 53	10:	53	11:	53	31:	53	36:	36
37: 64	39: 64	40:	64	41:	64	42:	64	43:	64
44: 64	45: 64	46:	64	47:	64	48:	53	49:	53
50: 45	51: 53
There	are 13 hits	at	basei	53
0 There	are 10 hits	at	basei	64
Tsel Gcwgc				17
2: 53	3: 53	5:	53	6:	53	7:	53	8:	53
9: 53	10: 53	11:	53	31:	53	36:	36	45:	64
46: 64	48: 53	49:	53	50:	45	51:	53
5 There	are 13 hits	at	basei	53

2016225923 09 Sep 2016

Mali 3: 7:	gagg	34	6: 15:	67 67
67 67	3 8	95 67	4: 9:	51 67	5: 10:	16 67	5-. 11:	67 67
16:	67	17	67	19:	67	20:	67	21:	67	22:	67
23:	67	24	67	25:	67	26:	67	27:	67	28:	67
29:	67	-30	67	31:	67	32:	67	33:	67	34 :	67
35:	67	36	67	50:	67	51:	67
There	are 31 hits	at	basel	67
HpyCH4V	TGoa					34
5:	90	6	90	11:	90	12:	90	13:	90	14:	90
15:	44	16	44	16:	90	17:	44	18:	90	19:	44
20:	44	21	: 44	22:	44	23:	44	24:	44	25:	44
26:	44	27	: 44	27:	90	28:	44	29:	44	33:	44
34:	44	35	: 44	35:	90	36:	38	48:	44	49:	44
50:	44	50	: 90	51:	44	51:	52
There	are	21 hits at	base# 44
There	are	1 hits at	base# 52

AccI	GTmkac					13
7:	37 11:	24	37:	16	38:	16
41:	16 42:	16	43:	16	44:	16
47:	16
There are 11	hits	at	base#	16
SacII	CCGCgg					8
9:	14 10:	14	11:	14	37:	65

42: 65 43: 65

5-base recognition 39: 16 40: 16

45: 16 46: 16

6-base recognition 39: 65 40: 65

There are There are hits at base# 65 3 hits at base# 14

Tfil Gawtc 24

	9:	22	15:	2	16:	2	17:	2	18:	2	19:	2
	19:	22	20:	2	21:	2	23:	2	24:	2	25:	2
35	26:	2	27:	2	28:	2	29:	2	30:	2	31:	2
	32:	2	33:	2	33:	22	34:	22	35:	2	36:	2

There are 20 hits at base# 2

2016225923 09 Sep 2016

BsraAI Nnnnnngagac 19

	15: 11 16: 11	20:	11	21: 11	22:	11	23 :	11
	24: 11 25: 11	26:	11	27: 11	28:	11	28 :	56
	30: 11 31: 11	32:	11	35: 11	36:	11	44 :	87
5	48: 87
	There are 16 hits	at	base#	11
	Bpml ctccag			19
	15: 12 16: 12	17:	12	18: 12	20:	12	21:	12
10	22: 12 23: 12	24:	12	25: 12	26:	12	27:	12
	28: 12 30: 12	31:	12	32: 12	34 :	12	35:	12
	36: 12
	There are 19 hits	at	base#	12
15	XmnI GAANNnnttc			12
	37: 30 38: 30	39:	30	40: 30	41:	30	42 :	30
	43: 30 44: 30	45:	30	46: 30	47:	30	50 :	30
	There are 12 hits	at	base#	30
20	BsrI NCcagt			12
	37: 32 38: 32	39:	32	40: 32	41:	32	42:	32
	43: 32 44: 32	45:	32	46: 32	47:	32	50:	32
	There are 12 hits	at	base#	32
25	Banll GRGCYc			11
	37: 51 38: 51	39:	51	40: 51	41:	51	42:	51
	43: 51 44: 51	45:	51	46: 51	47:	51
	There are 11 hits	at	base#	51
30	EC1136I GAGctc			11
	37: 51 38: 51	39:	51	40: 51	41:	51	42:	51
	43: 51 44: 51	45:	51	46: 51	47 :	51
	There are 11 hits	at	base#	51
35	Sacl GAGCTc			11
	37: 51 38: 51	39:	51	40: 51	41:	51	42:	51
	43: 51 44: 51	45:	51	46: 51	47:	51

There are 11 hits at base# 51

2016225923 09 Sep 2016

Table 3: Synthetic 3-23 FR3 of human heavy chains showning positions of possible cleavage sites ! Sites engineered into the synthetic gene are shown in upper case DNA ! with the RE name between vertical bars (as in | Xbal I ) .

! RERSs frequently found in GLGs are shown below the synthetic sequence ! with the name to the right (as in gtn ac=MaelII(24), indicating that ! 24 of the 51 GLGs contain the site) .

I---FR3---

1	89	90	: codon
	in 1	R	F
15	synthetic 3-23)	1 ege	ttc	6
	! Allowed DNA	1 cgn	tty
		lagr
	1	ga	ntc	=
20	Hinfl¢38) ;	ga	gtc
	Plel¢18) 1	ga	wtc	-
	Tfil¢20)
25	I Maelll(24)		gtn	ac =
	1		gts	ac =
	Tsp45I¢21) 1		tc	acc =
30	HphI¢44)

I --------FR3-------------------------------------------------! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 ! TISRDNSKNTLYLQM

Alul(23) !

Blpl(21) cItcc ag = Bpml(19) g ctn age

2016225923 09 Sep 2016

I | g aan nnn ttc = XmnI(12)

I Xbal I tg ca = HpyCH4V(21)

---FR3-----------------------------------------------------> I

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 NSLRAEDTAVYYCAK

IagyIctnIagrI I I

Aflll

1	cc	nng g = BsaJI(23)	ac ngt = Bst4CI(51)
aga	tet	= Bglll(10) 1	ac ngt = HpyCH4III(51)
Rga	tcY	= BstYI(ll) 1	ac ngt = Taal(51)
1		c ayn rinn rtc	= Msll (44)
1		eg rye g =	BsiEI(23)
1		yg gee r =	Eael (23)
1		eg gee g =	Eagl (23)
1		Ig gee = Haelll (25)
1		gag g = Mnll(31)1
1		1 Pstl 1

2016225923 09 Sep 2016

Table 4: REdaptors, Extenders, and Bridges used for Cleavage and Capture of Human Heavy Chains in FR3.

A: HpyCH4V Probes of actual human HC genes !HpyCH4V in FR3 of human HC, bases 35-56; only those with TGca site TGca;10,

RE recognition:tgca of length 4 is expected at

9

6-1

3-11,3-07,3-21,3-72,3-48 3-09,3-43,3-20

5-51

3-15,3-30, 3-30.5,3-30.3,3-7 4,3-23, 3-33

7-4 .1 3-73 5-a 3-49 agttctccctgcagctgaactc cactgtatctgcaaatgaacag ccctgtatctgcaaatgaacag ccgcctacctgcagtggagcag cgctgtatctgcaaatgaacag cggcatatctgcagatctgcag cggcgtatctgcaaatgaacag ctgcctacctgcagtggagcag tcgcctatctgcaaatgaacag

B: HpyCH4V REdaptors, Extenders, and Bridges B.1 REdaptors ! Cutting HC lower strand:

! TmKeller for 100 mE NaCl, zero formamide

! Edapters for cleavage	rp H ·* n 68.0	rp K ^xa 64.5
(ON_HCFR36-1)	5'-agttctcccTGCAgctgaactc-3'
(ON_HCFR36-1A)	5'-ttctcccTGCAgctgaactc-3'	62.0	62.5
(ON_HCFR36-1B)	5'-ttctcccTGCAgctgaac-3'	56.0	59.9
(ON_HCFR33-15>	5'-cgctgtatcTGCAaatgaacag-3'	64.0	60.8
(0N_HCFR33-15A)	5'-ctgtatcTGCAaatgaacag-3'	56.0	56.3
(0N_HCFR33-15B)	5'-ctgtatcTGCAaatgaac-3¹	50.0	53.1
<ON_HCFR33-11)	5'-cactgtatcTGCAaatgaacag-3’	62.0	58.9
(ON_HCFR35-51)	5'-ccgcctaccTGCAgtggagcag-3’	74.0	70.1
B.2 Segment of	synthetic 3-23 gene into which	captured CDR3	is to
be cloned		•
1	Xbal.. .
!D323* cqCttcacTaaq tcT aqa qac aaC tcT aaq aaT	acT etc taC
! scab....

2016225923 09 Sep 2016

HpyCH4V

.. .. Aflll...

Ttg caG atg aac age TtA agG . . .

!

B.3 Extender and Bridges ! Extender (bottom strand):

I (ON_HCHpyEx01) 5 ' -cAAgTAgAgAgTATTcTTAgAgTTgTcTcTAgAcTTAgTgAAgcg-3 ' ! ON_HCHpyEx01 is the reverse complement of ! 5'-cgCttcacTaag tcT aqa gac aaC tcT aag aaT acT ctC taC Ttg -3'

I ! Bridges (top strand, 9-base overlap):

t (ON_HCHpyBr016-l) 5'-cgCttcacTaag tcT aqa gac aaC tcT aagaaT acT ctC taC Ttg CAgctgaac-3' {3'-term C is blocked)

I ! 3-15 et al. + 3-11 (ON_HCHpyBr023-15) 5'-cgCttcacTaag tcT aqa gac aaC tcT aagaaT acT ctC taC Ttg CAaatgaac-3' {3'-term C is blocked)

I ! 5-51 (ON_HCHpyBr045-51) 5'-cgCttcacTaag tcT aqa gac aaC tcT aagaaT acT ctC taC Ttg CAgtggagc-3' {3'-term C is blocked} ! PCR primer (top strand) (ON_HCHpyPCR) 5'-cgCttcacTaag tcT aqa gac-3'

C: BlpI Probes from human HC GLGs

1-58,1-03,1-08,1-69,1-24,1-45,1-4 6,1-f, 1-e 35 acatggaGCTGAGCagcctgag

1-02 acatggaGCTGAGCaggctgag

2016225923 09 Sep 2016

1-18 acatggagctgaggagcctgag

5-51,5-a acctgcagtggagcagcctgaa

5 3-15,3-73,3-49,3-72 atctgcaaatgaacagcctgaa

3303, 3-33,3-07,3-11,3-30, 3-21,3-23,3305, 3-4 8 atctgcaaatgaacagcctgag

3-20,3-74,3-09,3-43 10 atctgcaaatgaacagtctgag

74.1 atctgcagatctgcagcctaaa

3-66,3-13,3-53,3-d atcttcaaatgaacagcctgag

10 3-64 atcttcaaatgggcagcctgag

4301,4-28,4302,4-04,4304,4-31,4-34,4-39,4-59, 4-61,4-b ccctgaaGCTGAGCtctgtgac

6-1 20 ccctgcagctgaactctgtgac

2-70,2-05 tccttacaatgaccaacatgga

2-26 tccttaccatgaccaacatgga

D: BlpT REdaptors , Extenders, and Bridges D.1 REdaptors

T„^M T (BlpF3HCl-58) 5'-ac atg gaG CTG AGC age ctg ag-3' 70 66 (BlpF3HC6-l) 5'-cc ctg aag ctg age tet gtg ac-3' 70 66 ! BlpF3HC6-l matches 4-30.1, not 6-1.

D.2 Segment of synthetic 3-23 gene into which captured CDR3 is to be cloned !

BlpI

Xbal...

2016225923 09 Sep 2016 !D323* cgCttcacTaag TCT AGA gac aaC tcT aag aaT acT etc taC Ttg caG atg aac ι ! Aflll...

! aqC TTA AGG

D.3 Extender and Bridges

! Bridges

(BlpF3Brl) 5'-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG-

taC Ttg caG Ctg a 1GC age ctg-3'

(BlpF3Br2) 5'-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG-

taC Ttg caG Ctg a|gc tet gtg-3'

! 1 lower strand is cut here

! Extender (BlpF3Ext) 5 ' -TcAgcTgcAAgTAcAAAgTATTTTTAcTgTTATcTcTAgA_cTgAgTgAAgcg15 3¹ ! BlpF3Ext is the reverse complement of:

! 5'-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG taC Ttg caG Ctg a-3'

I (BlpF3PCR) 5’-cgCttcacTcag tcT aga gaT aaC-3'

E: HpyCH4III Distinct GLG sequences surrounding site, bases 77-98

102*1,11804,14607,16909,leOlO,311017,353030,404*37,4301 ccgtgtattactgtgcgagaga

103*2,307015,321021, 3303*24,333*26, 348028,364#31, 366032

ctgtgtattactgtgcgagaga
3	10803
ccgtgtattactgtgcgagagg 4	12405,lfOll
ccgtgtattactgtgcaacaga 5	14506
ccatgtattactgtgcaagata 6	158*8
ccgtgtattactgtgcggcaga 7	205*12
ccacatattactgtgcacacag 8	226013

ccacatattactgtgcacggat

2016225923 09 Sep 2016 ccacgtattactgtgcacggat ccttgtattactgtgcaaaaga ctgtgtattactgtgcaagaga ccgtgtattactgtaccacaga ccttgtatcactgtgcgagaga ccgtatattactgtgcgaaaga ctgtgtattactgtgcgaaaga ccgtgtattactgtactagaga ccgtgtattactgtgctagaga ccgtgtattactgtactagaca ctgtgtattactgtaagaaaga ccgtgtattactgtgcgagaaa ccgtgtattactgtgccagaga ctgtgtattactgtgcgagaca ccatgtattactgtgcgagaca ccatgtattactgtgcgaga

270*14

309*16,343*27

313*18,374*35,61*50

315*19

320*20

323*22

330*23,3305*25

349*29

372*33

373*34

3d#36

428*38

4302*40,4304*41

439*44

551*48

5a#49

F: HpyCH4III REdaptors, Extenders, and Bridges F.l REdaptors ! ONs for cleavage of HC(lower) in FR3(bases 77-97) ! For cleavage with HpyCH4III, Bst4CI, or Taal ! cleavage is in lower chain before base 88.

	77 78	788 901	888 234	888 567	889 890	999 999	9
123	456	7	ψ W ‘•rr,
(H43.77.97.l-02#l)	5' -cc	gtg	tat	tAC	TGT	geg	aga	g-3'	6462.6
(H43.77.97.l-03#2)	5'-eg	gtg	tat	tAC	TGT	geg	aga	g-3'	6260.6
(H43.77.97.108#3)	5' -cc	gtg	tat	tAC	TGT	geg	aga	g-3'	6462.6
(H43.77.97.323#22)	5' -cc	gtS	tat	tac	tgt	geg	a3a	g-3'	6058.7
(H43.77.97.330#23)	5' -ct	gtg	tat	tac	tgt	geg	a;Ma	g-3'	6058.7

2016225923 09 Sep 2016

(H43.77.97.439#44)	5'-eg	gtg	tat	tac	tgt	gcg	aga	1-3'	6260
(H43.77.97.551#48)	5' -cc	atg	tat	tac	tgt	gcg	aga	\|-3'	6260
(H43 - 77.97.5a#49)	5 ' -cc	atg	tat	tAC	TGT	gcg	aga	1-3'	5858

F. 2 Extender and Bridges ! Xbal and Aflll sites in bridges are bunged (H43.XABrl) 5'-ggtgtagtga| TCT | AGt | gac | aac | tct | aag I aat | act ( etc | tac | ttg I cag I atg I I aacl agC I TTt I AGg I get 1 gag| gac | aCT I GCA I Gtc I tac I tat tgt gcg aga-3' (H43.XABr2) 5'-ggtgtagtga10 | TCT | AGt | gac | aac | tct | aag | aat | act | etc | tac | ttg | cag | atg | I aacl agC I TTt | AGg I get I gaglgacl aCT I GCA I Gtc I tac I tat tgt gcg aaa-3' (H43.XAExt) 5'-ATAgTAgAcT gcAgTgTccT cAgcccTTAA gcTgTTcATc TgcAAgTAgAgAgTATTcTT AgAgTTgTcT cTAgATcAcT AcAcc-3' !H43.XAExt is the reverse complement of ! 5'-ggtgtagtga! | TCT | AGA | gac | aac | tct | aag | aat | act | ctc | tac | ttg | cag | atg | « I aac I agC I TTAI AGg I get I gaql gac I aCT I GCA I Gtc 1 tac I tat -3' (H43.XAPCR) 5'-ggtgtagtga |TCT|AGA|gac|aac-3' ! Xbal and Aflll sites in bridges are bunged (H43.ABrl) 5 *-ggtgtagtgaI aac I agCI TTt I AGg I get I gaglgacl aCT I GCA I Gtc I tac I tat tgt gcg aga-3' (H43.ABr2) 5'-ggtgtagtgaI aacl agCI TTt I AGg I get I gaql gac I aCT I GCA I Gtc I tac I tat tgt gcg aaa-3' 25 (H43.AExt) 5'-ATAgTAgAcTgcAgTgTccTcAgcccTTAAgcTgTTTcAcTAcAcc-3' ! (H43.AExt) is the reverse complement of 5'-ggtgtagtga• I aac I agC I TTAI AGg I get I gag I gac I aCT I GCA IGtcItacItat -3' ' -ggtgtagtga I aac I (H43.APCR)

2016225923 09 Sep 2016 kj φ

x £

s □

Ο ο

μ οο γ— m

.2 (A © sT3 £

LT)

I

U w « 0) g w o XI c?

G <υ σΧ ίϋ

W φ

X υ

μ πί £

to

-μ ε

Μ-1

Ο kJ φ

ΓΟ

CM

X ο

·>Ί α

X •Ό

Μ

ϋ		CT	0»	t7>	tn	cn	Ch	Ch
μ	rt	<ϋ	rt	rt	(0	ίθ	ω	ίϋ
0	ϋ	ϋ	U	0	υ	α	ο	υ
rt	rt	Φ	σ»	rt	σ»	ίϋ	□»	ro
rt	rt	Φ	rt	rt	μ	ω	φ	ίϋ
	tn	σ»	0»		υ	Ch	σ»	σ»
μ	μ	μ	C7*	μ	μ	μ	Cn	μ
0	rt	<ϋ	μ	rt	Φ	ίϋ	μ	rt
ϋ»		ω	Ο»	(rt	σ»	<ϋ	Ch	rt
•rf		<		#2	<
ο	Q	ο	0	O	<J		Ο	Ο
ο	Ο	ο	Ο	O	ο	ο	ο	Ο
Η	&Μ	Η	Η	Eh	Η	Η	Η	Η
0	0	υ	υ	□	Φ	υ	υ	Ο
ϋ	μ	μ	0	μ	μ	μ	υ	μ
α	rt	<ϋ	rt	rt	Φ	ίϋ	<ϋ	<0
μ	μ	μ	μ	μ	μ	μ	μ	μ
0	17»	σ»	0	0»	<ϋ	σ»	υ	υ
μ	μ	μ	0	μ	ο	υ	υ	υ
μ	0	Ο	tr»	0	cn	σ»	Ch	σ»
σ»	rt	υ	0	0»	σ»	CP	μ	υ
rt	ϋ	υ	0	0	υ	υ	ω	μ

σ·' «Μ ΙΛ · CO ΟίΩτΗνΤΓI I I I !

σ» •sr

I I

50

ω

co

in

CO

ιΑ

CO

σι

r*

’S’

o

CM

CO

ΧΓ

σ>

Ο

CM

50

uo

o

*5

«μ

CM

ιΑ

μ

CM

O

,—1

ιΑ

50

o

μ

O

ο

Ο

CM

00

μ

CO

μ

σι

μ

CM

μ

CM

•«τ

Ο

co

50

μ

CM

t—1

co

r—1

μ

<ο

O

00

O

ο

μ

Ο

CO

co

Φ

r—J

Ch

X

CM

0

r*—,

kJ

μ

ο

i—i

CM

O

ο

ΓΟ

Ο

Γ-

o

Q

μ

CM

=r

00

CM

TJ

μ

Φ

CM

σι

o

CO

μ

ο

r—1

Ο

μ

CO

X

CM

Γ-

co

X

CM

0

i—1

xs

ΙΑ

CM

«μ

CM

lA

ο

CM

μ

Ο

CM

00

CM

50

T*“J

μ

ια

LT)

00

ο

Ο

<μ

50

co

CM

50

μ

o

•sr

<μ

o

r—I

CM

50

GD

μ

Ο

co

,μ

Ch

σι

CM

T

co

r-J

Ch

CM

r-

O>

μ

ο

CM

τ—1

co

cn

o

Γ*

CO

μ

r—

Ch

CM

co

Γ-

μ

CM

r-

CO

σ»

CM

^τ

CM

Γ—,

μ

«0»

CO

ΙΑ

50

μ

r—I

’Ο’

φ

ια

σ>

CM

μ

Ο

00

Ch

X

LA

τ—1

μ

^r

0

μ

CM

kJ

Cu

ο

CM

CO

r-

ο

Γ-

γ-

Ό

μ

00

μ

Ο»

uo

50

ΙΑ

CM

CO

ιη

μ

CM

co

,—1

μ

CM

CO

ιΑ

ΙΟ

γ-

co

Ch

Ό

μ

Ο LO μ cH μ

ϋ <ο (0 th μ

Ο σ»

Η υ

υ υ

μ υ

μ μ

σ» ίθ υ

μ

Ο (0 ίϋ σ» μ

υ tn <

Ο

Η

U υ

υ μ

μ σ» ίϋ

I lO σ» (ΰ μ

ίϋ σ» ο

C7* ίϋ ϋ

ο <0

Ο» μ

<ϋ ίϋ <

Ο ο

Η υ

μ «ϋ μ

σ» μ

ϋ ίϋ ϋ

ι co

-09 ccctgtatcTGCAaatgaacag ccc.g.at.....aa.....ag co ο

c\

2016225923 09 Sep 2016

tn	CP	CP	rt	rt
	rt	rt	rt	rt
t*	CP	CP	CP	CP
4J	4->	P	4->	P
0	υ	0	υ	υ
ϋ	CP	o	u	υ
tn	CP	CP	CP	CP
nJ	rt	rt	rt	rt
u	0	CP	u	υ
δ	tP	CP	CP	rt
	rt	rt	rt	rt
3	tP	CP	CP	CP
	4J	μ	CP	P
Ό	0	U	4J	rt
δ	tP	CP	CP	rt
tc	<0	rt	rt	rt
	CP	CP	υ	υ
e*	CP	CP	CP	CP
«Ρ	4->	4->	P	P
CC	to	rt	u	υ
U	υ	υ	u	P
(C	rt	rt	rt	ra

ω £

<D in

CM o

CD i—1

P m

in r—f

CP

Π3

1

J

rt

--S

2

P

1-1

m

CO

rt

ω

CP

•

0)

φ

«

P

•

X

Ρ

υ

.

υ

ο

□

σι

CM

o

CM

O

CP

.

•

CP

•

Ρ

Ο

P

rt

P

rt

2

P

rt

t=

fc

rt

ο

co

Ρ

ο

CO

o

p

o

-

Ο

ε

e

*

Ο

,—I

μ

P

Φ

ω

r~

si·

o

r~4

o

rt

3

•

ω

•

Φ

CP

rt

CP

.

-

Ρ

u

»

υ

u

o

υ

Ό

co

P

o

P

o

cp

υ

CP

μ

Φ

Μ

u

.

•

.

P

υ

ο

Ρ

Ο

υ

u

a

P

>1

υ

t—1

φ

ST

in

at

o

P

r-4

C

φ

0,

CP

Ο

X

Ρ

X

rt

ο

rt

Ρ

•Η

φ

Ρ

υ

u

υ

CJ

u

Φ

•Η

ω

£

Ρ

^r

U>

o

P

σ»

CO

CP

rt

CP

rt

CP

rt

Ρ

3

□

3

rt

P

rt

Ή

Τ5

CP

0

CP

rt

Φ

•π

rt

CP

P

CP

P

0)

Ρ

£

φ

co

CO

o

CD

o

Γ

4-»

rt

P

rt

td

rt

Ο

rt

«

P

rH

ΓΡ

rt

CP

rt

CP

rt

(£

rt

φ

ro

<

υ

a

Ό

υ

n

I)

c)

( )

a

Ό

X

φ

CM

P

o

CM

CD

o

<9

co

o

C9

o

φ

>1

φ

Ρ

P

fc-t

P

H

P

Ρ

£

Ο

Ρ

υ

O

U

υ

U

υ

C

□

Φ

£

u

P

υ

P

Φ

0

U

υ

ω

t-H

ID

P

00

CM

<—1

rt

0,

£

X

Φ

P

co

P

X

rt

φ

ω

Ρ

υ

CP

rt

CP

υ

u

Φ

Ρ

υ

P

υ

u

υ

u

£

X

£

rt

o

CO

P

C-

o

co

ΓΡ

υ

CP

Φ

□

ρ

Ρ

3

i—1

P

m

—1

υ

CD

CP

P

u

X

ο

£

ο

υ

Ο

υ

o

0

u

0

P

Ρ

υ

Ο

η

ο

£

»

—-

X

χ:

X

P

co

<3·

o

m

Ρ

o

co

P

ro

CM

in

U

P

t-1

s

3

2

ρ

2

Ρ

m

•

co

C

m

P

r-

rt

·“

I

1

i

σ

tr

σ

Έ.

Ό

P

CM

co

sp

in

m

co

r-

CO

in

co

φ

£

P

CO

co

σι

CO

lD o

lQ

2016225923 09 Sep 2016

CP	CP	CO	CP	CP	0	υ	nj	nJ
<0	nj	Π3	nJ	nJ	RJ	nj	CP	CP
CP	CP	nj	CP	CP	CP	CP	CP	cn
4-J	jj	4-) ·	4-)	4-J	4J	4-)	4J	4-J
υ	u	a	υ	Ό	CP	CP	nj	nJ
υ	-P	o	υ	υ	4J	4-)	u	o
cn	CP	CP	CP	CP	0	υ	<0	nj
nj	rC	nJ	<0	nJ	-P	4-)	Π3	nj
υ	O	a	υ	o	ϋ	ϋ	U	O
<0	(0	cn	(0	CP	CP	nJ	u	o
ro	rC	4-)	nj	CP	ίβ	Π3	rc	nJ
CP	CP	o	cn	CP	CP	CP	CP	cn
4J	4J	4-)	4-J	4-)	-P	4-)	4-)	4->
<0	rc	nj	ns	nj	ϋ	ϋ	nj	nJ
Γ0	rc	CP	nJ	nJ	CP	CP	nJ	υ
	<C	co	nj	nJ	rt	nj	a	u
υ	ϋ	o	□	ϋ	rt	o	nJ	co
CP	CP	CP	4-)	4-J	CP	CP	4-J	4-)
4J	4-J	4-)	4-¹	4-J	-P	4-»	4-J	4-J
ϋ	ϋ	a	u	U	0	υ	a	u
jj	4J	+j	4-)	4->	0	υ	0	ϋ
(t	nj	(0	CO	CO	a	a	4J	4-)
cn	o	i—1	kP		i-4		o	kO
o	CN	•	kO	kO	O	r—l	r~	CN
cn	1	-=T	1	f	cn	1	1	1
cn	m		m	cn	rf	kp	CN	CN
o	o	o	o	o	r*	1-4	O	O
					kO

o	o	o	o	o	r4	o	O	o
t—1	o	o	o	o	«3·	r—1	O	o
O	o	o	o	o	rf	cn	O	o
m	co	o	o	o	O	i-4	o	o
					v-4
<x>	Γ—)	©	©	o	>r4	i-4	»-4	o
					CN
m	CN	r—1	,—1	o	CD	O	CN	o
i—(	!~f				<n
1—1	m	o	CN	r—1	«—1	1-4	CN	o
’O'	CN				co
CO	kO	CN	CN	o	CD	m	00	CN
00	<—1				P*
kO	m	o	co	1—1	σι	k£>	m	O
CO	CN		«—4		rf		i-4
iH					CN
O	CN	cn	cn	CN	kO	kD	CO	CN
rr	00		CN		CD	i-4	CN
cn					rf
kO	r*	co	<Pi	o	i-4	CN	m	rr
				r—1	r*i	1—1	i—1	r4
				lO

2016225923 09 Sep 2016

V) <D

X

U

P £

W

P £

Ll

Φ φ

p

Li 'tr x

p •H co

Φ c

Φ σ

φ co

CP

03

a)

-

03

,

υ

Ο

03

CP

03

CP

c

Cp

.

•

03

•

Ρ

P

♦

Ρ

υ

CP

03

C

u

CP

•

P

•

P

Ρ

3

Cp

υ

ο

03

Ο

03

P

Ρ

U

O

CP

—

O

03

05

03

*

03

U

0

-

P

CP

•

Γ-

CM

(9

υ

σ>

Η

CP

-

•

♦

•

UD

( >

P

03

•

03

φ

ϋ

05

03

•

03

•

03

υ

*0

03

Ο

υ

0

CP

υ

Ό

U

O

υ

U

υ

03

ο

03

6

CP

P

•

Ρ

P

03

u

o

υ

O

ο

a

υ

u

υ

o

υ

P

•

o

03

u

ο

Ρ

>1

Ρ

C

φ

σ»

cp

CP

03

CP

Cp

03

CP

u

ο

03

Ο

Ρ

03

CP

-Η

CP

03

CP

O'

CP

φ

03

P

Ρ

P

Ρ

υ

<>

u

υ

□

o

υ

CP

03

-Η

τ)

υ

m

υ

o

u

P

υ

P

Ρ

υ

Φ

cp

σ»

CP

σ*

CP

Cp

u

ο

05

03

Ρ

03

P

Ρ

03

ω

υ

(>

ο

O'

U

υ

u

o

a

υ

a

ο

υ

0

0£

φ

Φ

ο

m

CP

03

CP

0J

CP

03

υ

0

a

υ

<

m

<υ

03

Ρ

03

CP

03

Τ3

X

c

co

CP

σ»

CP

υ

CP

φ

Η

Ρ

P

CP

Ρ

P

Ρ

C

□

ο

o

u

P

03

o

υ

03

υ

3

cr

ο

Cn

CP

03

CP

03

CP

03

U

φ

Φ

03

ο

υ

0,

C

(f)

σ>

CP

O'

O

υ

o

O

u

υ

U

03

Ο

03

X

03

σ»

cp

CP

P

CP

Ρ

φ

P

Ρ

P

Ρ

P

03

a

υ

u

U

υ

u

υ

ο

υ

φ

Ρ

0

υ

P

υ

χ

C

n.

03

υ

Ρ

Ο

χ

X

Ρ

Γ0

r—1

•Η

•

<—1

2

Φ

00

CM

co

r—1

LD

o

O

•

kO

'tr

Ο

ο

kO

£

ίΌ

O

i—<

x>

i—1

Γ9

CM

•5J·

k£>

i£>

09

r-4

Γ-

CM

£0

«

05

1

ι

1

σ

'Z

t—1

P

r-(

m

09

f)

Γ

C9

co

'«J’

kO

CM

φ

ω

Seqs with both expected and unexpected.

LO kD k£>

2016225923 09 Sep 2016

Ό <U

4-J a

O o

ω

M co w

QJ x;

u

4->

CO £

Ή £

<L>

0)

U-t w

O >

-C ω

tt

Λ

CQ x:

cr φ

2016225923 09 Sep 2016

<0 O’ <o O' t0 o

• ο>

υ <0

4-> <ο

υ σ>

Ο <0 υ Ό υ ιϋ

4-» <α ο» ο <ο

4-1 · » <0 · · θ' - ·

Ο υ

• <0

<0

4->

Ρ

υ

to

ο

u

0 S

4-)

(0

ο to

<0

to

<0

u

•

<0

CP

<0

to

4-1

o*

4-)

•

υ

•

<o

•

4-1

υ

4-»

<0

4->

cp

Ό

<0

•

«3

4-1

υ

ο»

<ϋ

C0

(0

«0

Ρ

ρ

«0

ο

4-1

Ρ

4-1

υ

*

RS

RJ

&

to

<0

o

P

to

(0

to

<0

tO

<0

to

(0

<0

to

<0

(Q *£

O'

OS

P

O

Λ5

«3

<0

O

o

OS

o

O

tn

Ο

O'

O

OS

to

OS

0

to

(0

R3

f0

to

υ

O'

to

<0

IQ

<Q

to

<0

to

(0

ns

<8

bs

tn

O'

u

O'

o

(0

o

O'

to

o

υ

o

<Q

ns

O'

o

O'

<0

o

O'

0

(0

to

(0

to

OS

□

o

to

<0

ns

(8

to

<0

to

(0

21

ns

&l

O'

0»

<0

tO

OS

to

Rj

to

<0

υ

o>

O

P

O'

O’

o

0»

O

u

U

u

o

u

o

υ

u

υ

o

υ

<0

υ

o

0

0)

OS

th

O'

o

O'

o

<0

o

cn

<0

o

<0

to

o

O'

(P

0>

o

h

P

o

u

σ»

0»

O'

crs

OS

O'

o

ο

o

Cn

o

O'

o

O'

&»

O'

0

c

fr#

P

x)

P

u

o

CJ

u

o

u

υ

o

u

0

o

ο

o

u

υ

0

<0

rf

(0

<0

to

<0

to

<0

to

«0

to

<Q

to

<0

(0

<0

<8

to

TJ

P

υ

P

p

P

4-)

P

(0

<8

ns

to

<0

10

to

(Q

<0

to

<0

to

(0

(Q

<Q

p

P

0»

σ»

th

O'

(0

Ό

o

O'

(Q

o

O'

o

O'

bs

O'

0

P

O

υ

u

P

0»

©

O'

to

O'

to

P

o

P

O

o

O'

o

O'

os

(Q

(0

0

P

ϋ

u

t)

o

(J

υ

O

υ

P

υ

u

0

P

o

υ

o

P

υ

o

P

0

o

u

ϋ

o

(J

u

o

υ

V

o

υ

0

υ

u

υ

0

o

u

0

o

H

co

P—1

o

-.

«.

(M

cy

SD

co

OS

o

CM

cn

OS

m

KT

CD

ra⁵

CD

H

CM

CO

m

SD

co

P

pH

r-t

«—f

r—t

p-t

CM

cy

so

cy

**

OS

4t

3fc

Ufc

:*fc

4t

Stfc

=tt=

S»=

=»t

%

φ

=»:

cy

=ff:

CM

4C

^3·

CM

CO

00

SJ1

tty

00

iT)

SD

o

os

co

/)

o

cn

o

cn

CM

cy

=tfc

CO

o

OS

•H

at

©

O

o

CM

xg.

/)

o

CM

γ-

o

p—t

pH

CM

cy

'T

r-

Ό

CM

cy

m

co

tH

H

p—1

P

ip

CM

m

cn

cy

<n

m

cy

•T

V

m

in

P o

H

©

co

CM

co

r—i

SD

CM

r—i

/y

cy

©

σ»

σ'

tH

<-1

O

i-M

SD

m

<n

o

co

so

CM

cy

pH

W

so

cy

,—1

<*

o

©

z

CM

p-1

H

CM

co

©

rH

H

i—t

o

pH

o

O

o

CM

o

O

γo cy ο

CM Ο Η Ο

SD

CM CD Ο Ο

CM CM

Η CM

/) cn so

XT

CM Ο P <*>

ο μ· cy ο ο cy

Η © t* •Τ

Ο

Η CO Η so cm m /> cy ο

CM ω CO CM CM Ο

Η SO CM

Ο CM Ο ι/) SO CM Ο

CM OS Cl Ο CM

CM r* m CM

CM cy in so cm i-t Cl

/)

/) (— CM ο ο <n cm CM H ’J *f

CM ©

so

CM ©

o in

CO

O 00 cy co

CO CM cy uy cm CM CM

Ο ο σ\ r* m so h h in © OS CM Ο O t* so in ’O' t* /) CM

OS H CM O

CM CM uy so

CM

lO cy O' toi-H CM

f) if) CO O

CM CO H

I*

H in rsr CM p-4

CM TT ΓΓΟ i-H in o r* co rH

CM

CO cH

CD O cH CM CO p-» CM CM CM <M •^r

CM

LT) in iD

100

2016225923 09 Sep 2016

CO (Μ

		rH	o	CD	Ο
00	r-
in	r-	m
	m	<—1
	«—1
O	σι
CO	r—1			Ό
t-4	uo			Φ
	»—1			44
		>,		u
CO	σι	rH		0)
r~4	OO	C	Φ	ο-
CM	co	0	4-) • r4	χ Φ
	rH	Φ	W	c
		44		3
co	rH	Ή	Ό
to	Γ*·	(fi	(V	T5
CO	r—1		4-»	C
	r~1	w	υ	<«
		CC	φ
5—I Γ-	CO o	•a o	a. κ φ	Ό (V 44
	co		c	υ
		υ	3	<v
		φ		α-	(η
r*·	Γ-	Cu	C	χ	φ
co	ΓΟ	x	ns	<v	44
CO	co	φ			-Η
	£>,	λ	V)
		o	rH	44
		_C	c	ο	ο
	>	4-»	0	λ	C
	x:	X2	43	43
	•H	44	4J	44	44
	4_)	-H	•rl	•r#	·»Η
	<0	5	3	3	3
a □	Ή 3	m cr	m O’	CO σ	co σ*
o	§	φ	Φ	<v	φ
kf	V5	co	03	ω
o	O

If)

101

2016225923 09 Sep 2016

Table 5D:

Analysis repeated using only	8 best REdaptors
Id Ntot 0123	4	5	6	7	8 +
1 301 78 101 54 32	16	9	10	1	0	281	102#l
ccgtgtattactgtgcgagaga 2 493 69 155 125 73	37	14	11	3	6	459	103#2
ctgtgtattactgtgcgagaga 3 189 52 45 38 23	18	5	4	1	3	176	108#3
ccgtgtattactgtgcgagagg 4 127 29 23 28 24	10	6	5	2	0	114	323#22
ccgtatattactgtgcgaaaga 5 78 21 25 14 11	1	4	2	0	0	72	330#23
ctgtgtattactgtgcgaaaga	6	79	15	17	25	8	11 1

76

439#44 ctgtgtattactgtgcgagaca

7	43 14	15 5 5 3 0 1	0 0	42	551#48
ccatgtattactgtgcgagaca
8	307 26	63 72 51 38 24 14	13 6	250	5a#49
ccatgtattactgtgcgaga
20 1	102#l	ccgtgtattactgtgcgagaga	ccgtgtattactgtgcgagaga
2	103#2	ctgtgtattactgtgcgagaga	. t.....
3	108#3	ccgtgtattactgtgcgagagg			..........g
4	323#22	ccgtatattactgtgcgaaaga	. . . . a..		.......a. , .
5	330#23	ctgtgtattactgtgcgaaaga	. t.....		.......a. _r .
25 6	439#44	ctgtgtattactgtgcgagaca	. t.....		.........c.
7	551#48	ccatgtattactgtgcgagaca	. . a. . . .		.........c.
8	5a#49	ccatgtattactgtgcgagaAA	..a....		.........AA

Seqs with the expected RE site only.......1463 / 1617

Seqs with only an unexpected site......... 0

Seqs with both expected and unexpected.... 7 Seqs with no sites........................ 0

102

2016225923 09 Sep 2016

Table 6: Human HC GLG FR1 Sequences

VH Exon - Nucleotide sequence alignment

VH1

1-02

CAG TCC

GTG CAG

CTG GTG

CAG TCT GGA TAC

GGG GCT

GAG GTG ACC

AAG AAG

CCT

GGG

GCC

TCA

GTG

AAG

GTC

TGC

AAG

GCT

TCT

ACC

TTC

1-03

cag

gtC

cag

ctT

gtg

cag

tet

ggg

get

gag

gtg

aag

cct

999

gee

tea

gtg

aag

gtT

tcc

tgc

aag

get

tet

gga

tac

acc

ttc

acT

1-08

cag

gtg

cag

ctg

gtg

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

gee

tea

gtg

aag

gtc

tcc

tgc

aag

get

tet

gga

tac

acc

ttc

acc

1-18

cag

gtT

cag

ctg

gtg

cag

tet

ggA

get

gag

gtg

aag

cct

ggg

gee

tea

gtg

aag

gtc

tcc

tgc

aag

get

tet

ggT

tac

acc

ttT

acc

1-24

cag

gtC

cag

ctg

gtA

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

gee

tea

gtg

aag

gtc

tcc

tgc

aag

gTt

tec

gga

tac

acc

Ctc

acT

1-45

cag

Atg

cag

ctg

gtg

cag

tet

ggg

get

gag

gtg

aag

Act

ggg

Tcc

tea

gtg

aag

gtT

tcc

tgc

aag

get

teC

gga

tac

acc

ttc

acc

1-46

cag

gtg

cag

ctg

gtg

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

gee

tea

gtg

aag

gtT

tcc

tgc

aag

gcA

tet

gga

tac

acc

ttc

acc

1-58

caA

Atg

cag

ctg

gtg

cag

tet

ggg

Cct

gag

gtg

aag

cct

ggg

Acc

tea

gtg

aag

gtc

tcc

tgc

aag

get

tet

gga

tTc

acc

ttT

acT

1-69

cag

gtg

cag

ctg

gtg

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

Tcc

teG

gtg

aag

gtc

tcc

tgc

aag

get

tet

gga

GGc

acc

ttc

aGc

1-e

cag

gtg

cag

ctg

gtg

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

Tcc

teG

gtg

aag

gtc

tcc

tgc

aag

get

tet

gga

GGc

acc

ttc

aGc

1-f

Gag

gtC

cag

ctg

gtA

cag

tet

ggg

get

gag

gtg

aag

cct

ggg

geT

Aca

gtg

aaA

Ate

tcc

tgc

aag

gTt

tet

gga

tac

acc

ttc

acc

VH2

2-05

CAG

ATC

ACC

TTG

AAG

GAG

TCT

GGT

CCT

ACG

CTG

GTG

AAA

CCC

ACA

CAG

ACC

CTC

ACG

CTG

ACC

TGC

ACC

TTC

TCT

GGG

TTC

TCA

CTC

AGC

2-26

cag

Gtc

acc

ttg

aag

gag

tet

ggt

cct

GTg

ctg

gtg

aaa

ccc

aca

Gag

acc

etc

aeg

ctg

acc

tgc

acc

Gtc

tet

ggg

ttc

tea

etc

age

2-70

cag

Gtc

acc

ttg

aag

gag

tet

ggt

cct

Geg

ctg

gtg

aaa

CCC

aca

cag

acc

etc

acA

ctg

acc

tgc

acc

ttc

tet

ggg

ttc

tea

etc

age

VH3

3-07

GAG

GTG

CAG

CTG

GTG

GAG

TCT

GGG

GGA

GGC

TTG

GTC

CAG

CCT

GGG

TCC

CTG

AGA

CTC

TCC TGT GCA GCC TCT GGA TTC ACC TTT AGT

3-09

gaA tcc

gtg tgt

cag gca

ctg gee

gtg tet

gag gga

tet ttc

ggg acc

gga ttt

ggc GAt

ttg gtA cag

cct

ggC Agg

tee ctg

aga

etc

3-11

Cag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtc

Aag

cct

ggA

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-13

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtA

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-15

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtA

Aag

cct

ggg

tcc

ctT

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acT

ttc

agt

- 103 2016225923 09 Sep 2016

3-20 3-21

gag tcc gag tcc

gtg tgt gtg tgt

cag gca cag gca

ctg gtg gee tet ctg gtg

gag gga gag gga

tet ggg

gga ttt gga ttc

ggT GAt ggc agt

Gtg Ctg

gtA eGg

cct cct

ggg ggg

tcc tcc

ctg aga

etc etc

ttc tet ttc

acc ggg acc

ctg

aga

gtc

Aag

gee

tet

5

3-23

gag

gtg

cag

ctg

Ttg

gag

tet

ggg

gga

ggc

ttg

gtA

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttt

agC

3-30

Cag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

Gtg

gtc

cag

cct

ggg

Agg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-30.3

Cag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

Gtg

gtc

cag

cct

ggg

Agg

tcc

ctg

aga

etc

10

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-30.5

Cag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

Gtg

gtc

cag

cct

ggg

Agg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-33

Cag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

Gtg

gtc

cag

cct

ggg

Agg

tcc

ctg

aga

etc

tcc

tgt

gca

geG

tet

gga

ttc

acc

ttc

agt

15

3-43

gaA

gtg

cag

ctg

gtg

gag

tet

ggg

gga

gTc

Gtg

gtA

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttt

GAt

3-48

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtA

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-49

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtA

cag

ccA

ggg

egg

tcc

ctg

aga

etc

20

tcc

tgt

Aca

geT

tet

gga

ttc

acc

ttt

Ggt

3-53

gag

gtg

cag

ctg

gtg

gag

Act

ggA

gga

ggc

ttg

Ate

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

ggG

ttc

acc

GtC

agt

3-64

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtc

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

25

3-66

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtc

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

Gtc

agt

3-72

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtc

cag

cct

ggA

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-73

gag

gtg

cag

ctg

gtg

gag

tet

ggg

gga

ggc

ttg

gtc

cag

cct

ggg

tcc

ctg

aAa

etc

30

tcc

tgt

gca

gee

tet

ggG

ttc

acc

ttc

agt

3-74

gag

gtg

cag

ctg

gtg

gag

teC

ggg

gga

ggc

ttA

gtT

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

ttc

agt

3-d

gag

gtg

cag

ctg

gtg

gag

tet

egg

gga

gTc

ttg

gtA

cag

cct

ggg

tcc

ctg

aga

etc

tcc

tgt

gca

gee

tet

gga

ttc

acc

GtC

agt

35

VH4

4-04

CAG

GTG

CAG

CTG

CAG

GAG

TCG

GGC

CCA

GGA

CTG

GTG

AAG

CCT

TCG

GGG

ACC

CTG

TCC

CTC

ACC

TGC

GCT

GTC

TCT

GGT

GGC

TCC

ATC

AGC

4-28

cag

gtg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

teg

gAC

acc

ctg

tcc

etc

acc

tgc

get

gtc

tet

ggt

TAc

tee

ate

age

40

4-30.1

cag

gtg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

tcA

. CAg

acc

ctg

tcc

etc

acc

tgc

Act

gtc

tet

ggt

ggc

tee

ate

age

4-30.2

cag

Ctg

cag

ctg

cag

gag

teC

ggc

Tea

gga

ctg

gtg

aag

cct

tcA

. CAg

acc

ctg

tcc

etc

acc

tgc

get

gtc

tet

ggt

ggc

tee

ate

age

- 104 2016225923 09 Sep 2016

4-30.4 4-31

cag gtg cag

ctg gtc ctg gtc

cag tet cag tet

gag ggt gag ggt

teg ggc teg ggc

ggc tee ggc tee

cca ate cca ate

gga age gga age

ctg ctg

gtg aag

cct cct

tcA CAg

acc acc

ctg ctg

tee tcc

Ct Ct(

acc cag acc

tgc Act

gtg tgc

cag Act

gtg

aag

tcA

CAg

5

4-34

cag

gtg

cag

ctA

cag

Cag

tGg

ggc

Gca

gga

ctg

Ttg

aag

cct

teg

gAg

acc

ctg

tcc

Ct(

acc

tgc

get

gtc

tAt

ggt

ggG

tee

Ttc

agT

4-39

cag

ctg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

teg

gAg

acc

ctg

tcc

ct<

acc

tgc

Act

gtc

tet

ggt

ggc

tee

ate

age

4-59

cag

gtg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

teg

gAg

acc

ctg

tcc

ct(

10

acc

tgc

Act

gtc

tet

ggt

ggc

tee

ate

agT

4-61

cag

gtg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

teg

gAg

acc

ctg

tcc

ctl

acc

tgc

Act

gtc

tet

ggt

ggc

tee

Gtc

age

4-b

cag

gtg

cag

ctg

cag

gag

teg

ggc

cca

gga

ctg

gtg

aag

cct

teg

gAg

acc

ctg

tcc

ct·

acc

tgc

get

gtc

tet

ggt

TAc

tee

ate

age

15

VH5

5-51

GAG

GTG

CAG

CTG

GTG

CAG

TCT

GGA

GCA

GAG

GTG

AAA

AAG

CCC

GGG

GAG

TCT

CTG

AAG

ATf

TCC

TGT

AAG

GGT

TCT

GGA

TAC

AGC

TTT

ACC

5-a

gaA

gtg

cag

ctg

gtg

cag

tet

gga

gca

gag

gtg

aaa

aag

ccc

ggg

gag

tet

ctg

aGg

at«

tcc

tgt

aag

ggt

tet

gga

tac

age

ttt

acc

20

VH6

6-1

CAG

GTA

CAG

CTG

CAG

TCA

GGT

CCA

GGA

CTG

GTG

AAG

CCC

TCG

CAG

ACC

CTC

TCA

CT<

ACC

TGT

GCC

ATC

TCC

GGG

GAC

AGT

GTC

TCT

VH7

7-4.1

CAG

GTG

CAG

CTG

GTG

CAA

TCT

GGG

TCT

GAG

TTG

AAG

CCT

GGG

GCC

TCA

GTG

AAG

GT'

25

TCC

TGC

AAG

GCT

TCT

GGA

TAC

ACC

TTC

ACT

105

2016225923 09 Sep 2016

Table 7: RERS sites in Human HC GLG FRls where there are at least 20 GLGs cut

Bsgl GTGCAG 71 (cuts 16/14 bases to right)

	1:	4	1:	13	2:	13	3:	4	3:	13	4 :	13
	6:	13	7:	4	7:	13	8 :	13	9:	4	9:	13
5	10:	4	10:	13	15:	4	15:	65	16:	4	16:	65
	17:	4	17:	65	18:	4	18:	65	19:	4	19:	65
	20:	4	20:	65	21:	4	21:	65	22:	4	22:	65
	23:	4	23:	65	24:	4	24 :	65	25:	4	25:	65
	26:	4	26:	65	27:	4	27:	65	28:	4	28:	65
10	29:	4	30:	4	30:	65	31:	4	31:	65	32:	4
	32:	65	33:	4	33:	65	34 :	4	34 :	65	35:	4
	35:	65	36:	4	36:	65	37:	4	38:	4	39:	4
	41:	4	42:	4	43:	4	45:	4	46:	4	47:	4
	48 :	4	48:	13	49:	4	49:	13	51:	4
15	There are 3 9	hits	at	base!	4
	There are 2]	hits	at	base!	65
	_ i’ _	ctgcac					9
	12:	63	13:	63	14 :	63	39:	63	41:	63	42:	63
20	44 :	63	45:	63	46:	63
	Bbvl	GCAGC					65
	1:	6	3:	6	6:	6	7:	6	8:	6	9:	6
	10:	6	15:	6	15:	67	16:	6	16:	67	17 :	6
	17:	67	18:	6	18:	67	19:	6	19:	67	20:	6
25	20:	67	21:	6	21:	67	22:	6	22:	67	23:	6
	23:	67	24 :	6	24:	67	25:	6	25:	67	26:	6
	26:	67	27:	6	27:	67	28 :	6	28:	67	29:	6
	30:	6	30:	67	31:	6	31:	67	32:	6	32:	67
	33:	6	33:	67	34:	6	34 :	67	35:	6	35:	67
30	36:	6	36:	67	37:	6	38:	6	39:	6	40:	6
	41:	6	42:	6	43:	6	44:	6	45:	6	46:	6
	47:	6	48:	6	49:	6	50:	12	51:	6

Thera are 43 hits at base# 6 Bolded sites very near sites listed below

There are 21 hits at base# 67 gctgc 13

37: 9 38: 9 39: 9 40: 3 40: 9 41: 9

42: 9 44: 3 44: 9 45: 9 46: 9 47: 9

- 106 2016225923 09 Sep 2016

50: 9

There are 11 hits at base#

BsoFI GCngc

5	1:	6	3:	6	6:	6
	10:	6	15:	6	15:	67
	17:	67	18:	6	18:	67
	20:	67	21:	6	21:	67
	23:	67	24:	6	24:	67
10	26:	67	27:	6	27:	67
	30:	6	30:	67	31:	6
	33:	6	33:	67	34:	6
	36:	6	36:	67	37:	6
	39:	6	39:	9	40:	3
15	41:	9	42:	6	42:	9
	44 :	9	45:	6	45:	9
	47:	9	48:	6	49:	6
	There are 43	hits	at	base#
	There are	11	hits	at	base#
20	There are	2	hits	at	base#
	There are	21	hits	at	base#
	Tsel	Gcwgc
	1:	6	3:	6	6:	6
25	10:	6	15:	6	15:	67
	17:	67	18:	6	18:	67
	20:	67	21:	6	21:	67
	23:	67	24:	6	24:	67
	26:	67	27:	6	27:	67
30	30:	6	30:	67	31:	6
	33:	6	33:	67	34:	6
	36:	6	36:	67	37:	6
	39:	6	39:	9	40:	3
	41:	9	42:	6	42:	9
35	44:	9	45:	6	45:	9
	47:	9	48:	6	49:	6

7: 16:	6 6	8 : 16:	6 67	9: 17 :	6 6
19:	6	19:	67	20:	6
22:	6	22:	67	23:	6
25:	6	25:	67	26:	6
28:	6	28:	67	29:	6
31:	67	32:	6	32:	67
34:	67	35:	6	35:	67
37 :	9	38:	6	38:	9
40:	6	40:	9	41:	6
43:	6	44:	3	44 :	6
46:	6	4 6:	9	47:	6
50:	9	50:	12	51:	6

These often occur together 9

7 : 16:	6 6	8: 16:	6 67	9: 17:	6 6
19:	6	19:	67	20:	6
22:	6	22:	67	23:	6
25:	6	25:	67	26:	6
28:	6	28:	67	29:	6
31:	67	32:	6	32:	67
34:	67	35:	6	35:	67
37:	9	38:	6	38:	9
40:	6	40:	9	41:	6
43:	6	44 :	3	44:	6
46:	6	46:	9	47:	6
50:	9	50:	12	51:	6

There are 43 hits at base# There are 11 hits at base#

Often together. 9

107

2016225923 09 Sep 2016

There	are	2	hits	at	base#	3
There	are	1	hits	at	base#	12
There	are	21	hits	at	base#	67
IspAlI 1: 7	CMGckg 3:	7	4 :	7	5:	48 7	6:	7	7:
8: 7		9:	7	10:	7	11:	7	15:	7	16:
17: 7		18:	7	19:	7	20:	7	21:	7	22:
23: 7		24 :	7	25:	7	26:	7	27:	7	28 :
29: 7		30:	7	31:	7	32:	7	33:	7	34 :
35: 7		36:	7	37:	7	38:	7	39:	7	40:
40: 7		41:	7	42 :	7	44 :	1	44 :	7	45:
46: 7		47:	7	48:	7	49:	7	50:	7	51:
There	are	46	hits	at	base#	7

PvuII CAGctg 48

	1:	7	3:	7	4 :	7	5:	7	6:	7	7:	7
	8:	7	9:	7	10:	7	11:	7	15:	7	16:	7
	17:	7	18:	7	19:	7	20:	7	21:	7	22:	7
20	23:	7	24:	7	25:	7	26:	7	27:	7	28:	7
	29:	7	30:	7	31:	7	32:	7	33:	7	34:	7
	35:	7	36:	7	37:	7	38:	7	39:	7	40:	1
	40:	7	41:	7	42:	7	44:	1	44:	7	45:	7
	46:	7	47:	7	48:	7	49:	7	50:	7	51:	7
25	There are 46	hits	i at	base# 7
	There are 2	hits	ί at	base# 1
	Alul	AGct						54
	1:	8	2:	8	3:	8	4 :	8	4 :	24	5:	8
30	6:	8	7 :	8	8:	8	9:	8	10:	8	11:	8
	15:	8	16:	8	17:	8	18:	8	19:	8	20:	8
	21:	8	22:	8	23:	8	24 :	8	25:	8	26:	8
	27:	8	28:	8	29:	8	29:	69	30:	8	31:	8
	32:	8	33:	8	34:	8	35:	8	36:	8	37:	8
35	38:	8	39:	8	40:	2	40:	8	41:	8	42:	8
	43:	8	44 :	2	44 :	8	45:	8	46:	8	47 :	8
	48:	8	4 8:	82	4 9:	8	49:	82	50:	8	51:	8

- 108 2016225923 09 Sep 2016

	There are There are	48 hits at base# 2 hits at base#	8 2
	Ddel Ctnag					48
5	1:-26	1:	48	2:	26	2:	48	3:	26	3:	48
	4: 26	4 :	48	5:	26	5:	48	6:	26	6:	48
	7: 26	7:	48	8:	26	8:	48	9:	26	10:	26
	11: 26	12:	85	13:	85	14:	85	15:	52	16:	52
	17: 52	18:	52	19:	52	20:	52	21:	52	22:	52
10	23: 52	24 :	52	25:	52	26:	52	27:	52	28:	52
	29: 52	30:	52	31:	52	32:	52	33:	52	35:	30
	35: 52	36:	52	40:	24	49:	52	51:	26	51:	48
	There are	22	hits	at	base#	52	52	and 48	never together
	There are	9	hits	at	base#	48
15	There are	12	hits	at	base#	26	26	and 24	never together
	HphI tcacc					42
	1: 86	3:	86	6:	86	7 :	86	8:	80	11:	86
	12: 5	13:	5	14 :	5	15:	80	16:	80	17:	80
20	18: 80	20:	80	21:	80	22:	80	23:	80	24 :	80
	25: 80	26:	80	27:	80	28:	80	29:	80	30:	80
	31: 80	32:	80	33:	80	34:	80	35:	80	36:	80
	37: 59	38:	59	39:	59	40:	59	41:	59	42:	59
	43: 59	44 :	59	45:	59	46:	59	47:	59	50:	59
25	There are	22	hits	at	base#	80	80	and 86 never together
	There are	. 5	i hits	. at	base#	86
	There are	12	hits	; at	base#	59
	BssKI Nccngg					50
30	1: 39	2:	39	3:	39	4 :	39	5:	39	7:	39
	8: 39	9:	39	10:	39	11:	39	15:	39	16:	39
	17: 39	18:	39	19:	39	20:	39	21:	29	21:	39
	22: 39	23:	39	24 :	39	25:	39	26:	39	27:	39
	28: 39	29:	39	30:	39	31:	39	32:	39	33:	39
35	34: 39	35:	19	35:	39	36:	39	37:	24	38:	24
	39: 24	41:	24	42:	24	44:	24	45:	24	46:	24
	47: 24	48:	39	48:	40	49:	39	49:	40	50:	24
	50: 73	51:	39

There are 35 hits at base# 39 39 and 40 together twice

109

2016225923 09 Sep 2016

	There are	2 hits at base# 40
	BsaJI Ccnngg					47
	1: 40	2	: 40	3:	40	4 :	40	5:	40	7 :	40
5	8: 40	9	: 40	9:	47	10:	40	10:	47	11:	40
	15: 40	18	: 40	19:	40	20:	40	21:	40	22:	40
	23: 40	24	: 40	25:	40	26:	40	27:	40	28:	40
	29: 40	30	: 40	31:	40	32:	40	34:	40	35:	20
	35: 40	36	: 40	37:	24	38:	24	39:	24	41:	24
10	42: 24	44	: 24	45:	24	46:	24	47:	24	48:	40
	48: 41	49	: 40	49:	41	50:	74	51:	40
	There	are	32 hits	at	base#	40	40	and 43	together tw
	There	are	2 hits	at	base#	41
	There	are	9 hits	at	base#	24
15	There	are	2 hits	at	base#	47
	BstNI CCwgg					44

PspGI ccwgg

ScrFI($M.Hpall) CCwgg

20	1: 8:	40 40	2 : 9:	40 40	3: 10:	40 40	4 : 11:	40 40	5: 15:	40 40	7: 40 16: 40
	17:	40	18 :	40	19:	40	20:	40	21:	30	21:	40
	22:	40	23:	40	24:	40	25:	40	26:	40	27:	40
	28:	40	29:	40	30:	40	31:	40	32:	40	33:	40
25	34:	40	35:	40	36:	40	37:	25	38:	25	39:	25
	41:	25	42:	25	44 :	25	45:	25	46:	25	47:	25
	50:	25	51:	40
	There are 33	ί hits	i at	base#	ί 40
30	ScrFI	CCngg					50
	1:	40	2:	40	3:	40	4 :	40	5:	40	7:	40
	8:	40	9:	40	10:	40	11:	40	15:	40	16:	40
	17:	40	18:	40	19:	40	20:	40	21:	30	21:	40
	22:	40	23:	40	24 :	40	25:	40	26:	40	27:	40
35	28:	40	29:	40	30:	40	31:	40	32:	40	33:	40
	34:	40	35:	20	35:	40	36:	40	37:	25	38:	25
	39:	25	41:	25	42:	25	44 :	25	45:	25	46:	25

110

2016225923 09 Sep 2016

47: 25 48: 40 48: 41 49: 40 49: 41

50: 74 51: 40

There are 35 hits at base# 40

There are 2 hits at base# 41

50: 25

EcoO109I	RGgnccy	3: 9:	43 43	4 : 10:	34
1: 7:	43 43	2 8	: 43 : 43	43 43	5 15	: 43 : 4 6	6: 43 16: 46
17:	46	18	: 46	19:	46	20:	46	21	: 46	22: 46
23:	46	24	: 46	25:	46	26:	46	27	: 46	28: 46
30:	46	31	: 46	32:	46	33:	46	34	: 4 6	35: 46
36:	46	37	: 46	43:	79	51:	43
There are	22 hits at	base#	46	46	and	43 never together

There are 11 hits at base# 43 NlalV GGNncc 71

1:	43	2:	43	3:	43	4 :	43	5:	43	6:	43
7:	43	8:	43	9:	43	9:	79	10:	43	10:	79
15:	46	15:	47	16:	47	17 :	46	17:	47	18 :	46
18:	47	19:	46	19:	47	20:	46	20:	47	21:	46
21:	47	22:	46	22:	47	23:	47	24:	47	25:	47
26:	47	27:	46	27:	47	28:	46	28:	47	29:	47
30:	46	30:	47	31:	46	31:	47	32:	46	32:	47
33:	46	33:	47	34 :	46	34 :	47	35:	46	35:	47
36:	46	36:	47	37:	21	37:	46	37:	47	37:	79
38:	21	39:	21	39:	79	40:	79	41:	21	41:	79
42:	21	42:	79	43:	79	44 :	21	44 :	79	45:	21
45:	79	46:	21	46:	79	47 :	21	51:	43
There ;	are 23 hits at	base# 47	46	& 47	often	together

There are	17	hits	at	base#	46		There	are	11	hits	at	base#	43
Sau9	51 Ggn	CC					70
1:	44	2:	3	2:	44	3:	44	4:	44	5:	3	5:	44	6:	44
7:	44	8:	22	8:	44	9:	44	10:	44	11:	3	12:	22	13:	22
14:	22	15:	33	15:	47	16:	47	17:	47	18:	47	19:	47	20:	47
21:	47	22:	47	23:	33	23:	47	24 :	33	24 :	47	25:	33	25:	47
26:	33	26:	47	27:	47	28:	47	29:	47	30:	47	31:	33	31:	47
32:	33	32:	47	33:	33	33:	47	34 :	33	34 :	47	35:	47	36:	47
37:	21	37:	22	37:	47	38:	21	38:	22	39:	21	39:	22	41:	21
41:	22	42:	21	42:	22	43:	80	44 :	21	44 :	22	45:	21	45:	22
46:	21	46:	22	47 :	21	47:	22	50:	22	51:	44

lll

2016225923 09 Sep 2016

There are	23 11	hits hits	at base#
There	are	at	base#
There	are	14	hits	at	base#
There	are	9	hits	at	base#

These do not occur together 44

These do occur together.

BsmAI GTCTCNnnnn

	1:	58	3:	58	4: 58
	10:	58	13:	70	36: 18
	40:	70	41:	70	42: 70
10	47 :	70	48:	48	49: 48
	There ars	11	hits	at bas
	__ It _	Nnnnnngagac
	13:	40	15:	48	16: 48
15	21:	48	22:	48	23: 48
	27:	48	28:	48	29: 48
	32:	48	33:	48	35: 48
	45:	40	46:	40	47: 40

There are 20 hits at base#

Avail Ggwcc

Sau96I(SM.Haelll) Ggwcc

	2:	3	5:	3	6:	44
	11:	3	12:	22	13:	22
25	16:	47	17:	47	18:	47
	22:	47	23:	33	23:	47
	25:	47	26:	33	26:	47
	30:	47	31:	33	31:	47
	33:	47	34:	33	34 :	47
30	43:	80	50:	22
	There	are 23	hits	at	base#
	There	are 4	hits	at	base#
35	PpuMI 6:	RGgwccy 43 8 :	43	9:	43
	17 :	46	18:	46	19:	46
	23:	46	24 :	46	25:	46

5:	58	8:	58	9:	58
37:	70	38:	70	39:	70
44 :	70	45:	70	46:	70
50:	85
70
	27
17:	48	18 :	48	20:	48
24 :	48	25:	48	26:	48
30:	10	30:	48	31:	48
36:	48	43:	40	44 :	40
48
	44
	44
8:	44	9:	44	10:	44
14:	22	15:	33	15:	47
19:	47	20:	47	21:	47
24:	33	24 :	47	25:	33
27:	47	28:	47	29:	47
32:	33	32:	47	33:	33
35:	47	36:	47	37:	47
47	44	& 47	never	together
44
	27
10:	43	15:	46	16:	46
20:	46	21:	46	22:	46
26:	46	27:	46	28:	46

2016225923 09 Sep 2016

- 112 -

	30:	46	31:	46	32:	46	33: 46	34 :	46	35:	46
	36:	46	37:	46	43:	79
	There are	22	hits	at	base#	46 43	and 46	never oc	:cu:
	There are	4	hits	at	base#	43
5
	BsraFI	GGGAC				3
	8:	43	37:	46	50:	77
		gtccc				33
	15:	48	16:	48	17:	48	1: 0	1:	0	20 :	48
10	21:	48	22:	48	23:	48	24 : 48	25:	48	26:	48
	27:	48	28:	48	29:	48	30: 48	31:	48	32 :	48
	33:	48	34:	48	35:	48	36: 48	37 :	54	38 :	54
	39:	54	40:	54	41:	54	42 : 54	43:	54	44 :	54
	45:	54	46:	54	47:	54
15	There are 20	' hits	; at	base#	48
	There are 11	. hits	i at	base#	54
	Hinfl Gantc				80
	8:	77	12:	16	13:	16	14: 16	15:	16	15:	56
20	15:	77	16:	16	16:	56	16: 77	17:	16	17:	56
	17:	77	18:	16	18:	56	18: 77	19:	16	19:	56
	19:	77	20:	16	20:	56	20: 77	21:	16	21:	56
	21:	77	22:	16	22:	56	22: 77	23:	16	23:	56
	23:	77	24:	16	24:	56	24: 77	25:	16	25:	56
25	25:	77	26:	16	26:	56	26: 77	27:	16	27:	26
	27:	56	27:	77	28:	16	28: 56	28:	77	29:	16
	29:	56	29:	77	30:	56	31: 16	31:	56	31:	77
	32:	16	32:	56	32:	77	33: 16	33:	56	33:	77
	34:	16	35:	16	35:	56	35: 77	36:	16	36:	26
30	36:	56	36:	77	37:	16	38: 16	39:	16	40:	16
	41:	16	42:	16	44:	16	45: 16	46:	16	47:	16
	48:	46	49:	46
	There are 34 hits at	base# 16
35	Tfil	Gawtc				21
	8:	77	15:	77	16:	77	17: 77	18:	77	19:	77
	20:	77	21:	77	22:	77	23: 77	24:	77	25:	77
	26:	77	27:	77	28:	77	29: 77	31:	77	32:	77

2016225923 09 Sep 2016

- 113 -

33:	77	35:	77	36:	77
There are	21	hits	at	base#	77
Mlyl	GAGTC					38
12:	16	13:	16	14 :	16	15:	16	16:	16	17
18:	16	19:	16	20:	16	21:	16	22:	16	23
24 :	16	25 :	16	26:	16	27:	16	27:	26	28
29:	16	31:	16	32:	16	33:	16	34 :	16	35
36:	16	36:	26	37:	16	38:	16	39:	16	40
41:	16	42:	16	44 :	16	45:	16	46:	16	47
48 :	46	49:	46
There are	34	hits	. at	base#	16

GACTC 21

15:	56	16:	56	17 :	56	18:	56	19:	56	20:
21:	56	22:	56	23:	56	24 :	56	25:	56	26:
27:	56	28:	56	29:	56	30:	56	31:	56	32:
33:	56	35:	56	36:	56
There are	21	hits	at	base#	56
Plel	gagtc				38
12:	16	13:	16	14:	16	15:	16	16:	16	17:
18:	16	19:	16	20:	16	21:	16	22:	16	23:
24:	16	25:	16	26:	16	27:	16	27:	26	28:
29:	16	31:	16	32:	16	33:	16	34 :	16	35:
36:	16	36:	26	37:	16	38:	16	39:	16	40:
41:	16	42:	16	44 :	16	45:	16	46:	16	47:
48:	46	49:	46
There are 34	hits	at	base#	16

. »1 __	gactc						21
15:	56	16:	56	17:	56	18:	56	19:	56	20
21:	56	22:	56	23:	56	24 :	56	25:	56	26
27:	56	28:	56	29:	56	30:	56	31:	56	32
33:	56	35:	56	36:	56
There are	21	. hits	; at	base#	56

AlwNI CAGNNNctg 26

16: 68 17: 68 18: 68 19: 68 20:

15: 68

114

225923 09 Sep 2016

SO

O

CM

21:	68	22:	68	23:	68	24:	68	25:	68	26:	68
27:	68	28:	68	29:	68	30:	68	31:	68	32:	68
33:	68	34:	68	35:	68	36:	68	39:	46	40:	46
41:	46	42:	46
There .	are 22	hits	; at	base#	68

115 so ο

(Μ

Gh <υ oo

Ο

! 1

2

3

4

5

6

7

8

9

10

11

12

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

CO

! 13

14

15

16

17

18

19

20

21

22

23

(Μ OS

5

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

I

υη Λ\ΐ

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

04

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

so ι—Ι

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

ο

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

ΓΜ

10

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

15

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

I

AAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCT

GCC

ATG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

1

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

TCA

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

t

20

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCC

TCA

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

t

GCC

ATC

CAG

TTG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GCC

ATC

CAG

TTG

ACC

CAG

TCT

CCA

TCC

CTG

TCT

25

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCT

TCC

GTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

I

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCA

TCT

GTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

t

30

GAC

ATC

CAG

TTG

ACC

CAG

TCT

CCA

TCC

TTC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GCC

ATC

CGG

ATG

ACC

CAG

TCT

CCA

TTC

TCC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

1

GCC

ATC

CGG

ATG

ACC

CAG

TCT

CCA

TCC

TCA

TTC

TCT

35

GCA

TCT

ACA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGT

1

012

018

Α20

Α30

L14

LI

L15

L4

L18

L5

L19

L8

L23

L9

2016225923 09 Sep 2016

- 116 -

GTC GCA GCC GCA

ATC TCT ATC TCT

TGG ATG ACC ACA GGA GAC

CAG AGA CAG AGA

TCT GTC TCT GTC

CCA ACC CCA ACC

TCC ATC TCC ATC

TTA CTC

TCT

AGT TCC ACT

TGT CTG TGC

I TCT j

L2 4 Lll

CAG ATG

ACC GAC

GTA

GGA

5

GAC

ATC

CAG

ATG

ACC

CAG

TCT

CCT

TCC

ACC

CTG

TCT

GCA

TCT

GTA

GGA

GAC

AGA

GTC

ACC

ATC

ACT

TGC

i

L12

GAT

ATT

GTG

ATG

ACC

CAG

ACT

CCA

CTC

TCC

CTG

CCC

GTC

ACC

CCT

GGA

GAG

CCG

GCC

TCC

ATC

TCC

TGC

t

Oil

GAT

ATT

GTG

ATG

ACC

CAG

ACT

CCA

CTC

TCC

CTG

CCC

10

GTC

ACC

CCT

GGA

GAG

CCG

GCC

TCC

ATC

TCC

TGC

1

01

GAT

GTT

GTG

ATG

ACT

CAG

TCT

CCA

CTC

TCC

CTG

CCC

GTC

ACC

CTT

GGA

CAG

CCG

GCC

TCC

ATC

TCC

TGC

I

A17

GAT

GTT

GTG

ATG

ACT

CAG

TCT

CCA

CTC

TCC

CTG

CCC

GTC

ACC

CTT

GGA

CAG

CCG

GCC

TCC

ATC

TCC

TGC

1

Al

15

GAT

ATT

GTG

ATG

ACC

CAG

ACT

CCA

CTC

TCT

CTG

TCC

GTC

ACC

CCT

GGA

CAG

CCG

GCC

TCC

ATC

TCC

TGC

t

A18

GAT

ATT

GTG

ATG

ACC

CAG

ACT

CCA

CTC

TCT

CTG

TCC

GTC

ACC

CCT

GGA

CAG

CCG

GCC

TCC

ATC

TCC

TGC

f

A2

GAT

ATT

GTG

ATG

ACT

CAG

TCT

CCA

CTC

TCC

CTG

CCC

20

GTC

ACC

CCT

GGA

GAG

CCG

GCC

TCC

ATC

TCC

TGC

J

A19

GAT

ATT

GTG

ATG

ACT

CAG

TCT

CCA

CTC

TCC

CTG

CCC

GTC

ACC

CCT

GGA

GAG

CCG

GCC

TCC

ATC

TCC

TGC

f

A3

GAT

ATT

GTG

ATG

ACC

CAG

ACT

CCA

CTC

TCC

TCA

CCT

GTC

ACC

CTT

GGA

CAG

CCG

GCC

TCC

ATC

TCC

TGC

1

A23

25

GAA

ATT

GTG

TTG

ACG

CAG

TCT

CCA

GGC

ACC

CTG

TCT

TTG

TCT

CCA

GGG

GAA

AGA

GCC

ACC

CTC

TCC

TGC

1

A27

GAA

ATT

GTG

TTG

ACG

CAG

TCT

CCA

GCC

ACC

CTG

TCT

TTG

TCT

CCA

GGG

GAA

AGA

GCC

ACC

CTC

TCC

TGC

1

All

GAA

ATA

GTG

ATG

ACG

CAG

TCT

CCA

GCC

ACC

CTG

TCT

30

GTG

TCT

CCA

GGG

GAA

AGA

GCC

ACC

CTC

TCC

TGC

1

L2

GAA

ATA

GTG

ATG

ACG

CAG

TCT

CCA

GCC

ACC

CTG

TCT

GTG

TCT

CCA

GGG

GAA

AGA

GCC

ACC

CTC

TCC

TGC

1

L16

GAA

ATT

GTG

TTG

ACA

CAG

TCT

CCA

GCC

ACC

CTG

TCT

TTG

TCT

CCA

GGG

GAA

AGA

GCC

ACC

CTC

TCC

TGC

t

L6

35

GAA

ATT

GTG

TTG

ACA

CAG

TCT

CCA

GCC

ACC

CTG

TCT

2016225923 09 Sep 2016

- 117 -

TTG	TCT	CCA	GGG	GAA	AGA	GCC	ACC	CTC	TCC	TGC !	L20
GAA	ATT	GTA	ATG	ACA	CAG	TCT	CCA	GCC	ACC	CTG TCT
TTG	TCT	CCA	GGG	GAA.	AGA	GCC	ACC	CTC	TCC	TGC !	L2S
GAC	ATC	GTG	ATG	ACC	CAG	TCT	CCA	GAC	TCC	CTG GCT
GTG	TCT	CTG	GGC	GAG	AGG	GCC	ACC	ATC	AAC	TGC !	B3
GAA	ACG	ACA	CTC	ACG	CAG	TCT	CCA	GCA	TTC	ATG TCA
GCG	ACT	CCA	GGA	GAC	AAA	GTC	AAC	ATC	TCC	TGC !	B2
GAA	ATT	GTG	CTG	ACT	CAG	TCT	CCA	GAC	TTT	CAG TCT
GTG	ACT	CCA	AAG	GAG	AAA	GTC	ACC	ATC	ACC	TGC !	A26
GAA	ATT	GTG	CTG	ACT	CAG	TCT	CCA	GAC	TTT	CAG TCT
GTG	ACT	CCA	AAG	GAG	AAA	GTC	ACC	ATC	ACC	TGC !	A10
GAT	GTT	GTG	ATG	ACA	CAG	TCT	CCA	GCT	TTC	CTC TCT
GTG	ACT	CCA	GGG	GAG	AAA	GTC	ACC	ATC	ACC	TGC !	A14

118

2016225923 09 Sep 2016 to ►j

Ul fO a

a c

f0 x

Ή

Ό

C □

Un io

0)

4J •H (0 ω

c£ ω

&

σι

a)

Ή

Λ ro

Η

υ >v £*

> rf

x© ch

o Ch

\© ch CM

x© ch Ch

X© ch rf

XO Ch uh

X© Ch x©

xO ch r-

X© ch 00

X© ch ex

1036

X© Ch

Ό Ch CN

X© ch Ch

x© ch rf

1536

x©

X©

XO

x©

XO

X©

©

X©

c

CN

Uh

s

CN

ch

rf

Uh

C-

CO

ex

-

•

¹

r~-

C-

r-

r*«

c*.

r-

Π-

r*

Γ-

r-

Γ-

r-

rf

3

rf

r-

rf

3

rf

©

CN

eh

Uh

rf

CN

Ch

uh

x©

r-

co

σχ

£ 'Λ CC

00

co

CO CM

00 ch

OO rf

co ©

oo X©

oo r-

oo co

© ©

©

Uh

uh

Uh

©

uh

Uh

Ch

uh

CN

ch

rf

Uh

3

Γ-

oo

Cx

X©

y*

CN

x©

rf

Uh

©

Vi cc

σχ

©

Ch

©

Ch

CX

ex

σχ

ex

rf

©

rf

3

rf

©

CN

Ch

rf

Uh

rf

»-

CN

ch

uh

x©

c*·

oo

ex

Ξ ?

cn

CN r

CN CN

CN ch

CN rf

CN Uh

CN xo

CN r-

CN ©

CN ex

CN o

CN

CN CN

CN ch

CN rf

CN Uh

Ch

Ch CN

CN

Λ

ch

Ch

ch

Ch

ch

Uh

CN

ch

© ©

§

c3

V-

CN

Ch

rf

Uh

C-

OO

©

•

rf

Uh

v

T·»

2 o £.

Λ

ch

Ch ©

Ch © CN

Ch © ch

Ch © rf

Ch © Uh

ch © XO

Ch © r-

Ch © ©

ch © ©

Ch © ©

,

Ch Cl

eh © Ch

ch $

Ch © uh

ch

Ch

ch

Ch

3

Ch

U5

ch

©

s

CN

Ch

rf

Uh

xo

r-

co

•

T“>

1

CX x©

Ch © CN

ex X© rf

ex so uh

e> xO x©

ex X© ©

i © SO Uh

©

5 rf

S Uh

© ©

o ©

©

O rf

3 Uh

2 c*

cn δ

CN o

00 δ

oo O

rf x

U

uh

rf Λ

© 1

Uh

©

© J

Ch 3

ιΓ>

LD rH

119

2016225923 09 Sep 2016

HpyCH

4V

1636

1736

SO «η 00 r—

1

©

so

SO

Ό

sO

VS

OO

Cs

©

sO

Ό

sO

Ό

sO

SO

Ό

1—-

CM

»n

VS

US

m

vs

m

in

CM

Os

©

CM

«η

M-

«η

Ό

CM

Cs CM

₌

SO

Γ- CM

c

CM

r-

r—

M

rf·

Ό

r*-

oo

Cs

cn

m

«η

00

CS

©

CM

m

<*s

OO

00

oo

00

oo

00

«ft

sO

Γ--

00

CM

vs

Ό

00

Cs

©

X

<—·

¹

CM

1

•

CM

fA

<n

»n

VS

vs

Ό

r~

00

1 ·

T—

Ui

X

-

>

•

cs

Os

OS

TT

TJ-

TT

SO

r~

oo

*“·

CM

so

r-

oo

CM

vs

sO

oo

Cs

©

r-

CM

•

CM

•

CM

rs

«η

rC

Λ

CM

j

sO

Γ-

1

00

V

© Ό

»C

<n

O

Λ

»—

•

r«“i

e*S

<s

*73

©

so

r-~

00

*“

¹

•

Os sO sO

-1769

-1869

-1969

-2069

CS sO C4

2269

Cs SO «*s CM

2469

-2569

2669

-2769

-2869

-2969

Cs SO ©

Cs so in

1601

1701

1801

1901

§ CM

2101

§ CM

2301

2401-

2501

2601-

2701

2801

2901

-toot

rr rj

CM

2

z

Γ-

r-

00

CM

Cs

«η

*n CM

Γ- CM

CM

Ό

J

>

o

0

<

>

2

3

to o Lf)

120

2016225923 09 Sep 2016

X u hu. Cu > X $							•
Mnll	© s© CM m	© Ό m m	o © TT m		V m »n m						Λ © m ©s m
ε ui £C	©v m CM <n 00 CM rn	©v rn rH rn CO m <n	©s *n m· m 00 «η		o »n in m 00 m rn		to- Ό m co © rn		00 to- rn	CO co m	oo o m
w CC					in in m
£	CM CM m	CM r· m m	CM Tf· m		CM in m		©s © rn		CM to- rn	CM 00 m	CM C\ m
Λ _ V θ Λ U« 1					1
*v3 s					m © m m
	O Ό CM rn © CM rn i©	© © m <n © m <n J	c\ © Tt m © tC m m CM 3	£· 12 >	©\ © in rn © in m m CQ	>	©1 © V© rn © V© m CM co	>	Λ26 3701-3769	Λ10 3801-3869	A14 3901-3969

Lf> O

121

2016225923 09 Sep 2016

Ό

Ο

C •X3

C

Ο υ

Ο

U

Ο

U, «

Cu

CU .3

Ό

-Ο

Λ

Η

CM ιΛ X —, X *rt CU sO C. 5Γ ο Ξ 2 ϊ																1406	1506 j
CM Ο X X SO m X X χ 22 c- 2 τ* X χ		sO U“,	156	256 1	sO m m	456	sO in in	SO m so	756	9S8 ί	956	1056	1156	1256	Ό m m	1456	so «η m
Mac 111 Tsp45I same sites		lA m	155	255	in in en	455	m m in	655	755	855	m in Cs	1055	«η in	in ιΛ CM	m <n <n	m «η T}-	1555
V Λ ί		f*h in	’-J m	tn in CM	«η «η «η	<n iA	m m	eO in sO	753	853	*n in Cs	1053	1153	rn in CM	cO m m	m m	1553
c: ,c		en m	rn m	<n m CM	*n m «η	<n m	553	rO m sO	⁷⁵³ I	853	m m Cs	1053	«η in	<n m CM	rO m cn	•n ’t’	1553
Sfcl		'r	141	241	»n	441	541	641	741	841	941	1041	1141	•’S' CM	1341	1441	1541
SfaNl		r- <n	Γ- m	237	r-- <n *n	Γ-- <n	537	637	Γ- en r-	837	r- m Cs	1037	1137	1237	r- <n	1437	1537
	VKI	012 1-69	02 101-169	018 201-269	I 08 301-369	Λ20 401-469	Λ30 501-569	L14 601-669	Ll 701-769	I 1 L15 801-869	L4 901-969	j L18 1001-1069	L5 1101-1169	L19 1201-1269	Cs Ό en s <n 00 uj	L23 1401-1469	Cs sO m o m Cs Ί

m ο

to ι—I

122

2016225923 09 Sep 2016

CM Id 3 g. g- § I 2 S
CM Ό X X SO id X X £ 5S cu X X	1656	sO id r-	sO Ld co
Maelll Tsp451 same sites	id Id •o	id id r~	id id OO
V Λ . - Λ !	cd id SO	cd Id r-	cd Id co
G	cd id sO	cd Id r-	*d id co
t/i	SO	Tt r-~	Tf oo
SfaNI	t— cd o	r- cd r-	r- Cd oc
	L24 1601-1669	L11 1701-1769	IJ2 1801-1869

1952	2052	2152	2252	2352	2452	2552	2652
00 cd c\	00 O CM	2138	2238	2338	CO cd cm	2538	2638	2731* 2738*
1937	2037	2137	2237	2337	2437	2537	Γ- cd SO cm	2737
00	2018	2112	2212	2318	2418	2512	2612	2718
00 δ	2018 ................._J	2112	2212	00 cd CM	2418	2512	2612	2718


C\ so o o C\ δ	O1 2001-2069	A17 2101-2169	A1 2201-2269	Λ18 2301-2369	Λ2 2401-2469	A19 2501-2569	CA sO Ό CM O SO CM cd	c\ SO r- CM δ r- CM rd CM

un in

I—I

123

2016225923 09 Sep 2016

Hpall

c. irt 2

CM in X X Ό © X X

<M

sO X

CM

X

Ό

sO

so

i—

00

OS

SO

ch

Ch

X

©

sO

X

uh

Γ-

00

O

x:

00

Ch

ch

C-

Ch

X

c

E

uh

Uh

L/h

Uh

r*-

O0

cs

w—

uh

ch

tT

Γ*·

Γ-

Γ3

Cl.

4J

Ch

ch

2

Γ-

oo

Cs

V)

ch

Ch

ch

Ch

OS

Ch

Os

y

Os

r-

00

cn

ch so

ch

Ch

ch Cs

Λ

Uh

Ch

ch

CM

Mly

Λ

OO

•

*

ch

Ch

CS

cs

ch

Ch

Cs

r-

00

Cs

uh

ch

CM

so

Cs

Uh

Ch

c

ch

CM

r-

CO

X.

•

*

ch

Ch

5

CZ]

•

1 to

-3169

CS sO ch ch

Cs Ό ’Φ Ch

cs SO Uh Ch

os SO sO Ch

Cs SO r- ch

CS so oo ch

Cs so CS ch

©

© ©

Ch

ch

Ch

ch

>

Uh Ch

so ch

ch

Ch

Ό

3

uh CM

2

Ch

CM

Ό CM

©

Φ

J

s

>

CQ

>

Λ

>

<

LT)

124

SO

Ο

CM

Ph ο

CZ

Os ο

CD

CM

OS m

CM

CM so o

CM cc U Z 2

V- v

X c o cl <2 Λ x

	Cl Tf
z	X X
vt &	o Cl X
	X
	CM
	CM
cS	X X

m m

125

2016225923 09 Sep 2016

03 o Uh CL IT. Η
u *2
£ » ·3 o S « « M cc U Z Z
rt rt X 1 X i V ’t _ S Λ I'M x x :		I
? £ <=> S. -5 X CH « X X x
c-> rt X X CH rt X X O rt* CH « x X x
	L9 1501-1569	L24 1601-1669	L11 1701-1769	L12 1801-1869


1954	2054	2154	2254	2354	2454	2554	rt uh Ό CH	2754
1951	2051	2151	Uh Cl CH	Uh ««h CH	2451	2551	uh Ό CH	2751
1944	2044					2544	2644
rt· o	2043			r'h rt r'h CH	2445	«-h Tt Uh CH	S© CH
1942	2042 j	2142	CH rt CH CH	CH rt «<Ί CH	2442	2542	2642	2742
011 1901-1969	O1 2001-2069	Λ17 2101-2169	Al 2201-2269	Λ18 2301-2369	A2 2401-2469	Λ19 2501-2569	A3 2601-2669	A23 2701-2769

2803	2903


rt QO CH © CH 00 CH	2920 2941
2822 2843	2943
rt 00 cn	Ch rt C\ CH
Λ27 2801-2869	A11 2901-2969

Lf) o

lO

126

2016225923 09 Sep 2016

LT) O t—f

127

Ό

Ο cd <υ Table 10 Lambda FR1 GLG sequences

CZ) _ ! VL1

o

CAG

TCT

GTG

CTG

ACT

CAG

CCA

CCC

TCG

GTG

TCT GAA

GCC

CCC

AGG

CAG

AGG

GTC

ACC

ATC

TCC

TGT

! la

CD

5

cag

tct

gtg

ctg

acG

cag

ccG

ccc

tcA

gtg

tct gGG

Cd OD

gcc

ccA

Ggg

cag

agg

gtc

acc

ate

tee

tgC

! le

ΜΊ Cd

cag

tct

gtg

ctg

act

cag

cca

ccc

tcA

geg

tct gGG

Cd l

Acc

ccc

Ggg

cag

agg

gtc

acc

ate

teT

tgt

! le

\D

cag

tct

gtg

ctg

act

cag

cca

ccc

tcA

geg

tct gGG

O Cd

0

Acc

ccc

Ggg

cag

agg

gtc

acc

ate

teT

tgt

! lg

cag

tct

gtg

Ttg

acG

cag

ccG

ccc

tcA

gtg

tct gCG

gcc

ccA

GgA

cag

aAg

gtc

acc

ate

tee

tgC

! lb

! VL2

CAG	TCT	GCC	CTG	ACT	CAG	CCT	CCC	TCC	GCG	TCC	GGG
TCT	CCT	GGA	CAG	TCA	GTC	ACC	ATC	TCC	TGC	!	2c
cag	tct	gcc	ctg	act	cag	cct	eGe	tcA	gTg	tee	ggg
tct	cct	gga	cag	tea	gtc	acc	ate	tee	tgc!	! 2e
cag	tct	gcc	ctg	act	cag	cct	Gcc	tee	gTg	teT	ggg
tct	cct	gga	cag	teG	Ate	acc	ate	tee	tgc	1	2a2
cag	tct	gee	ctg	act	cag	cct	ccc	tee	gTg	tee	ggg
tct	cct	gga	cag	tea	gtc	acc	ate	tee	tgc	1	2d
cag	tct	gcc	ctg	act	cag	cct	Gcc	tee	gTg	teT	ggg
tct	cct	gga	cag	teG	Ate	acc	ate	tee	tgc	1	2b2

! VL3

TCC	TAT	GAG	CTG	ACT	CAG	CCA	CCC	TCA	GTG	TCC GTG
TCC	CCA	GGA	CAG	ACA	GCC	AGC	ATC	ACC	TGC!	! 3r
tee	tat	gag	ctg	act	cag	cca	cTc	tea	gtg	tcA gtg
Gcc	cTG	gga	cag	acG	gee	agG	atT	acc	tgT	! 3j
tee	tat	gag	ctg	acA	cag	cca	ccc	teG	gtg	tcA gtg
tee	cca	gga	caA	acG	gcc	agG	ate	acc	tgc!	! 3p
tee	tat	gag	ctg	acA	cag	cca	ccc	teG	gtg	tcA gtg
tee	cTa	gga	cag	aTG	gcc	agG	ate	acc	tgc	! 3a
teT	tct	gag	ctg	act	cag	GAC	ccT	GeT	gtg	teT gtg
Gcc	TTG	gga	cag	aca	gTc	agG	ate	acA	tgc	! 31

2016225923 09 Sep 2016

- 128 -

.cc

tat

gTg

ctg

act

cag

cca

CCC

tea

gtg

tcA gtg

Gcc

cca

gga

Aag

aeG

gcc

agG

atT

acc

tgT

! 3h

tcc

tat

gag

ctg

acA

cag

cTa

CCC

teG

gtg

tcA gtg

tcc

cca

gga

cag

aca

gcc

agG

ate

acc

tgc

! 3e

tcc

tat

gag

ctg

aTG

cag

cca

CCC

teG

gtg

tcA gtg

tcc

cca

gga

cag

aeG

gcc

agG

ate

acc

tgc

! 3m

tcc

tat

gag

ctg

acA

cag

cca

Tcc

tea

gtg

tcA gtg

tCT

ccG

gga

cag

aca

gcc

agG

ate

acc

tgc

! V2-19

1

VL4

CTG

CCT

GTG

CTG

ACT

CAG

CCC

CCG

TCT

GCA

TCT GCC

TTG

CTG

GGA

GCC

TCG

ATC

AAG

CTC

ACC

TGC

! Ac

cAg

cct

gtg

ctg

act

caA

TcA

TeC

tet

geC

tet geT

tcc

ctg

gga

Tcc

teg

Gtc

aag

etc

acc

tgc

! 4a

cAg

eTt

gtg

ctg

act

caA

TeG

ccC

tet

geC

tet gcc

)

tcc

ctg

gga

gcc

teg

Gtc

aag

etc

acc

tgc

! 4b

! VL5

CAG

CCT

GTG

CTG

ACT

CAG

CCA

CCT

TCC

TCC GCA

TCT

CCT

GGA

GAA

TCC

GCC

AGA

CTC

ACC

TGC

! 5e

cag

Get

gtg

ctg

act

cag

ccG

Get

tcc

CTc

teT gca

)

tet

cct

gga

gCa

tcA

gcc

agT

etc

acc

tgc

! 5c

cag

cct

gtg

ctg

act

cag

cca

Tet

tcc

CAT

teT gca

tet

Tet

gga

gCa

tcA

gTc

aga

etc

acc

tgc

! 5b

! VL6

AAT

TTT

ATG

CTG

ACT

CAG

CCC

CAC

TCT

GTG

TCG GAG

3

! VL7

TCT

CCG

GGG

AAG

ACG

GTA

ACC

ATC

TCC

TGC

! 6a

CAG

ACT

GTG

ACT

CAG

GAG

CCC

TCA

CTG

ACT GTG

TCC

CCA

GGA

GGG

ACA

GTC

ACT

CTC

ACC

TGT

! 7a

cag

Get

gtg

act

cag

gag

CCC

tea

ctg

act gtg

)

VL8

tcc

cca

gga

ggg

aca

gtc

act

etc

acc

tgt

! 7b

CAG

ACT

GTG

ACC

CAG

GAG

CCA

TCG

TTC

TCA GTG

TCC

CCT

GGA

GGG

ACA

GTC

ACA

CTC

ACT

TGT

! 8a

129

2016225923 09 Sep 2016 ! VL9 ! VL10 5

CAG CCT GTG CTG ACT CAG CCA CCT TCT GCA

TCC CTG GGA GCC TCG GTC ACA CTC ACC TGC

CAG GCA GGG CTG ACT CAG CCA CCC TCG GTG

GGC TTG AGA CAG ACC GCC ACA CTC ACC TGC

TCA GCC ! 9a

TCC AAG ! 10a

130

2016225923 09 Sep 2016

Table 11 RERSs found in human lambda FRl GLGs ! There are 31 lambda GLGs Mlyl NnnnnnGACTC 25

1:	6	3:	6	4:	6	6:	6	7 :	6	8:
9:	6	10:	6	11:	6	12:	6	15:	6	16:
20:	6	21:	6	22:	6	23:	6	23:	50	24:
25:	6	25:	50	26:	6	27:	6	28 :	6	30:
31:	6
There	are	23	hits	at	base!)	6

GAGTCNNNNNn 1

26: 34

Mwol GCNNNNNnngc 20

1:	9	2:	9	3:	9	4:	9	11:	9	11:	56
12:	9	13:	9	14:	9	16:	9	17:	9	18:	9
19:	9	20:	9	23:	9	24:	9	25:	9	26:	9
30:	9	31:	9
There are	19	hits	at	base#	9
Hinfl	Gantc					27
1:	12	3:	12	4:	12	6:	12	7:	12	8:	12
9:	12	10:	12	11:	12	12:	12	15:	12	16:	12
20:	12	21:	12	22:	12	23:	12	23:	46	23:	56
24:	12	25:	12	25:	56	26:	12	26:	34	27:	12
28:	12	30:	12	31:	12
There are	23	hits	at	baset	12
Plel	gactc					25
1:	12	3:	12	4 :	12	6:	12	7:	12	8:	12
9:	12	10:	12	11:	12	12:	12	15:	12	16:	12
20:	12	21:	12	22:	12	23:	12	23:	56	24:	12
25:	12	25:	56	26:	12	27:	12	28:	12	30:	12

31: 12

There are 23 hits at base# 12 gagtc

131

2016225923 09 Sep 2016

	26:	34
	Ddel	Ctnag					32
	1;	14	2	: 24	3:	14	3:	24	4 :	14	4 :	24
5	5:	24	6	: 14	7:	14	7:	24	8 :	14	9:	14
	10:	14	11	: 14	11:	24	12:	14	12:	24	15:	5
	15:	14	16	: 14	16:	24	19:	24	20:	14	23:	14
	24:	14	25	: 14	26:	14	27:	14	28:	14	29:	30
	30:	14	31	: 14
0	There are	21 hits	at	base#	14
	BsaJI	Ccnngg				38
	1:	23	1	: 40	2:	39	2:	40	3:	39	3:	40
	4 :	39	4	: 40	5:	39	11:	39	12:	38	12:	39
5	13:	23	13	: 39	14:	23	14:	39	15:	38	16:	39
	17:	23	17	: 39	18:	23	18:	39	21:	38	21:	39
	21:	47	22	: 38	22:	39	22:	47	26:	40	27:	39
	28:	39	29	: 14	29:	39	30:	38	30:	39	30:	47
	31:	23	31	: 32
0	There are		17 hits	at	base#	39
	There are	5 hits	at	base#	38
	There are	5 hits	at	base#	40	Makes cleavage ragged.
	Mnll	cctc					35
	1:	23	2	: 23	3:	23	4 :	23	5:	23	6:	19
5	6:	23	7	: 19	8:	23	9:	19	9:	23	10:	23
	11:	23	13	: 23	14:	23	16:	23	17 :	23	18:	23
	19:	23	20	: 47	21:	23	21:	29	21:	47	22:	23
	22:	29	22	: 35	22:	47	23:	26	23:	29	24:	27
	27:	23	28	: 23	30:	35	30:	47	31:	23
0	There are		21 hits	at	base#	23
	There are		3 hits	at	base#	19
	There are	3 hits	at	base#	29
	There are	1 hits	at	base#	26
	There are	1 hits	at	base#	27	These could	make	cleavage ragged
5		gagg						7

132

2016225923 09 Sep 2016

1: 48 29: 44	2: 48	3:	48	4 :	48	27:	44	28:	44
BssKI Nccngg			39
1: 40	2: 39	3:	39	3:	40	4 :	39	4 :	40
5: 39	6: 31	6:	39	7:	31	7:	39	8:	39
9: 31	9: 39	10:	39	11:	39	12:	38	12:	52
13: 39	13: 52	14:	52	16:	39	16:	52	17:	39
17: 52	18: 39	18:	52	19:	39	19:	52	21:	38
22: 38	23: 39	24 :	39	26:	39	27:	39	28:	39
29: 14	29: 39	30:	38
There	are 21 hits	at	base#	39
There	are 4 hits	at	base#	38
There	are 3 hits	at	base#	31
There	are 3 hits	at	base#	40	Ragged
BstNI CCwgg			30
1: 41	2: 40	5:	40	6:	40	7:	40	8:	40
9: 40	10: 40	11:	40	12:	39	12:	53	13:	40
13: 53	14: 53	16:	40	16:	53	17:	40	17:	53
18: 40	18: 53	19:	53	21:	39	22:	39	23:	40
24: 40	27: 40	28:	40	29:	15	29:	40	30:	39
There	are 17 hits	at	base#	40
There	are 7 hits	at	base#	53
There	are 4 hits	at	base#	39
There	are 1 hits	at	base#	41	Ragged
PspGI ccwgg				30
1: 41	2: 40	5:	40	6:	40	7:	40	8:	40
9: 40	10: 40	11:	40	12:	39	12:	53	13:	40
13: 53	14: 53	16:	40	16:	53	17:	40	17:	53
18: 40	18: 53	19:	53	21:	39	22:	39	23:	40
24: 40	27: 40	28:	40	29:	15	29:	40	30:	39
There	are 17 hits	at	base#	40
There	are 7 hits	at	base#	53

133

2016225923 09 Sep 2016

There are 4 hits at base# 39 There are 1 hits at base# 41

ScrFI CCngg 39

5	1: 5:	41 40	2: 6:	40 32	3: 6:	40 40	3: 7:	41 32	4: 40	4: 8:	41 40
7:	40
	9:	32	9:	40	10:	40	11:	40	12:	39	12:	53
	13:	40	13:	53	14:	53	16:	40	16:	53	17:	40
	17:	53	18:	40	18:	53	19:	40	19:	53	21:	39
0	22:	39	23:	40	24:	40	26:	40	27:	40	28:	40
	29:	15	29:	40	30:	39
	There	are 21	hits	at	base#	40
	There	are 4	hits	at	base#	39
	There	are 3	hits	at	base#	41
5
	Maelll	gtnac					16
	1:	52	2:	52	3:	52	4:	52	5:	52	6:	52
	7:	52	9:	52	26:	52	27:	10	27:	52	28:	10
	28 :	52	29:	10	29:	52	30:	52
0	There	are 13	hits	at	base#	52
	Tsp45I	gtsac					15
	1:	52	2:	52	3:	52	4:	52	5:	52	6:	52
	7:	52	9:	52	27:	10	27:	52	28 :	10	28:	52
5	29:	10	29:	52	30:	52
	There	are 12	hits	at	base#	52
	HphI	tcacc					26
	1:	53	2:	53	3:	53	4 :	53	5:	53	6:	53
0	7:	53	8:	53	9:	53	10:	53	11:	59	13:	59
	14 :	59	17:	59	18:	59	19:	59	20:	59	21:	59
	22:	59	23:	59	24:	59	25:	59	27:	59	28:	59

30: 59 31: 59

There are 16 hits at base# 59

225923 09 Sep 2016

There are 10 hits at base# 53

SO

O

CM

134

BspMI ACCTGCNNNNn 14

11: 61	13	: 61	14:	61 17: 61	18:	61	19: 61
20: 61	21	: 61	22:	61 23: 61	24:	61	25: 61
30: 61 There	31 are	: 61 14 hits	at	base# 61 Goes	into	CDR1

135

Table 12: Matches to URE FR3 adapters in 79 human HC.

A. List of Heavy-chains genes sampled

2016225923 09 Sep 2016

	AF008566		AF103367		HSA235674	HSU94417	S83240
	AF035043		AF103368		HSA235673	HSU94418	SABVH369
5	AF103026		AF103369		HSA240559	HSU96389	SADEIGVH
	afl03033		AF103370		HSCB201	HSU96391	SAH2IGVH
	AF103061		afl03371		HSIGGVHC	HSU96392	SDA3IGVH
	Afl03072		AF103372		HSU44791	HSU96395	SIGVHTTD
	afl03078		AF158381		HSU44793	HSZ93849	SUK4IGVH
LO	AF103099		E05213			HSU82771	HSZ93850
	AF103102		E05886			HSU82949	HSZ93851
	AF103103		E05887			HSU82950	HSZ93853
	AF103174		HSA235661		HSU82952	HSZ93855
	AF103186		HSA235664		HSU82961	HSZ93857
L5	afl03187		HSA235660		HSU86522	HSZ93860
	AF103195		HSA235659		HSU86523	HSZ93863
	afl03277		HSA235678		HSU92452	MCOMFRAA
	afl03286		HSA235677		HSU94412	MCOMFRVA
	AF103309		HSA235676		HSU94415	S82745
20	afl03343		HSA235675		HSU94416	S82764
	Table 12B.	Testing all	distinct	GLGs :	from bases 89.1	to 93.2 of
	the heavy	variable domain
	Id	Nb	0	1	2	3	4			SEQ ID
	NO:
25	1	38	15	11	10	0	2	Seql	gtgtattactgtgc	25
	2	19	7	6	4	2	0	Seq2	gtAtattactgtgc	26
	3	1	0	0	1	0	0	Seq3	gtgtattactgtAA	27
	4	7	1	5	1	0	0	Seq4	gtgtattactgtAc	28
	5	0	0	0	0	0	0	Seq5	Ttgtattactgtgc	29
30	6	0	0	0	0	0	0	Seq6	TtgtatCactgtgc	30
	7	3	1	0	1	1	0	Seq7	ACAtattactgtgc	31
	8	2	0	2	0	0	0	Seq8	ACgtattactgtgc	32
	9	9	2	2	4	1	0	Sea9	ATqtattactqtac	33
	Group		26	26	21	4	2
35	Cumulative		26	52	73	77	79

Table 12C Most important URE recognition seqs in FR3 Heavy

	1	VHSzyl	GTGtattactgtgc	(ON	SHC103)	(SEQ	ID	NO:25)
	2	VHSzy2	GTAtattactgtgc	(ON*	SHC323)	(SEQ	ID	NO:26)
	3	VHSzy4	GTGtattactgtac	(ON*	SHC349)	(SEQ	ID	NO:28)
40	4	VHSzy9	ATGtattactgtgc	(ON*	SHC5a)	(SEQ	ID	NO:33)

Table 12D, testing 79 human HC V genes with four probes

Number of sequences.......... 7 9

Number of bases.............. 29143

Number of sequences.......... 7 9

Number of bases.............. 29143

136

2016225923 09 Sep 2016

		Number	of	mismatches
Id	Best	0	1	2	3	4	5
1	39	15	11	10	1	2	0	Seql gtgtattactgtgc	(SEQ	ID	NO: 25)
2	22	7	6	5	3	0	1	Seq2 gtAtattactgtgc	(SEQ	ID	NO:26)
3	7	1	5	1	0	0	0	Seq4 gtgtattactgtAc	(SEQ	ID	NO:28)
4	11	2	4	4	1	0	0	Secr9 ATgtattactqtqc	(SEQ	ID	NO:33)
Group		25	26	20	5	2
Cumulative	25	51	71	76	78

One sequence has five mismatches with sequences 2, 4, and 9; it is scored as best for 2.

Id is the number of the adapter.

Best is the number of sequence for which the identified .5 adapter was the best available.

The rest of the table shows how well the sequences match the adapters. For example, there are 10 sequences that match VHSzyl(Id=l) with 2 mismatches and are worse for all other adapters. In this sample, 90% come within 2 bases of one of

Ϊ0 the four adapters.

137

2016225923 09 Sep 2016

Table 13

The following list of enzymes was taken from http://rebase . neb.com/cqi-bin/asymmlist

I have removed the enzymes that a) cut within the recognition, b) cut on 5 both sides of the recognition, or c) have fewer than 2 bases between recognition and closest cut site.

REBASE Enzymes 04/13/2001

10	Type II restriction enzymes with	asymmetric recognition	sequences :
	Enzymes	Recognition Sequence		Isoschizomers	Suppliers
	Aarl	CACCTGCNNNN^ANNNN		-	y
	Acelll	CAGCTCNNNNNNN^ANNNN		-	-
	Bbr7I	GAAGACNNNNNNN^ANNNN		-	-
15	Bbvl	GCAGCNNNNNNNN^ANNNN			y
	BbvII	GAAGACNN^ANNNN
	Bce83I	CTTGAGNNNNNNNNNNNNNN	_NN^A	-	-
	BceAI	ACGGCNNNNNNNNNNNN^ANN		-	y
	Beef I	ACGGCNNNNNNNNNNNN^AN		-	-
20	BciVI	GTATCCNNNNN N^A		Bful	y
	Bfil	ACTGGGNNNN N^A		BmrI	y
	BinI	GGATCNNNN^AN
	BscAI	GCATCNNNN^ANN		-	-
	BseRI	GAGGAGNNNNNNNN NN^A		-	y
25	BsmFI	GGGACNNNNNNNNNN ^A NNNN		BspLUUIII	y
	BspMI	ACCTGCNNNN^ANNNN		Acc36I	y
	Ecil	GGCGGANNNNNNNNN NN^A		-	'y
	Eco57I	CTGAAGNNNNNNNNNNNNNN	_NN^A	BspKT5I	y
	Faul	CCCGCNNNN^ANN		BstFZ438I	y
30	FokI	GGATGNNNNNNNNN^ANNNN		BstPZ418I	y
	Gsul	CTGGAGNNNNNNNNNNNNNN	_NN^A	-	y
	Hgal	GACGCNNNNN^ANNNNN		-	y
	HphI	GGTGANNNNNNN N^A		AsuHPI	y
	MboII	GAAGANNNNNNN N^A		-	y
35	Mlyl	GAGTCNNNNN^A		Schl	y
	Mmel	TCCRACNNNNNNNNNNNNNNNNNN	_NN^A	-
	Mnll	CCTCNNNNNN N^A			y
	PleX	GAGTCNNNN^AN		PpsI	y
	RleAI	CCCACANNNNNNNNN NNN^A		-	-
40	SfaNI	GCATCNNNNN^ANNNN		BspST5I	y
	SspD5I	GGTGANNNNNNNN^A		-	-
	Sthl32I	CCCGNNNN^ANNNN		-	-
	StsI	GGATGNNNNNNNNNN^ANNNN		-	-
	Tagil	GACCGANNNNNNNNN NN^A,‘	CACCCANNNNNNNNN NN^A	-
45	Tthlllll	CAARCANNNNNNNNN NN^A		-	-
	UbaPI	CGAACG		-	-

The notation is ^A means cut the upper strand and _ means cut the lower strand. If the upper and lower strand are cut at the same place, then only ^A appears.

2016225923 09 Sep 2016

138

ro

CP

ro

<0

CP

υ

ϋ

1

CP

t

rf

CP

ΓΟ

CP

X

CP

X

1

X

CP

X

CP

1

03

X

to

X

ro

X

CP

<

CP

X

CP

X

CP

X

Eh

<0

nJ

<0

CP

υ

X

u

X

υ

X

S',

CP

u

CP

U

CP

υ

5

X

r0

X

ro

X

r0

Cp

o

X

Eh

X

—

o

0)

υ

u

ϋ

a

-

'Ll

ro

X

ro

X

<0

X

Eh

ro

X

d

X

o

X

o

1

—

u

CP

o

u

rf

u

υ

rf

Eh

o

X

(J

X

CJ

X

u

CP

0)

u

o

ϋ

o

υ

o

CP

-C

—

Eh

rf

X

Eh

X

Eh

X

Eh

ϋ

CP

u

t)

0

CJ

υ

o

CP

X

VO

Π3

rO

«0

(0

<0

rO

rf

CP

o

—

—-

—

CM

X

a

X

u

X

U

—

»

rf

4-»

<0

r0

(0

<0

ro

X

fcH

c

Φ

rO

O’»

ro

CP

¢0

CP

1

I

Eh

H

<P

X

—

CP

υ

e

Λ

CP

υ

R

υ

rf

0)

(0

rO

<0

to

f0

ro

<0

υ

CP

X

H

ro

CP

<0

CP

ro

CP

r0

EHj

CP

<

a

—-

—

—-

rf

X

fc

£

X

U

ο

υ

o

♦H

Q

ϋ

o

O

□

<0

tsl

R

X

υ

X

CP

X

CP

X

CP

o

W

rf

Ο

o

Φ

<0

O

CP

a

CP

υ

CP

rf

ΰ

ο

H

w

r0

C9

.

(0

L5

<0

(7

(1

tH

CP

>h

ro

rf

<

<0

rf

ro

CP

X

Φ

t

CM

—

•

—

— »

—

Eh

X

>

CP

a

rf

EH

CP

ϋ

rf cn

0

rf

H

Cp

cp

CP

(V

10

Γ0

ro

Eh

EH

r0

Eh <0

ro

H

o

Eh

rf

<

<TJ

CP

X

Φ

CP

Eh (0

CP Eh

fcH

Eh

υ

CP

H

ϋ

—

X

—

— X

--

—

CP

rf

Φ

U

-H

rf

•

o

rf

υ

rf

o u

rf

O

CP

b>

X

(0

X

co

CP

<0

(0

CP <0

01

CP

rO

R

4-)

u

Φ

<

•

ro

υ

rf

<ο o

rf

(0

&

υ

Eh

X

—

EH

—

X

—

— X

—·

H

oil

<

υ

«

X

<0

υ

X

Eh

ϋ X

X

o

υ

o

CP

H

-r-l

Γ)

t0

C

u

1)

(0

C)

<a o

()

r0

«0

R

rf

H

CP

u

X X

X

<0

CP

Eh

<0 CP

Eh

r0

u

rf

X

r~1

u

w

—

——

υ

—

— □

—

X

0

CP

X

CO

1

X

<

CP

co

-

•

-

n

0

Eh

rf

X

UP

in

n

CP

1

rf

CP

ω

υ

rf

X

1—>|

1

ID

I

>

u

-

a!

n

in

X

1—4

<0

—.

X

Q4

rH

X

1--(

«Ή

O

1—1

CD

CO

CU

Λ

CO

Φ

CO

co

X

0)

rj

X

rf

CQ

co

X

Ld

O

w

CQ

CD

X5

0

34

c

X

□4

G

rO

tu

>

H

—

lO O iTt

139

2016225923 09 Sep 2016

Table 15: Use of Fokl as Universal Restriction Enzyme

Fokl - for dsDNA, | represents sites of cleavage sites of cleavage

5'-cacGGATGtq—nnnnnnn|nnnnnnn-3'(SEQ ID NO:15)

3'-gtgCCTACac--nnnnnnnnnnnInnn-5'(SEQ ID N0:16)

RECOG

NITion of Fokl

Case I

5'-...gtgItatt-actgtgc. . Substrate ....-3' (SEQ ID NO:17) ί 0 3'-c a c-ataaItqacacq—i qtGTAGGcac\

5’- caCATCCgtg/(SEQ ID NO:18)

Case II

5'-...gtgtattIagac-tgc.. Substrate....-3'(SEQ ID NO:19) L5 cacataa-tctqIacq-5' /gtgCCTACac \cacGGATGtg-3'(SEQ ID NO.-20)

Case III (Case I rotated 180 degrees) /gtgCCTACac-5' ’0 \ cacGGATGtq—i qtqtcttIacaq-tcc-3' Adapter (SEQ ID NO:21)

3...cacagaa-tgtcIagg.. substrate ....-5 ' (SEQ ID NO:22)

Case IV (Case II rotated 180 degrees)

3'- gtGTAGGcaci (SEQ ID NO:23) i—caCATCCgtg/

5'-gag|tctc-actqaqc

Substrate 3’-...ctc-agagItgactcg...-5’(SEQ ID NO:24)

Improved Fokl adapters

Fokl - for dsDNA, I represents sites of cleavage 30 Case I

Stem 11, loop 5, stem 11, recognition 17

5'-...catgtg(tatt-actgtgc.. Substrate....-3'

3'-qtacac-ataaItqacacq—_{t r}T—| gtGTAGGcacG T 5'- caCATCCgtgc C LttJ

140

2016225923 09 Sep 2016

Case II

Stem 10, loop 5, stem 10, recognition 18 ' -...gtgtattIagac-tgctgcc.. Substrate ....-3¹ρΤη |—cacataa-tcta I acgacgg-5 '

T gtgCCTACac

C fcacGGATGtg-3' t-TT-¹

Case III (Case I rotated 180 degrees)

Stem 11, loop 5, stem 11, recognition 20 _r Τη

T TgtgCCTACac-5'

G AcacGGj^-TGt Q—| ^LTT^J qtqtcttIacaq-tccattctq-3' Adapter

3’-...cacagaa-tgtclaggtaagac..substrate....-5' .5 Case IV (Case II rotated 180 degrees)

Stem 11, loop 4, stem 11, recognition 17 r^Ti

3’- gtGTAGGcacc T [—caCATCCgtgg T ! 0 5'-atcgag| tctc-actqagc LgO

Substrate 3’- ...tagctc-agag|tgactcg...-5'

BseRI

I sites of cleavage 5 ' -cacGAGGAGnnnnnnnnnn I nnnnn-3 '

3' -qtqctcctcnnnnnnnn I nnnnnnn-5'

RECOG

NITion of BseRI

Stem 11, loop 5, stem 11, recognition 19

3' -.......gaacat I cg-ttaagccagta.....5 ’

JO _ΓΤ-Τ_Ί cttgta-gcIaattcggtcat-3*

C GCTGAGGAGTC—¹

T cgactcctcag-5' An adapter for BseRI to cleave the substrate above Lj_I

141 so ο

CH

Uh

CD

CZ

OS o

CO

CH

OS m

CH

CH so o

CH o

np

CO

CH ©

co

0) u

c

0) σ

φ to ti-1

U

Φ jQ £

CP

nJ

CP

u

CP

nJ

4-)

CP

4-)

υ

nJ

4-)

4-»

e

nJ

3-<

4-)

o

CP

nj

Ud

4-)

CP

nJ

4-)

CJ

u

U

o

ϋ

Q

tP

CP

CJ

nJ

nj

ni

CP

to

ϋ

u

υ

CP

(0

o

4-)

CP

Ο

4-)

u

U

u

U

nJ

m

<0

nj

nJ

co

CO

co

CO

4-)

m

1

4-)

4->

4-)

©

tP

CP

tP

Φ

nj

nJ

nj

«—4

£-1

H

EH

H

U

4-)

4-·

4-)

Ml

ΓΠ1

ra

c

CP

nJ

CP

σ

Φ

M

H

Φ

4-)

4-»

4-)

rd

s

•a

<

Q

□

CP

IP

CP

nJ

XI

Ί

ra

Ί

ra

O

n

4-)

c

C.)

u

(J

nJ

ε

till

tpl

ra

raj

3d

φ

o

u

ϋ

o

£h

Φ

a

ϋ

0

υ

O

cu

co

CP

σ

C

4-) CO

3 3 3 3 3

Φ

•ri

CP

rd

CH

rd

E-<

E-i

E-*

6-*

nj

CO

(X

ir-<

H

>

(—i

rd

CH

sr

OS

nJ

o

CP

nj

1

ϋ

H

<D

r—1

rd

r-|

rd

3d

u

H

rd

CO

00

Φ

a

φ

CO

co

CO

00

X

tP

CP

O'

CP

ε

co

CO

co

h

H

φ

σι

X

□

tP

O'

0

4-)

z

>

c

U

u

Of

υ

1 nJ

0

t)

o

0

>d

c

Ed

HI

Hl

H

4J

Γ-

o

rd

©

rd

CH

©

o TJ

e φ

MM3

W l-Q

NT CD

o u

4-) ω

33333

to

so

CH

©

CP

o

©

rd

CO

m

Md

rd

co

©

CP

o

O

CD

nJ

flj

nJ

To 60

NT

CP

σ

Φ

U0

'tr

O'

CH

OS

Γ*·

©

u

υ

U

u

4-)

CO

r—1

CH

CP

1«

σ

•td

CD

CO

4-)

to

WU

Φ

©

CP

σ

kJ

x:

r-

m

CH

rd

CO

1

—

ri

υ

CH

O

CH

4-)

4-J

61

4->

CO

©

a

υ

0

u

CJ

03

nj

nJ

«s

e

co

so

co

o

©

rd

©

Γ

rd

c

4-)

10

OH

rd

SD

CD

©

o

Ή

n-

•<d

4-)

4->

4-)

t

Σ

o

4-)

nJ

nj

nJ

nj

&

OS

SO

CO

SO

NP

X

O'

co

©

•*d

4-)

4-»

4->

4-)

Ud

Γ

CO

rd

nj*

rd

C

CP

nJ

CP

0

c

o

rd

©

CP

4-)

4-»

60

CD

o

tP

CP

n:

3d

r—

r-

©

nt

CO

SD

©

t—

ϋ

4->

o

υ

0

C

H

<D

©

SD

CO

CC

co

O'

CD

ci>

υ

o

u

61

X! e 3

o

OH

©

NJ*

o

U0

CH rd

m rd

CO

X

CP

CP 1

CP

σ 1

u u

2

m

UO

rd

OH

NJ»

«5Ρ

m

uo

m

ri

rd

CO

ΓΟ

rd

CH

rd

<-1

r-l

3

4J

<3*

uo

SD

o

UO

o

ΐη

c

SO

SD

©

CH

©

NT

rd

rH

CH

NJ*

©

4-J

ro

OH

CD

1

.

z

rd

rH

r-i

©

CD

CO

¢8

GO

CD

CO

CD

2

X

T-1

CH

CO

in

w

CO

co

CO

1—

X

o

>

X,

—

-—

ao <

<

&

H u

H <

β in oo it J kJ tuj oo b fc eot-i <CU s

u ^w < c op s ri s

Η ,^u S ooR S’

R u 5 eoR o < rio ^υ u uu dfP ²R rio £ < u < OC M 00 < «> ί η

J «

R ri ri oo z « -5 R

Λ >§-·

Synthetic 3-23 as in Table 206 | TCT | AGA | gac | aac | tet | aag |aat|act| etc | tac | ttg | cag | atg | Xbal...

| aac | agC | TTA | AGg | get | gag | gac | aCT | GCA | Gtc | tac | tat 11-3'

00

X —

H £J

X O * ________ lo o m r-d r-d

O

CH <n

CH

O

CO

142

2016225923 09 Sep 2016 toO o

bO §3

Ch &0 rt &0 u

<

& 2 03 — — u oc 2 _

£P0

-w

TT < 2 υ -o ±LH ~O ω σι « — be 2 bo 2 rt . ...

— H

SPO . LI ' S'· f“* rt £“# eo υ bo <3 u δβ . S ;5 £ 1 <

Η λ Η <

Η ύ

U « « rt ~ν y Sg §0 &0 — < υ — w Η £>θΡ Ο £ΐ < &C

Λ s

io '«!

υ

Ρ.

1—·

ΟΟ οο a

<2ιη

143

2016225923 09 Sep 2016 ο

rO

CU cu ~ o £ Z Z Q

Λ

t*D <_> .

<D

6fi < H ώ Ml <

T $ <j 500 < Η W §8 <S tuo r^W U

“ (3 : 00 J5	: .	·. <L<
; h *5 :	H < ;	: Η < ί
• < < j	: Ηί M j r< lift »* C	j 7^.00 4

’» js

8£ <M Sv

GO b? feO V ω _w • H c- <<

*“¹ u ε

<υ

LT)

LT) υ

<z 3

Cl, 0 O Uh O t-« c ω 2 ο ε υ ^v5 CO u

*5)«

S Oil H ϋ U U CZ) CJ op tS co

Cu o

o ε

<υ <Λ

4- < u l·· s .

w pin

I &

cO

CM

V ctf

4- G 00 O « a e& & u 00 Λ Ο ΟΟ V < <u ‘LO £$ o

CM

144

2016225923 09 Sep 2016

Ό

C (0

P

4-» co

Φ w

C

Φ (0

α> C ο	Φ C 0	φ C ο	Φ C Ο
co Ή χ: +-J	ω •Η -μ	W Ή χ; 4->	co •Η χ: •Ρ
Μ-1 Ο	0	4-4 Ο	4-1 Ο
ο	Ο	Ο	ο
&		cd	cd
OT •γΗ	¢0 -Η	co •Ή	(0 •Η
Φ > Ο £} 03	Φ > Ο <0	Φ > Ο Ό 03	Φ > Ο _α 03
& Ο	S ο	S Ο	ο
co	Γ9	ΠΊ t	co I

ο

Uh a

o tt ”D

C

C-, §*

Cd

Ου «3 ** U3 Q «3

X -S u w S> GO in cC ι

U u

rt u

Q

So rt

V bD bQ in u

rt in

CO

I σ

<

υ υ

e-« <

ο £

% υΐ · < w U J Ο «3 Η & θ) < U υ Η ο Ε-*

-Ρ ϋ -μ υ <η -Ρ υ Η

Λ <ϋ υ υ υ ω ι

CN o

ό rc <N <

CN

ID <

ό m

CN s s

		ιη	m	in	m	m	m m	in
	cd
	ο		r**n		r—s		r-	1““1
	CU		ο		c>		υ	o
W	£3		ci		05		05	cd
(X	Cd	cd	barf	cd		cd	«-* cd	»—<
ο	Ο	ο		o		o	z>
4-J	-Ρ	τ-4		CN		m	-sr
X	X	Ο		O		o	O
φ	φ	od		rd		cd	Qd
α,	Λ	CQ		CQ		CQ	CQ
03	Π3	03		03		03	03
24	X	-X		Λ4		Λί	U4
			·”
		tc					O
		r-H					CM

145 kO ο

CM ©

CL)

GO

Ok ©

co

CM ©

CM

CM ©

o

CM

SP ώ rt W

SP &

c o

•p i

u o

cd

Q.

ra

Ό

T3

X cr u

oo

X3 rt

H έ

Vi g

^z s z

T3

H to		υ &	GO	<U ¢/5	bO	u « oo rt	GO	δ C/5
ct	u	rt U		Cl	u u	υ		Cl
0 ©	I ' <	rt $	C © • rl	o O	H <;	\|	C O Λ	O O

E

00	: <	H '3	^: Jd £ '2
	: U	H EX	: Π ^υ &
rO	- *	u §>	:u u go

bQ <-> » w

In cd g «· ® s </j ir d i© *7 £ *9 <9 r—· _έ.

LO o

CM lO

CM s

« <z do o

ε «υ

GO (VL133-lc) 5'-cAcATccgTg TTgTT cAcggATgTg gAcccTcTgcccTggggcc-: [RC] 5'-ggccccagggcagagggtc cAcAigcgTg AAcAA cAcggATgTg-3'

O

CO

146

2016225923 09 Sep 2016

What happens in the top strand:

I j site of cleavage in the upper strand (VL133-2a2*) 5'-g ret cct g | ga cag teg ate ,

(VLl33-3!*) 5'-g gee ttg g | ga cag aca gtc t

(VLl33-2c*) 5'-g tet cct g | ga cag tea gtc .0 >0 >5 (VLl 33-lc*) 5'-g gee cca g | gg cag agg gtc

The following Extenders and Bridges all encode the AA sequence of 2a2 for codons 115 1 (ON_LamExl33) 5’-ccTcTgAcTgAgT gcA cAg I ! 2 3 4 5 6 7 8 9 10 11 12

AGt geT TtA acC caA ccG geT AGT gtT AGC ggT1 ! 13 14 15 teC ccG g! 2a2 ! 1 (ON_LamBl-133) [RC] 5'-ccTcTgAcTgAgT gcA cAg I ! 2 3 4 5 6 7 8 9 10 11 12

AGt geT TtA acC caA ccG geT AGT gtT AGC ggT13 14 15 teC ccG g ga cag teg at-3'! 2a2 2V.2?. the actual seq is the reverse complement of the one shown.

(ON _LamB2-133) [RC] 5'-ccTcTgAcTgAgT gcA cAg I ! 2 3 4 5 6 7 8 9 10 11 12

AGt geT TtA acC caA ccG geT AGT gtT AGC ggT!

! 13 14 15 teC ccG g ga cag aca gt-3'! 31 N.jB. the actual seq is die reverse complement of the one shown.

(ON_LamB3-133) [RC] 5'-ccTcTgAcTgAgT gcA cAg t I

3 4 5 6 7 8 9 10 11 12 AGt geT TtA acC caA ccG geT AGT gtT AGC ggT13 14 15 teC ccG g ga cag tea gt -3'! 2c 7\^r.B. the actual seq is the reverse complement of the one shown.

(ON LamB4-133) [RC] 5'-ccTcTgAcTgAgT gcA cAg

147

5923 09 Sep 2016

I

1	2	3	4 5	6	7 8	9	10 11	12
\|	AGt	gcT	TtA acC	caA	ccG gcT	AGT	gtT AGC	ggT-s
5 !	13	14	15
	tcC	ccG	g gg cag	agg	gt-3' !	lc	N.B. the	actual seg is the

reverse complement of the one shown.

(ON_Laml33PCR) 5'-ccTcTgAcTgAgT gcA cAg AGt gc-3’

CM

CM so o

CM

2016225923 09 Sep 2016

148

	Table 19: Enzvme	Cleavage of 75 human light chains.
Recoanition*	Nch	Ns	Planned location o:
	Afel	AGCgct	0	0
	Aflll	Cttaag	0	0	HC FR3
5	Age I	Accggt	0	0
	Ascl	GGcgcgcc	0	0	After LC
	Bglll	Agatct	0	0
	BsiWI	Cgtacg	0	0
	BspDI	ATcgat	0	0
.0	BssHXI	Gcgcgc	0	0
	BstBI	TTcgaa	0	0
	Dralll	CACNNNgtg	0	0
	Eagl	Cggccg	0	0
	Fsel	GGCCGGcc	0	0
.5	FspI	TGCgca	0	0
	Hpal	GTTaac	0	0
	Mfel	Caattg	0	0	HC FR1
	Mlul	Acgcgt	0	0
	Neo I	Ccatgg	0	0	Heavy chain signal
>0	Nhel	Gctagc	0	0	HC/anchor linker
	Notl	GCggccgc	0	0	In linker after HC
	Nrul	TCGcga	0	0
	Pad	TTAATtaa	0	0
	Pmel	GTTTaaac	0	0
>5	Pmll	CACgtg	0	0
	Pvul	CGATcg	0	0
	SacII	CCGCgg	0	0
	Sail	Gtcgac	0	0
	Sfil	GGCCNNNNnggcc	0	0	Heavy Chain signal
30	Sgfl	GCGATcgc	0	0
	SnaBI	TACgta	0	0
	StuX	AGGcct	0	0
	Xbal	Tctaga	0	0	HC FR3
	Aat 11	GACGTc	1	1
35	Acll	AAcgtt	1	1
	Asel	ATtaat	1	1
	Bsml	GAATGCN	1	1
	BspEI	Tccgga	1	1	HC FR1
	BstXI	CCANNNNNntgg	1	1	HC FR2
10	Drdl	GACNNNNnngtc	1	1
	HindiII	Aagctt	1	1
	Pcil	Acatgt	1	1
	Sapl	gaagagc	1	1
	Seal	AGTact	1	1
15	SexAI	Accwggt	1	1
	Spel	Actagt	1	1
	Tlil	Ctcgag	1	1
	Xhol	Ctcgag	1	1
	Bcgl	cgannnnnntgc	2	2
50	Blpl	GCtnagc	2	2
	BssSI	Ctcgtg	2	2
	BstAPI	GCANNNNntgc	2	2
	EspI	GCtnagc	2	2
	KasI	Ggcgcc	2	2
55	PflMI	CCANNNNntgg	2	2
	XmnI	GAANNnnttc	2	2
	ApaLI	Gtgcac	3	3	LC signal seq

149

2016225923 09 Sep 2016

Nael	GCCggc	3	3
NgoMI	Gccggc	3	3
PvuII	CAGctg	3	3
RsrII	CGgwccg	3	3
BsrBI	GAGcgg	4	4
BsrDI	GCAATGNNn	4	4
BstZ17I	GTAtac	4	4
EcoRI	Gaattc	4	4
' SphI	GCATGc	4	4
SspI	AATatt	4	4
AccI	GTmkac .	5	5
Bell	Tgatca	5	5
BsmBI	Nnnnnngagacg	5	5
BsrGI	Tgtaca	5	5
Dral	TTTaaa	6	6
Ndel	CAtatg	6	6
Swal	ATTTaaat	6	6
BamHI	Ggatcc	7	7
Sacl	GAGCTc	7	7
BciVI	GTATCCNNNNNN	8	8
BsaBI	GATNNnnatc	8	8
Nsil	ATGCAt	8	8
Bspl20I	Gggccc	9	9
Apal	GGGCCc	9	9
PspOOMI	Gggccc	9	9
BspHI	Tcatga	9	11
EcoRV	GATatc	9	9
AhdI	GACNNNnngtc	11	11
Bbsl	GAAGAC	11	14
Psil	TTAtaa	12	12
Bsal	GGTCTCNnnnn	13	15
Xmal	Cccggg	13	14
Aval	Cycgrg	14	16
Bgll	GCCNNNNnggc	14	17
AlwNI	CAGNNNctg	16	16
BspMI	ACCTGC	17	19
Xcml	CCANNNNNnnnntgg	17	26
BstEII	Ggtnacc	19	22
Sse8387I	CCTGCAgg	20	20
Avril	Cctagg	22	22
Hindi	GTYrac	22	22
Bsgl	GTGCAG	27	29
Mscl	TGGcca	30	34
BseRI	NNnnnnnnnnct cct c	32	35
Bsu36I	CCtnagg	35	37
Pstl	CTGCAg	35	40
Ecil	nnnnnnnnntccgcc	38	40
PpuMI	RGgwccy	41	50
Styl	Ccwwgg	44	73
EcoOl091	RGgnccy	46	70
Acc65I	Ggtacc	50	51
Kpnl	GGTACc	50	51
Bpml	ctccag	53	82
Avail	Ggwcc	71	124
* cleavage	occurs in the top	strand
that cut palindromic sequences,	the

CHI

CHI site.

150

2016225923 09 Sep 2016

Table 20: Cleavage of 79 human heavy chains

	Enzyme	Recognition	Nch	Ns	Planned location of s;
	Afel	AGCgct	0	0
	Aflll	Cttaag	0	0	HC FR3
5	Ascl	GGcgcgcc	0	0	After LC
	BsiWI	Cgtacg	0	0
	BspDI	ATcgat	0	0
	BssHII	Gcgcgc	0	0
	Fsel	GGCCGGcc	0	0
0	Hpal	GTTaac	0	0
	Nhel	Gctagc	0	0	HC Linker
	Notl	GCggccgc	0	0	In linker, HC/anchor
	Nrul	TCGcga	0	0
	Nsil	ATGCAt	0	0
5	Pacl	TTAATtaa	0	0
	Pcil	Acatgt	0	0
	Pmel	GTTTaaac	0	0
	Pvul	CGATcg	0	0
	RsrII	CGgwccg	0	0
0	Sapl	gaagagc	0	0
	Sfil	GGCCNNNNnggcc	0	0	HC signal seg
	Sgfl	GCGATcgc	0	0
	Swal	ATTTaaat	0	0
	Acll	AAcgtt	1	1
5	Age I	Accggt	1	1
	Asel	ATtaat	1	1
	Avril	Cctagg	1	1
	BsmI	GAATGCN	1	1
	BsrBI	GAGcgg	1	1
0	BsrDI	GCAATGNNn	1	1
	Dral	TTTaaa	1	1
	FspI	TGCgca	1	1
	Hindlll	Aagctt	1	1
	Mfel	Caattg	1	1	HC FRl
5	Nael	GCCggc	1	1
	NgoMI	Gccggc	1	1
	Spel	Actagt	1	1
	Acc65I	Ggtacc	2	2
	BstBI	TTcgaa	2	2
0	Kpnl	GGTACc	2	2
	Mlul	Acgcgt	2	2
	Ncol	Ccatgg	2	2	In HC signal seq
	Ndel	CAtatg	2	2	HC FR4
	Pmll	CACgtg	2	2
5	Xcml	CCANNNNNnnnntgg	2	2
	Bcgl	cgannnnnntgc	3	3
	Bell	Tgatca	3	3
	Bgll	GCCNNNNnggc	3	3
	BsaBI	GATNNnnatc	3	3
0	BsrGI	Tgtaca	3	3
	SnaBI	TACgta	3	3
	Sse8387I	CCTGCAgg	3	3
	ApaLI	Gtgcac	4	4	LC Signal/FRl
	BspHI	Tcatga	4	4
5	BssSI	Ctcgtg	4	4
	Psil	TTAtaa	4	5

151

2016225923 09 Sep 2016

SphI	GCATGc	4	4
AhdI	GACNNNnngtc	5	5
BapEI	Tccgga	5	5
Mscl	TGGcca	5	5
Sad	GAGCTc	5	5
Seal	AGTact	5	5
SexAI	Accwggt	5	6
SspI	AATatt	5	5
Tlil	Ctcgag	5	5
Xhol	Ctcgag	5	5
Bbsl	GAAGAC	7	8
BstAPI	GCANNNNntgc	7	8
BstZ17I	GTAtac	7	7
EcoRV	GATatc	7	7
EcoRI	Gaattc	8	8
BlpI	GCtnagc	9	9
Bsu36I	CCtnagg	9	9
Drain	CACNNNgtg	9	9
EspI	GCtnagc	9	9
Stul	AGGcct	9	13
Xbal	Tetaga	9	9
Bspl20I	Gggccc	10	11
Apal	GGGCCc	10	11
PspOOMI	Gggccc	10	11
BciVI	GTATCCNNNNNN	11	11
Sail	Gtcgac	11	12
Drdl	GACNNNNnngtc	12	12
KasI	Ggegee	12	12
Xmal	Cccggg	12	14
Bglll	Agatet	14	14
Hindi	GTYrac	16	18
BamHI	Ggatcc	17	17
PflMI	CCANNNNntgg	17	18
BsmBI	Nnnnnngagacg	18	21
BstXI	CCANNNNNntgg	18	19
Xmnl	GAANNnnttc	18	18
SacII	CCGCgg	19	19
Pstl	CTGCAg	20	24
PvuII	CAGetg	20	22
Aval	Cycgrg	21	24
Eagl	Cggccg	21	22
Aatll	GACGTc	22	22
BspMI	ACCTGC	27	33
AccI	GTmkac	30	43
Styl	Ccwwgg	36	49
AlwNI	CAGNNNctg	38	44
Bsal	GGTCTCNnnnn	38	44
PpuMI	RGgwccy	43	46
Bsgl	GTGCAG	44	54
BseRI	NNnnnnnnnnctcctc	48	60
Ecil	nnnnnnnnntccgcc	52	57
BstEII	Ggtnacc	54	61
Eco0109l	RGgnccy	54	86
Bpml	ctccag	60	121
Avail	Ggwee	71	140

HC FRl

HC FR3

CHI

HC FR2

HC Fr4, 47/79 have one

152

2016225923 09 Sep 2016

Table 21: MALIA3, annotated ! MALIA3 9532 bases

1

aat

get

act

att

agt

aga att gat

gee

acc

ttt

tea

get

ege

gee

5

! gene ii continued

49

cca

aat

gaa

aat

ata

get

aaa cag gtt

att

gac

cat

ttg

cga

aat

gta

97

tct

aat

ggt

caa

act

aaa

tct act cgt

teg

cag

aat

tgg

gaa

tea

act

145

gtt

aca

tgg

aat

gaa

act

tec aga cac

cgt

act

tta

gtt

gca

tat

tta

193

aaa

cat

gtt

gag

eta

cag

cac cag att

cag

caa

tta

age

tct

aag

cca

0

241

tcc

gca

aaa

atg

acc

tct

tat caa aag

gag

caa

tta

aag

gta

etc

tct

289

aat

cct

gac

ctg

ttg

gag

ttt get tec

ggt

ctg

gtt

ege

ttt

gaa

get

337

cga

att

aaa

aeg

cga

tat

ttg aag tct

ttc

ggg

ett

cct

ett

aat

ett

385

ttt

gat

gca

ate

ege

ttt

get tct gac

tat

aat

agt

cag

ggt

aaa

gac

433

ctg

att

ttt

gat

tta

tgg

tea ttc teg

ttt

tct

gaa

ctg

ttt

aaa

gca

5

481

ttt

gag

ggg

gat

tea

ATG

aat att tat

gac

gat

tee

gca

gta

ttg

gac

1

RBS?...

Start gene x, ii continues

529

get

ate

cag

tct

aaa

cat

ttt act att

acc

ccc

tct

ggc

aaa

act

tct

577

ttt

gca

aaa

gee

tct

ege

tat ttt ggt

ttt

tat

cgt

ctg

gta

aac

625

gag

ggt

tat

gat

agt

gtt

get ett act

atg

cct

cgt

aat

tec

ttt

tgg

0

673

cgt

tat

gta

tct

gca

tta

gtt gaa tgt

ggt

att

cct

aaa

tct

caa

ctg

721

atg

aat

ett

tct

acc

tgt

aat aat gtt

gtt

ccg

tta

gtt

cgt

ttt

att

769

aac

gta

gat

ttt

tct

tec

caa cgt cct

gac

tgg

tat

aat

gag

cca

gtt

817

ett

aaa

ate

gca

TAA

!

End

X &

II

5

832

ggtaattca ca

1

Ml

ES

Q10

T15

843

ATG

att

aaa

gtt

gaa

att

aaa cca tct

caa

gee

caa

ttt

act

cgt

fi

1

Start gene V

1

S17

S20

P25

E30

891

tct

ggt

gtt

tct

cgt

cag

ggc aag cct

tat

tea

ctg

aat

gag

cag

ett

1

V35

E40

V4 5

5

939

tgt

tac

gtt

gat

ttg

ggt

aat gaa tat

ccg

gtt

ett

gtc

aag

att

act

1

D50

A55

L60

987

ett

gat

gaa

ggt

cag

cca

gee tat geg

cct

ggt

cTG

TAC

Acc

gtt

cat

BsrGI.

0

1 1035 l 1

L65 ctg tec

tct ttc

aaa P85

V7 0 gtt

ggt K87

cag ttc ggt

S75 tec

ett

atg att gac

R80 cgt

end

of V

5

1083 1

ctg

ege

etc

gtt

ccg

get

aag

TAA

c

1108

ATG

gag

cag

gtc

geg

gat

ttc

gac

aca

att

tat

cag

geg

atg

Start gene VII

π

,

1150

ata

caa

ate

tee

gtt

gta

ett

tgt ttc

geg

ett

ggt

ata

ate

1

VII and IX overlap.

1

S2

V3 L4

V5

S10

1192

get

ggg

ggt

caa

agA

TGA

gt

gtt tta

gtg

tat

tct

ttc

gee

tct ttc

1

End

VII

5

1

I start

IX

1

L13

W15

G20

T25

1242

tta

ggt

tgg

tgc

ett

cgt

agt

ggc att

aeg

tat

ttt

acc

cgt

tta at

153

2016225923 09 Sep 2016

1293 act tcc tc

.... stop of IX, IX and VIII overlap by four bases

5	1301 1	ATG aaa aag tct tta gtc etc aaa gcc	tct	gta	gcc	gtt	get	acc	etc
Start signal	sequence of viii.
	1349	gtt ccg atg	ctg tct ttc get get gag	ggt	gac	gat	CCC	gca	aaa	geg
			mature ’	VIII	--->
10	1397	gcc ttt aac	tcc ctg caa gcc tea geg	acc	gaa	tat	ate	ggt	tat	geg
	1445	tgg geg atg	gtt gtt gtc att
	1466	gtc ggc gca	act ate ggt ate aag ctg	ttt	sag
	1499	aaa ttc acc	teg aaa gca ! 1515
	1		-35
15	1
	1517	age tga	i taaaccgat acaattaaag gctccttttg

..... -10

I

	1552	gagccttttt ttttGGAGAt ttt ! S.D.	underlined
20	1
	1	<—		III signal	sequence					---->
	1	M	K	K L L	F A	I P	L	V
	1575	caac GTG	aaa	aaa tta tta	ttc gca	att cct	tta	gtt	1611
25	1	V P	F	Y S H	S A	Q
	1612	gtt cct	ttc	tat tct cac	aGT gcA Cag tCT

ApaLl...

I

1642

GTC

GTG

ACG

CAG

CCG

CCC

TCA

GTG

TCT

GGG

GCC

CCA

GGG

CAG

30

AGG

GTC

ACC

ATC

TCC

TGC

ACT

GGG

AGC

TCC

AAC

ATC

GGG

GCA

1

BstEII...

1729

GGT

TAT

GAT

GTA

CAC

TGG

TAC

CAG

CTT

CCA

GGA

ACA

GCC

CCC

AAA

1777

CTC

ATC

TAT

GGT

AAC

AGC

AAT

CGG

CCC

TCA

GGG

GTC

CCT

GAC

CGA

1825

TTC

TCT

GGC

TCC

AAG

TCT

GGC

ACC

TCA

GCC

TCC

CTG

GCC

ATC

ACT

35

1870

GGG

CTC

CAG

GCT

GAG

GAT

GAG

GCT

GAT

TAT

1900

TAC

TGC

CAG

TCC

TAT

GAC

AGC

CTG

AGT

1930

GGC

CTT

TAT

GTC

TTC

GGA

ACT

GGG

ACC

AAG

GTC

ACC

GTC

BstEII...

	1969	CTA GGT	CAG	CCC AAG	GCC AAC CCC ACT	GTC	ACT
40	2002	CTG TTC	CCG	CCC TCC	TCT GAG GAG CTC	CAA	GCC AAC AAG GCC ACA CTA
	2050	GTG TGT	CTG	ATC AGT	GAC TTC TAC CCG	GGA	GCT GTG ACA GTG GCC TGG
	2098	AAG GCA	GAT	AGC AGC	CCC GTC AAG GCG	GGA	GTG GAG ACC ACC ACA CCC
	2146	TCC AAA	CAA	AGC AAC	AAC AAG TAC GCG	GCC	AGC AGC TAT CTG AGC CTG
	2194	ACG CCT	GAG	CAG TGG	AAG TCC CAC AGA	AGC	TAC AGC TGC CAG GTC ACG
45	2242	CAT GAA	GGG	AGC ACC	GTG GAG AAG ACA	GTG	GCC CCT ACA GAA TGT TCA
	2290	TAA TAA	ACCG CCTCCACCGG GCGCGCCAAT TCTATTTCAA GGAGACAGTC ATA

Ascl.....

50

J 2343

M ATG

K AAA

y liar Y TAC

L CTA

L TTG

P CCT

T ACG

A GCA

A GCC

A GCT

G GGA

L TTG

L TTA

L CTC

1

16

17

18

19

20

21

22

1

A

Q

P

A

M

A

55

2388

geG

GCC

cag

ccG

GCC

atq

gcc

Sfil.............

NgoMI...(1/2)

Ncol.........

154

2016225923 09 Sep 2016

5	2409
		31	32
0		G	G
	2433	1 ggc	ggt
	t	----FRl-
	1	46	47
5	I	A	S
	2478	1 get	TCC
	! 1		Bsj
0	1	61	62
	1	Q	A
	2523	I CAa	get
		BstXI
5	1	.....CDR2
	1	76	77
	,	S	G
	2568	Itct\|ggt
0	1
	1	91	92
	1	T	I
	2613	1 act	ate
5	1
	I	---FR3—
	1	106	107
	1	N	S
0	2658 1	\| aac	agC 1
	ί 1		. .C
	1	121	122
5	J	D	Y
	2703 j t	1 gac	tat
0	i	136	137
	t	T	M

2748

FRl(DP47/V3-23)--------------23 24 25 26 27 28 29 30

EVQLLESG gaaIgttICAA!TTGIttaI gag ItctIggt|

I Mfel I

L

G

FR263

P

G — FRl—

35

V Q

37

G

S

L

R

L

S

C

A

F

T

F

->| . . . CDR1................ί---FR252

Ξ

S

Y

A

M

S

W

60

V R

BsiWII

IBstXI,

G

K

67

G L

E

70

W V —>|...CDR2. 71 72 73

SAI

S

G

S

T

Y

A

D

S

V

K

G

I —FR3—

90

R F

FR3-93 S

I TCT | AC I Xbal

R

D

N

S

K

100 101 102 103 104 105

->l

Aflll |

Pstl

----FR4EGTGYAFDIWGQG gaa1ggt|act IggtI tat|get IttcIgaCIATA|TGg|ggt|caa|ggt|

I Ndel I(1/4)

------_FR4---------->|

138 139 140 141 142 V T V S S

I act|atG|GTCI ACC IgtcItctIagt I BstEII |

From BstEII onwards, pV323 is same as pCESl, except as noted.

BstEII sites may occur in light chains; not likely to be unique in final vector.

2016225923 09 Sep 2016

- 155

2769

				143	144	145	146 147	148	149	150 151	152
				A	S	T	K G	P	S	V F	P
				gcc	tcc	acc	aaG GGC	CCa	teg	GTC TTC	ccc
							Bspl20I.		Bbsl. . . (2/2)
							ApaI..
153	154	155	156	157 158	159	160	161 162	163	164	165 166	167
L	A	P	Ξ	S K	S	T	S G	G	T	A A	L
ctg	gca	ccC	TCC	TCc aag	age	acc	tet ggg	ggc	aca	gcg gcc	ctg

BseRI...(2/2)

1

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

1

G

C

L

V

K

D

Y

F

P

E

P

V

T

V

S

15

2844

ggc

tgc

ctg

GTC

AAG

GAC

TAC

TTC

CCc

gaA

CCG

GTg

aeg

gtg

teg

I

Age I.

1

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

!

W

N

S

G

A

L

T

S

G

V

H

T

F

P

A

20

2889

tgg

aac

tea

GGC

GCC

ctg

acc

age

ggc

gtc

cac

acc

ttc

ccg

get

1

Kasl...

(1/4)

1

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

1

V

L

Q

S

G

L

Y

s

L

S

V

T

25

2934

gtc

eta

cag

tet

age

GGa

etc

tac

tee

etc

age

gta

gtg

acc

1

(Bsu36I

...)(knocked

out)

1

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

1

V

P

S

L

G

T

Q

T

Y

I

C

N

V

30

2979

gtg

ccC

tet

age

tTG

Ggc

acc

cag

acc

tac

ate

tgc

aac

gtg

1

(BstXI..

.....

. )N

B.

destruction

of BstXI

& Bpml

1

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

1

N

H

K

P

S

N

T

K

V

D

K

V

E

P

35

3024

aat

cac

aag

CCC

age

aac

acc

aag

gtg

gac

aag

aaa

gtt

gag

ccc

J

243

244

245

1

K

S

C

A

H

s

A

3069

aaa

tet

tgt

GCG

GCC

GCt

cat

cac

cat

cac

tet

get

40

I

Notl...

f

E

Q

K

L

I

S

E

D

L

N

G

A

3111

gaa

caa

aaa

etc

ate

tea

gaa

gag

gat

ctg

aat

ggt

gee

gca

45

1

D

I

N

D

R

M

A Σ

G

A

3153

GAT

ATC

aac

gat

cgt

atg

get AGC

ggc

gee

rEK cleavage site. EcoRV..

Nhel... Kasl...

Domain 1 -----------------------------------AETVESCLA

3183 get gaa act gtt gaa agt tgt tta gca

KPHTEISF 3210 aaa ccc cat aca gaa aat tea ttt

T N V W KDDKT 3234 aCT AAC GTC TGG AAA GAC GAC AAA Act

156

2016225923 09 Sep 2016 :o )0

L

D

R

Y

A

N

Y

E

G

C

L

W N

A

T

G

V

3261

tta

gat

cgt

tac

get

aac

tat

gag

ggt

tgt

ctg

tgG AAT BsmI

GCt

aca

ggc

gtt

V

c

T

G

D

E

T

Q

C

Y

G T

W

V

P

I

3312

gta

gtt

tgt

act

ggt

GAC

GAA

ACT

CAG

TGT

TAC

GGT ACA

TGG

GTT

cct

att

G

L

A

I

P

E

N

3363

ggg

ett

get

ate

cct

gaa

aat

! LI linker

3384	E gag	G ggt	G ggt	G ggc	s tet	E gag	G ggt	G ggc	G ggt	s tet
	E	G	G	G	S	E	G	G	G	T
3414	gag	ggt	ggc	ggt	tet	gag	ggt	ggc	ggt	act

! Domain 2

3444

aaa

cct

gag

tac

ggt

gat

aca

cct

att

ccg

ggc

tat

act

tat

ate

aac

3495

cct

etc

gac

ggc

act

tat

ccg

cct

ggt

act

gag

caa

aac

ccc

get

aat

cct

3546

aat

cct

tet

ett

GAG GAG BseRI

tet

cag

cct

ett

aat

act

ttc

atg

ttt

cag

aat

3597

aat

agg

ttc

cga

aat

agg

cag

ggg

gca

tta

act

gtt

tat

aeg

ggc

act

3645

gtt

act

caa

ggc

act

gac

CCC

gtt

aaa

act

tat

tac

cag

tac

act

cct

3693

gta

tea

aaa

gee

atg

tat

gac

get

tac

tgg

aac

ggt

aaa

ttc

AGA

AlwNI

3741 3789

GAC TGc AlwNI tat caa

get ggc

ttc caa

cat teg

tet tet

ggc gac

ttt ctg

aat cct

gaa caa

gat cct

cca ttc gtt tgt gaa

cct gtc

aat

get

3834

ggc ggc

ggc

tet

start L2 ---

3846 ggt ggt ggt tet
3858	ggt	ggc	ggc	tet
3870	gag	ggt	ggt	ggc	tet	gag	ggt	ggc	ggt	tet
3900	gag	ggt	ggc	ggc	tet	gag	gga	ggc	ggt	tee
3930	ggt	ggt	ggc	tet	ggt	1	! end	1 L2

3945

S tee

G ggt

D gat

F ttt

D gat

Y tat

E gaa

K aag

M atg

A gca

N aac

A get

N aat

K aag

G ggg

A get

3993

M atg

T acc

E gaa

N aat

A gee

D gat

E gaa

N aac

A gcg

L eta

Q cag

s tet

D gac

A get

K aaa

G ggc

4041

K aaa

L ett

D gat

S tet

V gtc

A get

T act

D gat

Y tac

G ggt

A get

I ate

D gat

G ggt

F ttc

4089

I att

G ggt

D gac

V gtt

S tee

G ggc

L ett

A get

N aat

G ggt

N aat

G ggt

A get

T act

G ggt

D gat

4137

F ttt

A get

G ggc

s tet

N aat

s tee

Q caa

M atg

A get

Q caa

V gtc

G ggt

D gac

G ggt

D gat

N aat

4185

S tea

P cct

L tta

M atg

N aat

F ttc

R cgt

Q caa

Y tat

L tta

P cct

s tee

L etc

P cct

Q caa

)5

157

2016225923 09 Sep 2016

I

1 4233 I

s teg

V gtt

E gaa

c tgt

R ege

P cct

F ttt

V gtc

F ttt

S age

A get

G ggt

K aaa

P cca

Y tat

E gaa

5

1

F

s

I

D

C

D

K

I

N

L

F

R

4281

ttt

tet

att

gat

tgt

gac

aaa

ata

aac

tta

ttc

cgt

1

End

Domain

3

1

G

V

F

A

F

L

Y

V

A

T

F

M

Y

V

F14

10

4317

ggt

gtc

ttt

geg

ttt

ett

tta

tat

gtt

gee

acc

ttt

atg

tat

gta

ttt

1 |

start transmembrane

segment

1

S

T

F

A

N

I

L

4365

tet

aeg

ttt

get

aac

ata

ctg

15

»

1

R

N

K

E

S

4386

cgt

aat

aag

gag

tet

TAA

! stop

of iii

Intracellular anchor.

20

1 ! Ml P2 4404 tc ATG cca ! Start VI 1

V gtt

L ett

L5 ttg

G ggt

I att

P ccg

L tta

L10 tta

L ttg

R cgt

F ttc

L etc

G15 ggt

4451

ttc

ett

ctg

gta

act

ttg

ttc

ggc

tat

ctg

ett

act

ttt

ett

aaa

aag

25

4499

ggc

ttc

ggt

aag

ata

get

att

get

att

tea

ttg

ttt

ett

get

ett

att

4547

att

ggg

ett

aac

tea

att

ett

gtg

ggt

tat

etc

tet

gat

att

age

get

4595

caa

tta

ccc

tet

gac

ttt

gtt

cag

ggt

gtt

cag

tta

att

etc

ccg

tet

4643

aat

geg

ett

ccc

tgt

ttt

tat

gtt

att

etc

tet

gta

aag

get

att

4691

ttc

att

ttt

gac

gtt

aaa

caa

aaa

ate

gtt

tet

tat

ttg

gat

tgg

gat

30

1

Ml A2

V3

F5

L10

G13

4739 aaa TAA t ATG get gtt tat ttt gta act ggc aaa tta ggc tet gga end VI Start gene I

35

J 1 4785 f

14 K aag

15 T aeg

16 L etc

17 V gtt

18 S age

19 V gtt

20 G ggt

21 K aag

22 I att

23 Q cag

24 D gat

25 K aaa

26 X att

27 V gta

28 A get

1

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

40

1

G

C

K

I

A

T

N

L

D

L

R

L

Q

N

L

4830 )

ggg

tgc

aaa

ata

gca

act

aat

ett

gat

tta

agg

ett

caa

aac

etc

1

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

1

P

Q

V

G

R

F

A

K

T

P

R

V

L

R

I

45

4875 1

ccg

caa

gtc

ggg

agg

ttc

get

aaa

aeg

cct

ege

gtt

ett

aga

ata

1

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

1

P

D

K

P

s

I

S

D

L

A

I

G

R

G

4920

ccg

gat

aag

cct

tet

ata

tet

gat

ttg

ett

get

att

ggg

ege

ggt

50

I

1

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

1

N

D

S

Y

D

E

N

K

N

G

L

V

L

D

4965 |

aat

gat

tee

tac

gat

gaa

aat

aaa

aac

ggc

ttg

ett

gtt

etc

gat

55

1

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

1

E

C

G

T

W

F

N

T

R

S

W

N

D

K

E

5010

gag

tgc

ggt

act

tgg

ttt

aat

acc

cgt

tet

tgg

aat

gat

aag

gaa

I

158

2016225923 09 Sep 2016

! 1 5055

104 R aga

105 Q cag

106 P ccg

107 I att

108 I att

109 D gat

110 W tgg

lll F ttt

112 L eta

113 H cat

114 A get

115 R cgt

116 K aaa

117 L tta

118 G gga

5

1

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

ι

W

D

I

F

L

V

Q

D

L

S

I

V

D

K

5100 j

tgg

gat

att

ttt

ctt

gtt

cag

gac

tta

tet

att

gtt

gat

aaa

1

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

0

1

Q

A

R

S

A

L

A

E

H

V

Y

C

R

5145 ι

cag

geg

cgt

tet

gca

tta

get

gaa

cat

gtt

tat

tgt

cgt

1

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

1

L

D

R

I

T

L

P

F

V

G

T

L

Y

S

L

.5

5190 |

ctg

gac

aga

att

act

tta

cct

ttt

gtc

ggt

act

tta

tat

tet

ctt

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

1

X

T

G

S

K

M

P

L

P

K

L

H

V

G

V

:o

5235 I

att

act

ggc

teg

aaa

atg

cct

ctg

cct

aaa

tta

cat

gtt

ggc

gtt

t

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

1

V

K

Y

G

D

S

Q

L

S

P

T

V

E

R

W

5280 |

gtt

aaa

tat

ggc

gat

tet

caa

tta

age

cct

act

gtt

gag

cgt

tgg

:5

1

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

1

L

Y

T

G

K

N

L

Y

N

A

Y

D

T

K

Q

5325 1

ctt

tat

act

ggt

aag

aat

ttg

tat

aac

gca

tat

gat

act

aaa

cag

I

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

10

J

A

F

S

s

N

Y

D

S

G

V

Y

S

Y

L

T

5370 1

get

ttt

tet

agt

aat

tat

gat

tee

ggt

gtt

tat

tet

tat

tta

aeg

1

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

f

P

Y

L

S

H

G

R

Y

F

K

P

L

N

L

G

55

5415 1

cct

tat

tta

tea

cac

ggt

egg

tat

ttc

aaa

cca

tta

aat

tta

ggt

1

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

!

Q

K

M

K

L

T

K

I

Y

L

K

F

S

R

10

5460 1

cag

aag

atg

aaa

tta

act

aaa

ata

tat

ttg

aaa

aag

ttt

tet

ege

1

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

1

V

L

C

L

A

I

G

F

A

S

A

F

T

Y

S

5505 1

gtt

Ctt

tgt

ctt

geg

att

gga

ttt

gca

tea

gca

ttt

aca

tat

agt

15

1

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

1

Y

I

T

Q

P

K

P

E

V

K

V

S

Q

5550 ι

tat

ata

acc

caa

cct

aag

ccg

gag

gtt

aaa

aag

gta

gtc

tet

cag

1

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

30

1

T

Y

D

F

D

K

F

T

I

D

S

Q

R

L

5595 |

acc

tat

gat

ttt

gat

aaa

ttc

act

att

gac

tet

cag

cgt

ctt

1

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

1

N

L

S

Y

R

Y

V

F

K

D

S

K

G

K

L

55

5640 1

aat

eta

age

tat

ege

tat

gtt

ttc

aag

gat

tet

aag

gga

aaa

TTA Pacl

I

2016225923 09 Sep 2016

159

314	315	316	317	318	319	320	321	322	323	324	325	326	327	328
I	N	S	D	D	L	Q	K	Q	G	Y	S	L	T	Y
5685 ATT Pacl	AAt	age	gac	gat	tta	cag	aag	caa	ggt	tat	tea	etc	aca	tat
329	330	331	332	333	334	335	336	337	338	339	340	341	342	343

ilDLCTVS IKKGNSNE iv Ml K

5730 att gat tta tgt act gtt tcc att aaa aaa ggt aat tea aAT Gaa

Start IV

344	345	346	347	348	349
i I	V	K	C	N	. End	of	I
iv L3	L	N5	V	17	N	F	V10
5775 att	gtt	aaa	tgt	aat	TAA T	TTT	GTT

IV continued.....

5800

ttc

ttg

atg

ttt

gtt

tea

tet

ttt

get

cag

gta

att

gaa

atg

5848

aat

teg

cct

ctg

ege

gat

ttt

gta

act

tgg

tat

tea

aag

caa

tea

5896

ggc

gaa

tcc

gtt

att

gtt

tet

CCC

gat

gta

aaa

ggt

act

gtt

act

gta

5944

tat

tea

tet

gac

gtt

aaa

cct

gaa

aat

eta

ege

aat

ttc

ttt

att

tet

5992

gtt

tta

cgt

get

aat

ttt

gat

atg

gtt

ggt

tea

att

cct

tcc

ata

6040

att

cag

aag

tat

aat

cca

aac

aat

cag

gat

tat

att

gat

gaa

ttg

cca

6088

tea

tet

gat

aat

cag

gaa

tat

gat

aat

tcc

get

cct

tet

ggt

6136

ttc

ttt

gtt

ccg

caa

aat

gat

aat

gtt

act

caa

act

ttt

aaa

att

aat

6184

aac

gtt

egg

gca

aag

gat

tta

ata

ega

gtt

gtc

gaa

ttg

ttt

gta

aag

6232

tot

aat

act

tet

aaa

tcc

tea

aat

gta

tta

tet

att

gac

ggc

tet

aat

6280

eta

tta

gtt

TCT

gca

cct

aaa

gat

att

tta

gat

aac

ett

cct

caa

ApaLI removed

6328

ttc

ett

tet

act

gtt

gat

ttg

cca

act

gac

cag

ata

ttg

att

gag

ggt

6376

ttg

ata

ttt

gag

gtt

cag

caa

ggt

gat

get

tta

gat

ttt

tea

ttt

get

6424

get

ggc

tet

cag

cgt

ggc

act

gtt

gca

ggc

ggt

gtt

aat

act

gac

ege

6472

etc

acc

tet

gtt

tta

tet

get

ggt

teg

ttc

ggt

att

ttt

aat

6520

ggc

gat

gtt

tta

ggg

eta

tea

gtt

ege

gca

tta

aag

act

aat

age

cat

6568

tea

aaa

ata

ttg

tet

gtg

cca

cgt

att

ett

aeg

ett

tea

ggt

cag

aag

6616

ggt

tet

ate

tet

gtT

GGC

CAg

aat

gtc

cct

ttt

att

act

ggt

cgt

gtg

Mscl_

6664

act

ggt

gaa

tet

gee

aat

gta

aat

aat cca

ttt

cag

aeg att

gag cgt

6712

caa

aat

gta

ggt

att

tcc

atg

age

gtt ttt

cct

gtt

gca atg

get ggc

6760

ggt

aat

att

gtt

ctg

gat

att

acc

age aag

gee

gat

agt ttg

agt tet

6808

tet

act

cag

gca

agt

gat

gtt

att

act aat

caa

aga

agt att

get aca

6856

aeg

gtt

aat

ttg

cgt

gat

gga

cag

act ett

tta

etc

ggt ggc

etc act

6904

gat

tat

aaa

aac

act

tet

caa

gat

tet ggc

gta

ccg

ttc ctg

tet aaa

6952

ate

cct

tta

ate

ggc

etc

ctg

ttt

age tcc

ege

tet

gat tcc

aac gag

7000

gaa

age

aeg

tta

tac

gtg

etc

gtc

aaa gca

acc

ata

gta ege

gee ctg

7048

TAG

cggcgcatt

End

IV

7060

aagcgcggcg

ggtgtggtgg ttacgcgcag cgtgaccgct

acacttgcca <

gcgccctagc

7120

gcccgctcct

ttcgctttct tcccttcctt

: tctcgccacg

ttcGCCGGCt

ttccccgtca

NgoMI_

7180 agetetaaat cgggggctcc ctttagggtt ccgatttagt getttaegge acctcgaccc 7240 caaaaaactt gatttgggtg atggttCACG TAGTGggcca tcgccctgat agacggtttt

Drain_

7300 tcgccctttG ACGTTGGAGT Ccacgttctt taatagtgga ctcttgttcc aaactggaac Drdl

7360 aacactcaac cctatctcgg getattettt tgatttataa 7420 accaccatca aacaggattt tcgcctgctg gggcaaacca 7480 ctctctcagg gccaggcggt gaagggcaat CAGCTGttgc

PvuIX.

7540 aaaaccaccc

I tGGATCC

BamHI gggattttgc gcgtggaccg cCGTCTCact

BsmBI.

egatttegga cttgctgcaa ggtgaaaaga

AAGCTT

Hindlll (½)

160

2016225923 09 Sep 2016 ,0 ! Insert carrying bla gene

7563 gcaggtg gcacttttcg gggaaatgtg cgcggaaccc

7600 ctatttgttt atttttctaa atacattcaa atatGTATCC gctcatgaga caataaccct ! BciVI

7660 gataaatgct tcaataatat tgaaaaAGGA AGAgt ! RBS.?...

! Start bla gene

7695 ATG agt att caa cat ttc cgt gtc gcc ctt att ccc ttt ttt gcg gca ttt

7746 tgc ctt cct gtt ttt get cac cca gaa aeg ctg gtg aaa gta aaa gat get

7797 gaa gat cag ttg ggC gCA CGA Gtg ggt tac ate gaa ctg gat etc aac age ! BssSI...

! ApaLI removed

7848 ggt aag ate ctt gag agt ttt ege ccc gaa gaa cgt ttt cca atg atg age

7899 act ttt aaa gtt ctg eta tgt cat aca eta tta tcc cgt att gac gcc ggg

7950 caa gaG CAA CTC GGT CGc egg gcg egg tat tet cag aat gac ttg gtt gAG ! BegI_ Seal

8001 TAC Tea cca gtc aca gaa aag cat ctt aeg gat ggc atg aca gta aga gaa ! Scal_

8052 tta tgc agt get gcc ata acc atg agt gat aac act gcg gcc aac tta ctt

8103 ctg aca aCG ATC Gga gga ccg aag gag eta acc get ttt ttg cac aac atg ! Pvul_

8154 ggg gat cat gta act ege ctt gat cgt tgg gaa ccg gag ctg aat gaa gcc

8205 ata cca aac gac gag cgt gac acc aeg atg cct gta gca atg cca aca aeg

8256 tTG CGC Aaa eta tta act ggc gaa eta ctt act eta get tcc egg caa caa ! Fspl....

I

8307 tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt ctg ege teg

8358 GCC ctt ccG GCt ggc tgg ttt att get gat aaa tet gga gcc ggt gag cgt ! Bg1I_

8409 gGG TCT Cgc ggt ate att gca gca ctg ggg cca gat ggt aag ccc tcc cgt ! Bsal_

8460 ate gta gtt ate tac aeG ACg ggg aGT Cag gca act atg gat gaa ega aat ! AhdI_

8511 aga cag ate get gag ata ggt gcc tea ctg att aag cat tgg TAA ctgt ! stop

8560 cagaccaagt ttactcatat ataetttaga ttgatttaaa acttcatttt taatttaaaa

8620 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt

8680 cgttccactg tacgtaagac cccc

8704 AAGCTT GTCGAC tgaa tggcgaatgg cgctttgcct ! Hindlll Sail..

! (2/2) Hindi

8740 ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgegatett

I

8790 CCTGAGG ! Bsu36I_

8797 ccgat actgtcgtcg tcccctcaaa ctggcagatg

8832 cacggttacg atgcgcccat ctacaccaac gtaacctatc ccattacggt caatccgccg

8892 tttgttccca cggagaatcc gacgggttgt tactcgctca catttaatgt tgatgaaagc

8952 tggctacagg aaggccagac gegaattatt tttgatggcg ttcctattgg ttaaaaaatg

9012 agctgattta acaaaaattt aaegegaatt ttaacaaaat attaacgttt acaATTTAAA ! Swal...

9072 Tatttgetta tacaatcttc ctgtttttgg ggcttttctg attatcaacc GGGGTAcat

RBS?

9131

ATG

att gac

atg

eta

gtt

tta

ega

tta

ccg

ttc

ate gat

tet

ctt

gtt

tgc

Start gene '

II

9182

tcc

aga etc

tea

ggc

aat

gac

ctg

ata

gcc

ttt

gtA GAT

CTc

tea

aaa

ata

Bglll

9233

get

acc etc

tcc

ggc

atg

aat

tta

tea

get

aga

aeg gtt

gaa

tat

cat

att

>5

SO

161

Ο

CM

9284

gat

ggt

gat

ttg

act

gtc

tcc

ggc

ett

tet

cac

cct

ttt

gaa

tet

tta

cct

Ph CD

9335

aca

cat

tac

tea

ggc

att

gca

ttt

aaa

ata

tat

gag

ggt

tet

aaa

aat

ttt

9386

tat

cct

tgc

gtt

gaa

ata

aag

get

tet

CCC

gca

aaa

gta

tta

cag

ggt

cat

CZ

9437

aat

gtt

ttt

ggt

aca

acc

gat

tta

get

tta

tgc

tet

gag

get

tta

ttg

ett

5

9488

aat

ttt

get

aat

tet

ttg

cct

tgc

ctg

tat

gat

tta

ttg

gat

gtt

! 9532

OD

! gene

II

continues

O

ro

CM

OD m

CM

Ό

O

CM

162

2016225923 09 Sep 2016 ;0 .0 :5 ,0

Table 21B: Sequence of MALIA3, condensed

LOCUS MALIA3 9532 CIRCULAR

ORIGIN

121

181

241

361

421

481

541

601

661

721

781

841

901

961

1021

1081

1141

1201

1261

1321

1381

1441

1501

1561

1621

1681

1741

1801

1861

1921

1981

2041

2101

2161

2221

2281

2341

2401

2461

2521

2581

2641

2701

2761

2821

2881

2941

3001

3061

3121

3181

3241

AATGCTACTA CTATTAGTAG ATAGCTAAAC AGGTTATTGA CGTTCGCAGA ATTGGGAATC GTTGCATATT TAAAACATGT TCCGCAAAAA TGACCTCTTA TCTTTCGGGC TTCCTCTTAA CAGGGTAAAG ACCTGATTTT TTTGAGGGGG ATTCAATGAA AAACATTTTA CTATTACCCC GGTTTTTATC GTCGTCTGGT AATTCCTTTT GGCGTTATGT ATGAATCTTT CTACCTGTAA TCTTCCCAAC GTCCTGACTG CAATGATTAA AGTTGAAATT CTCGTCAGGG CAAGCCTTAT AATATCCGGT TCTTGTCAAG TGTACACCGT TCATCTGTCC GTCTGCGCCT CGTTCCGGCT CAGGCGATGA TACAAATCTC CAAAGATGAG TGTTTTAGTG GTGGCATTAC GTATTTTACC CAAAGCCTCT GTAGCCGTTG CGATCCCGCA AAAGCGGCCT TGCGTGGGCG ATGGTTGTTG ATTCACCTCG AAAGCAAGCT TTTTTGGAGA TTTTCAACGT TATTCTCACA GTGCACAGTC CAGAGGGTCA CCATCTCCTG CACTGGTACC AGCAGCTTCC CGGCCCTCAG GGGTCCCTGA GCCATCACTG GGCTCCAGGC AGCCTGAGTG GCCTTTATGT AAGGCCAACC CCACTGTCAC GCCACACTAG TGTGTCTGAT GCAGATAGCA GCCCCGTCAA AACAAGTACG CGGCCAGCAG AGCTACAGCT GCCAGGTCAC GAATGTTCAT AATAAACCGC TAATGAAATA CCTATTGCCT CCATGGCCGA AGTTCAATTG TACGTCTTTC TTGCGCTGCT GCCAAGCTCC TGGTAAAGGT CTTACTATGC TGACTCCGTT CTCTCTACTT GCAGATGAAC AAGACTATGA AGGTACTGGT TCTCTAGTGC CTCCACCAAG CCTCTGGGGG CACAGCGGCC CGGTGTCGTG GAACTCAGGC AGTCTAGCGG ACTCTACTCC CCCAGACCTA CATCTGCAAC TTGAGCCCAA ATCTTGTGCG TCATCTCAGA AGAGGATCTG CCGCTGAAAC TGTTGAAAGT TCTGGAAAGA CGACAAAACT

AATTGATGCC ACCTTTTCAG CCATTTGCGA AATGTATCTA AACTGTTACA TGGAATGAAA TGAGCTACAG CACCAGATTC TCAAAAGGAG CAATTAAAGG TCTTTTTGAT GCAATCCGCT TGATTTATGG TCATTCTCGT TATTTATGAC GATTCCGCAG CTCTGGCAAA ACTTCTTTTG AAACGAGGGT TATGATAGTG ATCTGCATTA GTTGAATGTG TAATGTTGTT CCGTTAGTTC GTATAATGAG CCAGTTCTTA AAACCATCTC AAGCCCAATT TCACTGAATG AGCAGCTTTG ATTACTCTTG ATGAAGGTCA TCTTTCAAAG TTGGTCAGTT AAGTAACATG GAGCAGGTCG CGTTGTACTT TGTTTCGCGC TATTCTTTCG CCTCTTTCGT CGTTTAATGG AAACTTCCTC CTACCCTCGT TCCGATGCTG TTAACTCCCT GCAAGCCTCA TCATTGTCGG CGCAACTATC GATAAACCGA TACAATTAAA GAAAAAATTA TTATTCGCAA TGTCGTGACG CAGCCGCCCT CACTGGGAGC AGCTCCAACA AGGAACAGCC CCCAAACTCC CCGATTCTCT GGCTCCAAGT TGAGGATGAG GCTGATTATT CTTCGGAACT GGGACCAAGG TCTGTTCCCG CCCTCCTCTG CAGTGACTTC TACCCGGGAG GGCGGGAGTG GAGACCACCA CTATCTGAGC CTGACGCCTG GCATGAAGGG AGCACCGTGG CTCCACCGGG CGCGCCAATT ACGGCAGCCG CTGGATTGTT TTAGAGTCTG GTGGCGGTCT TCCGGATTCA CTTTCTCTTC TTGGAGTGGG TTTCTGCTAT AAAGGTCGCT TCACTATCTC AGCTTAAGGG CTGAGGACAC TATGCTTTCG ACATATGGGG GGCCCATCGG TCTTCCCCCT CTGGGCTGCC TGGTCAAGGA GCCCTGACCA GCGGCGTCCA CTCAGCAGCG TAGTGACCGT GTGAATCACA AGCCCAGCAA GCCGCTCATC ACCACCATCA AATGGTGCCG CAGATATCAA TGTTTAGCAA AACCCCATAC TTAGATCGTT ACGCTAACTA

CTCGCGCCCC AAATGAAAAT ATGGTCAAAC TAAATCTACT CTTCCAGACA CCGTACTTTA AGCAATTAAG CTCTAAGCCA TACTCTCTAA TCCTGACCTG TTGCTTCTGA CTATAATAGT TTTCTGAACT GTTTAAAGCA TATTGGACGC TATCCAGTCT CAAAAGCCTC TCGCTATTTT TTGCTCTTAC TATGCCTCGT GTATTCCTAA ATCTCAACTG GTTTTATTAA CGTAGATTTT AAATCGCATA AGGTAATTCA TACTACTCGT TCTGGTGTTT TTACGTTGAT TTGGGTAATG GCCAGCCTAT GCGCCTGGTC CGGTTCCCTT ATGATTGACC CGGATTTCGA CACAATTTAT TTGGTATAAT CGCTGGGGGT TTTAGGTTGG TGCCTTCGTA ATGAAAAAGT CTTTAGTCCT TCTTTCGCTG CTGAGGGTGA GCGACCGAAT ATATCGGTTA GGTATCAAGC TGTTTAAGAA GGCTCCTTTT GGAGCCTTTT TTCCTTTAGT TGTTCCTTTC CAGTGTCTGG GGCCCCAGGG TCGGGGCAGG TTATGATGTA TCATCTATGG TAACAGCAAT CTGGCACCTC AGCCTCCCTG ACTGCCAGTC CTATGACAGC TCACCGTCCT AGGTCAGCCC AGGAGCTCCA AGCCAACAAG CTGTGACAGT GGCCTGGAAG CACCCTCCAA ACAAAGCAAC AGCAGTGGAA GTCCCACAGA AGAAGACAGT GGCCCCTACA CTATTTCAAG GAGACAGTCA ATTACTCGCG GCCCAGCCGG TGTTCAGCCT GGTGGTTCTT GTACGCTATG TCTTGGGTTC CTCTGGTTCT GGTGGCAGTA TAGAGACAAC TCTAAGAATA TGCAGTCTAC TATTGCGCTA TCAAGGTACT ATGGTCACCG GGCACCCTCC TCCAAGAGCA CTACTTCCCC GAACCGGTGA CACCTTCCCG GCTGTCCTAC GCCCTCTTCT AGCTTGGGCA CACCAAGGTG GACAAGAAAG TCACTCTGCT GAACAAAAAC CGATGATCGT ATGGCTGGCG AGAAAATTCA TTTACTAACG TGAGGGTTGT CTGTGGAATG >5

163

CTACAGGCGT TGTAGTTTGT TTGGGCTTGC TATCCCTGAA GCGGTTCTGA GGGTGGCGGT ATACTTATAT CAACCCTCTC ATCCTAATCC TTCTCTTGAG GGTTCCGAAA TAGGCAGGGG ACCCCGTTAA AACTTATTAC ACTGGAACGG TAAATTCAGA TTTGTGAATA TCAAGGCCAA GCTCTGGTGG TGGTTCTGGT AGGGTGGCGG CTCTGAGGGA ATGAAAAGAT GGCAAACGCT TACAGTCTGA CGCTAAAGGC ATGGTTTCAT TGGTGACGTT CTGGCTCTAA TTCCCAAATG ATTTCCGTCA ATATTTACCT GCGCTGGTAA ACCATATGAA TCTTTGCGTT TCTTTTATAT TACTGCGTAA TAAGGAGTCT TTTCCTCGGT TTCCTTCTGG CTTCGGTAAG ATAGCTATTG AATTCTTGTG GGTTATCTCT TGTTCAGTTA ATTCTCCCGT GGCTGCTATT TTCATTTTTG ATAATATGGC TGTTTATTTT TTGGTAAGAT TCAGGATAAA GGCTTCAAAA CCTCCCGCAA CGGATAAGCC TTCTATATCT AAAATAAAAA CGGCTTGCTT GGAATGATAA GGAAAGACAG GGGATATTAT TTTTCTTGTT TAGCTGAACA TGTTGTTTAT CTTTATATTC TCTTATTACT TTAAATATGG CGATTCTCAA ATTTGTATAA CGCATATGAT ATTCTTATTT AACGCCTTAT AGAAGATGAA ATTAACTAAA TTGGATTTGC ATCAGCATTT AGGTAGTCTC TCAGACCTAT ATCTAAGCTA TCGCTATGTT TACAGAAGCA AGGTTATTCA GTAATTCAAA TGAAATTGTT TCTTCTTTTG CTCAGGTAAT TATTCAAAGC AATCAGGCGA GTATATTCAT CTGACGTTAA GCTAATAATT TTGATATGGT AATCAGGATT ATATTGATGA GCTCCTTCTG GTGGTTTCTT AATAACGTTC GGGCAAAGGA TCTAAATCCT CAAATGTATT AAAGATATTT TAGATAACCT ATATTGATTG AGGGTTTGAT GCTGCTGGCT CTCAGCGTGG GTTTTATCTT CTGCTGGTGG GTTCGCGCAT TAAAGACTAA CTTTCAGGTC AGAAGGGTTC GTGACTGGTG AATCTGCCAA GGTATTTCCA TGAGCGTTTT

ACTGGTGACG AAACTCAGTG AATGAGGGTG GTGGCTCTGA ACTAAACCTC CTGAGTACGG GACGGCACTT ATCCGCCTGG GAGTCTCAGC CTCTTAATAC GCATTAACTG TTTATACGGG CAGTACACTC CTGTATCATC GACTGCGCTT TCCATTCTGG TCGTCTGACC TGCCTCAACC GGCGGCTCTG AGGGTGGTGG GGCGGTTCCG GTGGTGGCTC AATAAGGGGG CTATGACCGA AAACTTGATT CTGTCGCTAC TCCGGCCTTG CTAATGGTAA GCTCAAGTCG GTGACGGTGA TCCCTCCCTC AATCGGTTGA TTTTCTATTG ATTGTGACAA GTTGCCACCT TTATGTATGT TAATCATGCC AGTTCTTTTG TAACTTTGTT CGGCTATCTG CTATTTCATT GTTTCTTGCT CTGATATTAG CGCTCAATTA CTAATGCGCT TCCCTGTTTT ACGTTAAACA AAAAATCGTT GTAACTGGCA AATTAGGCTC ATTGTAGCTG GGTGCAAAAT GTCGGGAGGT TCGCTAAAAC GATTTGCTTG CTATTGGGCG GTTCTCGATG AGTGCGGTAC CCGATTATTG ATTGGTTTCT CAGGACTTAT CTATTGTTGA TGTCGTCGTC TGGACAGAAT GGCTCGAAAA TGCCTCTGCC TTAAGCCCTA CTGTTGAGCG ACTAAACAGG CTTTTTCTAG TTATCACACG GTCGGTATTT ATATATTTGA AAAAGTTTTC ACATATAGTT ATATAACCCA GATTTTGATA AATTCACTAT TTCAAGGATT CTAAGGGAAA CTCACATATA TTGATTTATG AAATGTAATT AATTTTGTTT TGAAATGAAT AATTCGCCTC ATCCGTTATT GTTTCTCCCG ACCTGAAAAT CTACGCAATT TGGTTCAATT CCTTCCATAA ATTGCCATCA TCTGATAATC TGTTCCGCAA AATGATAATG TTTAATACGA GTTGTCGAAT ATCTATTGAC GGCTCTAATC TCCTCAATTC CTTTCTACTG ATTTGAGGTT CAGCAAGGTG CACTGTTGCA GGCGGTGTTA TTCGTTCGGT ATTTTTAATG TAGCCATTCA AAAATATTGT TATCTCTGTT GGCCAGAATG TGTAAATAAT CCATTTCAGA TCCTGTTGCA ATGGCTGGCG

TTACGGTACA TGGGTTCCTA GGGTGGCGGT TCTGAGGGTG TGATACACCT ATTCCGGGCT TACTGAGCAA AACCCCGCTA TTTCATGTTT CAGAATAATA CACTGTTACT CAAGGCACTG AAAAGCCATG TATGACGCTT CTTTAATGAA GATCCATTCG TCCTGTCAAT GCTGGCGGCG CTCTGAGGGT GGCGGTTCTG TGGTTCCGGT GATTTTGATT AAATGCCGAT GAAAACGCGC TGATTACGGT GCTGCTATCG TGGTGCTACT GGTGATTTTG TAATTCACCT TTAATGAATA ATGTCGCCCT TTTGTCTTTA AATAAACTTA TTCCGTGGTG ATTTTCTACG TTTGCTAACA GGTATTCCGT TATTATTGCG CTTACTTTTC TTAAAAAGGG CTTATTATTG GGCTTAACTC CCCTCTGACT TTGTTCAGGG TATGTTATTC TCTCTGTAAA TCTTATTTGG ATTGGGATAA TGGAAAGACG CTCGTTAGCG AGCAACTAAT CTTGATTTAA GCCTCGCGTT CTTAGAATAC CGGTAATGAT TCCTACGATG TTGGTTTAAT ACCCGTTCTT ACATGCTCGT AAATTAGGAT TAAACAGGCG CGTTCTGCAT TACTTTACCT TTTGTCGGTA TAAATTACAT GTTGGCGTTG TTGGCTTTAT ACTGGTAAGA TAATTATGAT TCCGGTGTTT CAAACCATTA AATTTAGGTC TCGCGTTCTT TGTCTTGCGA ACCTAAGCCG GAGGTTAAAA TGACTCTTCT CAGCGTCTTA ATTAATTAAT AGCGACGATT TACTGTTTCC ATTAAAAAAG TCTTGATGTT TGTTTCATCA TGCGCGATTT TGTAACTTGG ATGTAAAAGG TACTGTTACT TCTTTATTTC TGTTTTACGT TTCAGAAGTA TAATCCAAAC AGGAATATGA TGATAATTCC TTACTCAAAC TTTTAAAATT TGTTTGTAAA GTCTAATACT TATTAGTTGT TTCTGCACCT TTGATTTGCC AACTGACCAG ATGCTTTAGA TTTTTCATTT ATACTGACCG CCTCACCTCT GCGATGTTTT AGGGCTATCA CTGTGCCACG TATTCTTACG TCCCTTTTAT TACTGGTCGT CGATTGAGCG TCAAAATGTA GTAATATTGT TCTGGATATT

164

2016225923 09 Sep 2016 :5 ίθ !5

6781

6841

6901

6961

7021

7081

7141

7201

7261

7321

7381

7441

7501

7561

7621

7681

7741

7801

7861

7921

7981

8041

8101

8161

8221

8281

8341

8401

8461

8521

8581

8641

8701

8761

8821

8881

8941

9001

9061

9121

9181

9241

9301

9361

9421

9481

ACCAGCAAGG CCGATAGTTT GAGTTCTTCT AGAAGTATTG CTACAACGGT TAATTTGCGT ACTGATTATA AAAACACTTC TCAAGATTCT ATCGGCCTCC TGTTTAGCTC CCGCTCTGAT GTCAAAGCAA CCATAGTACG CGCCCTGTAG TACGCGCAGC GTGACCGCTA CACTTGCCAG CCCTTCCTTT CTCGCCACGT TCGCCGGCTT TTTAGGGTTC CGATTTAGTG CTTTACGGCA TGGTTCACGT AGTGGGCCAT CGCCCTGATA CACGTTCTTT AATAGTGGAC TCTTGTTCCA CTATTCTTTT GATTTATAAG GGATTTTGCC CGCCTGCTGG GGCAAACCAG CGTGGACCGC AAGGGCAATC AGCTGTTGCC CGTCTCACTG TTGCAGGTGG CACTTTTCGG GGAAATGTGC TACATTCAAA TATGTATCCG CTCATGAGAC GAAAAAGGAA gagtatgagt attcaacatt CATTTTGCCT TCCTGTTTTT GCTCACCCAG ATCAGTTGGG CGCACGAGTG GGTTACATCG AGAGTTTTCG CCCCGAAGAA CGTTTTCCAA ATACACTATT ATCCCGTATT GACGCCGGGC CTCAGAATGA CTTGGTTGAG TACTCACCAG CAGTAAGAGA ATTATGCAGT GCTGCCATAA TTCTGACAAC GATCGGAGGA CCGAAGGAGC ATGTAACTCG CCTTGATCGT TGGGAACCGG GTGACACCAC GATGCCTGTA GCAATGCCAA TACTTACTCT AGCTTCCCGG CAACAATTAA GACCACTTCT GCGCTCGGCC CTTCCGGCTG GTGAGCGTGG GTCTCGCGGT ATCATTGCAG TCGTAGTTAT CTACACGACG GGGAGTCAGG CTGAGATAGG TGCCTCACTG ATTAAGCATT TACTTTAGAT TGATTTAAAA CTTCATTTTT TTGATAATCT CATGACCAAA ATCCCTTAAC CCCAAGCTTG TCGACTGAAT GGCGAATGGC TGCCGGAAAG CTGGCTGGAG TGCGATCTTC ACTGGCAGAT GCACGGTTAC GATGCGCCCA TCAATCCGCC GTTTGTTCCC ACGGAGAATC TTGATGAAAG CTGGCTACAG GAAGGCCAGA GTTAAAAAAT GAGCTGATTT AACAAAAATT TACAATTTAA ATATTTGCTT ATACAATCTT CGGGGTACAT ATGATTGACA TGCTAGTTTT CTCCAGACTC TCAGGCAATG ACCTGATAGC CTCCGGCATG AATTTATCAG CTAGAACGGT CTCCGGCCTT TCTCACCCTT TTGAATCTTT AATATATGAG GGTTCTAAAA ATTTTTATCC AGTATTACAG GGTCATAATG TTTTTGGTAC ATTGCTTAAT TTTGCTAATT CTTTGCCTTG

ACTCAGGCAA GTGATGTTAT TACTAATCAA GATGGACAGA CTCTTTTACT CGGTGGCCTC GGCGTACCGT TCCTGTCTAA AATCCCTTTA TCCAACGAGG AAAGCACGTT ATACGTGCTC CGGCGCATTA AGCGCGGCGG GTGTGGTGGT CGCCCTAGCG CCCGCTCCTT TCGCTTTCTT TCCCCGTCAA gctctaaatc gggggctccc CCTCGACCCC AAAAAACTTG ATTTGGGTGA GACGGTTTTT CGCCCTTTGA CGTTGGAGTC AACTGGAACA ACACTCAACC CTATCTCGGG GATTTCGGAA CCACCATCAA ACAGGATTTT TTGCTGCAAC TCTCTCAGGG CCAGGCGGTG GTGAAAAGAA AAACCACCCT GGATCCAAGC GCGGAACCCC TATTTGTTTA TTTTTCTAAA AATAACCCTG ATAAATGCTT CAATAATATT TCCGTGTCGC CCTTATTCCC TTTTTTGCGG AAACGCTGGT GAAAGTAAAA GATGCTGAAG AACTGGATCT CAACAGCGGT AAGATCCTTG TGATGAGCAC TTTTAAAGTT CTGCTATGTC AAGAGCAACT CGGTCGCCGG GCGCGGTATT TCACAGAAAA GCATCTTACG GATGGCATGA CCATGAGTGA TAACACTGCG GCCAACTTAC TAACCGCTTT TTTGCACAAC ATGGGGGATC AGCTGAATGA AGCCATACCA AACGACGAGC CAACGTTGCG CAAACTATTA ACTGGCGAAC TAGACTGGAT GGAGGCGGAT AAAGTTGCAG GCTGGTTTAT TGCTGATAAA TCTGGAGCCG CACTGGGGCC AGATGGTAAG CCCTCCCGTA CAACTATGGA TGAACGAAAT AGACAGATCG GGTAACTGTC AGACCAAGTT TACTCATATA AATTTAAAAG GATCTAGGTG AAGATCCTTT GTGAGTTTTC GTTCCACTGT ACGTAAGACC GCTTTGCCTG GTTTCCGGCA CCAGAAGCGG CTGAGGCCGA TACTGTCGTC GTCCCCTCAA TCTACACCAA CGTAACCTAT CCCATTACGG CGACGGGTTG TTACTCGCTC ACATTTAATG CGCGAATTAT TTTTGATGGC GTTCCTATTG TAACGCGAAT TTTAACAAAA TATTAACGTT CCTGTTTTTG GGGCTTTTCT GATTATCAAC ACGATTACCG TTCATCGATT CTCTTGTTTG CTTTGTAGAT CTCTCAAAAA TAGCTACCCT TGAATATCAT ATTGATGGTG ATTTGACTGT ACCTACACAT TACTCAGGCA TTGCATTTAA TTGCGTTGAA ATAAAGGCTT CTCCCGCAAA AACCGATTTA GCTTTATGCT CTGAGGCTTT CCTGTATGAT TTATTGGATG TT

165

Ό

Ο

CM

Ph ο

CZ

Oh ο

CD

CM

Oh m

CM

CM hO o

CM c

o •H

4-1 flj o

•H m

•H c—<

Cu £

ro ω

a

CD

υ Ε- Ε-

1

rf

<

υ

Ε-

<

υ

•

υ

CD

<

1

<

1

Ε-

υ

-

υ

Ε-

<

Ε-

CD

υ

I

<

υ

<

υ

-

E-

rf

Ε—

rf

Ε—

rf

CD

&H

rf

-

Ε-

υ

Ε-

υ

1

υ

CD

e-

O

1

rf

υ

rf

υ

E—

£m

E-

ε-

υ

Ε-

υ

Ε-

υ

Ε-

O

(J

E-

Ε-

Η

Ε-

υ

Ε-

υ

CD

E-

υ

rf

E-

ε-

<

rf

υ

1

E-

υ

2

υ

rf

υ

rf

-

rf

υ

O

υ

CD

Ε-

E—

E-

Ε-

υ

Ε-

υ

E-

υ

rf

<

O

Ε-

υ

Ε-

υ

Ε-

υ

rf

Ε-

υ

E-

U

O

υ

Ε-

C_)

Ε-

rf

υ

rf

υ

rf

<

υ

U

rf

Ε-

O

E-

ε-

υ

<

E—

υ

E-

υ

rf

υ

O

υ

rf

ε-

Η

rf

υ

Ε-

-—*

υ

E-

υ

C

υ

Ε-

υ

E-

υ

E-

Ε—

Ε-

υ

Φ

rf

<

E-

Ε-

υ

O

υ

ο

υ

rf

υ

O

υ

rf

υ

Μ

rf

υ

E-

rf

υ

4-5

υ

Ε-

<

E-

υ

rf

υ

-Η

υ

rf

υ

>

Ε-

υ

£-

I

υ

<

rf

υ

C

υ

E-

-

CJ

υ

Μ

rf

<

E-

U

UD

Ε—

rf

υ

1

υ

1

I

1

I

υ

-

υ

4-1

LTD

m

υ

m

ιη

ιΌ

UD

-Η

UD

to

Φ

X υ

υ <

υ υ

Ea υ

υ υ

υ <

I

Ό

C

-H

Ό

Φ

Ό •H >

O

Ci a

ό

Φ co

X υ

co

X

υ

X

φ

Μ

υ

C

cu

•Η

4-5

ε

Φ

X

Ό

Ή

X

<0

to

Μ

•r4

ε

C

<0

υ

Ό

X

γ—1

υ

Μ

Ή

4->

04

χ

X

C

υ

W

X

Μ

CO

—'

υ

CM

<

υ

X

c-4

4-5

4->

$4

Ui

ω

-

X

C

Ό

4->

CO

Μ

X

Ο

«ί

LID

X

υ

Ή

Φ

X

r—

U

θ'

Lu

I

j

υ

a

··

Φ

4->

σ>

—¹·

co

•Η

(0

X

fO

X

$4

X

ΟΜ

χ

X

CO

•Η

rf

γ—4

Ό

ο

Ό

ο

Φ

Ό

CM

a

Ο

Φ

•**4

i4

J-4

X

Lu

X

Lu

□

4-5

C

X

2

0

Ο

03

ε

»

1

co

ΟΜ

Φ

1

<0

Lu

Ό

(0

ΟΜ

Γ-

Ο

0Μ

Ο-

2

«-4

r-4

>

3.

2.

X

44

X

«—1

X

r~4

X

Χ

φ

χ

φ

υ

a

υ

X

ω

υ

C

Ζ

φ

ο

□

<Τ3

ο

2

<0

ο

□

ο

□

ο

φ

-

Ε-

X

id

X

υ

UD

LO

ο

ιΓ)

Ο

γΗ

γ^-1

CM

166 kO ο

CM

Ph <D

CZ

Ok o

cn

CM

Ok m

CM

CM kO o

CM

Q

O re «ίο

P o

re

U

Ck

O

C o

ii

o	u	o	ϋ	υ
Eh	Eh	eh		Eh
O'	©	O'	O'	O'
CP	O'	<	o	ϋ
O'	O'	O'	O'	O'
Eh	£-i	Eh		Eh
O	O	u	U	0
<			<	<
O'	O'	O'	O'	O'
<		<	<	<
O'	O'	©	O'	O'
O'	O'	O'	O'	O'
Eh	e-.	H	F·	Eh
<	<	O'	U	0
O'	O'	<	O	O'
CP	(3*	O'	O'	O'
<			£-<
O'	O'	CP	O'	O'
O'	O'	O'	O'	O'
O'	O'	O'	O'	O'

O' O' O' < O' O' < < O' o < < ϋ O' < e-i O' O' ϋ

o o

CO

C3 re

	CM	CM	r-	r*	rrt
	rrt	•rt	«rt	CM	irt
	o	X	<	<	<
k©	m	m	in	tn	in
	irt	rrt	rrt	rrt	rrt
Vi
4)	tn	ω	cn	cn	cn
4<H rt	o	o	o	o	o
re	CM	CM	CM	CM	CM
-o	j	1	1	1
jXj	2	2	2	2	2
CSC	O	O	O	O	O

I I

		Eh	Eh	frt	Η	Eh
	O	©	©	©	©	a
	Eh	Eh	Eh	Eh	Eh	<
	ϋ	υ	ϋ	O	υ	u
		<	<	<	<	©
	a	υ	u	o	u	Eh
	©	©	©	©	©	©
	Eh	EH	Eh	Eh	Eh	Eh
	O'	©	©	©	© E->
	Eh	Eh	&H	Eh	Eh	υ
	Eh	Eh	Eh	Eh	Eh	Eh
	O	O	o	o	υ	©
	Eh	Eh	Eh	H	Eh
	O	©	©	©	©	<
	Eh	Eh	Eh	Η	Eh	©
	<	<		<	<	©
	o	©	©	©	©	Eh
	o	©	©	©	©	U
	H	frt	Eh	eh	Eh	Eh
	υ	O	a	u	O
	Eh		&H		Eh	υ
0		<	C	<		Eh
Eh	U	υ	u	υ	o	©
O'	Eh	Eh	Eh	Eh	£h	©
O'	©	©	©	©	©	©
O'	©	©	©	©	©	Eh
Eh	©	©	©	©	©	υ
0	Eh	Eh	Eh	Eh	Eh	<
	ϋ	O	υ	O	a	©
O'	<	<	<	<	<	<
<	©	©	©	©	©	©
O'	<	<	<		<	©
O'	©	©	©	©	©	Eh
H	©	©	©	©	©	U
ϋ	Eh	Eh	Eh	Eh	Eh	Eh
H	<	<	©	υ	0	©
CP	©	©	<	a	Cn rt
<	©	<	©	©	©	©
O'	<	2	<	H		©
O'	©	©	©	©	©	©
o	©	©	©	©	©
	©	©	©	©	©
m
CQ	CM	CM	Γ-	r-	irt
in	rrt	rrt	rrt	CM	irt	cn
trt	z-rt©	►a	<	cC	ft	co
	SO rrt	rrt	irt	irt	irt	irt

” ΐ

CO X) ♦rt -rt M M X X!

rt

X

S o* 0< CL Oh Oh *- <0 05 (0 to (0 fO CO At Λί Λ4 .S .2

Vk Λ .2 s> a g X a> ¢) ** Q. x Λ ω

o		οώ
υ		ts
H		0Α
□		<—·
		Ο ο
©
		ο ο
u
u		ο
L>		ο
<		re
©		ο
Eh		re
		Cn
©
<c
υ
υ
Eh

υ	ο

©	©
2 υ	2 υ
	2
υ	0
©	©
Η	ε-«
©	©

υ	ο

a	υ
Η	&Η
©	©
Eh	Εη
υ	Ο
Eh	£η
υ	Ο
υ	υ

o

CM m

167

SO

Ο

CM

Ph ο

CZ

Os ο

CD

CM

OS m

CM

CM so o

CM

4=

O c

© ©

CD

OD c © WD .

2 3. c

-. © © -3 © © ri CM CD <

z

Q «3

Q

O <B

C o

*w cB

Q s

*5 £

CB ε

«β

S

Q ai

U cu tt

CM jj x>

fd

H

Ό	-o
C	c
O	o
o	o
o	o
CZ5	(Z)
ir>	©
—*	CD

u *—1 c

C £ ²c _or- j=

UU^j u _ri

Ο O SZ , ° (J </D *C CM CM o Os s© Γ- r- rf

J2 c

<SZ bi co* <u

JD

CC o

a.

4> O -J .

X> © — 3 0- « θέ 1β· 2 ο Ρ o c2 E x -£ Ϊ a. n. t- — 3 Ό Zi

168

2016225923 09 Sep 2016

Table 25: h3401-h2 captured Via CJ with BsmAI ! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ISAQDIQMTQSPATLS aGT GCA Caa gac ate cag atg acc cag tct cca gcc acc ctg tct ! ApaLI... a gcc acc ! L25,L6,L20,L2,L16,All ! Extender.................................Bridge...

! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 IVSPGERATLSCRASQ . 0 gtg tct cca ggg gaa agg gcc acc etc tcc tgc agg gcc agt cag ! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 ISVSNNLAWYQQKPGQ agt gtt agt aac aac tta gcc tgg tac cag cag aaa cct ggc cag .5 ! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 IVPRLLIYGASTRATD gtt ccc agg etc etc ate tat ggt gca tcc acc agg gcc act gat :0 ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

IlPARFSGSGSGTDFT ate cca gcc agg ttc agt ggc agt ggg tct ggg aca gac ttc act ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

ILTISRLEPEDFAVYY etc acc ate age aga ctg gag cct gaa gat ttt gca gtg tat tac ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 1CQRYGSSPGWTFGQG ! 0 tgt cag egg tat ggt age tea ccg ggg tgg aeg ttc ggc caa ggg ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 1TKVEIKRTVAAPSVF acc aag gtg gaa ate aaa ega act gtg get gca cca tct gtc ttc ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 1IFPPSDEQLKSGTAS ate ttc ccg cca tct gat gag cag ttg aaa tct gga act gcc tct ! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 ,'VVCLLNNFYPREAKV gtt gtg tgc ctg ctg aat aac ttc tat ccc aga gag gcc aaa gta ! 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165

1QWKVDNALQSGNSQE cag tgg aag gtg gat aac gcc etc caa teg ggt aac tcc cag gag ! 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 1SVTEQDSKDSTYSLS j 0 agt gtc aca gag cag gac age aag gac age acc tac age etc age

169

5923 09 Sep 2016 ! 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 iSTLTLSKADYEKHKV age acc ctg aeg ctg age aaa gca gac tac gag aaa cac aaa gtc ! 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 iYACEVTHQGLSSPVT tac gcc tgc gaa gtc acc cat cag ggc ctg age teg cct gtc aca ! 211 212 213 214 215 216 217 218 219 220 221 222 223 10 iKSFNKGECKGEFA aag age ttc aac aaa gga gag tgt aag ggc gaa ttc gc.....

CM---CM

SO ί—H o

CM

170

2016225923 09 Sep 2016

Table 26: h3401-d8 KAPPA captured with CJ and BsmM ! 1 2 3 4 5 6 7 8 9 10 II 12 13 14 15 • SAQDIQMTQSPATLS aGT GCA Caa gac ate cag atg acc cag tet cct gcc acc ctg tet ! ApaLI...Extender.........................g gcc acc ! L25,L6,L20,L2,L16,AU ! A GCC ACC CTG TCT! L2 ! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 iVSPGERATLSCRASQ gtg tet cca ggt gaa aga gcc acc etc tcc tgc agg gcc agt cag ! GTG TCT CCA GGG GAA AGA GCC ACC CTC TCC TGC ! L2 ! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 iNLLSNLAWYQQKPGQ aat ett etc age aac tta gcc tgg tac cag cag aaa cct ggc cag ι 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 '.APRLLIYGASTGAIG get ccc agg etc etc ate tat ggt get tcc acc ggg gcc att ggt ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ilPARFSGSGSGTEFT ate cca gcc agg ttc agt ggc agt ggg tet ggg aca gag ttc act ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 iLTISSLQSEDFAVYF etc acc ate age age ctg cag tet gaa gat ttt gca gtg tat ttc ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 iCQQYGTSPPTFGGGT tgt cag cag tat ggt acc tea ccg ccc act ttc ggc gga ggg acc ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 iKVEIKRTVAAPSVFI aag gtg gag ate aaa ega act gtg get gca cca tet gtc ttc ate ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 iFPPSDEQLKSGTASV ttc ccg cca tet gat gag cag ttg aaa tet gga act gcc tet gtt ! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 iVCPLNNFYPREAKVQ gtg tgc ccg ctg aat aac ttc tat ccc aga gag gcc aaa gta cag ! 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 iWKVDNALQSGNSQES tgg aag gtg gat aac gcc etc caa teg ggt aac tcc cag gag agt ! 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 iVTEQDNKDSTYSLSS gtc aca gag cag gac aac aag gac age acc tac age etc age age

171

2016225923 09 Sep 2016 ! 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 1TLTLSKVDYEKHEVY acc ctg acg ctg age aaa gta gac tac gag aaa cac gaa gtc tac ! 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210

1ACEVTHQGLSSPVTK gcc tgc gaa gtc acc cat cag ggc ett age teg ccc gtc acg aag

1211 212 213 214 215 216 217 218 219 220 221 222 223 10 1SFNRGECKKEFV age ttc aac agg gga gag tgt aag aaa gaa ttc gtt t

172

2016225923 09 Sep 2016

LO

L5

Table 27: V3-23 VH framework with variegated codons shown

18 19 20 21 22 A Q P A M A

5'-ctg tet gaa cG GCC cag ccG GCC atg gee 29 3'-gac aga ett gc egg gtc ggc egg tac egg

Scab.........Sfil.............

NgoMl...

Neo!....

FRl(DP47/V3-23)-----------23 24 25 26 27 28 29 30 EVQLLESG gaa|gtt|CAA|TTG|tta|gag|tct|ggt| 53 ctt|caa|gtt|aac|aat|ctc|aga|cca)

I Mfel |

-------------FRl-------------------------------31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 GGLVQPGGSLRLSCA lggclggt|ctt|gtt|cag|cctlggt|ggtltct|tta|cetlcttltct|tgc|gctl 98 |ccg,cca|gaa|caa|gtc|gga|cca|cca|aga|aat|gca|gaa|aga|acg|cga|

Sites to be varied—> *** *** ♦** —FRl----------->|...CDR1................|—-FR2----46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ASGFTFSSYAMSWVR |cga|aggjcct|aag|tga|aag|aga|agc|atg|cga|tac|aga$cc|caa|gcg| | BspEI | | BsiWI| |BstXl.

Sites to be varies—> *** *♦*

------FR2------------------->j...CDR2.........

62 63 64 65 66 67 68 69 70 71 72 73 74 75 QAPGKGLEWVSAISG |gtt|cga|gga|cca|ttt|cca|aac|ctc|acc|caa|ags(cga|tag|agajcca| .BstXI I

188 »«* ***

.....CDR2............................................1—FR3—

77 78 79 80 81 82 83 84 85 86 87 88 89 90 SGGSTYYADSVKGRF ltct|ggt|ggclagtlact|tac|tat|gct|gac|tcc|gtt|aaa|ggt|cgc|ttcl 233 |aga|cca|ccg|tcajtga)atgjata|cga|ctg|agg|caa|ttt|cca|gcg|aag| ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 ! TISRDNSKNTLYLQM

0 |act|atcjTCT|AGA|gac|aac|tct|aag|aat|act)ctc|tac|ttg|cag|atgj 278 ! |tga|tag|aga|tct|ctg|ttg|aga|ttc|tta|tga|gag|atg|aac|gtc|tac| ! | Xbal |

143 —FR3------------------------------>|

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 NSLRAEDTAVYYCAK laaclagCITTA|AGg|gct|gag|gac|aCT|GCA|Gtc|tac|tatltaclgctlaaal 323 |ttg|tcg|aat|tcc|cga|ctc|ctg|tga|cgt|cag|atg|ata)acg|cga|ttt|

173

2016225923 09 Sep 2016

IAflll I I Pstl I

.......CDR3.................1—FR4-----------------121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 DYEGTGYAFDIWGQG |gac|tat|gaa|ggt|act|ggt|tat|gct|ttc|gaC|ATAlTGg|ggt|caa|ggtl 368 |ctg|ata[ctt|cca|tga|cca|ata):ga|aag|ctg|tat|acc|cca|gtt|cca| | Ndel |

----------FR4-------->|

136 137 138 139 140 141 142 T Μ V T V S S |act|atG|GTC|ACC|gtc|tctlagt- 389 |tga|tac|cag|tgg|cag|aga|tca| BstEII |

143 144 145 146 147 148 149 150 151 152 ASTKGPSVFP gcc tcc acc aaG GGC CCa teg GTC TTC ccc-3’ 419 egg agg tgg ttc ccg ggt age cag aag ggg-5'

Bspl20I. BbsI...(2/2)

Apal— (SFPRMET) 5'-ctg tct gaa cG GCC cag ccG-3’ (TOPFR1 A) 5'-ctg tct gaa cG GCC cag ccG GCC atg gcc25 gaa|gtt|CAA|TTG|tta|gag|tct|ggt||ggc|ggt|ctt|gtt|cag|cct|ggt|ggt|ict|tta-3' (BOTFR1B) 3'-caa|gtc|gga|cca|cca|aga|aat|gca|gaa|aga|acg|cga||cga|agg|cctjaag|tga|aag-5'! bottom strand (BOTFR2) 3'-acc|caa|gcg|3 0 |gtt|cga|gga|cca|ttt|cca|aac|ctc|acc|caa|aga,-5' ! bottom strand (BOTFR3) 3’- a|cga|ctg|agg|caa|ttt|cca|gcg|aag||tga|tag|aga|tct|ctg|ttg|aga|ttc|tta|tga|gag|atg|aac|gtc|tac||ttg|tcg|aat|tcc|cga|ctc|ctg|tga-5' (F06) 5'-gC|TTA|AGg|gct|gag|gac|aCT|GCA|Gtc|tac|tat|tgc|gct|aaa|3 5 |gac|tatjgaa|ggt)aci|ggt|tat|gct|ttc|gaC|ATA|TGg|ggt|c-3' (BOTFR4) 3'-cga|aag|ctg|tat|acc|cca|gtt|cca||tga|tac|cag|tgg|cag|aga|tcacgs agg tgg ttc ccg ggt age cag aag ggg-5'! bottom strand (BOTPRCPRIM) 3'-gg ttc ccg ggt age cag aag ggg-5' !

! CDR1 diversity !

(ON-vgCl) 5'-|gct|TCCIGGA Ittcla ct| ttc|tct|< 1 >|T A Cl < 1 > 1 at gl < 1 >4 ! CDR1...................6859

5 J|gglgg^g£l£^algcilc|TlGg-3' t

!<1> stands for an equimolar mix of (ADEFGHIKLMNPQRSTVWY); no C ! (this is not a sequence) t

0 ! CDR2 diversity t

(ON-vgC2) 5'-ggt|ttg|gag|tgg|gitjtct|<2>|atcj<2>j<3>|! CDR2............

|tct|ggt|ggc|<l>|act|<l>|tat|gct|gac|tcc|gtt|aaa|gg-3' ! CDR2................................................

! <1> is an equimolar mixture of {ADEFGHIKLMNPQRSTVWY}; no C ! <2> is an equimolar mixture of {YRWVGS); no ACDEFHIKLMNPQT

174

2016225923 09 Sep 2016 ! <3> is an equimolar mixture of {PS}; no ACDEFGHIKLMNQRTVWY

175

SO

Ο

CM

Ρη ο

Os

Ο m

CM

OS m

CM

CM so o

CM

I .s (Z>

άά

CM

— CM ΟΟ „ ο o S2 s- <

§32 g 3 S §h8 c o g u o o g g g g [_ Η O < < <

^5ί u σ o <~> < υ o < <

£ < ί υ o F Ο Η Ο Η ϊ- Ο ο ο 2 s- < υ υ ο η υ < ο Η < Ο Ο Ο Ο υ ο η Η Η Ο < < < g ο < ο Ο Η ο 0 ο < ο ο <(-<-> υ ο ο < ρ ρ ξ g£< <§3

Α ο ο υ % ο fc 2 ί ία ίο £

Ο <

ο ιΖ) ^ΟΌ<ΜΟΟΤΤΟΌΓΜΟΟ ο

os

Lf) γΉ

176

SO ο

CM

Ph <D

CZ)

OS

O

CM os i/S

CM

CM so o

CM o£

Q

U cd

Q

U c

ω u

U

Cl ©

so

Ό ίC/Q ω

O

CL ©

© ©

c <

cc 5ζ 2 =>

soa^y

S lz C 00% ·= 3 ° ^ω < εα ω ω £ n

„ p > «>£ o Q *2 00 U f— o $ ο o $ « l· - £ s - ° Ξ

S § « o.co'Si'q.S ω^ο_α.ιζ)ΐζΐ(ζΐ!ζ>

o

CM

ε p

ο ¢3 !EcoO109I RGgnccy 3 7 2636 4208

BssSl Ctcgtg 1 12 !-- Cacgag 1 1703

BspHI Tcatga 3 43 148 1156

Aatll GACGTc 1 65

Lf)

CM

O

CO m

co

177

2016225923 09 Sep 2016 f—

MO

Ό

Tt

CM

Tt

SO

CM

O' uy

CM

V)

Os uy toΓCM

sO

O' οδ υ ΠSo o t- o _ a o > £} 3> I · s £, U. CQ ffl Ί CO

LT)

Os
C*		O'
SC		Os
sC		cy
c-	Ό O' OC uy	cy
O' so	TT O' cy
OC so c-	O' 00 SO cy	cy cy CM CM	Ci 'o CM p-5	Ό CM

uy uy _ a <=> 3 s?

O' CM d CM

CM uy — cy Tt tt cy p*y r*i CM CM CM

Οώ

O0 rt U — CM CM

O'

CM to- ”3- O sssga

Ο. ΓΜ £} un ,_k Mfc τ o C U a>£ 5 a> S § o gp < υ < ,-e z υ g 5 o - ° z. o “=5^2<

Q. 3 £5 C 2 O«o > C Ο- 5£ cn CU Cl- X < CO *- o £ s: o .5 ·=

0- <

<* oo OC ί 8 g?§,2 g o S’S’g So = o U 2 e —* ·— © Τλ *7? -£·«;=χ:-β-£Ϊ cn t- X oo cq !EspI GCtnagc 1 2580

ISgrAI CRccggyg 1 2648 !AgeI Accggt 2 2649 4302 !Ascl GGcgcgcc 1 2689

IBssHH Gcgcgc 1 2690

LO

CsJ o

co

ΜΊ co o

CN

178 so ο

CM

Ph

CD

CZ

OS o

CD

CM

OS m

CM

CM so o

CM < rn

CM

Ό »-« r*

CN

CN <N

O\ Ip T* V> VD \o m <*D <o in •C r- fH~i z **>

c«Π

O\ rt —· - » 5 v-l S N r-t'-'L.CNfC^OOo, ' ~ CN e*D

Γ- CCJ CN CM r^ £ <n £ s© m r*

oo r*

CM

TJz

Z DO (J DO u υ «ο u S

M U

1= « CZ Z

Q

BO

Bfl

O

U So > S' Σ ^u §> £P Z aa _„z

BA «y |“z s “ r s¹ I s c <

x«?eu««ny

- U so — *3 ε u CO X § s £ u. m . y U - 5 T w = iz

VJ *· MH* t/3 <-> O' ~ (Λ BA Ο o zw Seaasa « _r, « g U Z

- E , < co -, lO o

rH in on c Z si- §P,° Si 2 O oo U *7

Z ι υ :

ο ϊ & = P f- M « mo or <P,seuco“· < U to - - ^rn ϋ - _

Π ® U c i +· c

J— ΕΛ t/5

Z CO ffl o o a % ζΞ2 S.S < CQ c Z = I ε ^z ^g< Qij ou S 1¾ oo|g ο Z < -o5°

Q. Ο. ω (Λ Zb <Λ · cq a- co ,

O BO BO BO BO !PflFI GACNnngtc 1 4308 !Tth 1 111 GACNnngtc 1 4308 !Kasl Ggcgcc 2 4327 5967

BstXI CCANNNNNnlgg 1 4415 !Notl GCggccgc 1 4507 o

CM m o

CM CC in co

179

2016225923 09 Sep 2016 «Λ o Ό ο — -ϊ r~- ο 22 **> **> VS 00 ΞΖ _— Ο *η »η \0 co »s So co 3 — έ °ρ ο Λ ο Ζτ S <2 f i CJ ο eo£ «5 2^^ οο α ΓΖ ϋ 5 2? < ω_<<ο5υ> rj Χ« ·— Ο Γ* _ c Q τ> * ^h = « 5Ρ £ &£ ο ~ «

ΛΛΙΛΌΟιλ^·^ UJCQfflZlilCuOCQ

lD

LO

Ο

CM ιΏ

CM

Ο

CO lO co

180

2016225923 09 Sep 2016 f-* of)

Of) o

u co <

X JD ¢0

Of)

Vk O' 00

00 00 & g> g 05 oo ts rt

X rt § £

Ξ υ

S ®- Η © . <

— oft © jjj 00

Ξ? °o > ~ r- p

OO oo o

oo

CM

LO

X o — • rt <e

O' Ρ co o O' ~ <Z> _ rt oo O rt Ok co r- Z o O' _ ο σ Ok 3 C©O' _M of) υ O υ

— 00 co

Vk

	rt £3 rt	<9 (9		rt
	rt			a
_!	00	CM	J	rt rt
	rt.	<9	Q
U3	oo rt		z	o 00
d	rt OX) rt	19 ©	<	oo u co
>	o rt	<9	<	u rt
f-	op rt	O' CM	H	o 3
2	o e0 O0	00 CM	z	rt oo
a	rt 00	r-· CM	a	ob rt

— 00

5fi 73 rt rrt u

(0

CO u

O' *r« O0 Ο P rt § * i

Ξ oo r- “ 2h Ό >

co u

co

Q

GO

O0 t=

Ό , •<t U.

CM «9

CM

CM « <9 *£ CM

CM CM § H 3 o o OO u OO oo s

υ

LO

Ά «£ rt rt H

TJ* — P <9

Z ω 2

5°

Ck O oo co rt oo rt σ

u ·* CM

- o < — 2 o rt co Ρ

*9			©			»9
©			00			O'

rt			O'
©			Γ-		op	Ok
MM		00			rt	«Μ)
C9		Q	ΟΟ		00 O	<9
kO		U	r-		Ok
·—·	A	3	*M		rt	M-
CM	M	rt 00	c-		o	CM
©	tiJ	Γ-	3?	Q	O'
	CO	•M		rt
©		op eo	<5 Γ-	frM	g 00	O'
©		o	«9	H	Of)	o
kD		rt	Γ-		o	O'

m υ > 1* — , «α a. — t— “ qq CM •2° S ” ^J g> g

- Η n Ό 00 Ϊ2 > rt <n_x “ <9 rt

- O0 Q oo <9 00

- o **.00 rt £

3 rt

Ί ^υ — GC <9

LX w

CM c\ r- Q : z ’ a.

c— ©

rt oo

O' _, « Ό < rt — oo oo ω -s *> a z a>

sD a — P 00 <o rt £ω “

Ό ©

SO

Ό

O'

Ό un

CM o

co

Ok Λ oo P oo \x 00

ΓΟΟ so *” 2 f« Fs « <

O o

(-) o

f— cd o « (Λ rt U. Q 00

2^ -< CM < Ξ, o j O Q fe 00 co

LO m

o

CM

181

2016225923 09 Sep 2016'

©		VS		©
		CM		TO-
CM		CM		CM
Os		TO^-		Os
©		CM		ρ
CM		CM		CM
oo	Οβ	p	O	00
©	TO	ΓΜ		P
CM	Οβ	CM	CtC	CM

ο

Η

Ο ο

TJv>

cm

CM

Os

SO

CM so

CM ο ^ωCM so Σ ο

CM οθ to

TO υ

υ ο

ο οο οο ο

oo oo vs oo

CM rjOO

CM p

oo

CM

TO

O

0β

TO

TO υ

Ο -J CM ~

CM ΓΧ TO © ^ TO CM _ Ο — θ' οο ο οο

CM q/ V C5 “* Ο § _Λ 2

- - (Λ

CM

Γ™ X Ό

CM °-

Γ-	C/5	OO	CM		00 TO	Γ-		TO	CM
Ρ	u	vs	OS	sO	ω		00
CM	O	oo	CM	TO	CM	U	CM	X
SO P	rt 00	io	00	CO 00	SO so	ο	TO TO	00	SxS
CM	&	oo :	CM		CM		υ oo	CM
vs		00 P	©	&,	TO	VS		©
P	ns	o «	vs		OO	sO		oo	00	MM

ed op o

TO

O cm ω p ο CM £ <

Ρ

CM

CM Q

CM δ “ <

Ο

Η cm Ο CM Os *“· •’T

CM Q oo ’T

CM

Ο\ σ\ <

°° -3 Os γ- Η σ\ — J © £ U ο

οο « Ο TO ™ & Τ <

Q i?CQ

CM

Γ» •rr

CM so

TT

CM vs ’y

CM

Ό

Γ»

XI- t— Ό <N < m

CM O'

CM

Ρ

CM

CM ri

Q

CM * 3 00 οο U ΟΟ

Ο Ρ CM

CM < CM — CM

CM U. rcm > CM !> Ό

CM P CM sO

ΓΟΟ

If) p

'ey — CM

CM *“ CM C

Σ'CM © H Ό

CM _HOs V)

CM u

H

O

TO

Οβ oo

O <

>» o <

«Λ — S

CM OS rCM OO CCM cΓCM SO r*

CM jjQ vi r* -*· CM < ’d·

CM P

mJ	o o	u TO
	00	OO
CZ5		TO
00	C_>
<	00 «	op o
	TO
<3	00

2*

M z

LO

SO

Os

Οβ o

TO

Οβ

O' S) r- ρ CM ^5X1

TO

6β υ

so οο

CM

SO CM — VS S£> 00 © © o ·

141 tcctttttga taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 201 cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 261 gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg galcaagagc 321 taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgtcc 3 81 ttctagtgta gccgtagtta ggccaccact Icaagaactc tgtagcaccg cctacatacc tO

CsJ o

co to co o

CsJ

182

SO

Ο

CM

CU ω

CZ

Os ο

ΓΟ

CM

OS m

CM

CM so o

CM

υ S g> s> 2 etc «ZS Q

-u.« rr q

- cu □ . ao ** jCM GO > <G

Ch *— ¢3 < °° o r~ n. « S ^J S J g m 2 _ . ·* ΛΠ

Ch

CM

Qi

U<*o

Ό ω

u

Q cm

TT > CM „ O'

CM

CM ^U θ' > o

CM _σ o>

- < GO — <z

C~ — X

2314 tct cac aGT GCA Cag gtc caa CTG CAG GTC GAC CTC GAG ate aaa

ApaLI...... Pstl... Xhol...

BspMl...

Sail...

Accl...(1/2) to

LO iO

CM

O

CO m

co o

CM

183

2016225923 09 Sep 2016

SB §?

s* ©

JZ C. X <

ή a § § U © 4> </>

-° C

E

S§>

« c

rt ©·

ECl rt > >

m «3

d.

Cl.

Uo £ — u U- O > 2 ©

m ·—¹ oe U »n ** r- U <n >

un •n > m

TJ- V) *n ~ <

op o

o o

op op t:

oo

GO t*- GE a * r- Si >ce rt C© ©s

CZ5

0.

<

>

o 2?

o op oo

GO

I 00 ¹ o 2 ι rt c/5 rt oo op > rt > Q rt

O' eh , > 1 o s> O ^X GO op < rt 00

Urt

Aft oo ©\ ft- Z nf rt rt

B- CM

—.

u m ” O' $ r- oo rt- CQ *j rt

SO-00 tt Q

-Ί· ©

*«3·

CM ' Ί __ *·^Μ ό 2 *5

M O' 5 « ω S’Ό >

°° »v *n €/5 & oo op ω S oo o „oa ⁰⁰ 2 !

°° 21 oo ·*- on

OO ,-. 00

Ο o “ee g ©N «

O' B oo ° fi- Ί « ¹ —¹ o

GO c-~ < o \© §

2 o

Ο o

rt

CM ίο oo M rt — 00 2 rt rt o

oo : rt — & ο ω oo —I O' r- * ** > O fv~>

-±

Ch Q m

©

CM ©> CZ3 ©\ Q

LO uo oo op

I >

— ω >

£ ί -ο£ ~ «· f_ < ο ω>

Ο <2 > CM

Ξ ^χ© — ω ©*»

Tt

TiCN

Tf ©N

TICS ©\ rn o

CM

CO

CM «η rn

Tf m

m m

CM m cd rt ’*<

Ο CM CM m £· ©> Cu, m ~ <Z>

©S ^xh « rt < oo r- >

CM

C/3 rt oo O ©

- Ud v© oo ω «η £J CZ)

TJ a CM -⁴ m O CM «CX

CM _

2629 acc cat cag ggc ctg agt tcA CCG GTg aca aag age ttc aac agg

Age!....(1/2) cf) on

184

2016225923 09 Sep 2016 u

u c

a c

a o

o

	u	TO U
VD		TO
·“*	-1
		OO
—	J	£5
CD	u	TO OO CO
CM	o	o 00
—	<	o o
		oO
O	<	TO o
		OO
Os	<	00

oo , ^Λ

Λ. ^u>— oo c

^; TO o

CM m rCM CM

TO 00

TO

Λ £ ·< Go — ,. 00 _ U «c £ 00 — 03 Λ so So 22 O

OO ce s©

VD

T cd •J o ra s> *

CM *· QQ x co

CD

CM r*

CM

O0 ®p

TO

CM f-j

CM <· U _ □

Σ Q o 8

CM <

Os — ft.

OO

CD > O 2 Z

Si z

SO

C**

CM o

Csl ω o uL

SO <

CM —4 <J fo y.o (Μ V S ¢00 s > a fn yj ⁰⁰

Tf cd CD

CM 00 V _ U ’tf·

O* u

Os

1/)

CD Q rCD Q Ό cd CU _ D O' fi£

U. <*> >

2813 |ggc|ggt|ctt|gtt|cag|cct|ggt|ggt|tct|tta|cgt|ctt|tct|tgc|gct| in

CM o

co

185

2016225923 09 Sep 2016 <

Ο ο

οο Ο •t e q

- ω r-» H Cu SZ C/J ·=τ «λ o CO GD vO < —~

CZ) t—

Ofi «J lO

LG

SO — — m ©

-C ©χ) tt °Ί_

-J 3 m <=> > y — ffl ₂g-’ — O'.

— O' r- r- r- c- r*

Tt © Ό CM 00 CM <-> f*l Tt τ» ci m c*i «*> <·«>

c- r— rrt © © ι/l © © o

CN

LO

CN o

co

H Sf JL z ]3 §

CZ3

OX>

z < o

Q < C£ o

CZ) —

O' .. -fr ' vn Ov

Ό rt <*] ι/'s r* Os to co r- r— r- c© © CM 00 © © — — r**j m m r*i

186

2016225923 09 Sep 2016

Ό

U υ

ίο ο

υ β

(9 σ\ © .

Ο© Ο s - - «

Ε r- ί : - £ ζ £Ρ “ υ ν) - & Ο <

<

Η

Η _

Q ορ© <β © ο <

rt ο « zr

Q

U <

a _ H

CQ

©			VS			©
VD			SO			00
«—·			«X,			—
o\			rt		00 rt	Os
rt			SO		r-
		O			rt
00		u	m		o	00
Tt		Q	SO		00
		Q
r- rt	GO	Q 0)	CM Ό		op o	r- 1“-»	<
—	O			Q	·—·
Ό	00	rt		>	op	SO	Q
't	Q	SO			r-
		oo		j	Q 00		GO
VS	C-	00	©		00	VS
rf rt	<	O Q υ	SO Os	u	op Q	C- rt	z
rt	J	Q O	V*	o	8 00	c-	z
CD		ts	00		CS
rt	cu	υ	VS	J	OO (J 00 rt o	r»	Crt
CM rt	U-	CO co	C- VD	<	CM	>
	>	rt	SO	<	rt o		H
rt		Q	VS		O0	r~
	00	Q	ww	ί-	CO	·»	>
©		O	Ό		00	©
rt	Cu	00 00	VS	ο	co 00	c-«	CU
Os f*S	O	OO rt rt	rt VS	o		Os SO	ω

00	X	o Q	es	(Λ		00	Cu
m		rt	Vy		SO
t-*	H	o o	CM	H	u CO rt	r-	©
<*>			VS		>O
	OO	Q		(/3	s? rt rt		>
\o		00	—		SO
<*1	<	VS	SZ	Ό	□

SO ©

ri CM CM CM CM CM rt m r- m os ud —· \© OO OO © Os © — — n fh rg n rr rt·

4288 gac tac ttc ccc gaa ccg gtg aeg gtg teg tgg aac tea ggc gcc in

CM o

co

X o

CM

187

2016225923 09 Sep 2016

vn	©	wn
©						CM
MM			CM			CM
Φ			©			rtf
©		rt	© CM		o oc	CM CM		CJ oo
cn		3	OO		rt	cn		ca	00
©		Q	©		u	CM		o	cn
—«		Q	CM		GO	CM		o	CM
CM			r-		rt	CM		o	r-
©	cn	gp	©	CZ)	o	CM	cn	GO	cn
MM	rt Q	CM	o	CM		C3 rt	CM
©	cn	rt	SO ©	cz)	Q Q	CM	Cl.	O rt o	sO m
		u	CM		O	CM	it	CM
O	O'	o	wn	cz)	oo	©		«ΖΊ
©		GO	©		GO	CM		rt	m
“““	j		CM	0-	CM	X	rt	CM
©		ο	rtf		Q c>	©		CkO	rtf
oc	>	GO GO	© CM	>	rt GO	CM	z	ob ω	m CM
oo 00	<	U	m ©		ob	oo	>	a	m m
		O	CM		rt	CM		o	CM
Γ- ΟΟ	0.	o	CM ©	>	GO	r-	z	op	CM m
so	b.	u rt	CM	>	u a>	CM so	o	o rt	CM
oo		cs	©		o			o	m
~—	f-	Q	CM	cz)	oo	CM	—	rt	CM
wn		o	©	rt	wn		o	©
00	X	co	© CM	CZ)	Q Q	CM	>	o rt	cn CM
rtf 00	>	o 00 oo	© ©	mJ	U o	rtf	f—	00 rt	© CM
·—·					CM		o	CM
cn 00	o	u GO rt	00 ©	CZ)	u rt	cn	ex	o u rt	GO CM
CM	cn	o a	r-	>	□ Q	CM CM	ί-	o GO	CM r-
oo		co	©		rt GO			GO	CM
•M	ί-	op		mJ	CM	ο	GO	CM
M		Q	SO		GO	«-*		C	sO
GO	α		©	O	CM	-I		CM CM
		cn			00			cn
		m			r-			CM
		cn			cn			rtf
		rtf			rtf			rtf

Q cn it

a.

ω >

it it

Q >

it sz

ΙΟ μία ($ <

tj υ

o o

<

o o h « o rt χ ££ GC ΓΟΟ 2 GO O GO in ©

Vh ©

rtf

Φ r*Φ

SO

Φ

rtf ©

m <

<

O

X

X <

<

<n ©

*n

	so				OO			©
					o-a			—a
rt	Φ				©			Φ
o	SO		GO		Γ-			©
00	·“		rt		aaa
U	m				00			cn
o GO	©		rt o GO		Γ-		£	©
GO	CM		o	1	r-		rt	ΓΜ
00	SO		u		r-	U-	3	©
00			GO	*	»—«	_»
u rt		<	GO GO	1 t 1 •	© r-	CZ)		©
			GO	1			g
rt	©	<		1 1	wn	Z	00	©
o	sO			<	r-		rt	©
o rt u	© ΙΛ	σ	£P o	1 1 I 1	rtf r*·.	ω	u rt	© eft

-x rt go oo _M mJ GC

O o

a a

c u

a © oo O rt Z ^ω ©

Vh rtf so ω oo Uh « — ω 2 S 3 — CZ) rt rtf 3 •n ·— o cn -J e§ <zh rt fq rt

-σ = — GO *2 ω cn •Φ

Vh rtf cn ~ °°

X v oo *· o

GC co «J So O oo © sO —- rt —

J> vo 3 — « £ 2 2 a

GO ^ω H

GO > o «3 (—

OO

OO «η ^τ

Z <

ία:

a _j

H a

Q it £

>

z (4633 act aac gtc tgg aaa gac gac aaa act tta gat cgt tac get aac m o

CM cn

C3

CM

188

2016225923 09 Sep 2016

©

V»

©

<n

©

CM

*n

r-

00

©

CM

m

©

o

CM

m

•Ct

fl

©

OO

©

CM

CM oo

rt cb

CM m

w

CM 00

o

CM m

rt

CM 00

o

CM m

s

CM 00

00 rt

m m

CD

m co

©

CM

o

m

ό

CD

©

rt

00

©

Q

O

CM

**

CM

00

CM

CO

CM

CD

CM

co

CM

o

CM

H

m

rt

m

r-

CO

CM

, Ί

co

r~

CM

.

r-

u

CM

o

CM

t-

rt

r-

σ

©

co

CM

co

m

<Z)

3)

V)

>

o

©

y

OO

Z

©

cv

3P

«-*

CM

u

CM

*2

CM

CD

CM

o

CM

Ω

CM

g

CM

UZ

rt

m

fci

cn

©

oh

o

rt

©

0

—

ω

©

Q

—

a.

©

PU

y

>·

00

©

>

©

C

CM

m

CO

Ό

o

©

j

O

OO

Q

©

J-

CM

***

00

CM

u

CM

o

CD

CM

Qm

rt

CM

Q

CM

7

CO

CM

m

>

rt

cn

>

*n

0>

©

*r>

CD

©

3

rt

©

4-.

Q

S

O

©

«Λ

O

co

CM

CU

*“C

rt

«ΖΊ

rt

©

00

U

©

rt

s

CM

H

CM

00

CM

00

CM

CD

CM

&

*»·

CM

Q

CM

<

u

CM

*-

m

H

rt o

m

>

rt

©

CO

TT

u

<_>

©

o rt

Tt

z

rt

©

u

u<

3

©

o

WM

m

ω

Tf

©

ea

r-'

CU

rt

©

u*

©

co

CM

rt

CM

rt

CM

o

CM

oh

CM

«Μ

2

CM

JZi o

cn

m

m ©

o

O

OO

&

u rt

m m

CZ5

CD CD

00

f—

00 Q

m ©

υ rt

00 C-

z

« Q

©

U

00 ©

<

oh OD

cn CM

>

CM

L_

O

CM

«»

CM

*-*

CM

CO

CM

CD

CM

z

o

m

o

00

cn

CM ©

H

[—

f—

CO co

CM m

o

oD OD

r-~ ’M'

O

00

CM ©

E-

rt

r- r-

O'

rt CD

CM ©

►J

g?

r~ ©

rt u

CM CM

ft.

CM

<

5

CM

o

(J rt

CM

o

oh

CM

c

CD

CM

Q CO

CM

UJ

O

CM

m

a

CD

m

Q

o co

©

*-*

CO

©

·—«

>

co

©

rt

a.

3

©

rt

—

z

CM «η

>

§> 00

m CM ©

o

CD rt CO

Tf CM <n

o

rt CD

Q

co Q O

Γ CM *n

ί-

ω CD

ex

a <

0i

3

CM cn ©

£

op o

c

CM

υ

rt υ

m CM

ω

a

TT CM

ω

Q

Cu

w

CM

ο

o o

CZ5

o

CM

z

rt CO

CM m

O

© ©

u

op

CA CQ

CM

O'

o

Z

3 00

Tf V CM

GO

ch 00 Q

-

o u

TJ- r- CM

ft.

CO a a

UJ

o <

δ-

TT © cn

ei

u o ti

O'

00

(J

o co

ρ-

rt

00

UJ

·«.

m

C

00

flu

03

m

ft.

oo

f t3

o

5

CO

b.

CD

00

f—

©

CO

MB

00

CM

y

•^r

CD

v>

rt

Γ-

2

oo

UU

4=:

4)

©

2?

r-

o

Sf

CM CM

ω

3

CM r-

Cu

3

CM CM

o

eh co

CM Γ-

H

«

CM CM

>

o co

CM Γ-

-3

u rt

Vi CO

m CM

Bi

rt CS

cn

>

©

UJ

co

CO

CM

rt

«η

co

r-·

ΟΟ

©

rt

©

CM

Q

co

CM

—

Q

CM

C

co rt

CM

a

eh

CM

ί-

o co

CM

5Λ

CJ

m

z

co

ί-

CO

©

CD

©

co

oo

©

o

rt

©

>

00

CM

O

CM CM

<

OO

·*» CM

cd

cn

W-l CM

Q

r~ CM

ο

00 CM

flu

OO

© m

z

cn

ο

en

00

m

cn

5038 ggc act gtt act caa ggc act gac ccc gtt aaa act tat tac cag in

Lf) o

CM ir>

CM o

CO

Lf) oo

189

2016225923 09 Sep 2016

Bi rn >

© o

m ca © ^ °* m Q -tg m 2 co > co S « £ 2 8 8 S?

m «1 2

GO © wn m 00 v> m ίη Q m © ω vn ” Z.

V

V m U. Tt £o h~ <

o rt o · 00 ΞΞ 00 I 2 rt ~ CQ m

Tf torn m

torn

P ex <n

-J © —¹torn Q © £ „ u

CJ

C

H

O

U < CM <_> '-f oo “ Β Σ — CX ©

m

Q

CO

CD

S o “ © ¹ ®? co co m qq m —

O0 tn oo

Tp ©

Tt m

CM

Tf ©

Tt rt

Tt

Tt *rt rt·

OO _— m g o •e

5’^JS

Tt . , .w . UJ o q o — ^w o Tt *© O GO — 00 ,

Η 2P «η ζΛ O

V π <* s Jn co « m oh m 2> -ucu £

N * u tn , <e Π f- n

CM 2 £ oO «/Ί Q m < op a £

OO rn «a c3 od g § 00 S CO m ς>

^σ SS Sa g \o SL m O' ca v « <n rt , w © UJ co m “ m rj © ^wm

CM >

O0 — CQ cn rf oo O cn

S3 O

CO

2a s c & © 00 £ ϋ to- <C <n oo z toto- »> torn £^, ©

5* in ‘ O - = ¹ 00r- oo to- ο co © rf

O Q

Tf — C/5

Tt q So τρ m CM O <* o

5< S a

OS oo oo 7- 73 S «> F <

o rt

O

O o

u co

SO « © m cZ) in sa

Tt -© Q m m □ © rrj CM © m ζ/5 rt v

Tf ©

Tt ’t

Tf to 2 TT s° ©

Tt <

I^Q <n

So

Tt

J

u	Μ»	S^i		©		o	_
00		!Z>	co	CM	<	oo	Tt
OO	Tt	00	Tt	®p	Tt	<
u	©		u	Ά		rt	©
oo 00	Tt	O	00 00	CM Tt	2	TP Tt	2
00 oo	© ©	o	oo 00	Tt CM	54	rt	© rn	ω
T}·		Tt		rt	Tt
oo rt	00	σ	g?	rn	ω	00	00	a
00	© rr	00	CM rt		«	m TT

> 7 rt

Q u

co <n ©

CM «Ο ©

Tt ¢/5 ©

C oo co oo ©

m

5398 gaa aat gee gat gaa aac geg eta cag tct gac get aaa ggc aaa in

CM o

CM

190

2016225923 09 Sep 2016

vr				©			vt			©
sO				oo			©
				M-			re·			VT
				©			et			©
sO				r—			Os			©
		ϋ		Tj-			't			VT
ct				oo		00	CT			oo
sO		OO		c—		Os		00	©
		ΟΟ		tT		00	re^-	o	00	VT
CM SO	Ll	P <1-		c- t—	o	u co	CM Os	u re O0	r- ©	V)
TT				tT			TT	Q		vt
	a	ο		SO	H	u			oh	SO	Dm
so			r—		oo	Os	o	00	©
	Q	p		•sf	<	oh	•*T	Q	VT	mJ
©		<-		VT		oo	©		eh s	VT
SO	—.	o		t— *iT	o	re	Os sT	>	© vr
©		00				03	Os	O'	a	TT	σ
vr M·	<	3		t- TT	z	oh 00	00 re-	3 co	© vr
oo V»	<	00 oh	Q	CT Γ-	o	eS	oo oo	<	op	CT ©	0έ
sr	O	00	o.	TT		03	TT		re	vr
t- vr	υ co	(/} ea	CM r-	z	Q oo	Γ- ΟΟ	Σ		CM ©	Ll
sO	P	3		tt	<	ti o	TT so	O'	o o	vr	z
VT		00		r-		o 00	00			©
TT	Q			tt		TT	C/3	M	VT	2
vr		Ο		©		oo	VT		ro	©
vr TT	P	CJ		Γ— TT	a	o o	00 TT	z	3	© vr	S
vr	<	00 Cj		© Ό	co		TT OO	co	o	Os OS	mJ
Tf					00			oo	tt
ct vr	>	oo CJ		OO so	>	o 03	ct oo	Ό	3	oo Os	a.
TJ· CM	ΙΛ	CO		Tf (—	a	00 oh	tt CM	<	oo £5	TT t—	C/3
vr		oh		SO		00	00		cs	Os
	Q	o		TT Ό	o	CQ	TT	u.	re 00	TT Ό	Z
VT	r—3	CT		Ό tT	-	00	00 TT	a	ct	Os TT	a
		TT				oo			CT
		TT				TT			VT
		VT				VT			VT

VT CM VT

o re· vr

VT VT VT

TT

©

TT

©

CM

ct

VT

so

VT

o

VT

CT

00

©

ct

oo

CM

oh

CT

re

VT

so

—1

VT

OO

VT

JS

VT

Q

CM

C

Γ-

υ

CM

re

t-

CM

CT

vt

Ό

O

VT

OO

VT

u.

re

vt

&p

VT

υ

__

<

O

so

re

MM.

2

re

SO

CZ5

re

CM

gg

CT

J

re

VT

SO

VT

Ll

£

VT

*-

©

a

g

VT

re

©

υ

VT

ω

CM

CT

VT

P

o

so

0

VT Ο

LL

o oh

VT xt

u re oh

VT ©

re u

vt rf

oh

VT

>

re

CT VT

00

ντ

<

00

so VT

z

u

00

CT

oo

CT

u c

VT

>

ο q

CT VT

D

re 00

VT

>

00 re

Ό VT

Q£

MM

f—

Ob.

οφ

CM

r-

P

CM

P3

—

o

CT

CJ

re

so

mJ

CO re

VT so

c£

3)

VT

Q

o

VT so

J

Ξ

VT

MM

re

CT

Q

so

oo

VT

u

£

VT

£

VT

J

VT

z

03

VT

oo

©

re

VT

—

©

re

VT

LLl

3

CT VT

</3

3 O

’i* VT

Ll

oo U

SO VT

<

3

TT

>

u.

F-

xj-

<

00

Ll

U

vt

VT

<

VT

*-

vt

re

CT

cn

oo

00

UJ

CT

Ll

Q

00

Q

re

CM

00

VT

re

VT CM

O

Q

VT r-~-

>

<. a

ντ CM

>

3)

VT r—

ζΛ

re

u

CM

o

oo

VT

re

VT

Cl.

00

VT

a.

2

Z

VT

O

3»

vt

u.

oo

c

SO

o

so

oo

VT

-1

CT

CM VT

2Z

00

xr VT

oc

CT

vt VT

>

r-

CM

so

vr

sO

so

r-

VT

5758 gta ttt teg aeg ttt get aac ata ctg cgt aat aag gag tct taa

LT) m

CM

O

CO o

Csl

191

2016225923 09 Sep 2016 r3o*

SO

U

<*> « o\ «η ~

O ~ Os O </-> : ur> \© \©

O IO r~I i-H

O lO

Cd Cd tO

192

2016225923 09 Sep 2016

Table 30: Oligonucleotides used to clone CDR1/2 diversity All sequences are 5' to 3’.

1) ON CDl Bsp, 30 bases 5

AccTcAcTggcTTccggA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

TTcAcTTTcTcT 0 19 20 2122 23 24 25 26 27 28 29 30

2) ON_Brl2,42 bases

AgAAAcccAcTccAAAcc 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

TTTAccAggAgcTTggcg

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

A A c c c A 37 38 39 40 41 42

3) ON_CD2Xba, 51 bases ggAAggcAgTgATcTAgA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 gATAgTgAAgcgAccTTT 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

AAcggAgTcAgcATA 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 • 5 4) ON_BotXba, 23 bases ggAAggcAgTgATcTAgA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 g A T A g 19 20 21 22 23

193

2016225923 09 Sep 2016

Table 31: Bridge/Extender Oligonucleotides

ON_LamlaB7 (rc) .........................GTGCTGACTCAGCCACCCTC.

ON_Lam2aB7 (rc) ........................GCCCTGACTCAGCCTGCCTC.

ON_Lam31B7 (rc) .......................GAGCTGACTCAGG. ACCCTGC

ON_Lam3rB7 (rc) GAGCTGACTCAGCCACCCTC.

ON_LamHflcBrg(rc) CCTCGACAGCGAAGTGCACAGAGCGTCTTGACTCAGCC.......

ON_LamHf1cExt CCTCGACAGCGAAGTGCACAGAGCGTCTTG...............

ON_LamHf2b2Brg(rc) CCTCGACAGCGAAGTGCACAGAGCGCTTTGACTCAGCC.......

ON_LamHf2b2Ext CCTCGACAGCGAAGTGCACAGAGCGCTTTG...............

ON_LamHf2dBrg(rc) CCTCGACAGCTAAGTGCACAGAGCGCTTTGACTCAGCC.......

ON_LamHf2dExt CCTCGACAGCGAAGTGCACAGAGCGCTTTG...............

ON_LamHf3IBrg(rc) CCTCGACAGCGAAGTGCACAGAGCGAATTGACTCAGCC.......

ON_LamHf 3lExt CCTCGACAGCGAAGTGCACAGAGCGAATTG...............

ON_LamHf3rBrg(rc) CCTCGACAGCGAAGTGCACAGTACGAATTGACTCAGCC.......

ON_LamHf3rExt CCTCGACAGCGAAGTGCACAGTACGAATTG...............

ON_lamPlePCR CCTCGACAGCGAAGTGCACAG........................

Consensus

194

Table 32: Oligonucleotides used to make SSDNA locally double-stranded

2016225923 09 Sep 2016

Adapters (8)

H43HF3.l?02#l 5'-cc gtg tat tac tgt geg aga g-3'

H43.77.97.l-03#2	5'-ct
H43.77.97.323#22	5 ’ -cc
H43.77.97.330#23	5’-eg
H43.77.97.439#44	5’-eg
H43.77.97.551#48	5 * -cc

gtg tat tac tgt geg aga gtg t at tac tgt geg aga gtg tat tac tgt geg a§a gtg tat tac tgt geg aga §tg tat tac tgt geg aga g-3’

S-3’

8-3'

195

2016225923 09 Sep 2016

Table 33: Bridge/extender pairs

Bridges (2)

H43.XABrl

5'ggtgtagtgaTCTAGtgacaactctaagaatactctctacttgcagatgaacagCTTtAGgg ctgaggacaCTGCAGtctactattgtgcgaga-3'

H43.XABr2

5'ggtgtagtgaTCTAGtgacaactctaagaatactctctacttgcagatgaacagCTTtAGgg L0 ctgaggacaCTGCAGtctactattgtgcgaaa-3'

Extender

H43.XAExt

5' ATAgTAgAcTgcAgTgTccTcAgcccTTAAgcTgTTcATcTgcAAgTAgAgAgTATTcTTAg L5 AgTTgTcTcTAgATcAcTAcAcc-3'

2016225923 09 Sep 2016

196

Table 34: PCR primers

Primers

H4 3.XAPCR2 gactgggTgTAgTgATcTAg

Hucmnest cttttctttgttgccgttggggtg

197

2016225923 09 Sep 2016

Table 35: PCR program for amplification of heavy chain CDR3 DNA

95	degrees	C	5 minutes
95	degrees	C	20 seconds
60	degrees	C	30 seconds
72	degrees	C	1 minute
72	degrees	C	7 minutes
4 ,	degrees (	z	hold

repeat 20x

Reagents (100 10 Template lOx PCR buffer Taq dNTPs MgCl2

H43.XAPCR2-biotin

Hucmnest ul reaction) :

5ul ligation mix lx

5U

200 uM each 2mM

400 nM 200 nM

198 ! Table 36: Annotated sequence of CJR DY3F7(CJR-A05) 10251 bases

2016225923 09 Sep 2016 ! Non-cutters ι

5	'.Bell Tgatca BsiWI	Cgtacg		BssSI	Cacgag
	!BstZ17X GTAtac Btrl	CACgtg		EcoRV	GATatc
	!FseI GGCCGGcc Hpal	GTTaac		Mlul	Acgcgt
	!PmeI GTTTaaac Pmll	CACgtg		PpuMI	RGgwccy
	!RsrII CGgwccg Sapl	GCTCTTC		SexAI	Accwggt
0	JSgfl GCGATcgc SgrAI	CRccggyg	SphI	GCATGc
	!Stul AGGcct Xmal i	Cccggg
	! cutters 1
5	! Enzymes that cut from 1 to 4 times 1	and	other features
	!End of genes II and X		829
	!Start gene V		843
	!BsrGI Tgtaca	1	1021
0	!BspMI Nnnnnnnnngcaggt	3	1104	5997	9183
	ACCTGCNNNNn	1	2281
	!End of gene V		1106
	! Start gene VII-		1108
	JBsaBI GATNNnnatc	2	1149	3967
5	!Start gene IX		1208
	!End gene VII		1211
	!SnaBI TACgta	2	1268	7133
	!BspHI Tcatga	3	1299	6085	7093
	JStart gene VIII		1301
•0	!End gene IX		1304
	!End gene VIII		1522
	!Start gene III		1578
	!EagI Cggccg	2	1630	8905
	JXbal Tctaga	2	1643	8436
15	!KasI Ggcgcc	4	1650	8724	9039 9120
	JBsmI GAATGCN	2	1769	9065
	!BseRI GAGGAGNNNNNNNNNN	2	2031	8516
	!-- NNnnnnnnnnctcctc	2	7603	8623
	'.AlwNI CAGNNNctg	3	2210	8072	8182
:0	JBspDI ATcgat	2	2520	9883
	JNdel CAtatg	3	2716	3796	9847
	!End gene III		2846
	!Start gene VI		2848
	iAfel AGCgct	1	3032
:5	!End gene VI		3187
	!Start gene I		3189
	iEarl CTCTTCNnnn	2	4067	9274
	Nnnnngaagag	2	6126	8953
	!Pacl TTAATtaa	1	4125
)0	!Start gene IV		4213
	!End gene I		4235
	!BsmFI Nnnnnnnnnnnnnnngtccc 2	5068	9515
	JMscI TGGcca	3	5073	7597	9160
	JPsil TTAtaa	2	5349	5837
>5	!End gene IV		54 93
	!Start ori		54 94
	JNgoMIV Gccggc	3	5606	8213	9315
	iBanll GRGCYc	4	5636	8080	8606 8889
	! Drain CACNNNgtg	I	5709
50	iDrdl GACNNNNnngtc	1	5752
	JAval Cycgrg	2	5818	7240

199

2016225923 09 Sep 2016

PvulI CAGctg	1	5953
BsmBI CGTCTCNnnnn	3	5964	8585	9271
End ori region		5993
BamHI Ggatcc	1	5994
Hindlll Aagctt	3	6000	7147	7384
BciVI GTATCCNNNNNN	1	6077
Start bla		6138
Eco57I CTGAAG	2	6238	7716
Spel Actagt	1	6257
Bcgl gcannnnnntcg	1	6398
Seal AGTact .	1	6442
Pvul CGATcg	1	6553
FspI TGCgca	1	6700
Bgll GCCNNNNnggc	3	6801	8208	8976
Bsal GGTCTCNnnnn	1	6853
AhdI GACNNNnngtc	1	6920
EamllO5I GACNNNnngtc	1	6920
End bla		6998
Accl GTmkac	2	7153	8048
Hindi GTYrac	1	7153
Sail Gtcgac	1	7153
Xhol Ctcgag	1	7240
Start PlacZ region		7246
End PlacZ region		7381
PflMI CCANNNNntgg	1	7382
RBS1		7405
start M13-iii signal seq for	LC	7418
ApaLI Gtgcac	1	7470
end M13-iii signal seq		7471
Start light chain kappa L20:	JK1	7472
PflFI GACNnngtc	3	7489	8705	9099
Sbfl CCTGCAgg	1	7542
Pstl CTGCAg	1	7543
Kpnl GGTACc	1	7581
Xcml CCANNNNNnnnntgg	2	7585	9215
Nsil ATGCAt	2	7626	9503
Bsgl ctgcac	1	7809
Bbsl gtette	2	7820	8616
Blpl GCtnagc	1	8017
EspI GCtnagc	1	8017
Eco0109I RGgnccy	2	8073	8605
Ecll36I GAGctc	1	8080
Sad GAGCTc	1	8080
End light chain		8122
Ascl GGcgcgcc	1	8126
BssHII Gcgcgc	1	8127
RBS2		8147
Sfil GGCCNNNNnggcc	1	8207
Ncol Ccatgg	1	8218
Start 3-23, FR1		8226
Mfel Caattg	1	8232
BspEI Tccgga	1	8298
Start CDR1		8316
Statt FR2		8331
BstXI CCANNNNNntgg	2	8339	8812
EcoNI CCTNNnnnagg	2	8346	8675
Start FR3		8373
Xbal Tctaga	2	8436	1643
Aflll Cttaag	1	8480
Start CDR3		8520
Aat 11 GACGTc	1	8556

200

Ό

Ο cd

2016225923 09 Sep 'Start FR4 !PshAI GACNNnngtc ! BstEII Ggtnacc ! Start CHI 'Apal GGGCCC 'Bspl20I Gggccc !PspOMI Gggccc !AgeI Accggt !Bsu36I CCtnagg 'End of CHI !NotI GCggccgc 'Start His6 tag 'Start cMyc tag 'Amber codon !NheI Gctagc 'Start M13 III Domain !NruI TCGcga !BstBI TTcgaa 'EcoRI Gaattc !XcmI CCANNNNNnnnntgg 'BstAPI GCANNNNntgc !SacII CCGCgg ! End Illstump anchor 'Avril Cctagg ! trp terminator !SwaI ATTTaaat !Start gene II !BglII Agatct

		8562
	2	8573	9231
	1	8579 8595
	1	8606
	1	8606
	1	8606
	1	8699
	2	8770 8903	9509
	1	8904 8913 8931 8982
	1	8985
3	1	8997 9106
	1	9197
	1	9200
	1	9215
	1	9337
	1	9365 9455
	1	9462 9470
	1	9784 9850
	1	9936

I----------------------------------------------------------------------

0

1

aat

get

act

att

agt

aga

att

gat

gee

acc

ttt

tea

get

ege

gee

! gene ii continued

49

cca

aat

gaa

aat

ata

get

aaa

cag

gtt

att

gac

cat

ttg

ega

aat

gta

97

tct

aat

ggt

caa

act

aaa

tct

act

cgt

teg

cag

aat

tgg

gaa

tea

act

145

gtt

aTa

tgg

aat

gaa

act

tee

aga

cac

cgt

act

tta

gtt

gca

tat

tta

5

193

aaa

cat

gtt

gag

eta

cag

caT

TaT

att

cag

caa

tta

tct

aag

cca

241

tcc

gca

aaa

atg

acc

tct

tat

caa

aag

gag

caa

tta

aag

gta

etc

tct

289

aat

cct

gac

ctg

ttg

gag

ttt

get

tee

ggt

ctg

gtt

ege

ttt

gaa

get

337

ega

att

aaa

acg

ega

tat

ttg

aag

tct

ttc

ggg

ett

cct

ett

aat

ett

0

385

ttt

gat

gca

ate

ege

ttt

get

tct

gac

tat

aat

agt

cag

ggt

aaa

gac

433

ctg

att

ttt

gat

tta

tgg

tea

ttc

teg

ttt

tct

gaa

ctg

ttt

aaa

gca

481

ttt

gag

ggg

gat

tea

ATG

aat

att

tat

gac

gat

tee

gca

gta

ttg

gac

Start gene x, ii continues

529 577

get ttt

ate gca

cag aaa

tct aaa gcc tct

cat ege

ttt tat

act ttt

att ggt

acc ttt

ccc tct

ggc cgt

aaa act ctg gta

tct aac

tat

cgt

5

625

gag

ggt

tat

gat

agt

gtt

get

ett

act

atg

cct

cgt

aat

tcc

ttt

tgg

673

cgt

tat

gta

tct

gca

tta

gtt

gaa

tgt

ggt

att

cct

aaa

tct

caa

ctg

721

atg

aat

ett

tct

acc

tgt

aat

gtt

ccg

tta

gtt

cgt

ttt

att

769

aac

gta

gat

ttt

tct

tcc

caa

cgt

cct

gac

tgg

tat

aat

gag

cca

gtt

817

ett

aaa

ate

gca

TAA

0

1

End

X &

II

832 1

ggtaattca ca

1

Ml

E5

Q10

T15

843

ATG

att

aaa

gtt

gaa

att

aaa

cca

tct

caa

gee

caa

ttt

act

cgt

5

1 1

Start gene

V

1

S17

S20

P25

E30

891 1

tct

ggt

gtt

tct

cgt

cag

ggc

aag

cct

tat

tea

ctg

aat

gag

cag

ett

0

1

V35

E40

V4 5

939

tgt

tac

gtt

gat

ttg

ggt

aat

gaa

tat

ccg

gtt

ett

gtc

aag

att

act

201

2016225923 09 Sep 2016

D50

A55

L60

987

ett

gat

gaa

ggt

cag

cca

gee

tat

gcg

cct

ggt

cTG

TAC

Acc

gtt

cat

BsrGI. .

L65

V7 0

S75

R80

1035

ctg

tcc

tet

ttc

aaa

gtt

ggt

cag

ttc

ggt

tee

ett

atg

att

gac

cgt

P85

K87

end

of V

r

1083

ctg

ege

etc

gtt

ccg

get

aag

TAA

C

1108

ATG

gag

cag

gtc

gcg

gat

ttc

gac

aca

att

tat

cag

gcg

atg

Start gene VII

1150

ata

caa

ate

tcc

gtt

gta

ett

tgt

ttc

gcg

ett

ggt

ata

ate

VII and IX overlap.

..... S2	V3	L4	V5	S10
1192 get ggg ggt caa agA TGA gt	gtt	tta	gtg tat	tet ttT gee tet ttc

End VII I start IX

1242

L13 tta

ggt

W15 tgg

tgc

ett

cgt

agt

G20 ggc att

aeg

tat

ttt

T25 acc cgt tta

atg

£2 9 gaa

1293

act

tcc

tc

.. stop of IX, IX and

1 VIII overlap by four bases

1301

ATG

aaa

aag

tet

tta

gtc

etc

aaa gee

tet

gta

gee

gtt get acc

etc

Start signal

. sequence of

viii.

1349

gtt

ccg

atg

ctg

tet

ttc

get

get gag

ggt

gac

gat

ccc gca aaa

gcg

mature VIII

>

1397

gee

ttt

aac

tcc

ctg

caa

gee

tea gcg

acc

gaa

tat

ate ggt tat

gcg

1445

tgg

gcg

atg

gtt

gtc

att

1466

gtc

ggc

gca

act

ate

ggt

ate

aag ctg

ttt

aag

! bases 1499-1539 are probable promoter for iii 1499 aaa ttc acc teg aaa gca ! 1515 ! ........... -35 . .

I

1517 age tga taaaccgat acaattaaag gctccttttg ! ..... -10

1552 gagccttttt ttt GGAGAt ttt ! S.D. uppercase, there may be 9 Ts

1574

caac

<— M GTG

K aaa

III K aaa

signal

sequence

------> F ttc ! 1620

L tta

I» tta

E ttc

A gca

I att

P cct

L tta

V gtt

P cct

Y

S

G

A

E

s

H

L

D

G

A

1620

tat

tet

ggc

gCG

GCC

Gaa

tea

caT

CTA

GAc

ggc

gee

Eagl.... Xbal....

1656

A get

E gaa

T act

V gtt

E gaa

S agt

c tgt

L tta

A gca

K

S

H

T

E

I

S

F

T

N

V

W

K

D

K

T

1683

aaA

Tcc

cat

aca

gaa

aat

tea

ttt

aCT

AAC

GTC

TGG

AAA

GAC

AAA

ACt

L

D

R

Y

A

N

Y

E

G

S

L

W

N

A

T

G

V

1734

tta

gat

cgt

tac

get

aac

tat

gag

ggc

tgt

ctg

tgG

AAT

GCt

aca

ggc

gtt

202

2016225923 09 Sep 2016 :5 ! BsmI....

! VVCTGDETQCYGTWVPI

1785 gta gtt tgt act ggt GAC GAA ACT CAG TGT TAC GGT ACA TGG GTT cct att

I ! G L A I P E N

1836 ggg ctt get ate cct gaa aat

I ! LI linker -----------------------------------! EGGGSEGGGS

1857 gag ggt ggt.ggc tet gag ggt ggc ggt tet

I ! EGGGSEGGGT

1887 gag ggt ggc ggt tet gag ggt ggc ggt act

I ! Domain 2 -----------------------------------1917 aaa cct cct gag tac ggt gat aca cct att ccg ggc tat act tat ate aac

1968 cct etc gac ggc act tat ccg cct ggt act gag caa aac ccc get aat cct

2019 aat cct tet ctt GAG GAG tet cag cct ctt aat act ttc atg ttt cag aat ! BseRI..

2070 aat agg ttc ega aat agg cag ggg gca tta act gtt tat aeg ggc act

2118 gtt act caa ggc act gac ccc gtt aaa act tat tac cag tac act cct

2166 gta tea tea aaa gee atg tat gac get tac tgg aac ggt aaa ttC AGA ! AlwNI

2214 GAC TGc get ttc cat tet ggc ttt aat gaG gat TTa ttT gtt tgt gaa ί AlwNI

2262 tat caa ggc caa teg tet gac ctg cct caa cct cct gtc aat get !5

2307 ggc ggc ggc tet start L2 --------------------------------------------------------2319 ggt ggt ggt tet 2331 ggt ggc ggc tet

2343 gag ggt ggt ggc tet gag gga ggc ggt tee 2373 ggt ggt ggc tet ggt ! end L2 ! Many published sequences of M13-derived phage have a longer linker ! than shown here by repeats of the EGGGS motif two more times.

iO >5

Domain 3 -	F ttt	D gat	Y tat	E gaa
2388	S tee	G ggt	D gat
2436	M atg	T acc	E gaa	N aat	A gee	D gat	E gaa
2484	K aaa	L ctt	D gat	s tet	V gtc	A get	T act
2532	I att	G ggt	D gac	V gtt	S tee	G ggc	L ctt
2580	F ttt	A get	G ggc	s tet	N aat	s tee	Q caa
2628	S tea	P cct	L tta	M atg	N aat	N aat	F ttc
2676	s teg	V gtt	E gaa	C tgt	R ege	P cct	F ttt
	F	s	I	D	c	D	K

K	M	A	N	A	N	K	G	A
aag	atg	gca	aac	get	aat	aag	ggg	get
N	A	L	Q	S	D	A	K	G
aac	geg	eta	cag	tet	gac	get	aaa	ggc
D	Y	G	A	A	M	D	G	F
gat	tac	ggt	get	get	ate	gat	ggt	ttc
A	N	G	N	G	A	T	G	D
get	aat	ggt	aat	ggt	get	act	ggt	gat
M	A	Q	V	G	D	G	D	N
atg	get	caa	gtc	ggt	gac	ggt	gat	aat
R	Q	Y	L	P	S	L	P	Q
cgt	caa	tat	tta	cct	tee	etc	cct	caa
V	F	G	A	G	K	P	Y	E
gtc	ttt	Ggc	get	ggt	aaa	cca	tat	gaa
I	N	L	F	R

203

2016225923 09 Sep 2016

2724 ttt tct att gat tgt gac aaa ata aac tta ttc cgt

End Domain 3

G

V

F

A

F

L

L Υ V A

T F

M

Y

V

F140

2760

ggt gtc ttt geg ttt start transmembrane

ett tta tat gtt gee segment

acc ttt

atg

tat

gta

ttt

s

T

F

A

N

I

L

2808

tct

aeg

ttt

get

aac

ata

ctg

R

N

K

E

S

2829

cgt aat aag Intracellular

gag tct anchor

TAA

! stop of iii

Ml P2 2847 te ATG cca

V gtt

L ett

L5 ttg

G ggt

I att

P ccg

L tta

L10 tta

L ttg

R cgt

F ttc

L etc

G15 ggt

Start VI

2894

ttc

ett

ctg

gta

act

ttg

ttc

ggc

tat

ctg

ett

act

ttt

ett

aaa

aag

2942

ggc

ttc

ggt

aag

ata

get

att

get

att

tea

ttg

ttt

ett

get

ett

att

2990

att

ggg

ett

aac

tea

att

ett

gtg

ggt

tat

etc

tct

gat

att

age

get

3038

caa

tta

ccc

tct

gac

ttt

gtt

cag

ggt

gtt

cag

tta

att

etc

ccg

tct

3086

aat

geg

ett

ccc

tgt

ttt

tat

gtt

att

etc

tct

gta

aag

get

att

3134

ttc

att

ttt

gac

gtt

aaa

caa

aaa

ate

gtt

tct

tat

ttg

gat

tgg

gat

!

! Ml A2 V3 F5 L10 G13

3182 aaa TAA t ATG get gtt tat ttt gta act ggc aaa tta ggc tct gga ! end VI Start gene I

3228

K aag

T aeg

L etc

V gtt

S age

V gtt

G ggt

K aag

I att

Q cag

D gat

K aaa

1 att

V gta

A get

G

C

K

I

A

T

N

L

D

L

R

L

Q

N

L

3273

ggg

tgc

aaa

ata

gca

act

aat

ett

gat

tta

agg

ett

caa

aac

etc

P

Q

V

G

R

F

A

K

T

P

R

V

L

R

I

3318

ccg

caa

gtc

ggg

agg

ttc

get

aaa

aeg

cct

ege

gtt

ett

aga

ata

P

D

K

P

s

I

S

D

L

A

I

G

R

G

3363

ccg

gat

aag

cct

tct

ata

tct

gat

ttg

ett

get

att

ggg

ege

ggt

N

D

S

Y

D

E

N

K

N

G

L

V

L

D

3408

aat

gat

tee

tac

gat

gaa

aat

aaa

aac

ggc

ttg

ett

gtt

etc

gat

E

C

G

T

W

F

N

T

R

S

W

N

D

K

E

3453

gag

tgc

ggt

act

tgg

ttt

aat

acc

cgt

tct

tgg

aat

gat

aag

gaa

R

Q

P

I

D

W

F

L

H

A

R

K

L

G

3498

aga

cag

ccg

att

gat

tgg

ttt

eta

cat

get

cgt

aaa

tta

gga

W

D

I

F

L

V

Q

D

L

S

I

V

D

K

3543

tgg

gat

att

ttt

ett

gtt

cag

gac

tta

tct

att

gtt

gat

aaa

Q

A

R

S

A

L

A

E

H

V

Y

C

R

3588

cag

geg

cgt

tct

gca

tta

get

gaa

cat

gtt

tat

tgt

cgt

L

D

R

I

T

L

P

F

V

G

T

L

Y

S

L

3633

ctg

gac

aga

att

act

tta

cct

ttt

gtc

ggt

act

tta

tat

tct

ett

I

T

G

S

K

M

P

L

P

K

L

H

V

G

V

3678

att

act

ggc

teg

aaa

atg

cct

ctg

cct

aaa

tta

cat

gtt

ggc

gtt

SO

204

2016225923 09 Sep 20

3723

V gtt

K aaa

Y tat

G ggc

D gat

S tet

Q caa

L tta

S age

P cct

T act

V gtt

E gag

R cgt

W tgg

L

Y

T

G

K

N

L

Y

N

A

Y

D

T

K

Q

3768

ett

tat

act

ggt

aag

aat

ttg

tat

aac

gca

tat

gat

act

aaa

cag

A

F

S

s

N

Y

D

S

G

V

Y

s

Y

L

T

3813

get

ttt

tet

agt

aat

tat

gat

tcc

ggt

gtt

tat

tet

tat

tta

aeg

P

Y

L

S

H

G

R

Y

F

K

P

L

N

L

G

3858

cct

tat

tta

tea

cac

ggt

egg

tat

ttc

aaa

cca

tta

aat

tta

ggt

Q

K

M

K

L

T

K

I

Y

L

K

F

S

R

3903

cag

aag

atg

aaa

tta

act

aaa

ata

tat

ttg

aaa

aag

ttt

tet

ege

V

L

c

L

A

I

G

F

A

S

A

F

T

Y

S

3948

gtt

ett

tgt

ett

gcg

att

gga

ttt

gca

tea

gca

ttt

aca

tat

agt

Y

I

T

Q

P

K

P

E

V

K

V

S

Q

3993

tat

ata

acc

caa

cct

aag

ccg

gag

gtt

aaa

aag

gta

gtc

tet

cag

T

Y

D

F

D

K

F

T

I

D

S

Q

R

L

4038

acc

tat

gat

ttt

gat

aaa

ttc

act

att

gac

tet

cag

cgt

ett

N

L

S

Y

R

Y

V

F

K

D

S

K

G

K

L

4083

aat

eta

age

tat

ege

tat

gtt

ttc

aag

gat

tet

aag

gga

aaa

TTA

! Pacl ι

i0 ! INSDDLQKQGYSLTY

4128 ATT AAt age gac gat tta cag aag caa ggt tat tea etc aca tat ! Pacl

I ! ilDLCTVSIKKGNSNE ! iv Ml K

4173 att gat tta tgt act gtt tcc att aaa aaa ggt aat tea aAT Gaa ! Start IV

I iO >5 ! i I V K C N .End of I ! iv L3 L N5 V 17 N F V10

4218 att gtt aaa tgt aat TAA T TTT GTT ! IV continued.....

4243

ttc

ttg

atg

ttt

gtt

tea

tet

ttt

get

cag

gta

att

gaa

atg

4291

aat

teg

cct

ctg

ege

gat

ttt

gta

act

tgg

tat

tea

aag

caa

tea

4339

ggc

gaa

tcc

gtt

att

gtt

tet

ccc

gat

gta

aaa

ggt

act

gtt

act

gta

4387

tat

tea

tet

gac

gtt

aaa

cct

gaa

aat

eta

ege

aat

ttc

ttt

att

tet

4435

gtt

tta

cgt

gcA

aat

ttt

gat

atg

gtA

ggt

teT

aAC

cct

tcc

atT

4483

att

cag

aag

tat

aat

cca

aac

aat

cag

gat

tat

att

gat

gaa

ttg

cca

4531

tea

tet

gat

aat

cag

gaa

tat

gat

aat

tcc

get

cct

tet

ggt

4579

ttc

ttt

gtt

ccg

caa

aat

gat

aat

gtt

act

caa

act

ttt

aaa

att

aat

4 627

aac

gtt

egg

gca

aag

gat

tta

ata

ega

gtt

gtc

gaa

ttg

ttt

gta

aag

4675

tet

aat

act

tet

aaa

tcc

tea

aat

gta

tta

tet

att

gac

ggc

tet

aat

4723

eta

tta

gtt

agt

geT

cct

aaa

gat

att

tta

gat

aac

ett

cct

caa

4771

ttc

ett

tcA

act

gtt

gat

ttg

cca

act

gac

cag

ata

ttg

att

gag

ggt

4819

ttg

ata

ttt

gag

gtt

cag

caa

ggt

gat

get

tta

gat

ttt

tea

ttt

get

4867

get

ggc

tet

cag

cgt

ggc

act

gtt

gca

ggc

ggt

gtt

aat

act

gac

ege

4915

etc

acc

tet

gtt

tta

tet

get

ggt

teg

ttc

ggt

att

ttt

aat

4963

ggc

gat

gtt

tta

ggg

eta

tea

gtt

ege

gca

tta

aag

act

aat

age

cat

5011

tea

aaa

ata

ttg

tet

gtg

cca

cgt

att

ett

aeg

ett

tea

ggt

cag

aag

5059

ggt

tet

ate

tet

gtT

GGC

CAg

aat

gtc

cct

ttt

att

act

ggt

cgt

gtg

Mscl....

>0

205

2016225923 09 Sep 2016

5107

act

ggt

gaa

tet

gee

aat

gta

aat

cca

ttt

cag

aeg

att

gag

cgt

5155

caa

aat

gta

ggt

att

tee

atg

age

gtt

ttt

cct

gtt

gca

atg

get

ggc

5203

ggt

aat

att

gtt

ctg

gat

att

acc

age

aag

gee

gat

agt

ttg

agt

tet

5251

tet

act

cag

gca

agt

gat

gtt

att

act

aat

caa

aga

agt

att

get

aca

5299

aeg

gtt

aat

ttg

cgt

gat

gga

cag

act

ett

tta

etc

ggt

ggc

etc

act

5347

gat

tat

aaa

aac

act

tet

caG

gat

tet

ggc

gta

ccg

ttc

ctg

tet

aaa

5395

ate

cct

tta

ate

ggc

etc

ctg

ttt

age

tee

ege

tet

gat

teT

aac

gag

5443

gaa

age

aeg

tta

tac

gtg

etc

gtc

aaa

gca

acc

ata

gta

ege

gee

ctg

5491 TAG cggcgcatt End IV

5503 aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc

5563 gcccgctcct ttcgctttct tcccttcctt tctcgccacg ttcGCCGGCt ttccccgtca

NgoMI.

5623 agctctaaat cgggggctcc ctttagggtt ccgatttagt gctttacggc acctcgaccc

5683 caaaaaactt gatttgggtg atggttCACG TAGTGggcca tcgccctgat agacggtttt

Drain. . . .

5743 tcgccctttG ACGTTGGAGT Ccacgttctt taatagtgga ctcttgttcc aaactggaac Drdl..........

5803 aacactcaac cctatctcgg gctattcttt tgatttataa gggattttgc cgatttcgga

5863 accaccatca aacaggattt tcgcctgctg gggcaaacca gcgtggaccg cttgctgcaa

5923 ctctctcagg gccaggcggt gaagggcaat CAGCTGttgc cCGTCTCact ggtgaaaaga

PvuII. BsmBI.

5983 aaaaccaccc tGGATCC AAGCTT

BamHI Hindlll (1/2)

Insert carrying bla gene

6006 gcaggtg gcacttttcg gggaaatgtg cgcggaaccc

6043 ctatttgttt atttttctaa atacattcaa atatGTATCC gctcatgaga caataaccct BciVI

6103 gataaatgct tcaataatat tgaaaaAGGA AGAgt

RBS.? . ..

• 40

Start bla gene

6138

ATG agt

att

caa

cat

ttc

cgt

gtc

gee ett

att

CCC

ttt

gcg

gca

ttt

6189

tgc ett

cct

gtt

ttt

get

cac

cca

gaa aeg

ctg

gtg

aaa

gta

aaa

gat

get

6240

gaa gat

cag

ttg

ggC gcA CTA GTg ggt tac Spel.... ApaLI & BssSI Removed

ate

gaa

ctg

gat

etc

aac

age

6291

qqt aag

ate

ett

gag

agt

ttt

ege

ccc gaa

gaa

cgt

ttt

cca

atg

age

6342

act ttt

aaa

gtt

ctg

eta

tgt

GGC

GeG Gta

tta

tee

cgt

att

gac

gee

ggg

6393

caa gaG CAA BegI..

CTC

GGT

CGc

cgC

ATA

cAC tat

tet

cag

aat

gac

ttg

gtt

gAG Seal

6444

TAC Tea

cca

gtc

aca

gaa

aag

cat

ett aeg

gat

ggc

atg

aca

gta

aga

gaa

Seal.

6495

tta

tgc

agt

get

gee

ata

acc

atg

agt

gat

aac

act

gcg

gee

aac

tta

ett

6546

ctg

aca

aCG

ATC

Gga

gga

ccg

aag

gag

eta

acc

get

ttt

ttg

cac

aac

atg

Pvul...

, .

6597

ggg

gat

cat

gta

act

ege

ett

gat

cgt

tgg

gaa

ccg

gag

ctg

aat

gaa

gee

6648

ata

cca

aac

gac

gag

cgt

gac

acc

aeg

atg

cct

gta

gca

atg

Gca

aca

aeg

6699

tTG

CGC

Aaa

eta

tta

act

ggc

gaa

eta

ett

act

eta

get

tee

egg

caa

caa.

FspI. . . .

6750

tta

ata

gac

tgg

atg

gag

gcg

gat

aaa

gtt

gca

gga

cca

ett

ctg

cgc

teg

6801

GCC Bgll

ett

ccG

GCt

ggc

tgg

ttt

att

get

gat

aaa

tet

gga

gee

ggt

gag

cgt

6852

gGG

TCT

Cgc

ggt

ate

att

gca

ctg

ggg

cca

gat

ggt

aag

ccc

tee

cgt

Bsal. . . .

6903

ate

gta

gtt

ate

tac

aeG ACg

ggg

aGT

Cag

gca

act

atg

gat

gaa

cga aat

AhdI..

6954

aga

cag

ate

get

gag

ata ggt

gee

tea

ctg

att

aag

cat

tgg

TAA

ctgt

stop

7003 cagaccaagt ttactcatat ataetttaga ttgatttaaa acttcatttt taatttaaaa

7063 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt

206

2016225923 09 Sep 2016 .0 .5 :5 !5 >0 >5

7123

7147

7183 cgttccactg tacgtaagac cccc

AAGCTT GTCGAC tgaa tggcgaatgg cgctttgcct

Hindlll Sail..

(2/2) Hindi ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgcgatctt

Start of Fab-display cassette, the Fab DSR-A05, selected for binding to a protein antigen.

7233 CCTGAcG xBsu36I

CTCGAG Xhol..

PlacZ promoter is in the following block

7246 cgcaacgc aattaatgtg agttagctca 7274 ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg 7324 tgtggaattg tgagcggata acaatttcac acaggaaaca gctatgacca

7374 tgattacgCC AagcttTGGa gccttttttt PflMI.......

tggagatttt

caac

Hind3

. (there

are

3)

Gene iii signal 1 2

sequence 3 4

5

6

7

8

9 10 11

12

13

14

15

M

K

L

F

A

I

P L V

V

P

F

Y

7418 gtg

aaa

. tta

tta

ttc

gca

att

cct tta gtt

gtt

cct

ttc

tat

16

17

18

Start light

chain (L20:

JKl)

S

H

S

A

Q

D

I

Q

Μ T Q

S

P

A

7463 tct

cac

aGT

¹ GCA

Caa

qac

ate

caq

atq acc caq

tct

cca

qcc

ApaLI...

Sequence supplied by extender............

7505

T acc

L : ctg

s I tct

L ttg . .

s

P

G

E

R

A

T

L

S

C

R

A

s

Q

G

7517

tct

cca

ggg

gaa

aga

gcc

acc

etc

tcc

tgc

agg

gcc

agt

cag

Ggt

V

s

Y

L

A

w

Y

Q

K

P

G

Q

A

7562

gtt

age

tac

tta

gcc

tgg

tac

cag

aaa

cct

ggc

cag

get

P

R

L

I

Y

D

A

s

S

R

A

T

G

I

7607

ccc

agg

etc

ate

tat

gAt

gca

tcc

aAc

agg

gcc

act

ggc

ate

P

A

R

F

S

G

s

G

P

G

T

D

F

T

L

7652

cca

gCc

agg

ttc

agt

ggc

agt

ggg

Cct

ggg

aca

gac

ttc

act

etc

T

I

s

L

E

P

E

D

F

A

V

Y

C

7697

acc

ate

age

agC

ctA

gag

cct

gaa

gat

ttt

gca

gtT

tat

tac

tgt

Q

R

S

W

H

P

w

T

F

G

Q

G

T

R

7742

cag

CGt

aAc

tgg

cat

ccg

tgg

ACG

TTC

GGC

CAA

GGG

ACC

PAG

V

E

I

K

R

T

V

A

P

S

V

F

X

F

7787

gtg

gaa

ate

aaa

ega

act

gtg

gCT

GCA

Cca

tct

gtc

ttc

ate

ttc

Bsgl... .

P

s

D

E

Q

L

K

S

G

T

A

s

V

7832

ccg

cca

tct

gat

gag

cag

ttg

aaa

tct

gga

act

gee

tct

gtt

gtg

C

L

N

F

Y

P

R

E

A

K

V

Q

w

7877

tgc

ctg

aat

aac

ttc

tat

ccc

aga

gag

gcc

aaa

gta

cag

tgg

207

2016225923 09 Sep 2016

K

V

D

N

A

L

Q

S

G

N

S

Q

E

S

V

7922

aag

gtg

gat

aac

gcc

etc

caa teg

ggt

aac

tee

cag

gag

agt

gtc

T

E

R

D

S

K

D

S

T

Y

S

L

s

S

T

7967

aca

gag

egg

gac

age

aag

gac age

acc

tac

age

etc

age

acc

L

T

L

S

K

A

D

Y

E

K

H

K

V

Y

A

8012

ctg

acG

CTG

AGC

aaa

gca

gac tac

gag

aaa

cac

aaa

gtc

tac

gcc

! EspI.....

C

E

V

T H

Q

G

L

s

S P

V

T K

S

8057

tgc

gaa

gtc

acc cat

cag

ggc

ctG

AGC

TCg ccc

gtc

aca aag

age

ι Sacl....

!

! F N R G E C . .

8102 ttc aac agg gga gag tgt taa taa

I

8126 GGCGCG CCaattctat ttcaaGGAGA cagtcata ! Ascl..... RBS2.

PelB signal sequence------(22 codons)----->

! 1

1 M

2 K

3 Y

4 L

5 L

6 P

7 T

8 A

9 A

10 A

11 G

12 L

13 L

14 L

15 L

25

8160

atg

aaa

tac

eta

ttg

cct

aeg

gca

gee

get

gga

ttg

tta

etc

0-1—, >-+- vu

! PP1 -

!

. .. PelB

11 Γ KJ.

I

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

1

A

Q

P

A

M

A

E

V

Q

L

E

S

G

30

8205

geG

GCC

cag

ccG

GCC

atg

gee

gaa

gtt

ZMs

TTG

tta

gag

tet

ggt

1

Sfil..

Mfel

. . .

Ncol....

ι

J

31 32

33

34

35

36

37

38

39

40

41

42

43

44

45

35

1

G G

L

V

Q

P

G

S

L

R

L

S

C

A

8250

ggc ggt

Ctt

gtt

cag

cct

ggt

tet

tta

cgt

Ctt

tet

tgc

get

Γ*ΠΡ T -

FR2-

---->

euKJ.

J

46 47

48

49

50

51

52

53

54

55

56

57

58

59

60

40

1

A s

G

F

T

F

S

T

Y

E

M

R

W

V

R

8295

get TCC

GGA

ttc

act

ttc

tet

act

tac

gag

atg

cgt

tgg

gtt

cgC

J

BspEI..

BstXI

CDR2 — -

---->

45

1

61 62

63

64

65

66

67

68

69

70

71

72

73

74

75

1

Q A

P

G

K

G

L

E

W

V

S

Y

I

A

P

8340

CAa get

ccT

GGt

aaa

ggt

ttg

gag

tgg

gtt

tet

tat

ate

get

cct

50

! BstXI. f

FR3-

---->

1

76 77

78

79

80

81

82

83

84

85

86

87

88

89

90

1

S G

G

D

T

A

Y

A

D

s

V

K

G

R

F

8385

tet ggt

ggc

gat

act

get

tat

get

gac

tee

gtt

aaa

ggt

ege

ttc

55

1

91 92

93

94

95

96

97

98

99

100

101

102

103

104

105

1

T I

S

R

D

N

S

K

N

T

L

Y

L

Q

M

8430

act ate

TCT

AGA

aac

tet

aag

aat

act

etc

tac

ttCL

caq

atq

! Xbal...

! Supplied by extender !

FR3

208

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

2016225923 09 Sep 2016 .0

8475

8520

8565

N SLRAEDT aac agC TTA AGg get gag gac act gca gtc tac tat tgt gcg agg

Aflll...

from extender--------------------------------->

CDR3----------121 122 123 124 L D G

125 126 127 128 129 130 131 132 133 134 RLDGYISYYYGMDV agg etc gat ggc tat att tcc tac tac tac ggt atg GAC GTC tgg Aatll..

140 141 142 143 144 145 QGTTVTV.S S caa ggg acc acG GTC ACC gtc tea age

BstEII...

FR4—> 135

W

136 137 138 139 G ggc

CHI of IgGl->

A

S

T

K

G

P

S

V

F

P

L

A

P

S

>0

8595

gcc

tcc

acc

aag

ggc

cca

teg

gtc

ttc

ccc

ctg

gca

CCC

tcc

K

S

T

s

G

T

A

L

G

c

L

V

K

8640

aag

age

acc

tet

ggg

ggc

aca

gcg

gee

ctg

ggc

tgc

ctg

gtc

aag

>5

D

Y

F

P

E

P

V

T

V

S

w

N

S

G

A

8685

gac

tac

ttc

ccc

gaa

ccg

gtg

aeg

gtg

teg

tgg

aac

tea

ggc

gee

L

T

s

G

V

H

T

F

P

A

V

L

Q

S

s

8730

ctg

acc

age

ggc

gtc

cac

acc

ttc

ccg

get

gtc

eta

cag

tee

TCA

30

Bsu36I

G

L

Y

s

L

s

V

T

V

P

s

S

8775

GGa

etc

tac

tee

etc

age

gta

gtg

acc

gtg

ccc

tcc

age

! Bsu36I.

...

35

L

G

T

Q

T

Y

I

C

N

V

N

H

K

P

s

8820

ttg

ggc

acc

cag

acc

tac

ate

tgc

aac

gtg

aat

cac

aag

ccc

age

N

T

K

V

D

K

V

E

P

K

s

c

A

30

8865

aac

acc

aag

gtg

gac

aag

aaa

gtt

gag

CCC

aaa

tet

tgt

GCG

GCC

Not I. . .

A

H

G

A

E

Q

K

L

I

8910

GCa

cat

cac

cat

cac

ggg

gee

gca

gaa

caa

aaa

etc

ate

35

! ..Notl.

H6

tag.

Myc-

-Tag

S

E

D

L

N

G

A

g

A

s

A

8955

tea

gaa

gag

gat

ctg

aat

ggg

gee

gca

tag

GCT

AGC

tet

get

Myc-

-Tag

. . .

Nhel...

Amber

III'stump

Domain 3 of III

S G D tcc g g Kasl.

t !W.T • · (2/4)

8997 agt ggc gac ttc gac tac gag aaa atg get aat gcc aac aaa GGC GCC

A K

209

9045 atG ACT GAG AAC GCT GAC GAG aat get ttg caa age gat gcc aag ggt

2016225923 09 Sep 2016

! c

a

t

C

t

a

c

g

c a

g

tet

c

t

a

c

! W.T

,

K

L

D

s

V

A

T

D

Y

G

A

I

D

G

F

5

9093

aag

tta

gac

age

gTC

GCG

Acc

gac

tat

GGC

GCC

gee

ATC

GAc

ggc

ttt

1

a

c t

t

tet

t

c

t

c

! W.T

1

Nrul..

-

KasI

...

(3/4)

1

I

G

D

V

S

G

L

A

N

G

N

G

A

T

G

D

10

9141

ate

ggc

gat

gtc

agt

ggt

tTG

GCC

Aac

ggc

aac

gga

gee

acc

gga

gac

1

t

c

t

tcc

c

c t

t

!W.T

1

Mscl..

. ;3/3)

1

F

A

G

S

N

s

Q

M

A

Q

V

G

D

G

D

N

15

9189

ttc

GCA

GGT

teG

AAT

TCt

cag

atg

geC

CAG

GTT

GGA

GAT

GGg

gac

aac

j

t

c

t

c

a

t

a

c

t

c

t

! W.T

I

BspMI..

(2/2)

XemI..

EcoRI...

20

9237

S agt tea

P ccg t

L ctt t a

M atg

N aac t

F ttt c

R aga c t

Q cag a

Y tac t

L ctt t a

P ccg t

s tet c

L ctt c

P ccg t

Q cag a !W.T

S

V

E

C

R

P

F

V

F

S

A

G

K

P

Y

E

25

9285

agt

gtc

gag

tgc

cgt

cca

ttc

gtt

ttc

tet

gee

ggc

aag

cct

tac

gag

teg

t

a

t

c

t

c

t

age

t

a

t

a 'W.T

F

S

I

D

C

D

K

I

N

L

F

R

9333

ttc

aGC

Ate

gac

TGC

gat

aag

ate

aat

Ctt

ttc

CGC

30

t

tet

t

c

a

c

t a

c

t

! W

.T.

BstAPI

SacII

. . .

End

Domain

3

G

V

F

A

F

L

Y

V

A

T

F

M

Y

V

F

35

9369

GGc

gtt

ttc

get

ttc

ttg

eta

tac

gtc

get

act

ttc

atg

tac

gtt

ttc

t

c

t

g

t

c t

t a

t

c

t

a

t !W.T

start transmembrane

segment

S

T

F

A

N

I

L

R

N

K

E

S

40

9417

aGC

ACT

TTC

GCC

AAT

ATT

TTA

Cgc aac aaa gaa age

tet

g

t

c

a

c g

t

g

g tet !

W.T.

Intracellular

anchor.

45

9453

tag

tga

tet

CCT

AGG

Avril. .

1 ! 50 ! I \|	9468 aag ccc gcc taa tga gcg ggc ttt ttt ttt ct ggt I Trp terminator 1
End	Fab cassette
I	9503	ATGCAT CCTGAGG ccgat actgtcgtcg tcccctcaaa ctggcagatg Nsil. . Bsu36I. (3/3)

9551 cacggttacg atgcgcccat ctacaccaac gtgacctatc ccattacggt caatccgccg

9611 tttgttccca cggagaatcc gacgggttgt tactcgctca catttaatgt tgatgaaagc

9671 tggctacagg aaggccagac gegaattatt tttgatggcg ttcctattgg ttaaaaaatg

9731 agctgattta acaaaaattt aaTgegaatt ttaacaaaat attaacgttt acaATTTAAA ! Swal...

9791 Tatttgetta tacaatcttc ctgtttttgg ggcttttctg attatcaacc GGGGTAcat

9850 ATG att gac atg eta gtt tta ega tta ccg ttc ate gat tet ctt gtt tgc

210

2016225923 09 Sep 2016

Start gene II

9901

tcc

aga

etc

tea

ggc

aat

gac

ctg

ata

gee

ttt

gtA GAT Bglll.

CTc

tea

aaa

ata

9952

get

acc

etc

tec

ggc

atT

aat

tta

tea

get

aga

aeg gtt

gaa

tat

cat

att

10003

gat

ggt

gat

ttg

act

gtc

tec

ggc

ett

tct

cac

cct ttt

gaa

tct

tta

cct

10054

aca

cat

tac

tea

ggc

att

gca

ttt

aaa

ata

tat

gag gg<-

tct

aaa

aat

ttt

10105

tat

cct

tgc

gtt

gaa

ata

aag

get

tct

ccc

gca

aaa gta

tta

cag

ggt

cat

10156

aat

gtt

ttt

ggt

aca

acc

gat

tta

get

tta

tgc

tct gag

get

tta

ttg

ett

10207

aat

ttt

get

aat

tct

ttg

cct

tgc

ctg

tat

gat

tta ttg

gat

gtt

1

gene

II ,

continues

------------------------ End of Table

211

2016225923 09 Sep 2016

Table 37: DNA seq of w.t. M13 gene iii

5	1 I 1579 1	1 2 fM K gtg aaa Signal	3 K aaa sequ	4 L tta ence	5 L tta	6 F ttc	7 A gca	8 I att	9 P cct	10 L tta

	1	16 17	18	19	20	21	22	23	24	25
	1	S H	S	A	E	T	V	E	S	C
10	1624	tet cac	tee	get	gaa	act	gtt	gaa	agt	tgt
	! Signal sequencer »	Domain 1
	r	31 32	33	34	35	36	37	38	39	40
		T E	N	S	F	T	N	V	W	K
15	1669	aca gaa	aat	tea	ttt	act	aac	gtc	tgg	aaa
	1	Domain 1
	1	46 47	48	49	50	51	52	53	54	55
	1	D R	Y	A	N	Y	E	G	c	L
20	1714	gat cgt	tac	get	aac	tat	gag	ggt	tgt	ctg

V

L

D

W

Domain 112

V

P

F

Y

A

K

P

H

D

K

T

L

N

A

T

G

BsmI.

62 63 64 65 66 67 68 69 70 71 72 73 74 75

VVVCTGDETQCYGTW

1759 gtt gta gtt tgt act ggt gac gaa act cag tgt tac ggt aca tgg Domain 1--------------------------------------------------76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

VPIGLAIPENEGGGS

1804 gtt cct att ggg ctt get ate cct gaa aat gag ggt ggt ggc tet Domain 1------------------------------> Linker 1----------91 92 93 94 95 96 97 98 99 100

EGGGSEGGGS 1849 gag ggt ggc ggt tet gag ggt ggc ggt tet gag ggt ggc ggt act Linker 1-------------------------------------------------->

101 102 103 104 105 E G G G T

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 KPPEYGDT PI PGYTY

1894 aaa cct cct gag tac ggt gat aca cct att ccg ggc tat act tat Domain 2--------------------------------------------------121 122 123 124 125 126 127 128 129 130 131 132 133 134 135

1939 ate aac cct etc gac ggc act taT CCG CCt ggt act gag caa aac Ecil....

Domain 2

N

Domain 2

138	139	140	141	142	143	144	145	146	147	148	149	150
N	P	N	P	S	L	E	E	S	Q	P	L	N
aat	cct	aat	cct	tet	Ctt	GAG	GAG	tet	cag	cct	ctt	aat
						BseRI..
153	154	155	156	157	158	159	160	161	162	163	164	165
M	F	Q	N	N	R	F	R	N	R	Q	G	A
atg	ttt	cag	aat	aat	agg	ttc	ega	aat	agg	cag	ggg	gca
168	169	170	171	172	173	174	175	176	177	178	179	180

212

2016225923 09 Sep 2016 :5 (5

·)0

L

T

V

Y

T

G

T

V

T

Q

G

T

D

P

V

2074

tta

act

gtt

tat

aeg

ggc

act

gtt

act

caa

ggc

act

gac

CCC

gtt

Domain

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

K

T

Y

Q

Y

T

P

V

S

s

K

A

M

Y

2119

aaa

act

tat

tac

cag

tac

act

cct

gta

tea

aaa

gee

atg

tat

Domain

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

D

A

Y

W

N

G

K

F

R

D

C

A

F

H

S

2164

gac

get

tac

tgg

aac

ggt

aaa

ttc

AGa

gaC

TGc

get

ttc

cat

tet

AlwNI.......

Domain 2

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

G

F

N

E

D

P

F

V

C

E

Y

Q

G

Q

S

2209 ggc

ttt

aat

gaG

GAT

CCa

ttc

gtt

tgt

gaa

tat

caa

ggc

caa

teg

! BamHI. . .

! Domain 2--------------------------------------------------ι ! 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 ! SDLPQPPVNAGGGSG

2254 tet gac ctg cct caa cct cct gtc aat get ggc ggc ggc tet ggt ! Domain 2------------------------------> Linker 2----------1 ! 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 ! GGSGGGSEGGGSEGG

2299 ggt ggt tet ggt ggc ggc tet gag ggt ggt ggc tet gag ggt ggc ! Linker 2--------------------------------------------------1 ! 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 ! GSEGGGSEGGGSGGG

2344 ggt tet gag ggt ggc ggc tet gag gga ggc ggt tcc ggt ggt ggc ! Linker 2--------------------------------------------------ι ! 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 ! SGSGDFDYEKMANAN

2389 tet ggt tec ggt gat ttt gat tat gaa aag atg gca aac get aat .’Linker 2> Domain 3------------------------------------------1 ! 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 ! KGAMTENADENALQS

2434 aag ggg get atg acc gaa aat gee gat gaa aac geg eta cag tet ! Domain 3--------------------------------------------------I ! 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 ! DAKGKLDSVATDYGA

2479 gac get aaa ggc aaa ett gat tet gtc get act gat tac ggt get ! Domain 3--------------------------------------------------ι ! 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 ! AIDGFIGDVSGLANG

2524 get ate gat ggt ttc att ggt gac gtt tcc ggc ett get aat ggt ! Domain 3--------------------------------------------------I ! 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 ! NGA TGDFAGSNSQMA

2569 aat ggt get act ggt gat ttt get ggc tet aat tcc caa atg get ! Domain 3--------------------------------------------------50

213

2016225923 09 Sep 2016

346 Q

347 V

348 G

349 D

350 G

351 D

352 N

353 S

354 P

355 L

356 M

357 N

358 N

359 F

360 R

2614

caa

gtc

ggt

gac

ggt

gat

aat

tea

cct

tta

atg

aat

ttc

cgt

Domain

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

Q

Y

L

P

S

L

P

Q

S

V

E

C

R

P

F

2659

caa

tat

tta

cct

tcc

etc

cct

caa

teg

gtt

gaa

tgt

ege

cct

ttt

Domain

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

V

F

S

A

G

K

P

Y

E

F

Γ

I

D

C

D

2704

gtc

ttt

age

get

ggt

aaa

cca

tat

gaa

ttt

tet

att

gat

tgt

gac

Domain

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

K

I

N

L

F

R

G

V

F

A

F

L

Y

V

2749

aaa

ata

aac

tta

ttc

cgt

ggt

gtc

ttt

gcg

ttt

ett

tta

tat

gtt

Domain

j

1 i anbiueiiLDi ans segnienL

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

A

T

F

M

Y

V

F

S

T

F

A

N

I

L

R

2794

gcc

acc

ttt

atg

tat

gta

ttt

tet

aeg

ttt

get

aac

ata

ctg

cgt

Transmembrane segment---

--->

ICA·

I ! 421 422 423 424 425 ! N K E S

2839 aat aag gag tet taa ! 2853 ! ICA-----------> ICA = intracellular anchor

I

End of Table

214

2016225923 09 Sep 2016

Table 38: Whole mature III anchor M13-III derived anchor with recoded DNA

5	1 t 1 1	1 A GCG Notl	2 A gcc	3 A gca
	1		4	5	6
0	1		H	H	H
		10	cat	cat	.cat
	1		18	19	20
	1		S	E	E
5		52	tea	gaa	gag

7	8	9	10	11	12
H	H	H	G	A	A
cac	cat	cac	ggg	gcc	gca
21	22	23	24	25	26
D	L	N	G	A	A
gat	ctg	aat	ggg	gcc	gca

13	14	15	16	17
E	Q	K	L	I
gaa	caa	aaa	etc	ate
27	28	29
	A	S
Tag	GCT Nhel	AGC

31

D I

33 34 35 36 N D D R M

38

A S

GAT ATC aac gat gat cqt atg get tet (ON_G37bot) [RC] 5'-c aac gat gat cqt atg gcG

EcoRV..

Enterokinase cleavage site.

T act

CAt Get gcc gag aca g-3'

Start mature III (recoded) 40 41 42 43

A Ε Τ V

118 IgcCIgaGIacA|gtC

Domain 1---->

t ί W.T.

130

E

S

C

A

K

P

I gaaITCC agt

L tgCICTGIGCCIAaGIccT t t a a a c

Mscl....

52 53 54

Η Τ Ε N caC|acT|gaGIaat t a a

56 57 58

S F Τ N

AGT|ttC|aCA|Aat| tea t t c !

W.T

! 59

60

61

62

63

64 65

66

67

68

69

70

71

72

73

V

W

K

D

K T

L

D

R

Y

A

N

Y

E

175

Igtg

TGGtaaG

gaT

aaG|acC

CtT

gAT

CGA

TaTIgcC

aaT

taC

gaA|

0

c

a

c

a t

t a

t

c

t

c

t

g

W.T

BspDI.

74

75

76

77

78

79 80

81

82

83

84

85

86

87

88

G

C

L

W

N

A T

G

V

C

T

G

D

5

220

1ggCltgC

ITtA

tgg

aat

IgcCIACC

GGC

GtC

gtT

Igtc

TGC

ACG|ggC

IgaTI

1

t

c g

t a

t

a

t

c

W.T

SgrAI

Bsgl.

t

89

90

91

92

93

94 95

96

97

98

99

100

101

102

103

0

1

E

T

Q

c

Y

G T

W

V

P

I

G

L

A

I

265

1 gaG

acA

I caA

tgC

taT

1ggC|ACG

TGg

gtGIccG

j atA

gGC

TTA

GCC

latAI

1

a

t

g

t

c

t a

t

g

c t

t

c

W.T

1 1

Pmll.

BlpI..

5

j

Domain

Linker

1

104

105

106

107

108

109 110

111

112

113

114

115

116

117

118

1

P

E

N

E

G

G G

s

E

G

S

E

G

310

1 ccG

gaG

| aaC

gaA|ggC

1ggC|ggT

AGC

gaA

ggClggT

ggC|AGC

gaA|ggC|

0

I 1

t

a

t

g

t

t c

tet

g

t

c

t

tet

g

t

W.T

1

Linker

1----

—>

Domain

2----

>

215

2016225923 09 Sep 2016

119

120

121 122

123

124

125

126

127

128

129

130

131

132

133

G

S E

G

T

K

P

E

Y

G

D

355

IggT

GGA

TCCIgaAl

ggA|ggT|ggA|acC

aaG

ccG

ccG|gaA|taT

ggClgaCI

c

t

t g

t

C

t

a

t

g

c

t

t !

W.T

5

BamHI..(2/2)

134

135

136 137

138

139

140

141

142

143

144

145

146

147

148

T

P

I P

G

Y

T

Y

I

N

P

L

D

G

T

400

1 acT

ccG|atA|CCT|GGT

taC|acC

taC

atT|aaT

ccG

TtA

gaT

ggA

acC |

L0

a

t

t g

c

t

c

t

c c

c

t !

W.T

SexAI....

I 1	149 150 151 152 153 154 155 156	157 158	159	160	161	162	163
1	YPPGTEQN	P	A	N	P	N	P	S
15	445 \|taCIccT\|ccG1ggC1acC\|gaA\|caG\|aaT	ccT\|gcC	aaC	ccG	aaC	ccA	AGCI
t	TGtttgac	c	t	t	t	t	t	tct	W.T

Hindlll...

1 I

164

165

166 167 168

169

170

171

172

173

174

175

176

177 178

20 !

L

E

esq

P

L

N

T

F

M

F

Q

N N

490

|TTA|gaA

gaA|AGC1caA

ccGITtA

aaC

acC

ttT

atg1ttC

caA

aaC1aaC|

t

c t

G

G tct g

t

c t

t

c

t

g

t t ! W.T

Hindlll.

25 ! 1 1 1

535

179 R ICgT a g

180 F ttT c

181 R AgG c a

182 N aaC | t

183 R CgT a g

184 Q caA g

185 G gGT g Hg:

186 A GCT a lAI .

187 L CtT t a

188 T acC t

189 V gTG t Bsi

190 Y TAC t :GI.

191 T AcT g

192 G ggA c

193 T acC| t !

W.T

30

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

V

T

Q

G

T

D

P

V

K

T

Y

Q

Y

T

580

Igtc

acC

caG

GGTI ACC

gaT|ccT

gtC

aaG|acC

taC

taT|caA

taTIacC|

t

a

c

t

c

t

a

t

c

g

c

t !

W.T

35

r

Kpnl

1

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

P

V

S

K

A

M

Y

D

A

Y

W

N

G

K

625

I ccG

gtC

TCG

AGtIaaG

gcT

atg

taC

gaT

gcCItaT

tgg

aaT

ggc

aaG |

40

1

t

a

tea

a

c

t

c

t

c

t

a Ϊ

W.T

Bsal.

ί 1 } 45 ! t

670

224 F 1 ttT I c

Xhol.. 225 226

227 C tgT| c

228 A gcC | t

229 F ttT | c

230 H caC | t

231 S AGC| tct

232 G ggTi c

233 234

235 E gaa 1 G

236 D gac I T

237 238

R CgT| A a

D gaT| C

F ttci t

N aaC 1 t

P CCtl a

F ttT| c !

W.T

1

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

50 !

V

C

E

Y

Q

G

Q

S

D

L

P

Q

P

715

IgtCitgCI

gaGI

taC |

caG I

ggTi

caG |

AGT |

AGC|

gaTITtA|

ccG |

caGIccAj

CCG I

t

a

t

a

c

a

teg

tct

c

c g

t

a

t

t !

W.T

1 I

Drdl.

Agel. . .

c. R 1

Domain 2-

Lj.ii Kci

!

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

1

V

N

A

G

S

G

S

G

S

760

1GTT|AAC

IgcG

IggT

IggTIggT

|AGC

IggC

IggA

1ggCIAGC1ggC

IggT

1ggTIAGC1

1

c

t

c

tct

t

tct

t

c

tct

! W.

60 !

Age I.

I

Hpal...

216

2016225923 09 Sep 2016 ,5 ,0 .5 .0

Hindi.

Linker 2----------------------------------------------> Domain 3—>

269 270 271 272 273 274 275 276 277 278 279 280 281 282 283

------------Domain 3------------------->

284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 SG DFDYEKMANANKG

850 | AGTIggC|gacIttcIgacItacI gag IaaaIatg|get|aat|gee|aac|aaa|GGC | tee tttttag aettgg! W.T.

KasI....

299 300 301 302 303 304 305 306 307 308 309 310 311 312 313

95 |GCCIatgI act I gag IaacI get IgacIgaGIAAT|GCA|ctg|caa|agt|gat IgCCI

KasI..

a c BsmI..

g tet t ! W StyX..

T.

314 315 316 317 318 319 320 321 322 323 324 325 326 327 328

940 |AAGIGGtIaagIttaIgacI age IgTC|GCc|AcaIgacI tat|ggT|GCt|gee I ate| act

Styl.

t tet PflFI...

T.

329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 DG FIGDVSGLANGNG

985 IgaclggcItttI ate Iggc|gat|gtc|agt|ggtIctg|get IaacIggcIaacIgga| t t c t t c ttcc cct t t t t!W

344 345 346 347 348 349 350 351 352 353 AT GDFAGSNS

1030 I gee IaccIggaIgacIttc|GCA|GGT|teG|AAT|TCt| ttttttct c ! W.T.

BstBI...

EcoRI...

BspMI..

354	355	356	357	358	359	360	361	362	363
Q	M	A	Q	V	G	D	G	D	N
1060 cag	atg	geC	CAG	GTT	GGA	GAT	GGg	gac	aac
a		t	a	c	t	c	t	t	t

XemI................

T.

1090

364 S agt tea

365 P ccg t

366 L ett t a

367 M atg

368 N aac t

369 N aac t

370 F ttt c

371 R aga c t

372 Q cag a

373 Y tac t

374 L ett t a

375 P ccg t

376 S tet c

377 L ett c

378 P ccg t

37 9 Q cag a !

! W.T

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

S

V

E

C

R

P

F

V

F

S

A

G

K

P

Y

E

1138

agt

gtc

gag

tgc

cgt

cca

ttc

gtt

ttc

tet

gee

ggc

aag

cct

tac

gag

teg

t

a

t

c

t

c

t

age

t

a

t

a

! W.T

Domain 3

1186

396	397	398	399
c*	S	I	D
t'. 3	aGC	Ate	gac
t	tet	t	t

400	401	402	403
C	D	K	I
TGC	gat	aag	ate
t	c	a	a

>

404	405	406	407
N	L	F	R
aat	ett	ttc	CGC
c	t a		t

217

2016225923 09 Sep 2016

	1		BstAPI
	1	trar	.smembrane se	gment---
	1	408	409	410	411	412	413	414
5	1	G	V	F	A	F	L	L
	1222	GGc	gtt	ttc	get	ttc	ttg	eta
	1	t	c	t	g	t	c t	t a
	1	424	425	426	427	428	429	430
10	1	S	T	F	A .	N	I	L
	1270	aGC	ACT	TTC	GCC	AAT	ATT	TTA
	1 1	tct	g	t	t	c	a	c g
15	1 1
	1306			tag	tga	tct	CCT	AGG
	1						Avril. .
	1321	aag	CCC	gcc	taa	tga	gcg	ggc
20	1	1	Trp	terminator

End Fab cassette

SacII.. .

>

415	416	417	418	419	420	421	422	423
Y	V	A	T	F	M	Y	V	F
tac	gtc	get	act	ttc	atg	tac	gtt	ttc
t	t	c	c	t		t	a	t

431 432 433 434 435 R N K E S

Cgc aac aaa gaa age t t g g tct ! w.t.

Intracellular anchor.

ttt ttt ttt ct ggt I

End of Table

218

2016225923 09 Sep 2016 .0 l5

Table 39: ONs to make deletions in III ! ONs for use with Nhel

I

N (ON_G29bot) 5'-c gTT gAT ATc gcT Age cTA Tgc-3' ! this is the reverse complement of 5'-gca tag get age gat ate aac g-3’ i Nhel... scab.........

(ON_G104top) 5'-gIataIggcIttaIgcTIaGC|ccg|gag|aacIgaaIgg-3' ! Scab..........Nhel... 104 105 106 107 108 (ON_G23 6top) 5 -cItttIcac i age)ggt|ttc|GCT jAGCIgac(cct I ttt|gtc|tgc-3' ! Nhel... 236 237 238 239 240 (ON_G236tCS) 5'-cIttt1cacI age Iggt|ttc|GCT|AGC|gacIcctIttt|gtc| Agc! Nhel... 236 237 238 239 240 gag|tacIcag|ggt|c-3' ! ONs for use with SphI G CAT Gc (ON X37bot) 5'-gAc TgT cTc ggc Age ATg ege cAT Acg ATc ATc gTT g-3' !

NDDRMAHA (ON_X37bot)=[RC] 5'-c aac gat gat cgt atg qcG CAt Get gcc gag aca gtc-3' SphI....Scab...........

(ON_X104top) 5'-g|gtG ccg|ataIggcIttGI CAT|GCa|ccg|gag|aac|gaa | gg-3' !

! Scab...............SphI.... 104 105 106 107 108 (ON_X236top) 5'-cItttIcacI age|ggt i ttG|CaT|gCa|gacIcct | tttIgtc|tgc-3' !

! SphI.... 236 237 238 239 240 (ON X23 6tCS) 5'-c|ttt|cac|age|ggtIttGICaT|gCa|gac1cct | ttt|gtc1AgcNhel.

236 237 238 239 240 gag|tac|cag1ggtIc-3 '

219

Table 40: Phage titers and enrichments of a selections with a DY3F31-based human Fab library

2016225923 09 Sep 2016

	Input (total cfu)	Output (total cfu)	Output/input ratio
Rl-ox selected on phOx-BSA	4,5 χ 10¹²	3,4 χ 10⁵	7,5 χ 10'⁸
R2-Strep selected on Strep-beads	9,2 χ 10¹²	3 χ 10⁸	3,3 χ 10’⁵

2016225923 09 Sep 2016

220

Table 41: Frequency of ELISA positives in DY3F31-based Fab libraries

	Anti-M13 HRP	9E10/RAM- HRP	Anti-CK/CL Gar-HRP
R2-ox (with IPTG induction)	18/44	10/44	10/44
R2-ox (without IPTG)	13/44	ND	ND
R3-strep (with IPTG)	39/44	38/44	36/44
R3-strep (without IPTG)	33/44	ND	ND

-221 2016225923 12 Jun 2018

Claims

1. A library comprising a collection of nucleic acids, which collectively encodes a plurality of antibody heavy chains each comprising a heavy chain variable region containing, from its N-terminus to C-terminus, Framework Region 1 (FR1),

Complementary Determining Region 1 (CDR1), Framework Region 2 (FR2), Complementary Determining Region 2 (CDR2), Framework

Region 3 (FR3), Complementary Determining Region 3 (CDR3), and Framework Region 4 (FR4), wherein:

(a) the CDR1 region comprises the amino acid sequence -Xi-YX2-M-X3-, in which each of Xi, X2, and X3 is independently selected from the group consisting of A, D, E, F, G, Η, I, K, L, Μ, Ν, P,

Q, R, S, Τ, V, W, and Y;

G, and S, Χε is selected from the group consisting of P and S, and each of X7 and Xs is independently selected from the group consisting of A, D, E, F, G, Η , I, K, L, Μ, Ν, P, Q, R, S, Τ, V, W, and Y; and (c) the CDR3 region is captured from the CDR3 region of an immunoglobulin heavy chain variable gene from a B cell.

25 2. The library of claim 1, wherein the library is a library of vectors or a library of genetic packages comprising the collection of nucleic acids.

3. The library of claim 2, wherein the library is a

30 library of vectors, which are phage vectors or yeast vectors.

2016225923 12 Jun 2018

-2224. The library of claim 2, wherein the library is a library of genetic packages, which are M13 phage particles or yeast cells.

5. The library of any one of claims 1, 2, and 4, wherein the library is a library of genetic packages, which display the plurality of antibody heavy chains encoded by the collection of nucleic acids.

6. The library of any one of claims 1-5, wherein each of the plurality of antibody heavy chains further comprise an antibody heavy chain constant region or a portion thereof, which is linked to the C-terminus of the FR4.

7. The library of claim 6, wherein each of the plurality of antibody heavy chains further comprise a CHI domain of an antibody heavy chain constant region, which is linked to the Cterminus of the FR4.

8. The library of any one of claims 1-7, wherein each of the plurality of antibody heavy chains is linked to an M13 pill anchor segment, which does not mediate infection of phage particles .

9. The library of claim 8, wherein the M13 pill anchor segment comprises the amino acid sequence of:

SGDFDYEKMA NANKGAMTEN ADENALQSDA KGKLDSVATD YGAAIDGFIG

DVSGLANGNG ATGDFAGSNS QMAQVGDGDN SPLMNNFRQY LPSLPQSVEC

RPFVFGAGKP YEFSIDCDKI NLFR.

-223 2016225923 12 Jun2018

10. The library of claim 8 or claim 9, wherein the collection of nucleic acids are present on phage vectors, which further encode wild-type M13 pill.

11. The library of any one of claims 1-10, wherein the plurality of the antibody heavy chains collectively comprises a mixture of A, D, E, F, G, Η, I, K, L, Μ, Ν, P, Q, R, S, T, V, W, and Y at each of positions Xi, X2, and X3 in CDR1.

12. The library of any one of claims 1-11, wherein the plurality of the antibody heavy chains collectively comprises a mixture of Y, R, W, V, G, and S at each of positions X4 and X5, a mixture of P and S at position Χθ, and a mixture of A, D, E, F, G, Η, I, K, L, Μ, Ν, P, Q, R, S, T, V, W, and Y at each of positions X7 and Xs in CDR2.

13. The library of any one of claims 1-12, wherein the plurality of the antibody heavy chains collectively comprises a plurality of CDR3 regions captured from B cells.

14. The library of claim 13, wherein the B cells are from a blood sample of an autoimmune patient.

15. The library of claim 14, wherein the autoimmune patient 25 is diagnosed with a disorder selected from the group consisting of systemic lupus erythematosus, systemic sclerosis, rheumatoid arthritis, antiphospholipid syndrome and vasculitis.

16. The library of any one of claims 1-15, wherein the FR1, 30 FR2, FR3, and FR4 regions are VH3-23 framework regions.

-2242016225923 12 Jun 2018

17. The library of any one of claims 1-16, wherein the library further comprises an additional collection of nucleic acids encoding a plurality of antibody light chains each comprising a light chain variable region.

18. The library of claim 17, wherein each of the antibody light chains further comprises a light chain constant region.

19. The library of claim 18, wherein the library is a library of genetic packages displaying a plurality of antibody Fab fragments comprising the plurality of heavy chains and the plurality of light chains.

1/22

SO

Ο

ΓΊ

Ph <υ (Ζ!

OS ο

C*D (Μ

OS on (Μ (Μ so

Ο (Μ cr ε

< co □ ο αίζ S co

CT-Q ο

X geo

5αι ίβζ ζ ^ωω·2 co Ο 21 LU Xco

FIG. 1 co ο

Ο

UJ

2/22

2016225923 09 Sep 2016

CM ,d fe

Ο

3Σ geo ίϊ UJ coO ss

CJ

HI

NO ©

CN

Ph <Z>

ON ©

CN ©

in

CN

NO ©

CN

3/22

4/22

2016225923 09 Sep 2016 »

JS

Q.

E ra

W c

.2 ♦3 ra

a.

to

Φ

k.

(0

Ό C (0 « CM

5/22

2016225923 09 Sep 2016

1 2 3 4 5 6 7 8 dd? Gel analysis of PCR product from

Approx. 75ng/5gl -> 15ng/gl extender-kappa amplification

1- 100bp

2- LDM

3 - 50ng template 4-1 Ong template

5 - ssDNA unligated

6 - negative control 7-LDM

8 - lOObp

FIG. 5

6/22

2016225923 09 Sep 2016

Gel purified PCR product from extender-kappa amplification Concentration: ± 35ng/pl l-LDM 2 - Ιμΐ purif.

FIG. 6

7/22

2016225923 09 Sep 2016 ««Jws <#4&

.................. .....5:<<&s£* ^s ''S' <&£«*> <1/?«£>*' % _z< C^f > ί f. .<·'«* »V « j* ' _{v z} rtj.

/.«. ' ' v v z ,+ i, , .vzz»w«^-«:#x··· *>·.···>·.% ···.·.··.·.·>·> ο_·.·_·.»· i\W_vv . S' ' '' ^M >?»WXW,VA^r.V.V<V.vAV'VAy.VA\^VtV.V.V//.TO>ra% _ ·55γλχ <A Λ» V . «Λ \\.

Gel-analysis of digested k-ssDNA Ιμΐ digested ssDNA « 8ng ssDNA Total volume of 50μ1 = 400ng ssDNA

400ng ssDNA available for ligation of the bridge-extenders

1- lOObp

2- LDM

3 -Ιμΐ ssDNA pure

4 - 4μΙ beads after dig

5 - 8μ1 beads after dig 6-LDM

7 - lOObp

FIG. 7

8/22

2016225923 09 Sep 2016 'zz**

·. V ' z z ^WA'V.

^ν&Λ 'z ' t &§$jvy+^ * 4/? *

Ζ.νζ.-.ΆχΛ5Ζχ^**»Χ«Χ·β*^Γ·

Gel analysis of extender - cleaved kappa ligation

20ng/5gl eluted material -> 4ng/gl

1- lOObp

2- LDM

3 - Ligationmix, 4μ1

4 - Unligated ssDNA

5-LDM

FIG. 8

9/22

2016225923 09 Sep 2016 <0 σ>

(Ο

CL

CL (Π ο

Μ

Φ σ>

Μ □

C (0 φ

σ)

ΓΟ >

(Ο φ

Ο

φ ίί co Ε 1 co ro ο. Ε φ φ Ω. Ε 73 X) Ο) >> ο Σ C ο ο Ο ο CO ν— μ ιη τ—

L. < ζ C σ\ Φ L. co C 1 X ο (fl (fl ο ♦3 0 ε Έ Ό ro C Ω. -Ω Ο ο co Ε Σ Ο C ο 4^ (0 Ο) S <0 σ> c σ> •Μβ fe μ μ Ζ3 ο μ CM CO Ό σ>

ο σ>

ro >

co £

co

CO

Ε

Ω.

.Ω

Ο

Φ co

Ε

Ο (0 (Ο

Ω.

Ω ω

co •C φ σ> co =5 ε

Φ S « Φ > (0 (fl φ

S Ο CO c Φ Ω -Ω <3 (0

Φ σ>

C0

Φ

Ο) >

(0 φ

CM CO

10/22

2016225923 09 Sep 2016

FIG. 10

11/22

2016225923 09 Sep 2016

12/22

2016225923 09 Sep 2016

ΡΊ rH ϋ

>—I fe

c «3 Ο CQ . c •MB *-· o 2 | .E m3 O 2 c o 2 1 co 1 CO CO

2016225923 09 Sep 2016

13/22

14/22

2016225923 09 Sep 2016 ω

c ’φ ο

Σ3 o

IO σ>

(0

Τ3

JD ε

JS c

ο

ΜΗ»

Φ

Ο “□

C

Φ

Φ σ>

φ >

φ .2 ϋ

a

CO φ

σ>

£

Οί ο

QL

Ο

C ο

CO

Ο)

ΙΜ

QQ ω

C C o

o

T o

o φ

Ο) (0 φ

ο ω

Φ o

>» u

in < < Q

S ro JB cd to =

3 5

II II ▼- CM

Ό ^_r 2 <§ J 8 g<§ §1 W ® w cn oo K £ i c

o «« *-» (8 σ>

m o>

CO φ

□) (8

Φ *5

0^s ©

K

15/22

SO

Ο (Μ

Q<υ

1X1 (Ζ ο

m (Μ

Ο

ΙΤ) <Ν (Μ <Ο

Ο (Μ

Φ

Ο) φ

>

φ ο

Ο 'Τ ϋ

ΗΗ fe

16/22 so ο

CM

Ph <υ <Z>

o rc

CM

OS on

CM

CM so o

CM >1

X o

ω a

« co ω

φ >

'δ co

Ct

O

O c

’<3

-C o

>

(0

Φ c

o w

σ>

-’er-a.—ra <a φ Ό Λ « c c φ O *s s s» □ c .2* Uj 03 -J v oi H td

Q

a.

I

C >

(0 *»

0) k· <0 c

o ♦3 cn «

cn

-Έ-.

g «

m

Λ

-Ω x

a «0 m

co td a

.9..

co td

u.

ir ra ©

T3 ω

<Λ <β

JO co co >

2?

ra

CM 21

2l □ «» ϋ CM CM On LU Q rag

Ϊ3 Q X O — c ο = s-g .?§ -1 o

CM

UI w

CO ra ©

X

M

TT ▼H d

fe

17/22

2016225923 09 Sep 2016 α

• ψχ cd

X

Ο

18/22

2016225923 09 Sep 2016

19/22

2016225923 09 Sep 2016

FIG. 17

20/22

2016225923 09 Sep 2016 cS

00 • iH

CH

O' c- ¹⁰H +

CO

I

O di <

o o

H o| g

<

o o

H o

o

Q Si

H <3 O O

O o o o < H o o <C E—¹O) oi <3 H O O o u t s 3 £ o o o o

5? E <3 H <j> o < H O O o o H < O O O O <

<3 a

<

I fe fe!

O

PL, cn

uo LO CO • · • L< • · φ 00 a Ό > x +J •H 1 o X U. < Om ω CQ <

21/22

2016225923 09 Sep 2016 co

I

O <

O

I in

Pi u

Pm co ι I in co go

GO

Oh

GO

O

E—* u

O.

Di

U

Οo +

cn +

+

Q + i O

I—I tS

O.

<

I ►J >

I

FIG. 19

X u ω cq

22/22

2016225923 09 Sep 2016 bo .s «1 o

CM cn +

co

O

H

O

H %

%

UD

I

CM

CO

Q < O 0 co 1 H E-j O Oi O C < O c H H ί- < CJ + CO ο O H f- <

>

I

CO

FIG. 20 cn

< H * < H b < O O p < * < H H H < O O O t-· > H < O O o O o c_> o 1 1 ί- • - UD CO ο < • « ·· a • 1 V U - 00 • · a to co Ό Oi oi Ή P o ta L. X 0- 03 ω

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

SEQUENCE LISTING

<110> LADNER, ROBERT C. COHEN, EDWARD H. NASTRI, HORACIO G. ROOKEY, KRISTIN L. ΗΟΕΤ, RENE ' HOOGENBOOM, HENDRICUS R. J. M. <120> NOVEL METHODS OF CONSTRUCTING LIBRARIES COMPRISING DISPLAYED AND/OR EXPRESSED MEMBERS OF A DIVERSE FAMILY OF PEPTIDES, POLYPEPTIDES OR PROTEINS AND THE NOVEL LIBRARIES <130> DYAX/002 CIP2 <140> <141> 10/045,674 2001-10-25 <150> <151> 06/198,069 2000-04-17 <150> <151> 09/837,306 2001-04-17 <160> 635 <170> Patentln Ver. 2.1 <210> <211> <212> <213> 1 17 DNA . Artificial Sequence

<220> <223> Description of Artificial Sequence: Synthetic oligonucleotide

<400> 1 catgtgtatt actgtgc 17 <210> 2 <211> 44

<212> <213> DNA Artificial Sequence

<220> <223> Description of Artificial Sequence: Synthetic ' oligonucleotide

<400> 2 cacatccgtg cttcttgcac ggatgtggca cagtaataca catg 44 <210> 3 <211> 18

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 3 gtgtattaga ctgctgcc <210> 4 <211> 43 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 4 ggcagcagtc taatacacca catccgtgtt cttcacggat gtg

<210> 5 <211> 47 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 5 cacatccgtg tttgttacac ggatgtggtg tcttacagtc cattctg

<210> 6 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic

oligonucleotide <400> 6 cagaatggac tgtaagacac 20 <210> 7 <211> 43 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 7 atcgagtctc actgagccac atccgtggtt ttccacggat gtg <210> 8 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <400> 8 gctcagtgag actcgat <210> 9 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <220>

<221> modified_base <222> (10) . . (24) <223> A, T, C, G, other or unknown <400> 9 cacgaggagn nnnnnnnnnn nnnn <210> 10 <211> 19 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <400> 10 atgaccgaat tgctacaag <210> 11 <211> 46 <212> DNA <213> Artificial Sequence

Synthetic

Synthetic <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 11 gactcctcag cttcttgctg aggagtcctt gtagcaattc ggtcat 46 <210> 12 <211> 6 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: 6 His tag <400> 12

His His His His His His 1 5 <210> 13 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6).,(10) <223> A, T, C, G, other or unknown <400> 13 gtctcnnnnn · 10 <210> 14 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(6) <223> A, T, C, G, other or unknown <400> 14 nnnnnngaga c <210> 15 <211> 24 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (11) .. (24) <223> A, T, C, G, other or unknown <400> 15 cacggatgtg nnnnnnnnnn nrinn <210> 16 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide

Synthetic <220>

<221> modified_base <222> (1)..(14) <223> A, T, C, G, other or unknown <400> 16 nnnnnnnnnn nnnncacatc cgtg <210> 17 <211> 14 · <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide

Synthetic <400> 17 gtgtattact gtgc <210> 18 <211> 34 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 18 cacatccgtg cacggatgtg gcacagtaat acac

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 19 <211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 19 gtgtattaga ctgc <210> 20 <211> 34 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 20 gcagtctaat acaccacatc cgtgcacgga tgtg <210> 21 <211> 34 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide

Synthetic <400> 21 cacatccgtg cacggatgtg gtgtcttaca gtcc <210> 22 <211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 22 ggactgtaag acac <210> 23 <211> 34 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 23 gagtctcact gagccacatc cgtgcacgga tgtg 34 <210> 24 .

<211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 24 gctcagtgag actc 14 <210> 25 <211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 25 gtgtattact gtgc <210> 26 <211> 14 .

<212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic ’ oligonucleotide <400> 26 gtatattact gtgc 14 <210> 27 <211> 14 · <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 27 gtgtattact gtaa 14

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 28 <211> 14 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 28 gtgtattact gtac <210> 29 <211> 14 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 29 ttgtattact gtgc <210> 30 <211> 14 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 30 ttgtatcact gtgc <210> 31 <211> 14 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 31

Sequence: Synthetic

Sequence: Synthetic acatattact gtgc <210> 32 <211> 14 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 32 acgtattact gtgc <210> 33 <211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

<400> 33 atgtattact gtgc <210> 34 <211> 101 <212> DNA <213> Homo sapiens <400> 34 agggtcacca tgaccaggga ctgagatctg acgacacggc <210> 35 <211> 98 <212> DNA <213> Homo sapiens <400> 35 agagtcacca ttaccaggga agatctgaag acacggctgt <210> 36 <211> 98 <212> DNA <213> Homo sapiens <400> 36 agagtcacca tgaccaggaa agatctgagg acacggccgt <210> 37 <211> 98 <212> DNA <213> Homo sapiens

101

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 37 agagtcacca tgaccacaga cacatccacg gtattactgt agcacagcct gcgagaga acatggagct gaggagcctg 60 98 agatctgacg acacggccgt <210> 38 <211> 98 <212> DNA <213> Homo sapiens <400> 38 agagtcacca tgaccgagga cacatctaca gacacagcct acatggagct gagcagcctg 60 agatctgagg acacggccgt gtattactgt gcaacaga 98 <210> 39 <211> 98 <212> DNA <213> Homo sapiens <400> 39 agagtcacca ttaccaggga caggtctatg agcacagcct acatggagct gagcagcctg 60 agatctgagg acacagccat gtattactgt gcaagata 98 <210> 40 <211> 98 <212> DNA <213> Homo sapiens <400> 40 agagtcacca tgaccaggga cacgtccacg agcacagtct acatggagct gagcagcctg 60 agatctgagg acacggccgt gtattactgt gcgagaga 98 <210> 41 <211> 98 <212> DNA <213> Homo sapiens <400> 41 agagtcacca ttaccaggga catgtccaca agcacagcct acatggagct gagcagcctg 60 agatccgagg acacggccgt gtattactgt gcggcaga 98 <210> 42 <211> 98 <212> DNA <213> Homo sapiens <400> 42 agagtcacga ttaccgcgga cgaatccacg agcacagcct acatggagct gagcagcctg 60 agatctgagg acacggccgt gtattactgt gcgagaga 98

<210> 43 <211> 98

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Homo <400> 43 agagtcacga agatctgagg sapiens agcacagcct gcgagaga acatggagct gagcagcctg 60 98 ttaccgcgga acacggccgt caaatccacg gtattactgt <210> 44 <211> 98 <212> DNA <213> Homo sapiens <400> 44 agagtcacca taaccgcgga cacgtctaca gacacagcct acatggagct gagcagcctg 60 agatctgagg acacggccgt gtattactgt gcaacaga 98 <210> 45 <211> 100 <212> DNA <213> Homo sapiens <400> 45 aggctcacca tcaccaagga cacctccaaa aaccaggtgg tccttacaat gaccaacatg 60 gaccctgtgg acacagccac atattactgt gcacacagac 100 <210> 46 <211> 100 <212> DNA <213> Homo sapiens <400> 46 aggctcacca tctccaagga cacctccaaa agccaggtgg tccttaccat gaccaacatg 60 gaccctgtgg acacagccac atattactgt gcacggatac 100 <210> 47 <211> 100 <212> DNA <213> Homo sapiens <400> 47 aggctcacca tctccaagga cacctccaaa aaccaggtgg tccttacaat gaccaacatg 60 gaccctgtgg acacagccac gtattactgt gcacggatac 100 <210> 48 <211> 98 <212> DNA <213> Homo sapiens <400> 48 cgattcacca tctccagaga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 agagccgagg acacggctgt gtattactgt gcgagaga 98

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 49 <211> 100 <212> DNA <213> Homo <400> 49 cgattcacca agagctgagg sapiens tctccagaga caacgccaag aactccctgt atctgcaaat gaacagtctg 60 100 acacggcctt gtattactgt gcaaaagata <210> 50 <211> 98 <212> DNA <213> Homo sapiens <400> 50 cgattcacca tctccaggga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 agagccgagg acacggccgt gtattactgt gcgagaga 98 <210> 51 <211> 98 <212> DNA <213> Homo sapiens <400> 51 cgattcacca tctccagaga aaatgccaag aactccttgt atcttcaaat gaacagcctg 60 agagccgggg acacggctgt gtattactgt gcaagaga 98 <210> 52 <211> 98 <212> DNA <213> Homo sapiens <400> 52 agattcacca tctcaagaga tgattcaaaa aacacgctgt atctgcaaat gaacagcctg 60 aaaaccgagg acacagccgt gtattactgt accacaga 98 <210> 53 <211> 98 <212> DNA <213> Homo sapiens <400> 53 cgattcacca tctccagaga caacgccaag aactccctgt atctgcaaat gaacagtctg 60 agagccgagg acacggcctt gtatcactgt gcgagaga 98 <210> 54 <211> 98 <212> DNA <213> Homo sapiens <400> 54 cgattcacca tctccagaga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 agagccgagg acacggctgt gtattactgt gcgagaga 98

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 55 <211> 98 <212> DNA <213> Homo <400> 55 cggttcacca agagccgagg sapiens tctccagaga acacggccgt caattccaag atattactgt aacacgctgt gcgaaaga <210> 56 <211> 98 <212> DNA <213> Homo sapiens <400> 56 cgattcacca tctccagaga caattccaag aacacgctgt agagctgagg acacggctgt gtattactgt gcgaaaga <210> 57 <211> 98 <212> DNA <213> Homo sapiens <400> 57 cgattcacca tctccagaga caattccaag aacacgctgt agagctgagg acacggctgt gtattactgt gcgagaga <210> 58 <211> 98 <212> DNA <213> Homo sapiens <400> 58 cgattcacca tctccagaga caattccaag aacacgctgt agagctgagg acacggctgt gtattactgt gcgaaaga <210> 59 <211> 98 <212> DNA <213> Homo sapiens <400> 59 cgattcacca tctccagaga caattccaag aacacgctgt agagccgagg acacggctgt gtattactgt gcgagaga <210> 60 <211> 100 <212> DNA <213> Homo sapiens

atctgcaaat atctgcaaat atctgcaaat atctgcaaat atctgcaaat gaacagcctg gaacagcctg gaacagcctg gaacagcctg gaacagcctg

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 60

cgattcacca agaactgagg tctccagaga acaccgcctt caacagcaaa aactccctgt atctgcaaat gaacagtctg 60 100 gtattactgt gcaaaagata <210> 61 <211> 98 <212> DNA <213> Homo sapiens <400> 61 cgattcacca tctccagaga caatgccaag aactcactgt atctgcaaat gaacagcctg 60 agagacgagg acacggctgt gtattactgt gcgagaga 98 <210> 62 <211> 98 <212> DNA <213> Homo sapiens <400> 62 agattcacca tctcaagaga tggttccaaa agcatcgcct atctgcaaat gaacagcctg 60 aaaaccgagg acacagccgt gtattactgt actagaga 98 <210> 63 <211> 98 <212> DNA <213> Homo sapiens <400> 63 cgattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gaacagcctg 60 agagccgagg acacggccgt gtattactgt gcgagaga 98 <210> 64 <211> 98 <212> DNA <213> Homo sapiens <400> 64 agattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gggcagcctg 60 agagctgagg acatggctgt gtattactgt gcgagaga 98 <210> 65 <211> 98 <212> DNA <213> Homo sapiens <400> 65 agattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gaacagcctg 60 agagctgagg acacggctgt gtattactgt gcgagaga 98

<210> 66 <211> 98

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Homo <400> 66 agattcacca aaaaccgagg sapiens tctcaagaga acacggccgt tgattcaaag gtattactgt aactcactgt gctagaga atctgcaaat gaacagcctg <210> 67 <211> 98 <212> DNA <213> Homo sapiens <400> 67 aggttcacca tctccagaga tgattcaaag aacacggcgt atctgcaaat gaacagcctg aaaaccgagg acacggccgt gtattactgt actagaca <210> 68 <211> 98 <212> DNA <213> Homo sapiens <400> 68 cgattcacca tctccagaga caacgccaag aacacgctgt atctgcaaat gaacagtctg agagccgagg acacggctgt gtattactgt gcaagaga <210> 69 <211> 98 <212> DNA <213> Homo sapiens <400> 69 agattcacca tctccagaga caattccaag aacacgctgc atcttcaaat gaacagcctg agagctgagg acacggctgt gtattactgt aagaaaga <210> 70 <211> 98 <212> DNA <213> Homo sapiens <400> 70 cgagtcacca tatcagtaga caagtccaag aaccagttct ccctgaagct gagctctgtg accgccgcgg acacggccgt gtattactgt gcgagaga <210> 71 <211> 98 <212> DNA <213> Homo sapiens <400> 71 cgagtcacca tgtcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg accgccgtgg acacggccgt gtattactgt gcgagaaa

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 72 <211> 98 <212> DNA <213> Homo <400> 72 cgagttacca actgccgcgg sapiens tatcagtaga acacggccgt cacgtctaag aaccagttct ccctgaagct gagctctgtg gtattactgt gcgagaga <210> 73 <211> 98 <212> DNA <213> Homo sapiens <400> 73 cgagtcacca tatcagtaga caggtccaag aaccagttct ccctgaagct gagctctgtg accgccgcgg acacggccgt gtattactgt gccagaga <210> 74 <211> 98 <212> DNA <213> Homo sapiens <400> 74 cgagttacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg actgccgcag acacggccgt gtattactgt gccagaga <210> 75 <211> 98 <212> DNA <213> Homo sapiens <400> 75 cgagttacca tatcagtaga cacgtctaag aaccagttct ccctgaagct gagctctgtg actgccgcgg acacggccgt gtattactgt gcgagaga <210> 76 <211> 98 <212> DNA <213> Homo sapiens <400> 76 cgagtcacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg accgccgcgg acacggctgt gtattactgt gcgagaga <210> 77 <211> 98 <212> DNA <213> Homo sapiens <400> 77 cgagtcacca tatccgtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg accgccgcag acacggctgt gtattactgt gcgagaca

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 78 <211> 98 <212> DNA <213> Homo <400> 78 cgagtcacca accgctgcgg sapiens tatcagtaga acacggccgt cacgtccaag gtattactgt aaccagttct gcgagaga <210> 79 <211> 98 <212> DNA <213> Homo sapiens <400> 79 cgagtcacca tatcagtaga cacgtccaag aaccagttct accgctgcgg acacggccgt gtattactgt gcgagaga <210> 80 <211> 98 <212> DNA <213> Homo sapiens <400> 80 cgagtcacca tatcagtaga cacgtccaag aaccagttct accgccgcag acacggccgt gtattactgt gcgagaga <210> 81 <211> 98 <212> DNA <213> Homo sapiens <400> 81 caggtcacca tctcagccga caagtccatc agcaccgcct aaggcctcgg acaccgccat gtattactgt gcgagaca <210> 82 <211> 96 <212> DNA <213> Homo sapiens <400> 82 cacgtcacca tctcagctga caagtccatc agcactgcct aaggcctcgg acaccgccat gtattactgt gcgaga <210> 83 <211> 98 <212> DNA <213> Homo sapiens

ccctgaagct gagctctgtg ccctgaagct gagctctgtg ccctgaagct gagctctgtg acctgcagtg gagcagcctg acctgcagtg gagcagcctg

WO 02/083872

PCT/US02/12405 gaactctgtg 60

2016225923 09 Sep 2016 <400> 83 cgaataacca tcaacccaga cacatccaag aaccagttct ccctgcagct actcccgagg acacggctgt gtattactgt gcaagaga <210> 84 <211> 98 <212> DNA <213> Homo sapiens .

<400> 84 cggtttgtct tctccttgga cacctctgtc agcacggcat atctgcagat aaggctgagg acactgccgt gtattactgt gcgagaga ctgcagccta 60 98 <210> 85 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (3)..(9) <223> A, T, C, G, other or unknown <400> 85 gcnnnnnnng c <210> 86 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 86 caynnnnrtg 10 <210> 87 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 oligonucleotide <220>

<221> modified_base <222> (6) .. (11) <223> A, T, C, G, other or unknown <400> 87 gagtcnnnnn n <210> 88 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(6) <223> A, T, C, G, other or unknown <400> 88 nnnnnngaga c <210> 89 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 89 gaannnnttc <210> 90 <211> 90 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic 3-23 FR3 nucleotide sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<220> <221> <222> CDS (1) · · (90) <220> <221> modified base <222> (3) <223> A, T, C or G <220> <221> modified base <222> (9) <223> A, T, C or G <220> <221> modified base <222> (12) <223> A, T, C or G <220> <221> modified base <222> (21) <223> A, T, C or G <220> <221> modified base <222> (30) <223> A, T, C or G <220> <221> modified base <222> (36) <223> A, T, C or G <220> <221> modified base <222> (51) <223> A, T, C or G <220> <221> modified base <222> (57) <223> A, T, C or G <220> <221> modified base <222> (60) <223> A, T, C or G <220> <221> modified base <222> (69) <223> A, T, C or G <220> <221> modified base <222> (72) <223> A, T, C or G

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (75) <223> A, T, C or G <220>

<221> modified_base <222> (78) <223> A, T, C or G <220>

<221> modified_base <222> (87) <223> A, T, C or G <400> 90

acn Thr 1 ath lie wsn Ser mgn Arg gay Asp 5 aay Asn wsn Ser aar Lys aay Asn acn Thr 10 ytn Leu tay Tyr ttn Leu car Gin atg Met 15 aay Asn 48 wsn ttr mgn gen gar gay acn gen gtn tay tay tgy gen aar 90 Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys 20 25 30

<210> 91 <211> 30 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic 3-23 FR3 protein sequence <400> 91

Thr 1 He Ser Arg Asp 5 Asn Ser Lys Asn Thr 10 Leu Tyr Leu Gin Met Asn 15 Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys 20 25 30

<210> 92 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 92 agttctccct gcagctgaac tc

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 93 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 93

cactgtatct gcaaatgaac ag

Sequence: Synthetic

<210> 94 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 94

ccctgtatct gcaaatgaac ag

Sequence: Synthetic

<210> 95 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 95

ccgcctacct gcagtggagc ag

Sequence: Synthetic

<210> 96 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 96 cgctgtatct gcaaatgaac ag

Sequence: Synthetic <210> 97 <211> 22 <212> DNA <213> Artificial Sequence <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

23 <223> Description of Artificial probe . Sequence : Synthetic <400> 97 cggcatatct gcagatctgc ag 22 <210> 98 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe Sequence : Synthetic <400> 98 cggcgtatct gcaaatgaac ag 22 <210> 99 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe Sequence : Synthetic <400> 99 ctgcctacct gcagtggagc ag 22

<210> 100 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe Sequence : Synthetic <400> 100 tcgcctatct gcaaatgaac ag 22 <210> 101 <211> 63 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Synthetic

oligonucleotide <400> 101 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 agg 63

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 102 <211> 45 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 102 caagtagaga gtattcttag agttgtctct agacttagtg aagcg 45 <210> 103 <211> 54 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 103 cgcttcacta agtctagaga caactctaag aatactctct acttgcagct gaac 54 <210> 104 <211> 54 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 104 cgcttcacta agtctagaga caactctaag aatactctct acttgcaaat gaac 54 <210> 105 <211> 54 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 105 cgcttcacta agtctagaga caactctaag aatactctct acttgcagtg gage <210> 106 <211> 21 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial <400> 106 cgcttcacta agtctagaga c

<210> 107 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 107

acatggagct gagcagcctg ag

Sequence: Synthetic

Sequence: Primer

<210> 108 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 108

acatggagct gagcaggctg ag

Sequence: Synthetic

<210> 109 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 109

acatggagct gaggagcctg ag

Sequence: Synthetic <210> 110 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 110 acctgcagtg gagcagcctg aa

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 111 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 111 atctgcaaat gaacagcctg aa

Sequence: Synthetic

<210> 112 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 112 atctgcaaat gaacagcctg ag

Sequence: Synthetic

<210> 113 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 113 atctgcaaat gaacagtctg ag

Sequence: Synthetic <210> 114 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial probe <400> 114 atctgcagat ctgcagccta aa

Sequence: Synthetic <210> 115 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 115 atcttcaaat gaacagcctg ag <210> 116 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 116 atcttcaaat gggcagcctg ag <210> 117 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 117 ccctgaagct gagctctgtg ac <210> 118 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence probe <400> 118 ccctgcagct gaactctgtg ac

Synthetic <210> 119 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 119 tccttacaat gaccaacatg ga <210> 120 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 120 tccttaccat gaccaacatg ga .

<210> 121 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 121 acatggagct gagcagcctg ag <210> 122 <211> 22 <212> DNA · <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 122 ccctgaagct gagctctgtg ac <210> 123 <211> 54 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 123 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaac <210> 124 <211> 60

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 124 cgcttcactc agtctagaga taacagtaaa aatactttgt acttgcagct gagcagcctg 60 <210> 125 <211> 60 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 125 cgcttcactc agtctagaga taacagtaaa aatactttgt acttgcagct gagctctgtg 60

<210> 126 <211> 52 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 126 tcagctgcaa gtacaaagta tttttactgt tatctctaga ctgagtgaag

<210> 127 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 127

cgcttcactc agtctagaga taac 24 <210> 128 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 128 ccgtgtatta ctgtgcgaga ga

<210> 129 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 129 ctgtgtatta ctgtgcgaga ga <210> 130 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 130 ccgtgtatta ctgtgcgaga gg <210> 131 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 131 ccgtgtatta ctgtgcaaca ga <210> 132 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 132

Sequence: Synthetic

Sequence: Synthetic ccatgtatta ctgtgcaaga ta

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 133 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 133

ccgtgtatta ctgtgcggca ga

Sequence: Synthetic

<210> 134 <211> 22 <212> DNA ' <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 134 ccacatatta ctgtgcacac ag <210> 135 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 135 ccacatatta ctgtgcacgg at <210> 136 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 136

ccacgtatta ctgtgcacgg at

Sequence: Synthetic

Sequence: Synthetic <210> 137 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial oligonucleotide <400> 137 ccttgtatta ctgtgcaaaa ga

Sequence: Synthetic

<210> 138 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 138 ctgtgtatta ctgtgcaaga ga <210> 139 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 139 ccgtgtatta ctgtaccaca ga <210> 140 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 140 ccttgtatca ctgtgcgaga ga <210> 141 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 141

Sequence: Synthetic

22' ccgtatatta ctgtgcgaaa ga

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 142 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 142 ctgtgtatta ctgtgcgaaa ga 22 <210> 143 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 143 ccgtgtatta ctgtactaga ga 22

<210> 144 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 144 ccgtgtatta ctgtgctaga ga 22 <210> 145 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Synthetic

oligonucleotide <400> 145 ccgtgtatta ctgtactaga ca 22 <210> 146 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 146 ctgtgtatta ctgtaagaaa ga 22 <210> 147 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 147 ccgtgtatta ctgtgcgaga aa 22 <210> 148 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 148 ccgtgtatta ctgtgccaga ga 22 <210> 149 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 149 ctgtgtatta ctgtgcgaga ca 22 <210> 150 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 150 ccatgtatta ctgtgcgaga ca

<210> 151 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 151 ccatgtatta ctgtgcgaga <210> 152 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 152 ccgtgtatta ctgtgcgaga g <210> 153 <211> 21 <212> DNA . <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 153 ctgtgtatta ctgtgcgaga g <210> 154 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 154

Sequence: Synthetic

Sequence: Synthetic <210> 155 <211> 21 ccgtgtatta ctgtgcgaga g

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide <400> 155 ccgtatatta ctgtgcgaaa g

Sequence: Synthetic

<210> 156 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 156 ctgtgtatta ctgtgcgaaa g

Sequence: Synthetic

<210> 157 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 157 ctgtgtatta ctgtgcgaga c

Sequence: Synthetic

<210> 158 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 158 ccatgtatta ctgtgcgaga c

Sequence: Synthetic <210> 159 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide

Sequence: Synthetic

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 159 ccatgtatta ctgtgcgaga 20 <210> 160 <211> 94 <212> DNA <213> Artificial Sequence .

<220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 160 ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 ggctgaggac actgcagtct actattgtgc gaga 94 <210> 161 <211> 94 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 161 ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 ggctgaggac actgcagtct actattgtgc gaaa 94 <210> 162 <211> 85 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 162 atagtagact gcagtgtcct cagcccttaa gctgttcatc tgcaagtaga gagtattctt 60 agagttgtct ctagatcact acacc 85 <210> 163 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 163 ggtgtagtga tctagagaca ac

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 164 <211> 55 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 164 ggtgtagtga aacagcttta gggctgagga cactgcagtc tactattgtg cgaga 55 <210> 165 <211> 55 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 165 ggtgtagtga aacagcttta gggctgagga cactgcagtc tactattgtg cgaaa 55

<210> 166 <211> 46 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 166

atagtagact gcagtgtcct cagcccttaa gctgtttcac tacacc 46 <210> 167 <211> 46 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 167 ggtgtagtga aacagcttaa gggctgagga cactgcagtc tactat <210> 168 <211> 26 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<220> <223> Description of Artificial oligonucleotide <400> 168 ggtgtagtga aacagcttaa gggctg <210> 169 <211> 22 <212> DNA <213> Artificial Sequence <220> Sequence : Synthetic 26 <223> Description of Artificial probe Sequence : Synthetic

<400> 169 agttctccct gcagctgaac tc 22 <210> 170 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 170 cactgtatct gcaaatgaac ag <210> 171 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 171 ccctgtatct gcaaatgaac ag <210> 172 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 172 ccgcctacct gcagtggagc ag

<210> 173 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 173

cgctgtatct gcaaatgaac ag

Synthetic <210> 174 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: probe

Synthetic <400> 174 cggcatatct gcagatctgc ag <210> 175 <211> 22 <212> DNA .

<213> Artificial Sequence

<220> <223> Description of Artificial probe Sequence : Synthetic <400> 175 cggcgtatct gcaaatgaac ag 22 <210> 176 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe Sequence : Synthetic <400> 176

<210> 177 <211> 22 ctgcctacct gcagtggagc ag 22

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe Sequence : Synthetic <400> 177 tcgcctatct gcaaatgaac ag 22 <210> 178 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Synthetic

oligonucleotide <400> 178 acatggagct gagcagcctg ag 22 <210> 179 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 179 acatggagct gagcaggctg ag 22

<210> 180 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 180 acatggagct gaggagcctg ag

<210> 181 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 181 acctgcagtg gagcagcctg aa

<210> 182 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 182 atctgcaaat gaacagcctg aa <210> 183 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 183 atctgcaaat gaacagcctg ag <210> 184 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 184 atctgcaaat gaacagtctg ag <210> 185 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 185

Sequence: Synthetic

Sequence: Synthetic atctgcagat ctgcagccta aa

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 186 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 186 atcttcaaat gaacagcctg ag <210> 187 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 187 atcttcaaat gggcagcctg ag <210> 188 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 188

ccctgaagct gagctctgtg ac

Sequence: Synthetic

Sequence: Synthetic •<210> 189 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide <400> 189 ccctgcagct gaactctgtg ac

Sequence: Synthetic <210> 190 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 190 tccttacaat gaccaacatg ga <210> 191 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 191

tccttaccat gaccaacatg ga 22 <210> 192 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 192 ccgtgtatta ctgtgcgaga ga 22 <210> 193 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 193 ctgtgtatta ctgtgcgaga ga <210> 194 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 194

ccgtgtatta ctgtgcgaga gg 22

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 195 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 195 ccgtgtatta ctgtgcaaca ga <210> 196 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 196 ccatgtatta ctgtgcaaga ta <210> 197 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 197

ccgtgtatta ctgtgcggca ga

Sequence: Synthetic

Sequence: Synthetic <210> 198 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide <400> 198 ccacatatta ctgtgcacac ag

Sequence: Synthetic <210> 199 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 199 ccacatatta ctgtgcacgg at <210> 200 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic ’ oligonucleotide <400> 200 ccacgtatta ctgtgcacgg at 22 <210> 201 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 201 ccttgtatta ctgtgcaaaa ga 22 <210> 202 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 202 ctgtgtatta ctgtgcaaga ga 22 <210> 203 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 203 ccgtgtatta ctgtaccaca ga <210> 204 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic ' oligonucleotide <400> 204 ccttgtatca ctgtgcgaga ga 22 <210> 205 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 205 ccgtatatta ctgtgcgaaa ga 22 <210> 206 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 206 ctgtgtatta ctgtgcgaaa ga 22 <210> 207 <211> 22 <212> DNA <213> Artificial Sequence <220> · <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 207 ccgtgtatta ctgtactaga ga 22 <210> 208 <211> 22

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 208 ccgtgtatta ctgtgctaga ga <210> 209 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 209 ccgtgtatta ctgtactaga ca <210> 210 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 210 ctgtgtatta ctgtaagaaa ga

Sequence: Synthetic

<210> 211 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 211 ccgtgtatta ctgtgcgaga aa

Sequence: Synthetic <210> 212 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide

Sequence: Synthetic

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 212 ccgtgtatta ctgtgccaga ga <210> 213 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 213 ctgtgtatta ctgtgcgaga ca <210> 214 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 214

ccatgtatta ctgtgcgaga ca

Sequence: Synthetic

Sequence: Synthetic <210> 215 <211> 22 · <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

<400> 215 ccatgtatta ctgtgcgaga aa 22 <210> 216 <211> 90 <212> DNA <213> Homo sapiens <400> 216 caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtc 60 tcctgcaagg cttctggata caccttcacc 90

<210> 217 <211> 90 <212> DNA <213> Homo sapiens

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 217

caggtccagc tcctgcaagg ttgtgcagtc cttctggata tggggctgag caccttcact gtgaagaagc ctggggcctc agtgaaggtt <210> 218 <211> 90 <212> DNA <213> Homo sapiens <400> 218 caggtgcagc tcctgcaagg tggtgcagtc cttctggata tggggctgag caccttcacc gtgaagaagc ctggggcctc agtgaaggtc <210> 219 <211> 90 <212> DNA <213> Homo sapiens <400> 219 caggttcagc tcctgcaagg tggtgcagtc cttctggtta tggagctgag cacctttacc gtgaagaagc ctggggcctc agtgaaggtc <210> 220 <211> 90 <212> DNA <213> Homo sapiens <400> 220 caggtccagc tcctgcaagg tggtacagtc tttccggata tggggctgag caccctcact gtgaagaagc ctggggcctc agtgaaggtc <210> 221 <211> 90 <212> DNA <213> Homo sapiens <400> 221 cagatgcagc tcctgcaagg tggtgcagtc cttccggata tggggctgag caccttcacc gtgaagaaga ctgggtcctc agtgaaggtt <210> 222 <211> 90 <212> DNA <213> Homo sapiens <400> 222 caggtgcagc tcctgcaagg tggtgcagtc catctggata tggggctgag caccttcacc gtgaagaagc ctggggcctc agtgaaggtt

<210> 223 <211> 90

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Homo sapiens <400> 223

caaatgcagc tcctgcaagg tggtgcagtc cttctggatt tgggcctgag cacctttact gtgaagaagc ctgggacctc agtgaaggtc <210> 224 <211> 90 <212> DNA <213> Homo sapiens <400> 224 caggtgcagc tcctgcaagg tggtgcagtc cttctggagg tggggctgag caccttcagc gtgaagaagc ctgggtcctc ggtgaaggtc <210> 225 <211> 90 <212> DNA <213> Homo sapiens <400> 225 caggtgcagc tcctgcaagg tggtgcagtc cttctggagg tggggctgag caccttcagc gtgaagaagc ctgggtcctc ggtgaaggtc <210> 226 <211> 90 <212> DNA <213> Homo sapiens <400> 226 gaggtccagc tcctgcaagg tggtacagtc tttctggata tggggctgag caccttcacc gtgaagaagc ctggggctac agtgaaaatc <210> 227 <211> 90 <212> DNA <213> Homo sapiens <400> 227 cagatcacct acctgcacct tgaaggagtc tctctgggtt tggtcctacg ctcactcagc ctggtgaaac ccacacagac cctcacgctg <210> 228 <211> 90 <212> DNA <213> Homo sapiens <400> 228 caggtcacct acctgcaccg tgaaggagtc tctctgggtt tggtcctgtg ctcactcagc ctggtgaaac ccacagagac cctcacgctg

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 229 <211> 90 <212> DNA

<213> Homo <400> 229 caggtcacct acctgcacct sapiens tgaaggagtc tctctgggtt tggtcctgcg ctcactcagc <210> 230 <211> 90 <212> DNA <213> Homo sapiens <400> 230 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc cacctttagt <210> 231 <211> 90 <212> DNA <213> Homo sapiens <400> 231 gaagtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc cacctttgat <210> 232 <211> 90 <212> DNA <213> Homo sapiens <400> 232 caggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc caccttcagt <210> 233 <211> 90 <212> DNA <213> Homo sapiens <400> 233 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc caccttcagt <210> 234 <211> 90 <212> DNA <213> Homo sapiens <400> 234 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc cactttcagt

ctggtgaaac ccacacagac cctcacactg 60 90 ttggtccagc ctggggggtc cctgagactc 60 90 ttggtacagc ctggcaggtc cctgagactc 60 90 ttggtcaagc ctggagggtc cctgagactc 60 90 ttggtacagc ctggggggtc cctgagactc 60 90 ttggtaaagc ctggggggtc ccttagactc 60

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 235 <211> 90 <212> DNA <213> Homo sapiens <400> 235 gaggtgcagc tggtggagtc tgggggaggt gtggtacggc ctggggggtc cctgagactc tcctgtgcag cctctggatt cacctttgat <210> 236 <211> 90 <212> DNA <213> Homo sapiens <400> 236 gaggtgcagc tggtggagtc tgggggaggc ctggtcaagc ctggggggtc cctgagactc tcctgtgcag cctctggatt caccttcagt <210> 237 <211> 90 <212> DNA <213> Homo sapiens <400> 237 gaggtgcagc tgttggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc tcctgtgcag cctctggatt cacctttagc <210> 238 <211> 90 <212> DNA <213> Homo sapiens <400> 238 caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc tcctgtgcag cctctggatt caccttcagt <210> 239 <211> 90 <212> DNA <213> Homo sapiens <400> 239 caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc tcctgtgcag cctctggatt caccttcagt <210> 240 <211> 90 <212> DNA <213> Homo sapiens

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 240 caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 tcctgtgcag cctctggatt caccttcagt 90 <210> 241 <211> 90 <212> DNA <213> Homo sapiens <400> 241 caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 tcctgtgcag cgtctggatt caccttcagt 90 <210> 242 <211> 90 <212> DNA <213> Homo sapiens <400> 242 gaagtgcagc tggtggagtc tgggggagtc gtggtacagc ctggggggtc cctgagactc 60 tcctgtgcag cctctggatt cacctttgat 90 <210> 243 <211> 90 <212> DNA <213> Homo sapiens <400> 243 gaggtgcagc tggtggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60 tcctgtgcag cctctggatt caccttcagt 90 <210> 244 <211> 90 <212> DNA <213> Homo sapiens <400> 244 gaggtgcagc tggtggagtc tgggggaggc ttggtacagc cagggcggtc cctgagactc 60 tcctgtacag cttctggatt cacctttggt 90 <210> 245 <211> 90 <212> DNA <213> Homo sapiens <400> 245 gaggtgcagc tggtggagac tggaggaggc ttgatccagc ctggggggtc cctgagactc 60 tcctgtgcag cctctgggtt caccgtcagt 90

<210> 246 <211> 90 <212> DNA

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<213> Homo <400> 246 gaggtgcagc tcctgtgcag sapiens ttggtccagc ctggggggtc cctgagactc tggtggagtc cctctggatt tgggggaggc caccttcagt <210> 247 <211> 90 <212> DNA <213> Homo sapiens <400> 247 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc caccgtcagt ttggtccagc ctggggggtc cctgagactc <210> 248 <211> 90 <212> DNA <213> Homo sapiens <400> 248 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tgggggaggc caccttcagt ttggtccagc ctggagggtc cctgagactc <210> 249 <211> 90 <212> DNA <213> Homo sapiens <400> 249 gaggtgcagc tcctgtgcag tggtggagtc cctctgggtt tgggggaggc caccttcagt ttggtccagc ctggggggtc cctgaaactc <210> 250 <211> 90 <212> DNA <213> Homo sapiens <400> 250 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt cgggggaggc caccttcagt ttagttcagc ctggggggtc cctgagactc <210> 251 <211> 90 <212> DNA <213> Homo sapiens <400> 251 gaggtgcagc tcctgtgcag tggtggagtc cctctggatt tcggggagtc caccgtcagt ttggtacagc ctggggggtc cctgagactc

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> <211> <212> <213> 252 90 DNA Homo sapiens <400> 252 caggtgcagc tgcaggagtc acctgcgctg tctctggtgg

gggcccagga ctccatcagc

<210> 253 <211> 90 <212> DNA <213> Homo sapiens

ctggtgaagc cttcggggac cctgtccctc <400> 253 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggacac cctgtccctc acctgcgctg tctctggtta ctccatcagc <210> 254 <211> 90 <212> DNA <213> Homo sapiens <400> 254 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 acctgcactg tctctggtgg ctccatcagc 90 <210> 255 <211> 90 <212> DNA <213> Homo sapiens <400> 255 cagctgcagc tgcaggagtc cggctcagga ctggtgaagc cttcacagac cctgtccctc 60 acctgcgctg tctctggtgg ctccatcagc 90 <210> 256 <211> 90 <212> DNA <213> Homo sapiens <400> 256 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 acctgcactg tctctggtgg ctccatcagc 90 <210> 257 <211> 90 <212> DNA <213> Homo sapiens <400> 257 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 acctgcactg tctctggtgg ctccatcagc 90

2016225923 09 Sep 2016

WO 02/083872 PCT/US02/12405 57 <210> 258 <211> 90 <212> DNA <213> Homo sapiens <400> 258 caggtgcagc tacagcagtg gggcgcagga ctgttgaagc cttcggagac cctgtccctc 60 acctgcgctg tctatggtgg gtccttcagt 90 <210> 259 <211> 90 <212> DNA <213> Homo sapiens <400> 259 cagctgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 acctgcactg tctctggtgg ctccatcagc 90 <210> 260 <211> 90 <212> DNA <213> Homo sapiens <400> 260 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 acctgcactg tctctggtgg ctccatcagt 90 <210> 261 <211> 90 <212> DNA <213> Homo sapiens <400> 261 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 acctgcactg tctctggtgg ctccgtcagc 90 <210> 262 <211> 90 <212> DNA <213> Homo sapiens <400> 262 caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 acctgcgctg tctctggtta ctccatcagc 90 <210> 263 <211> 90 <212> DNA <213> Homo sapiens

WO 02/083872

PCT/US02/12405 tctgaagatc

2016225923 09 Sep 2016 <400> 263 gaggtgcagc tggtgcagtc tggagcagag gtgaaaaagc ccggggagtc tcctgtaagg gttctggata cagctttacc <210> 264 <211> 90 <212> DNA <213> Homo sapiens <400> 264 gaagtgcagc tggtgcagtc tggagcagag gtgaaaaagc ccggggagtc tcctgtaagg gttctggata cagctttacc <210> 265 <211> 90 <212> DNA <213> Homo sapiens <400> 265 caggtacagc tgcagcagtc aggtccagga ctggtgaagc cctcgcagac acctgtgcca tctccgggga cagtgtctct <210> 266 <211> 90 <212> DNA <213> Homo sapiens <400> 266 caggtgcagc tggtgcaatc tgggtctgag ttgaagaagc ctggggcctc tcctgcaagg cttctggata caccttcact tctgaggatc cctctcactc agtgaaggtt <210> 267 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 267 ccgtgtatta ctgtgcgaga ga <210> 268 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 268 ctgtgtatta ctgtgcgaga ga

<210> 269 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 269 ccgtgtatta ctgtgcgaga gg <210> 270 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 270 ccgtatatta ctgtgcgaaa ga <210> 271 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 271 ctgtgtatta ctgtgcgaaa ga <210> 272 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 272

Sequence: Synthetic

Sequence: Synthetic <210> 273 <211> 22 ctgtgtatta ctgtgcgaga ca

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 273 ccatgtatta ctgtgcgaga ca . 22 <210> 274 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 274 ccatgtatta ctgtgcgaga aa 22 <210> 275 <211> 69 <212> DNA <213> Homo sapiens <400> 275 gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgc 69 <210> 276 <211> 69 <212> DNA <213> Homo sapiens <400> 276 gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgc 69 <210> 277 <211> 69 <212> DNA <213> Homo sapiens <400> 277 gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgc 69 <210> 278 <211> 69 <212> DNA <213> Homo sapiens

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 278 gacatccaga atcacttgc tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 69 <210> 279 <211> 69 <212> DNA <213> Homo sapiens <400> 279 gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgc 69 <210> 280 <211> 69 <212> DNA <213> Homo sapiens <400> 280 gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgc 69 <210> 281 <211> 69 <212> DNA <213> Homo sapiens <400> 281 aacatccaga tgacccagtc tccatctgcc atgtctgcat ctgtaggaga cagagtcacc 60 atcacttgt 69 <210> 282 <211> 69 <212> DNA <213> Homo sapiens <400> 282 gacatccaga tgacccagtc tccatcctca ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgt 69 <210> 283 <211> 69 <212> DNA <213> Homo sapiens <400> 283 gacatccaga tgacccagtc tccatcctca ctgtctgcat ctgtaggaga cagagtcacc 60 atcacttgt 69

<210> 284 <211> 69

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Homo <400> 284 gccatccagt atcacttgc <210> 285 <211> 69 <212> DNA <213> Homo <400> 285 gccatccagt atcacttgc <210> 286 <211> 69 <212> DNA <213> Homo <400> 286 gacatccaga atcacttgt <210> 287 <211> 69 <212> DNA <213> Homo <400> 287 gacatccaga atcacttgt sapiens tgacccagtc sapiens tgacccagtc sapiens tgacccagtc sapiens tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga tccatcctcc ctgtctgcat ctgtaggaga tccatcttcc gtgtctgcat ctgtaggaga tccatcttct gtgtctgcat ctgtaggaga cagagtcacc cagagtcacc cagagtcacc cagagtcacc

<210> 288 <211> 69 <212> DNA <213> Homo <400> 288

gacatccagt atcacttgc sapiens tgacccagtc tccatccttc ctgtctgcat ctgtaggaga cagagtcacc

<2 10> 289 <2 11> 69 <2 12> DNA <2 13> Homo <4 00> 289

sapiens gccatccgga tgacccagtc atcacttgc tccattctcc ctgtctgcat ctgtaggaga cagagtcacc

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 290 <211> 69 <212> DNA <213> Homo <400> 290 gccatccgga atcacttgt <210> 291 <211> 69 <212> DNA <213> Homo <400> 291 gtcatctgga atcagttgt <210> 292 <211> 69 <212> DNA <213> Homo <400> 292 gccatccaga atcacttgc <210> 293 <211> 69 <212> DNA <213> Homo <400> 293 gacatccaga atcacttgc <210> 294 <211> 69 <212> DNA <213> Homo <400> 294 gatattgtga atctcctgc <210> 295 <211> 69 <212> DNA <213> Homo <400> 295 gatattgtga atctcctgc sapiens tgacccagtc tccatcctca ttctctgcat ctacaggaga sapiens tgacccagtc tccatcctta ctctctgcat ctacaggaga sapiens tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga sapiens tgacccagtc tccttccacc ctgtctgcat ctgtaggaga sapiens tgacccagac tccactctcc ctgcccgtca cccctggaga sapiens tgacccagac tccactctcc ctgcccgtca cccctggaga cagagtcacc 60 69 cagagtcacc 60 69 cagagtcacc 60 69 cagagtcacc 60 69 gccggcctcc 60 69 gccggcctcc 60 69

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 296 <211> 69 <212> DNA <213> Homo sapiens <400> 296 gatgttgtga tgactcagtc tccactctcc ctgcccgtca cccttggaca gccggcctcc 60 atctcctgc 69

<210> 297 <211> 69 <212> DNA <213> Homo <400> 297 gatgttgtga atctcctgc sapiens tgactcagtc <210> 298 <211> 69 <212> DNA <213> Homo sapiens <400> 298 gatattgtga tgacccagac atctcctgc <210> 299 <211> 69 <212> DNA <213> Homo sapiens <400> 299 gatattgtga tgacccagac atctcctgc <210> 300 <211> 69 <212> DNA <213> Homo sapiens <400> 300 gatattgtga tgactcagtc atctcctgc <210> 301 <211> 69 <212> DNA <213> Homo sapiens

tccactctcc ctgcccgtca cccttggaca gccggcctcc tccactctct ctgtccgtca cccctggaca gccggcctcc tccactctct ctgtccgtca cccctggaca gccggcctcc tccactctcc ctgcccgtca cccctggaga gccggcctcc

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 301 gatattgtga tgactcagtc atctcctgc <210> 302 <211> 69 <212> DNA <213> Homo sapiens

tccactctcc ctgcccgtca <400> 302 gatattgtga tgacccagac atctcctgc tccactctcc tcacctgtca cccctggaga cccttggaca gccggcctcc gccggcctcc

<210> <211> <212> <213> 303 69 DNA Homo sapiens <400> 303 gaaattgtgt tgacgcagtc ctctcctgc

tccaggcacc ctgtctttgt ctccagggga aagagccacc

<210> <211> <212> <213> 304 69 DNA Homo sapiens <400> 304 gaaattgtgt tgacgcagtc ctctcctgc

tccagccacc ctgtctttgt ctccagggga aagagccacc

<210> <2U> <212> <213> 305 69 DNA Homo sapiens <400> 305 gaaatagtga tgacgcagtc ctctcctgc

tccagccacc ctgtctgtgt ctccagggga aagagccacc

<210> <211> <212> <213> 306 69 DNA Homo sapiens <400> 306 gaaatagtga tgacgcagtc ctctcctgc

tccagccacc ctgtctgtgt ctccagggga aagagccacc <210> 307 <211> 69

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Homo sapiens <400> 307

gaaattgtgt ctctcctgc tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 69 <210> 308 <211> 69 <212> DNA <213> Homo sapiens <400> 308 gaaattgtgt ctctcctgc tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 69 <210> 309 <211> 69 <212> DNA <213> Homo sapiens <400> 309 gaaattgtaa ctctcctgc tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 69 <210> 310 <211> 69 <212> DNA <213> Homo sapiens <400> 310 gacatcgtga atcaactgc tgacccagtc tccagactcc ctggctgtgt ctctgggcga gagggccacc 60 69 <210> 311 <211> 69 <212> DNA <213> Homo sapiens <400> 311 gaaacgacac atctcctgc tcacgcagtc tccagcattc atgtcagcga ctccaggaga caaagtcaac 60 69 <210> 312 <211> 69 <212> DNA <213> Homo sapiens <400> 312 gaaattgtgc atcacctgc tgactcagtc tccagacttt cagtctgtga ctccaaagga gaaagtcacc 60 69

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 313 <211> 69 <212> DNA <213> Homo sapiens <400> 313

gaaattgtgc atcacctgc tgactcagtc tccagacttt cagtctgtga ctccaaagga gaaagtcacc <210> 314 <211> 69 <212> DNA <213> Homo sapiens <400> 314 gatgttgtga tgacacagtc tccagctttc ctctctgtga ctccagggga gaaagtcacc

atcacctgc

<210> 315 <211> 66 <212> DNA <213> Homo sapiens <400> 315 cagtctgtgc tcctgt tgactcagcc accctcggtg tctgaagccc ccaggcagag ggtcaccatc <210> 316 <211> 66 <212> DNA <213> Homo sapiens <400> 316 cagtctgtgc tcctgc tgacgcagcc gccctcagtg tctggggccc cagggcagag ggtcaccatc

<210> 317 <211> 66 <212> DNA <213> Homo sapiens <400> 317 cagtctgtgc tcttgt tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc <210> 318 <211> 66 <212> DNA <213> Homo sapiens <400> 318 cagtctgtgc tcttgt tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 319 <211> 66

<212> DNA <213> Homo <400> 319 cagtctgtgt tcctgc sapiens tgacgcagcc <210> 320 <211> 66 <212> DNA <213> Homo sapiens <400> 320 cagtctgccc tgactcagcc tcctgc <210> 321 <211> 66 <212> DNA <213> Homo sapiens <400> 321 cagtctgccc tgactcagcc tcctgc <210> 322 <211> 66 <212> DNA <213> Homo sapiens <400> 322 cagtctgccc tgactcagcc tcctgc <210> 323 <211> 66 <212> DNA <213> Homo sapiens <400> 323 cagtctgccc tgactcagcc tcctgc <210> 324 <211> 66 <212> DNA <213> Homo sapiens

<400> 324

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

cagtctgccc tcctgc tgactcagcc tgcctccgtg tctgggtctc ctggacagtc gatcaccatc 60 66 <210> 325 <211> 66 <212> DNA <213> Homo sapiens <400> 325 tcctatgagc acctgc tgactcagcc accctcagtg tccgtgtccc caggacagac agccagcatc 60 66 <210> 326 <211> 66 <212> DNA <213> Homo sapiens <400> 326 tcctatgagc acctgt tgactcagcc actctcagtg tcagtggccc tgggacagac ggccaggatt 60 66 <210> 327 <211> 66 <212> DNA <213> Homo sapiens <400> 327 tcctatgagc acctgc tgacacagcc accctcggtg tcagtgtccc caggacaaac ggccaggatc 60 66 <210> 328 <211> 66 <212> DNA <213> Homo sapiens <400> 328 tcctatgagc acctgc tgacacagcc accctcggtg tcagtgtccc taggacagat ggccaggatc 60 66 <210> 329 <211> 66 <212> DNA <213> Homo sapiens <400> 329 tcttctgagc acatgc tgactcagga ccctgctgtg tctgtggcct tgggacagac agtcaggatc 60 66 <210> 330 <211> 66 <212> DNA <213> Homo sapiens

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 330 tcctatgtgc acctgt tgactcagcc accctcagtg tcagtggccc caggaaagac ggccaggatt 60 66 <210> 331 <211> 66 <212> DNA <213> Homo sapiens <400> 331 tcctatgagc tgacacagct accctcggtg tcagtgtccc caggacagac agccaggatc 60 acctgc 66 <210> 332 <211> 66 <212> DNA <213> Homo sapiens <400> 332 tcctatgagc tgatgcagcc accctcggtg tcagtgtccc caggacagac ggccaggatc 60 acctgc 66 <210> 333 <211> 66 <212> DNA <213> Homo sapiens <400> 333 tcctatgagc tgacacagcc atcctcagtg tcagtgtctc cgggacagac agccaggatc 60 acctgc 66 <210> 334 <211> 66 <212> DNA <213> Homo sapiens <400> 334 ctgcctgtgc tgactcagcc cccgtctgca tctgccttgc tgggagcctc gatcaagctc 60 acctgc 66 <210> 335 <211> 66 <212> DNA <213> Homo sapiens <400> 335 cagcctgtgc tgactcaatc atcctctgcc tctgcttccc tgggatcctc ggtcaagctc 60 acctgc 66

<210> 336 <211> 66

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<212> DNA <213> Homo <400> 336 cagcttgtgc acctgc sapiens tgggagcctc ggtcaagctc 60 66 tgactcaatc gccctctgcc tctgcctccc <210> 337 <211> 66 <212> DNA <213> Homo sapiens <400> 337 cagcctgtgc acctgc tgactcagcc accttcctcc tccgcatctc ctggagaatc cgccagactc 60 66 <210> 338 <211> 66 <212> DNA <213> Homo sapiens <400> 338 caggctgtgc acctgc tgactcagcc ggcttccctc tctgcatctc ctggagcatc agccagtctc 60 66 <210> 339 <211> 66 <212> DNA <213> Homo sapiens <400> 339 cagcctgtgc acctgc tgactcagcc atcttcccat tctgcatctt ctggagcatc agtcagactc 60 66 <210> 340 <211> 66 <212> DNA <213> Homo sapiens <400> 340 aattttatgc tcctgc tgactcagcc ccactctgtg tcggagtctc cggggaagac ggtaaccatc 60 66 <210> 341 <211> 66 <212> DNA <213> Homo sapiens <400> 341 cagactgtgg acctgt tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc 60 66

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 342 <211> 66 <212> DNA <213> Homo sapiens <400> 342 caggctgtgg tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc acctgt <210> 343 <211> 66 <212> DNA <213> Homo sapiens <400> 343 cagactgtgg tgacccagga gccatcgttc tcagtgtccc ctggagggac agtcacactc acttgt <210> 344 <211> 66 <212> DNA <213> Homo sapiens <400> 344 cagcctgtgc tgactcagcc accttctgca tcagcctccc tgggagcctc ggtcacactc acctgc <210> 345 <211> 66 <212> DNA <213> Homo sapiens <400> 345 caggcagggc tgactcagcc accctcggtg tccaagggct tgagacagac cgccacactc acctgc <210> 346 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(6) <223> A, T, C, G, other or unknown <400> 346 nnnnnngact c

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 347 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6) .. (11) <223> A, T, C, G, other or unknown <400> 347 gagtcnnnnn n 11 <210> 348 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (3)..(9) <223> A, T, C, G, other or unknown <400> 348 · gcnnnnnnng c 11 <210> 349 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown · <400> 349 acctgcnnnn n 11 <210> 350 <211> 25 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 350 cacatccgtg ttgttcacgg atgtg 25 <210> 351 ' <211> 88 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 351 aatagtagac tgcagtgtcc tcagccctta agctgttcat ctgcaagtag agagtattct 60 tagagttgtc tctagactta gtgaagcg 88 <210> 352 <211> 88 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 352 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 agggctgagg acactgcagt ctactatt 88 <210> 353 <211> 95 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 353 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat agggctgagg acactgcagt ctactattgt gcgag gaacagctta 60 95 <210> 354 <211> 95 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 354 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 agggctgagg acactgcagt ctactattgt acgag 95 <210> 355 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 355 cgcttcacta agtctagaga caac <210> 356 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (8)..(15) <223> A, T, C, G, other or unknown <400> 356 cacctgcnnn nnnnn <210> 357 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (17) <223> A, T, C, G, other or unknown <400> 357 cagctcnnnn nnnnnnn

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 358 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (17) <223> A, T, C, G, other or unknown <400> 358 gaagacnnnn nnnnnnn 17 <210> 359 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(17) <223> A, T, C, G, other or unknown <400> 359 gcagcnnnnn nnnnnnn 17 <210> 360 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide · <220>

<221> modified_base <222> (7)..(12) <223> A, T, C, G, other or unknown <400> 360 gaagacnnnn nn <210> 361 <211> 22 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (22) <223> A, T, C, G, other or unknown <400> 361 cttgagnnnn nnnnnnnnnn nn 22 <210> 362 <211> 19 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(19) <223> A, T, C, G, other or unknown <400> 362 acggcnnnnn nnnnnnnnn 19 <210> 363 <211> 18 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(18) <223> A, T, C, G, other or unknown <400> 363 acggcnnnnn nnnnnnnn 18 <210> 364 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (7)..(12) <223> A, T, C, G, other or unknown <400> 364 gtatccnnnn nn <210> 365 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 365 actgggnnnn n <210> 366 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(10) <223> A, T, C, G, other or unknown <400> 366 ggatcnnnnn <210> 367 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(11)

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <223> A, T, C, G, other or unknown <400> 367 gcatcnnnnn n <210> 368 <211> 16 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(16) <223> A, T, C, G, other or unknown <400> 368 gaggagnnnn nnnnnn <210> 369 <211> 19 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(19) <223> A, T, C, G, other or unknown <400> 369 gggacnnnnn nnnnnnnnn <210> 370 <211> 14 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(14) <223> A, T, C, G, other or unknown <400> 370 acctgcnnnn nnnn

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 371 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (17) <223> A, T, C, G, other or unknown <400> 371 ggcggannnn nnnnnnn ' 17 <210> 372 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(22) <223> A, T, G, G, other or unknown <400> 372 ctgaagnnnn nnnnnnnnnn nn 22 <210> 373 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base ’ <222> (6)..(11) <223> A, T, C, G, other or unknown <400> 373 cccgcnnnnn n 11 <210> 374 <211> 18

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6) .. (18) <223> A, T, C, G, other or unknown <400> 374 ggatgnnnnn nnnnnnnn 18 <210> 375 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(22) <223> A, T, C, G, other or unknown <400> 375 ctggagnnnn nnnnnnnnnn nn 22 <210> 376 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6) .. (15) <223> A, T, C, G, other or unknown <400> 376 gacgcnnnnn nnnnn 15 <210> 377 <211> 13 <212> DNA <213> Artificial Sequence <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(13) <223> A, T, C, G, other or unknown <400> 377 .

ggtgannnnn nnn <210> 378 <211> 13 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(13) <223> A, T, C, G, other or unknown <400> 378 gaagannnnn nnn <210> 379 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6) .. (10) <223> A, T, C, G, other or unknown <400> 379 gagtcnnnnn <210> 380 <211> 26 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (7) .. (26) <223> A, T, C, G, other or unknown <400> 380 tccracnnnn nnnnnnnnnn nnnnnn <210> 381 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (5) .. (11) <223> A, T, C, G, other or unknown <400> 381 cctcnnnnnn n <210> 382 <211> 10 <212> DNA <213> Artificial Sequence <220> · <223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (6)..(10) <223> A, T, C, G, other or unknown <400> 382 gagtcnnnnn <210> 383 <211> 18 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(18) <223> A, T, C, G, other or unknown

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 383 cccacannnn nnnnnnnn <210> 384 <211> 14 <212> DNA <213> Artificial Sequence . <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <220> <221> modified_base <222> (6)..(14) <223> A, T, C, G, other or unknown <400> 384 gcatcnnnnn nnnn <210> 385 <211> 13 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <220> · <221> modified base <222> (6)..(13) <223> A, T, C, G, other or unknown <400> 385 ggtgannnnn nnn <210> 386 <211> 12 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <220> <221> modified base <222> (5) .. (12) <223> A, T, C, G, other or unknown <400> 386

cccgnnnnnn nn 12

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 387 <211> 19 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide .

<220>

<221> modified_base <222> (6)..(19) <223> A, T, C, G, other or unknown <400> 387 ggatgnnnnn nnnnnnnnn 19 <210> 388 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(17) <223> A, T, C, G, other or unknown <400> 388 gaccgannnn nnnnnnn 17 <210> 389 <211> 17 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(17) <223> A, T, C, G, other or unknown ' <400> 389 cacccannnn nnnnnnn <210> 390 <211> 17 <212> DNA

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (17) <223> A, T,. C, G, other or unknown <400> 390 caarcannnn nnnnnnn 17 <210> 391 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 391 gctgtgtatt actgtgcgag <210> 392 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 392 gccgtgtatt actgtgcgag 20 <210> 393 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic probe <400> 393 gccgtatatt actgtgcgag 20 <210> 394 <211> 20 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial probe <400> 394 gccgtgtatt actgtacgag

Sequence: Synthetic

<210> 395 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial probe <400> 395 gccatgtatt actgtgcgag

Sequence: Synthetic

<210> 396 <211> 25 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 396 cacatccgtg ttgttcacgg atgtg

Sequence: Synthetic

<210> 397 <211> 88 <212> DNA <213> Artificial Sequence

<220>

<223> Description of Artificial oligonucleotide

Sequence: Synthetic <400> 397 aatagtagac tgcagtgtcc tcagccctta agctgttcat ctgcaagtag agagtattct 60 tagagttgtc tctagactta gtgaagcg 88 <210> 398 <211> 95 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <400> 398 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 agggctgagg acactgcagt ctactattgt gcgag 95

<210> 399 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 399 cgcttcacta agtctagaga caac <210> 400 <211> 44 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 400 cacatccgtg ttgttcacgg atgtgggagg atggagactg ggtc <210> 401 <211> 44 · <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 401 cacatccgtg ttgttcacgg atgtgggaga gtggagactg agtc <210> 402 <211> 44 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 402

cacatccgtg ttgttcacgg atgtgggtgc ctggagactg cgtc 44 <210> 403

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <211> 44 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 403 cacatccgtg ttgttcacgg atgtgggtgg ctggagactg cgtc 44 <210> 404 <211> 34 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 404 cctctactct tgtcacagtg cacaagacat ccag 34 <210> 405 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 405 cctctactct tgtcacagtg 20 <210> 406 <211> 44 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 406 ggaggatgga ctggatgtct tgtgcactgt gacaagagta gagg 44

<210> 407 <211> 44 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 oligonucleotide <400> 407 ggagagtgga ctggatgtct tgtgcactgt gacaagagta gagg 44 <210> 408 <211> 44 <212> DNA .

<213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 408 ggtgcctgga ctggatgtct tgtgcactgt gacaagagta gagg 44 <210> 409 <211> 44 <212> DNA <213> Artificial Sequence _# <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 409 ggtggctgga ctggatgtct tgtgcactgt gacaagagta gagg 44 <210> 410 · <211> 44 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 410 cacatccgtg ttgttcacgg atgtggatcg actgtccagg agac 44 <210> 411 <211> 44 <212> DNA <213> Artificial Sequence ' <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 411 cacatccgtg ttgttcacgg atgtggactg tctgtcccaa ggcc 44

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<210> 412 <211> 44 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 412 cacatccgtg ttgttcacgg atgtggactg actgtccagg agac <210> 413 <211> 44 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 413 cacatccgtg ttgttcacgg atgtggaccc tctgccctgg ggcc <210> 414 <211> 59 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 414

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccgg 59 <210> 415 <211> 69 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 415 cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 acagtcgat 69 <210> 416 <211> 69 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 416 cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 acagacagt 69 <210> 417 <211> 69 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 417 cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 acagtcagt 69 <210> 418 <211> 70 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 418 · cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gtstccccgg 60 ggcagagggt 70 <210> 419 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 419 cctctgactg agtgcacaga gtgc 24 <210> 420 <211> 13 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<220> <221> modified base <222> (5)..(9) <223> A, T, C, G, other or unknown <400> 420 ggccnnnnng gcc

<210> 421 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(12) <223> A, T, C, G, other or unknown <400> 421 ccannnnnnn nntgg <210> 422 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> .

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 422 cgannnnnnt gc <210> 423 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8)

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <223> A, T, C, G, other or unknown <400> 423 gccnnnnngg c <210> 424 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 424 gatnnnnatc <210> 425 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 425 gacnnnnngt c <210> 426 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4) .. (8) <223> A, T, C, G, other or unknown <400> 426 gcannnnntg c

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <210> 427 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7) .. (12) <223> A, T, C, G, other or unknown <400> 427 gtatccnnnn nn 12 <210> 428 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, G, G, other or unknown <400> 428 gacnnnnnng tc 12 <210> 429 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base ' <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 429 ccannnnntg g 11 <210> 430 <211> 12

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base .

<222> (1)..(6) <223> A, T, C, G, other or unknown <400> 430 nnnnnngaga eg 12 <210> 431 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 431 ccannnnnnt gg 12 <210> 432 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 432 gaannnnttc 10 <210> 433 <211> 11 <212> DNA <213> Artificial Sequence <220>

WO 02/083872

PCT/US02/12405

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 433 .

ggtctcnnnn n <210> 434 <211> 16 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(10) <223> A, T, C, G, other or unknown <400> 434 nnnnnnnnnn ctcctc <210> 435 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(9) <223> A, T, C, G, other or unknown <400> 435 nnnnnnnnnt ccgcc 15 <210> 436 · <211> 13 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016 <221> modified_base <222> (5) . . (9) <223> A, T, C, G, other or unknown <400> 436 ggccnnnnng gcc <210> 437 .

<211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 437 ccannnnnnt gg <210> 438 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 438 gacnnnnnng tc <210> 439 <211> 12 ' <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide .

<220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

<400> 439 cgannnnnnt gc <210> 440 <211> 11 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic

oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 440 gcannnnntg c 11 <210> 441 <211> 11 <212> DNA <213> Artificial Sequence ' <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown ' <400> 441 ccannnnntg g 11 <210> 442 <211> 10 <212> DNA <213> Artificial Sequence <220> .

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 442 gaannnnttc

WO 02/083872

PCT/US02/12405

100

2016225923 09 Sep 2016 <210> 443 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(6) <223> A, T, C, G, other or unknown <400> 443 nnnnnngaga eg

<210> 444 <211> 12 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <220> <221> modified base <222> (7)..(12) <223> A, T, C, G, other or unknown <400> 444 gtatccnnnn nn

<210> 445 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, C, G, other or unknown <400> 445 gacnnnnngt c <210> 446 <211> 11 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

101

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown .

<400> 446 ggtctcnnnn n 11 <210> 447 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 447 gccnnnnngg c 11 <210> 448 · <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(12) <223> A, T, C, G, other or unknown <400> 448 ccannnnnnn nntgg 15 <210> 449 <211> 16 <212> DNA <213> Artificial Sequence <220> .

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

102

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (1).. (10) <223> A, T, C, G, other or unknown <400> 449 nnnnnnnnnn ctcctc 16 <210> 450 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(9) <223> A, T, C, G, other or unknown <400> 450 nnnnnnnnnt ccgcc 15 <210> 451 <211> 9532 <212> DNA <213> Unknown Organism <220>

<223> Description of Unknown Organism: MALIA3 nucleotide sequence <220>

<221> CDS <222> (1579).. (1638) <220>

<221> CDS <222> (2343)..(3443) <220>

<221> CDS <222> (3945)..(4400) <220>

<221> CDS <222> (4406) . . (4450) <220>

<221> CDS <222> (4746)..(5789) <400> 451

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

103 aatgctacta ctattagtag aattgatgcc accttttcag ctcgcgcccc aaatgaaaat 60 atagctaaac aggttattga ccatttgcga aatgtatcta atggtcaaac taaatctact 120 cgttcgcaga attgggaatc aactgttaca tggaatgaaa cttccagaca ccgtacttta 180 gttgcatatt taaaacatgt tgagctacag caccagattc agcaattaag ctctaagcca 240 tccgcaaaaa tgacctctta tcaaaaggag caattaaagg tactctctaa tcctgacctg 300 ttggagtttg cttccggtct ggttcgcttt gaagctcgaa ttaaaacgcg atatttgaag 360 tctttcgggc ttcctcttaa tctttttgat gcaatccgct ttgcttctga ctataatagt 420 cagggtaaag acctgatttt tgatttatgg tcattctcgt tttctgaact gtttaaagca 480 tttgaggggg attcaatgaa tatttatgac gattccgcag tattggacgc tatccagtct 540 aaacatttta ctattacccc ctctggcaaa acttcttttg caaaagcctc tcgctatttt 600 ggtttttatc gtcgtctggt aaacgagggt tatgatagtg ttgctcttac tatgcctcgt 660 aattcctttt ggcgttatgt atctgcatta gttgaatgtg gtattcctaa atctcaactg 720 atgaatcttt ctacctgtaa taatgttgtt ccgttagttc gttttattaa cgtagatttt 780 tcttcccaac gtcctgactg gtataatgag ccagttctta aaatcgcata aggtaattca 840 caatgattaa agttgaaatt aaaccatctc aagcccaatt tactactcgt tctggtgttt 900 ctcgtcaggg caagccttat tcactgaatg agcagctttg ttacgttgat ttgggtaatg 960 aatatccggt tcttgtcaag attactcttg atgaaggtca gccagcctat gcgcctggtc 1020 tgtacaccgt tcatctgtcc tctttcaaag ttggtcagtt cggttccctt atgattgacc 1080 gtctgcgcct cgttccggct aagtaacatg gagcaggtcg cggatttcga cacaatttat 1140 caggcgatga tacaaatctc cgttgtactt tgtttcgcgc ttggtataat cgctgggggt 1200 caaagatgag tgttttagtg tattctttcg cctctttcgt tttaggttgg tgccttcgta 1260 gtggcattac gtattttacc cgtttaatgg aaacttcctc atgaaaaagt ctttagtcct 1320 caaagcctct gtagccgttg ctaccctcgt tccgatgctg tctttcgctg ctgagggtga 1380 cgatcccgca aaagcggcct ttaactccct gcaagcctca gcgaccgaat atatcggtta 1440 tgcgtgggcg atggttgttg tcattgtcgg cgcaactatc ggtatcaagc tgtttaagaa 1500 attcacctcg aaagcaagct gataaaccga tacaattaaa ggctcctttt ggagcctttt 1560 tttttggaga ttttcaac gtg aaa aaa tta tta ttc gca att cct tta gtt 1611

Met 1 Lys Lys Leu Leu Phe Ala 5 lie Pro Leu Val 10 gtt cct ttc tat tet cac agt gca cag tctgtcgtga cgcagccgcc 1658 Val Pro Phe Tyr Ser His Ser Ala Gin

WO 02/083872

PCT/US02/12405

104

2016225923 09 Sep 2016

15 20 ctcagtgtct ggggccccag ggcagagggt caccatctcc tgcactggga gcagctccaa 1718 catcggggca ggttatgatg tacactggta ccagcagctt ccaggaacag cccccaaact 1778 cctcatctat ggtaacagca atcggccctc aggggtccct gaccgattct ctggctccaa 1838 gtctggcacc tcagcctccc tggccatcac tgggctccag gctgaggatg aggctgatta 1898 ttactgccag tcctatgaca gcagcctgag tggcctttat gtcttcggaa ctgggaccaa 1958 ggtcaccgtc ctaggtcagc ccaaggccaa ccccactgtc actctgttcc cgccctcctc 2018 tgaggagctc caagccaaca aggccacact agtgtgtctg atcagtgact tctacccggg 2078 agctgtgaca gtggcctgga aggcagatag cagccccgtc aaggcgggag tggagaccac 2138 cacaccctcc aaacaaagca acaacaagta cgcggccagc agctatctga gcctgacgcc 2198 tgagcagtgg aagtcccaca gaagctacag ctgccaggtc acgcatgaag ggagcaccgt 2258 ggagaagaca gtggccccta cagaatgttc ataataaacc gcctccaccg ggcgcgccaa 2318

ttctatttca aggagacagt cata atg aaa tac eta ttg cct aeg gca gcc 2369 Met Lys Tyr Leu Leu Pro Thr Ala Ala

get Ala 30 gga Gly ttg Leu tta Leu tta Leu etc Leu 35 geg Ala gcc Ala cag Gin ccg Pro gcc Ala 40 atg Met gcc Ala gaa Glu gtt Val caa Gin 45 2417 ttg tta gag tct ggt ggc ggt ett gtt cag cct ggt ggt tct tta cgt 2465 Leu Leu Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu Arg 50 55 60 ett tct tgc get get tcc gga ttc act ttc tct teg tac get atg tct 2513 Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Ser Tyr Ala Met Ser 65 70 75 tgg gtt ege caa get cct ggt aaa ggt ttg gag tgg gtt tct get ate 2561 Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ala He 80 85 90 tct ggt tct ggt ggc agt act tac tat get gac tcc gtt aaa ggt ege 2609 Ser Gly Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly Arg 95 100 105 ttc act ate tct aga gac aac tct aag aat act etc tac ttg cag atg 2657 Phe Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 110 115 120 125 aac age tta agg get gag gac act gca gtc tac tat tgc get aaa gac 2705 Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys Asp 130 135 140 tat gaa ggt act ggt tat get ttc gac ata tgg ggt caa ggt act atg 2753 Tyr Glu Gly Thr Gly Tyr Ala Phe Asp lie Trp Gly Gin Gly Thr Met

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

105

145 150 155

gtc Val acc Thr gtc Val 160 tct Ser agt Ser gcc Ala tcc Ser acc Thr 165 aag Lys ggc Gly cca Pro teg Ser gtc Val 170 ttc Phe ccc Pro ctg Leu 2801 gca ccc tcc tcc aag age acc tct ggg ggc aca geg gcc ctg ggc tgc 2849 Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys 175 180 185 ctg gtc aag gac tac ttc ccc gaa ccg gtg aeg gtg teg tgg aac tea 2897 Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser 190 195 200 205 ggc gcc ctg acc age ggc gtc cac acc ttc ccg get gtc eta cag tct 2945 Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser 210 215 220 age gga etc tac tcc etc age age gta gtg acc gtg ccc tct tct age 2993 Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser 225 230 235 ttg ggc acc cag acc tac ate tgc aac gtg aat cac aag ccc age aac 3041 Leu Gly Thr Gin Thr Tyr He Cys Asn Val Asn His Lys Pro Ser Asn 240 245 250 acc aag gtg gac aag aaa gtt gag ccc aaa tct tgt geg gcc get cat 3089 Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His 255 260 265 cac cac cat cat cac tct get gaa caa aaa etc ate tea gaa gag gat 3137 His His His His His Ser Ala Glu Gin Lys Leu He Ser Glu Glu Asp 270 275 280 285 ctg aat ggt gcc gca gat ate aac gat gat cgt atg get ggc gcc get 3185 Leu Asn Gly Ala Ala Asp lie Asn Asp Asp Arg Met Ala Gly Ala Ala 290 295 300 gaa act gtt gaa agt tgt tta gca aaa ccc cat aca gaa aat tea ttt 3233 Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu Asn Ser Phe 305 310 315 act aac gtc tgg aaa gac gac aaa act tta gat cgt tac get aac tat 3281 Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr 320 325 330 gag ggt tgt ctg tgg aat get aca ggc gtt gta gtt tgt act ggt gac 3329 Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp 335 340 345 gaa act cag tgt tac ggt aca tgg gtt cct att ggg ett get ate cct 3377 Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu Ala He Pro 350 355 360 365 gaa aat gag ggt ggt ggc tct gag ggt ggc ggt tct gag ggt ggc ggt 3425 Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 370 375 380

WO 02/083872

PCT/US02/12405

106 tet gag ggt ggc ggt act aaacctcctg agtacggtga tacacctatt 3473

Ser Glu Gly Gly Gly Thr

385

2016225923 09 Sep 2016

ccgggctata cttatatcaa ccctctcgac ggcacttatc cgcctggtac tgagcaaaac 3533 cccgctaatc ctaatccttc tettgaggag tctcagcctc ttaatacttt catgtttcag 3593 aataataggt teegaaatag gcagggggca ttaactgttt atacgggcac tgttactcaa 3653 ggcactgacc ccgttaaaac ttattaccag tacactcctg tatcatcaaa agccatgtat 3713 gacgcttact ggaacggtaa attcagagac tgcgctttcc attctggctt taatgaagat 3773 ccattcgttt gtgaatatca aggccaatcg tctgacctgc ctcaacctcc tgteaatget 3833 ggeggegget ctggtggtgg ttctggtggc ggctctgagg gtggtggctc tgagggtggc 3893 ggttctgagg gtggcggctc tgagggaggc ggttccggtg gtggctctgg t tee ggt 3950

Ser Gly

gat Asp 390 ttt Phe gat Asp tat Tyr gaa Glu aag Lys 395 atg Met gca Ala aac Asn get Ala aat Asn 400 aag Lys ggg Gly get Ala atg Met acc Thr 405 3998 gaa aat gee gat gaa aac gcg eta cag tet gac get aaa ggc aaa ett 4046 Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu 410 415 420 gat tet gtc get act gat tac ggt get get ate gat ggt ttc att ggt 4094 Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly 425 430 435 gac gtt tee ggc ett get aat ggt aat ggt get act ggt gat ttt get 4142 Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala 440 445 450 ggc tet aat tee caa atg get caa gtc ggt gac ggt gat aat tea cct 4190 Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro 455 460 465 tta atg aat aat ttc cgt caa tat tta cct tee etc cct caa teg gtt 4238 Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val 470 475 480 485 gaa tgt cgc cct ttt gtc ttt age get ggt aaa cca tat gaa ttt tet 4286 Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser 490 495 500 att gat tgt gac aaa ata aac tta ttc cgt ggt gtc ttt gcg ttt ett 4334 He Asp Cys Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala Phe Leu 505 510 515 tta tat gtt gee acc ttt atg tat gta ttt tet aeg ttt get aac ata 4382 Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn lie 520 525 530

ctg cgt aat aag gag tet taatc atg cca gtt ett ttg ggt att ccg tta 4432

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

107

Leu Arg Asn Lys Glu Ser Met Pro Val Leu Leu Gly lie Pro Leu

535 540 545 tta ttg cgt ttc etc ggt ttccttctgg taactttgtt eggetatetg 4480

Leu Leu Arg Phe Leu Gly

550 cttacttttc ttaaaaaggg etteggtaag atagetattg ctatttcatt gtttcttgct 4540 ettattattg ggcttaactc aattcttgtg ggttatctct ctgatattag cgctcaatta 4600 ccctctgact ttgttcaggg tgttcagtta attctcccgt etaatgeget tccctgtttt 4660 tatgttattc tctctgtaaa ggctgctatt ttcatttttg acgttaaaca aaaaatcgtt 4720

tcttatttgg attgggataa ataat atg get Met Ala gtt Val tat Tyr ttt Phe gta Val 560 act Thr ggc Gly aaa Lys 4772 555 tta ggc tct gga aag aeg etc gtt age gtt ggt aag att cag gat aaa 4820 Leu Gly Ser Gly Lys Thr Leu Val Ser Val Gly Lys He Gin Asp Lys 565 570 575 att gta get ggg tgc aaa ata gca act aat ett gat tta agg ett caa 4868 lie Val Ala Gly Cys Lys He Ala Thr Asn Leu Asp Leu Arg Leu Gin 580 585 590 595 aac etc ccg caa gtc ggg agg ttc get aaa aeg cct ege gtt ett aga 4916 Asn Leu Pro Gin Val Gly Arg Phe Ala Lys Thr Pro Arg Val Leu Arg 600 605 610 ata ccg gat aag cct tct ata tct gat ttg ett get att ggg ege ggt 4964 lie Pro Asp Lys Pro Ser He Ser Asp Leu Leu Ala He Gly Arg Gly 615 620 625 aat gat tec tac gat gaa aat aaa aac ggc ttg ett gtt etc gat gag 5012 Asn Asp Ser Tyr Asp Glu Asn Lys Asn Gly Leu Leu Val Leu Asp Glu 630 635 640 tgc ggt act tgg ttt aat acc cgt tct tgg aat gat aag gaa aga cag 5060 Cys Gly Thr Trp Phe Asn Thr Arg Ser Trp Asn Asp Lys Glu Arg Gin 645 650 655 ccg att att gat tgg ttt eta cat get cgt aaa tta gga tgg gat att 5108 Pro He lie Asp Trp Phe Leu His Ala Arg Lys Leu Gly Trp Asp He 660 665 670 675 att ttt ett gtt cag gac tta tct att gtt gat aaa cag geg cgt tct 5156 He Phe Leu Val Gin Asp Leu Ser He Val Asp Lys Gin Ala Arg Ser 680 685 690 gca tta get gaa cat gtt gtt tat tgt cgt cgt ctg gac aga att act 5204 Ala Leu Ala Glu His Val Val Tyr Cys Arg Arg Leu Asp Arg lie Thr 695 700 705 tta cct ttt gtc ggt act tta tat tct ett att act ggc teg aaa atg 5252 Leu Pro Phe Val Gly Thr Leu Tyr Ser Leu He Thr Gly Ser Lys Met 710 715 720

WO 02/083872

PCT/US02/12405

108

2016225923 09 Sep 2016

cct Pro ctg Leu 725 cct Pro aaa Lys tta Leu cat His gtt Val 730 ggc Gly gtt Val gtt Val aaa Lys tat Tyr 735 ggc Gly gat Asp tet Ser caa Gin 5300 tta age cct act gtt gag cgt tgg ett tat act ggt aag aat ttg tat 5348 Leu Ser Pro Thr Val Glu Arg Trp Leu Tyr Thr Gly Lys Asn Leu Tyr 740 745 750 755 aac gca tat gat act aaa cag get ttt tet agt aat tat gat tee ggt 5396 Asn Ala Tyr Asp Thr Lys Gin Ala Phe Ser Ser Asn Tyr Asp Ser Gly 760 765 770 gtt tat tet tat tta aeg cct tat tta tea cac ggt egg tat ttc aaa 5444 Val Tyr Ser Tyr Leu Thr Pro Tyr Leu Ser His Gly Arg Tyr Phe Lys 775 780 785 cca tta aat tta ggt cag aag atg aaa tta act aaa ata tat ttg aaa 5492 Pro Leu Asn Leu Gly Gin Lys Met Lys Leu Thr Lys lie Tyr Leu Lys 790 795 800 aag ttt tet ege gtt ett tgt ett geg att gga ttt gca tea gca ttt 5540 Lys Phe Ser Arg Val Leu Cys Leu Ala lie Gly Phe Ala Ser Ala Phe 805 810 815 aca tat agt tat ata acc caa cct aag ccg gag gtt aaa aag gta gtc 5588 Thr Tyr Ser Tyr lie Thr Gin Pro Lys Pro Glu Val Lys Lys Val Val 820 825 830 835 tet cag acc tat gat ttt gat aaa ttc act att gac tet tet cag cgt 5636 Ser Gin Thr Tyr Asp Phe Asp Lys Phe Thr He Asp Ser Ser Gin Arg 840 845 850 ett aat eta age tat ege tat gtt ttc aag gat tet aag gga aaa tta 5684 Leu Asn Leu Ser Tyr Arg Tyr Val Phe Lys Asp Ser Lys Gly Lys Leu 855 860 865 att aat age gac gat tta cag aag caa ggt tat tea etc aca tat att 5732 He Asn Ser Asp Asp Leu Gin Lys Gin Gly Tyr Ser Leu Thr Tyr He 870 875 880 gat tta tgt act gtt tee att aaa aaa ggt aat tea aat gaa att gtt 5780 Asp Leu Cys Thr Val Ser He Lys Lys Gly Asn Ser Asn Glu lie Val 885 890 895

aaa tgt aat taattttgtt ttcttgatgt ttgtttcatc atcttctttt 5829

Lys Cys Asn

900 gctcaggtaa ttgaaatgaa taattcgcct ctgcgcgatt ttgtaacttg gtattcaaag 5889 caatcaggcg aatccgttat tgtttctccc gatgtaaaag gtactgttac tgtatattca 5949 tctgacgtta aacctgaaaa tctacgcaat ttctttattt ctgttttacg tgctaataat 6009 tttgatatgg ttggttcaat tccttccata attcagaagt ataatccaaa caatcaggat 6069 tatattgatg aattgccatc atctgataat caggaatatg atgataattc cgctccttct 6129

WO 02/083872

PCT/US02/12405

109

2016225923 09 Sep 2016

ggtggtttct ttgttccgca aaatgataat gttactcaaa cttttaaaat taataacgtt 6189 cgggcaaagg atttaatacg agttgtcgaa ttgtttgtaa agtctaatac ttctaaatcc 6249 tcaaatgtat tatctattga cggctctaat ctattagttg tttctgcacc taaagatatt 6309 ttagataacc ttcctcaatt cctttctact gttgatttgc caactgacca gatattgatt 6369 gagggtttga tatttgaggt tcagcaaggt gatgctttag atttttcatt tgctgctggc 6429 tctcagcgtg gcactgttgc aggcggtgtt aatactgacc gcctcacctc tgttttatct 6489 tctgctggtg gttcgttcgg tatttttaat ggcgatgttt tagggctatc agttcgcgca 6549 ttaaagacta atagccattc aaaaatattg tctgtgccac gtattcttac gctttcaggt 6609 cagaagggtt ctatctctgt tggccagaat gtccctttta ttactggtcg tgtgactggt 6669 gaatctgcca atgtaaataa tccatttcag acgattgagc gtcaaaatgt aggtatttcc 6729 atgagcgttt ttcctgttgc aatggctggc ggtaatattg ttctggatat taccagcaag 6789 gccgatagtt tgagttcttc tactcaggca agtgatgtta ttactaatca aagaagtatt 6849 gctacaacgg ttaatttgcg tgatggacag actcttttac tcggtggcct cactgattat 6909 aaaaacactt ctcaagattc tggcgtaccg ttcctgtcta aaatcccttt aatcggcctc 6969 ctgtttagct cccgctctga ttccaacgag gaaagcacgt tatacgtgct cgtcaaagca 7029 accatagtac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag 7089 cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt 7149 tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc ctttagggtt 7209 ccgatttagt gctttacggc acctcgaccc caaaaaactt gatttgggtg atggttcacg 7269 tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt 7329 taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg gctattcttt 7389 tgatttataa gggattttgc cgatttcgga accaccatca aacaggattt tcgcctgctg 7449 gggcaaacca gcgtggaccg cttgctgcaa ctctctcagg gccaggcggt gaagggcaat 7509 cagctgttgc ccgtctcact ggtgaaaaga aaaaccaccc tggatccaag cttgcaggtg 7569 gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa 7629 atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga 7689 agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc 7749 ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg 7809 gcgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc 7869

WO 02/083872

PCT/US02/12405

110

2016225923 09 Sep 2016

gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt catacactat 7929 tatcccgtat tgacgccggg caagagcaac tcggtcgccg ggcgcggtat tctcagaatg 7989 acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 8049 aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 8109 cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 8169 gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaaegaegag cgtgacacca 8229 cgatgcctgt agcaatgcca acaacgttgc gcaaactatt aactggcgaa ctacttactc 8289 tagcttcccg gcaacaatta atagactgga tggaggegga taaagttgca ggaccacttc 8349 tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 8409 ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt ategtagtta 8469 tctacacgac ggggagtcag gcaactatgg atgaaegaaa tagacagatc getgagatag 8529 gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat ataetttaga 8589 ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 8649 tcatgaccaa aatcccttaa cgtgagtttt cgttccactg tacgtaagac ccccaagctt 8709 gtcgactgaa tggcgaatgg cgctttgcct ggtttccggc accagaagcg gtgccggaaa 8769 gctggctgga gtgegatett cctgaggccg atactgtcgt cgtcccctca aactggcaga 8829 tgcacggtta cgatgcgccc atctacacca acgtaaccta tcccattacg gtcaatccgc 8889 cgtttgttcc cacggagaat ccgacgggtt gttactcgct cacatttaat gttgatgaaa 8949 gctggctaca ggaaggccag aegegaatta tttttgatgg cgttcctatt ggttaaaaaa 9009 tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaaegt ttacaattta 9069 aatatttgct tatacaatct tcctgttttt ggggcttttc tgattatcaa ccggggtaca 9129 tatgattgac atgctagttt tacgattacc gttcatcgat tctcttgttt gctccagact 9189 ctcaggcaat gacctgatag cctttgtaga tctctcaaaa atagctaccc tctccggcat 9249 gaatttatca getagaaegg ttgaatatca tattgatggt gatttgactg tctccggcct 9309 ttctcaccct tttgaatett tacctacaca ttactcaggc attgeattta aaatatatga 9369 gggttctaaa aatttttatc cttgcgttga aataaagget tctcccgcaa aagtattaca 9429 gggtcataat gtttttggta caaccgattt agctttatgc tetgaggett tattgettaa 9489

ttttgctaat tctttgcctt gcctgtatga tttattggat gtt

9532

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

111 <210> 452 <211> 20 <212> PRT <213> Unknown Organism <220>

<223> Description of Unknown Organism: MALIA3 peptide sequence <400> 452

Met Lys Lys Leu Leu Phe Ala lie Pro Leu Val Val Pro Phe Tyr Ser 1 5 10 15

His Ser Ala Gin 20 <210> 453 <211> 367 <212> PRT <213> Unknown Organism <220>

<223> Description of Unknown Organism: MALIA3 protein sequence <400> 453

Met Lys Tyr Leu Leu Pro Thr Ala 1 5

Ala Gin Pro Ala Met Ala Glu Val 20

Leu Val Gin Pro Gly Gly Ser Leu 35 40

Phe Thr Phe Ser Ser Tyr Ala Met 50 55

Lys Gly Leu Glu Trp Val Ser Ala 65 70

Tyr Tyr Ala Asp Ser Val Lys Gly 85

Ser Lys Asn Thr Leu Tyr Leu Gin 100

Thr Ala Val Tyr Tyr Cys Ala Lys 115 120

Phe Asp lie Trp Gly Gin Gly Thr 130 135

Thr Lys Gly Pro Ser Val Phe Pro 145 150

Ser Gly Gly Thr Ala Ala Leu Gly

Ala Ala Gly Leu Leu Leu Leu Ala 10 15

Gin Leu Leu Glu Ser Gly Gly Gly 25 30

Arg Leu Ser Cys Ala Ala Ser Gly 45

Ser Trp Val Arg Gin Ala Pro Gly 60 lie Ser Gly Ser Gly Gly Ser Thr 75 80

Arg Phe Thr lie Ser Arg Asp Asn 90 95

Met Asn Ser Leu Arg Ala Glu Asp 105 110

Asp Tyr Glu Gly Thr Gly Tyr Ala 125

Met Val Thr Val Ser Ser Ala Ser 140

Leu Ala Pro Ser Ser Lys Ser Thr 155 160

Cys Leu Val Lys Asp Tyr Phe Pro

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

112 165 170 175 Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val 180 185 190 His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser Leu Ser 195 200 205 Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr Tyr He 210 215 220 Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys Lys Val 225 230 235 240 Glu Pro Lys Ser Cys Ala Ala Ala His His His His His His Ser Ala 245 250 255 Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn Gly Ala Ala Asp He 260 265 270 Asn Asp Asp Arg Met Ala Gly Ala Ala Glu Thr Val Glu Ser Cys Leu 275 280 285 Ala Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp 290 295 300 Lys Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala 305 310 315 320 Thr Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr 325 330 335 Trp Val Pro lie Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser 340 345 350 Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr 355 360 365 <210> 454 <2H> 152 <212> PRT <213> Unknown Organism

<220>

<223> Description of Unknown Organism: MALIA3 protein sequence <400> 454

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala 1 5 10 15 Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly

20 25 30

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe 35 40 45

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

113 lie Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 50 55 60 Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 65 70 75 80 Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 85 90 95 Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu 100 105 110 Phe Ser He Asp Cys Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala 115 120 125 Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 130 135 140 Asn lie Leu Arg Asn Lys Glu Ser

145 150 <210> 455 <211> 15 <212> PRT <213> Unknown Organism <220>

<223> Description of Unknown Organism: MALIA3 peptide sequence <400> 455

Met Pro Val Leu Leu Gly lie Pro Leu Leu Leu Arg Phe Leu Gly 15 10 15 <210> 456 <211> 348 <212> PRT <213> Unknown Organism <220>

<223> Description of Unknown Organism: MALIA3 protein sequence <400> 456

Met 1 Ala Val Tyr Phe 5 Val Thr Gly Lys Leu 10 Gly Ser Gly Lys Thr 15 Leu Val Ser Val Gly 20 Lys lie Gin Asp Lys 25 He Val Ala Gly Cys 30 Lys lie Ala Thr Asn 35 Leu Asp Leu Arg Leu 40 Gin Asn Leu Pro Gin 45 Val Gly Arg Phe Ala 50 Lys Thr Pro Arg Val 55 Leu Arg He Pro Asp 60 Lys Pro Ser He

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

Ser Asp Leu 65

Lys Asn Gly

Arg Ser Trp

His Ala Arg 115

Ser He Val 130

Tyr Cys Arg 145

Tyr Ser Leu

Gly Val Val

Trp Leu Tyr 195

Ala Phe Ser 210

Tyr Leu Ser 225

Met Lys Leu

Leu Ala lie

Pro Lys Pro 275

Lys Phe Thr 290

Val Phe Lys 305

Lys Gin Gly

Lys Lys Gly

Leu Ala He Gly Arg 70

Leu Leu Val Leu Asp 85

Asn Asp Lys Glu Arg 100

Lys Leu Gly Trp Asp 120

Asp Lys Gin Ala Arg 135

Arg Leu Asp Arg lie 150

He Thr Gly Ser Lys 165

Lys Tyr Gly Asp Ser 180

Thr Gly Lys Asn Leu 200

Ser Asn Tyr Asp Ser 215

His Gly Arg Tyr Phe 230

Thr Lys He Tyr Leu 245

Gly Phe Ala Ser Ala 260

Glu Val Lys Lys Val 280

He Asp Ser Ser Gin 295

Asp Ser Lys Gly Lys 310

Tyr Ser Leu Thr Tyr 325

Asn Ser Asn Glu He 340

114

Gly Asn

Glu Cys 90

Gin Pro 105

He lie

Ser Ala

Thr Leu

Met Pro 170

Gin Leu 185

Tyr Asn

Gly Val

Lys Pro

Lys Lys 250

Phe Thr 265

Val Ser

Arg Leu

Leu He

He Asp 330

Val Lys

345

Asp Ser 75

Gly Thr

He He

Phe Leu

Leu Ala 140

Pro Phe 155

Leu Pro

Ser Pro

Ala Tyr

Tyr Ser 220

Leu Asn 235

Phe Ser

Tyr Ser

Gin Thr

Asn Leu 300

Asn Ser 315

Leu Cys

Cys Asn

Tyr Asp Glu

Trp Phe Asn 95

Asp Trp Phe 110

Val Gin Asp 125

Glu His Val

Val Gly Thr

Lys Leu His 175

Thr Val Glu 190

Asp Thr Lys 205

Tyr Leu Thr

Leu Gly Gin

Arg Val Leu 255

Tyr He Thr 270

Tyr Asp Phe 285

Ser Tyr Arg

Asp Asp Leu

Thr Val Ser 335

Asn

Thr

Leu

Val

Leu

160

Val

Arg

Gin

Pro

Lys

240

Cys

Gin

Asp

Tyr

Gin

320

He <210> 457 <211> 24 <212> DNA

WO 02/083872

PCT/US02/12405

115

2016225923 09 Sep 2016 <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 457 tggaagaggc acgttctttt cttt 24 <210> 458 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 458 cttttctttg ttgccgttgg ggtg 24 <210> 459 <211> 24 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 459 acactctccc ctgttgaagc tctt 24 <210> 460 <211> 51 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 460 accgcctcca ccgggcgcgc cttattaaca ctctcccctg ttgaagctct t 51 <210> 461 <211> 23 <212> DNA <213> Artificial Sequence · <220>

<223> Description of Artificial Sequence: Primer <400> 461 tgaacattct gtaggggcca ctg 23 <210> 462

WO 02/083872

PCT/US02/12405

116

2016225923 09 Sep 2016 <211> 23 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 462 agagcattct gcaggggcca ctg . 23 <210> 463 <211> 50 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 463 accgcctcca ccgggcgcgc cttattatga acattctgta ggggccactg 50 <210> 464 <211> 50 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 464 accgcctcca ccgggcgcgc cttattaaga gcattctgca ggggccactg 50 <210> 465 <211> 23 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 465 cgactggagc acgaggacac tga 23 <210> 466 <211> 26 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 466 ggacactgac atggactgaa ggagta

WO 02/083872

PCT/US02/12405

117

2016225923 09 Sep 2016

<210> 4 67 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 467 gggaggatgg agactgggtc

<210> 468 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 468 gggaagatgg agactgggtc

<210> 469 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 469 gggagagtgg agactgagtc

<210> 470 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 470 gggtgcctgg agactgcgtc

<210> 471 <211> 20 <212> DNA <213> Artificial

WO 02/083872

PCT/US02/12405

118

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

<400> 471 gggtggctgg agactgcgtc <210> 472 . <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 472 gggaggatgg agactgggtc atctggatgt cttgtgcact gtgacagagg <210> 473 <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 473 gggaagatgg agactgggtc atctggatgt cttgtgcact gtgacagagg

<210> 474 <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 474 gggagagtgg agactgggtc atctggatgt cttgtgcact gtgacagagg

<210> 475 <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 475

gggtgcctgg agactgggtc atctggatgt cttgtgcact gtgacagagg

WO 02/083872

PCT/US02/12405

119

2016225923 09 Sep 2016 <210> 476 <211> 50 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 476 gggtggctgg agactgggtc atctggatgt cttgtgcact gtgacagagg 50

<210> 477 <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 477 gggagtctgg agactgggtc atctggatgt cttgtgcact gtgacagagg

<210> 478 <211> 42 <212> DNA <213> Artificial Sequence

<220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 478 cctctgtcac agtgcacaag acatccagat gacccagtct cc 42 <210> 479 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 479 cctctgtcac agtgcacaag ac <210> 480 <211> 24 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

120 <220>

<223> Description of Artificial Sequence: Primer <400> 480 acactctccc ctgttgaagc tctt 24 <210> 481 <211> 668 .

<212> DNA <213> Homo sapiens <220>

<221> CDS <222> (1) .. (668) <400> 481 agt gca caa gac ate cag atg acc cag tet cca gcc acc ctg tet gtg 48

Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val

1 5 10 15 tet cca ggg gaa agg gcc acc etc tcc tgc agg gcc agt cag agt gtt 96

Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val

20 25 30 agt aac aac tta gcc tgg tac cag cag aaa cct ggc cag gtt ccc agg 144

Ser Asn Asn Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Val Pro Arg

35 40 45 etc etc ate tat ggt gca tcc acc agg gcc act gat ate cca gcc agg 192

Leu Leu lie Tyr Gly Ala Ser Thr Arg Ala Thr Asp He Pro Ala Arg

50 55 60 ttc agt ggc agt ggg tet ggg aca gac ttc act etc acc ate age aga 240

Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg

65 70 75 80 ctg gag cct gaa gat ttt gca gtg tat tac tgt cag egg tat ggt age 288

Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Arg Tyr Gly Ser

85 90 95 tea ccg ggg tgg aeg ttc ggc caa ggg acc aag gtg gaa ate aaa ega 336

Ser Pro Gly Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg

100 105 110 act gtg get gca cca tet gtc ttc ate ttc ccg cca tet gat gag cag 384

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin

115 120 125 ttg aaa tet gga act gcc tet gtt gtg tgc ctg ctg aat aac ttc tat 432

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr

130 135 140 ccc aga gag gcc aaa gta cag tgg aag gtg gat aac gcc etc caa teg 480

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser

145 150 155 160 ggt aac tcc cag gag agt gtc aca gag cag gac age aag gac age acc 528

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

121 Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser Thr 165 170 175 tac age etc age age acc ctg aeg ctg age aaa gca gac tac gag aaa 576 Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 180 185 190 cac aaa gtc tac gee tgc gaa gtc acc cat cag ggc ctg age teg cct 624 His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 195 200 205 gtc aca aag age ttc aac aaa gga gag tgt aag ggc gaa ttc gc 668 Val Thr Lys Ser Phe Asn Lys Gly Glu Cys Lys Gly Glu Phe Ala

210 215 220 <210> 482 <211> 223 <212> PRT <213> Homo sapiens <400> 482

Ser 1 Ala Gin Asp lie 5 Gin Met Thr Gin Ser 10 Pro Ala Thr Leu Ser 15 Val Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val 20 25 30 Ser Asn Asn Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Val Pro Arg 35 40 45 Leu Leu He Tyr Gly Ala Ser Thr Arg Ala Thr Asp He Pro Ala Arg 50 55 60 Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg 65 70 75 80 Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Arg Tyr Gly Ser 85 90 95 Ser Pro Gly Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg 100 105 110 Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 115 120 125 Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 130 135 140 Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 145 150 155 160 Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser Thr 165 170 175 Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 180 185 190

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

122

His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 195 200 205

Val Thr Lys Ser Phe Asn Lys Gly Glu Cys Lys Gly Glu Phe Ala 210 215 220 <210> 483 <211> 13 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 483 agccaccctg tct 13 <210> 484 <211> 700 <212> DNA <213> Homo sapiens <220>

<221> CDS <222> (1) .. (699) <400> 484

agt Ser 1 gca Ala caa Gin gac Asp ate He 5 cag Gin atg Met acc Thr cag Gin tct Ser 10 cct Pro gcc Ala acc Thr ctg Leu tct Ser 15 gtg Val 48 tct cca ggt gaa aga gcc acc etc tcc tgc agg gcc agt cag gtg tct 96 Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Val Ser 20 25 30 cca ggg gaa aga gcc acc etc tcc tgc aat ett etc age aac tta gcc 144 Pro Gly Glu Arg Ala Thr Leu Ser Cys Asn Leu Leu Ser Asn Leu Ala 35 40 45 tgg tac cag cag aaa cct ggc cag get ccc agg etc etc ate tat ggt 192 Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He Tyr Gly 50 55 60 get tcc acc ggg gcc att ggt ate cca gcc agg ttc agt ggc agt ggg 240 Ala Ser Thr Gly Ala lie Gly He Pro Ala Arg Phe Ser Gly Ser Gly 65 70 75 80 tct ggg aca gag ttc act etc acc ate age age ctg cag tct gaa gat 288 Ser Gly Thr Glu Phe Thr Leu Thr He Ser Ser Leu Gin Ser Glu Asp 85 90 95 ttt gca gtg tat ttc tgt cag cag tat ggt acc tea ccg ccc act ttc 336 Phe Ala Val Tyr Phe Cys Gin Gin Tyr Gly Thr Ser Pro Pro Thr Phe 100 105 110

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

123 ggc gga ggg acc aag gtg gag ate aaa ega act gtg get gca cca tet 384 Gly Gly Gly Thr Lys Val Glu lie Lys Arg Thr Val Ala Ala Pro Ser 115 120 125 gtc ttc ate ttc ccg cca tet gat gag cag ttg aaa tet gga act gcc 432 Val Phe He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala 130 135 140 tet gtt gtg tgc ccg ctg aat aac ttc tat ccc aga gag gcc aaa gta 480 Ser Val Val Cys Pro Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val 145 150 155 160 cag tgg aag gtg gat aac gcc etc caa teg ggt aac tcc cag gag agt 528 Gin Trp Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser 165 170 175 gtc aca gag cag gac aac aag gac age acc tac age etc age age acc 576 Val Thr Glu Gin Asp Asn Lys Asp Ser Thr Tyr Ser Leu Ser Ser Thr 180 185 190 ctg aeg ctg age aaa gta gac tac gag aaa cac gaa gtc tac gcc tgc 624 Leu Thr Leu Ser Lys Val Asp Tyr Glu Lys His Glu Val Tyr Ala Cys 195 200 205 gaa gtc acc cat cag ggc ett age teg ccc gtc aeg aag age ttc aac 672 Glu Val Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn 210 215 220 agg gga gag tgt aag aaa gaa ttc gtt t 700 Arg Gly Glu Cys Lys Lys Glu Phe Val

225 230 <210> 485 <211> 233 <212> PRT

<213> Homo : sapiens <400> 485 Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val 1 5 10 15 Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Val Ser 20 25 30 Pro Gly Glu Arg Ala Thr Leu Ser Cys Asn Leu Leu Ser Asn Leu Ala 35 40 45 Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He Tyr Gly 50 55 60 Ala Ser Thr Gly Ala He Gly He Pro Ala Arg Phe Ser Gly Ser Gly 65 70 75 80 Ser Gly Thr Glu Phe Thr Leu Thr He Ser Ser Leu Gin Ser Glu Asp 85 90 95 Phe Ala Val Tyr Phe Cys Gin Gin Tyr Gly Thr Ser Pro Pro Thr Phe

WO 02/083872

PCT/US02/12405

124

2016225923 09 Sep 2016

100

Gly Gly Gly Thr 115

Val Phe lie Phe 130

Ser Val Val Cys 145

Gin Trp Lys Val

Val Thr Glu Gin 180

Leu Thr Leu Ser 195

Glu Val Thr His 210

Arg Gly Glu Cys 225

Lys Val Glu lie 120

Pro Pro Ser Asp 135

Pro Leu Asn Asn 150

Asp Asn Ala Leu 165

Asp Asn Lys Asp

Lys Val Asp Tyr 200

Gin Gly Leu Ser 215

Lys Lys Glu Phe 230

105 110

Lys Arg Thr Val Ala Ala Pro 125

Glu Gin Leu Lys Ser Gly Thr 140

Phe Tyr Pro Arg Glu Ala Lys 155

Gin Ser Gly Asn Ser Gin Glu 170 175

Ser Thr Tyr Ser Leu Ser Ser 185 190

Glu Lys His Glu Val Tyr Ala 205

Ser Pro Val Thr Lys Ser Phe 220

Val

Ser

Ala

Val

160

Ser

Thr

Cys

Asn <210> 486 <211> 419 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic 3-23 VH nucleotide sequence <220>

<221> CDS <222> (12)..(419) <400> 486 ctgtctgaac g gcc cag ccg gcc atg gcc gaa gtt caa ttg tta gag tet Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser

15 10

ggt Gly ggc Gly 15 ggt Gly ett Leu gtt Val cag cct ggt Gly ggt Gly tet Ser tta Leu cgt Arg 25 ett Leu tet Ser tgc Cys get Ala Gin Pro 20 get tcc gga ttc act ttc tet teg tac get atg tet tgg gtt ege caa Ala Ser Gly Phe Thr Phe Ser Ser Tyr Ala Met Ser Trp Val Arg Gin 30 35 40 45 get cct ggt aaa ggt ttg gag tgg gtt tet get ate tet ggt tet ggt Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ala lie Ser Gly Ser Gly 50 55 60

146

194

WO 02/083872

PCT/US02/12405

125

2016225923 09 Sep 2016

ggc Gly agt Ser act Thr tac Tyr 65 tat Tyr get Ala gac Asp tcc Ser gtt Val 70 aaa Lys ggt Gly ege Arg ttc Phe act Thr 75 ate lie tct Ser 242 aga gac aac tct aag aat act etc tac ttg cag atg aac age tta agg 290 Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg 80 85 90 get gag gac act gca gtc tac tat tgc get aaa gac tat gaa ggt act 338 Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys Asp Tyr Glu Gly Thr 95 100 105 ggt tat get ttc gac ata tgg ggt caa ggt act atg gtc acc gtc tct 386 Gly Tyr Ala Phe Asp He Trp Gly Gin Gly Thr Met Val Thr Val Ser 110 115 120 125 agt gcc tcc acc aag ggc cca teg gtc ttc ccc 419 Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro 130 135

<21.0> 487 <211> 136 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic 3-23 VH protein sequence <400> 487

Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 1.5 10 15

Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 20 25 30

Phe Thr Phe Ser Ser Tyr Ala Met Ser Trp Val Arg Gin Ala Pro Gly 35 40 45

Lys Gly Leu Glu Trp Val Ser Ala lie Ser Gly Ser Gly Gly Ser Thr 50 55 60

Tyr Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr lie Ser Arg Asp Asn 65 70 75 80

Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg Ala Glu Asp 85 90 95

Thr Ala Val Tyr Tyr Cys Ala Lys Asp Tyr Glu Gly Thr Gly Tyr Ala 100 105 110

Phe Asp lie Trp Gly Gin Gly Thr Met Val Thr Val Ser Ser Ala Ser 115 120 125

Thr Lys Gly Pro Ser Val Phe Pro 130 135

WO 02/083872

PCT/US02/12405

126

2016225923 09 Sep 2016 <210> 488 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 488 ctgtctgaac ggcccagccg 20 <210> 489 <211> 83 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 489 ctgtctgaac ggcccagccg gccatggccg aagttcaatt gttagagtct ggtggcggtc 60 ttgttcagcc tggtggttct tta 83 <210> 490 <211> 54 <212> DNA <213> Artificial Sequence <220> · <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 490 .

gaaagtgaat ccggaagcag cgcaagaaag acgtaaagaa ccaccaggct gaac 54 <210> 491 <211> 42 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide ' <400> 491 agaaacccac tccaaacctt taccaggagc ttggcgaacc ca 42 <210> 492 <211> 94 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

127

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 492 agtgtcctca gcccttaagc tgttcatctg caagtagaga gtattcttag agttgtctct 60 agagatagtg aagcgacctt taacggagtc agca 94 <210> 493 <211> 81 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 493 gcttaagggc tgaggacact gcagtctact attgcgctaa agactatgaa ggtactggtt 60 atgctttcga catatggggt c 81 <210> 494 <211> 72 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 494 ggggaagacc gatgggccct tggtggaggc actagagacg gtgaccatag taccttgacc 60 tatgtcgaaa gc 72 <210> 495 <211> 23 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 495 ggggaagacc gatgggccct tgg 23 <210> 496 <211> 56 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide

WO 02/083872

PCT/US02/12405

128

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (22).. ¢24) <223> A, T, C, G, other or unknown <220>

<221> modified_base <222> (28)..(30) <223> A, T, C, G, other or unknown <220>

<221> modified_base <222> (34)..(36) <223> A, T, C, G, other or unknown <220>

<223> nnn codes for any amino acid but Cys <400> 496 gcttccggat tcactttctc tnnntacnnn atgnnntggg ttcgccaagc tcctgg 56 <210> 497 <211> 68 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (19)..(21) <223> A, T, C or G <220>

<221> modified_base <222> (25)..(30) <223> A, T, C or G <220>

<221> modified_base <222> (40)..(42) <223> A, T, C or G <220>

<221> modified_base <222> (46)..(48) <223> A, T, C or G <400> 497 ggtttggagt gggtttctnn natcnnnnnn tctggtggcn nnactnnnta tgctgactcc 60 gttaaagg 68 <210> 498

WO 02/083872

PCT/US02/12405

129

2016225923 09 Sep 2016 <211> 912 <212> DNA <213> Escherichia coli <400> 498 tccggagctt cagatctgtt tgcctttttg tggggtggtg cagatcgcgt tacggagatc 60 gaccgactgc ttgagcaaaa gccacgctta actgctgatc aggcatggga tgttattcgc 120 caaaccagtc gtcaggatct taacctgagg ctttttttac ctactctgca agcagcgaca 180 tctggtttga cacagagcga tccgcgtcgt cagttggtag aaacattaac acgttgggat 240 ggcatcaatt tgcttaatga tgatggtaaa acctggcagc agccaggctc tgccatcctg 300 aacgtttggc tgaccagtat gttgaagcgt accgtagtgg ctgccgtacc tatgccattt 360 gataagtggt acagcgccag tggctacgaa acaacccagg acggcccaac tggttcgctg 420 aatataagtg ttggagcaaa aattttgtat gaggcggtgc agggagacaa atcaccaatc 480 ccacaggcgg ttgatctgtt tgctgggaaa ccacagcagg aggttgtgtt ggctgcgctg 540 gaagatacct gggagactct ttccaaacgc tatggcaata atgtgagtaa ctggaaaaca 600 cctgcaatgg ccttaacgtt ccgggcaaat aatttctttg gtgtaccgca ggccgcagcg 660 gaagaaacgc gtcatcaggc ggagtatcaa aaccgtggaa cagaaaacga tatgattgtt 720 ttctcaccaa cgacaagcga tcgtcctgtg cttgcctggg atgtggtcgc acccggtcag 780 agtgggttta ttgctcccga tggaacagtt gataagcact atgaagatca gctgaaaatg 840 tacgaaaatt ttggccgtaa gtcgctctgg ttaacgaagc aggatgtgga ggcgcataag 900 gagtcgtcta ga 912 <210> 499 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> <221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 499 gatnnnnatc 10 <210> 500 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(15) <223> A, T, C, G, other or unknown <400> 500 nnnnnnnnnn nnnnngtccc 20

WO 02/083872

PCT/US02/12405

130

2016225923 09 Sep 2016 <210> 501 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide .

<220>

<221> modified_base <222> (4).. (8) <223> A, T, C, G, other or unknown <400> 501 gcannnnntg c

Synthetic <210> 502 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 502 gacnnnngtc

Synthetic <210> 503 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <220>

<221> modified_base <222> (1)..(7) <223> A, T, C, G, other or unknown <400> 503 nnnnnnngcg gg

Synthetic <210> 504 <211> 12 <212> DNA

WO 02/083872

PCT/US02/12405

131

2016225923 09 Sep 2016 <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(12) <223> A, T, C, G, other or unknown <400> 504 gtatccnnnn nn 12 <210> 505 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 505 gcannnnnnt eg 12 <210> 506 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 506 gccnnnnngg c 11 <210> 507 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic

WO 02/083872

PCT/US02/12405

132

2016225923 09 Sep 2016

oligonucleotide <220> <221> modified base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 507 ggtctcnnnn n

<210> 508 <211> 11 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence oligonucleotide <220> <221> modified base <222> (4)..(11) <223> A, T, C, G, other or unknown <400> 508 gacnnnnngt c

Synthetic

<210> 509 <211> 11 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence oligonucleotide <220> <221> modified base <222> (4) . . (8) <223> A, T, C, G, other or unknown <400> 509 gacnnnnngt c

Synthetic <210> 510 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <220>

<221> modified_base

Synthetic

WO 02/083872

PCT/US02/12405

133

2016225923 09 Sep 2016

<222> (4) . . (9) <223> A, T, C, G, other or unknown <400> 510 gacnnnnnng tc

<210> 511 <211> 11 .

<212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 511 ccannnnntg g H <210> 512 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(9) <223> A, T, C, G, other or unknown <400> 512 nnnnnnnnng caggt ’ 15 <210> 513 <211> 11 <212> DNA <213> Artificial Sequence ' <220> · <223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 513

WO 02/083872

PCT/US02/12405

134

2016225923 09 Sep 2016 acctgcnnnn n <210> 514 <211> 13 <212> DNA <213> Artificial Sequence <220> .

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (5)..(9) <223> A, T, C, G, other or unknown <400> 514 ggccnnnnng gcc <210> 515 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: oligonucleotide <220>

<221> modified_base <222> (4)..(12) <223> A, T, C, G, other or unknown <400> 515 ccannnnnnn nntgg

Synthetic <210> 516 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> · <221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 516 cgtctcnnnn n 11 <210> 517

WO 02/083872

PCT/US02/12405

135

2016225923 09 Sep 2016 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> .

<221> modified_base <222> (1)..(6) <223> A, T, C, G, other or unknown <400> 517 nnnnnngaga eg 12 <210> 518 <211> 16 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(10) <223> A, T, C, G, other or unknown <400> 518 nnnnnnnnnn ctcctc 16 <210> 519 <211> 16 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(16) <223> A, T, C, G, other or unknown <400> 519 gaggagnnnn nnnnnn <210> 520 <211> 11 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

136

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4) .. (8) <223> A, T, C, G, other or unknown <400> 520 cctnnnnnag g <210> 521 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4) . . (9) <223> A, T, C, G, other or unknown <400> 521 ccannnnnnt gg <210> 522 <211> 6680 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Vector pCES5 nucleotide sequence <220>

<221> CDS <222> (201) . . (1058) <220>

<221> CDS <222> (2269)..(2682) <220>

<221> CDS <222> (2723)..(2866) <220>

<221> CDS <222> (3767)..(3850) <220>

<221> CDS

WO 02/083872

PCT/US02/12405

137

2016225923 09 Sep 2016 <222> (4198) .. (5799) <400> 522 gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt 60 cttagacgtc aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttatttt 120 tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat 180 aatattgaaa aaggaagagt atg agt att caa cat ttc cgt gtc gcc ett att 233 Met Ser lie Gin His Phe Arg Val Ala Leu lie

15 10 ccc ttt ttt gcg gca ttt tgc ett cct gtt ttt get cac cca gaa aeg 281

Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr

15 20 25 ctg gtg aaa gta aaa gat get gaa gat cag ttg ggt gcc cga gtg ggt 329

Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly

30 35 40 tac ate gaa ctg gat etc aac age ggt aag ate ett gag agt ttt ege 377

Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu Ser Phe Arg

45 50 55 ccc gaa gaa cgt ttt cca atg atg age act ttt aaa gtt ctg eta tgt 425

Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys

60 65 70 75 ggc gcg gta tta tcc cgt att gac gcc ggg caa gag caa etc ggt ege 473

Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg

80 85 90 ege ata cac tat tet cag aat gac ttg gtt gag tac tea cca gtc aca 521

Arg lie His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr

95 100 105 gaa aag cat ett aeg gat ggc atg aca gta aga gaa tta tgc agt get 569

Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala

110 115 120 gcc ata acc atg agt gat aac act gcg gcc aac tta ett ctg aca aeg 617

Ala He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr

125 130 135 ate gga gga ccg aag gag eta acc get ttt ttg cac aac atg ggg gat 665

He Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp

140 145 150 155 cat gta act ege ett gat cgt tgg gaa ccg gag ctg aat gaa gcc ata 713

His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He

160 165 170 cca aac gac gag cgt gac acc aeg atg cct gta gca atg gca aca aeg 761

Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr

175 180 185 ttg ege aaa eta tta act ggc gaa eta ett act eta get tcc egg caa 809

WO 02/083872

PCT/US02/12405

138

2016225923 09 Sep 2016

Leu Arg Lys 190 Leu Leu Thr Gly Glu 195 Leu Leu Thr Leu Ala 200 Ser Arg Gin caa tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ett ctg 857 Gin Leu He Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu 205 210 215 cgc teg gcc ett ccg get ggc tgg ttt att get gat aaa tet gga gcc 905 Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly Ala 220 225 230 235 ggt gag cgt ggg tet cgc ggt ate att gca gca ctg ggg cca gat ggt 953 Gly Glu Arg Gly Ser Arg Gly lie lie Ala Ala Leu Gly Pro Asp Gly 240 245 250 aag ccc tcc cgt ate gta gtt ate tac aeg aeg ggg agt cag gca act 1001 Lys Pro Ser Arg lie Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr 255 260 265 atg gat gaa cga aat aga cag ate get gag ata ggt gcc tea ctg att 1049 Met Asp Glu Arg Asn Arg Gin He Ala Glu lie Gly Ala Ser Leu He

270 275 280 aag cat tgg taactgtcag accaagttta ctcatatata ctttagattg 1098

Lys His Trp

285

atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatetea 1158 tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga 1218 tcaaaggatc ttettgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 1278 aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga 1338 aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg tageegtagt 1398 taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt 1458 taccagtggc tgctgccagt ggegataagt cgtgtcttac cgggttggac teaagaegat 1518 agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcata cagcccagct 1578 tggagcgaac gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca 1638 cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag 1698 agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc 1758 gccacctctg aettgagegt cgatttttgt gatgetegte aggggggcgg agcctatgga 1818 aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca 1878 tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag 1938 ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagegg 1998

aagagcgccc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcagct 2058

WO 02/083872

PCT/US02/12405

139

2016225923 09 Sep 2016 ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt 2118 agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt atgttgtgtg 2178 gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat tacgccaagc 2238 tttggagcct tttttttgga gattttcaac gtg aaa aaa tta tta ttc gca att 2292 Met Lys Lys Leu Leu Phe Ala lie

290

cct Pro 295 tta Leu gtt Val gtt Val cct Pro ttc Phe 300 tat Tyr tct Ser cac His agt Ser gca Ala 305 cag Gin gtc Val caa Gin ctg Leu cag Gin 310 2340 gtc gac etc gag ate aaa cgt gga act gtg get gca cca tct gtc ttc 2388 Val Asp Leu Glu He Lys Arg Gly Thr Val Ala Ala Pro Ser Val Phe 315 320 325 ate ttc ccg cca tct gat gag cag ttg aaa tct gga act gcc tct gtt 2436 He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala Ser Val 330 335 340 gtg tgc ctg ctg aat aac ttc tat ccc aga gag gcc aaa gta cag tgg 2484 Val Cys Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val Gin Trp 345 350 355 aag gtg gat aac gcc etc caa teg ggt aac tcc cag gag agt gtc aca 2532 Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser Val Thr 360 365 370 gag cag gac age aag gac age acc tac age etc age age acc ctg acg 2580 Glu Gin Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser Ser Thr Leu Thr 375 380 385 390 ctg age aaa gca gac tac gag aaa cac aaa gtc tac gcc tgc gaa gtc 2628 Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val 395 400 405 acc cat cag ggc ctg agt tea ccg gtg aca aag age ttc aac agg gga 2676 Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn Arg Gly 410 415 420

gag tgt taataaggcg cgccaattct atttcaagga gacagtcata atg aaa tac 2731 Glu Cys Met Lys Tyr

425

eta Leu ttg Leu cct Pro 430 acg Thr gca Ala gcc Ala get Ala gga Gly 435 ttg Leu tta Leu tta Leu etc Leu geg Ala 440 gcc Ala cag Gin ccg Pro 2779 gcc atg gcc gaa gtt caa ttg tta gag tct ggt ggc ggt ett gtt cag 2827 Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly Leu Val Gin 445 450 455 cct ggt ggt tct tta cgt ett tct tgc get get tcc gga gcttcagatc 2876 Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 460 465 470

WO 02/083872

PCT/US02/12405

140

2016225923 09 Sep 2016

tgtttgcctt tttgtggggt ggtgcagatc gegttaegga gatcgaccga ctgcttgagc 2936 aaaagccacg cttaactgct gatcaggcat gggatgttat tcgccaaacc agtegteagg 2996 atcttaacct gaggcttttt ttacctactc tgcaagcagc gacatctggt ttgacacaga 3056 gcgatccgcg tcgtcagttg gtagaaacat taacacgttg ggatggcatc aatttgctta 3116 atgatgatgg taaaacctgg cagcagccag gctctgccat cctgaacgtt tggctgacca 3176 gtatgttgaa gcgtaccgta gtggctgccg tacctatgcc atttgataag tggtacagcg 3236 ccagtggcta cgaaacaacc caggacggcc caactggttc getgaatata agtgttggag 3296 caaaaatttt gtatgaggcg gtgcagggag acaaatcacc aatcccacag gcggttgatc 3356 tgtttgctgg gaaaccacag caggaggttg tgttggctgc gctggaagat acctgggaga 3416 ctctttccaa aegetatgge aataatgtga gtaactggaa aacacctgca atggccttaa 3476 cgttccgggc aaataatttc tttggtgtac cgcaggccgc ageggaagaa acgcgtcatc 3536 aggeggagta tcaaaaccgt ggaacagaaa aegatatgat tgttttctca ccaacgacaa 3596 gcgatcgtcc tgtgcttgcc tgggatgtgg tcgcacccgg tcagagtggg tttattgctc 3656 ccgatggaac agttgataag cactatgaag atcagctgaa aatgtacgaa aattttggcc 3716 gtaagteget ctggttaacg aagcaggatg tggaggegea taaggagteg tet aga Ser Arg 3772

gac aac tet aag aat act etc tac ttg cag atg aac age tta agt ctg 3820 Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Ser Leu 475 480 485 490 age att egg tec ggg caa cat tet cca aac tgaccagacg acacaaacgg 3870 Ser lie Arg Ser Gly Gin His Ser Pro Asn

495 500 cttacgctaa atcccgcgca tgggatggta aagaggtggc gtctttgctg gcctggactc 3930 atcagatgaa ggccaaaaat tggcaggagt ggacacagca ggcagcgaaa caagcactga 3990 ccatcaactg gtactatgct gatgtaaacg gcaatattgg ttatgttcat actggtgctt 4050 atccagatcg tcaatcaggc catgatccgc gattacccgt tcctggtacg ggaaaatggg 4110 actggaaagg gctattgcct tttgaaatga accctaaggt gtataacccc cagaagctag 4170 cctgcggctt cggtcaccgt ctcaagc gcc tcc acc aag ggc cca teg gtc ttc 4224 Ala Ser Thr Lys Gly Pro Ser Val Phe

505 age acc tet ggg ggc aca geg gcc ctg 4272

Ser Thr Ser Gly Gly Thr Ala Ala Leu

520 525 ccc ctg gca ccc tec tec aag Pro Leu Ala Pro Ser Ser Lys 510 515

WO 02/083872

PCT/US02/12405

SO

141

<N ggc tgc ctg gtc aag gac tac ttc CCC gaa ccg gtg aeg gtg teg tgg 4320 CL Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp <D 530 535 540 <Z> aac tea ggc gcc ctg acc age ggc gtc cac acc ttc ccg get gtc eta 4368 s Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu 545 550 555 cag tee tea gga etc tac tee etc age age gta gtg acc gtg CCC tee 4416 ΓΠ Gin Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser (N OS 560 565 570 IT) /k) age age ttg ggc acc cag acc tac ate tgc aac gtg aat cac aag CCC 4464 V N Ser Ser Leu Gly Thr Gin Thr Tyr He Cys Asn Val Asn His Lys Pro v N SO 575 580 585 o age aac acc aag gtg gac aag aaa gtt gag CCC aaa tet tgt geg gcc 4512 <N Ser Asn Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Ala Ala

590 595 600 605

gca Ala cat His cat His cat His cac His 610 cat His cac His ggg Gly gcc Ala gca Ala 615 gaa Glu caa Gin aaa Lys etc Leu ate lie 620 tea Ser 4560 gaa gag gat ctg aat ggg gcc gca tag act gtt gaa agt tgt tta gca 4608 Glu Glu Asp Leu Asn Gly Ala Ala Thr Val Glu Ser Cys Leu Ala 625 630 635 aaa cct cat aca gaa aat tea ttt act aac gtc tgg aaa gac gac aaa 4656 Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys 640 645 650 act tta gat cgt tac get aac tat gag ggc tgt ctg tgg aat get aca 4704 Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr 655 660 665 ggc gtt gtg gtt tgt act ggt gac gaa act cag tgt tac ggt aca tgg 4752 Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp 670 675 680 gtt cct att ggg ett get ate cct gaa aat gag ggt ggt ggc tet gag 4800 Val Pro lie Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu 685 690 695 700 ggt ggc ggt tet gag ggt ggc ggt tet gag ggt ggc ggt act aaa cct 4848 Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro 705 710 715 cct gag tac ggt gat aca cct att ccg ggc tat act tat ate aac cct 4896 Pro Glu Tyr Gly Asp Thr Pro lie Pro Gly Tyr Thr Tyr lie Asn Pro 720 725 730 etc gac ggc act tat ccg cct ggt act gag caa aac CCC get aat cct 4944 Leu Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro 735 740 745 aat cct tet ett gag gag tet cag cct ett aat act ttc atg ttt cag 4992 Asn Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

142

750 755 760

aat Asn 765 aat Asn agg Arg ttc Phe ega Arg aat Asn 770 agg Arg cag Gin ggt Gly gca Ala tta Leu 775 act Thr gtt Val tat Tyr aeg Thr ggc Gly 780 5040 act gtt act caa ggc act gac ccc gtt aaa act tat tac cag tac act 5088 Thr Val Thr Gin Gly Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr 785 790 795 cct gta tea tea aaa gcc atg tat gac get tac tgg aac ggt aaa ttc 5136 Pro Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe 800 805 810 aga gac tgc get ttc cat tet ggc ttt aat gag gat cca ttc gtt tgt 5184 Arg Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys 815 820 825 gaa tat caa ggc caa teg tet gac ctg cct caa cct cct gtc aat get 5232 Glu Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala 830 835 840 ggc ggc ggc tet ggt ggt ggt tet ggt ggc ggc tet gag ggt ggc ggc 5280 Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly 845 850 855 860 tet gag ggt ggc ggt tet gag ggt ggc ggc tet gag ggt ggc ggt tcc 5328 Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser 865 870 875 ggt ggc ggc tcc ggt tcc ggt gat ttt gat tat gaa aaa atg gca aac 5376 Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn 880 885 890 get aat aag ggg get atg acc gaa aat gcc gat gaa aac gcg eta cag 5424 Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin 895 900 905 tet gac get aaa ggc aaa ett gat tet gtc get act gat tac ggt get 5472 Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala 910 915 920 get ate gat ggt ttc att ggt gac gtt tcc ggc ett get aat ggt aat 5520 Ala He Asp Gly Phe lie Gly Asp Val Ser Gly Leu Ala Asn Gly Asn 925 930 935 940 ggt get act ggt gat ttt get ggc tet aat tcc caa atg get caa gtc 5568 Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val 945 950 955 ggt gac ggt gat aat tea cct tta atg aat aat ttc cgt caa tat tta 5616 Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu 960 965 970 cct tet ttg cct cag teg gtt gaa tgt ege cct tat gtc ttt ggc get 5664 Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Tyr Val Phe Gly Ala 975 980 985

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

143 ggt aaa cca tat gaa ttt tct att gat tgt gac aaa ata aac tta ttc 5712 Gly Lys Pro Tyr Glu Phe Ser lie Asp Cys Asp Lys lie Asn Leu Phe 990 995 1000 cgt ggt gtc ttt geg ttt ett tta tat gtt gcc acc ttt atg tat gta 5760 Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val 1005 1010 1015 1020 ttt teg aeg ttt get aac ata ctg cgt aat aag gag tct taataagaat 5809 Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys Glu Ser

1025 1030

tcactggccg tcgttttaca acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat 5869 cgccttgcag cacatccccc tttcgccagc tggegtaata gegaagagge ccgcaccgat 5929 cgcccttccc aacagttgcg cagcctgaat ggcgaatggc gcctgatgcg gtattttctc 5989 cttacgcatc tgtgcggtat ttcacaccgc atataaattg taaacgttaa tattttgtta 6049 aaattcgcgt taaatttttg ttaaatcagc tcatttttta accaataggc egaaategge 6109 aaaatccctt ataaatcaaa agaatagccc gagatagggt tgagtgttgt tccagtttgg 6169 aacaagagtc cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctat 6229 cagggcgatg gcccactacg tgaaccatca cccaaatcaa gttttttggg gtcgaggtgc 6289 cgtaaagcac taaatcggaa ccctaaaggg agcccccgat ttagagcttg acggggaaag 6349 ccggcgaacg tggcgagaaa ggaagggaag aaagegaaag gagegggege tagggcgctg 6409 gcaagtgtag cggtcacgct gcgcgtaacc accacacccg ccgcgcttaa tgcgccgcta 6469 cagggcgcgt actatggttg ctttgacggg tgeagtetea gtacaatctg ctctgatgcc 6529 gcatagttaa gccagccccg acacccgcca acacccgctg acgcgccctg acgggcttgt 6589 ctgctcccgg catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag 6649 aggttttcac cgtcatcacc gaaaegegeg a 6680

<210> 523 <211> 286 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Vector pCES5 protein sequence <400> 523

Met Ser He Gin His Phe Arg Val Ala Leu lie Pro Phe Phe Ala Ala 15 10 15

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 20 25 30

WO 02/083872

PCT/US02/12405

144

2016225923 09 Sep 2016

Asp Ala Glu Asp Gin Leu Gly Ala 35 40

Leu Asn Ser Gly Lys lie Leu Glu 50 55

Pro Met Met Ser Thr Phe Lys Val 65 70

Arg He Asp Ala Gly Gin Glu Gin 85

Gin Asn Asp Leu Val Glu Tyr Ser 100

Asp Gly Met Thr Val Arg Glu Leu 115 120

Asp Asn Thr Ala Ala Asn Leu Leu 130 135

Glu Leu Thr Ala Phe Leu His Asn 145 150

Asp Arg Trp Glu Pro Glu Leu Asn 165

Asp Thr Thr Met Pro Val Ala Met 180

Thr Gly Glu Leu Leu Thr Leu Ala 195 · 200

Met Glu Ala Asp Lys Val Ala Gly 210 215

Ala Gly Trp Phe He Ala Asp Lys 225 230

Arg Gly He He Ala Ala Leu Gly 245

Val Val He Tyr Thr Thr Gly Ser 260

Arg Gin He Ala Glu lie Gly Ala

275 280 <210> 524 <211> 138 <212> PRT <213> Artificial Sequence <220> <223> Description of Artificial

Sequence: Vector pCES5

Arg Val Gly Tyr He Glu Leu Asp 45

Ser Phe Arg Pro Glu Glu Arg Phe 60

Leu Leu Cys Gly Ala Val Leu Ser . 75 80

Leu Gly Arg Arg He His Tyr Ser 90 95

Pro Val Thr Glu Lys His Leu Thr 105 110

Cys Ser Ala Ala He Thr Met Ser 125

Leu Thr Thr He Gly Gly Pro Lys 140

Met Gly Asp His Val Thr Arg Leu 155 160

Glu Ala He Pro Asn Asp Glu Arg 170 175

Ala Thr Thr Leu Arg Lys Leu Leu 185 190

Ser Arg Gin Gin Leu He Asp Trp 205

Pro Leu Leu Arg Ser Ala Leu Pro 220

Ser Gly Ala Gly Glu Arg Gly Ser 235 240

Pro Asp Gly Lys Pro Ser Arg He 250 255

Gin Ala Thr Met Asp Glu Arg Asn 265 270

Ser Leu He Lys His Trp 285 protein sequence

WO 02/083872

PCT/US02/12405

145

2016225923 09 Sep 2016 <400> 524

Met Lys Lys Leu Leu Phe Ala lie Pro Leu Val Val Pro Phe Tyr Ser 15 10 15

His Ser Ala Gin Val Gin Leu Gin Val Asp Leu Glu lie Lys Arg Gly 20 25 30

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 35 40 45

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 50 55 60

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser

65 70 75 80

Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser Thr

85 90 95

Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 100 105 110

His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 115 120 125

Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 130 135 <210> 525 <211> 48 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Vector pCES5 protein sequence <400> 525

Met 1 Lys Tyr Leu Leu 5 Pro Thr Ala Ala Ala 10 Gly Leu Leu Leu Leu 15 Ala Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 20 25 30 Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly

35 40 45 <210> 526 <211> 28 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Vector pCES5

WO 02/083872

PCT/US02/12405

146

2016225923 09 Sep 2016

protein sequence <400> 526 Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu 1 5 10 15 Ser Leu Ser lie Arg Ser Gly Gin His Ser Pro Asn

20 25 <210> 527 <211> 533 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Vector pCES5 protein sequence <400> 527 Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Ser Ser Lys 1 5 10 15 Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr 20 25 30 Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser 35 40 45 Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser 50 55 60 Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr 65 70 75 80 Tyr He Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys 85 90 95 Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His His His His His His 100 105 110 Gly Ala Ala Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn Gly Ala 115 120 125 Ala Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu Asn Ser Phe 130 135 140 Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr 145 150 155 160 Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp 165 170 175 Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu Ala He Pro 180 185 190 Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly

195 200 205

WO 02/083872

PCT/US02/12405

147

2016225923 09 Sep 2016

Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp Thr Pro lie 210 215 220

Pro Gly Tyr Thr Tyr He Asn Pro Leu Asp Gly Thr Tyr Pro Pro Gly

225 230 235 240

Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu Glu Ser Gin

245 250 255

Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg Asn Arg Gin 260 265 270

Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly Thr Asp Pro 275 280 285

Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys Ala Met Tyr 290 295 300

Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe His Ser Gly

305 310 315 320

Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin Ser Ser Asp

325 330 335

Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly Gly Gly Ser 340 345 350

Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly 355 360 365

Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp 370 · 375 380

Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu

385 390 395 400

Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp

405 410 415

Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly Asp 420 425 430

Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly 435 440 445

Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu 450 455 460

Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu

465 470 475 480

Cys Arg Pro Tyr Val Phe Gly Ala Gly Lys Pro Tyr Glu Phe Ser He

485 490 495

Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu 500 505 510

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

148

Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn lie Leu 515 520 525

Arg Asn Lys Glu Ser 530

<210> 528 <211> 30 . <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 528 acctcactgg cttccggatt cactttctct

<210> 529 <211> 42 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 529 agaaacccac tccaaacctt taccaggagc ttggcgaacc ca

<210> 530 <211> 51 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 530

ggaaggcagt gatctagaga tagtgaagcg acctttaacg gagtcagcat a 51 <210> 531 <211> 23 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 531 ggaaggcagt gatctagaga tag

WO 02/083872

PCT/US02/12405

149

2016225923 09 Sep 2016 <210> 532 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 532 gtgctgactc agccaccctc 20 <210> 533 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 533 gccctgactc agcctgcctc 20

<210> 534 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic <400> 534 gagctgactc aggaccctgc 20 <210> 535 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Synthetic

oligonucleotide <400> 535 gagctgactc agccaccctc 20 <210> 536 <211> 38 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

150

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 536 cctcgacagc gaagtgcaca gagcgtcttg actcagcc 38 <210> 537 .

<211> 30 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 537 cctcgacagc gaagtgcaca gagcgtcttg 30 <210> 538 <211> 38 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 538 cctcgacagc gaagtgcaca gagcgctttg actcagcc 38 <210> 539 <211> 30 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 539 cctcgacagc gaagtgcaca gagcgctttg 30 <210> 540 <211> 38 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 540 cctcgacagc taagtgcaca gagcgctttg actcagcc 38

WO 02/083872

PCT/US02/12405

151

2016225923 09 Sep 2016

<210> 541 <211> 30 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 541 cctcgacagc gaagtgcaca gagcgctttg

<210> 542 <211> 38 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 542 cctcgacagc gaagtgcaca gagcgaattg actcagcc

<210> 543 <211> 30 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 543 cctcgacagc gaagtgcaca gagcgaattg

<210> 544 <211> 38 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 544

<210> 545 <211> 30 <212> DNA <213> Artificial Sequence cctcgacagc gaagtgcaca gtacgaattg actcagcc

WO 02/083872

PCT/US02/12405

152

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 545 cctcgacagc gaagtgcaca gtacgaattg 30 <210> 546 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 546 cctcgacagc gaagtgcaca g 21 <210> 547 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 547 ccgtgtatta ctgtgcgaga g 21 <210> 548 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 548 ctgtgtatta ctgtgcgaga g 21 <210> 549 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 549

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

153 ccgtatatta ctgtgcgaaa g 21 <210> 550 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 550 ctgtgtatta ctgtgcgaaa g 21 <210> 551 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 551 ctgtgtatta ctgtgcgaga c 21

<210> 552 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 552 ccatgtatta ctgtgcgaga c

<210> 553 <211> 94 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <210> 554 <211> 94 <400> 553 ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 ggctgaggac actgcagtct actattgtgc gaga 94

WO 02/083872

PCT/US02/12405

154

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 554 ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 ggctgaggac actgcagtct actattgtgc gaaa 94 <210> 555 <211> 85 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 555 atagtagact gcagtgtcct cagcccttaa gctgttcatc tgcaagtaga gagtattctt 60

agagttgtct ctagatcact acacc 85 <210> 556 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Primer <400> 556 gactgggtgt agtgatctag 20 <210> 557 <211> 24 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Primer <400> 557 cttttctttg ttgccgttgg ggtg 24 <210> 558 <211> 15 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence : Synthetic

WO 02/083872

PCT/US02/12405

155

2016225923 09 Sep 2016 <220>

<221> modified_base <222> (1)..(9) <223> A, T, C, G, other or unknown <400> 558 nnnnnnnnng caggt <210> 559 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)., (11) <223> A, T, C, G, other or unknown <400> 559 acctgcnnnn n <210> 560 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 560 gatnnnnatc <210> 561 <211> 16 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(16)

WO 02/083872

PCT/US02/12405

156

2016225923 09 Sep 2016 <223> A, T, C, G, other or unknown <400> 561 gaggagnnnn nnnnnn <210> 562 <211> 16 <212> DNA .

<213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(10) <223> A, T, C, G, other or unknown <400> 562 nnnnnnnnnn ctcctc <210> 563 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(10) <223> A, T, C, G, other or unknown <400> 563 ctcttcnnnn <210> 564 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(5) <223> A, T, C, G, other or unknown <400> 564 nnnnngaaga g

WO 02/083872

PCT/US02/12405

157

2016225923 09 Sep 2016 <210> 565 <211> 20 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (1)..(15) <223> A, T, C, G, other or unknown <400> 565 nnnnnnnnnn nnnnngtccc 20 <210> 566 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, G, G, other or unknown <400> 566 gacnnnnnng tc 12 <210> 567 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 567 cgtctcnnnn n 11 <210> 568 <211> 12

WO 02/083872

PCT/US02/12405

158

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base .

<222> (7) .. (12) <223> A, T, C, G, other or unknown <400> 568 gtatccnnnn nn 12 <210> 569 <211> 12 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(9) <223> A, T, C, G, other or unknown <400> 569 gcannnnnnt eg 12 <210> 570 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4) .. (8) <223> A, T, C, G, other or unknown <400> 570 gccnnnnngg c 11 <210> 571 <211> 11 <212> DNA <213> Artificial Sequence <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

159 <223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(11) <223> A, T, C, G, other or unknown <400> 571 .

ggtctcnnnn n 11 <210> 572 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 572 gacnnnnngt c 11 <210> 573 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 573 gacnnnnngt c 11 <210> 574 <211> 11 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

160 <221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 574 ccannnnntg g 11 <210> 575 .

<211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(12) <223> A, T, C, G, other or unknown <400> 575 ccannnnnnn nntgg 15

<210> 576 <211> 13 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: oligonucleotide Synthetic <220> <221> modified base <222> (5) . . (9) <223> A, T, C, G, other or unknown <400> 576 ggccnnnnng gcc 13 <210> 577 <211> 12 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: oligonucleotide Synthetic <220> <221> modified base <222> (4)..(9) <223> A, T, C, G, other or unknown

WO 02/083872

PCT/US02/12405

161

2016225923 09 Sep 2016

<400> 577 ccannnnnnt gg <210> 578 <211> 11 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic

oligonucleotide <220>

<221> modified_base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 578 cctnnnnnag g 11 <210> 579 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> ’ <221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 579 gacnnnngtc 10 <210> 580 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4) .. (12) <223> A, T, C, G, other or unknown <400> 580 ccannnnnnn nntgg 15

WO 02/083872

PCT/US02/12405

162

2016225923 09 Sep 2016

<210> 581 <211> 11 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : Synthetic oligonucleotide <220> <221> modified base <222> (4)..(8) <223> A, T, C, G, other or unknown <400> 581 gcannnnntg c 11 <210> 582 <211> 10251 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence : CJRA05

nucleotide sequence <220>

<221> CDS <222> (1578)..(1916) <220>

<221> CDS <222> (2388) .. (2843) <220>

<221> CDS <222> (2849) . . (2893) <220>

<221> CDS <222> (3189) . . (4232) <220>

<221> CDS <222> (7418) .. (8119) <220>

<221> CDS · <222> (8160) .. (9452) <400> 582 aatgctacta ctattagtag aattgatgcc accttttcag ctcgcgcccc aaatgaaaat 60 atagctaaac aggttattga ccatttgcga aatgtatcta atggtcaaac taaatctact 120 cgttcgcaga attgggaatc aactgttata tggaatgaaa cttccagaca ccgtacttta 180

WO 02/083872

PCT/US02/12405

163

2016225923 09 Sep 2016

gttgcatatt taaaacatgt tgagctacag cattatattc agcaattaag ctctaagcca 240 tccgcaaaaa tgacctctta tcaaaaggag caattaaagg tactctctaa tcctgacctg 300 ttggagtttg cttccggtct ggttcgcttt gaagctcgaa ttaaaacgcg atatttgaag 360 tctttcgggc ttcctcttaa tctttttgat gcaatccgct ttgcttctga ctataatagt 420 cagggtaaag acctgatttt tgatttatgg tcattctcgt tttctgaact gtttaaagca 480 tttgaggggg attcaatgaa tatttatgac gattccgcag tattggacgc tatccagtct 540 aaacatttta ctattacccc ctctggcaaa acttcttttg caaaagcctc tcgctatttt 600 ggtttttatc gtcgtctggt aaacgagggt tatgatagtg ttgctcttac tatgcctcgt 660 aattcctttt ggcgttatgt atctgcatta gttgaatgtg gtattcctaa atctcaactg 720 atgaatcttt ctacctgtaa taatgttgtt ccgttagttc gttttattaa cgtagatttt 780 tcttcccaac gtcctgactg gtataatgag ccagttctta aaatcgcata aggtaattca 840 caatgattaa agttgaaatt aaaccatctc aagcccaatt tactactcgt tctggtgttt 900 ctcgtcaggg caagccttat tcactgaatg agcagctttg ttacgttgat ttgggtaatg 960 aatatccggt tcttgtcaag attactcttg atgaaggtca gccagcctat gcgcctggtc 1020 tgtacaccgt tcatctgtcc tctttcaaag ttggtcagtt cggttccctt atgattgacc 1080 gtctgcgcct cgttccggct aagtaacatg gagcaggtcg cggatttcga cacaatttat 1140 caggcgatga tacaaatctc cgttgtactt tgtttcgcgc ttggtataat cgctgggggt 1200 caaagatgag tgttttagtg tattcttttg cctctttcgt tttaggttgg tgccttcgta 1260 gtggcattac gtattttacc cgtttaatgg aaacttcctc atgaaaaagt ctttagtcct 1320 caaagcctct gtagccgttg ctaccctcgt tccgatgctg tctttcgctg ctgagggtga 1380 cgatcccgca aaagcggcct ttaactccct gcaagcctca gcgaccgaat atatcggtta 1440 tgcgtgggcg atggttgttg tcattgtcgg cgcaactatc ggtatcaagc tgtttaagaa 1500 attcacctcg aaagcaagct gataaaccga tacaattaaa ggctcctttt ggagcctttt 1560

ttttggagat tttcaac gtg aaa aaa tta tta ttc gca att cct tta gtt 1610

Met Lys Lys Leu Leu Phe Ala lie Pro Leu Val 15 10

gtt Val cct Pro ttc Phe tat Tyr 15 tet Ser ggc Gly geg Ala gcc Ala gaa Glu 20 tea Ser cat His eta Leu gac Asp ggc Gly 25 gcc Ala get Ala 1658 gaa act gtt gaa agt tgt tta gca aaa tcc cat aca gaa aat tea ttt 1706 Glu Thr Val Glu Ser Cys Leu Ala Lys Ser His Thr Glu Asn Ser Phe

30 35 40

WO 02/083872

PCT/US02/12405

164

2016225923 09 Sep 2016

act Thr aac gtc tgg aaa gac gac aaa act tta gat Leu Asp cgt Arg 55 tac get aac Asn tat Tyr 1754 Asn 45 Val Trp Lys Asp Asp 50 Lys Thr Tyr Ala gag ggc tgt ctg tgg aat get aca ggc gtt gta gtt tgt act ggt gac 1802 Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp 60 65 70 75 gaa act cag tgt tac ggt aca tgg gtt cct att ggg ett get ate cct 1850 Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro lie Gly Leu Ala lie Pro 80 85 90 gaa aat gag ggt ggt ggc tct gag ggt ggc ggt tct gag ggt ggc ggt 1898 Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 95 100 105 tct gag ggt ggc ggt act aaacctcctg agtacggtga tacacctatt 1946 Ser Glu Gly Gly Gly Thr 110 ccgggctata cttatatcaa ccctctcgac : ggcacttatc cgcctggtac tgagcaaaac 2006 cccgctaatc ctaatccttc tettgaggag 1 tctcagcctc ttaatacttt catgtttcag 2066 aataataggt teegaaatag gcagggggca i ttaactgttt atacgggcac tgttactcaa 2126 ggcactgacc ccgttaaaac ttattaccag 1 tacactcctg tatcatcaaa agccatgtat 2186 gacgcttact ggaacggtaa attcagagac : tgcgctttcc attctggctt taatgaggat 2246 ttatttgttt gtgaatatca aggccaatcg [ tctgacctgc ctcaacctcc tgteaatget 2306 ggcggcggct ctggtggtgg ttctggtggc : ggctctgagg gtggtggctc tgagggaggc 2366 ggttccggtg gtggctctgg t tcc ggt gat ttt gat tat gaa aag atg gca 2417 Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala 115 120 aac get aat aag ggg get atg acc gaa aat gcc gat gaa aac geg eta 2465 Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu Asn Ala Leu 125 130 135 cag tct gac get aaa ggc aaa ett gat tct gtc get act gat tac ggt 2513 Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly 140 145 150 155 get get ate gat ggt ttc att ggt gac gtt tcc ggc ett get aat ggt 2561 Ala Ala lie Asp Gly Phe lie Gly Asp Val Ser Gly Leu Ala Asn Gly 160 165 170 aat ggt get act ggt gat ttt get ggc tct aat tcc caa atg get caa 2609 Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin Met Ala Gin 175 180 185 gtc ggt gac ggt gat aat tea cct tta atg aat aat ttc cgt caa tat 2657 Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr 190 195 200

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

165 tta cct tcc etc cct caa teg gtt gaa tgt cgc cct ttt gtc ttt ggc 2705

Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Phe Val Phe Gly

205 210 215 get ggt aaa cca tat gaa ttt tet att gat tgt gac aaa ata aac tta 2753

Ala Gly Lys Pro Tyr Glu Phe Ser lie Asp Cys Asp Lys He Asn Leu

220 225 230 235 ttc cgt ggt gtc ttt gcg ttt ctt tta tat gtt gcc acc ttt atg tat 2801

Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr

240 245 250 gta ttt tet aeg ttt get aac ata ctg cgt aat aag gag tet taatc atg 2851

Val Phe Ser Thr Phe Ala Asn lie Leu Arg Asn Lys Glu Ser Met

255 260 265 cca gtt ctt ttg ggt att ccg tta tta ttg cgt ttc etc ggt 2893

Pro Val Leu Leu Gly He Pro Leu Leu Leu Arg Phe Leu Gly

270 275 280

ttccttctgg taactttgtt eggetatetg cttacttttc ttaaaaaggg etteggtaag 2953 atagetattg ctatttcatt gtttcttgct ettattattg ggcttaactc aattcttgtg 3013 ggttatctct ctgatattag cgctcaatta ccctctgact ttgttcaggg tgttcagtta 3073 attctcccgt etaatgeget tccctgtttt tatgttattc tctctgtaaa ggctgctatt 3133 ttcatttttg acgttaaaca aaaaatcgtt tcttatttgg attgggataa ataat atg 3191

Met

get Ala gtt Val tat Tyr ttt Phe 285 gta Val act Thr ggc Gly aaa Lys tta Leu 290 ggc Gly tet Ser gga Gly aag Lys aeg Thr 295 etc Leu gtt Val 3239 age gtt ggt aag att cag gat aaa att gta get ggg tgc aaa ata gca 3287 Ser Val Gly Lys lie Gin Asp Lys lie Val Ala Gly Cys Lys He Ala 300 305 310 act aat ctt gat tta agg ctt caa aac etc ccg caa gtc ggg agg ttc 3335 Thr Asn Leu Asp Leu Arg Leu Gin Asn Leu Pro Gin Val Gly Arg Phe 315 320 325 get aaa aeg cct cgc gtt ctt aga ata ccg gat aag cct tet ata tet 3383 Ala Lys Thr Pro Arg Val Leu Arg He Pro Asp Lys Pro Ser He Ser 330 335 340 345 gat ttg ctt get att ggg cgc ggt aat gat tcc tac gat gaa aat aaa 3431 Asp Leu Leu Ala He Gly Arg Gly Asn Asp Ser Tyr Asp Glu Asn Lys 350 355 360 aac ggc ttg ctt gtt etc gat gag tgc ggt act tgg ttt aat acc cgt 3479 Asn Gly Leu Leu Val Leu Asp Glu Cys Gly Thr Trp Phe Asn Thr Arg 365 370 375 tet tgg aat gat aag gaa aga cag ccg att att gat tgg ttt eta cat 3527 Ser Trp Asn Asp Lys Glu Arg Gin Pro He He Asp Trp Phe Leu His 380 385 390

WO 02/083872

PCT/US02/12405

166

2016225923 09 Sep 2016

get Ala cgt Arg 395 aaa Lys tta Leu gga Gly tgg Trp gat Asp 400 att He att He ttt Phe ett Leu gtt Val 405 cag Gin gac Asp tta Leu tet Ser 3575 att gtt gat aaa cag gcg cgt tet gca tta get gaa cat gtt gtt tat 3623 lie Val Asp Lys Gin Ala Arg Ser Ala Leu Ala Glu His Val Val Tyr 410 415 420 425 tgt cgt cgt ctg gac aga att act tta cct ttt gtc ggt act tta tat 3671 Cys Arg Arg Leu Asp Arg He Thr Leu Pro Phe Val Gly Thr Leu Tyr 430 435 440 tet ett att act ggc teg aaa atg cct ctg cct aaa tta cat gtt ggc 3719 Ser Leu He Thr Gly Ser Lys Met Pro Leu Pro Lys Leu His Val Gly 445 450 455 gtt gtt aaa tat ggc gat tet caa tta age cct act gtt gag cgt tgg 3767 Val Val Lys Tyr Gly Asp Ser Gin Leu Ser Pro Thr Val Glu Arg Trp 460 465 470 ett tat act ggt aag aat ttg tat aac gca tat gat act aaa cag get 3815 Leu Tyr Thr Gly Lys Asn Leu Tyr Asn Ala Tyr Asp Thr Lys Gin Ala 475 480 485 ttt tet agt aat tat gat tee ggt gtt tat tet tat tta aeg cct tat 3863 Phe Ser Ser Asn Tyr Asp Ser Gly Val Tyr Ser Tyr Leu Thr Pro Tyr 490 495 500 505 tta tea cac ggt egg tat ttc aaa cca tta aat tta ggt cag aag atg 3911 Leu Ser His Gly Arg Tyr Phe Lys Pro Leu Asn Leu Gly Gin Lys Met 510 515 520 aaa tta act aaa ata tat ttg aaa aag ttt tet cgc gtt ett tgt ett 3959 Lys Leu Thr Lys lie Tyr Leu Lys Lys Phe Ser Arg Val Leu Cys Leu 525 530 535 gcg att gga ttt gca tea gca ttt aca tat agt tat ata acc caa cct 4007 Ala He Gly Phe Ala Ser Ala Phe Thr Tyr Ser Tyr He Thr Gin Pro 540 545 550 aag ccg gag gtt aaa aag gta gtc tet cag acc tat gat ttt gat aaa 4055 Lys Pro Glu Val Lys Lys Val Val Ser Gin Thr Tyr Asp Phe Asp Lys 555 560 565 ttc act att gac tet tet cag cgt ett aat eta age tat cgc tat gtt 4103 Phe Thr He Asp Ser Ser Gin Arg Leu Asn Leu Ser Tyr Arg Tyr Val 570 575 580 585 ttc aag gat tet aag gga aaa tta att aat age gac gat tta cag aag 4151 Phe Lys Asp Ser Lys Gly Lys Leu lie Asn Ser Asp Asp Leu Gin Lys 590 595 600 caa ggt tat tea etc aca tat att gat tta tgt act gtt tcc att aaa 4199 Gin Gly Tyr Ser Leu Thr Tyr He Asp Leu Cys Thr Val Ser lie Lys 605 610 615

aaa ggt aat tea aat gaa att gtt aaa tgt aat taattttgtt ttcttgatgt 4252

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

167

Lys Gly Asn Ser Asn Glu lie Val Lys Cys Asn 620 625

ttgtttcatc atcttctttt gctcaggtaa ttgaaatgaa taattcgcct ctgcgcgatt 4312 ttgtaacttg gtattcaaag caatcaggcg aatccgttat tgtttctccc gatgtaaaag 4372 gtactgttac tgtatattca tctgacgtta aacctgaaaa tctacgcaat ttctttattt 4432 ctgttttacg tgcaaataat tttgatatgg taggttctaa cccttccatt attcagaagt 4492 ataatccaaa caatcaggat tatattgatg aattgccatc atctgataat caggaatatg 4552 atgataattc cgctccttct ggtggtttct ttgttccgca aaatgataat gttactcaaa 4612 cttttaaaat taataacgtt cgggcaaagg atttaatacg agttgtcgaa ttgtttgtaa 4672 agtctaatac ttctaaatcc tcaaatgtat tatctattga cggctctaat ctattagttg 4732 ttagtgctcc taaagatatt ttagataacc ttcctcaatt cctttcaact gttgatttgc 4792 caactgacca gatattgatt gagggtttga tatttgaggt tcagcaaggt gatgctttag 4852 atttttcatt tgctgctggc tctcagcgtg gcactgttgc aggcggtgtt aatactgacc 4912 gcctcacctc tgttttatct tctgctggtg gttcgttcgg tatttttaat ggcgatgttt 4972 tagggctatc agttcgcgca ttaaagacta atagccattc aaaaatattg tctgtgccac 5032 gtattcttac gctttcaggt cagaagggtt ctatctctgt tggccagaat gtccctttta 5092 ttactggtcg tgtgactggt gaatctgcca atgtaaataa tccatttcag acgattgagc 5152 gtcaaaatgt aggtatttcc atgagcgttt ttcctgttgc aatggctggc ggtaatattg 5212 ttctggatat taccagcaag gccgatagtt tgagttcttc tactcaggca agtgatgtta 5272 ttactaatca aagaagtatt gctacaacgg ttaatttgcg tgatggacag actcttttac 5332 tcggtggcct cactgattat aaaaacactt ctcaggattc tggcgtaccg ttcctgtcta 5392 aaatcccttt aatcggcctc ctgtttagct cccgctctga ttctaacgag gaaagcacgt 5452 tatacgtgct cgtcaaagca accatagtac gcgccctgta gcggcgcatt aagcgcggcg 5512 ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct 5572 ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat 5632 cgggggctcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt 5692 gatttgggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg 5752 acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac 5812 cctatctcgg gctattcttt tgatttataa gggattttgc cgatttcgga accaccatca 5872 aacaggattt tcgcctgctg gggcaaacca gcgtggaccg cttgctgcaa ctctctcagg 5932

WO 02/083872

PCT/US02/12405

168

2016225923 09 Sep 2016

gccaggcggt gaagggcaat cagctgttgc ccgtctcact ggtgaaaaga aaaaccaccc 5992 tggatccaag cttgcaggtg gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt 6052 atttttctaa atacattcaa atatgtatcc gctcatgaga caataaccct gataaatget 6112 tcaataatat tgaaaaagga agagtatgag tattcaacat ttccgtgtcg cccttattcc 6172 cttttttgcg gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa 6232 agatgctgaa gatcagttgg gcgcactagt gggttacatc gaactggatc tcaacagcgg 6292 taagatcctt gagagttttc gccccgaaga acgttttcca atgatgagca cttttaaagt 6352 tctgctatgt ggcgcggtat tatcccgtat tgacgccggg caagagcaac tcggtcgccg 6412 catacactat tctcagaatg acttggttga gtactcacca gtcacagaaa agcatcttac 6472 ggatggcatg acagtaagag aattatgcag tgctgccata accatgagtg ataacactgc 6532 ggccaactta cttctgacaa egateggagg accgaaggag ctaaccgctt ttttgcacaa 6592 catgggggat catgtaactc geettgateg ttgggaaccg gagctgaatg aagccatacc 6652 aaaegaegag cgtgacacca cgatgcctgt agcaatggca acaacgttgc gcaaactatt 6712 aactggcgaa ctacttactc tagcttcccg gcaacaatta atagactgga tggaggegga 6772 taaagttgca ggaccacttc tgcgctcggc ccttccggct ggctggttta ttgctgataa 6832 atctggagcc ggtgagcgtg ggtetegegg tatcattgca gcactggggc cagatggtaa 6892 gccctcccgt ategtagtta tctacacgac ggggagtcag gcaactatgg atgaaegaaa 6952 tagacagatc getgagatag gtgcctcact gattaagcat tggtaactgt cagaccaagt 7012 ttactcatat ataetttaga ttgatttaaa acttcatttt taatttaaaa ggatctaggt 7072 gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt cgttccactg 7132 tacgtaagac ccccaagctt gtcgactgaa tggcgaatgg cgctttgcct ggtttccggc 7192 accagaagcg gtgccggaaa gctggctgga gtgegatett cctgacgctc gagcgcaacg 7252 caattaatgt gagttagctc actcattagg caccccaggc tttacacttt atgcttccgg 7312 ctcgtatgtt gtgtggaatt gtgageggat aacaatttca cacaggaaac agctatgacc 7372 atgattaege caagctttgg agcctttttt ttggagattt tcaac gtg , Met : aaa aaa tta Lys Lys Leu 7429

630

tta ttc gca att cct tta gtt gtt cct ttc tat tet cac agt gca caa 7477 Leu Phe Ala lie Pro Leu Val Val Pro Phe Tyr Ser His Ser Ala Gin 635 640 645

gac ate cag atg acc cag tet cca gcc acc ctg tet ttg tet cca ggg 7525

WO 02/083872

PCT/US02/12405

169

2016225923 09 Sep 2016

Asp lie 650 Gin Met Thr Gin Ser 655 Pro Ala Thr Leu Ser 660 Leu Ser Pro Gly gaa aga gcc acc etc tcc tgc agg gcc agt cag ggt gtt age age tac 7573 Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Gly Val Ser Ser Tyr 665 670 675 680 tta gcc tgg tac cag cag aaa cct ggc cag get ccc agg etc etc ate 7621 Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He 685 690 695 tat gat gca tcc aac agg gcc act ggc ate cca gcc agg ttc agt ggc 7669 Tyr Asp Ala Ser Asn Arg Ala Thr Gly lie Pro Ala Arg Phe Ser Gly 700 705 710 agt ggg cct ggg aca gac ttc act etc acc ate age age eta gag cct 7717 Ser Gly Pro Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Glu Pro 715 720 725 gaa gat ttt gca gtt tat tac tgt cag cag cgt aac tgg cat ccg tgg 7765 Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Arg Asn Trp His Pro Trp 730 735 740 aeg ttc ggc caa ggg acc aag gtg gaa ate aaa ega act gtg get gca 7813 Thr Phe Gly Gin Gly Thr Lys Val Glu lie Lys Arg Thr Val Ala Ala 745 750 755 760 cca tet gtc ttc ate ttc ccg cca tet gat gag cag ttg aaa tet gga 7861 Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly 765 770 775 act gcc tet gtt gtg tgc ctg ctg aat aac ttc tat ccc aga gag gcc 7909 Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala 780 785 790 aaa gta cag tgg aag gtg gat aac gcc etc caa teg ggt aac tcc cag 7957 Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin 795 800 805 gag agt gtc aca gag egg gac age aag gac age acc tac age etc age 8005 Glu Ser Val Thr Glu Arg Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser 810 815 820 age acc ctg aeg ctg age aaa gca gac tac gag aaa cac aaa gtc tac 8053 Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr 825 830 835 840 gcc tgc gaa gtc acc cat cag ggc ctg age teg ccc gtc aca aag age 8101 Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser 845 850 855

gacagtcata atg aaa Met Lys tac eta ttg cct Tyr Leu Leu Pro 865 aeg gca gcc get gga ttg Thr Ala Ala Ala Gly Leu

870 tta

Leu

875 ttc aac agg gga gag tgt taataaggcg cgccaattct atttcaagga 8149

Phe Asn Arg Gly Glu Cys

860

8198

WO 02/083872

PCT/US02/12405

170

2016225923 09 Sep 2016

tta Leu etc Leu geg Ala gee Ala cag Gin 880 ccg Pro gee Ala atg Met gee Ala gaa Glu 885 gtt Val caa Gin ttg Leu tta Leu gag Glu 890 tct Ser 8246 ggt ggc ggt ett gtt cag cct ggt ggt tct tta cgt ett tct tgc get 8294 Gly Gly Gly Leu 895 Val Gin Pro Gly Gly 900 Ser Leu Arg Leu Ser 905 Cys Ala get tec gga ttc act ttc tct act tac gag atg cgt tgg gtt ege caa 8342 Ala Ser Gly 910 Phe Thr Phe Ser Thr 915 Tyr Glu Met Arg Trp 920 Val Arg Gin get cct ggt aaa ggt ttg gag tgg gtt tct tat ate get cct tct ggt 8390 Ala Pro 925 Gly Lys Gly Leu Glu 930 Trp Val Ser Tyr He 935 Ala Pro Ser Gly ggc gat act get tat get gac tec gtt aaa ggt ege ttc act ate tct 8438 Gly 940 Asp Thr Ala Tyr Ala 945 Asp Ser Val Lys Gly 950 Arg Phe Thr He Ser 955 aga gac aac tct aag aat act etc tac ttg cag atg aac age tta agg 8486 Arg Asp Asn Ser Lys 960 Asn Thr Leu Tyr Leu 965 Gin Met Asn Ser Leu 970 Arg get gag gac act gca gtc tac tat tgt geg agg agg etc gat ggc tat 8534 Ala Glu Asp Thr 975 Ala Val Tyr Tyr Cys 980 Ala Arg Arg Leu Asp 985 Gly Tyr att tec tac tac tac ggt atg gac gtc tgg ggc caa ggg acc aeg gtc 8582 lie Ser Tyr 990 Tyr Tyr Gly Met Asp 995 Val Trp Gly Gin Gly 1000 Thr Thr Val acc gtc tea age gee tec acc aag ggc cca teg gtc ttc ccc ctg gca 8630 Thr Val 1005 Ser Ser Ala Ser Thr 1010 Lys Gly Pro Ser Val 1015 Phe Pro Leu Ala ccc tec tec aag age acc tct ggg ggc aca geg gee ctg ggc tgc ctg 8678 Pro Ser 1020 Ser Lys Ser Thr 1025 Ser Gly Gly Thr Ala 1030 Ala Leu Gly Cys Leu 1035 gtc aag gac tac ttc ccc gaa ccg gtg aeg gtg teg tgg aac tea ggc 8726 Val Lys Asp Tyr Phe 1040 Pro Glu Pro Val Thr 1045 Val Ser Trp Asn Ser 1050 Gly gee ctg acc age ggc gtc cac acc ttc ccg get gtc eta cag tec tea 8774 Ala Leu Thr Ser 1055 Gly Val His Thr Phe 1060 Pro Ala Val Leu Gin 1065 Ser Ser gga etc tac tec etc age age gta gtg acc gtg ccc tec age age ttg 8822 Gly Leu Tyr 1070 Ser Leu Ser Ser Val 1075 Val Thr Val Pro Ser 1080 Ser Ser Leu ggc acc cag acc tac ate tgc aac gtg aat cac aag ccc age aac acc 8870 Gly Thr 1085 Gin Thr Tyr lie Cys 1090 Asn Val Asn His Lys 1095 Pro Ser Asn Thr aag gtg gac aag aaa gtt gag ccc aaa tct tgt geg gee gca cat cat 8918

WO 02/083872

PCT/US02/12405

171

2016225923 09 Sep 2016

Lys Val 1100 Asp Lys Lys Val 1105 Glu Pro Lys Ser Cys 1110 Ala Ala Ala His His 1115 cat cac cat cac ggg gcc gca gaa caa aaa etc ate tea gaa gag gat 8966 His His His His Gly 1120 Ala Ala Glu Gin Lys 1125 Leu lie Ser Glu Glu 1130 Asp ctg aat ggg gcc gca tag get age tct get wsy ggy gay tty gay tay 9014 Leu Asn Gly Ala 1135 Ala Gin Ala Ser Ser 1140 Ala Ser Gly Asp Phe 1145 Asp Tyr gar aar atg get aaw gey aay aar ggs gey atg acy gar aay gey gay 9062 Glu Lys Met 1150 Ala Asn Ala Asn Lys 1155 Gly Ala Met Thr Glu 1160 Asn Ala Asp gar aay gck ytr car wsy gay gey aar ggy aar ytw gay wsy gtc gck 9110 Glu Asn 1165 Ala Leu Gin Ser Asp 1170 Ala Lys Gly Lys Leu 1175 Asp Ser Val Ala acy gay tay ggy gey gcc ate gay ggy tty aty ggy gay gtc wsy ggy 9158 Thr Asp 1180 Tyr Gly Ala Ala He 1185 Asp Gly Phe He Gly 1190 Asp Val Ser Gly 1195 ytk gey aay ggy aay ggy gey acy ggw gay tty gew ggy tek aat tcy 9206 Leu Ala Asn Gly Asn 1200 Gly Ala Thr Gly Asp 1205 Phe Ala Gly Ser Asn 1210 Ser car atg gey car gty ggw gay ggk gay aay wsw cck ytw atg aay aay 9254 Gin Met Ala Gin 1215 Val Gly Asp Gly Asp 1220 Asn Ser Pro Leu Met 1225 Asn Asn tty mgw car tay ytw cck tcy cty cck car wsk gty gar tgy egy ccw 9302 Phe Arg Gin 1230 Tyr Leu Pro Ser Leu 1235 Pro Gin Ser Val Glu 1240 Cys Arg Pro tty gty tty wsy gey ggy aar ccw tay gar tty wsy aty gay tgy gay 9350 Phe Val 1245 Phe Ser Ala Gly Lys 1250 Pro Tyr Glu Phe Ser 1255 He Asp Cys Asp aar atm aay ytw tty egy ggy gty tty gck tty ytk yta tay gty gey 9398 Lys lie 1260 Asn Leu Phe Arg 1265 Gly Val Phe Ala Phe 1270 Leu Leu Tyr Val Ala 1275 acy tty atg tay gtw tty wsy ack tty gey aay atw ytr egy aay aar 9446 Thr Phe Met Tyr Val 1280 Phe Ser Thr Phe Ala 1285 Asn He Leu Arg Asn 1290 Lys gar wsy tagtgatctc i etaggaagee cgcctaatga gcgggctttt tttttctggt 9502

Glu Ser atgcatcctg aggccgatac tgtcgtcgtc ccctcaaact ggcagatgca cggttacgat 9562 gcgcccatct acaccaacgt gacctatccc attacggtca atccgccgtt tgttcccacg 9622 gagaatccga cgggttgtta ctcgctcaca tttaatgttg atgaaagctg gctacaggaa 9682 ggccagacgc gaattatttt tgatggcgtt cctattggtt aaaaaatgag ctgatttaac 9742

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

172 aaaaatttaa tgcgaatttt aacaaaatat taacgtttac aatttaaata tttgcttata 9802 caatcttcct gtttttgggg cttttctgat tatcaaccgg ggtacatatg attgacatgc 9862 tagttttacg attaccgttc atcgattctc ttgtttgctc cagactctca ggcaatgacc 9922 tgatagcctt tgtagatctc tcaaaaatag ctaccctctc cggcattaat ttatcagcta 9982 gaacggttga atatcatatt gatggtgatt tgactgtctc cggcctttct cacccttttg 10042 aatctttacc tacacattac tcaggcattg catttaaaat atatgagggt tctaaaaatt 10102 tttatccttg cgttgaaata aaggcttctc ccgcaaaagt attacagggt cataatgttt 10162 ttggtacaac cgatttagct ttatgctctg aggctttatt gcttaatttt gctaattctt 10222 tgccttgcct gtatgattta ttggatgtt 10251 <210> 583 <211> 113 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05 protein sequence <400> 583

Met 1 Lys Lys Leu Leu 5 Phe Ala lie Pro Leu 10 Val Val Pro Phe Tyr 15 Ser Gly Ala Ala Glu Ser His Leu Asp Gly Ala Ala Glu Thr Val Glu Ser 20 25 30 Cys Leu Ala Lys Ser His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys 35 40 45 Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp 50 55 60 Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr 65 70 75 80 Gly Thr Trp Val Pro lie Gly Leu Ala lie Pro Glu Asn Glu Gly Gly 85 90 95 Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 100 105 110 Thr <210> 584 <211> 152 <212> PRT <213> Artificial . Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

173 <220>

<223> Description of Artificial Sequence: CJRA05 protein sequence <400> 584

Ser 1 Gly Asp Phe Asp 5 Tyr Glu Lys Met Ala 10 Asn Ala Asn Lys Gly 15 Ala Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly 20 25 30 Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe 35 40 45 lie Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 50 55 60 Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 65 70 75 80 Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 85 90 95 Ser Val Glu Cys Arg Pro Phe Val Phe Gly Ala Gly Lys Pro Tyr Glu 100 105 110 Phe Ser lie Asp Cys Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala 115 120 125 Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 130 135 140 Asn He Leu Arg Asn Lys Glu Ser 145 150

<210> 585 <211> 15 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05 peptide sequence <400> 585

Met Pro Val Leu Leu Gly lie Pro Leu Leu Leu Arg Phe Leu Gly 15 10 15 <210> 586 <211> 348 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

174 protein sequence <400> 586

Met Ala Val Tyr Phe Val Thr Gly Lys Leu Gly Ser Gly Lys Thr Leu 1 5 10 15

Val Ser Val Gly Lys lie Gin Asp Lys He Val Ala Gly Cys Lys lie 20 25 30

Ala Thr Asn Leu Asp Leu Arg Leu Gin Asn Leu Pro Gin Val Gly Arg 35 40 45

Phe Ala Lys Thr Pro Arg Val Leu Arg He Pro Asp Lys Pro Ser He 50 55 60

Ser Asp Leu Leu Ala He Gly Arg Gly Asn Asp Ser Tyr Asp Glu Asn

65 70 75 80

Lys Asn Gly Leu Leu Val Leu Asp Glu Cys Gly Thr Trp Phe Asn Thr

85 90 95

Arg Ser Trp Asn Asp Lys Glu Arg Gin Pro He He Asp Trp Phe Leu 100 105 110

His Ala Arg Lys Leu Gly Trp Asp He He Phe Leu Val Gin Asp Leu . 115 120 125

Ser He Val Asp Lys Gin Ala Arg Ser Ala Leu Ala Glu His Val Val

130 135 140

Tyr Cys Arg Arg Leu Asp Arg He Thr Leu Pro Phe Val Gly Thr Leu

145 150 155 160

Tyr Ser Leu He Thr Gly Ser Lys Met Pro Leu Pro Lys Leu His Val

165 170 175

Gly Val Val Lys Tyr Gly Asp Ser Gin Leu Ser Pro Thr Val Glu Arg 180 185 190

Trp Leu Tyr Thr Gly Lys Asn Leu Tyr Asn Ala Tyr Asp Thr Lys Gin 195 200 205

Ala Phe Ser Ser Asn Tyr Asp Ser Gly Val Tyr Ser Tyr Leu Thr Pro 210 215 220

Tyr Leu Ser His Gly Arg Tyr Phe Lys Pro Leu Asn Leu Gly Gin Lys

225 230 235 240

Met Lys Leu Thr Lys He Tyr Leu Lys Lys Phe Ser Arg Val Leu Cys

245 250 255

Leu Ala He Gly Phe Ala Ser Ala Phe Thr Tyr Ser Tyr He Thr Gin 260 265 270

Pro Lys Pro Glu Val Lys Lys Val Val Ser Gin Thr Tyr Asp Phe Asp 275 280 285

Lys Phe Thr He Asp Ser Ser Gin Arg Leu Asn Leu Ser Tyr Arg Tyr

WO 02/083872

PCT/US02/12405

175

2016225923 09 Sep 2016

290 295 300 Val Phe Lys Asp Ser Lys Gly Lys Leu lie Asn Ser Asp Asp Leu Gin 305 310 315 320 Lys Gin Gly Tyr Ser Leu Thr Tyr He Asp Leu Cys Thr Val Ser He 325 330 335 Lys Lys Gly Asn Ser Asn Glu lie Val Lys Cys Asn

340 345 <210> 587 <211> 234 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05 protein sequence <400> 587

Met Lys Lys Leu Leu Phe Ala lie Pro Leu Val Val Pro Phe Tyr Ser 15 10 15

His Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser 20 25 30

Leu Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Gly 35 40 45

Val Ser Ser Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro 50 · 55 60

Arg Leu Leu He Tyr Asp Ala Ser Asn Arg Ala Thr Gly He Pro Ala

65 70 75 80

Arg Phe Ser Gly Ser Gly Pro Gly Thr Asp Phe Thr Leu Thr He Ser

85 90 95

Ser Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Arg Asn 100 105 110

Trp His Pro Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg 115 120 125

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 130 135 140

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr

145 150 155 160

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser

165 170 175

Gly Asn Ser Gin Glu Ser Val Thr Glu Arg Asp Ser Lys Asp Ser Thr 180 185 190

WO 02/083872

PCT/US02/12405

176

2016225923 09 Sep 2016

Tyr Ser Leu 195 Ser Ser Thr Leu Thr Leu 200 Ser Lys Ala Asp 205 Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 210 215 220 Val Thr Lys Ser Phe Asn Arg Gly Glu Cys

225 230 <210> 588 <211> 431 <212> PRT

<213> Artificial . Sequence <220> <223> Description of Artificial Sequence: CJRA05 protein sequence <400> 588 Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 1 5 10 15 Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 20 25 30 Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 35 40 45 Phe Thr Phe Ser Thr Tyr Glu Met Arg Trp Val Arg Gin Ala Pro Gly 50 55 60 Lys Gly Leu Glu Trp Val Ser Tyr lie Ala Pro Ser Gly Gly Asp Thr 65 70 75 80 Ala Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Asn 85 90 95 Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg Ala Glu Asp 100 105 110 Thr Ala Val Tyr Tyr Cys Ala Arg Arg Leu Asp Gly Tyr He Ser Tyr 115 120 125 Tyr Tyr Gly Met Asp Val Trp Gly Gin Gly Thr Thr Val Thr Val Ser 130 135 140 Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Ser Ser 145 150 155 160 Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp 165 170 175 Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr 180 185 190 Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr

195 200 205

WO 02/083872

PCT/US02/12405

177

2016225923 09 Sep 2016

Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly 210 215 220

Thr Tyr lie Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys 225 230 235

Lys Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His His His 245 250

His Gly Ala Ala Glu Gin Lys Leu He Ser Glu Glu Asp Leu 260 265 270

Ala Ala Gin Ala Ser Ser Ala Ser Gly Asp Phe Asp Tyr Glu 275 280 285

Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu 290 295 300

Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr 305 310 315

Gly Ala Ala He Asp Gly Phe He Gly Asp Val Ser Gly Leu 325 330

Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin 340 345 350

Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe 355 360 365

Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Phe 370 375 380

Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp Lys 385 390 395

Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr 405 410

Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys Glu 420 425 430

Thr Gin

Val Asp 240

His His 255

Asn Gly

Lys Met

Asn Ala

Asp Tyr 320

Ala Asn 335

Met Ala

Arg Gin

Val Phe

He Asn 400

Phe Met 415

Ser <210> 589 <211> 5 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Illustrative peptide <400> 589

Glu Gly Gly Gly Ser 1 5

WO 02/083872

PCT/US02/12405

178

2016225923 09 Sep 2016 <210> 590 <211> 1275 <212> DNA <213> Unknown Organism <220>

<221> CDS <222> (1) .. (1272) <220>

<223> Description of Unknown Organism: M13 nucleotide sequence <400> 590

gtg Met 1 aaa Lys aaa Lys tta Leu tta Leu 5 ttc Phe gca Ala att He cct Pro tta Leu 10 gtt Val gtt Val cct Pro ttc Phe tat Tyr 15 tet Ser 48 cac tcc get gaa act gtt gaa agt tgt tta gca aaa ccc cat aca gaa 96 His Ser Ala Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu 20 25 30 aat tea ttt act aac gtc tgg aaa gac gac aaa act tta gat cgt tac 144 Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr 35 40 45 get aac tat gag ggt tgt ctg tgg aat get aca ggc gtt gta gtt tgt 192 Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys 50 55 60 act ggt gac gaa act cag tgt tac ggt aca tgg gtt cct att ggg ett 240 Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu 65 70 75 80 get ate cct gaa aat gag ggt ggt ggc tet gag ggt ggc ggt tet gag 288 Ala lie Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu 85 90 95 ggt ggc ggt tet gag ggt ggc ggt act aaa cct cct gag tac ggt gat 336 Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp 100 105 110 aca cct att ccg ggc tat act tat ate aac cct etc gac ggc act tat 384 Thr Pro He Pro Gly Tyr Thr Tyr lie Asn Pro Leu Asp Gly Thr Tyr 115 120 125 ccg cct ggt act gag caa aac ccc get aat cct aat cct tet ett gag 432 Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu 130 135 140 gag tet cag cct ett aat act ttc atg ttt cag aat aat agg ttc cga 480 Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg 145 150 155 160 aat agg cag ggg gca tta act gtt tat aeg ggc act gtt act caa ggc 528 Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly 165 170 175

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

179 act gac ccc gtt aaa act tat tac cag tac act cct gta tea tea aaa

Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys

180 185 190 gcc atg tat gac get tac tgg aac ggt aaa ttc aga gac tgc get ttc

Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe

195 200 205 cat tct ggc ttt aat gag gat cca ttc gtt tgt gaa tat caa ggc caa

His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin

210 215 220 teg tct gac ctg cct caa cct cct gtc aat get ggc ggc ggc tct ggt

Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly

225 230 235 240 ggt ggt tct ggt ggc ggc tct gag ggt ggt ggc tct gag ggt ggc ggt

Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly

245 250 255 tct gag ggt ggc ggc tct gag gga ggc ggt tcc ggt ggt ggc tct ggt

Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly

260 ' 265 270 tcc ggt gat ttt gat tat gaa aag atg gca aac get aat aag ggg get

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala

275 280 285 atg acc gaa aat gcc gat gaa aac geg eta cag tct gac get aaa ggc

Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly

290 295 300 aaa ett gat tct gtc get act gat tac ggt get get ate gat ggt ttc

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala lie Asp Gly Phe

305 310 315 320 att ggt gac gtt tcc ggc ett get aat ggt aat ggt get act ggt gat

He Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp

325 330 335 ttt get ggc tct aat tcc caa atg get caa gtc ggt gac ggt gat aat

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn

340 345 350 tea cct tta atg aat aat ttc cgt caa tat tta cct tcc etc cct caa

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin

355 360 365 teg gtt gaa tgt ege cct ttt gtc ttt age get ggt aaa cca tat gaa

Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu

370 375 380 ttt tct att gat tgt gac aaa ata aac tta ttc cgt ggt gtc ttt geg

Phe Ser He Asp Cys Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala

385 390 395 400

576

624

672

720

768

816

864

912

960

1008

1056

1104

1152

1200 ttt ett tta tat gtt gcc acc ttt atg tat gta ttt tct aeg ttt get 1248

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

180

405 410 415 aac ata ctg cgt aat aag gag tet taa

Asn lie Leu Arg Asn Lys Glu Ser 420 <210> 591 <211> 424 .

<212> PRT <213> Unknown Organism <220>

<223> Description of Unknown Organism: M13 protein sequence <400> 591

Met 1 Lys Lys Leu Leu 5 Phe Ala lie Pro Leu 10 Val Val Pro Phe Tyr 15 Ser His Ser Ala Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu 20 25 30 Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr 35 40 45 Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys 50 55 60 Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu 65 70 75 80 Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu 85 90 95 Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp 100 105 110 Thr Pro He Pro Gly Tyr Thr Tyr lie Asn Pro Leu Asp Gly Thr Tyr 115 120 125 Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu 130 135 140 Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg 145 150 155 160 Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly 165 170 175 Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys 180 185 190 Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe 195 200 205

His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin 210 215 220

1275

WO 02/083872

PCT/US02/12405

181

2016225923 09 Sep 2016

Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser 225 230 235

Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly

245 250 255

Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser

260 265 . 270

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly

275 280 285

Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys 290 295 300

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala lie Asp Gly 305 310 315 lie Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly 325 330 335

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp 340 345 350

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro 355 360 365

Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr 370 375 380

Phe Ser He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe 385 390 395

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe 405 410 415

Asn He Leu Arg Asn Lys Glu Ser 420 <210> 592 <211> 35 <212> DNA <213> Artificial Seguence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 592 caacgatgat cgtatggcgc atgctgccga gacag

Gly

240

Gly

Ala

Gly

Phe

320

Asp

Asn

Gin

Glu

Ala

400

Ala <210> 593 <211> 1355 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

182

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Μ13-ΠΙ nucleotide sequence <220>

<221> CDS <222> (1) .. (1305) <400> 593

geg Ala 1 gcc Ala gca Ala cat His cat His 5 cat His cac His cat His cac His ggg Gly 10 gcc Ala gca Ala gaa Glu caa Gin aaa Lys 15 etc Leu 48 ate tea gaa gag gat ctg aat ggg gcc gca tag get age gat ate aac 96 lie Ser Glu Glu Asp Leu Asn Gly Ala Ala Ala Ser Asp He Asn 20 25 30 gat gat cgt atg get tet act gey gar acw gty gaa wsy tgy ytr gem 144 Asp Asp Arg Met Ala Ser Thr Ala Glu Thr Val Glu Ser Cys Leu Ala 35 40 45 aar ccy cay acw gar aat wsw tty acw aay gts tgg aar gay gay aar 192 Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys 50 55 60 acy ytw gat cgw tay gey aay tay gar ggy tgy ytr tgg aat gey a cm 240 Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr 65 70 75 ggc gty gtw gty tgy ack ggy gay gar acw car tgy tay ggy acr tgg 288 Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp 80 85 90 95 gtk cck atw ggs ytw gey atm cck gar aay gar ggy ggy ggy wsy gar 336 Val Pro He Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu 100 105 110 ggy ggy ggy wsy gar ggy ggy ggw tcy gar ggw ggy ggw acy aar cck 384 Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro 115 120 125 cck gar tay ggy gay acw cck atw cck ggy tay acy tay aty aay cck 432 Pro Glu Tyr Gly Asp Thr Pro He Pro Gly Tyr Thr Tyr He Asn Pro 130 135 140 ytm gay ggm acy tay cck cck ggy acy gar car aay ccy gey aay cck 480 Leu Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro 145 150 155 aay ccw wsy ytw gar gar wsy car cck ytw aay acy tty atg tty car 528 Asn Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin 160 165 170 175 aay aay mgk tty mgr aay mgk car ggk gew ytw acy gtk tay ack ggm 576 Asn Asn Arg Phe Arg Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly 180 185 190

WO 02/083872

PCT/US02/12405

183

2016225923 09 Sep 2016

acy gty Thr Val acy car ggy acy gay ccy Pro gty Val 200 aar Lys acy Thr tay Tyr tay Tyr car Gin 205 tay Tyr acy Thr 624 Thr Gin 195 Gly Thr Asp cck gtm ter wsw aar gey atg tay gay gey tay tgg aay ggy aar tty 672 Pro Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe 210 215 220 mgw gay tgy gey tty cay wsy ggy tty aay gar gay ccw tty gty tgy 720 Arg Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys 225 230 235 gar tay car ggy car wsk wsy gay ytr cck car ccw cck gty aay gck 768 Glu Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala 240 245 250 255 ggy ggy ggy wsy ggy ggw ggy wsy ggy ggy ggy wsy gar ggy ggw ggy 816 Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly 260 265 270 wsy gar ggw ggy ggy wsy ggt ggy ggy wsy ggy wsy ggy gay tty gay 864 Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp 275 280 285 tay gar aar atg gew aay gey aay aar ggs gey atg acy gar aay gey 912 Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala 290 295 300 gay gar aay gcr ctr car wst gay gey aar ggy aar ytw gay wsy gtc 960 Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val 305 310 315 gey acw gay tay ggt get gey ate gay ggy tty aty ggy gay gty wsy 1008 Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe lie Gly Asp Val Ser 320 325 330 335 ggy ctk get aay ggy aay ggw gey acy ggw gay tty gew ggy tek aat 1056 Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn 340 345 350 toy car atg gey car gty ggw gay ggk gay aay wsw cck ytw atg aay 1104 Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn 355 360 365 aay tty mgw car tay ytw cck tcy cty cck car wsk gty gar tgy egy 1152 Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg 370 375 380 ccw tty gty tty wsy gey ggy aar ccw tay gar tty wsy aty gay tgy 1200 Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys 385 390 395 gay aar atm aay ytw ttc egy ggy gty tty gck tty ytk yta tay gty 1248 Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val 400 405 410 415 gey acy tty atg tay gtw tty wsy ack tty gey aay atw ytr egy aay 1296 Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

184

420 425 430 aar gar wsy tagtgatctc ctaggaagcc cgcctaatga gcgggctttt Lys Glu Ser

1345 tttttctggt

1355 <210> 594 <211> 434 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: M13-III protein sequence <400> 594

Ala Ala Ala His His His His His His Gly Ala Ala Glu Gin Lys Leu 15 10 15 lie Ser Glu Glu Asp Leu Asn Gly Ala Ala Ala Ser Asp He Asn Asp 20 25 30

Asp Arg Met Ala Ser Thr Ala Glu Thr Val Glu Ser Cys Leu Ala Lys 35 40 45

Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys Thr 50 55 60

Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly

65 70 75 80

Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val

85 90 95

Pro He Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly 100 105 110

Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro 115 120 125

Glu Tyr Gly Asp Thr Pro He Pro Gly Tyr Thr Tyr He Asn Pro Leu 130 135 140

Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn

145 150 155 160

Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn

165 170 175

Asn Arg Phe Arg Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr 180 185 190

Val Thr Gin Gly Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro 195 200 205

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

185

Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn 210 215 220

Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro 225 230 235

Gly Lys Phe Arg

Phe Val Cys Glu 240

Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro 245 250

Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu 260 265

Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly 275 280

Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr 290 295 300

Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu 305 310 315

Thr Asp Tyr Gly Ala Ala lie Asp Gly Phe He Gly 325 330

Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala 340 345

Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro 355 360

Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val 370 375 380

Val Asn Ala Gly 255

Gly Gly Gly Ser 270

Asp Phe Asp Tyr 285

Glu Asn Ala Asp

Asp Ser Val Ala 320

Asp Val Ser Gly 335

Gly Ser Asn Ser 350

Leu Met Asn Asn 365

Glu Cys Arg Pro

Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser 385 390 395

He Asp Cys Asp 400

Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu 405 410

Leu Tyr Val Ala 415

Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He 420 425

Leu Arg Asn Lys 430

Glu Ser <210> 595 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 595 cgttgatatc gctagcctat gc

WO 02/083872

PCT/US02/12405

186

2016225923 09 Sep 2016 <210> 596 <211> 30 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 596 gataggctta gctagcccgg agaacgaagg <210> 597 <211> 37 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 597 ctttcacagc ggtttcgcta gcgacccttt tgtctgc <210> 598 <211> 50 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 598 ctttcacagc ggtttcgcta gcgacccttt tgtcagcgag taccagggtc <210> 599 <211> 37 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 599 gactgtctcg gcagcatgcg ccatacgatc atcgttg <210> 600 <211> 37 <212> DNA <213> Artificial Sequence <220>

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

187 <223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> CDS <222> (2)..(25) <400> 600 c aac gat gat cgt atg geg cat get gccgagacag tc Asn Asp Asp Arg Met Ala His Ala

1 5 <210> 601 <211> 8 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic peptide <400> 601

Asn Asp Asp Arg Met Ala His Ala 1 5 <210> 602 <211> 37 <212> DNA <213> Artificial Sequence <220> · <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 602 ctttcacagc ggtttgcatg cagacccttt tgtctgc <210> 603 <211> 50 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 603 ctttcacagc ggtttgcatg cagacccttt tgteagegag taccagggtc <210> 604 <211> 7 <212> PRT <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

188

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Illustrative peptide <400> 604

Tyr Ala Asp Ser Val Lys Gly 1 5 <210> 605 <211> 21 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 605 cctcgacagc gaagtgcaca g <210> 606 <211> 38 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 606 ggctgagtca agacgctctg tgcacttcgc tgtcgagg <210> 607 <211> 7 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Illustrative peptide <400> 607

Gin Ser Ala Leu Thr Gin Pro 1 5 <210> 608 <211> 22 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Primer <400> 608 cctctgtcac agtgcacaag ac

WO 02/083872

PCT/US02/12405

189

2016225923 09 Sep 2016

<210> 609 <211> 42 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 609 cctctgtcac agtgcacaag acatccagat gacccagtct cc

<210> 610 <211> 50 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide Sequence: Synthetic <400> 610 gggaggatgg agactgggtc gtctggatgt cttgtgcact gtgacagagg

<210> 611 <211> 11 <212> PRT <213> Artificial Sequence <220> <223> Description of Artificial peptide Sequence: Illustrative <400> 611 Gin Asp lie Gin Met Thr Gin Ser Pro Ser Ser 1 5 10

<210> <211> <212> <213> 612 20 DNA Artificial Sequence <220> <223> Description of Artificial Sequence: Primer <400> 612

gactgggtgt agtgatctag 20 <210> 613 <211> 28 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

190

2016225923 09 Sep 2016 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 613 ggtgtagtga tcttctagtg acaactct 28 <210> 614 <211> 6 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic peptide <400> 614

Val Ser Ser Arg Asp Asn 1 5 <210> 615 <211> 15 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220> · <221> CDS <222> (1)..(15) <400> 615 tac tat tgt gcg aaa

Tyr Tyr Cys Ala Lys 1 5 <210> 616 <211> 5 ' <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic peptide <400> 616

Tyr Tyr Cys Ala Lys 1 5 <210> 617 <211> 36

WO 02/083872

PCT/US02/12405

191

2016225923 09 Sep 2016 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 617 ggtgccgata ggcttgcatg caccggagaa cgaagg 36 <210> 618 .

<211> 95 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 618 cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 agggctgagg acactgcagt ctactattgt acgag 95 <210> 619 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (4)..(7) <223> A, T, C, G, other or unknown <400> 619 gatnnnnatc 10 <210> 620 <211> 10 <212> PRT <213> Unknown Organism <220> <223> Description of Unknown Organism: MALIA3-derived peptide <400> 620

Met Lys Leu Leu Asn Val lie Asn Phe Val 1 5 ' 10 <210> 621

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

192 <211> 29 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05-derived peptide

<400> 621 Met Ser Val Leu Val Tyr Ser Phe Ala Ser Phe Val Leu Gly 1 5 10 Leu Arg Ser Gly lie Thr Tyr Phe Thr Arg Leu Met Glu 20 25

<210> <211> <212> <213> 622 15 DNA Artificial Sequence <220> <223> Description of Artificial Sequence: Illustrative nucleotide sequence

<400> 622 tttttttttt ttttt

<210> 623 <211> 87 <212> PRT <213> Unknown Organism <220> <223> Description of Unknown Organism: MALIA3-derived peptide <400> 623

Met 1 lie Lys Val Glu 5 He Lys Pro Ser Gin 10 Ala Gin Phe Thr Thr 15 Arg Ser Gly Val Ser 20 Arg Gin Gly Lys Pro 25 Tyr Ser Leu Asn Glu 30 Gin Leu Cys Tyr Val 35 Asp Leu Gly Asn Glu 40 Tyr Pro Val Leu Val 45 Lys lie Thr Leu Asp 50 Glu Gly Gin Pro Ala 55 Tyr Ala Pro Gly Leu 60 Tyr Thr Val His Leu Ser Ser Phe Lys Val Gly Gin Phe Gly Ser Leu Met He Asp Arg

65 70 75 80

Leu Arg Leu Val Pro Ala Lys 85

WO 02/083872

PCT/US02/12405

193

2016225923 09 Sep 2016

<210> 624 <211> 29 <212> PRT <213> Unknown Organism <220> <223> Description of Unknown peptide Organism: MALIA3-derived <400> 624 Met Ser Val Leu Val Tyr Ser Phe Ala Ser Phe Val Leu Gly 1 5 10

Leu Arg Ser Gly lie Thr Tyr Phe Thr Arg Leu Met Glu 20 25 <210> 625 <211> 10 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <220>

<221> modified_base <222> (7)..(10) <223> A, T, C, G, other or unknown <400> 625 ctcttcnnnn .

<210> 626 <211> 87 <212> PRT <213> Artificial Sequence <220>

<223> Description of Artificial Sequence: CJRA05-derived peptide <400> 626

Met 1 lie Lys Val Glu 5 lie Lys Pro Ser Gin 10 Ala Gin Phe Thr Thr 15 Arg Ser Gly Val Ser 20 Arg Gin Gly Lys Pro 25 Tyr Ser Leu Asn Glu 30 Gin Leu Cys Tyr Val 35 Asp Leu Gly Asn Glu 40 Tyr Pro Val Leu Val 45 Lys lie Thr Leu Asp 50 Glu Gly Gin Pro Ala 55 Tyr Ala Pro Gly Leu 60 Tyr Thr Val His Leu Ser Ser Phe Lys Val Gly Gin Phe Gly Ser Leu Met He Asp Arg

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

194

65 70 Leu Arg Leu Val Pro Ala Lys 85 <210> 627 <211> 10 <212> PRT <213> Artificial Sequence <220> <223> Description of Artificial peptide <400> 627 Met Lys Leu Leu Asn Val lie Asn 1 5 <210> 628 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 628 gacccagtct ccatcctcc <210> 629 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 629

gactcagtct ccactctcc

Sequence: CJRA05-derived

Phe Val 10

Sequence: Synthetic

75 80 <210> 630 <211> 19 <212> DNA <213> Artificial Sequence <220>

<223> Description of Artificial oligonucleotide <400> 630 gacgcagtct ccaggcacc

Sequence: Synthetic

WO 02/083872

PCT/US02/12405

195

2016225923 09 Sep 2016

<210> 631 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 631 gacgcagtct ccagccacc <210> 632 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 632 gtctcctgga cagtcgatc <210> 633 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 633

ggccttggga cagacagtc

Sequence: Synthetic

<210> 634 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> Description of Artificial oligonucleotide <400> 634 gtctcctgga cagtcagtc

Sequence: Synthetic <210> 635 <211> 19 <212> DNA <213> Artificial Sequence

WO 02/083872

PCT/US02/12405

2016225923 09 Sep 2016

196 <220>

<223> Description of Artificial Sequence: Synthetic oligonucleotide <400> 635 ggccccaggg cagagggtc 19