AU1199000A - Shuffling of codon altered genes - Google Patents
Shuffling of codon altered genes Download PDFInfo
- Publication number
- AU1199000A AU1199000A AU11990/00A AU1199000A AU1199000A AU 1199000 A AU1199000 A AU 1199000A AU 11990/00 A AU11990/00 A AU 11990/00A AU 1199000 A AU1199000 A AU 1199000A AU 1199000 A AU1199000 A AU 1199000A
- Authority
- AU
- Australia
- Prior art keywords
- nucleic acid
- codon
- nucleic acids
- codon altered
- library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000623 proteins and genes Proteins 0.000 title claims description 216
- 108020004705 Codon Proteins 0.000 title claims description 170
- 150000007523 nucleic acids Chemical class 0.000 claims description 369
- 102000039446 nucleic acids Human genes 0.000 claims description 313
- 108020004707 nucleic acids Proteins 0.000 claims description 313
- 238000000034 method Methods 0.000 claims description 162
- 102000004169 proteins and genes Human genes 0.000 claims description 97
- 108091034117 Oligonucleotide Proteins 0.000 claims description 81
- 241000700605 Viruses Species 0.000 claims description 76
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 61
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 59
- 229920001184 polypeptide Polymers 0.000 claims description 57
- 239000013598 vector Substances 0.000 claims description 50
- 238000012216 screening Methods 0.000 claims description 43
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 35
- 238000009396 hybridization Methods 0.000 claims description 27
- 108700010070 Codon Usage Proteins 0.000 claims description 24
- 230000004048 modification Effects 0.000 claims description 23
- 238000012986 modification Methods 0.000 claims description 23
- 238000000338 in vitro Methods 0.000 claims description 22
- 230000002238 attenuated effect Effects 0.000 claims description 19
- 238000006467 substitution reaction Methods 0.000 claims description 19
- 230000002209 hydrophobic effect Effects 0.000 claims description 18
- 241000701161 unidentified adenovirus Species 0.000 claims description 18
- 108010017080 Granulocyte Colony-Stimulating Factor Proteins 0.000 claims description 16
- 102000004269 Granulocyte Colony-Stimulating Factor Human genes 0.000 claims description 16
- 239000002773 nucleotide Substances 0.000 claims description 16
- 230000003612 virological effect Effects 0.000 claims description 14
- 239000000203 mixture Substances 0.000 claims description 13
- 125000003729 nucleotide group Chemical group 0.000 claims description 13
- 208000015181 infectious disease Diseases 0.000 claims description 12
- 241000702421 Dependoparvovirus Species 0.000 claims description 11
- 238000004519 manufacturing process Methods 0.000 claims description 9
- 230000010076 replication Effects 0.000 claims description 9
- 241001430294 unidentified retrovirus Species 0.000 claims description 9
- 239000013603 viral vector Substances 0.000 claims description 9
- 230000028993 immune response Effects 0.000 claims description 8
- 239000000463 material Substances 0.000 claims description 6
- 230000001177 retroviral effect Effects 0.000 claims description 5
- 102000004127 Cytokines Human genes 0.000 claims description 4
- 108090000695 Cytokines Proteins 0.000 claims description 4
- 241001529453 unidentified herpesvirus Species 0.000 claims description 4
- 241000124008 Mammalia Species 0.000 claims description 3
- 230000002829 reductive effect Effects 0.000 claims description 3
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 claims description 2
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 claims description 2
- 108010003533 Viral Envelope Proteins Proteins 0.000 claims 2
- 241000713666 Lentivirus Species 0.000 claims 1
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 claims 1
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 claims 1
- 230000003362 replicative effect Effects 0.000 claims 1
- 230000006798 recombination Effects 0.000 description 96
- 238000005215 recombination Methods 0.000 description 96
- 210000004027 cell Anatomy 0.000 description 91
- 235000018102 proteins Nutrition 0.000 description 85
- 230000035772 mutation Effects 0.000 description 59
- 108020004414 DNA Proteins 0.000 description 58
- 235000001014 amino acid Nutrition 0.000 description 39
- 229940024606 amino acid Drugs 0.000 description 39
- 150000001413 amino acids Chemical class 0.000 description 39
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 36
- 239000012634 fragment Substances 0.000 description 35
- 239000000758 substrate Substances 0.000 description 32
- 238000003752 polymerase chain reaction Methods 0.000 description 29
- 238000003556 assay Methods 0.000 description 27
- 230000000694 effects Effects 0.000 description 25
- 230000014509 gene expression Effects 0.000 description 23
- 241000725303 Human immunodeficiency virus Species 0.000 description 22
- 239000000523 sample Substances 0.000 description 21
- 241000713340 Human immunodeficiency virus 2 Species 0.000 description 19
- 230000001404 mediated effect Effects 0.000 description 19
- 239000000047 product Substances 0.000 description 19
- 229960005486 vaccine Drugs 0.000 description 19
- 238000004422 calculation algorithm Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 18
- 238000004806 packaging method and process Methods 0.000 description 18
- 102000003951 Erythropoietin Human genes 0.000 description 17
- 108090000394 Erythropoietin Proteins 0.000 description 17
- 230000004075 alteration Effects 0.000 description 17
- 241000196324 Embryophyta Species 0.000 description 16
- 230000003321 amplification Effects 0.000 description 15
- 230000015572 biosynthetic process Effects 0.000 description 15
- 229940105423 erythropoietin Drugs 0.000 description 15
- 238000003199 nucleic acid amplification method Methods 0.000 description 15
- OXCMYAYHXIHQOA-UHFFFAOYSA-N potassium;[2-butyl-5-chloro-3-[[4-[2-(1,2,4-triaza-3-azanidacyclopenta-1,4-dien-5-yl)phenyl]phenyl]methyl]imidazol-4-yl]methanol Chemical compound [K+].CCCCC1=NC(Cl)=C(CO)N1CC1=CC=C(C=2C(=CC=CC=2)C2=N[N-]N=N2)C=C1 OXCMYAYHXIHQOA-UHFFFAOYSA-N 0.000 description 14
- 241000894007 species Species 0.000 description 14
- 238000003786 synthesis reaction Methods 0.000 description 14
- 230000027455 binding Effects 0.000 description 13
- 230000002068 genetic effect Effects 0.000 description 13
- 230000006872 improvement Effects 0.000 description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 12
- 238000001727 in vivo Methods 0.000 description 12
- 238000002703 mutagenesis Methods 0.000 description 12
- 231100000350 mutagenesis Toxicity 0.000 description 12
- 102000040430 polynucleotide Human genes 0.000 description 12
- 108091033319 polynucleotide Proteins 0.000 description 12
- 239000002157 polynucleotide Substances 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 230000001976 improved effect Effects 0.000 description 11
- 239000013615 primer Substances 0.000 description 11
- 125000003275 alpha amino acid group Chemical group 0.000 description 10
- 238000013459 approach Methods 0.000 description 10
- 230000001580 bacterial effect Effects 0.000 description 10
- 238000006243 chemical reaction Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 10
- -1 phosphoramidite triester Chemical class 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 9
- 208000030507 AIDS Diseases 0.000 description 9
- 230000008901 benefit Effects 0.000 description 9
- 210000000987 immune system Anatomy 0.000 description 9
- 108091008146 restriction endonucleases Proteins 0.000 description 9
- 241000713772 Human immunodeficiency virus 1 Species 0.000 description 8
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 8
- 210000004102 animal cell Anatomy 0.000 description 8
- 230000000295 complement effect Effects 0.000 description 8
- 108020004999 messenger RNA Proteins 0.000 description 8
- 230000000869 mutational effect Effects 0.000 description 8
- 239000013612 plasmid Substances 0.000 description 8
- 102000005962 receptors Human genes 0.000 description 8
- 108020003175 receptors Proteins 0.000 description 8
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 7
- 238000000137 annealing Methods 0.000 description 7
- 239000003153 chemical reaction reagent Substances 0.000 description 7
- 230000012010 growth Effects 0.000 description 7
- 238000000126 in silico method Methods 0.000 description 7
- 239000007788 liquid Substances 0.000 description 7
- 230000001105 regulatory effect Effects 0.000 description 7
- 150000003839 salts Chemical class 0.000 description 7
- 238000001228 spectrum Methods 0.000 description 7
- 102000053602 DNA Human genes 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 6
- 230000002950 deficient Effects 0.000 description 6
- 230000006735 deficit Effects 0.000 description 6
- 229940088598 enzyme Drugs 0.000 description 6
- 238000001415 gene therapy Methods 0.000 description 6
- 238000002741 site-directed mutagenesis Methods 0.000 description 6
- 208000011580 syndromic disease Diseases 0.000 description 6
- 230000001225 therapeutic effect Effects 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 5
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 5
- 239000004475 Arginine Substances 0.000 description 5
- 102100034349 Integrase Human genes 0.000 description 5
- 108090000581 Leukemia inhibitory factor Proteins 0.000 description 5
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 5
- 230000004913 activation Effects 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 5
- 238000010348 incorporation Methods 0.000 description 5
- 238000011534 incubation Methods 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 239000003471 mutagenic agent Substances 0.000 description 5
- 239000007858 starting material Substances 0.000 description 5
- 230000002194 synthesizing effect Effects 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 4
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 208000031886 HIV Infections Diseases 0.000 description 4
- 101000746367 Homo sapiens Granulocyte colony-stimulating factor Proteins 0.000 description 4
- 102100032352 Leukemia inhibitory factor Human genes 0.000 description 4
- 108090000364 Ligases Proteins 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 4
- 238000004113 cell culture Methods 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000007812 deficiency Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000003623 enhancer Substances 0.000 description 4
- 230000002163 immunogen Effects 0.000 description 4
- 238000007745 plasma electrolytic oxidation reaction Methods 0.000 description 4
- 230000001681 protective effect Effects 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 229940104230 thymidine Drugs 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 3
- 102000007644 Colony-Stimulating Factors Human genes 0.000 description 3
- 108010071942 Colony-Stimulating Factors Proteins 0.000 description 3
- 101150002621 EPO gene Proteins 0.000 description 3
- 108010092408 Eosinophil Peroxidase Proteins 0.000 description 3
- 102100031939 Erythropoietin Human genes 0.000 description 3
- 241000233866 Fungi Species 0.000 description 3
- 108010051696 Growth Hormone Proteins 0.000 description 3
- 208000037357 HIV infectious disease Diseases 0.000 description 3
- 101000611183 Homo sapiens Tumor necrosis factor Proteins 0.000 description 3
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 description 3
- 102100021244 Integral membrane protein GPR180 Human genes 0.000 description 3
- 102100020880 Kit ligand Human genes 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 3
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 3
- 102000003960 Ligases Human genes 0.000 description 3
- 108060001084 Luciferase Proteins 0.000 description 3
- 239000005089 Luciferase Substances 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- 239000002202 Polyethylene glycol Substances 0.000 description 3
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 102100038803 Somatotropin Human genes 0.000 description 3
- 108090000787 Subtilisin Proteins 0.000 description 3
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 3
- 108060008683 Tumor Necrosis Factor Receptor Proteins 0.000 description 3
- 102100040247 Tumor necrosis factor Human genes 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 239000012190 activator Substances 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 241001493065 dsRNA viruses Species 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000002538 fungal effect Effects 0.000 description 3
- 208000033519 human immunodeficiency virus infectious disease Diseases 0.000 description 3
- 230000001771 impaired effect Effects 0.000 description 3
- 230000002757 inflammatory effect Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 239000003446 ligand Substances 0.000 description 3
- 229930182817 methionine Natural products 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 231100000219 mutagenic Toxicity 0.000 description 3
- 230000003505 mutagenic effect Effects 0.000 description 3
- 238000002515 oligonucleotide synthesis Methods 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 239000013608 rAAV vector Substances 0.000 description 3
- 101150079601 recA gene Proteins 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 231100000765 toxin Toxicity 0.000 description 3
- 239000003053 toxin Substances 0.000 description 3
- 108700012359 toxins Proteins 0.000 description 3
- 108091006106 transcriptional activators Proteins 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 102000003298 tumor necrosis factor receptor Human genes 0.000 description 3
- 238000002255 vaccination Methods 0.000 description 3
- 238000011179 visual inspection Methods 0.000 description 3
- 102100034613 Annexin A2 Human genes 0.000 description 2
- 108090000668 Annexin A2 Proteins 0.000 description 2
- 108020005544 Antisense RNA Proteins 0.000 description 2
- 102100021943 C-C motif chemokine 2 Human genes 0.000 description 2
- 101710155857 C-C motif chemokine 2 Proteins 0.000 description 2
- 108010029697 CD40 Ligand Proteins 0.000 description 2
- 102100032937 CD40 ligand Human genes 0.000 description 2
- 241000282693 Cercopithecidae Species 0.000 description 2
- 102100022615 Cotranscriptional regulator FAM172A Human genes 0.000 description 2
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 101710091045 Envelope protein Proteins 0.000 description 2
- 108010074604 Epoetin Alfa Proteins 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 101150021185 FGF gene Proteins 0.000 description 2
- 108090000385 Fibroblast growth factor 7 Proteins 0.000 description 2
- 241000713813 Gibbon ape leukemia virus Species 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 108010054017 Granulocyte Colony-Stimulating Factor Receptors Proteins 0.000 description 2
- 102100039622 Granulocyte colony-stimulating factor receptor Human genes 0.000 description 2
- 102100034221 Growth-regulated alpha protein Human genes 0.000 description 2
- 101000823488 Homo sapiens Cotranscriptional regulator FAM172A Proteins 0.000 description 2
- 101100333654 Homo sapiens EPO gene Proteins 0.000 description 2
- 101001069921 Homo sapiens Growth-regulated alpha protein Proteins 0.000 description 2
- 108010050904 Interferons Proteins 0.000 description 2
- 102000014150 Interferons Human genes 0.000 description 2
- 108010002352 Interleukin-1 Proteins 0.000 description 2
- 108010063738 Interleukins Proteins 0.000 description 2
- 102000015696 Interleukins Human genes 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- 241000282553 Macaca Species 0.000 description 2
- 241000713869 Moloney murine leukemia virus Species 0.000 description 2
- 241000714177 Murine leukemia virus Species 0.000 description 2
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 102000043276 Oncogene Human genes 0.000 description 2
- 108700020796 Oncogene Proteins 0.000 description 2
- RJKFOVLPORLFTN-LEKSSAKUSA-N Progesterone Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](C(=O)C)[C@@]1(C)CC2 RJKFOVLPORLFTN-LEKSSAKUSA-N 0.000 description 2
- 101710188315 Protein X Proteins 0.000 description 2
- 241000125945 Protoparvovirus Species 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 241001068263 Replication competent viruses Species 0.000 description 2
- 108700008625 Reporter Genes Proteins 0.000 description 2
- 102000013275 Somatomedins Human genes 0.000 description 2
- MUMGGOZAMZWBJJ-DYKIIFRCSA-N Testostosterone Chemical compound O=C1CC[C@]2(C)[C@H]3CC[C@](C)([C@H](CC4)O)[C@@H]4[C@@H]3CCC2=C1 MUMGGOZAMZWBJJ-DYKIIFRCSA-N 0.000 description 2
- 108700019146 Transgenes Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 108010067390 Viral Proteins Proteins 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 210000003719 b-lymphocyte Anatomy 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- BBBFJLBPOGFECG-VJVYQDLKSA-N calcitonin Chemical compound N([C@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(N)=O)C(C)C)C(=O)[C@@H]1CSSC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1 BBBFJLBPOGFECG-VJVYQDLKSA-N 0.000 description 2
- 210000000234 capsid Anatomy 0.000 description 2
- 238000012219 cassette mutagenesis Methods 0.000 description 2
- 102000006834 complement receptors Human genes 0.000 description 2
- 108010047295 complement receptors Proteins 0.000 description 2
- 239000003184 complementary RNA Substances 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000000205 computational method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 108091008053 gene clusters Proteins 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 239000003102 growth factor Substances 0.000 description 2
- 239000000122 growth hormone Substances 0.000 description 2
- 238000010438 heat treatment Methods 0.000 description 2
- 238000013537 high throughput screening Methods 0.000 description 2
- 230000002458 infectious effect Effects 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 229940047124 interferons Drugs 0.000 description 2
- 229940047122 interleukins Drugs 0.000 description 2
- 238000011005 laboratory method Methods 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 230000002101 lytic effect Effects 0.000 description 2
- 210000004962 mammalian cell Anatomy 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000001616 monocyte Anatomy 0.000 description 2
- 101150049514 mutL gene Proteins 0.000 description 2
- 210000000440 neutrophil Anatomy 0.000 description 2
- 239000002853 nucleic acid probe Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 239000005022 packaging material Substances 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 101150056906 recJ gene Proteins 0.000 description 2
- 238000001525 receptor binding assay Methods 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 238000004153 renaturation Methods 0.000 description 2
- 201000005404 rubella Diseases 0.000 description 2
- 101150072534 sbcB gene Proteins 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000019491 signal transduction Effects 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 230000004936 stimulating effect Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- 101150115617 umuC gene Proteins 0.000 description 2
- 101150046028 umuD gene Proteins 0.000 description 2
- 210000002845 virion Anatomy 0.000 description 2
- 101150100239 vsr gene Proteins 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- BNIFSVVAHBLNTN-XKKUQSFHSA-N (2s)-4-amino-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-1-[(2s)-4-amino-2-[[2-[[(2s)-2-[[(2s)-2-[[(2s)-1-[(2s)-6-amino-2-[[(2s)-2-[[(2s)-2-[[(2s,3r)-2-amino-3-hydroxybutanoyl]amino]-4-methylsulfanylbutanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]hexan Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(=O)N1[C@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O)CCC1 BNIFSVVAHBLNTN-XKKUQSFHSA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- 239000013607 AAV vector Substances 0.000 description 1
- 241000186046 Actinomyces Species 0.000 description 1
- PQSUYGKTWSAVDQ-UHFFFAOYSA-N Aldosterone Natural products C1CC2C3CCC(C(=O)CO)C3(C=O)CC(O)C2C2(C)C1=CC(=O)CC2 PQSUYGKTWSAVDQ-UHFFFAOYSA-N 0.000 description 1
- PQSUYGKTWSAVDQ-ZVIOFETBSA-N Aldosterone Chemical compound C([C@@]1([C@@H](C(=O)CO)CC[C@H]1[C@@H]1CC2)C=O)[C@H](O)[C@@H]1[C@]1(C)C2=CC(=O)CC1 PQSUYGKTWSAVDQ-ZVIOFETBSA-N 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 102400000068 Angiostatin Human genes 0.000 description 1
- 108010079709 Angiostatins Proteins 0.000 description 1
- 101710081722 Antitrypsin Proteins 0.000 description 1
- 102000007592 Apolipoproteins Human genes 0.000 description 1
- 108010071619 Apolipoproteins Proteins 0.000 description 1
- 108010083590 Apoproteins Proteins 0.000 description 1
- 102000006410 Apoproteins Human genes 0.000 description 1
- 101000716807 Arabidopsis thaliana Protein SCO1 homolog 1, mitochondrial Proteins 0.000 description 1
- 241000712891 Arenavirus Species 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000228212 Aspergillus Species 0.000 description 1
- 101800001288 Atrial natriuretic factor Proteins 0.000 description 1
- 102400001282 Atrial natriuretic peptide Human genes 0.000 description 1
- 101800001890 Atrial natriuretic peptide Proteins 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 108010016529 Bacillus amyloliquefaciens ribonuclease Proteins 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 241000606660 Bartonella Species 0.000 description 1
- 241000588807 Bordetella Species 0.000 description 1
- 241000589968 Borrelia Species 0.000 description 1
- 241000589562 Brucella Species 0.000 description 1
- 102100032367 C-C motif chemokine 5 Human genes 0.000 description 1
- 102100032366 C-C motif chemokine 7 Human genes 0.000 description 1
- 102100025248 C-X-C motif chemokine 10 Human genes 0.000 description 1
- 101710098275 C-X-C motif chemokine 10 Proteins 0.000 description 1
- 102100036150 C-X-C motif chemokine 5 Human genes 0.000 description 1
- 102100036153 C-X-C motif chemokine 6 Human genes 0.000 description 1
- 101710085504 C-X-C motif chemokine 6 Proteins 0.000 description 1
- 102100036170 C-X-C motif chemokine 9 Human genes 0.000 description 1
- 101710085500 C-X-C motif chemokine 9 Proteins 0.000 description 1
- 102000001902 CC Chemokines Human genes 0.000 description 1
- 108010040471 CC Chemokines Proteins 0.000 description 1
- 101150013553 CD40 gene Proteins 0.000 description 1
- 102100032912 CD44 antigen Human genes 0.000 description 1
- 108050006947 CXC Chemokine Proteins 0.000 description 1
- 102000019388 CXC chemokine Human genes 0.000 description 1
- 102000055006 Calcitonin Human genes 0.000 description 1
- 108060001064 Calcitonin Proteins 0.000 description 1
- 241000178270 Canarypox virus Species 0.000 description 1
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 108050001186 Chaperonin Cpn60 Proteins 0.000 description 1
- 102000052603 Chaperonins Human genes 0.000 description 1
- 108010055166 Chemokine CCL5 Proteins 0.000 description 1
- 108010055124 Chemokine CCL7 Proteins 0.000 description 1
- 102000000012 Chemokine CCL8 Human genes 0.000 description 1
- 108010055204 Chemokine CCL8 Proteins 0.000 description 1
- 241000606161 Chlamydia Species 0.000 description 1
- 241000581364 Clinitrachus argentatus Species 0.000 description 1
- 241001112696 Clostridia Species 0.000 description 1
- 102100022641 Coagulation factor IX Human genes 0.000 description 1
- 102100023804 Coagulation factor VII Human genes 0.000 description 1
- 208000003322 Coinfection Diseases 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 229940124073 Complement inhibitor Drugs 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 241000711573 Coronaviridae Species 0.000 description 1
- OMFXVFTZEKFJBZ-UHFFFAOYSA-N Corticosterone Natural products O=C1CCC2(C)C3C(O)CC(C)(C(CC4)C(=O)CO)C4C3CCC2=C1 OMFXVFTZEKFJBZ-UHFFFAOYSA-N 0.000 description 1
- 241000557626 Corvus corax Species 0.000 description 1
- 241000700626 Cowpox virus Species 0.000 description 1
- 241001445332 Coxiella <snail> Species 0.000 description 1
- 102000036364 Cullin Ring E3 Ligases Human genes 0.000 description 1
- 108091007045 Cullin Ring E3 Ligases Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 241000450599 DNA viruses Species 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 101710116602 DNA-Binding protein G5P Proteins 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 238000012286 ELISA Assay Methods 0.000 description 1
- 101710117538 Endogenous retrovirus group FC1 Env polyprotein Proteins 0.000 description 1
- 101710167714 Endogenous retrovirus group K member 18 Env polyprotein Proteins 0.000 description 1
- 101710152279 Endogenous retrovirus group K member 21 Env polyprotein Proteins 0.000 description 1
- 101710197529 Endogenous retrovirus group K member 25 Env polyprotein Proteins 0.000 description 1
- 101710141424 Endogenous retrovirus group K member 6 Env polyprotein Proteins 0.000 description 1
- 101710159911 Endogenous retrovirus group K member 8 Env polyprotein Proteins 0.000 description 1
- 101710205628 Endogenous retrovirus group K member 9 Env polyprotein Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 241000224431 Entamoeba Species 0.000 description 1
- 101000925646 Enterobacteria phage T4 Endolysin Proteins 0.000 description 1
- 241000588921 Enterobacteriaceae Species 0.000 description 1
- 101710104662 Enterotoxin type C-3 Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 102100030844 Exocyst complex component 1 Human genes 0.000 description 1
- 108010076282 Factor IX Proteins 0.000 description 1
- 108010023321 Factor VII Proteins 0.000 description 1
- 108010054218 Factor VIII Proteins 0.000 description 1
- 102000001690 Factor VIII Human genes 0.000 description 1
- 108010014173 Factor X Proteins 0.000 description 1
- 108010049003 Fibrinogen Proteins 0.000 description 1
- 102000008946 Fibrinogen Human genes 0.000 description 1
- 108050007372 Fibroblast Growth Factor Proteins 0.000 description 1
- 102000018233 Fibroblast Growth Factor Human genes 0.000 description 1
- 102000003972 Fibroblast growth factor 7 Human genes 0.000 description 1
- 108090000368 Fibroblast growth factor 8 Proteins 0.000 description 1
- 102100037362 Fibronectin Human genes 0.000 description 1
- 108010067306 Fibronectins Proteins 0.000 description 1
- 241000710831 Flavivirus Species 0.000 description 1
- 102100040837 Galactoside alpha-(1,2)-fucosyltransferase 2 Human genes 0.000 description 1
- 101710115997 Gamma-tubulin complex component 2 Proteins 0.000 description 1
- 241000224466 Giardia Species 0.000 description 1
- 108010017544 Glucosylceramidase Proteins 0.000 description 1
- 102000004547 Glucosylceramidase Human genes 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102000006771 Gonadotropins Human genes 0.000 description 1
- 108010086677 Gonadotropins Proteins 0.000 description 1
- 206010018612 Gonorrhoea Diseases 0.000 description 1
- 241000606790 Haemophilus Species 0.000 description 1
- 108090000031 Hedgehog Proteins Proteins 0.000 description 1
- 102000003693 Hedgehog Proteins Human genes 0.000 description 1
- 241000589989 Helicobacter Species 0.000 description 1
- 102000001554 Hemoglobins Human genes 0.000 description 1
- 108010054147 Hemoglobins Proteins 0.000 description 1
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 1
- 241000700721 Hepatitis B virus Species 0.000 description 1
- 102000003745 Hepatocyte Growth Factor Human genes 0.000 description 1
- 108090000100 Hepatocyte Growth Factor Proteins 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 102000007625 Hirudins Human genes 0.000 description 1
- 108010007267 Hirudins Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000947186 Homo sapiens C-X-C motif chemokine 5 Proteins 0.000 description 1
- 101000868273 Homo sapiens CD44 antigen Proteins 0.000 description 1
- 101000987586 Homo sapiens Eosinophil peroxidase Proteins 0.000 description 1
- 101000893710 Homo sapiens Galactoside alpha-(1,2)-fucosyltransferase 2 Proteins 0.000 description 1
- 101000746364 Homo sapiens Granulocyte colony-stimulating factor receptor Proteins 0.000 description 1
- 101000973997 Homo sapiens Nucleosome assembly protein 1-like 4 Proteins 0.000 description 1
- 101000947178 Homo sapiens Platelet basic protein Proteins 0.000 description 1
- 101000582950 Homo sapiens Platelet factor 4 Proteins 0.000 description 1
- 101001076715 Homo sapiens RNA-binding protein 39 Proteins 0.000 description 1
- 101000617830 Homo sapiens Sterol O-acyltransferase 1 Proteins 0.000 description 1
- 101000652229 Homo sapiens Suppressor of cytokine signaling 7 Proteins 0.000 description 1
- 108010048209 Human Immunodeficiency Virus Proteins Proteins 0.000 description 1
- 102000008100 Human Serum Albumin Human genes 0.000 description 1
- 108091006905 Human Serum Albumin Proteins 0.000 description 1
- 241000598436 Human T-cell lymphotropic virus Species 0.000 description 1
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 1
- XQFRJNBWHJMXHO-RRKCRQDMSA-N IDUR Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(I)=C1 XQFRJNBWHJMXHO-RRKCRQDMSA-N 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 102000004218 Insulin-Like Growth Factor I Human genes 0.000 description 1
- 108090001117 Insulin-Like Growth Factor II Proteins 0.000 description 1
- 102000048143 Insulin-Like Growth Factor II Human genes 0.000 description 1
- 101710203526 Integrase Proteins 0.000 description 1
- 102100022339 Integrin alpha-L Human genes 0.000 description 1
- 108010008212 Integrin alpha4beta1 Proteins 0.000 description 1
- 108010064593 Intercellular Adhesion Molecule-1 Proteins 0.000 description 1
- 102100037877 Intercellular adhesion molecule 1 Human genes 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 108090001007 Interleukin-8 Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 101710177504 Kit ligand Proteins 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 108010001831 LDL receptors Proteins 0.000 description 1
- 108010063045 Lactoferrin Proteins 0.000 description 1
- 102100032241 Lactotransferrin Human genes 0.000 description 1
- 241000589248 Legionella Species 0.000 description 1
- 208000007764 Legionnaires' Disease Diseases 0.000 description 1
- 241000222722 Leishmania <genus> Species 0.000 description 1
- 241000589902 Leptospira Species 0.000 description 1
- 102000004058 Leukemia inhibitory factor Human genes 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 102100024640 Low-density lipoprotein receptor Human genes 0.000 description 1
- 208000016604 Lyme disease Diseases 0.000 description 1
- 108010064548 Lymphocyte Function-Associated Antigen-1 Proteins 0.000 description 1
- 102000004083 Lymphotoxin-alpha Human genes 0.000 description 1
- 108090000542 Lymphotoxin-alpha Proteins 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 241000282560 Macaca mulatta Species 0.000 description 1
- 241000282561 Macaca nemestrina Species 0.000 description 1
- 201000005505 Measles Diseases 0.000 description 1
- 102000008109 Mixed Function Oxygenases Human genes 0.000 description 1
- 108010074633 Mixed Function Oxygenases Proteins 0.000 description 1
- 208000005647 Mumps Diseases 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000204031 Mycoplasma Species 0.000 description 1
- 241000588653 Neisseria Species 0.000 description 1
- 102000001839 Neurturin Human genes 0.000 description 1
- 108010015406 Neurturin Proteins 0.000 description 1
- 241000187654 Nocardia Species 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 102000004140 Oncostatin M Human genes 0.000 description 1
- 108090000630 Oncostatin M Proteins 0.000 description 1
- 241000283283 Orcinus orca Species 0.000 description 1
- 241000713112 Orthobunyavirus Species 0.000 description 1
- 241000702244 Orthoreovirus Species 0.000 description 1
- 101000989950 Otolemur crassicaudatus Hemoglobin subunit alpha-A Proteins 0.000 description 1
- 241000282579 Pan Species 0.000 description 1
- 241000282520 Papio Species 0.000 description 1
- 108090000445 Parathyroid hormone Proteins 0.000 description 1
- 102000003982 Parathyroid hormone Human genes 0.000 description 1
- 241000606860 Pasteurella Species 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 201000005702 Pertussis Diseases 0.000 description 1
- 241000709664 Picornaviridae Species 0.000 description 1
- 102100036154 Platelet basic protein Human genes 0.000 description 1
- 102100030304 Platelet factor 4 Human genes 0.000 description 1
- 208000000474 Poliomyelitis Diseases 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108010014608 Proto-Oncogene Proteins c-kit Proteins 0.000 description 1
- 102000016971 Proto-Oncogene Proteins c-kit Human genes 0.000 description 1
- 101710130181 Protochlorophyllide reductase A, chloroplastic Proteins 0.000 description 1
- 241000589516 Pseudomonas Species 0.000 description 1
- 108020005067 RNA Splice Sites Proteins 0.000 description 1
- 102000003743 Relaxin Human genes 0.000 description 1
- 108090000103 Relaxin Proteins 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 101710162453 Replication factor A Proteins 0.000 description 1
- 101710176758 Replication protein A 70 kDa DNA-binding subunit Proteins 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 241000606701 Rickettsia Species 0.000 description 1
- 102100023361 SAP domain-containing ribonucleoprotein Human genes 0.000 description 1
- 101710176276 SSB protein Proteins 0.000 description 1
- 101100355586 Schizosaccharomyces pombe (strain 972 / ATCC 24843) rhp51 gene Proteins 0.000 description 1
- 206010040070 Septic Shock Diseases 0.000 description 1
- 241000713311 Simian immunodeficiency virus Species 0.000 description 1
- 101710126859 Single-stranded DNA-binding protein Proteins 0.000 description 1
- 108010056088 Somatostatin Proteins 0.000 description 1
- 102000005157 Somatostatin Human genes 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 241000589970 Spirochaetales Species 0.000 description 1
- 241000295644 Staphylococcaceae Species 0.000 description 1
- 101000882403 Staphylococcus aureus Enterotoxin type C-2 Proteins 0.000 description 1
- 101001057112 Staphylococcus aureus Enterotoxin type D Proteins 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 108010039445 Stem Cell Factor Proteins 0.000 description 1
- 102100021993 Sterol O-acyltransferase 1 Human genes 0.000 description 1
- 108010023197 Streptokinase Proteins 0.000 description 1
- 101000697584 Streptomyces lavendulae Streptothricin acetyltransferase Proteins 0.000 description 1
- 102100021669 Stromal cell-derived factor 1 Human genes 0.000 description 1
- 101710088580 Stromal cell-derived factor 1 Proteins 0.000 description 1
- 208000003028 Stuttering Diseases 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 102000019197 Superoxide Dismutase Human genes 0.000 description 1
- 108010012715 Superoxide dismutase Proteins 0.000 description 1
- 102100030529 Suppressor of cytokine signaling 7 Human genes 0.000 description 1
- 101710091286 Syncytin-1 Proteins 0.000 description 1
- 101710091284 Syncytin-2 Proteins 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- 230000005867 T cell response Effects 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108010078233 Thymalfasin Proteins 0.000 description 1
- 102400000800 Thymosin alpha-1 Human genes 0.000 description 1
- 108090000373 Tissue Plasminogen Activator Proteins 0.000 description 1
- 102000003978 Tissue Plasminogen Activator Human genes 0.000 description 1
- 206010044248 Toxic shock syndrome Diseases 0.000 description 1
- 231100000650 Toxic shock syndrome Toxicity 0.000 description 1
- 101710120037 Toxin CcdB Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 101710184535 Transmembrane protein Proteins 0.000 description 1
- 101710141239 Transmembrane protein domain Proteins 0.000 description 1
- 241000589886 Treponema Species 0.000 description 1
- 241000224526 Trichomonas Species 0.000 description 1
- 101710090322 Truncated surface protein Proteins 0.000 description 1
- 101710110267 Truncated transmembrane protein Proteins 0.000 description 1
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 1
- 102000000852 Tumor Necrosis Factor-alpha Human genes 0.000 description 1
- 102100040245 Tumor necrosis factor receptor superfamily member 5 Human genes 0.000 description 1
- 108010018161 UlTma DNA polymerase Proteins 0.000 description 1
- 241000202898 Ureaplasma Species 0.000 description 1
- 108090000435 Urokinase-type plasminogen activator Proteins 0.000 description 1
- 102000003990 Urokinase-type plasminogen activator Human genes 0.000 description 1
- 241000700618 Vaccinia virus Species 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 108010000134 Vascular Cell Adhesion Molecule-1 Proteins 0.000 description 1
- 102100023543 Vascular cell adhesion protein 1 Human genes 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000607598 Vibrio Species 0.000 description 1
- 241000607734 Yersinia <bacteria> Species 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000005273 aeration Methods 0.000 description 1
- 229960002478 aldosterone Drugs 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 208000007502 anemia Diseases 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000002587 anti-hemolytic effect Effects 0.000 description 1
- 230000036436 anti-hiv Effects 0.000 description 1
- 230000001475 anti-trypsic effect Effects 0.000 description 1
- 108010082685 antiarrhythmic peptide Proteins 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 206010003246 arthritis Diseases 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- FZCSTZYAHCUGEM-UHFFFAOYSA-N aspergillomarasmine B Natural products OC(=O)CNC(C(O)=O)CNC(C(O)=O)CC(O)=O FZCSTZYAHCUGEM-UHFFFAOYSA-N 0.000 description 1
- 101150036080 at gene Proteins 0.000 description 1
- 230000001746 atrial effect Effects 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 239000003633 blood substitute Substances 0.000 description 1
- 210000002798 bone marrow cell Anatomy 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 239000000337 buffer salt Substances 0.000 description 1
- 229960004015 calcitonin Drugs 0.000 description 1
- 229960003773 calcitonin (salmon synthetic) Drugs 0.000 description 1
- NSQLIUXCMFBZME-MPVJKSABSA-N carperitide Chemical compound C([C@H]1C(=O)NCC(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CSSC[C@@H](C(=O)N1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)=O)[C@@H](C)CC)C1=CC=CC=C1 NSQLIUXCMFBZME-MPVJKSABSA-N 0.000 description 1
- 238000000423 cell based assay Methods 0.000 description 1
- 238000010370 cell cloning Methods 0.000 description 1
- 239000006143 cell culture medium Substances 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 238000001516 cell proliferation assay Methods 0.000 description 1
- 230000009134 cell regulation Effects 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000003508 chemical denaturation Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 238000007398 colorimetric assay Methods 0.000 description 1
- 239000004074 complement inhibitor Substances 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- OMFXVFTZEKFJBZ-HJTSIMOOSA-N corticosterone Chemical compound O=C1CC[C@]2(C)[C@H]3[C@@H](O)C[C@](C)([C@H](CC4)C(=O)CO)[C@@H]4[C@@H]3CCC2=C1 OMFXVFTZEKFJBZ-HJTSIMOOSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000002934 diuretic Substances 0.000 description 1
- 241001492478 dsDNA viruses, no RNA stage Species 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 231100000655 enterotoxin Toxicity 0.000 description 1
- 230000006862 enzymatic digestion Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 229940089118 epogen Drugs 0.000 description 1
- 229940011871 estrogen Drugs 0.000 description 1
- 239000000262 estrogen Substances 0.000 description 1
- 231100000776 exotoxin Toxicity 0.000 description 1
- 239000002095 exotoxin Substances 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 229960004222 factor ix Drugs 0.000 description 1
- 229940012413 factor vii Drugs 0.000 description 1
- 229960000301 factor viii Drugs 0.000 description 1
- 229940012426 factor x Drugs 0.000 description 1
- 229940012952 fibrinogen Drugs 0.000 description 1
- 229940126864 fibroblast growth factor Drugs 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 239000002622 gonadotropin Substances 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000003394 haemopoietic effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 229960002897 heparin Drugs 0.000 description 1
- 229920000669 heparin Polymers 0.000 description 1
- 238000012203 high throughput assay Methods 0.000 description 1
- 229940006607 hirudin Drugs 0.000 description 1
- WQPDUTSPKFMPDP-OUMQNGNKSA-N hirudin Chemical compound C([C@@H](C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C=CC(OS(O)(=O)=O)=CC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CCCCN)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H]1NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(O)=O)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]2CSSC[C@@H](C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(=O)N[C@H](C(NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N2)=O)CSSC1)C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H]1NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CO)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CC=2C=CC(O)=CC=2)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)C(C)C)[C@@H](C)O)CSSC1)C(C)C)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 WQPDUTSPKFMPDP-OUMQNGNKSA-N 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 102000044890 human EPO Human genes 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 244000052637 human pathogen Species 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 239000003262 industrial enzyme Substances 0.000 description 1
- 230000001524 infective effect Effects 0.000 description 1
- 206010022000 influenza Diseases 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 239000002054 inoculum Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 102000002467 interleukin receptors Human genes 0.000 description 1
- 108010093036 interleukin receptors Proteins 0.000 description 1
- 229960005431 ipriflavone Drugs 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 210000003292 kidney cell Anatomy 0.000 description 1
- CSSYQJWUGATIHM-IKGCZBKSSA-N l-phenylalanyl-l-lysyl-l-cysteinyl-l-arginyl-l-arginyl-l-tryptophyl-l-glutaminyl-l-tryptophyl-l-arginyl-l-methionyl-l-lysyl-l-lysyl-l-leucylglycyl-l-alanyl-l-prolyl-l-seryl-l-isoleucyl-l-threonyl-l-cysteinyl-l-valyl-l-arginyl-l-arginyl-l-alanyl-l-phenylal Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CSSYQJWUGATIHM-IKGCZBKSSA-N 0.000 description 1
- 229940078795 lactoferrin Drugs 0.000 description 1
- 235000021242 lactoferrin Nutrition 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 229940124590 live attenuated vaccine Drugs 0.000 description 1
- 229940023012 live-attenuated vaccine Drugs 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 230000002906 microbiologic effect Effects 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000033607 mismatch repair Effects 0.000 description 1
- 239000003226 mitogen Substances 0.000 description 1
- 208000010805 mumps infectious disease Diseases 0.000 description 1
- 230000001452 natriuretic effect Effects 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 238000001668 nucleic acid synthesis Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000002188 osteogenic effect Effects 0.000 description 1
- 101150080184 p17 gene Proteins 0.000 description 1
- 239000000199 parathyroid hormone Substances 0.000 description 1
- 229960001319 parathyroid hormone Drugs 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 231100000255 pathogenic effect Toxicity 0.000 description 1
- 230000007918 pathogenicity Effects 0.000 description 1
- 238000000059 patterning Methods 0.000 description 1
- 108010012038 peptide 78 Proteins 0.000 description 1
- 229940125863 peptide 78 Drugs 0.000 description 1
- 239000000813 peptide hormone Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 108010083127 phage repressor proteins Proteins 0.000 description 1
- 239000012071 phase Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 238000000596 photon cross correlation spectroscopy Methods 0.000 description 1
- 238000013492 plasmid preparation Methods 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 102000005162 pleiotrophin Human genes 0.000 description 1
- 230000001402 polyadenylating effect Effects 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000001566 pro-viral effect Effects 0.000 description 1
- 229940029359 procrit Drugs 0.000 description 1
- 229960003387 progesterone Drugs 0.000 description 1
- 239000000186 progesterone Substances 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 230000012743 protein tagging Effects 0.000 description 1
- 229940023143 protein vaccine Drugs 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 230000001698 pyrogenic effect Effects 0.000 description 1
- 230000000637 radiosensitizating effect Effects 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 108010068072 salmon calcitonin Proteins 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 230000037432 silent mutation Effects 0.000 description 1
- 238000001542 size-exclusion chromatography Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- NHXLMOGPVYXJNR-ATOGVRKGSA-N somatostatin Chemical compound C([C@H]1C(=O)N[C@H](C(N[C@@H](CO)C(=O)N[C@@H](CSSC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@@H](CC=2C=CC=CC=2)C(=O)N[C@@H](CC=2C3=CC=CC=C3NC=2)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N1)[C@@H](C)O)NC(=O)CNC(=O)[C@H](C)N)C(O)=O)=O)[C@H](O)C)C1=CC=CC=C1 NHXLMOGPVYXJNR-ATOGVRKGSA-N 0.000 description 1
- 229960000553 somatostatin Drugs 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 108020003113 steroid hormone receptors Proteins 0.000 description 1
- 102000005969 steroid hormone receptors Human genes 0.000 description 1
- 229960005202 streptokinase Drugs 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 231100000617 superantigen Toxicity 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 229960003604 testosterone Drugs 0.000 description 1
- NZVYCXVTEHPMHE-ZSUJOUNUSA-N thymalfasin Chemical compound CC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NZVYCXVTEHPMHE-ZSUJOUNUSA-N 0.000 description 1
- 229960004231 thymalfasin Drugs 0.000 description 1
- 230000001550 time effect Effects 0.000 description 1
- 229960000187 tissue plasminogen activator Drugs 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 239000002753 trypsin inhibitor Substances 0.000 description 1
- 201000008827 tuberculosis Diseases 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 229960005356 urokinase Drugs 0.000 description 1
- 229940125575 vaccine candidate Drugs 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 230000029812 viral genome replication Effects 0.000 description 1
- 210000000605 viral structure Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1027—Mutagenizing nucleic acids by DNA shuffling, e.g. RSR, STEP, RPR
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/475—Growth factors; Growth regulators
- C07K14/505—Erythropoietin [EPO]
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/52—Cytokines; Lymphokines; Interferons
- C07K14/53—Colony-stimulating factor [CSF]
- C07K14/535—Granulocyte CSF; Granulocyte-macrophage CSF
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01J—CHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
- B01J2219/00—Chemical, physical or physico-chemical processes in general; Their relevant apparatus
- B01J2219/00274—Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
- B01J2219/0068—Means for controlling the apparatus of the process
- B01J2219/00686—Automatic
- B01J2219/00689—Automatic using computers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/16011—Human Immunodeficiency Virus, HIV
- C12N2740/16111—Human Immunodeficiency Virus, HIV concerning HIV env
- C12N2740/16122—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
Landscapes
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Medicinal Chemistry (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Toxicology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Virology (AREA)
- Immunology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Description
WO 00/18906 PCT/US99/22588 CROSS REFERENCE TO RELATED APPLICATIONS This application is a non-provisional filing of USSN 60/102,362, 5 "SHUFFLING OF CODON ALTERED GENES," Attorney Docket No. 02-028500, by Patten and Stemmer, filed 09/29/98, and 60/117,729, "SHUFFLING OF CODON ALTERED GENES," Attorney Docket No. 02-0285 10, by Patten and Stemmer, filed January 29, 1999. The application is also related to USSN 60/118,813 "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION," by Crameri et al., Attorney Docket Number 02-296, 10 filed February 5, 1999; and USSN 60/141,049 "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION," by Crameri et al., Attorney Docket Number 02-296 1, filed June 24, 1999. BACKGROUND The genetic code is highly degenerate. Every DNA/RNA triplet (codon) 15 encoding an amino acid can typically be altered, with the exception of ATG/AUG (coding for methionine) and TGG/UGG (coding for Tryptophan), without altering the sequence of the protein encoded by the corresponding nucleic acid sequence. Roughly, on average (the distribution of amino acids varies from protein to protein), each coding triplet can be substituted about 3 different ways, since there are 61 codons encoding 20 amino acids (there 20 are 3 additional triplets encoding stop codons, for a total of 64 codons encoding 20 amino acids). This represents a possible sequence diversity of approximately 3" possible sequences which encode a given protein, where n is the length of the protein in amino acids. As can easily be seen, for proteins of even modest length, the number of possible nucleic acids which can encode the protein exceeds the number of physical particles in the universe (estimated at 25 about 1080 particles). This tremendous potential coding sequence space for individual proteins has interesting evolutionary implications. For example, hypermutable viruses such as HIVs and other retroviruses typically stay one step ahead of the host immune system by accumulating non-random mutations based, in part, upon the particular codons used to encode recognition 30 molecules, e.g., in the envelope portion of the virus. The mutations are non-random because viruses are selected for the ability to mutate to forms which are not quickly recognized by the WO 00/18906 PCT/US99/22588 host immune system. A consequence of this is that viruses are selected to have a non-random set of codons encoding, e.g., envelope proteins, allowing the viruses to shift forms rapidly by making, e.g., specific point mutations to generate specific alterations in protein structure. Codon use is also non-random within species. By preferentially making a 5 subset of all possible t-RNAs, cells may conserve energy, and can optimize, or even regulate, the efficiency of cellular translation systems. This fact has long been recognized empirically, often allowing investigators initially to determine the reading frame of a given nucleic acid sequence simply by consideration of the codons resulting from different potential reading frames. One consequence of this "species codon bias" is that proteins within a species have a 10 limited set of possible mutations that can arise as a consequence of, e.g., point mutation. This limits the possible evolution rate of proteins. In addition to the diversity of nucleic acid coding sequences which encode any given protein, it is now clear that protein sequences are, themselves, quite degenerate. Often, many of the amino acid residues constituting a protein may be substituted for structurally 15 similar amino acid units without significantly changing the tertiary structure of the protein. Thus, it may be difficult to determine which residues to modify or to improve desirable properties of a protein. For proteins which are commercially valuable, it would be desirable to be able to gain access to a mutational spectrum which is different than that of the native protein. The 20 present invention provides this, and many other features, that will be apparent upon complete review of the following. SUMMARY OF THE INVENTION The present invention provides methods of accessing a completely different mutational spectrum for a selected protein than is available in the naturally occurring nucleic 25 acid encoding the protein. This increases the type and rate of forced evolution for the selected protein, allowing for rapid improvement of any detectable characteristic of the protein. In the methods, nucleic acids are synthesized with altered codon usage, and/or which encode one or several amino acid residue changes as compared to the selected protein, where the amino acid and codon usage changes can be conservative or non-conservative. 30 The resulting codon/amino acid modified nucleic acid(s) are recombined using DNA shuffling techniques with either the native nucleic acid, or with each other (or both), typically 2 WO 00/18906 PCT/US99/22588 using recursive shuffling methods. The nucleic acids or the encoded protein are then screened for a desirable property. Thus, the invention provides methods of making codon altered nucleic acids. In the methods, a first nucleic acid sequence encoding a first polypeptide sequence is 5 selected. A plurality of codon altered nucleic acid sequences, each of which encode the first polypeptide, or a modified form thereof, are then selected (e.g., a library of codon altered nucleic acids can be selected in a biological assay which recognizes library components or activities), and the plurality of codon altered nucleic acid sequences is recombined to produce a target codon altered nucleic acid encoding a second protein. The target codon altered 10 nucleic acid is then screened for a detectable functional or structural property, optionally including comparison to the properties of the first polypeptide. The goal of such screening is to identify a polypeptide that has a structural or functional property equivalent or superior to the first polypeptide. A nucleic acid encoding such a polypeptide can be used in essentially any procedure desired, including introducing the target codon altered nucleic acid into a cell, 15 vector, virus, attenuated virus (e.g., as a component of a vaccine or immunogenic composition), transgenic organism, or the like. Kits and compositions for practicing the methods are also provided, including one or more of: cell recombination mixtures and substrates (e.g., nucleic acids with altered codon usage), containers, instructional material for practicing the methods, or the like. 20 BRIEF DESCRIPTION OF THE FIGURES Figure 1 is a nucleic acid/amino acid sequence of a part of the monkey EPO gene, which is similar to the human EPO gene. Figure 2 shows an example of a codon altered EPO nucleic acid sequence. Figure 3 shows an alignment of naturally occurring EPOs. 25 Figure 4 is a schematic of the human EPO wobble sequence space. Figure 5 is a schematic of Mammalian EPO Family-Wobble Sequence Space. Figure 6 is a sequence alignment of G-CSF homologs, with species information. Figure 7 is a sequence alignment of G-CSF homologs, with differences broken 30 out. 3 WO 00/18906 PCTIUS99/22588 Figure 8 is a sequence alignment showing the hydrophobic core residues of human G-CSF (blacked out). Figure 9 is a schematic showing the shuffling strategy for G-CSF. Figure 10 is a list of oligos used to make a codon altered alkaline phosphatase. 5 Figure I1 is a map of oligos used to make a codon altered alkaline phosphatase. Figure 12 is a schematic of vaccination with evolution defective viruses. Figure 13 is a schematic of different mutations that result from different codon types for ser, arg, and leu. 10 Figure 14 is a schematic of vaccination with evolution defective viruses. Figure 15 is a schematic of vaccination with evolution defective viruses showing sophisticated versus non-sophisticated "mutant clouds." Figure 16, panels A-C show results of single mutations of different codons for ser, arg, and leu. 15 Fig. 17 is a schematic of protein evolution with expanded mutation spectra. Fig. 18, panels A-D show codon altered forms of Env. Fig. 19 is a list of oligos in one application for synthesis of HIV Env. DEFINITIONS Unless clearly indicated to the contrary, the following definitions supplement 20 definitions of terms known in the art. As used herein, a "recombinant" nucleic acid is a nucleic acid produced by recombination between two or more nucleic acids, or any nucleic acid made by an in vitro or artificial process. The term "recombinant" when used with reference to a cell indicates that the cell comprises (and optionally replicates) a heterologous nucleic acid, or expresses a 25 peptide or protein encoded by a heterologous nucleic acid. Recombinant cells can contain genes that are not found within the native (non-recombinant) wild-type form of the cell. Recombinant cells can also contain genes found in the native form of the cell where the genes are modified and re-introduced into the cell by artificial means. The term also encompasses cells that contain a nucleic acid endogenous to the cell that has been artificially modified 30 without removing the nucleic acid from the cell; such modifications include those obtained by gene replacement, site-specific mutation, chimeraplasty, and related techniques. 4 WO 00/18906 PCT/US99/22588 A "codon altered" nucleic acid is a first nucleic acid that encodes a first polypeptide similar or identical to a naturally occurring polypeptide encoded by a naturally occurring nucleic acid, where the first nucleic acid utilizes a plurality of codons to encode the first polypeptide, which differ from the codons of the naturally occurring nucleic acid that 5 encode the naturally occurring polypeptide. A "nucleic acid sequence" refers to either a nucleic acid (e.g., RNA, DNA or modified form thereof, in isolated, recombinant or native form) or to a representation of the nucleic acid such as a sequence of letters indicating the primary structure (sequence) of the nucleic acid. 10 A "polypeptide sequence" refers to either a polypeptide (or modified form thereof, in isolated, recombinant or native form) or to a representation of the polypeptide such as a sequence of letters or other character string information indicating the primary structure (amino acid sequence) of the polypeptide. A "modified form" of a reference polypeptide is a target polypeptide which 15 has a similar, but not identical, sequence to the reference polypeptide. The sequence of the target polypeptide can differ from the reference polypeptide by conservative or non conservative substitutions of the reference polypeptide sequence. As noted in more detail, supra, different nucleic acids encoding different target polypeptides having different non conservative substitutions relative to the reference polypeptide can be recombined to produce 20 a recombined nucleic acid encoding a target polypeptide more similar to the reference polypeptide. A "plurality of forms" of a selected nucleic acid refers to a plurality of homologs of the nucleic acid. The homologs can be from naturally occurring homologs (e.g., two or more homologous genes, or derivatives thereof) or by artificial synthesis of one or 25 more nucleic acids having related sequences, or by modification of one or more nucleic acid to produce related nucleic acids. Nucleic acids are homologous when they are derived, naturally or artificially, from a common ancestor sequence. During natural evolution, this occurs when two or more descendent sequences diverge from a parent sequence over time, i.e., due to mutation and natural selection. Under artificial conditions, divergence occurs, 30 e.g., in one of two ways. First, a given sequence can be artificially recombined with another sequence, as occurs, e.g., during typical cloning, to produce a descendent nucleic acid. 5 WO 00/18906 PCT/US99/22588 Alternatively, a nucleic acid can be synthesized de novo, by synthesizing a nucleic acid which varies in sequence from a given parental nucleic acid sequence. When there is no explicit knowledge about the ancestry of two nucleic acids, homology is typically inferred by sequence comparison between two sequences. Where two 5 nucleic acid sequences show sequence similarity it is inferred that the two nucleic acids share a common ancestor. The precise level of sequence similarity required to establish homology varies in the art depending on a variety of factors. For purposes of this disclosure, two sequences are considered homologous where they share sufficient sequence identity to allow recombination to occur between two nucleic acid molecules, or when codon changes can be 10 made which would result in two or more nucleic acids having the ability to recombine. Typically, nucleic acids require regions of close similarity spaced roughly the same distance apart to permit recombination to occur. The terms "identical" or percent "identity," in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that 15 are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the sequence comparison algorithms described below (or other algorithms available to persons of skill) or by visual inspection. The phrase "substantially identical," in the context of two nucleic acids or 20 polypeptides refers to two or more sequences or subsequences that have at least about 40%, 50%, 60%, or preferably about 70% or 80% or more, or most preferably 90-95% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. Such "substantially identical" sequences are typically considered to be homologous. 25 Preferably, the "substantial identity" exists over a region of the sequences that is at least about 50 residues in length, more preferably over a region of at least about 100 residues, and most preferably the sequences are substantially identical over at least about 150 residues, or over the full length of the two sequences to be compared. For sequence comparison and homology determination, typically one 30 sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, 6 WO 00/18906 PCT/US99/22588 subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters. 5 Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Apple. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat 'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and 10 TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by visual inspection (see generally, Ausubel et al., infra). One example of algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly 15 available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score 20 threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for 25 mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm 30 parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation 7 WO 00/18906 PCT/US99/22588 (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915). 5 In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid 10 sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.00 1. Another indication that two nucleic acid sequences are substantially identical/ 15 homologous is that the two molecules hybridize to each other under stringent conditions. The phrase "hybridizing specifically to," refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions, including when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. "Bind(s) substantially" refers to complementary hybridization between a probe nucleic acid and a 20 target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence. "Stringent hybridization conditions" and "stringent hybridization wash conditions" in the context of nucleic acid hybridization experiments such as Southern and 25 northern hybridizations are sequence dependent, and are different under different environmental parameters. Longer sequences and sequences with higher G:C content remain hybridized at higher temperatures (or at lower salt). An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes part I chapter 2 "Overview of 30 principles of hybridization and the strategy of nucleic acid probe assays," Elsevier, New York. 8 WO 00/18906 PCTIUS99/22588 Generally, highly stringent hybridization and wash conditions are selected to be about 5 oC lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Typically, under "stringent conditions" a probe will hybridize to its target subsequence, but not to unrelated (non-homologous) sequences. 5 The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formamide with 1 10 mg of heparin at 42 OC, with the hybridization being carried out overnight. An example of highly stringent wash conditions is 0.15M NaCl at 72 OC for about 15 minutes. An example of stringent wash conditions is a 0.2x SSC wash at 65 0 C for 15 minutes (see, Sambrook, infra., for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash 15 for a duplex of, e.g., more than 100 nucleotides, is Ix SSC at 45 OC for 15 minutes. An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6x SSC at 40 OC for 15 minutes. For short probes (e.g., about 10 to 50 nucleotides), stringent conditions typically involve salt concentrations of less than about 1.0 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically 20 at least about 40 oC. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. If the signal to noise ratio is less than 2x binding of an unrelated probe (e.g., a nucleic acid encoding a non-homologous protein), the nucleic acids at 25 issue do not hybridize under stringent conditions. Similarly, if the signal to noise ratio is less than 25% as high as that observed for a perfectly matched probe under stringent conditions, the nucleic acids do not "hybridize under stringent conditions" as that term is used herein. This does not apply to highly stringent conditions, as the stringency can theoretically be increased until only a perfectly matched probe will hybridize. 30 In one example hybridization procedure, a target nucleic acid to be probed is blotted onto a filter by any conventional method. An unrelated nucleic acid such as a plasmid 9 WO 00/18906 PCT/US99/22588 vector (assuming that the target nucleic acid has no homology with the target nucleic acid) is also blotted, in approximately equal amounts onto the filter. The filter is probed with a labeled probe complementary to the target nucleic acid. The experiment is repeated at gradually increasing stringency of hybridization and wash conditions until signal from the 5 hybridization of the labeled probe to the complementary target is 10-1 OOX as high as to the unrelated plasmid vector nucleic acid. Once these conditions are determined as described above, a test nucleic acid is probed under the same conditions as the target. If signal from the labeled probe is 25% as high or higher than the signal from binding of the probe to the target, the test nucleic acid "hybridizes under stringent conditions" to the probe. If the signal is less 10 than 25% as high, the test nucleic acid does not hybridize under stringent conditions to the probe. Nucleic acids which do not hybridize to each other under stringent conditions are still recognizable as variant forms of a nucleic acid when the polypeptides they encode are substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using 15 the maximum codon degeneracy permitted by the genetic code. Such nucleic acids are not functionally equivalent, as described in detail herein, due to differences in mRNA folding, alterations of regulatory sequences and the like. Another indication that two nucleic acid sequences or polypeptides are variant forms is that the polypeptide encoded by the first nucleic acid is immunologically cross 20 reactive with the polypeptide encoded by the second nucleic acid, as tested by polyclonal antisera generated to the first polypeptide. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. "Conservatively modified variations" of a particular polynucleotide sequence 25 are those polynucleotide variations that encode identical or essentially identical amino acid sequences, or where the polynucleotide does not encode an amino acid sequence, which encode essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. 30 Thus, at every position where an arginine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such 10 WO 00/18906 PCT/US99/22588 nucleic acid variations are "silent variations," which are one species of "conservatively modified variations." Every polynucleotide sequence described herein which encodes a polypeptide also optionally describes every possible silent variation, except where otherwise noted. One of skill will recognize that each codon in a nucleic acid (except AUG, which is 5 ordinarily the codon for methionine, and TGG, which is ordinarily the codon for tryptophan) can be modified to yield a peptide which is structurally identical. Furthermore, one of skill will recognize that individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small percentage of amino acids (typically less than 5%, more typically less than 1%) in an encoded sequence are 10 "conservatively modified variations" where the alterations result in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. The following five groups each contain amino acids that are conservative substitutions for one another: Aliphatic: Glycine (G), Alanine (A), Valine (V), Leucine (L), Isoleucine (I); 15 Aromatic: Phenylalanine (F), Tyrosine (Y), Tryptophan (W); Sulfur-containing: Methionine (M), Cysteine (C);Basic: Arginine (R), Lysine (K), Histidine (H); Acidic: Aspartic acid (D), Glutamic acid (E), Asparagine (N), Glutamine (Q). See also, Creighton (1984) Proteins, W.H. Freeman and Company. In addition, individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small percentage of amino acids in an 20 encoded sequence are also "conservatively modified variations." Sequences that differ by conservative variations are generally homologous. The term "isolated", when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular or other components (e.g., library components) with which it is associated in the natural state. 25 The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides which have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic 30 acid sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences and as well as the sequence 11 WO 00/18906 PCTIUS99/22588 explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res. 19: 5081; Ohtsuka et al. (1985) J. Biol. Chem. 260: 2605-2608; Cassol et al. (1992) ; 5 Rossolini et al. (1994) Mol. Cell. Probes 8: 91-98). The term nucleic acid is generic to the terms "gene", "DNA," "cDNA", "oligonucleotide," "RNA," "mRNA," and the like. "Nucleic acid derived from a gene" refers to a nucleic acid for whose synthesis the gene, or a subsequence thereof, has ultimately served as a template. Thus, an mRNA, a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a 10 DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the gene and detection of such derived products is indicative of the presence and/or abundance of the original gene and/or gene transcript in a sample. A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For instance, a promoter or enhancer is 15 operably linked to a coding sequence if it increases the transcription of the coding sequence. A "recombinant expression cassette" or simply an "expression cassette" is a nucleic acid construct, generated recombinantly or synthetically, with nucleic acid elements that are capable of effecting expression of a structural gene in hosts compatible with such sequences. Expression cassettes include at least promoters and optionally, transcription 20 termination signals. Typically, the recombinant expression cassette includes a nucleic acid to be transcribed (e.g., a nucleic acid encoding a desired polypeptide), and a promoter. Additional factors necessary or helpful in effecting expression may also be used as described herein. For example, an expression cassette can also include nucleotide sequences that encode a signal sequence that directs secretion of an expressed protein from the host cell. 25 Transcription termination signals, enhancers, and other nucleic acid sequences that influence gene expression, can also be included in an expression cassette. DETAILED DISCUSSION OF THE INVENTION In the present invention, the sequence diversity of substrates for DNA shuffling procedures is increased by using codon-altered nucleic acids as templates and/or by 30 using templates that encode proteins with conservative or non-conservative amino acid modifications as compared to a selected wild-type protein. 12 WO 00/18906 PCT/US99/22588 These codon altered nucleic acids can be chemically synthesized (e.g., using standard artificial synthetic protocols, e.g., those typically used by commercial sources from which nucleic acids can be ordered), or can be made using any of a variety of methods herein of available to one of skill. For example, oligonucleotide fragments can be made which 5 correspond to a codon altered nucleic acid which is desired using standard synthetic methods, followed by polymerase and/or ligase mediated oligonucleotide ligation/recombination protocols to generate full-length nucleic acids. The combination of codon usage modifications and coding modifications can be extensive enough to reduce or, under stringent conditions, even eliminate the hybridization 10 of the codon-altered nucleic acids to a nucleic acid which naturally encodes the selected protein. This dramatically alters the mutations which result from possible single nucleotide mutations, providing access to greater diversity for DNA shuffling protocols. In addition, the recombination and selection of such nucleic acids during DNA shuffling procedures can result not only in access to a different set of possible mutations, but 15 can also result in modified forms of transcriptional or translational regulation, alterations in nucleic acid localization, mRNA stability and the like. Furthermore, the modified hybridization properties of codon altered nucleic acids leads to alterations in the ability of the nucleic acids to hybridize with potential recombination partners, altering, and ultimately increasing, the available recombination diversity during shuffling. 20 Furthermore, "family shuffling" using codon-altered substrates even further increases the possible sequence diversity of the starting materials for recombination. As currently practiced, family shuffling methods involve shuffling nucleic acids encoding sequence variants of a given protein (e.g., species or allele homologs). In the present methods, this procedure is modified by generating codon-altered versions of the sequence 25 variants to access additional molecular diversity during recombination. Additional diversity is achieved by conservatively and non-conservatively modifying the starting nucleic acids to encode non-naturally occurring sequence variants. Family shuffling can be performed even using homologs of relatively low identity. In such cases, codons may be changed in one or more of the family members to increase the level of identity between the members, thereby 30 increasing their ability to recombine using the methods of this invention. 13 WO 00/18906 PCT/US99/22588 Gene shuffling and family shuffling provide two of the most powerful methods available for improving and "migrating" (gradually changing the type of reaction, substrate or activity of a selected protein such as an enzyme, or regulation or structure of an expressed component) the functions of proteins. In family shuffling, homologous sequences, 5 e.g., from different species, chromosomal positions, or due to synthetic alteration, are recombined. In gene shuffling, a single sequence is mutated or otherwise altered and then recombined. The generation and screening of high quality shuffled libraries provides for DNA shuffling (or "directed evolution"). The availability of appropriate high-throughput 10 analytical chemistry to screen the libraries permits integrated high-throughput shuffling and screening of the libraries to achieve a desired activity. In one significant embodiment, oligonucleotides for constructing codon modified nucleic acids are designed in a computer ("in silico"). Predicted codon-modified recombinant nucleic acids can also be determined in silico, i.e., essentially as taught in 15 Selifonov and Stemmer "METHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" filed 02/05/1999, USSN 60/118854. Furthermore, rather than generating codon-modified nucleic acids as substrates for recombination, families of nucleic acids can be recombined simply by 20 appropriate selection of the relevant oligonucleotides which are used in gene reconstruction methods to produce recombinant nucleic acids, i.e., by using codon-modified nucleic acid oligonucleotides as discussed herein in conjunction with family oligonucleotide-mediated shuffling methods, e.g., as taught in Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed February 5, 1999, USSN 60/118,813 and 25 Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed June 24, 1999, USSN 60/141,049. The technique can be used to recombine homologous or even non-homologous nucleic acid sequences; in the context of the present invention, oligonucleotides corresponding to families of codon-modified nucleic acids are shuffled. The present invention provides significant advantages over previously used 30 methods for optimization of genes. For example, DNA shuffling of codon modified nucleic acids can result in optimization of a desirable property even in the absence of a detailed 14 WO 00/18906 PCT/US99/22588 understanding of the mechanism by which the particular property is mediated. In addition, entirely new properties can be obtained upon shuffling of codon modified DNAs, i.e., shuffled DNAs can encode polypeptides or RNAs with properties entirely absent in the parental DNAs which are shuffled. Thus, by modifying the codon usage and/or encoded 5 amino acids of the relevant gene or other nucleic acid, molecular diversity is accessed and sequences can be shuffled to obtain desired, including entirely new, properties. In general, sequence recombination can be achieved in many different formats and permutations of formats, as described in further detail below. The targets for modification vary in different applications, as does the 10 property sought to be acquired or improved. Examples of candidate targets for acquisition of a property or improvement in a property include genes that encode proteins which have enzymatic or therapeutic or other commercially useful activities. A more extensive listing is found supra; however, even this list is not intended to be limiting, as essentially any nucleic acid can be codon modified and shuffled, using one or more of the processes herein. 15 Shuffling methods use at least two variant forms of a starting target (the variant forms can be nucleic acids, or representations thereof, e.g., as character strings in a computer program). The variant forms of candidate codon-altered substrates can show substantial sequence or secondary structural similarity with each other, but they should also differ in at least one and preferably at least two positions. The initial diversity between forms 20 can be the result of natural variation, e.g., the different variant forms (homologs) are obtained from different individuals or strains of an organism, or constitute related sequences from the same organism (e.g., allelic variations), or constitute homologs from different organisms (interspecific variants), or constitute artificial homologs, e.g., codon-altered nucleic acids encoding the same or a similar protein. Any or all of these sequences can represent or 25 include codon altered nucleic acids. Initial diversity can also be induced, e.g., the variant forms can be generated by error-prone transcription, such as an error-prone PCR or use of a polymerase which lacks proof-reading activity (see, Liao (1990) Gene 88:107-111), of the first variant form, or, by replication of the first form in a mutator strain (mutator host cells are discussed in further 30 detail below, and are generally well known). The initial diversity between substrates is greatly augmented in subsequent steps of recombination for library generation. 15 WO 00/18906 PCT/US99/22588 A mutator strain can include any mutants in any organism impaired in the functions of mismatch repair. These include mutant gene products of mutS, mutT, mutH, mutL, ovrD, dcm, vsr, umuC, umuD, sbcB, recJ, etc. The impairment is achieved by genetic mutation, allelic replacement, selective inhibition by an added reagent such as a small 5 compound or an expressed antisense RNA, or other techniques. Impairment can be of the genes noted, or of homologous genes in any organism. The properties or characteristics that can be acquired or improved vary widely, and, of course depend on the choice of substrate. At least two variant forms of a nucleic acid, e.g., which can confer a desired activity or which can be recombined to produce a desired activity, are recombined to produce 10 a library of recombinant nucleic acids. The library is then screened to identify at least one recombinant nucleic acid that is optimized for the particular property or properties of interest. Often, improvements are achieved after one round of recombination and selection. However, recursive sequence recombination can be employed to achieve still further improvements in a desired property, or to bring about new (or "distinct") properties. 15 Recursive sequence recombination entails successive cycles of recombination to generate molecular diversity. That is, one creates a family of nucleic acid molecules showing some sequence identity to each other but differing due to the presence of mutations. In any given cycle, recombination can occur in vivo or in vitro, intracellularly or extracellularly. Furthermore, diversity resulting from recombination can be augmented in any cycle by 20 applying known methods of mutagenesis (e.g., error-prone PCR or cassette mutagenesis) to either the substrates or products for recombination. In general, however, a single cycle of DNA shuffling of codon-altered nucleic acids provides for generation of surprisingly effective nucleic acids. Accordingly, while recursive approaches to shuffling can be used, single cycle recombination is also preferred. Typically, 2, 3, 4, 5, or even 10 or more cycles 25 of recombination can be performed, each cycle optionally comprising one or more selection steps. A recombination cycle is usually followed by at least one cycle of screening or selection for molecules having a desired property or characteristic. If a recombination cycle is performed in vitro, the products of recombination, i.e., recombinant segments, are 30 sometimes introduced into cells before the screening step. Recombinant segments can also be linked to an appropriate vector or other regulatory sequences before screening. 16 WO 00/18906 PCTIUS99/22588 Alternatively, products of recombination generated in vitro are sometimes packaged in viruses (e.g., bacteriophage) before screening. If recombination is performed in vivo, recombination products can sometimes be screened in the cells in which recombination occurred. In other applications, recombinant segments are extracted from the cells, and 5 optionally packaged as viruses, before screening. The nature of screening or selection depends on what property or characteristic is to be acquired or the property or characteristic for which improvement is sought, and many examples are discussed below. It is not usually necessary to understand the molecular basis by which particular products of recombination (recombinant segments) have 10 acquired new or improved properties or characteristics relative to the starting substrates. For example, a gene can have many component sequences, each having a different intended role (e.g., coding sequences, regulatory sequences, targeting sequences, stability-conferring sequences, subunit sequences and sequences affecting integration). Each of these component sequences can be varied and recombined simultaneously. Screening/selection can then be 15 performed, for example, for recombinant segments that have increased ability to confer activity upon a cell without the need to attribute such improvement to any of the individual component sequences of the vector. Depending on the particular screening protocol used for a desired property, initial round(s) of screening can sometimes be performed using bacterial cells due to high 20 transfection efficiencies and ease of culture. However, bacterial expression is often not practical or desired, and yeast, fungal or other eukaryotic systems are also used for library expression and screening. Similarly, other types of screening which are not amenable to screening in bacterial or simple eukaryotic library cells, are performed in cells selected for use in an environment close to that of their intended use. Final rounds of screening can be 25 performed in the precise cell type of intended use. If further improvement in a property is desired, at least one and usually a collection of recombinant segments surviving a first round of screening/selection are subject to a further round of recombination. These recombinant segments can be recombined with each other or with exogenous segments representing the original substrates or further variants 30 thereof. Again, recombination can proceed in vitro or in vivo. If the previous screening step identifies desired recombinant segments as components of cells, the components can be 17 WO 00/18906 PCTIUS99/22588 subjected to further recombination in vivo, or can be subjected to further recombination in vitro, or can be isolated before performing a round of in vitro recombination. Conversely, if the previous screening step identifies desired recombinant segments in naked form or as components of viruses, these segments can be introduced into cells to perform a round of in 5 vivo recombination. The second round of recombination, irrespective how performed, generates further recombinant segments which encompass additional diversity that is present in recombinant segments resulting from a previous round (or from multiple previous rounds, e.g., where the process is iteratively repeated). The second round of recombination can be followed by a further round of 10 screening/selection according to the principles discussed above for the first round. The stringency of screening/selection can be increased between rounds. Also, the nature of the screen and the property being screened for can vary between rounds if improvement in more than one property is desired or if acquiring more than one new property is desired. Additional rounds of recombination and screening can then be performed until the 15 recombinant segments have sufficiently evolved to acquire the desired new or improved property or function. The practice of this invention involves the construction of recombinant nucleic acids and the expression of genes in transfected host cells. Molecular cloning techniques to achieve these ends are known in the art. A wide variety of cloning and in vitro amplification 20 methods suitable for the construction of recombinant nucleic acids such as expression vectors are well-known to persons of skill. General texts which describe molecular biological techniques useful herein, including mutagenesis, include Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd 25 Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 1998) ("Ausubel")). Methods of transducing cells, including plant and animal cells, with nucleic acids are generally available, as are methods of 30 expressing proteins encoded by such nucleic acids. In addition to Berger, Ausubel and Sambrook, useful general references for culture of animal cells include Freshney (Culture of 18 WO 00/18906 PCT/US99/22588 Animal Cells, a Manual of Basic Technique, third edition Wiley- Liss, New York (1994)) and the references cited therein, Humason (Animal Tissue Techniques, fourth edition W.H. Freeman and Company (1979)) and Ricciardelli, et al., In Vitro Cell Dev. Biol. 25:1016-1024 (1989). References for plant cell cloning, culture and regeneration include Payne et al. 5 (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, NY (Payne); and Gamborg and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, Springer-Verlag (Berlin Heidelberg New York) (Gamborg). A variety of Cell culture media are described in Atlas and Parks (eds) The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, FL (Atlas). Additional 10 information for plant cell culture is found in available commercial literature such as the Life Science Research Cell Culture Catalogue (1998) from Sigma- Aldrich, Inc (St Louis, MO) (Sigma-LSRCCC) and, e.g., the Plant Culture Catalogue and supplement (1997) also from Sigma-Aldrich, Inc (St Louis, MO) (Sigma-PCCS). Examples of techniques sufficient to direct persons of skill through in vitro 15 amplification methods, including the polymerase chain reaction (PCR), the ligase chain reaction (LCR), Qp-replicase amplification and other RNA polymerase mediated techniques (e.g., NASBA) are found in Berger, Sambrook, and Ausubel, id., as well as in Mullis et al., (1987) U.S. Patent No. 4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis); Arnheim & Levinson 20 (October 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991) 3, 81-94; Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86, 1173; Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87, 1874; Lomell et al. (1989) J. Clin. Chem 35, 1826; Landegren et al., (1988) Science 241, 1077-1080; Van Brunt (1990) Biotechnology 8, 291-294; Wu and Wallace, (1989) Gene 4, 560; Barringer et al. (1990) Gene 89, 117, and Sooknanan and Malek (1995) 25 Biotechnology 13: 563-564. Improved methods of cloning in vitro amplified nucleic acids are described in Wallace et al., U.S. Pat. No. 5,426,039. Improved methods of amplifying large nucleic acids by PCR are summarized in Cheng et al. (1994) Nature 369: 684-685 and the references therein, in which PCR amplicons of up to 40kb are generated. One of skill will appreciate that essentially any RNA can be converted into a double stranded DNA suitable 30 for restriction digestion, PCR expansion and sequencing using reverse transcriptase and a polymerase. See, Ausbel, Sambrook and Berger, all supra. 19 WO 00/18906 PCTIUS99/22588 Oligonucleotides e.g., for use in in vitro amplification/ gene reconstruction methods, for use as gene probes, or as shuffling targets (e.g., synthetic genes or gene segments) are typically synthesized chemically according to the solid phase phosphoramidite triester method, e.g., as described by Beaucage and Caruthers (1981), Tetrahedron Letts., 5 22(20):1859-1862, e.g., using an automated synthesizer, e.g., as described in Needham VanDevanter et al. (1984) Nucleic Acids Res., 12:6159-6168 or as is now practiced routinely in the art. Oligonucleotides can also be custom made and ordered from a variety of commercial sources known to persons of skill. Purification of oligonucleotides (e.g., using gel-purification methods) to improve the quality of synthesized oligonucleotides can be 10 particularly desirable in the processes herein to improve the quality of nucleic acid synthesis protocols. As noted, essentially any nucleic acid can be custom ordered from any of a variety of commercial sources, such as The Midland Certified Reagent Company ([email protected]), The Great American Gene Company (http://www.genco.com), 15 ExpressGen Inc. (www.expressgen.com), Operon Technoloigies Inc. (Alameda, CA) and many others. Similarly, peptides and antibodies can be custom ordered from any of a variety of sources, such as PeptidoGenic ([email protected]), HTI Bio-products, inc. (http://www.htibio.com), BMA Biomedicals Ltd (U.K.), Bio'Synthesis, Inc., and many others. 20 CODON AND AMINO ACID ALTERED LIBRARIES In the methods of the invention, libraries of codon altered nucleic acids can be made and recombined. The codon altered nucleic acids can also include differences in encoded amino acid sequences, which can be either conservative or non-conservative in nature. The codon altered nucleic acids can be derived from a single parental amino acid 25 sequence, or can be derived from a family of original sequences, e.g., natural or synthetic homologous variants of a given sequence. Libraries can exist, e.g., in pools or aliquots of cells, viral plaques, enzymatically synthesized pools or aliquots of nucleic acids, or chemically synthesisized pools of nucleic acids. Methods of making libraries of nucleic acids are available and taught, e.g., in Berger, Sambrook and Ausubel, supra. In one embodiment, 30 a library as used in the invention comprises at least 2 nucleic acid sequences. In additional 20 WO 00/18906 PCT/US99/22588 embodiments, the libraries of this invention comprise at least 2, 5, 10, 100, 1000, or more nucleic acid sequences. As applied to the invention, libraries are typically constructed with a high percentage of codons altered relative to an initial (e.g., wild type) nucleic acid. Codon usage 5 divergence for each of the codon altered nucleic acids can be 50%, 75%, or even 90% or more as compared to the first nucleic acid. This eliminates hybridization to the parental nucleic acid (and thereby inhibits recombination with the parental nucleic acid, a desirable feature in certain embodiments discussed below). In several embodiments of this invention, codons are modified in members of 10 a gene family so as to increase the degree of identity between the members. In one such embodiment, the genes are homologous genes from different species. In such cases, the degree of nucleic acid identity may be lower than the degree of amino acid identity, at least in part, because of differences in codon usage between the species. In additional embodiments, the homologous genes represent different members of a gene family within a single species. 15 Such genes may encode functionally distinct members of a gene family that nevertheless share significant structural or functional similarity. In preferred embodiments, homologous genes are reverse translated into nucleic acid sequences, and the nucleic acid sequences are modified so as to increase the level of identity between them. Nucleic acids with the modified sequences can then be synthesized in vitro. In particularly preferred embodiments, 20 the modified nucleic acid sequences are at least as identical to each other as the original amino acid sequences. Additional sequence diversity is provided by generating nucleic acids with non-overlapping non-conservative substitutions in each of the codon altered nucleic acids as compared to the first nucleic acid. This provides for reversion to wild-type upon 25 recombination, while optionally allowing for the incorporation of non-conservative changes to the sequence in the event that they produce a detectable improvement during screening. Modification of the codons of one or more of the codon altered nucleic acids to provide one or more different hydrophobic core residue for an encoded polypeptide as compared to the first polypeptide is also provided. This modification of core amino acids 30 provides minor differences in encoded proteins, while changing the mutational spectrum of the resulting nucleic acid, thereby increasing sequence diversity. 21 WO 00/18906 PCT/US99/22588 In addition, due to the constraints of the translational machinery for a given cell, codon usage may need to be altered when expressed sequences are shuttled between different organisms (e.g., animal cells, plant cells, bacterial cells, etc.) for optimal expression. This produces a nucleic acid which encodes the same protein, but which, after typical forms 5 of point mutation, will access a different mutational diversity than the original form of the protein. In one embodiment, phage libraries are made and recombined in mutator strains such as cells with mutant or impaired gene products of mutS, mutT, mutH, mutL, ovrD, dcm, vsr, umuC, umuD, sbcB, recJ, etc. The impairment is achieved by genetic 10 mutation, allelic replacement, selective inhibition by an added reagent such as a small compound or an expressed antisense RNA, or other techniques. High multiplicity of infection (MOI) libraries are used to infect the cells to increase recombination frequency. Additional strategies for making phage libraries and or for recombining DNA from donor and recipient cells are set forth in U.S. Pat. No. 5,521,077. Additional recombination strategies 15 for recombining plasmids in yeast are set forth in WO 97 07205. The library to be made can be an in vitro set of molecules, or present in cells, phage or the like. Virtual libraries of nucleic acids generated in silico are also a feature of the invention (see also, Selifonov and Stemmer, supra). Generally, the library is screened to identify at least one recombinant nucleic acid that exhibits distinct or improved activity 20 compared to the parental nucleic acid or nucleic acids which are recombined. Additional details on making appropriate libraries are found below, e.g., in the section entitled "Formats for Sequence Recombination." TARGETS FOR CODON MODIFICATION AND SHUFFLING Essentially any nucleic acid can be codon altered and shuffled. No attempt is 25 made herein to identify the hundreds of thousands of known nucleic acids. Common sequence repositories for known proteins include GenBank EMBL, DDBJ and the NCBI. Other repositories can easily be identified by searching the internet. One class of preferred targets for activation includes nucleic acids encoding therapeutic proteins such as erythropoietin (EPO), insulin, peptide hormones such as human 30 growth hormone; growth factors and cytokines such as epithelial Neutrophil Activating Peptide-78, GROa/MGSA, GROp, GROy, MIP-la, MIP-1P, MCP-1, epidermal growth 22 WO 00/18906 PCTIUS99/22588 factor, fibroblast growth factor, hepatocyte growth factor, insulin-like growth factor, the interferons, the interleukins, keratinocyte growth factor, leukemia inhibitory factor, oncostatin M, PD-ECSF, PDGF, pleiotropin, SCF, c-kit ligand, VEGEF, G-CSF etc. Many of these proteins are commercially available (See, e.g., the Sigma BioSciences 1997 5 catalogue and price list), and the corresponding genes are well-known. Another class of preferred targets are transcriptional and expression activators. Example transcriptional and expression activators include genes and proteins that modulate cell growth, differentiation, regulation, or the like. Expression and transcriptional activators are found in prokaryotes, viruses, and eukaryotes, including fungi, plants, and animals, 10 including mammals, providing a wide range of therapeutic targets. It will be appreciated that expression and transcriptional activators regulate transcription by many mechanisms, e.g., by binding to receptors, stimulating a signal transduction cascade, regulating expression of transcription factors, binding to promoters and enhancers, binding to proteins that bind to promoters and enhancers, unwinding DNA, splicing pre-mRNA, polyadenylating RNA, and 15 degrading RNA. Expression activators include cytokines, inflammatory molecules, growth factors, their receptors, and oncogene products, e.g., interleukins (e.g., IL-1, IL-2, IL-8, etc.), interferons, FGF, IGF-I, IGF-II, FGF, PDGF, TNF, TGF-a, TGF-p, EGF, KGF, SCF/c-Kit, CD40L/CD40, VLA-4/VCAM-1, ICAM-1/LFA-1, and hyalurin/CD44; signal transduction molecules and corresponding oncogene products, e.g., Mos, Ras, Raf, and Met; and 20 transcriptional activators and suppressors, e.g., p53, Tat, Fos, Myc, Jun, Myb, Rel, and steroid hormone receptors such as those for estrogen, progesterone, testosterone, aldosterone, the LDL receptor ligand and corticosterone. Similarly, proteins from infectious organisms for possible vaccine applications, described in more detail below, including infectious fungi, e.g., Aspergillus, 25 Candida species; bacteria, particularly E. coli, which serves a model for pathogenic bacteria, as well as medically important bacteria such as Staphylococci (e.g., aureus), Streptococci (e.g., pneumoniae), Clostridia (e.g., perfringens), Neisseria (e.g., gonorrhoea), Enterobacteriaceae (e.g., coli), Helicobacter (e.g., pylori), Vibrio (e.g., cholerae), Capylobacter (e.g., jejuni), Pseudomonas (e.g., aeruginosa), Haemophilus (e.g., influenzae), 30 Bordetella (e.g., pertussis), Mycoplasma (e.g., pneumoniae), Ureaplasma (e.g., urealyticum), Legionella (e.g., pneumophila), Spirochetes (e.g., Treponema, Leptospira, and Borrelia), 23 WO 00/18906 PCT/US99/22588 Mycobacteria (e.g., tuberculosis, smegmatis), Actinomyces (e.g., israelii), Nocardia (e.g., asteroides), Chlamydia (e.g., trachomatis), Rickettsia, Coxiella, Ehrilichia, Rochalimaea, Brucella, Yersinia, Fracisella, and Pasteurella; protozoa such as sporozoa (e.g., Plasmodia), rhizopods (e.g., Entamoeba) and flagellates (Trypanosona, Leishmania, Trichomonas, 5 Giardia, etc.); viruses such as (+ ) RNA viruses (examples include Poxviruses e.g., vaccinia; Picornaviruses, e.g. polio; Togaviruses, e.g., rubella; Flaviviruses, e.g., HCV; and Coronaviruses), ( - ) RNA viruses (examples include Rhabdoviruses, e.g., VSV; Paramyxovimses, e.g., RSV; Orthomyxovimses, e.g., influenza; Bunyaviruses; and Arenaviruses), dsDNA viruses (Reoviruses, for example), RNA to DNA viruses, i.e., 10 Retroviruses, e.g., especially HIV and HTLV, and certain DNA to RNA viruses such as Hepatitis B virus. Other proteins relevant to non-medical uses, such as inhibitors of transcription or toxins of crop pests e.g., insects, fungi, weed plants, and the like, are also preferred targets for shuffling. Industrially important enzymes such as monooxygenases, proteases, nucleases, 15 and lipases are also preferred targets. As an example, subtilisin can be evolved by shuffling codon altered forms of the gene for subtilisin (von der Osten et al., J. Biotechnol. 28:55-68 (1993) provide a subtilisin coding nucleic acid). Proteins which aid in folding such as the chaperonins are also preferred. Preferred known genes suitable for codon alteration and shuffling also include 20 the following: Alpha-I antitrypsin, Angiostatin, Antihemolytic factor, Apolipoprotein, Apoprotein, Atrial natriuretic factor, Atrial natriuretic polypeptide, Atrial peptides, C-X-C chemokines (e.g., T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP-10, GCP-2, NAP-4, SDF-1, PF4, MIG), Calcitonin, CC chemokines (e.g., Monocyte chemoattractant protein-1, Monocyte chemoattractant protein-2, Monocyte chemoattractant protein-3, Monocyte 25 inflammatory protein-I alpha, Monocyte inflammatory protein-I beta, RANTES, 1309, R83915, R91733, HCC1, T58847, D31065, T64262), CD40 ligand, Collagen, Colony stimulating factor (CSF), Complement factor 5a, Complement inhibitor, Complement receptor 1, Factor IX, Factor VII, Factor VIII, Factor X, Fibrinogen, Fibronectin, Glucocerebrosidase, Gonadotropin, Hedgehog proteins (e.g., Sonic, Indian, Desert), 30 Hemoglobin (for blood substitute; for radiosensitization), Hirudin, Human serum albumin, Lactoferrin, Luciferase, Neurturin, Neutrophil inhibitory factor (NIF), Osteogenic protein, 24 WO 00/18906 PCT/US99/22588 Parathyroid hormone, Protein A, Protein G, Relaxin, Renin, Salmon calcitonin, Salmon growth hormone, Soluble complement receptor I, Soluble I-CAM 1, Soluble interleukin receptors (IL-1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15), Soluble TNF receptor, Somatomedin, Somatostatin, Somatotropin, Streptokinase, Superantigens, i.e., 5 Staphylococcal enterotoxins (SEA, SEB, SEC 1, SEC2, SEC3, SED, SEE), Toxic shock syndrome toxin (TSST-1), Exfoliating toxins A and B, Pyrogenic exotoxins A, B, and C, and M. arthritides mitogen, Superoxide dismutase, Thymosin alpha 1, Tissue plasminogen activator, Tumor necrosis factor beta (TNF beta), Tumor necrosis factor receptor (TNFR), Tumor necrosis factor-alpha (TNF alpha) and Urokinase. Many other known coding nucleic 10 acids, such as those in GenebankTM, can be codon-altered and shuffled. GENES WITH CODON USAGE REDESIGNED AND CHEMICALLY SYNTHESIZED AS STARTING MATERIALS FOR GENE FAMILY SHUFFLING--EXPANDING THE DIVERSITY OF DNA SHUFFLING. Because the genetic coding preference among organisms ranges from quite 15 similar to very different, homologous genes from different organisms can have significantly lower homology at the nucleic acid level than at the amino acid level. For example, genetic information for some bacterial species is high in GC content (up to 70%), while others have AT rich (>60%) codon usage. Thus, genes from different organisms may have, for example, 40-60% amino acid identity but only 25-35% nucleic acid identity. It is often desirable to 20 increase such levels of nucleic acid identity so as to enhance the ability of the homologous sequences to recombine, thereby increasing the efficiency of family shuffling using the methods of this invention. In other aspects, it is actually preferable to decrease the rate of recombination in a system, e.g., when using vectors it is sometimes desirable to decrease the rate of recombination between the vector and the host DNA, thereby increasing the safety of 25 the vector. The following examples address specific issues with regard to shuffling codon altered nucleic acids. Altering codon usage to increase homology In one aspect, protein sequences of gene family members are reverse translated back into DNA sequences, for example by using one of the preferable codon usage TM 30 charts in any conventional DNA manipulation program (e.g. the Wisconsin Package SeqWeb, OMIGA, SeqApp, SeqPup, MacVector, DNA stryder, GeneWorks, etc.). The 25 WO 00/18906 PCT/US99/22588 choice of codon usage is often determined by the host in which the genes will be expressed. After maximizing the percentage of DNA sequence identity, the genes are chemically synthesized, e.g., using a high throughput oligonucleotide synthesizer in, e.g. a 96-well format, optionally in conjunction with polymerase and/ or ligase gene synthesis methods. 5 In general, the DNA sequence similarity after such treatment will be at least as high as the amino acid similarity, but can be at least about 10% to 15% higher than the amino acid identity (in contrast to the situation for naturally occurring genes, which are ordinarily less well conserved than encoded polypeptides), based on the random frequency of sequence identity for any given codon. In most cases, the minimal requirement for amino acid identity 10 can be as low as about 35% while still retaining adequate nucleic acid homology for standard recombination methods (as discussed, supra, oligonucleotide-mediated recombination methods do not require high levels of similarity to achieve recombination). In some cases, however, the minimal amino acid identity can be even lower, e.g. if the conserved regions are clustered within the genes. 15 Example: Shuffling codon-modified EPO The protein erythropoietin alpha, also known as EPO, Epogen, and Procrit is a hematopoietic hormone, providing a variety of benefits to patients suffering from anemia (a common symptom of, e.g., AIDS). EPO is produced as a pharmaceutical, with sales of nearly I billion dollars world-wide. Accordingly, proteins with EPO-like activity (and 20 preferably superior activity) are of substantial commercial interest. Figure 1 shows the sequence of a part of the monkey EPO gene, which is similar to the human EPO gene. Figure 2 shows an example of a codon altered EPO nucleic acid (or "wobble" EPO gene). In general, transversions rather than transition mutations are made where possible. The purpose of this strategy is to maximally disrupt hybridization of 25 the resulting gene with naturally occurring EPOs. Figure 3 shows an alignment of naturally occurring EPOs. This strategy is further fine-tuned by applying standard rules of base pairing (e.g., elimination of G-C pairing and GC stacking) to maximize sequence disruption; in addition, conservative or non-conservative amino acid modifications can also be made (in 30 some cases, where multiple codon-altered nucleic acids are shuffled, it is desirable to make codon altered nucleic acids with non-overlapping non-conservative substitutions to permit 26 WO 00/18906 PCT/US99/22588 reversion to the wild-type amino acid during shuffling). The size of the sequence space for nucleic acids encoding EPO is large, at about 2.8 x 1088 different sequences (there are about 1080 particles in the universe; thus, it is physically impossible to make all of the possible sequences encoding EPO). As indicated schematically in Figure 4, if one only considers the 5 maximally divergent wobble genes (those that use the alternative types of codons for leucine, arginine, and serine), there is still a sequence space of 103 sequences encoding EPO. The overall strategy is to synthesize a library of wobble genes, screen for expression and activity and DNA shuffle desirable genes as desired (e.g., by recursive processes). It is of interest to further evolve codon-altered nucleic acids. Shuffling with 10 other homologous genes from nature, designed genes (incorporating libraries of designed sequence variation), and genes containing mutations of interest are strategies for evolving any gene of interest. However, the codon altered nucleic acid may not be easily shuffled with these genes because of the sequence differences; or they may be undesirable for other reasons (e.g., the naturally occurring sequences may be proprietary, or include proprietary elements). 15 These difficulties can be avoided by synthesizing codon-altered homologous nucleic acids which encode desired amino acid variations (e.g., those found in homologous genes), but which have a codon-set close to the nucleic acid(s) to be recombined (thereby permitting, e.g., hybridization during recombination). For example, after identifying homologues of interest (e.g., those shown in 20 Figure 3 for EPO), codon-altered nucleic acids encoding the same proteins are synthesized with a similar codon selection. Standard family shuffling is then practiced with the codon altered nucleic acids. This is shown schematically for EPO in Figure 5. EPO wobble variants are screened for expression and then receptor binding assays are conducted in an ELISA format, using human EPOr-Fc fusions. Following 25 selection of binding variants, activity is measured as thymidine incorporation in UT7-EPO (A human bone marrow cell line) cell proliferation assays. Cells are treated for 2-3 days with various concentration of EPO variants after which time they are incubated in the presence of 3-H thymidine for 4 hours and incorporation of thymidine is measured. See also, Erickson miller et al. (1997) Blood 90:2421 (for the receptor binding assay), and Wen et al. (1994) L 30 Biol. Chem. 269:22839-22846 (for the thymidine incorporation assay). 27 WO 00/18906 PCT/US99/22588 Assays for selecting EPO can also be based, e.g., on the ability of EPO proteins to stimulate the growth of blood cell, e.g., in vitro or in vivo. Example: Codon Shuffling G-CSF Family shuffling can be used to breed diversity from genes into the libraries to 5 be screened. Additionally, design heuristics such as randomization of hydrophobic core residues can be used to take advantage of the redundancy between primary structure and tertiary structure of proteins (i.e. many different primary structures encode proteins with very similar three dimensional structures). Design heuristics are employed to create a sequence space of mutants that are 10 predicted to be highly biased (relative to random mutagenesis) to encode proteins which preserve the original activity. Methods such as high throughput (HTP) screening and phage panning are used to identify members of the designed libraries that have the desired activity. DNA shuffling is used to breed this population of active clones in order to fine tune the mutants, thus allowing one to evolve variants with equivalent or superior function relative to 15 the naturally occurring proteins. Figures 6 and 7 show several mammalian homologues of G-CSF. Figure 8 shows the hydrophobic core residues of human G-CSF (blacked out). Figure 9 shows a strategy for evolving variants of human G-CSF that are highly divergent in sequence. First, three genes are synthesized (Genes 1, 2 and 3, Figure 8) which contain all of the mammalian 20 homologue diversity of G-CSF. These genes are shuffled, phage panned against the G-CSF receptor, and HTP screened for biological function (receptor activation). Active clones are iteratively shuffled and screened if necessary to give evolved variants that rival or surpass the human gene in activity (on human cells). Next, one evolves a variant that has a highly mutated hydrophobic core. This 25 is schematically illustrated in Figure 8, and the specific strategy for performing the biological screening is schematically illustrated in Figure 9. it is expected that the best mutants obtained after screening hydrophobic core randomized libraries may be less active than wild type human G-CSF because it is difficult to initially optimize activity in such a procedure. Family shuffling is used to obtain optimized variants. This is done by synthesizing genes 30 which contain mammalian homologue diversity at all but the hydrophobic core positions; but 28 WO 00/18906 PCT/US99/22588 they are synthesized in the context of an evolved, non-wild type hydrophobic core. Family shuffling is used to optimize around the new hydrophobic core. This strategy works because there are functionally similar hydrophobic cores for wild type proteins that consist of largely different amino acids than the wild type protein. 5 This understanding is supported by recent experiments in model systems. For example, 53% of randomized sequences for three residues in the hydrophobic core of lambda repressor are folded and biologically active (Lim and Sauer (1991) J. Mol. Biol. 219:359-376). Protein design by patterning of polar and non-polar amino acids, where 24 residues in the hydrophobic core of a 4-helix bundle protein were randomized by Kamtekar et al. (1993) 10 Science 262:132 1. Folded, alpha helical proteins were recovered from about 1% of the clones. Desjarlais and Handel (1995) Current Opinion in Biotechnology 6:460-466 showed a mutant of Rop, another 4-helix bundle protein, where four hydrophobic core residues have been randomized and active mutants have been obtained. Axe et al. (1996) PNAS 95:5590 5594 showed that randomizing 13 hydrophobic core residues in the enzyme barnase resulted 15 in 23% of the clones in the library retaining biological activity. Gassener et al. (1996) PNAS 93:12155-12158 describe a mutant of T4 lysozyme where 10 residues in the hydrophobic core are replaced with Met. This is taken as evidence that the hydrophobic core of this protein is very tolerant to substitution. Taken together, this experimental evidence on model systems shows that the hydrophobic cores of many proteins can be replaced with other 20 hydrophobic residues that pack in a similar fashion to give an active protein. This degeneracy is exploited to evolve novel forms of natural and codon-altered genes. A related approach is to search the protein databases for a protein that has a similar activity to a protein that on wishes to evolve. Denesyuk et al. (1996) J. Theor. Biol shows the results of such a search for G-CSF. LIF is a very similarly folded protein. One 25 can use LIF as a 'scaffold' on which to place residues of G-CSF that are required for activity. Given LIF with a G-CSF "toupee," one would family shuffle the LIF scaffold so as to obtain a variant in which the toupee is displayed in a fully biologically active form. Another approach is to use computational methods to create families of variants that are predicted to be functional. Dahiyat and Mayo Science recently described 30 computer methods that are used to design proteins. Proteins are simulated on the computer, often with the aid of genetic algorithms, and a subset that are deemed 'fit' are actually 29 WO 00/18906 PCT/US99/22588 synthesized and 'analyzed'. These computational methods are becoming increasingly powerful. They would be useful to, for example, predict a family of mutations on the surface of a protein that would not destroy function. DNA shuffling can be used to optimized active clones obtained by design. Taking the example of G-CSF, one could use computational 5 methods in combination with all structure function data (for example alanine scan data for G CSF reported recently by Reidhaar-Olsen in Biochemistry) to design a family of putatively functional variants. One could, for example, design the family to have minimal DNA identity to the wild type gene given the design constraints. This library is synthesized, put through biological screens and/or selections (i.e. panning against the G-CSF receptor), and active 10 variants are obtained. DNA shuffling is then used to evolve these active variants to have the desired level of function. G-CSF proteins are displayed on phage and screened for binding to human G CSF receptor in an ELISA format. Variants that bind receptor are selected in a high throughput screen for receptor activation. This cell based assay measures receptor activation 15 via a reporter gene (such as luciferase) activated by a G-CSF responsive construct containing STAT binding elements. Cells (such as HepG2) are transformed with a G-CSF responsive reporter plasmid and treated with the codon shuffled G-CSF variant for 2.5 hours. Cells are then lysed and luciferase activity measured. See also, Tian et al (1998) Science 281:257-259. Example: Codon Shuffling Alkaline Phosphatase 20 Alkaline phosphatase is a widely used reporter enzyme for ELISA assays, protein fusion assays, and in a secreted form as a reporter gene for mammalian cells. A more active form of the enzyme is desirable. A codon altered form of alkaline phosphatase was generated by PCR assembly using the oligos set forth in Figure 10. A map of the oligos is set forth in Figure 11. The 25 procedure used was essentially identical to that taught in Stemmer et al. (1994) Gene 164:49 57. In brief, the oligos were mixed 1:1 at a variety of dilutions and PCR assembled by performing e.g., 25-60 cycles of PCR at e.g., 94 oC (60 sec.), 94 oC (30 sec.), 50 oC (30 sec.), 72 oC (30 sec). Assembly of the BIAP gene was conducted in a circular format and gene fragements were purified. -100,000 colonies were screened on LB/am plates (-1/10 are wt 30 plasmid). About 1/10 showed a bluer color than background. Plasmid DNA showed a correct insertion. 30 WO 00/18906 PCT/US99/22588 In general, petri-dish screening using the typical colorimetric assay for phosphatase activity can be used for screening. This has the advantage or being simple, high throughput, and semi quantitative. Microtiter plate screening, also preferred, is colorimetric, and quantitative, although additional instrumentation can be required for implementation. 5 Example: Codon Shuffling to Reduce Competent Virus Production from Vectors and to Generate Attenuated Viruses as Immunogenic Compositions and Vaccines Cells can be stably transduced with a number of viral vectors including those derived from retroviruses, pox viruses, adenoviruses (Ads), herpes viruses and parvoviruses. 10 Common viral vectors include those derived from murine leukemia viruses (MuLV), gibbon ape leukemia viruses (GaLV), human immuno deficiency viruses (HIV), adenoviruses, adeno associated viruses (AAVs), Epstein Barr viruses, canarypox viruses, cowpox viruses, and vaccinia viruses. Viral vectors based upon retroviruses, adeno-associated viruses, herpes viruses and adenoviruses are all used as gene therapy vectors for the introduction of 15 therapeutic nucleic acids into the cells of an organism by ex vivo and in vivo methods. When using viral vectors, packaging cells are commonly used to prepare virions used to transduce target cells. In these vectors, trans-active genes are rendered inactive and "rescued" by trans-complementation to provide a packaged vector. This form of trans complementation is provided by co-infection of a packaging cell with a virus or vector 20 which supplies functions missing from a particular gene therapy vector in trans, or by using a cell line (e.g., 293 cells) which have viral components integrated into the genome of the packaging cell. For instance, cells transduced with HIV or murine retroviral proviral sequences which lack the nucleic acid packaging site produce retroviral trans active components, but do not specifically incorporate the retroviral nucleic acids into the capsids 25 produced, and therefore produce little or no live virus. If these transduced "packaging" cells are subsequently transduced with a vector nucleic acid which lacks coding sequences for retroviral trans active functions, but includes a packaging signal, the vector nucleic acid is packaged into an infective virion. A number of packaging cell lines useful for MoMLV-based vectors are known in the art, such 30 as PA317 (ATCC CRL 9078) which expresses MoMLV core and envelope proteins see, Miller et al. J. Virol. 65:2220-2224 (1991). Carrol et al. (1994) Journal of virology 68(9):6047-6051 describe the construction of packaging cell lines for HIV viruses. 31 WO 00/18906 PCT/US99/22588 Reciprocal complementation of defective HIV molecular clones is described, e.g., in Lori et al. (1992) Journal of Virology 66(9) 5553-5560. Functions of viral replication not supplied by trans-complementation which are necessary for replication of the vector are present in the vector. In HIV, this typically 5 includes, e.g., the TAR sequence, the sequences necessary for HIV packaging, the RRE sequence if the instability elements of the p17 gene of gag is included, and sequences encoding the polypurine tract. HIV sequences that contain these functions include a portion of the 5' long terminal repeat (LTR) and sequences downstream of the 5' LTR responsible for efficient packaging, i.e., through the major splice donor site ("MSD"), and the polypurine 10 tract upstream of the 3' LTR through the U3R section of the 3' LTR. The packaging site (psi site or W site) is partially located adjacent to the 5' LTR, primarily between the MSD site and the gag initiator codon (AUG) in the leader sequence. See, Garzino-Demo et al. (1995) Hum. Gene Ther. 6(2): 177-184. For a general description of the structural elements of the HIV genome, see, Holmes et al. PCT/EP92/02787. 15 Another common vector is based upon adenovirus. Typically, vectors which include the adenovirus ITRs (Gingeras et al. (1982) J. Biol. Chem. 257:13475-13491) are packaged in, e.g., 293 cells, which provide many of the components necessary for vector packaging. Adeno-associated viruses (AAVs) utilize helper viruses such as adenovirus or 20 herpes virus to achieve productive infection. In the absence of helper virus functions, AAV integrates (site-specifically) into a host cell's genome, but the integrated AAV genome has no pathogenic effect. The integration step allows the AAV genome to remain genetically intact until the host is exposed to the appropriate environmental conditions (e.g., a lytic helper virus), whereupon it re-enters the lytic life-cycle. Samulski (1993) Current Opinion in 25 Genetic and Development 3:74-80 and the references cited therein provides an overview of the AAV life cycle. For a general review of AAVs and of the adenovirus or herpes helper functions see, Berns and Bohensky (1987) Advanced in Virus Research, Academic Press., 32:243-306. The genome of AAV is described in Laughlin et al. (1983) Gene, 23:65-73. Expression of AAV is described in Beaton et al. (1989) J. Virol., 63:4450-4454. In general, 30 the packaging sites for all parvoviruses, including B 19 and AAV are located in the viral ITRs. Recombinant AAV vectors (rAAV vectors) deliver foreign nucleic acids to a wide 32 WO 00/18906 PCTIUS99/22588 range of mammalian cells (Hermonat & Muzycka (1984) Proc Natl Acad Sci USA 81:6466 6470; Tratschin et al. (1985) Mol Cell Biol 5:3251-3260), integrate into the host chromosome (McLaughlin et al. (1988) J Virol 62: 1963-1973), and show stable expression of the transgene in cell and animal models (Flotte et al. (1993) Proc Natl Acad Sci USA 90:10613 5 10617). rAAV vectors are able to infect non-dividing cells (Podsakoff et al. (1994) J Virol 68:5656-66; Flotte et al. (1994) Am. J. Respir. Cell Mol. Biol. 11:517-521). Further advantages of rAAV vectors include the lack of an intrinsic strong promoter, thus avoiding possible activation of downstream cellular sequences, and the vector s naked icosohedral capsid structure, which renders the vectors stable and easy to concentrate by common 10 laboratory techniques. One problem with previously existing vector packaging strategies is that vectors to be packaged can recombine with nucleic acids providing packaging functions in trans, producing a replication-competent virus. This can be a problem both when vectors are produced for therapeutic applications (e.g., in gene therapy) and during production of 15 encoded components in vitro. The present invention provides a way of reducing or eliminating recombination between nucleic acids encoding trans-active components and vector nucleic acids encoding packaging sites. In particular, nucleic acid subsequences of a vector which are adjacent to modified or deleted elements provided in trans, are codon modified to eliminate hybridization 20 to wild-type sequences. Because these sequences do not hybridize, they cannot recombine with nucleic acids producing trans-active components. One additional advantage of this approach is that the vectors also cannot recombine with live viruses, e.g., in a human body which is infected with a virus that packages vector elements. As noted above, two types of gene therapy vectors are those based upon retroviruses (which can be packaged by, e.g., HIV 25 1) and adenoviruses (which can be packaged by adenovirus). Alternatively, the nucleic acids encoding trans-active components can be codon modified so that they do not hybridize to wild-type sequences. This also prevents recombination with vectors having wild-type sequences, preventing recombination and formation of replication competent viruses. 33 WO 00/18906 PCTIUS99/22588 After codon modification, vectors or trans active nucleic acids can be shuffled as described supra, and screened for the ability to package nucleic acids, or to be packaged, as appropriate. It will be appreciated that codon modification of viral sequences has an 5 additional use as well. Codon alteration of viral sequences can result in attenuation of the virus, e.g., due to modification of regulatory sequences, alterations in mRNA secondary structure, inefficient translation due to rare codon use, and the like. Such "codon attenuated" viruses have a significant advantage over existing attenuated viruses (which are typically generated by serial passage in cells other than the normal host type for the virus). In 10 particular, codon attenuated viruses can encode a wild-type set of proteins, making them ideal as immunogenic compositions to generate antibodies, or to use as vaccines. Viral proteins can also be used in various diagnostic assays. For example, the standard diagnostic test for HIV infection in current use tests for the presence of anti-HIV antibodies in blood by probing with viral proteins. 15 Example: Codon Usage Libraries to evolve Functional Variants with reduced Recombination With Natural Gene Sequences--Adenovirus Adenovirus is a common vector used, e.g., for gene therapy. The virus is typically modified to make it replication deficient. This can be achieved e.g., by deleting the El and E4 genes. The functions of El and E4 can be supplied by trans complementation 20 when El and E4 deleted vectors are grown in the ubiquitous human embryonic kidney cell line 293, which has uncharacterized adenovirus fragments incorporated into their genome that supply the missing functions in trans. The replication defective adenoviral vectors recombine at a low, but clinically significant frequency, resulting in replication competent adenovirus contamination of vector preparations. Because adenovirus has detrimental effects 25 on health, this is a significant problem for application of adenovirus-based gene therapy vectors. In the present invention, a codon usage library encompassing several hundred bases to several kilobases of sequence flanking the adenovirus El and E4 genes are made. The library is designed to enforce a high degree of divergence from the natural adenoviral 30 consensus sequence, while at the same time incorporating a large degree of degeneracy in the codons to allow for a large space of sequence diversity to be searched. The design principle 34 WO 00/18906 PCT/US99/22588 is to obtain mutants that encode the same or similar protein sequence, but with many mismatches to the wild-type El and E4 sequences found in the 293 genome. These mismatches strongly reduce the frequency of unwanted recombination with the trans complementary genes. Consequently, engineered adenoviral vectors, or adenovirus helper 5 vectors which package adenoviral sequences which include packaging sequences (adenoviral or adeno-associated viral ITRs) in trans have reduced levels of recombination. This provides for a lower rate of competent adenovirus production, making culture and production of such vectors safer. Evolution Impaired Viruses Created by Massive Codon Usage Alteration As a 10 General Approach to Vaccines--HIV HIV- 1 and HIV-2 are genetically related, antigenically cross reactive, and share a common cellular receptor (CD4). See, Rosenburg and Fauci (1993) in Fundamental Immunology, Third Edition Paul (ed) Raven Press, Ltd., New York (Rosenburg and Fauci 1) and the references therein for an overview of HIV infection. HIV- 1 infection is epidemic 15 world wide, causing a variety of immune system-failure related phenomena commonly termed acquired immune deficiency syndrome (AIDS). HIV type 2 (HIV-2) has been isolated from both healthy individuals and patients with AIDS-like illnesses (Andreasson, et al. (1993) Aids 7, 989-93; Clavel, et al. (1986) Nature, 324, 691-695; Gao, et al. (1992) Nature 358, 495-9; Harrison, et al. (199 1) Journal ofAcquired Immune Deficiency 20 Syndromes 4, 1155-60; Kanki, et al. (1992) American Journal ofEpidemiology 136, 895-907; Kanki, et al. (1991) Aids Clinical Review 1991, 17-38; Romieu, et al. (1990) Journal of Acquired Immune Deficiency Syndromes 3, 220-30; Naucler, et al. (1993) International Journal of STD and Aids 4, 217-21; Naucler, et al. (1991) Aids 5, 301-4). Although HIV-2 AIDS cases have been identified principally from West Africa, sporadic HIV-2 related AIDS 25 cases have also been reported in the United States (O'Brien, et al. (1991) Aids 5, 85-8) and elsewhere. HIV-2 will likely become endemic in other regions over time, following routes of transmission similar to HIV- 1 (Harrison, et al. (1991) Journal ofAcquired Immune Deficiency Syndromes 4, 1155-60; Kanki, et al. (1992) American Journal of Epidemiology 136, 895-907; Romieu, et al. (1990) Journal ofAcquired Immune Deficiency Syndromes 3, 30 220-30). Epidemiological studies suggest that HIV-2 produces human disease with lesser penetrance than HIV-1, and exhibits a considerably longer period of clinical latency (at least 35 WO 00/18906 PCT/US99/22588 25 years, and possibly longer, as opposed to less than a decade for HIV- 1; see, Kanki, et al. (1991) Aids Clinical Review 1991, 17-38; Romieu, et al. (1990) Journal ofAcquired Immune Deficiency Syndromes 3, 220-30, and Travers et al. (1995) Science 268: 1612-1615). The ability of HIV virus populations to rapidly point mutate to avoid the 5 immune response poses a special challenge for vaccine design. While the immune system has responded to viruses in a gradual and co-evolutionary manner, the present invention provides a general approach that provides for massively faster evolution to produce new vaccines to stimulate more effective immune responses. For example, during the incubation period for HIV infection, which lasts for 10 several years, low titers of HIV can result from high HIV replication rates in conjunction with efficient viral clearance by the immune system. In response to these selective forces, virus mutations are selected which reduce recognition and neutralization by the immune system's B and T-cell responses. See, Lukashov et al. (1995) J. Virol. 69:6911-6916. During the long incubation time, these mutations accumulate and eventually overwhelm the immune 15 system's defenses. See, Ho et al. (1995) Nature. Live attenuated vaccines, typically produced by prolonged growth of human viruses in animal cells, have proven useful as vaccines for several diseases, including mumps, rubella and measles. Attenuation involves the slow accumulation of many mutations throughout the viral genome during the course of adapting to growth in the animal cells. 20 When used to vaccinate humans, the attenuated virus grows only weakly and elicits a complex immune response which the virus is unable to avoid. The mutations in the attenuated virus could, in principle, revert in the same stepwise fashion that it underwent to grow in culture. The risk of reversion is highest in viruses with a high mutation rate such as 25 HIV- 1, which makes this strategy dangerous under current techniques for vaccine development. It is worth noting, however, that protective effects against HIV-1 are observed following infection with the related HIV-2 virus, which is much less pathogenic than HIV-1. Thus, protective effects against HIV can be achieved with live vaccines. To reduce the risk of reversion, a large number of mutations need to 30 accumulate in the virus. However, if too many mutations are present, the immune system in 36 WO 00/18906 PCT/US99/22588 effect recognizes the attenuated virus, but not the virus against which a protective effect is sought. As provided herein, immunogenic compositions such as vaccines are created which contain a large number of silent substitutions. In contrast to existing attenuated 5 viruses, such viruses have native protein sequences and elicit essentially the same immune responses as the corresponding wild-type virus (typically one or a few additional disabling mutations can also be incorporated). Codon alteration results in two effects that both increase the potential of the vaccine. First, like standard attenuated viruses, the growth of codon-altered viruses is 10 attenuated, due to the effect of the codon alterations on translation, regulatory sequences, mRNA folding, packaging, and the like. For example, regulation of HIV-1 envelope expression has been observed as a result of codon usage. See, Haas et al. (1989) Current Biology 6(3):315-324. Second, codon alteration results in impairment of virus evolution. As 15 discussed above, modification of the codons alters the mutational escape spectrum of the virus, upsetting the evolutionary selection for specific codons. The six codon amino acids are the best targets for codon alteration. Serine, arginine and leucine each have one group of four codons, plus two codons in an unrelated group. See, Figure 12. Switching all of the serine codons from AGY to TCX and vice versa, 20 yields proteins with unaltered amino acid sequences. See also, Figure 13. However, these codon groups differ significantly in the spectrum of the amino acids that they yield upon point mutation. Of all possible point mutations of one codon for serine (TCA) 78% result in a different amino acid compared to point mutations obtained for the AGT codon for serine. See, Figure 13. A virus with hundreds of codon alterations is in, statistically, a very different 25 mutational space, able to access a totally different mutation spectrum, or "cloud," compared to the wild-type virus. The overall strategy for producing an evolution-defective virus is additionally set forth in Figures 14, 15 and 17. Figure 16, panels A-C show results of single mutations of different codons for ser, arg, and leu. Point mutation is critical for viruses such as HIV- 1 to stay ahead of the host 30 immune system. The amino acid mutations that are required for virus escape are likely not random. Wild type codon usage has evolved to allow optimal immune system evasion. The 37 WO 00/18906 PCT/US99/22588 wild type codon usage is likely to favor mutations that represent alterations that avoid the host immune system, without detrimentally affecting the protein(s) encoded. While complex, this natural pattern of amino acid sequence change of the natural virus in response to the host system is non-random and weakly predictable. See also, Seiller-Moiseiwitsch et al. (1994) 5 Annu. Rev. Genet. 28:559-596. Changing all of the codons for ser, arg and leu in the 875 aa envelope polyprotein of HIV-1 (e.g., strain MN) would affect 187 codons (22%) resulting in 561 mutations. See, e.g., Figure 18, panels A-D. If all of the HIV proteins were altered, the number of mutations would be more than three-fold higher. The construction of such codon 10 modified viruses is simplified by recent advances in the synthesis of long DNA sequences, which enable the assembly of a plasmid of average size from 40 mer oligos in a single step with about 75% efficiency. See, Stemmer (1994) Nature 370_389-391. See also, Figure 19 for a list of oligos in one application for synthesis of HIV Env. While the synthesis of the envelope gene is sufficient, synthesis of the whole HIV genome from oligos can be 15 performed by this method. In practice, a preferred balance of attenuation and evolution impairment is obtained by DNA shuffling (e.g., Stemmer et al. (1995) Gene 164:49-53), e.g., of the wild type and codon altered sequences, followed by selection of the resulting library of viruses that retain moderate growth despite many codon alterations. 20 While attenuation that can be obtained by this approach may be sufficient for obtaining a vaccine for most viruses, for HIV- 1, the evolution impairment is more important, due to the high mutation rate of the virus. Live vaccines are used only if they elicit an immune response which is complex and strong enough to prevent infection of the wild-type virus. Live virus vaccines are typically more protective than single protein vaccines because 25 it is harder to out-mutate T and B-cell responses to a larger number of epitopes. The weak growth of the live virus vaccine results in a larger antigenic dose and point mutation is increases the complexity of the immune response. To evaluate vaccine competence, vaccine potential is evaluated in Macaques (M. nemestrina) or chimpanzees using SIV variants that are known to cause AIDS. Sequence for an example SIV, SIVsmm, is found at Gene Bank 30 Accession No. x14307. This virus is closely related to HIV-2. See, Hirsch (1989) Nature 339: 389-392. In general, many complete sequences for HIVs, SIVs and many other viruses 38 WO 00/18906 PCTIUS99/22588 are found in well known sequence repositories, including GenBank, EMBL, DDBJ and the NCBI. Well characterized HIV clones include: HIV-1NL43, HIV-lSF2, HIV-1BRU and HIV-1MN. For an introduction to the genetic variability of HIV, see, Seillier-Moiseiwitsch et al. (1994) Annu. Rev. Genet. 28:559-96 and the references cited therein. 5 Several HIV-2 isolates, including three molecular clones of HIV-2 (HIV-2ROD, HIV-2SBL-ISY, and HIV-2uci), have also been reported to infect macaques (M. mulatta and M nemestrina) or baboons (Franchini, et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86, 2433-2437; Barnett, et al. (1993) Journal of Virology 67, 1006-14; Boeri, et al. (1992) Journal of Virology 66, 4546-50; Castro, et al. (1991) Virology 184, 219-26; Franchini, et al. (1990) 10 Journal of Virology 64, 4462-7; Putkonen, et al. (1990) Aids 4, 783-9; Putkonen, et al. (1991) Nature 352, 436-8). As human pathogens capable of infection of small primates, HIV-2 molecular clones provide attractive models for studies of AIDS pathogenesis, and for drug and vaccine development against HIV- 1 and HIV-2. Recently, HIV-2 was suggested as a possible vaccine candidate against the 15 more virulent HIV-1 due to its long asymptomatic latency period, and its ability to protect against infection by HIV-1 (see, Travers et al. (1995) Science 268: 1612-1615 and related commentary by Cohen et al (1995) Science 268: 1566). In the nine-year study by Travers et al. (id) of West African prostitutes infected with HIV-2, it was determined that infection with HIV-2 caused a 70% reduction in infection by HIV-1. Thus, codon altered HIV-2 viruses can 20 also be used as a live vaccine, against both HIV-2 and HIV- 1. Furthermore, because the natural pathogenicity of HIV-2 is less than HIV- 1, it is, in addition to HIV- 1, a preferred virus for modification. FORMATS FOR SEQUENCE RECOMBINATION The methods of the invention entail performing recombination ("shuffling") 25 and screening or selection to "evolve" individual genes, whole plasmids or viruses, multigene clusters, or even whole genomes (Stemmer (1995) Bio/Technology 13:549-553). Reiterative cycles of recombination and screening/selection can be performed to further evolve the nucleic acids of interest. Such techniques do not require the extensive analysis and computation required by conventional methods for polypeptide engineering. Shuffling 30 allows the recombination of large numbers of mutations in a minimum number of selection cycles, in contrast to natural pair-wise recombination events (e.g., as occur during sexual 39 WO 00/18906 PCT/US99/22588 replication). Thus, the sequence recombination techniques described herein provide particular advantages in that they provide recombination between mutations in any or all of these, thereby providing a very fast way of exploring the manner in which different combinations of mutations can affect a desired result. In some instances, however, structural 5 and/or functional information is available which, although not required for sequence recombination, provides opportunities for modification of the technique. Exemplary formats and examples for sequence recombination, referred to, e.g., as "DNA shuffling," "fast forced evolution," or "molecular breeding," have been described by the present inventors and co-workers in the following patents and patent 10 applications: US Patent No. 5,605,793; PCT Application WO 95/22625 (Serial No. PCT/US95/02126), filed February 17, 1995; US Serial No. 08/425,684, filed April 18, 1995; US Serial No. 08/621,430, filed March 25, 1996; PCT Application WO 97/20078 (Serial No. PCT/US96/05480), filed April 18, 1996; PCT Application WO 97/35966, filed March 20, 1997; US Serial No. 08/675,502, filed July 3, 1996; US Serial No. 08/721, 824, filed 15 September 27, 1996; PCT Application WO 98/13487, filed September 26, 1997; "Evolution of Whole Cells and Organisms by Recursive Sequence Recombination" Attorney Docket No. 018097-020720US filed July 15, 1998 by del Cardayre et al. (PCT/US99/15972, filed 07/15/1999); Stemmer, Science 270:1510 (1995); Stemmer et al., Gene 164:49-53 (1995); Stemmer, Bio/Technology 13:549-553 (1995); Stemmer, Proc. Natd. Acad. Sci. U.S.A. 20 91:10747-10751 (1994); Stemmer, Nature 370:389-391 (1994); Crameri et al., Nature Medicine 2(l):1-3 (1996); and Crameri et al., Nature Biotechnology 14:315-319 (1996), each of which is incorporated by reference in its entirety for all purposes. The recombination procedure starts with at least two substrates that generally show substantial sequence identity to each other (e.g., at least about 30%, 50%, 70%, 80% or 25 90% or more sequence identity), but differ from each other at certain positions. For example, at least one codon altered nucleic acid is recombined with one or more additional nucleic acid (the additional nucleic acid can also be a codon altered nucleic acid) herein. The difference between nucleic acids to be recombined can be any type of mutation, for example, substitutions, insertions and deletions. Often, different segments differ from each other in 30 about 5-20 positions. For recombination to generate increased diversity relative to the starting materials, the starting materials must differ from each other in at least two nucleotide 40 WO 00/18906 PCTIUS99/22588 positions. That is, if there are only two substrates, there should be at least two divergent positions. If there are three substrates, for example, one substrate can differ from the second at a single position, and the second can differ from the third at a different single position. The starting DNA segments can be natural variants of each other, for example, allelic or 5 species variants. More typically, they will be codon altered nucleic acids derived from one or more homologous nucleic acid sequence. The segments can also be from nonallelic genes showing some degree of structural and usually functional relatedness (e.g., codon altered nucleic acids derived from different, but homologous, genes within a superfamily). The starting DNA segments can also be induced variants of each other. For example, one DNA 10 segment can be produced by error-prone PCR replication of the other, or by substitution of a mutagenic cassette. Induced mutants can also be prepared by propagating one (or both) of the segments in a mutagenic strain. In these situations, strictly speaking, the second DNA segment is not a single segment but a large family of related segments. The different segments forming the starting materials are often the same length or substantially the same 15 length. However, this need not be the case; for example; one segment can be a subsequence of another. The segments can be present as part of larger molecules, such as vectors, or can be in isolated form. The starting DNA segments are recombined by any of the sequence recombination formats provided herein to generate a diverse library of recombinant DNA 20 segments. Such a library can vary widely in size from having fewer than 10 to more than 5 9 12 15 20 10 , 10 , 10 , 10 , 10 or even more members. In some embodiments, the starting segments and the recombinant libraries generated will include essentially full-length coding sequences and any essential regulatory sequences, such as a promoter and polyadenylation sequence, required for expression. In other embodiments, the recombinant DNA segments in 25 the library can be inserted into a common vector providing sequences necessary for expression before performing screening/selection. Use of Restriction Enzyme Sites to Recombine Mutations In some situations it is advantageous to use restriction enzyme sites in nucleic acids to direct the recombination of mutations in a nucleic acid sequence of interest. These 30 techniques are particularly preferred in the evolution of fragments that cannot readily be shuffled by other existing methods due to the presence of repeated DNA or other problematic 41 WO 00/18906 PCT/US99/22588 primary sequence motifs. These situations also include recombination formats in which it is preferred to retain certain sequences unmutated. The use of restriction enzyme sites is also preferred for shuffling large fragments (typically greater than 10 kb), such as gene clusters that cannot be readily shuffled and "PCR-amplified" because of their size. Although 5 fragments up to 50 kb have been reported to be amplified by PCR (Barnes, Proc. Natl. Acad. Sci. U.S.A. 91:2216-2220 (1994)), it can be problematic for fragments over 10 kb, and thus alternative methods for shuffling in the range of 10 - 50 kb and beyond are preferred. Preferably, the restriction endonucleases used are of the Class II type (Sambrook, Ausubel and Berger, supra) and of these, preferably those which generate nonpalindromic sticky end 10 overhangs such as Alwn I, Sfi I or BstXl. These enzymes generate nonpalindromic ends that allow for efficient ordered reassembly with DNA ligase. Typically, restriction enzyme (or endonuclease) sites are identified by conventional restriction enzyme mapping techniques (Sambrook, Ausubel, and Berger, supra.), by analysis of sequence information for that gene, or by introduction of desired restriction sites into a nucleic acid sequence by synthesis (i.e. by 15 incorporation of silent mutations). For example, one or more codon-altered nucleic acid can be recombined at restriction sites, e.g., with one or more nucleic acid of interest (including, e.g. a gene or gene cluster to be modified by recombination with the codon-altered nucleic acid). The DNA substrate molecules to be digested can either be from in vivo 20 replicated DNA, such as a plasmid preparation, or from synthetic or e.g., PCR amplified nucleic acid fragments harboring the restriction enzyme recognition sites of interest, preferably near the ends of the fragment. Typically, at least two variants of a gene of interest, each having one or more mutations, and at least one of which incorporating codon modifications, are digested with at least one restriction enzyme determined to cut within the 25 nucleic acid sequence of interest. The restriction fragments are then joined with DNA ligase to generate full length genes having shuffled regions. The number of regions shuffled will depend on the number of cuts within the nucleic acid sequence of interest. The shuffled molecules can be introduced into cells as described above and screened or selected for a desired property as described herein. Nucleic acid can then be isolated from pools (libraries), 30 or clones having desired properties and subjected to the same procedure until a desired degree of improvement is obtained. 42 WO 00/18906 PCT/US99/22588 In some embodiments, at least one DNA substrate molecule or fragment thereof is isolated and subjected to mutagenesis. In some embodiments, the pool or library of religated restriction fragments are subjected to mutagenesis or additional recombination protocols before the digestion-ligation process is repeated. "Mutagenesis" as used herein 5 comprises such techniques known in the art as PCR mutagenesis, oligonucleotide-directed mutagenesis, site-directed mutagenesis, etc., and recursive sequence recombination by any of the techniques described herein. Reassembly PCR A further technique for recombining mutations in a nucleic acid sequence 10 utilizes "reassembly PCR." This method can be used to assemble multiple segments that have been separately evolved into a full length nucleic acid template such as a gene. This technique is performed when a pool of advantageous mutants is known from previous work or has been identified by screening mutants that may have been created by any mutagenesis technique known in the art, such as PCR mutagenesis, cassette mutagenesis, doped oligo 15 mutagenesis, chemical mutagenesis, or propagation of the DNA template in vivo in mutator strains. Boundaries defining segments of a nucleic acid sequence of interest preferably lie in intergenic regions, introns, or areas of a gene not likely to have mutations of interest. Preferably, oligonucleotide primers (oligos) are synthesized for PCR amplification of segments of the nucleic acid sequence of interest, such that the sequences of the 20 oligonucleotides overlap the junctions of two segments. The overlap region is typically about 10 to 100 nucleotides in length. Each of the segments is amplified with a set of such primers. The PCR products are then "reassembled" according to assembly protocols such as those discussed herein to assemble randomly fragmented genes. In brief, in an assembly protocol the PCR 25 products are first purified away from the primers, by, for example, gel electrophoresis or size exclusion chromatography. Purified products are mixed together and subjected to about 1-10 cycles of denaturing, reannealing, and extension in the presence of polymerase and deoxynucleoside triphosphates (dNTPs) and appropriate buffer salts in the absence of additional primers ("self-priming"). Subsequent PCR with primers flanking the gene are used 30 to amplify the yield of the fully reassembled and shuffled genes. In some embodiments, the resulting reassembled genes are subjected to mutagenesis before the process is repeated. 43 WO 00/18906 PCTIUS99/22588 In the present invention, oligos such as PCR primers can include codon modifications as compared to a starting sequence. In addition, oligonucleotides can form the basis for PCR concatemerization reactions in which overlapping hybridized oligonucleotides are extended in one or more PCR amplification cycles. In this embodiment, a template 5 nucleic acid is not required (although a template or fragments thereof can be added to the amplification mixture, which can aid in the eventual reassembly of a full-length gene). Further details regarding oligonucleotide gene reassembly methods are found, e.g., in Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed February 5, 1999, USSN 60/118,813 and Crameri et al. "OLIGONUCLEOTIDE 10 MEDIATED NUCLEIC ACID RECOMBINATION" filed June 24, 1999, USSN 60/141,049. In a further embodiment, the PCR primers for amplification of segments of a nucleic acid sequence of interest are used to introduce variation into the gene of interest as follows. Mutations at sites of interest in a nucleic acid sequence are identified by screening or selection, by sequencing homologues of the nucleic acid sequence, and so on. 15 Oligonucleotide PCR primers are synthesized which encode wild type or mutant information at sites of interest. These primers are then used in PCR mutagenesis to generate libraries of full length genes encoding permutations of wild type and mutant information at the designated positions. This technique is typically advantageous in cases where the screening or selection process is expensive, cumbersome, or impractical relative to the cost of 20 sequencing the genes of mutants of interest and synthesizing mutagenic oligonucleotides. Site Directed Mutagenesis (SDM) with Oligonucleotides Encoding Homologue Mutations Followed by Shuffling In some embodiments of the invention, sequence information from one or more substrate sequences is added to a given "parental" sequence of interest, with subsequent 25 recombination between rounds of screening or selection. Typically, this is done with site directed mutagenesis performed by techniques well known in the art (e.g., Berger, Ausubel and Sambrook, supra.) with one substrate as a template and oligonucleotides encoding single or multiple mutations from other substrate sequences, e.g. homologous genes. After screening or selection for an improved phenotype of interest, the selected recombinant(s) can 30 be further evolved using recursive techniques. After screening or selection, site-directed mutagenesis can be done again with another collection of oligonucleotides encoding 44 WO 00/18906 PCT/US99/22588 homologue mutations, and the above process repeated until the desired properties are obtained. When the difference between two homologues is one or more single point mutations in a codon, degenerate oligonucleotides can be used that encode the sequences in 5 both homologues. One oligonucleotide can include many such degenerate codons and still allow one to exhaustively search all permutations over that block of sequence. When the homologue sequence space is very large, it can be advantageous to restrict the search to certain variants. Thus, for example, computer modeling tools (Lathrop et al. (1996) J Mol. Biol., 255: 641-665) can be used to model each homologue mutation 10 onto the target protein and discard any mutations that are predicted to grossly disrupt structure and function. In silico genetic algorithm operations for generating and predicting mutational events are found in Selifonov and Stemmer "METHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" filed 02/05/1999, USSN 60/118854. 15 Oligonucleotide and in silico shuffling formats As mentioned above, at least two additional related formats are useful in the practice of the present invention. The first, referred to as "in silico" shuffling utilizes computer algorithms to perform "virtual" shuffling using genetic operators in a computer. As applied to the present invention, codon altered gene sequence strings are recombined in a 20 computer system and desirable products are made, e.g., by reassembly PCR or ligation of synthetic oligonucleotides. In silico shuffling is described in detail in Selifonov and Stemmer in "METHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" filed 02/05/1999, USSN 60/118854. In brief, genetic operators (algorithms which represent given genetic events such 25 as point mutations, recombination of two strands of homologous nucleic acids, etc.) are used to model recombinational or mutational events which can occur in one or more nucleic acid, e.g., by aligning nucleic acid sequence strings (using standard alignment software, or by manual inspection and alignment) and predicting recombinational outcomes. The predicted recombinational outcomes are used to produce corresponding molecules, e.g., by 30 oligonucleotide synthesis and reassembly PCR. 45 WO 00/18906 PCT/US99/22588 The second useful format is referred to as "oligonucleotide mediated shuffling" in which oligonucleotides corresponding to a family of related homologous nucleic acids (e.g., as applied to the present invention, codon modified synthetic homologous variants of a nucleic acid) which are recombined to produce selectable nucleic acids. This format is 5 described in detail in Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed February 5, 1999, USSN 60/118,813 and Crameri et al. "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" filed June 24, 1999, USSN 60/141,049. In brief, selected oligonucleotides are synthesized, ligated and elongated, typically either in a polymerase or ligase-mediated elongation reaction. The 10 technique can be used to recombine homologous or even non-homologous codon-altered nucleic acid sequences. One advantage of oligonucleotide-mediated recombination is the ability to recombine homologous nucleic acids with low sequence similarity, or even non-homologous nucleic acids. In these low-homology oligonucleotide shuffling methods, one or more set of 15 fragmented nucleic acids (e.g., cleaved codon-modified oligonucleotides, or synthesized codon-modified oligonucleotides) are recombined, e.g., with a with a set of crossover family diversity oligonucleotides. Each of these crossover oligonucleotides have a plurality of sequence diversity domains corresponding to a plurality of sequence diversity domains from homologous or non-homologous nucleic acids with low sequence similarity. The fragmented 20 oligonucleotides, which are derived by comparison to one or more homologous or non homologous nucleic acids, can hybridize to one or more region of the crossover oligos, facilitating recombination. When recombining homologous nucleic acids, sets of overlapping family gene shuffling oligonucleotides (which are derived by comparison of homologous nucleic acids 25 that include one or more codon-modified nucleic acid, followed by synthesis of corresponding oligonucleotides) are hybridized and elongated (e.g., by reassembly PCR or ligation), providing a population of recombined nucleic acids, which can be selected for a desired trait or property. The set of overlapping family shuffling gene oligonucleotides includes a plurality of oligonucleotide member types which have consensus region 30 subsequences derived from a plurality of homologous target nucleic acids. 46 WO 00/18906 PCT/US99/22588 Typically, as applied to the present invention, family gene shuffling oligonucleotide which include one or more codon-altered nucleic acid(s) are provided by aligning homologous nucleic acid sequences to select conserved regions of sequence identity and regions of sequence diversity. A plurality of family gene shuffling oligonucleotides are 5 synthesized (serially or in parallel) which correspond to at least one region of sequence diversity. Sets of fragments, or subsets of fragments used in oligonucleotide shuffling approaches can be provided by cleaving one or more homologous nucleic acids (e.g., with a DNase), or, more commonly, by synthesizing a set of oligonucleotides corresponding to a 10 plurality of regions of at least one nucleic acid (typically oligonucleotides corresponding to a full-length nucleic acid are provided as members of a set of nucleic acid fragments). In the shuffling procedures herein, these cleavage fragments can be used in conjunction with family gene shuffling oligonucleotides, e.g., in one or more recombination reaction to produce recombinant codon-altered nucleic acid(s). 15 Additional oligonucleotide shuffling formats are found in co-filed application by Crameri et al., "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" (Attorney Docket Number 02-296-2US) and in co-filed application by Welch et al., "USE OF CODON VARIED OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC SHUFFLING" (Attorney docket number 02-1007). In particular, these 20 applications provide for tri-nucleotide-based synthesis of degenerate oligonucleotides, thereby providing for codon substitution during oligonucleotide shuffling. In brief, this procedure utilizes tri-nucleotide phosphoramidite chemistry to synthesize oligos, rather than standard mono-nucleotide synthesis. Because codons are altered as a unit, the synthetic scheme of degenerate oligonucleotides is simplified. 25 Additional In Vitro DNA Shuffling Formats In one embodiment for shuffling DNA sequences in vitro, the initial substrates for recombination are a pool of related sequences, e.g., different variant forms, as homologs from different individuals, strains, or species of an organism, or related sequences from the same organism, as allelic variations. The sequences can be DNA or RNA and can be of 30 various lengths depending on the size of the gene or DNA fragment to be recombined or reassembled. Preferably the sequences are from 50 base pairs (bp) to 50 kilobases (kb). 47 WO 00/18906 PCT/US99/22588 The pool of related substrates are converted into overlapping fragments, e.g., from about 5 bp to 5 kb or more. Often, for example, the size of the fragments is from about 10 bp to 1000 bp, and sometimes the size of the DNA fragments is from about 100 bp to 500 bp. The conversion can be effected by a number of different methods, such as DNase I or 5 RNase digestion, random shearing or partial restriction enzyme digestion, or by oligonucleotide synthesis as in the family oligonucleotide-mediated shuffling methods of crameri et al., discusses supra. For discussions of protocols for the isolation, manipulation, enzymatic digestion, and the like, of nucleic acids, see, for example, Sambrook et al. and Ausubel, both supra. The concentration of nucleic acid fragments of a particular length and 10 sequence is often less than 0.1 % or 1% by weight of the total nucleic acid. The number of different specific nucleic acid fragments in the mixture is usually at least about 2, 10, 100, 500 or 1,000 or more. The mixed population of nucleic acid fragments are converted to at least partially single-stranded form using any of a variety of techniques, including, for example, 15 heating, chemical denaturation, use of DNA binding proteins, and the like (in oligonucleotide mediated methods, this step can be omitted). Conversion can be effected by heating to about 80 OC to 100 oC, more preferably from 90 OC to 96 OC, to form single-stranded nucleic acid fragments and then reannealing. Conversion can also be effected by treatment with a single stranded DNA binding protein (see Wold (1997) Annu. Rev. Biochen. 66:61-92) or recA 20 protein (see, e.g., Kiianitsa (1997) Proc. Natl. Acad. Sci. USA 94:7837-7840). Single stranded nucleic acid fragments having regions of sequence identity with other single stranded nucleic acid fragments can then be reannealed by cooling to 20 OC to 75 OC, and preferably from 40 OC to 65 OC. Renaturation can be accelerated by the addition of polyethylene glycol (PEG), other volume-excluding reagents or salt. The salt concentration 25 is preferably from 0 mM to 200 mM, more preferably the salt concentration is from 10 mM to 100 mM. The salt may be KCl or NaCl. The concentration of PEG is preferably from 0% to 20%, more preferably from 5% to 10%. The fragments that reanneal can be from different substrates. The annealed nucleic acid fragments are incubated in the presence of a nucleic acid polymerase, such as Taq or Klenow, and dNTP's (i.e. dATP, dCTP, dGTP and dTTP). If 30 regions of sequence identity are large, Taq polymerase can be used with an annealing temperature of between 45-65 oC. If the areas of identity are small, Klenow polymerase can 48 WO 00/18906 PCT/US99/22588 be used with an annealing temperature of between 20-30 oC. The polymerase can be added to the random nucleic acid fragments prior to annealing, simultaneously with annealing or after annealing. The process of denaturation, renaturation and incubation in the presence of 5 polymerase or ligase of overlapping fragments to generate a collection of polynucleotides containing different permutations of fragments is sometimes referred to as shuffling of the nucleic acid in vitro. This cycle is repeated for a desired number of times. Preferably the cycle is repeated from 2 to 100 times, more preferably the sequence is repeated from 10 to 40 times. The resulting nucleic acids are a family of double-stranded polynucleotides of from 10 about 50 bp to about 100 kb, preferably from 500 bp to 50 kb. The population represents variants of the starting substrates showing substantial sequence identity thereto but also diverging at several positions. The population has many more members than the starting substrates. The population of fragments resulting from shuffling is used to transform host cells, optionally after cloning into a vector. 15 In one embodiment utilizing in vitro shuffling, subsequences of recombination substrates can be generated by amplifying the full-length sequences under conditions which produce a substantial fraction, typically at least 20 percent or more, of incompletely extended amplification products. Another embodiment uses random primers to prime an entire template DNA to generate less than full length amplification products. The amplification 20 products, including the incompletely extended amplification products are denatured and subjected to at least one additional cycle of reannealing and amplification. This variation, in which at least one cycle of reannealing and amplification provides a substantial fraction of incompletely extended products, is termed "stuttering." In the subsequent amplification round, the partially extended (less than full length) products reanneal to and prime extension 25 on different sequence-related template species. In another embodiment, the conversion of substrates to fragments can be effected by partial PCR amplification of substrates. In another embodiment, a mixture of fragments is spiked with one or more oligonucleotides. The oligonucleotides can be designed to include precharacterized mutations of a wildtype sequence (e.g., codon modification), or sites of natural variations 30 between individuals or species. The oligonucleotides also typically include sufficient sequence or structural homology flanking such mutations or variations to allow annealing 49 WO 00/18906 PCT/US99/22588 with the wildtype fragments. Annealing temperatures can be adjusted depending on the length of homology. In a further embodiment, recombination occurs in at least one cycle by template switching, such as when a DNA fragment derived from one template primes on the 5 homologous position of a related but different template. Template switching can be induced by addition of recA (see, Kiianitsa (1997) supra), rad51 (see, Namsaraev (1997) Mol. Cell. Biol. 17:5359-5368), rad55 (see, Clever (1997) EMBO J. 16:2535-2544), rad57 (see, Sung (1997) Genes Dev. 11:1111-1121) or other polymerases (e.g., viral polymerases, reverse transcriptase) to the amplification mixture. Template switching can also be increased by 10 increasing the DNA template concentration. Another embodiment utilizes at least one cycle of amplification, which can be conducted using a collection of overlapping single-stranded DNA fragments of related sequence, and different lengths. Fragments can be prepared using a single stranded DNA phage, such as M13 (see, Wang (1997) Biochemistry 36:9486-9492). Each fragment can 15 hybridize to and prime polynucleotide chain extension of a second fragment from the collection, thus forming sequence-recombined polynucleotides. In a further variation, ssDNA fragments of variable length can be generated from a single primer by Pfu, Taq, Vent, Deep Vent, UlTma DNA polymerase or other DNA polymerases on a first DNA template (see, Cline (1996) Nucleic Acids Res. 24:3546-3551). The single stranded DNA fragments 20 are used as primers for a second, Kunkel-type template, consisting of a uracil-containing circular ssDNA. This results in multiple substitutions of the first template into the second. See, Levichkin (1995) Mol. Biology 29:572-577; Jung (1992) Gene 121:17-24. In some embodiments of the invention, shuffled nucleic acids obtained by use of the recursive recombination methods of the invention, are put into a cell and/or organism 25 for screening. Shuffled genes can be introduced into, for example, bacterial cells, yeast cells, fungal cells vertebrate cells, invertebrate cells or plant cells for initial screening. Bacillus species (such as B. subtilis and E. coli are two examples of suitable bacterial cells into which one can insert and express shuffled genes which provide for convenient shuttling to other cell types (a variety of vectors for shuttling material between these bacterial cells and eukaryotic 30 cells are available; see, Sambrook, Ausubel and Berger, all supra). The shuffled genes can 50 WO 00/18906 PCTIUS99/22588 be introduced into bacterial, fungal or yeast cells either by integration into the chromosomal DNA or as plasmids. Bacterial, plant, animal and yeast systems are preferred in the present invention. For example, in one embodiment, shuffled genes can be introduced into plant or 5 animal cells for production purposes (it will be appreciated that transgenic plants are, increasingly, an important source of industrial enzymes), or can be introduced into a plant or animal cell for therapeutic purposes. Thus, a transgene of interest can be modified using the recursive sequence recombination methods of the invention in vitro and reinserted into the cell for in vivo/in situ selection for the new or improved property, in bacteria, eukaryotic 10 cells, or whole eukaryotic organisms. In Vivo DNA Shuffling Formats In some embodiments of the invention, DNA substrate molecules, e.g., those comprising codon modifications relative to a wild-type sequence, are introduced into cells, where the cellular machinery directs their recombination. For example, a library of mutants 15 is constructed and screened or selected for mutants with improved phenotypes by any of the techniques described herein. The DNA substrate molecules encoding the best candidates are recovered by any of the techniques described herein, then fragmented and used to transfect a plant host and screened or selected for improved function. If further improvement is desired, the DNA 20 substrate molecules are recovered from the host cell, such as by PCR, and the process is repeated until a desired level of improvement is obtained. In some embodiments, the fragments are denatured and reannealed prior to transfection, coated with recombination stimulating proteins such as recA, or co-transfected with a selectable marker such as NeoR to allow the positive selection for cells receiving recombined versions of the gene of interest. 25 Methods for in vivo shuffling are described in, for example, PCT application WO 98/13487 and WO 97/20078. The efficiency of in vivo shuffling can be enhanced by increasing the copy number of a gene of interest in the host cells. Whole Genome Shuffling In one embodiment, the selection methods herein are utilized in a "whole 30 genome shuffling" format. An extensive guide to the many forms of whole genome shuffling is found in the pioneering application to the inventors and their co-workers entitled 51 WO 00/18906 PCT/US99/22588 "Evolution of Whole Cells and Organisms by Recursive Sequence Recombination," PCT/US99/15972, by del Cardayre et al. Any codon-altered set of nucleic acids can be used to transform cells, which can then be shuffled by in a whole genome format. In brief, whole genome shuffling makes no presuppositions at all regarding 5 what nucleic acids may confer a desired property. Instead, entire genomes (e.g., from a genomic -library, or isolated from an organism) are shuffled in cells and selection protocols applied to the cells. These genomes can be spiked with any desired set of nucleic acids, including codon-modified nucleic acids. Assays 10 The relevant assay for selection of a desired property of a codon-modified nucleic acid will depend on the application. Many assays which detect activity for proteins, receptors, ligands, cells and the like are known. Formats include binding to immobilized components, cell or organismal viability, production of reporter compositions, and the like. In the high throughput assays of the invention, it is possible to screen up to 15 several thousand different shuffled variants in a single day. In particular, each well of a microtiter plate can be used to run a separate assay, or, if concentration or incubation time effects are to be observed, every 5-10 wells can test a single variant. Thus, a single standard microtiter plate can assay about 100 (e.g., 96) reactions. If 1536 well plates are used, then a single plate can easily assay from about 100- about 1500 different reactions. It is possible to 20 assay several different plates per day; assay screens for up to about 6,000-20,000 different assays (i.e., involving different nucleic acids, encoded proteins, concentrations, etc.) is possible using the integrated systems of the invention. More recently, microfluidic approaches to reagent manipulation have been developed, e.g., by Caliper Technologies (Mountain View, CA). 25 In one aspect, library members, e.g., cells, viral plaques, spores or the like, are separated on solid media to produce individual colonies (or plaques). Using an automated colony picker (e.g., the Q-bot, Genetix, U.K.), colonies or plaques are identified, picked, and up to 10,000 different mutants inoculated into 96 well microtitre dishes containing two 3 mm glass balls/well. The Q-bot does not pick an entire colony but rather inserts a pin through the 30 center of the colony and exits with a small sampling of cells, (or mycelia) and spores (or viruses in plaque applications). The time the pin is in the colony, the number of dips to 52 WO 00/18906 PCTIUS99/22588 inoculate the culture medium, and the time the pin is in that medium each effect inoculum size, and each can be controlled and optimized. The uniform process of the Q-bot decreases human handling error and increases the rate of establishing cultures (roughly 10,000/4 hours). These cultures are then shaken in a temperature and humidity controlled incubator. The 5 glass balls in the microtiter plates act to promote uniform aeration of cells and the dispersal of mycelial fragments similar to the blades of a fermenter. Clones from cultures of interest can be cloned by limiting dilution. As also described supra, plaques or cells constituting libraries can also be screened directly for production of proteins, either by detecting hybridization, protein activity, protein binding to antibodies, or the like. 10 The ability to detect a subtle increase in the performance of a shuffled library member over that of a parent strain relies on the sensitivity of the assay. The chance of finding the organisms having an improvement is increased by the number of individual mutants that can be screened by the assay. To increase the chances of identifying a pool of sufficient size, a prescreen that increases the number of mutants processed by, e.g., 10-fold 15 can be used. The goal of the primary screen is to quickly identify mutants having equal or better product titres than the parent strain(s) and to move only these mutants forward to liquid cell culture for subsequent analysis. A number of well known robotic systems have also been developed for solution phase chemistries useful in assay systems. These systems include automated 20 workstations like the automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.) which mimic the manual synthetic operations performed by a scientist. Any of the above devices are suitable for use with the present invention, e.g., for high-throughput screening of 25 molecules encoded by codon-altered nucleic acids. The nature and implementation of modifications to these devices (if any) so that they can operate as discussed herein with reference to the integrated system will be apparent to persons skilled in the relevant art. High throughput screening systems are commercially available (see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 30 Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems typically automate entire procedures including all sample and reagent pipetting, liquid 53 WO 00/18906 PCT/US99/22588 dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for the assay. These configurable systems provide high throughput and rapid start up as well as a high degree of flexibility and customization. The manufacturers of such systems provide detailed protocols the various high 5 throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening systems for detecting the modulation of gene transcription, ligand binding, and the like. Microfluidic approaches to reagent manipulation have also been developed, e.g., by Caliper Technologies (Mountain View, CA). Optical images viewed (and, optionally, recorded) by a camera or other 10 recording device (e.g., a photodiode and data storage device) are optionally further processed in any of the embodiments herein, e.g., by digitizing the image and/or storing and analyzing the image on a computer. A variety of commercially available peripheral equipment and software is available for digitizing, storing and analyzing a digitized video or digitized optical image, e.g., using PC (Intel x86 or pentium chip- compatible DOSTM, OS2TM 15 WINDOWS
TM
, WINDOWS NTTM or WINDOWS95 TM based machines), MACINTOSHTM, or UNIX based (e.g., SUNTM work station) computers. One conventional system carries light from the assay device to a cooled charge-coupled device (CCD) camera, in common use in the art. A CCD camera includes an array of picture elements (pixels). The light from the specimen is imaged on the CCD. 20 Particular pixels corresponding to regions of the specimen (e.g., individual hybridization sites on an array of biological polymers) are sampled to obtain light intensity readings for each position. Multiple pixels are processed in parallel to increase speed. The apparatus and methods of the invention are easily used for viewing any sample, e.g., by fluorescent or dark field microscopic techniques. 25 Software elements for manipulating strings of characters which correspond to codon-modified nucleic acids can be used to direct synthesis of oligonucleotides relevant to shuffling of codon-modified nucleic acids. Integrated systems comprising these and other useful features, e.g., a digital computer with additional features such as high-throughput liquid control software, image analysis software, data interpretation software, a robotic liquid 0 control armature for transferring solutions from a source to a destination operably linked to the digital computer, an input device (e.g., a computer keyboard) for entering data to the 54 WO 00/18906 PCT/US99/22588 digital computer to control high throughput liquid transfer by the robotic liquid control armature an image scanner for digitizing label signals from labeled assay components, or the like are a feature of the invention. In one aspect, the invention provides an integrated system comprising a 5 computer or computer readable medium comprising a database having at least two artificial homologous codon-altered nucleic acid sequence strings, and a user interface allowing a user to selectively view one or more sequence strings in the database. As discussed theroughout, there are a variety of sequence database programs for aligning and manipulating sequences. In addition, standard text manipulation software such as word processing software (e.g., 10 Microsft WordTM or Corel WodperfectTM) and database software (e.g., spreadsheet software such as Microsoft ExcelTM, Corel Quattro ProTM, or database programs such as Microsoft AccessTM or ParadoxTM) can be used in conjuction with a user interface (e.g., a GUI in a standard operating system such as a Windows, Macintosh or LINUX system) to manipulate strings of characters. Specialized alignment programs such as BLAST can also be 15 incorporated into the systems of the invention for alignment of codon-altered nucleic acids (or corresponding character strings). In addition to the integrated system elements mentioned above, the integrated system can also include an automated oligonucleotide synthesizer operably linked to the computer or computer readable medium. Typically, the synthesizer is programmed to 20 synthesize one or more oligonucleotide comprising one or more subsequence of one or more of the at least two artificial homologous codon-altered nucleic acids. Modifications can be made to the method and materials as hereinbefore described without departing from the spirit or scope of the invention as claimed, and the invention can be put to a number of different uses, including: 25 The use of an integrated system to test shuffled codon-modified DNAs, including in an iterative process. An assay, kit or system utilizing a use of any one of the selection strategies, materials, components, methods or substrates hereinbefore described. Kits will optionally additionally comprise instructions for performing methods or assays, packaging materials, 30 one or more containers which contain assay, device or system components, or the like. 55 WO 00/18906 PCT/US99/22588 In an additional aspect, the present invention provides kits embodying the methods and apparatus herein. Kits of the invention optionally comprise one or more of the following: (1) a shuffled codon-modified component as described herein; (2) instructions for practicing the methods described herein, and/or for operating the selection procedure herein; 5 (3) one or more assay component; (4) a container for holding nucleic acids or enzymes, other nucleic acids, transgneic plants, animals, cells, or the like, (5) packaging materials and (6) software fixerd in a computer readable medium comprising sequences corresponding to one or more codon-altered nucleic acid character string. In a further aspect, the present invention provides for the use of any 10 component or kit herein, for the practice of any method or assay herein, and/or for the use of any apparatus or kit to practice any assay or method herein. While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the 15 true scope of the invention. For example, all the techniques and materials described above can be used in various combinations. All publications and patent documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent document were so individually denoted. 56
Claims (51)
1. A method of making codon altered nucleic acids, the method comprising: (i) providing a first nucleic acid sequence, which nucleic acid sequence encodes a first 5 polypeptide sequence; (ii) providing a plurality of codon altered nucleic acid sequences, each of which encode the first polypeptide or a modified form thereof; and, (iii) recombining the plurality of codon-altered nucleic acid sequences to produce a target codon altered nucleic acid, which target codon altered nucleic acid encodes a second protein. 10
2. The method of claim 1, wherein at least one of the plurality of codon altered nucleic acid sequences does not hybridize to the first nucleic acid under stringent hybridization conditions.
3. The method of claim 1, further comprising shuffling a nucleic acid comprising a subsequence consisting of the first nucleic acid, or a substantially identical 15 variant thereof, with one or more of the plurality of codon altered nucleic acids, or with the target codon altered nucleic acid.
4. The method of claim 1, the method further comprising the step of: (iv) screening the second protein for a structural or functional property.
5. The method of claim 1, the method further comprising the steps of: 20 (iv) screening the second protein for a structural or functional property, and, (v) comparing the structural or functional property of the second protein to a structural or functional property of the first protein.
6. The method of claim 1, wherein the second polypeptide has a structural or functional property equivalent or superior to the first polypeptide. 25
7. The method of claim 1, wherein the first and second polypeptide are homologous. 57 WO 00/18906 PCT/US99/22588
8. The method of claim 1, wherein the plurality of codon altered nucleic acids comprise a library of codon altered nucleic acids.
9. The method of claim 1, wherein the plurality of codon altered nucleic acids comprise a library of codon altered conservatively modified nucleic acids. 5
10. A library of codon altered conservatively modified nucleic acids produced by the method of claim 9.
11. The method of claim 1, wherein the plurality of codon altered nucleic acids comprise a library of codon altered non-conservatively modified nucleic acids.
12. A library of codon altered conservatively modified nucleic acids 10 produced by the method of claim 11.
13. The method of claim 1, wherein the plurality of codon altered nucleic acids is derived from a plurality of forms of the first nucleic acid.
14. The method of claim 1, wherein the plurality of codon altered nucleic acid sequences comprise at least three codon altered nucleic acids.
15 15. The method of claim 1, wherein the plurality of codon altered nucleic acid sequences comprise one or more of the following structural features: (a) codon usage divergence for each of the codon altered nucleic acids of 50% or more as compared to the first nucleic acid; (b) codon usage divergence for each of the codon altered nucleic acids of 75% or 20 more as compared to the first nucleic acid; (c) codon usage divergence for each of the codon altered nucleic acids of 90% or more as compared to the first nucleic acid; (d) maximal codon usage divergence for each of the codon altered nucleic acids as compared to the first nucleic acid; 25 (e) non-overlapping non-conservative substitutions in each of the codon altered nucleic acids as compared to the first nucleic acid; 58 WO 00/18906 PCT/US99/22588 (f) a lack of high stringency hybridization between one or more of the codon altered nucleic acid and the first nucleic acid; and, (g) modification of the codons of one or more of the codon altered nucleic acids to provide one or more different hydrophobic core residue for an encoded polypeptide as 5 compared to the first polypeptide.
16. The method of claim 1, wherein the percent identity between the second protein and the first protein is lower than the percent identity between two of the plurality of codon altered nucleic acids.
17. The method of claim 1, wherein the first nucleic acid encodes a protein 10 selected from: EPO, G-CSF, a viral envelope protein, a cytokine, and a phosphatase.
18. The method of claim 1, wherein the first nucleic acid sequence or the codon altered nucleic acid sequences are isolated nucleic acids.
19. The target codon altered nucleic acid produced by the method of claim 1.
20. The method of claim 1, wherein the first nucleic acid sequence or the 15 codon altered nucleic acid sequences are nucleic acids present in cells.
21. The cells produced by the method of claim 20.
22. The method of claim 1, wherein each of the codon altered nucleic acid sequences comprises at least two nucleotide differences when compared to the first nucleic acid. 20
23. The method of claim 1, further comprising introducing the target codon altered nucleic acid into a cell, or into a vector or virus.
24. The cell, vector or virus produced by the method of claim 23.
25. The method of claim 1, wherein the target codon altered nucleic acid is recombined with a portion of a viral genome to produce an attenuated virus. 25
26. The attenuated virus produced by the method of claim 25. 59 WO 00/18906 PCT/US99/22588
27. The method of claim 1, wherein the target codon altered nucleic acid is recombined with a portion of a viral genome to produce an attenuated virus, which attenuated virus produces an immune response upon infection by the virus in a mammal.
28. The attenuated virus produced by the method of claim 27. 5
29. The method of claim 1, wherein the target codon altered nucleic acid is recombined with a portion of a retroviral genome to produce an attenuated retrovirus, which attenuated retrovirus produces an immune response upon infection by the retrovirus in a mammal.
30. The attenuated retrovirus produced by the method of claim 29. 10
31. The method of claim 1, wherein the target codon altered nucleic acid is recombined with a portion of a viral genome to produce an viral vector.
32. The viral vector produced by the method of claim 31.
33. The method of claim 1, wherein the target codon altered nucleic acid is recombined with a portion of a viral genome to produce an viral vector, which vector requires 15 trans complementation for replication, and which vector has a reduced rate of reversion to a replicative form as compared to a corresponding viral vector which lacks a subsequence corresponding to the target codon altered nucleic acid.
34. The viral vector produced by the method of claim 33.
35. The method of claim 33, wherein the vector comprises viral elements 20 from one or more of: a lentivirus, an adenovirus, a herpes virus, and an adeno-associated virus.
36. A method of making a library of codon-altered nucleic acids, the method comprising: (i) selecting a first nucleic acid sequence, which nucleic acid sequence encodes a first 25 polypeptide sequence; and, 60 WO 00/18906 PCTIUS99/22588 (ii) making a plurality of codon altered nucleic acid sequences, each of which encode the first polypeptide or a modified form thereof, wherein the plurality of codon altered nucleic acids comprise the library.
37. A codon-altered library made by the method of claim 36. 5
38. The library of claim 37, wherein said library comprises at least 2 codon altered nucleic acids.
39. The library of claim 37, wherein said library comprises at least 5 codon altered nucleic acids.
40. The library of claim 37, wherein said library comprises at least 10 codon 10 altered nucleic acids.
41. The library of claim 37, wherein said library comprises at least 100 codon altered nucleic acids.
42. A kit comprising the library of claim 25 and one or more of: a container and instructional materials providing method step instructions for recombining two or more 15 of members of the library.
43. The method of claim 41, further comprising recombining said plurality of codon altered nucleic acids to produce a shuffled codon-altered library.
44. The codon altered library made by the method of claim 43.
45. The method of claim 36, wherein said nucleic acids encode a protein 20 selected from EPO, a cytokine, a phosphatase, and a viral envelope protein.
46. A composition comprising a plurality of codon altered nucleic acids, each of which encode a first polypeptide or a modified form thereof.
47. A library of codon-altered nucleic acids, comprising a plurality of codon altered nucleic acids derived from a plurality of homologous nucleic acids. 61 WO 00/18906 PCT/US99/22588
48. The library of claim 47, wherein said plurality of codon altered nucleic acids recombine in vitro at an increased rate compared to said plurality of homologous nucleic acids.
49. The library of claim 47, wherein the level of identity among said 5 plurality of codon-altered nucleic acids is at least as high as among a plurality of polypeptides encoded by said plurality of homologous nucleic acids.
50. An integrated system comprising a computer or computer readable medium comprising a database having at least two artificial homologous codon-altered nucleic acid sequence strings, and a user interface allowing a user to selectively view one or 10 more sequence strings in the database.
51. The integrated system of claim 50, further comprising an automated oligonucleotide synthesizer operably linked to the computer or computer readable medium, which synthesizer is programmed to synthesize one or more oligonucleotide comprising one or more subsequence of one or more of the at least two artificial homologous codon-altered 15 nucleic acids. 62
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10236298P | 1998-09-29 | 1998-09-29 | |
US60102362 | 1998-09-29 | ||
US11772999P | 1999-01-29 | 1999-01-29 | |
US60117729 | 1999-01-29 | ||
US11881399P | 1999-02-05 | 1999-02-05 | |
US60118813 | 1999-02-05 | ||
US14104999P | 1999-06-24 | 1999-06-24 | |
US60141049 | 1999-06-24 | ||
PCT/US1999/022588 WO2000018906A2 (en) | 1998-09-29 | 1999-09-28 | Shuffling of codon altered genes |
Publications (1)
Publication Number | Publication Date |
---|---|
AU1199000A true AU1199000A (en) | 2000-04-17 |
Family
ID=27493256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU11990/00A Abandoned AU1199000A (en) | 1998-09-29 | 1999-09-28 | Shuffling of codon altered genes |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP1117777A2 (en) |
JP (1) | JP2002537758A (en) |
KR (1) | KR20010085850A (en) |
AU (1) | AU1199000A (en) |
CA (1) | CA2331335A1 (en) |
IL (1) | IL140441A0 (en) |
WO (1) | WO2000018906A2 (en) |
Families Citing this family (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6759226B1 (en) | 2000-05-24 | 2004-07-06 | Third Wave Technologies, Inc. | Enzymes for the detection of specific nucleic acid sequences |
US7150982B2 (en) | 1991-09-09 | 2006-12-19 | Third Wave Technologies, Inc. | RNA detection assays |
US7045289B2 (en) | 1991-09-09 | 2006-05-16 | Third Wave Technologies, Inc. | Detection of RNA Sequences |
US5605793A (en) | 1994-02-17 | 1997-02-25 | Affymax Technologies N.V. | Methods for in vitro recombination |
WO1999020652A1 (en) * | 1997-10-23 | 1999-04-29 | Nippon Institute For Biological Science | Feline granulocyte colony-stimulating factor |
AU1124499A (en) | 1997-10-28 | 1999-05-17 | Maxygen, Inc. | Human papillomavirus vectors |
EP1030861A4 (en) | 1997-10-31 | 2001-09-05 | Maxygen Inc | Modification of virus tropism and host range by viral genome shuffling |
US6902918B1 (en) | 1998-05-21 | 2005-06-07 | California Institute Of Technology | Oxygenase enzymes and screening method |
IL140125A0 (en) | 1998-06-17 | 2002-02-10 | Maxygen Inc | Method for producing polynucleotides with desired properties |
US7033781B1 (en) | 1999-09-29 | 2006-04-25 | Diversa Corporation | Whole cell engineering by mutagenizing a substantial portion of a starting genome, combining mutations, and optionally repeating |
AU2788101A (en) * | 2000-01-11 | 2001-07-24 | Maxygen, Inc. | Integrated systems and methods for diversity generation and screening |
AU2001241522A1 (en) * | 2000-02-16 | 2001-08-27 | Sequel Genetics, Inc. | Methods and products for peptide based dna sequence identification and analysis |
WO2001068835A2 (en) * | 2000-03-13 | 2001-09-20 | Aptagen | Method for modifying a nucleic acid |
US7115403B1 (en) | 2000-05-16 | 2006-10-03 | The California Institute Of Technology | Directed evolution of galactose oxidase enzymes |
US20020045175A1 (en) * | 2000-05-23 | 2002-04-18 | Zhen-Gang Wang | Gene recombination and hybrid protein development |
AUPQ776100A0 (en) | 2000-05-26 | 2000-06-15 | Australian National University, The | Synthetic molecules and uses therefor |
CA2413022A1 (en) * | 2000-06-14 | 2001-12-20 | Diversa Corporation | Whole cell engineering by mutagenizing a substantial portion of a starting genome, combining mutations, and optionally repeating |
US20050209447A1 (en) * | 2000-07-25 | 2005-09-22 | Takashi Ito | Process for producing recombinant protein |
AU2001280968A1 (en) | 2000-07-31 | 2002-02-13 | Menzel, Rolf | Compositions and methods for directed gene assembly |
US20030073092A1 (en) * | 2000-11-10 | 2003-04-17 | Maranas Costas D. | Modeling framework for predicting the number, type, and distribution of crossovers in directed evolution experiments |
CA2434224C (en) | 2001-01-10 | 2013-04-02 | The Penn State Research Foundation | Method and system for modeling cellular metabolism |
WO2002083868A2 (en) | 2001-04-16 | 2002-10-24 | California Institute Of Technology | Peroxide-driven cytochrome p450 oxygenase variants |
WO2002092780A2 (en) * | 2001-05-17 | 2002-11-21 | Diversa Corporation | Novel antigen binding molecules for therapeutic, diagnostic, prophylactic, enzymatic, industrial, and agricultural applications, and methods for generating and screening thereof |
WO2003008563A2 (en) | 2001-07-20 | 2003-01-30 | California Institute Of Technology | Improved cytochrome p450 oxygenases |
DK1493027T3 (en) | 2002-03-01 | 2014-11-17 | Codexis Mayflower Holdings Llc | Methods, systems and software for identifying functional biomolecules |
WO2003078583A2 (en) | 2002-03-09 | 2003-09-25 | Maxygen, Inc. | Optimization of crossover points for directed evolution |
US9321832B2 (en) | 2002-06-28 | 2016-04-26 | Domantis Limited | Ligand |
AU2003256480B2 (en) | 2002-07-10 | 2008-03-06 | The Penn State Research Foundation | Method for determining gene knockout strategies |
US7826975B2 (en) | 2002-07-10 | 2010-11-02 | The Penn State Research Foundation | Method for redesign of microbial production systems |
WO2005003289A2 (en) | 2002-08-06 | 2005-01-13 | Verdia, Inc. | Ap1 amine oxidase variants |
MXPA05011585A (en) | 2003-04-29 | 2006-05-25 | Pioneer Hi Bred Int | Novel glyphosate-n-acetyltransferase (gat) genes. |
EP1639091B1 (en) | 2003-06-17 | 2012-12-05 | California University Of Technology | Regio- and enantioselective alkane hydroxylation with modified cytochrome p450 |
WO2005017116A2 (en) | 2003-08-11 | 2005-02-24 | California Institute Of Technology | Thermostable peroxide-driven cytochrome p450 oxygenase variants and methods of use |
US8715988B2 (en) | 2005-03-28 | 2014-05-06 | California Institute Of Technology | Alkane oxidation by modified hydroxylases |
US11214817B2 (en) | 2005-03-28 | 2022-01-04 | California Institute Of Technology | Alkane oxidation by modified hydroxylases |
WO2008016709A2 (en) | 2006-08-04 | 2008-02-07 | California Institute Of Technology | Methods and systems for selective fluorination of organic molecules |
US8252559B2 (en) | 2006-08-04 | 2012-08-28 | The California Institute Of Technology | Methods and systems for selective fluorination of organic molecules |
GB0617387D0 (en) * | 2006-09-04 | 2006-10-11 | Glaxo Group Ltd | Synthetic gene |
PL2139515T5 (en) | 2007-03-30 | 2024-04-08 | The Research Foundation Of The State University Of New York | Attenuated viruses useful for vaccines |
CA2723427C (en) | 2008-05-23 | 2018-01-23 | E. I. Du Pont De Nemours And Company | Novel dgat genes for increased seed storage lipid production and altered fatty acid profiles in oilseed plants |
US8383346B2 (en) | 2008-06-13 | 2013-02-26 | Codexis, Inc. | Combined automated parallel synthesis of polynucleotide variants |
EP2346995B1 (en) | 2008-09-26 | 2018-11-07 | Tocagen Inc. | Gene therapy vectors and cytosine deaminases |
US9187762B2 (en) | 2010-08-13 | 2015-11-17 | Pioneer Hi-Bred International, Inc. | Compositions and methods comprising sequences having hydroxyphenylpyruvate dioxygenase (HPPD) activity |
US9322007B2 (en) | 2011-07-22 | 2016-04-26 | The California Institute Of Technology | Stable fungal Cel6 enzyme variants |
KR101148191B1 (en) * | 2011-09-27 | 2012-05-23 | 김후정 | Erythropoietin-derived peprides and uses thereof |
RU2014148769A (en) | 2012-05-04 | 2016-06-27 | Е.И. Дюпон Де Немур Энд Компани | COMPOSITIONS AND METHODS INCLUDING SEQUENCES CHARACTERIZED BY MEGANUCLEASE ACTIVITY |
US9663532B2 (en) | 2012-10-29 | 2017-05-30 | University Of Rochester | Artemisinin derivatives, methods for their preparation and their use as antimalarial agents |
CN104955961B (en) * | 2012-12-11 | 2017-03-08 | 塞勒密斯株式会社 | Methods for Synthesizing Gene Libraries Using Codon Randomization and Mutagenesis |
US20140289906A1 (en) | 2013-03-14 | 2014-09-25 | Pioneer Hi-Bred International, Inc. | Compositions Having Dicamba Decarboxylase Activity and Methods of Use |
EP2970935A1 (en) | 2013-03-14 | 2016-01-20 | Pioneer Hi-Bred International, Inc. | Compositions having dicamba decarboxylase activity and methods of use |
CA2901316A1 (en) | 2013-03-15 | 2014-09-25 | Pioneer Hi-Bred International, Inc. | Phi-4 polypeptides and methods for their use |
CN106232820A (en) | 2013-08-16 | 2016-12-14 | 先锋国际良种公司 | Insecticidal protein and using method thereof |
EA031651B1 (en) | 2013-09-13 | 2019-02-28 | Пайонир Хай-Бред Интернэшнл, Инк. | Insecticidal proteins and methods for their use |
WO2015100277A2 (en) | 2013-12-23 | 2015-07-02 | University Of Rochester | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
BR112016018103B1 (en) | 2014-02-07 | 2024-01-16 | E.I. Du Pont De Nemours And Company | POLYPEPTIDE AND ITS USE, POLYNUCLEOTIDE, COMPOSITION, FUSION PROTEIN, METHOD FOR CONTROLING A POPULATION, METHOD FOR INHIBITING GROWTH, METHOD FOR CONTROLING INFESTATION, METHOD FOR OBTAINING A PLANT OR PLANT CELL, CONSTRUCTION |
ES2806473T3 (en) | 2014-02-07 | 2021-02-17 | Pioneer Hi Bred Int | Insecticidal proteins and methods for their use |
CA2963558C (en) | 2014-10-16 | 2023-04-04 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
WO2016114973A1 (en) | 2015-01-15 | 2016-07-21 | Pioneer Hi Bred International, Inc | Insecticidal proteins and methods for their use |
EA038923B1 (en) | 2015-03-11 | 2021-11-10 | Пайонир Хай-Бред Интернэшнл, Инк. | Insecticidal dna construct and methods of use thereof |
RU2017144238A (en) | 2015-05-19 | 2019-06-19 | Пайонир Хай-Бред Интернэшнл, Инк. | INSECTICIDAL PROTEINS AND METHODS OF THEIR APPLICATION |
WO2017023486A1 (en) | 2015-08-06 | 2017-02-09 | Pioneer Hi-Bred International, Inc. | Plant derived insecticidal proteins and methods for their use |
US20180325119A1 (en) | 2015-12-18 | 2018-11-15 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
WO2017192560A1 (en) | 2016-05-04 | 2017-11-09 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
MX2018015906A (en) | 2016-07-01 | 2019-04-04 | Pioneer Hi Bred Int | Insecticidal proteins from plants and methods for their use. |
WO2018048869A1 (en) * | 2016-09-06 | 2018-03-15 | Bioventures, Llc | Compositions and methods for generating reversion free attenuated and/or replication incompetent vaccine vectors |
US11021716B2 (en) | 2016-11-01 | 2021-06-01 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
US11174295B2 (en) | 2016-12-14 | 2021-11-16 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3046226A1 (en) | 2016-12-22 | 2018-06-28 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
CA3052794A1 (en) | 2017-02-08 | 2018-08-16 | Pioneer Hi-Bred International, Inc. | Insecticidal combinations of plant derived insecticidal proteins and methods for their use |
EP3363900A1 (en) * | 2017-02-21 | 2018-08-22 | ETH Zurich | Evolution-guided multiplexed dna assembly of dna parts, pathways and genomes |
RU2019140646A (en) | 2017-05-11 | 2021-06-11 | Пайонир Хай-Бред Интернэшнл, Инк. | INSECTICIDE PROTEINS AND METHODS OF THEIR APPLICATION |
WO2019040335A1 (en) | 2017-08-19 | 2019-02-28 | University Of Rochester | Micheliolide derivatives, methods for their preparation and their use as anticancer and antiinflammatory agents |
CA3092078A1 (en) | 2018-03-14 | 2019-09-19 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins from plants and methods for their use |
CN116410286A (en) | 2018-03-14 | 2023-07-11 | 先锋国际良种公司 | Insecticidal proteins from plants and methods of use thereof |
US20220356491A1 (en) * | 2019-06-21 | 2022-11-10 | Osaka University | Method for preparing artificial recombinant rna virus that stably holds foreign gene |
CA3163708A1 (en) | 2020-01-10 | 2021-07-15 | Yi Tang | Biosynthetic platform for the production of olivetolic acid and analogues of olivetolic acid |
EP4182466A2 (en) | 2020-07-14 | 2023-05-24 | Pioneer Hi-Bred International, Inc. | Insecticidal proteins and methods for their use |
EP4490202A1 (en) | 2022-03-11 | 2025-01-15 | University of Rochester | Cyclopeptibodies and uses thereof |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IL99553A0 (en) * | 1990-09-28 | 1992-08-18 | Ixsys Inc | Compositions containing oligonucleotides linked to expression elements,a kit for the preparation of vectors useful for the expression of a diverse population of random peptides and methods utilizing the same |
US5837458A (en) * | 1994-02-17 | 1998-11-17 | Maxygen, Inc. | Methods and compositions for cellular and metabolic engineering |
WO1998013485A1 (en) * | 1996-09-27 | 1998-04-02 | Maxygen, Inc. | Methods for optimization of gene therapy by recursive sequence shuffling and selection |
-
1999
- 1999-09-28 AU AU11990/00A patent/AU1199000A/en not_active Abandoned
- 1999-09-28 CA CA002331335A patent/CA2331335A1/en not_active Abandoned
- 1999-09-28 JP JP2000572353A patent/JP2002537758A/en not_active Withdrawn
- 1999-09-28 KR KR1020017003873A patent/KR20010085850A/en not_active Application Discontinuation
- 1999-09-28 EP EP99969739A patent/EP1117777A2/en not_active Withdrawn
- 1999-09-28 IL IL14044199A patent/IL140441A0/en unknown
- 1999-09-28 WO PCT/US1999/022588 patent/WO2000018906A2/en not_active Application Discontinuation
Also Published As
Publication number | Publication date |
---|---|
KR20010085850A (en) | 2001-09-07 |
IL140441A0 (en) | 2002-02-10 |
CA2331335A1 (en) | 2000-04-06 |
WO2000018906A9 (en) | 2000-08-31 |
EP1117777A2 (en) | 2001-07-25 |
WO2000018906A3 (en) | 2000-10-26 |
WO2000018906A2 (en) | 2000-04-06 |
JP2002537758A (en) | 2002-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU1199000A (en) | Shuffling of codon altered genes | |
US6423542B1 (en) | Oligonucleotide mediated nucleic acid recombination | |
CA2320697C (en) | Oligonucleotide mediated nucleic acid recombination | |
US6436675B1 (en) | Use of codon-varied oligonucleotide synthesis for synthetic shuffling | |
US6368861B1 (en) | Oligonucleotide mediated nucleic acid recombination | |
US8058001B2 (en) | Oligonucleotide mediated nucleic acid recombination | |
US20060051795A1 (en) | Oligonucleotide mediated nucleic acid recombination | |
Soong et al. | Molecular breeding of viruses | |
US6413745B1 (en) | Recombination of insertion modified nucleic acids | |
EA020657B1 (en) | Tailored multi-site combinatorial assembly | |
US20030054390A1 (en) | Oligonucleotide mediated nucleic acid recombination | |
US20230357733A1 (en) | Reverse Transcriptase and Methods of Use | |
MXPA01003212A (en) | Shuffling of codon altered genes | |
DK2253704T3 (en) | Oligonucleotide-mediated recombination nucleic acid | |
KR20010042040A (en) | Oligonucleotide mediated nucleic acid recombination | |
MXPA00009027A (en) | Oligonucleotide mediated nucleic acid recombination |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MK5 | Application lapsed section 142(2)(e) - patent request and compl. specification not accepted |