Bibliography

Allex, C. F., Baldwin, S. F., Shavlik, J. W. and Blattner, F. R. (1996), Improving the Quality of Automatic DNA Sequence Assembly using Fluorescent Trace-Data Classifications. Intell. Systems Mol. Biol., 4, 3-14.

Allex, C. F., Baldwin, S. F., Shavlik, J. W. and Blattner, F. R. (1997), Increasing Consensus Accuracy in DNA Fragment Assemblies by Incorporating Fluorescent Trace Representations. Proceedings, Fifth International Conference on Intelligent Systems for Molecular Biology, AAAI Press, pp. 3-14, pp. 3-14.

Allison, L. (1993), A fast Algorithm for the Optimal Alignment of Three Strings. Journal of theoretical Biology, 164, 261-269.

Althaus, E., Caprara, A., Lenhof, H.-P. and Reinert, K. (2002), Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics. Bioinformatics, 18, S4-S16, Suppl. 2.

Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhan, J., Zhang, Z., Miller, W. and Lipman, D. J. (1997), Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389-3402.

Anderson, I. and Brass, A. (1998), Searching DNA databases for similarities to DNA sequences: when is a match significant? Bioinformatics, 14(4), 349-356.

Anson, E. L. and Myers, E. W. (1997), A Program for Refining DNA Sequence Multi-Alignments. Journal of Computational Biology, 4(3), 269-283.

Armen, C. and Stein, C. (1995), Short Superstrings and the Structure of Overlapping Strings. Journal of Computational Biology, 2(2), 307-332.

Arslan, A. N., Egecioglu, O. and Pevzner, P. A. (2001), A new approach to sequence comparison: normalized sequence alignment. Bioinformatics, 17(4), 327-337.

Asayama, M., Saito, K. and Kobayashi, Y. (1998), Translational attenuation of the Bacillus subtilis spo0B cistron by an RNA structure encompassing the initiation region. Nucleic Acids Research, 26(3), 824-830.

Baeza-Yates, R. A. and Gonnet, G. H. (1992), A New Approach to Text Searching. Commun. of the Assoc. for Comp. Mach., 35, 74-82.

Bailey, J. A., Yavor, A. M., Massa, H. F., Trask, B. J. and Eichler, E. E. (2001), Segmental Duplications: Organization and Impact Within the Currant Human Genome Project Assembly. Genome Research, 11, 1005-1017.

Barker, G., Batley, J., O' Sullivan, H., Edwards, K. J. and Edwards, D. (2003), Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics, 421-2.

Barton, G. J. (1993), An efficient algorithm to locate all locally optimal alignments between two sequences allowing for gaps. Computer Applications in the Bioscience, 9(6), 729-734.

Berno, A. J. (1996), A Graph Theoretic Approach to the Analysis of DNA Sequencing Data. Genome Research, 6, 80-91.

Bonfield, J. K., Rada, C. and Staden, Rodger, S. (1998), Automated detection of point mutations using fluorescent sequence trace subtraction. Nucleic Acids Research, 26, 3404-3409.

Bonfield, J. K., Smith, K. F. and Staden, R. (1995a), The application of numerical estimates of base calling accuracy to DNA sequencing projects. Nucleic Acids Research, 23(8), 1406-1410.

Bonfield, J. K., Smith, K. F. and Staden, R. (1995b), A new DNA sequence assembly program. Nucleic Acids Research, 23(24), 4992-4999.

Bonfield, J. K. and Staden, R. (1996), Experiment files and their application during large-scale sequencing projects. DNA Sequence, 6, 109-117.

Boyer, R. S. and Moore, J. S. (1977), A Fast String Searching Algorithm. Commun. of the Assoc. for Comp. Mach., 20(10), 762-772.

Bray, N., Dubchak, I. and Pachter, L. (2003), AVID: A Global Alignment Program. Genome Research, 13, 97-102.

Bruce, A., Bray, D., Lewis, J., Raff, M., Roberts, K. and Watson, J. D. (1994), Molecular biology of the cell. Garland Publishing, 3rd edn.

Bucher, P. and Hofmann, K. (1996), A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System. Intell. Systems Mol. Biol., 4, 44-51.

Camargo, A. A., Samaia, Helena P.B. Dias-Neto, E., Simao, D. F. and Migotto, I. A. e. a. (2001), The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome. Proceedings of the National Academy of Sciences of the United States of America, 98(21), 12103-12108.

Chan, S. C., Wong, A. K. C. and Chiu, D. K. Y. (1992), A survey of multiple sequence comparison methods. Bulletin of Mathematical Biology, 54(4), 563-598.

Chao, K.-M., Hardison, R. C. and Miller, W. (1994), Recent Developments in Linear-Space Alignment Methods: A Survey. Journal of Computational Biology, 1(4), 271-291.

Chao, K.-M., Zhang, J., Ostell, J. and Miller, W. (1995), A local alignment tool for very long DNA sequences. Computer Applications in the Bioscience, 11(2), 147-153.

Chen, T. and Skiena, S. S. (2000), A case study in genome-level fragment assembly. Bioinformatics, 16(6), 494-500.

Cheung, J., Estivill, X., Khaja, R., MacDonald, J. R., Lau, K., Tsui, L.-C. and Scherer, S. W. (2003), Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biology, 4(4), R25.1-R25.10.

Chevreux, B., Pfisterer, T., Drescher, B., Driesel, A. J., Müller, W. E., Wetter, T. and Suhai, S. (2004), Using the miraEST Assembler for Reliable and Automated mRNA Transcript Assembly and SNP Detection in Sequenced ESTs. Genome Research, 14(6).

Chevreux, B., Pfisterer, T. and Suhai, S. (2000), Automatic Assembly and Editing of Genomic Sequences. Genomics and Proteomics - Functional and Computational Aspects, Kluwer Academic/Plenum Publishers, New York, chap. 5, pp. 51-65.

Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. GCB99 (1999), pp. 45-56.

Chou, H.-H. and Holmes, M. H. (2001), DNA sequence quality trimming and vector removal. Bioinformatics, 17(12), 1093-1104.

Dardel, F. (1985), A microcomputer program for comparison and alignment of DNA sequence gel readings. Computer Applications in the Bioscience, 1(3), 173-175.

Dear, S., Durbin, R., Hilloier, L., Marth, G., Thierry-Mieg, J. and Mott, R. (1998), Sequence Assembly with CAFTOOLS. Genome Research, 8, 260-267.

Delcher, A. L., Phillippy, A., Carlton, J. and Salzberg, S. L. (2002), Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Research, 30(11), 2478-2483.

DOE (1992), DOE Human Genome Program: Primer on Molecular Genetics. Tech. rep., U.S. Department of Energy; Office of Energy Research; Office of Health and Environmental Research, Washington, DC 20585.

Durbin, R. and Dear, S. (1998), Base Qualities Help Sequencing Software. Genome Research, 8, 161-162.

Eichler, E. E. (2001), Segmental Duplications: What's Missing, Misassigned, and Misassembled - and Should We Care? Genome Research, 11, 653-656.

Engle, M. and Burks, C. (1993), Artificially generated data sets for testing DNA fragment assembly algorithms. Genomics, 16, 286-288.

Engle, M. and Burks, C. (1994), GenFrag 2.2: New features for more robust fragment assembly benchmarks. Computer Applications in the Bioscience, 10, 567-568.

Ewing, B. and Green, P. (1998), Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities. Genome Research, 8, 186-194.

Ewing, B., Hillier, L., Wendl, M. C. and Green, P. (1998), Base-Calling of Automated Sequencer Traces Using Phred. I. Accuracy Assessment. Genome Research, 8, 175-185.

Felsenstein, J., Sawyer, S. and Kochin, R. (1982), An efficient method for matching nucleic acid sequences. Nucleic Acids Research, 10, 133-139.

GCB99 (1999), Computer Science and Biology: Proceedings of the Germanc Conference on Bioinformatics GCB `99, GBF-Braunschweig, Dep. of Bioinformatics.

Giegerich, R. (2000), A systematic approach to dynamic programming in bioinformatics. Bioinformatics, 16(8), 665-77.

Giegerich, R. and Wheeler, D. (1996), Pairwise Sequence Alignment. http://www.techfak.uni-bielefeld.de/bcd/Curric/PrwAli/prwali.html.

Giladi, E., Walker, M., Wang, J. and Volkmuth, W. (2002), SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size. Bioinformatics, 873-877.

Gordon, D., Abajian, C. and Green, P. (1998), Consed: A Graphical Tool for Sequence Finishing. Genome Research, 8, 195-202.

Gotoh, O. (1993), Optimal alignment between groups of sequences and its application to multiple sequence alignment. Computer Applications in the Bioscience, 9(3), 361-370.

Grice, J. A., Hughey, R. and Speck, D. (1997), Reduced Space Sequence Alignment. Computer Applications in the Bioscience, 13(1), 45-53.

Grillo, G., Attimonelli, M., Luici, S. and Pesole, G. (1996), CLEANUP: a fast computer program for removing redundancies from nucleotide sequence databases. Computer Applications in the Bioscience, 12(1), 1-8.

Gronek, G. (1995a), Ähnlichkeiten gesucht: Fehlertoleranter Suchalgorithmus 'Shift-AND'. c't, 5, 294-301.

Gronek, G. (1995b), Optimal schnell: Schnelle Textsuche mit 'Optimal Mismatch'. c't, 3, 278-284.

Guan, X. and Uberbacher, E. C. (1996), Alignments of DNA and protein sequences containing frameshift errors. Computer Applications in the Bioscience, 12(1), 31-40.

Heber, S., Alekseyev, M., Sze, S., Tang, H. and Pevzner, P. (2002), Splicing graphs and EST assembly problem. Bioinformatics, S181-8, Suppl 1.

Huang, X. (1994), On global sequence alignment. Computer Applications in the Bioscience, 10(3), 227-235.

Huang, X. (1996), An Improved Sequence Assembly Program. Genomics, 33, 21-31.

Huang, X. and Madan, A. (1999), CAP3: A DNA Sequence Assembly Program. Genome Research, 9, 868-877.

Idury, R. M. and Waterman, M. S. (1995), A New Algorithm for DNA Sequence Assembly. Journal of Computational Biology, 2(2), 291-306.

Jaffe, D. B., Butler, Jonathan Gnerre, S., Mauceli, E., Lindblad-Toh, K., Mesirov, J. P., Zody, M. C. and Lander, E. S. (2003), Whole-Genome Sequence Assembly for Mammalian Genomes: Arachne 2. Genome Research, 13(1), 91-96.

Johnston, R. E., Mackenzie, J. J. and Dougherty, W. (1986), Assembly of overlapping DNA sequences by a program written in BASIC for 64K CP/M and MS-DOS IBM-compatible microcomputers. Nucleic Acids Research, 14(1), 517-527.

Katoh, K., Misawa, K., Kuma, K.-i. and Miyata, T. (2002), MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14), 3059-3066.

Kececioglu, J. D. and Myers, E. W. (1992), Combinatorial algorithms for DNA sequence assembly. Tech. Rep. TR 92-37, University of California at Davis, University of Arizona.

Keith, J., Adams, P., Bryant, D., Kroese, D.P. andMitchelson, K., Cochran, D. and Lala, G. (2002), A simulated annealing algorithm for finding consensus sequences. Bioinformatics, 1494-1499.

Kent, J. W. (2002), BLAT - The BLAST-Like Alignment Tool. Genome Research, 12, 656-664.

Kleinjung, J., Douglas, N. and Heringa, J. (2002), Parallelized multiple alignment. Bioinformatics, 1270-1271.

Klug, W. S. and Cummings, M. R. (1996), Essential of Genetics. Prentice Hall, 2nd edn.

Kumar, S. and Rzhetsky, A. (1996), Evolutionary relationships of eukaryotic kingdoms. Journal of Molecular Evolution, 42, 183-193.

Lario, A., González, A. and Dorado, G. (1997), Automated Laser-Induced Fluorescence DNA Sequencing: Equalizing Signal-to-Noise Ratios Significantly Enhances Overall Performance. Analytical Biochemistry, 247, 30-33.

Lassmann, T. and Sonnhammer, E. L. (2002), Quality assessment of multiple alignment programs. FEBS Letters, 529(1), 126-130.

Lawrence, C. B., Honda, S., Parrott, N. W., Flood, T. C., Ghu, L., Zhang, L., Jain, M., Larson, S. and Myers, E. W. (1994), The Genome Reconstruction Manager: A Software Environment for Supporting High-Throughput DNA Sequencing. Genomics, 23, 192-201.

Lee, C., Grasso, C. and Sharlow, M. F. (2002), Multiple sequence alignment using partial order graphs. Bioinformatics, 18(3), 452-464.

Lipshutz, R. J., Taverner, F., Henessy, K., Hartzell, G. and Davis, R. (1994), DNA Sequence Confidence Estimation. Genomics, 19, 417-424.

Ma, B., Tromp, J. and Li, M. (2002), PatternHunter: faster and more sensitive homology search. Bioinformatics, 18(3), 440-445.

Miller, W. (2001), Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics, 17(5), 391-397.

Morgenstern, B., Dress, A. and Werner, T. (1996), Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proceedings of the National Academy of Science USA, 93, 12098-12103.

Morgenstern, B., Goel, S., Sczyrba, A. and Dress, A. (2003), AltAVisT: Comparing alternative multiple sequence alignments. Bioinformatics, 19(3), 425-426.

Müller, W. E. (2001), How was metazoan threshold crossed: the hypothetical Urmetazoa (part A). Comparative Biochemistry and Physiology, 129, 433-460.

Myers, E. W. (1991), An Overview of Sequence Comparison Algorithms in Molecular Biology. Tech. Rep. 29, Department of Computer Science; The University of Arizona, Tucson, Arizona 85721.

Myers, E. W. (1994), Advances in Sequence Assembly, Academic Press. pp. 231-238.

Myers, E. W. (1995), Toward Simplifying and Accurately Formulating Fragment Assembly. Journal of Computational Biology, 2(2), 275-290.

Myers, G. (1999), A Whole Genome Assembler for Drosophila. GCB99 (1999), p. 44.

Myers, G., Selznick, S., Zhang, Z. and Miller, W. (1996), Progressive Multiple Alignment with Constraints. Journal of Computational Biology, 3(4), 563-572.

Needleman, S. and Wunsch, C. (1970), A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3), 443-453.

Nickerson, D. A., Taylor, S. L. and Rieder, M. J. (2000), Identifying Single Nucleotide Polymorphisms (SNPs) in Human Candidate Genes. Research Abstracts from the DOE Human Genome Program Contractor-Grantee Workshop VIII, February 27-March 2, Santa Fe, NM.

Ning, Z., Cox, A. J. and Mullikin, J. C. (2001), SSAHA: A Fast Search Method for Large DNA Databases. Genome Research, 11, 1725-1729.

Notredame, C. (2002), Recent progress in multiple sequence alignment: a survey. Pharmacogenomics, 3, 131-144.

Notredame, C. and Higgins, D. G. (1996), SAGA: sequence alignment by genetic algorithm. Nucleic Acids Research, 24(8), 1515-1524.

Notredame, C., Holm, L. and Higgins, D. G. (1998), COFFEE: an objective function for multiple sequence alignments. Bioinformatics, 14(5), 407-422.

Otu, H. H. and Sayood, K. (2003), A divide-and-conquer approach to fragment assembly. Bioinformatics, 19(1), 22-29.

Paracel (2002a), Paracel Filtering Package User Manual. Paracel Inc., 1055 E. Colorado Blvd; Pasadena, CA 91106.

Paracel (2002b), PGA: Paracel Genome Assembler User Manual. Paracel Inc., 1055 E. Colorado Blvd; Pasadena; CA 91106.

Paracel (2002c), PTA: Paracel TranscriptAssembler User Manual. Paracel Inc., 1055 E. Colorado Blvd; Pasadena, CA 91106.

Parsons, R., Forrest, S. and Burks, C. (1993), Genetic Algorithms for DNA Sequence Assembly. L. Hunter, D. B. Searls, J. W. S. (ed.), Proc. of the 1st International Conference on Intelligent Systems for Molecular Biology, AAAI, Bethesda, MD, USA, pp. 310-318, pp. 310-318, ISBN 0-929280-47-4.

Pearson, W. R. (1995), Comparison of Methods for Searching Protein Sequence Databases. Protein Science, 4, 1145-1160.

Pearson, W. R. (1998), Empirical Statistical Estimates for Sequence Similarity Searches. Journal of Molecular Biology, 276, 71-84.

Peltola, H., Söderlund, H. and Ukkonen, E. (1984), SEQAID: a DNA sequence assembling program based on a mathematical model. Nucleic Acids Research, 12(1), 307-321.

Pevzner, P. A. and Tang, H. (2001), Fragment assembly with double-barreled data. Bioinformatics, 17, S225-S233, Suppl. 1.

Pevzner, P. A., Tang, H. and Waterman, M. (2001), An Eulerian path approach to DNA fragment assembly. Proceedings of the National Academy of Science USA, 98, 9748-9753.

Pfisterer, T. and Wetter, T. (1999), Computer Assisted Editing of Genomic Sequences - Why and How We Evaluated a Prototype, Springer-Verlag, Berlin Heidelberg New York. Lecture Notes in Artificial Intelligence; Subseries of Lecture Notes in Computer Science, pp. 201-209.

Prunella, N., Luini, S., Attimonelli, M. and Pesole, G. (1993), FASTPAT: a fast and efficient algorithm for string searching in DNA sequences. Computer Applications in the Bioscience, 9(5), 541-545.

Rajasekaran, S., Jin, X. and Spouge, J. (2002), The efficient computation of position-specific match scores with the fast fourier transform. Journal of Computational Biology, 9(1), 23-33.

Reinert, K., Stoye, J. and Will, T. (2000), An iterative method for faster sum-of-pairs multiple sequence alignment. Bioinformatics, 16(9), 808-814.

Richterich, P. (1998), Estimation of Errors in ``Raw'' DNA Sequences: A Validation Study (Letter). Genome Research, 8, 251-259.

Rosenblum, B., Lee, L., Spurgeon, S., Khan, S., Menchen, S., Heiner, C. and Chen, S. (1997), New dye-labeled terminators for improved DNA sequencing patterns. Nucleic Acids Research, 25(22), 4500-4504.

Sanders, J. Z., Petterson, A. A., Hughes, P. J., Connell, C. R., Raff, M., Menchen, S., Hood, L. E. and Teplow, D. B. (1991), Imaging as a tool for improving length and accuracy of sequence analysis in automated fluorescence-based DNA sequencing. Electrophoresis, 12, 3-11.

Sanger, F., Nicklen, S. and Coulson, A. (1977), DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Science USA, 74, 5463-5467.

Schlosshauer, M. and Ohlsson, M. (2002), A novel approach to local reliability of sequence alignments. Bioinformatics, 18(6), 847-854.

Schuler, G. D. (1997), Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med, 75, 694-698.

Schuler, G. D. (1998), Sequence Alignmnent and Database Searching, Wiley-Liss, Inc. pp. 145-171, ISBN 0-471-19196-5.

Shpaer, E. G., Robinson, M., Yee, D., Candlin, J. D., Mines, R. and Hunkapiller, T. (1996), Sensitivity and Selectivity in Protein Similarity Searches: A Comparison of Smith-Waterman in Hardware to BLAST and FASTA. Genomics, 38(2), 179-191.

Smith, T. F. and Waterman, M. S. (1981), Identification of Common Molecular Subsequences. Journal of Molecular Biology, 147, 195-197.

Smith, T. F., Waterman, M. S. and Fitch, W. M. (1981), Comparative Biosequence Metrics. Journal of Molecular Evolution, 18, 38-46.

Staden, R. (1984), Computer methods to aid the determination and analysis of DNA sequences. Biochemical Society Transaction, 12(6), 1005-1008.

Staden, R. (1989), Methods for calculating the probabilities of finding patterns in sequences. Computer Applications in the Bioscience, 5(2), 89-96.

Staden, R. (1996), The Staden Sequence Analysis Package. Molecular Biotechnology, 5, 233-241.

Staden, R., Bonfield, J. and Beal, K. (1997), The New Staden Package Manual - Part 1. Medical Research Council, Laboratory of Molecular Biology.

Stoye, J. (1998), Multiple sequence alignment with the divide-and-conquer method. Gene / GC, 211, 45-56.

Sunday, D. M. (1990), A very fast substring search algorithm. Commun. of the Assoc. for Comp. Mach., 33(8), 132-142.

Tammi, M. T., Arner, E., Britton, T. and Andersson, B. (2002), Separation of nearly identical repeats in shotgun assemblies using defined nucleotide positions, DNPs. Bioinformatics, 18(3), 379-88.

Thompson, J. D., Plewniak, F. and Poch, O. (1999a), A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research, 27(13), 2682-2690.

Thompson, J. D., Plewniak, F. and Poch, O. (1999b), A comprehensive comparison of multiple sequence alignmnet programs. Nucleic Acids Research, 27(13), 2682-2690.

Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., Gocayne, J. D., Amanatides, P., Ballew, R. M., Huson, D. H. and Wortman, J. R. e. a. (2001), The Sequence of the Human Genome. Science, 1304-1351.

Walther, D., Bartha, G. and Morris, M. (2001), Basecalling with LifeTrace. Genome Research, 11, 875-888.

Wang, J., Ka-Shu, G. W., Wang, J., Ni, P., Han, Y., Huang, X., Zhang, J., Ye, C., Zhang, Y., Hu, J., Zhang, K., Xu, X., Cong, L., Lu, H., Ren, X., Ren, X., Dai, D., He, J., Tao, L., Passey, D. A., Yang, H., Yu, J. and Li, S. (2002), RePS: A Sequence Assembler That Masks Exact Repeats Identified from the Shotgun Data. Genome Research, 12, 824-831.

Wang, L. and Jiang, T. (1994), On the Complexity of Multiple Sequence Alignment. Journal of Computational Biology, 1(4), 337-348.

Wilbur, W. and Lipman, D. J. (1983), Rapid similarity searches of nucleic acid and proteins data banks. Proceedings of the National Academy of Science USA, 80, 726-730.

Wu, S. and Manber, U. (1992a), Approximate Pattern Matching. Byte Magazine, 11, 281-292.

Wu, S. and Manber, U. (1992b), Fast Text Searching Allowing Errors. Commun. of the Assoc. for Comp. Mach., 35(10), 83-91.

Xu, Y., Mural, R. J. and Uberbacher, E. C. (1995), Correcting sequencing errors in DNA coding regions using dynamic programming approach. Computer Applications in the Bioscience, 11(2), 117-124.

Yu, Z., Li, T., Zhao, J. and Luo, J. (2002), PGAAS: a prokaryotic genome assembly assistant system. Bioinformatics, 18(5), 661-665.

Zhang, C. and Wong, A. K. (1997), A genetic algorithm for multiple molecular sequence alignment. Computer Applications in the Bioscience, 13(6), 565-581.



Bastien Chevreux 2006-05-11