This document gives a short introduction into the possibilities of backbone assemblies and how to use them most effectively.
What are backbones?
Backbones in the terminology of MIRA are sequences to which shotgun sequences are assembled to.
What sequences can be used as backbones?
Any valid DNA/RNA sequence in CAF, FASTA, EXP or Genbank (gbk, gbf) format can be used. In genome assembly backbone sequences can be, for examples, partly finished results of previous assemblies or even completed genomes of a closely related strain or species. In EST assembly, validated full length cDNA sequences are often used as backbones. Note that when using CAF files, the structure of the backbones is kept, i.e., contigs in the CAF file remain contigs and do include all the read sequences that they are consisting of (single reads remain, of course, single reads).
What ancillary information is loaded together with backbone sequences?
Basically, everything which is present in the file type of the backbone sequences is loaded:
for FASTA, this is of course, only the sequence itself.
for Genbank, this is the sequence and additionally, all features of the sequence (genes, CDSs, RNAs etc.) are translated into Staden package and MIRA compatible tags.
for EXP, everything is imported (sequence, quality values, clippings
Please note that no computational operation like clipping etc. is applied to loaded backbone sequences, they are used as is.