Usage

FastaParser allows parsing and writing FASTA files. It also provides classes that represent individual FASTA sequences and individual letter codes, which can be used independently of the parsing and writing classes. The classes provided by FastaParser are described in detail in the API Specification section of this documentation.

To use FastaParser in a project:

import fastaparser

Reading FASTA files

To read a FASTA file with FastaParser the file should first be opened for reading:

with open('fasta_file.fasta', 'r') as fasta:
    ...

or

fasta = open('fasta_file.fasta', 'r')

Then the opened file handle should be passed to the Reader class:

reader = fastaparser.Reader(fasta)

reader is an iterator of the FASTA sequence/s contained in the fasta file. Therefore, to parse the sequence/s:

for sequence in reader:
    sequence.id             # do something with the id
    sequence.description    # do something the description
    sequence.sequence       # do something with the sequence of nucleotides/aminoacids

By default the reader parses the FASTA sequences into FastaSequence objects, which are feature rich. This behaviour can be changed by instantiating the Reader class with a different parse_method parameter. The default value of parse_method is 'rich', the alternate value is 'quick':

reader = fastaparser.Reader(fasta, parse_method='quick')

This alternate method of parsing FASTA files simply parses the header and sequence of each FASTA into individual properties. It's faster than the default but lacks features.

for sequence in reader:
    sequence.header         # do something with the header (contains the '>')
    sequence.sequence       # do something with the sequence of nucleotides/aminoacids

Writing FASTA files

To write a FASTA file with FastaParser the file should first be opened for writing:

with open('fasta_file.fasta', 'w') as fasta:
    ...

or

fasta = open('fasta_file.fasta', 'w')

The file can also be opened in appending mode, to add FASTA sequences to an existing file:

with open('fasta_file.fasta', 'a') as fasta:
    ...

Then the opened file handle should be passed to the Writer class:

writer = fastaparser.Writer(fasta)

The Writer class has 2 methods:

  • writefasta
  • writefastas

The writefasta method writes a single FASTA sequence to the provided file, and takes the following parameter:

  • either a FastaSequence object
  • or a tuple (header : str, sequence : str)

The writefastas method writes multiple FASTA sequences and takes the same parameters but contained in an iterable.

The following examples would write the same single FASTA sequence to file, using the writefasta method:

fasta_sequence = FastaSequence(sequence='ACTG', id_='id123', description='This is a sequence')
writer.writefasta(fasta_sequence)

or

fasta_sequence = ('>id123 This is a sequence', 'ACTG')
writer.writefasta(fasta_sequence)

Using the writefastas method:

fasta_sequences = [
    FastaSequence(sequence='ACTG', id_='id123', description='This is a nucleotide sequence'),
    FastaSequence(sequence='ABCD', id_='id456', description='This is an aminoacid sequence')
]
writer.writefastas(fasta_sequences)