Usage
FastaParser allows parsing and writing FASTA files. It also provides classes that represent individual FASTA sequences and individual letter codes, which can be used independently of the parsing and writing classes. The classes provided by FastaParser are described in detail in the API Specification section of this documentation.
To use FastaParser in a project:
import fastaparser
Reading FASTA files
To read a FASTA file with FastaParser the file should first be opened for reading:
with open('fasta_file.fasta', 'r') as fasta:
...
or
fasta = open('fasta_file.fasta', 'r')
Then the opened file handle should be passed to the Reader
class:
reader = fastaparser.Reader(fasta)
reader
is an iterator of the FASTA sequence/s contained in the fasta file.
Therefore, to parse the sequence/s:
for sequence in reader:
sequence.id # do something with the id
sequence.description # do something the description
sequence.sequence # do something with the sequence of nucleotides/aminoacids
By default the reader parses the FASTA sequences into FastaSequence
objects,
which are feature rich.
This behaviour can be changed by instantiating the Reader
class with a different parse_method
parameter.
The default value of parse_method
is 'rich'
, the alternate value is 'quick'
:
reader = fastaparser.Reader(fasta, parse_method='quick')
This alternate method of parsing FASTA files simply parses the header and sequence of each FASTA into individual properties. It's faster than the default but lacks features.
for sequence in reader:
sequence.header # do something with the header (contains the '>')
sequence.sequence # do something with the sequence of nucleotides/aminoacids
Writing FASTA files
To write a FASTA file with FastaParser the file should first be opened for writing:
with open('fasta_file.fasta', 'w') as fasta:
...
or
fasta = open('fasta_file.fasta', 'w')
The file can also be opened in appending mode, to add FASTA sequences to an existing file:
with open('fasta_file.fasta', 'a') as fasta:
...
Then the opened file handle should be passed to the Writer
class:
writer = fastaparser.Writer(fasta)
The Writer
class has 2 methods:
writefasta
writefastas
The writefasta
method writes a single FASTA sequence to the provided file, and takes the following parameter:
- either a
FastaSequence
object - or a tuple (
header
: str,sequence
: str)
The writefastas
method writes multiple FASTA sequences and takes the same parameters but contained in an iterable.
The following examples would write the same single FASTA sequence to file, using the writefasta
method:
fasta_sequence = FastaSequence(sequence='ACTG', id_='id123', description='This is a sequence')
writer.writefasta(fasta_sequence)
or
fasta_sequence = ('>id123 This is a sequence', 'ACTG')
writer.writefasta(fasta_sequence)
Using the writefastas
method:
fasta_sequences = [
FastaSequence(sequence='ACTG', id_='id123', description='This is a nucleotide sequence'),
FastaSequence(sequence='ABCD', id_='id456', description='This is an aminoacid sequence')
]
writer.writefastas(fasta_sequences)