The fpseq package provides tools for working with Fluorescent Protein sequences and mutations, and has a simple python function for retrieving FP sequences from fpbase.org. It forms the basis of sequence analysis on FPbase, but can be used independently (as a basic way to grab and compare/mutate FPsequences using the same HGVS notation that is typically used in papers).
Source code at Github:
Example usage:
In [1]:from fpseq import from_fpbase# retrieve sequence from FPbase.orgIn [2]: avGFP =from_fpbase('avgfp')In [3]: avGFP # sequences from FPbase.orgOut[3]:Protein------------------------------------------------------MSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTTGKLPVPWPTL VTTFSYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFFKDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNVYIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHYLSTQSALSKD PNEKRDHMVL LEFVTAAGIT HGMDELYKIn [4]: EGFP =from_fpbase('egfp')# calculate HGVS mutation stringIn [6]: avGFP.mutations_to(EGFP)Out[6]:<MutationSet: M1_S2insV/F64L/S65T/H231L>In [7]: mEGFP =from_fpbase('megfp')In [8]: EGFP.mutations_to(mEGFP)Out[8]:<MutationSet: A207K># A207K does not match the literature, because of V1a...# use reference parameter to enforce# position numbering relative to avGFP In [9]: EGFP.mutations_to(mEGFP, reference=avGFP)Out[9]:<MutationSet: A206K>In [10]: mCherry =from_fpbase('mcherry')# attempt to apply the ‘mCherry2’ mutation# reported in Shen et al. (2017) throws an error# because the positions do not align with mCherry sequenceIn [11]: newseq = mCherry.mutate('K92N/K138C/K139R/S147T/N196D/T202L')---------------------------------------------------------------------SequenceMismatch: Mutation K138C does not align with the parent seq: PSD>G<PVM...But a match was found 5 positions away: K97N/K143C/K144R/S152T/N201D/T207L# use correct_offset to apply a shift to the mutation set, if a match is foundIn [12]: newseq, offset = mCherry.mutate('K92N/K138C/K139R/S147T/N196D/T202L', correct_offset=True)UserWarning: An offset of 5 amino acids was detected between the sequence and the mutation set, and automatically corrected
In [13]: newseq ==from_fpbase('mcherry2')# sequence equivalence checksOut[13]:True