What is fpseq?
The fpseq package provides tools for working with Fluorescent Protein sequences and mutations, and has a simple python function for retrieving FP sequences from fpbase.org. It forms the basis of sequence analysis on FPbase, but can be used independently (as a basic way to grab and compare/mutate FPsequences using the same HGVS notation that is typically used in papers).
Source code at Github:
GitHub - tlambert03/fpseq: package for working with protein sequences and mutations, and FPbase API wrapper to retrieve fluorescent protein sequences
GitHub
Example usage:
1
In [1]: from fpseq import from_fpbase
2
3
# retrieve sequence from FPbase.org
4
In [2]: avGFP = from_fpbase('avgfp')
5
6
In [3]: avGFP # sequences from FPbase.org
7
Out[3]:
8
Protein
9
------------------------------------------------------
10
MSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYG KLTLKFICTT
11
GKLPVPWPTL VTTFSYGVQC FSRYPDHMKQ HDFFKSAMPE GYVQERTIFF
12
KDDGNYKTRA EVKFEGDTLV NRIELKGIDF KEDGNILGHK LEYNYNSHNV
13
YIMADKQKNG IKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHY
14
LSTQSALSKD PNEKRDHMVL LEFVTAAGIT HGMDELYK
15
16
In [4]: EGFP = from_fpbase('egfp')
17
18
# calculate HGVS mutation string
19
In [6]: avGFP.mutations_to(EGFP)
20
Out[6]: <MutationSet: M1_S2insV/F64L/S65T/H231L>
21
22
In [7]: mEGFP = from_fpbase('megfp')
23
24
In [8]: EGFP.mutations_to(mEGFP)
25
Out[8]: <MutationSet: A207K>
26
27
# A207K does not match the literature, because of V1a...
28
# use reference parameter to enforce
29
# position numbering relative to avGFP
30
In [9]: EGFP.mutations_to(mEGFP, reference=avGFP)
31
Out[9]: <MutationSet: A206K>
32
33
In [10]: mCherry = from_fpbase('mcherry')
34
35
# attempt to apply the ‘mCherry2’ mutation
36
# reported in Shen et al. (2017) throws an error
37
# because the positions do not align with mCherry sequence
38
In [11]: newseq = mCherry.mutate('K92N/K138C/K139R/S147T/N196D/T202L')
39
---------------------------------------------------------------------
40
SequenceMismatch: Mutation K138C does not align with the parent seq: PSD>G<PVM...
41
But a match was found 5 positions away: K97N/K143C/K144R/S152T/N201D/T207L
42
43
# use correct_offset to apply a shift to the mutation set, if a match is found
44
In [12]: newseq, offset = mCherry.mutate('K92N/K138C/K139R/S147T/N196D/T202L', correct_offset=True)
45
UserWarning: An offset of 5 amino acids was detected between the sequence and the mutation set, and automatically corrected
46
47
In [13]: newseq == from_fpbase('mcherry2') # sequence equivalence checks
48
Out[13]: True
Copied!
Last modified 2yr ago
Copy link