Glossary of Terms
Proteins are the main objects in the database, and each protein possesses the following attributes.
The name of the fluorescent protein. No two proteins in the database can have the same name
An abbreviated, URL-friendly version of the name (must be unique in the database). All lowercase, with no spaces or periods For instance:
- mRFP1.2 → mrfp12
- PA-CFP2 → pa-cfp2
Slugs are automatically generated based on the name, and are visible in the URL on any protein page, for instance, mNeonGreen: https://www.fpbase.org/protein/mneongreen/
Amino acid sequence. No two proteins in the database can have the same amino acid sequence. Sequences are specified with single-letter amino acid codes, with no degenerate codes, and no stop codons. Sequences should start with M (Methionine) and should not contain His tags or other non-FP fusions.
The molecular weight is calculated automatically from the amino acid sequence.
Some "fluorescent" proteins (for instance, those derived from bacterial phytochrome photoreceptors) aren't independently/intrinsically fluorescent; rather, they autocatalytically bind and activate the fluorescence of an extrinsic chromophore (here, referred to as the "cofactor"). Often this is a small molecule, most commonly biliverdin (BV). This cofactor must be present in the environment in order to observe fluorescence. Biliverdin, for example, is a product of heme catabolism, and is therefore present to some extent in most mammalian systems.
The dimerization tendency of the fluorescent protein (monomer, dimer, etc...).
Many proteins are monomers at low concentrations but dimerize as the concentration increases, so a single classification is unlikely to be a complete characterization of the protein. Furthermore, some proteins (such as TagRFP) are described as monomers in early publications, and later found to be weak dimers. So a simple, single classification of oligomerization tendency is usually insufficient.
Proteins can have zero or more (usually light-induced) transitions between different fluorescent or non-fluorescent states. See Transitions below.
Each protein is classified into one of the following photo-switching types, based on the number of states and transitions that the protein displays:
- Multi-Photochromic: Usually greater than 2 states, exhibiting both conversion and switching (e.g. IrisFP).
- Multistate: Generic categorization for proteins with more than one state.
In reality, all fluorescent proteins have many potential states that they can be in, as they progress through maturation pathways or depending on their environment (pH, etc.). So classifying any protein as "basic" is an oversimplification. The intention here is to classify proteins for their intended purpose (or, if it becomes known after development that a particularly
The original organism from which the protein was cloned and evolved (e.g. Aequorea Victoria for all proteins that were evolved from the original avGFP). In cases where the protein was computationally designed from a synthetic template, the NCBI synthetic construct is used as a parental "organism".
States represent a collection of attributes related to the fluorescent properties of the protein. A protein can have multiple states, and transitions.
The name of the state (e.g. "ON", "Green", "default", etc.). For single-state proteins, the name "default" is used.
The excitation maximum of the state in nanometers. This value is stored independently of any Spectra (and may be slightly different).
The emission maximum of the state in nanometers. This value is stored independently of any Spectra (and may be slightly different).
The difference (in wavelength) between the excitation maximum and the emission maximum is referred to as the Stokes shift.
The molar extinction coefficient (M-1 cm-1) of the state is a measure of how strongly the protein absorbs light at a given wavelength.
Quantum yield represents the ratio of photons emitted to photons absorbed. It is the likelihood that, once excited by a photon, the protein (state) will emit a photon.
Practical brightness of a fluorescent protein depends on additional protein characteristics such as folding and maturation efficiency and pKa. Molecular brightness may not reflect the actual brightness of a given protein in a biological experiment.
pKa is a measure of the acid sensitivity of a fluorescent protein. It is the pH at which fluorescence intensity drops to 50% of its maximum value.
Whereas pKa shows the pH at which a fluorescence drops to 50% of its maximal value, the Hill coefficient describes the slope of the fluorescence-versus-pH relationship. Many papers do not report Hill coefficients, so this value is unrecorded for many proteins in the database.
The pH robustness is often misinterpreted by only looking at pKa values. For instance, an FP with a low pKa and a low Hill coefficient does not show a pH range in which the fluorescence remains constant. Therefore, these FPs are still pH-sensitive, even with a low pKa. On the other hand, an FP with a low pKa and a high Hill coefficient is pH-insensitive as this FP has a plateau at pH values above the pKa. Thus, pH-insensitive FPs are better identified based on these two parameters combined (see figure below). In addition, pH-sensitivity has always been described in vitro, neglecting the effect of cytosolic components on the pH quenching. How representative in vitro pH-sensitivity is for the in vivo performance is unknown.
figure from Botman et al. 2018. bioRxiv
Maturation is the time (in minutes) required (due to protein folding and chromophore maturation) for fluorescence to obtain half-maximal value.
While maturation is currently stored in FPbase a single number (corresponding to the half-life of maturation), maturation does not always follow mono-exponential kinetics, making this value an oversimplification. For an excellent examination of the complexity of fluorescent protein maturation kinetics in E coli, see Balleza et al (2018)
The amount of time (in nanoseconds) after photon absorption that it takes the fluorophore to relax to the ground state is referred to as the fluorescence lifetime. When a population of fluorophores is excited, the lifetime is the time it takes for the number of excited molecules to decay to 1/e or 36.8% of the original population.
Each fluorescent protein state can have many bleach measurements, performed across multiple references and imaging modalities.
This hidden database field is reserved for proteins that can have multiple states, not necessarily through photoactivation or switching, but through environmental factors such as pH or calcium.
Transition objects in the database capture the (currently only light-induced) transition between two states.
Organisms are stored in the database as an NCBI Taxonomy ID. All corresponding data (such as scientific name, genus, species, rank, etc.) are pulled from NCBI.
Measurements of photostability and photobleaching are tremendously error-prone, and depend heavily on the specifics of the experiment. We have chosen not to give a single "photostability" metric to each state, but rather allow each state to have one or more bleach measurements, described below
While FP oligomerization tendency is often measured in vitro using methods such as size-exclusion chromatography, these measurements do not always reflect the behavior of the FP in a cellular environment. The OSER assay is a commonly-used, biologically relevant assay of FP tendencies to oligomerize. In an OSER assay, the FP in question is fused to the cytoplasmic end of an endoplasmic reticulum (ER) signal anchor membrane protein (CytERM) and expressed in cells. Cells are scored based on the ability of CytERM to homo-oligomerize with proteins on opposing membranes and restructure the ER from a tubular network into organized smooth ER (OSER) whorl structures (Costantini et al. 2012).