Proteins are the main objects in the database, and each protein possesses the following attributes.
The name of the fluorescent protein. No two proteins in the database can have the same name
An abbreviated, URL-friendly version of the name (must be unique in the database). All lowercase, with no spaces or periods For instance:
mRFP1.2 → mrfp12
PA-CFP2 → pa-cfp2
Slugs are automatically generated based on the name, and are visible in the URL on any protein page, for instance, mNeonGreen: https://www.fpbase.org/protein/mneongreen/
Amino acid sequence. No two proteins in the database can have the same amino acid sequence. Sequences are specified with single-letter amino acid codes, with no degenerate codes, and no stop codons. Sequences should start with M (Methionine) and should not contain His tags or other non-FP fusions.
The molecular weight is calculated automatically from the amino acid sequence.
Some "fluorescent" proteins (for instance, those derived from bacterial phytochrome photoreceptors) aren't independently/intrinsically fluorescent; rather, they autocatalytically bind and activate the fluorescence of an extrinsic chromophore (here, referred to as the "cofactor"). Often this is a small molecule, most commonly biliverdin (BV). This cofactor must be present in the environment in order to observe fluorescence. Biliverdin, for example, is a product of heme catabolism, and is therefore present to some extent in most mammalian systems.
The dimerization tendency of the fluorescent protein (monomer, dimer, etc...).
The various states that the protein can be in (e.g. "red", "green", "off"). See States below.
Each protein is classified into one of the following photo-switching types, based on the number of states and transitions that the protein displays:
Basic: A single (constitutively) fluorescent state (e.g. EGFP).
Photoactivatable: Two states, one transition, from dark state to fluorescent state (e.g. PA-GFP).
Photoconvertible: Two states, one transition, from one fluorescent state to another (e.g. EosFP).
Photoswitchable: Two (reversible) transitions between two states (e.g. Dronpa).
Multi-Photochromic: Usually greater than 2 states, exhibiting both conversion and switching (e.g. IrisFP).
Timer: Transitions between 2 or more states over time (e.g. DsRed-Timer).
Multistate: Generic categorization for proteins with more than one state.
The original organism from which the protein was cloned and evolved (e.g. Aequorea Victoria for all proteins that were evolved from the original avGFP). In cases where the protein was computationally designed from a synthetic template, the NCBI synthetic construct is used as a parental "organism".
States represent a collection of attributes related to the fluorescent properties of the protein. A protein can have multiple states, and transitions.
The name of the state (e.g. "ON", "Green", "default", etc.). For single-state proteins, the name "default" is used.
The excitation maximum of the state in nanometers. This value is stored independently of any Spectra (and may be slightly different).
The emission maximum of the state in nanometers. This value is stored independently of any Spectra (and may be slightly different).
The difference (in wavelength) between the excitation maximum and the emission maximum is referred to as the Stokes shift.
States can have different spectra for absorption, excitation, two-photon excitation, and emission.
The molar extinction coefficient (M-1 cm-1) of the state is a measure of how strongly the protein absorbs light at a given wavelength.
Quantum yield represents the ratio of photons emitted to photons absorbed. It is the likelihood that, once excited by a photon, the protein (state) will emit a photon.
pKa is a measure of the acid sensitivity of a fluorescent protein. It is the pH at which fluorescence intensity drops to 50% of its maximum value.
Whereas pKa shows the pH at which a fluorescence drops to 50% of its maximal value, the Hill coefficient describes the slope of the fluorescence-versus-pH relationship. Many papers do not report Hill coefficients, so this value is unrecorded for many proteins in the database.
Maturation is the time (in minutes) required (due to protein folding and chromophore maturation) for fluorescence to obtain half-maximal value.
The amount of time (in nanoseconds) after photon absorption that it takes the fluorophore to relax to the ground state is referred to as the fluorescence lifetime. When a population of fluorophores is excited, the lifetime is the time it takes for the number of excited molecules to decay to 1/e or 36.8% of the original population.
This hidden database field is reserved for proteins that can have multiple states, not necessarily through photoactivation or switching, but through environmental factors such as pH or calcium.
Transition objects in the database capture the (currently only light-induced) transition between two states.
Organisms are stored in the database as an NCBI Taxonomy ID. All corresponding data (such as scientific name, genus, species, rank, etc.) are pulled from NCBI.
Measurements of photostability and photobleaching are tremendously error-prone, and depend heavily on the specifics of the experiment. We have chosen not to give a single "photostability" metric to each state, but rather allow each state to have one or more bleach measurements, described below
While FP oligomerization tendency is often measured in vitro using methods such as size-exclusion chromatography, these measurements do not always reflect the behavior of the FP in a cellular environment. The OSER assay is a commonly-used, biologically relevant assay of FP tendencies to oligomerize. In an OSER assay, the FP in question is fused to the cytoplasmic end of an endoplasmic reticulum (ER) signal anchor membrane protein (CytERM) and expressed in cells. Cells are scored based on the ability of CytERM to homo-oligomerize with proteins on opposing membranes and restructure the ER from a tubular network into organized smooth ER (OSER) whorl structures (Costantini et al. 2012).