Database Schema
Last updated
Last updated
FPbase is a database designed specifically for fluorescent proteins. The goal is to come up with a database design that can categorize the majority of the many subtle properties that fluorescent proteins can possess. If you have suggestions for ways to extend the database model to incorporate additional properties, feel free to contact us.
The database schema determines the fields and information that can be stored for any given object in the database, and how the objects relate to each other. The schema changes periodically to accommodate new fields, or to add features to the site. A graphical representation of the current database schema is shown below. For a definitions of terms: see the glossary.
Items in bold code
correspond to database objects:
Proteins
are represented with a name, common aliases, amino acid sequence, external accession IDs (currently GenBank, UniProt, Protein Data Bank, and NCBI Identical Protein Groups), aggregation type, photochromicity/switch type, and required cofactors (e.g. biliverdin). Every protein has a primary reference
(that introduced the protein), additional references
( limited to those that further characterize the protein), and the NCBI taxonomy ID for the parental organism
from which the protein was evolved (e.g. 6100 for Aequorea victoria). Protein lineages
are stored as recursively as parent-child relationships between two proteins, along with the mutation that generates the child sequence from the parent. Excerpts
are snippets of text from references
that convey key information about a protein
that is otherwise difficult to capture within the current database schema, and appear on both the corresponding protein and reference pages.
Each protein
can have one or more states
that represent the protein in a certain intrinsic condition (e.g. pre/post photoactivation or photoconversion) or under a certain environmental condition (e.g. pH, calcium, etc.). States
have typical fluorescence characteristics such as excitation maxima
and emission maxima
, extinction coefficient
, quantum yield
, pKa
, fluorescence lifetime
, and full spectra
data (stored as an array of wavelength/value pairs, along with metadata describing, for instance, the pH or solvent under which the spectra was measured). Transitions
represent conversions between two states
in response to some stimulus, such as irradiation with a certain wavelength. Bleach measurements
capture information about the photostability of a given protein state
under one set of experimental conditions (such as microscope modality, light source and filter spectra, illumination power, temperature, fusion protein, cell type, etc.) along with the reference
that made the measurement. OSER measurements
are quantifications of the monomericity of the protein (Costantini et al 2012), taken from a given reference
. References
are stored as DOIs, and corresponding metadata (such as title, authors
, date, journal, etc.) is pulled from Crossref. Every reference and author object has a dedicated page showing the corresponding proteins
or references
, respectively, attributed to that object.
When users register for an account at FPbase (which can be done either directly through FPbase, or using OAuth 2.0 authentication through Google or Twitter), a user object is created. Registered users can create protein collections
and microscopes
. Microscopes are stored as collections of optical configurations
, each of which comprises a set of filters, light source
, and camera
, all of which are associated with a spectral
data object.