The central dogma of molecular biology describes the flow of genetic material within a biological system. This process is fundamental to understanding how proteins are made using genetic instructions in DNA. Understanding protein synthesis is key to developing therapies in medicine, enhancing crop yields in agriculture and creating new biotechnologies.
The central dogma
The synthesis of proteins is explained using the central dogma (“central dogma” just means it is a fundamental principle or concept). It tells us that DNA is copied into messenger RNA (mRNA) through a process called transcription and then mRNA is used to make proteins in a process called translation.
[transcript]
Flowchart of the central dogma
A three-step flowchart, starting with "DNA" on the left, "RNA" in the middle and "Protein" on the right. DNA is represented by a double-helix. RNA is represented with a single strand of DNA. Protein is represented by a thick coiled line.
A solid arrow is shown pointing from "DNA" to "RNA", labelled "Transcription".
A dashed arrow is shown pointing from "RNA" back to "DNA", labelled "Reverse transcription".
A solid curved arrow is shown pointing from "DNA" back to itself, labelled "Replication".
A solid arrow is shown pointing from "RNA" to "Protein", labelled "Translation".
A dashed curved arrow is shown pointing from "RNA" back it itself, labelled "Replication".
[/transcript]
You will soon learn that this process is very complex, with a number of modifications that occur to the mRNA and protein before it is ready for use by the body. Any errors with these processes could lead to a malfunctioning protein.
Transcription
Transcription (“trans” means “across” and “scribe” means “write”) is the process in which a specific segment of DNA is copied into mRNA. This mRNA strand contains the genetic blueprints for making a protein.
There are three steps to transcribing DNA, which occur in the nucleus of eukaryotic cells: initiation, elongation and termination.
During initiation, an enzyme called RNA polymerase binds to a region of the DNA known as the promoter region. This is the part of a gene that signals the DNA to unwind from its double helix structure, exposing the nucleotides so that they can be read by the enzyme. [transcript]
Initiation step of transcription
A diagram showing the initiation of transcription. A segment of double-stranded DNA is shown, with part of it on the far left labelled "Promoter region". At the promoter region, a purple blob is attached. It is labelled "RNA polymerase".
This diagram shows the binding of the RNA polymerase enzyme to the promoter region of the DNA.
[/transcript]
Elongation is when the RNA polymerase moves along the segment of unwound DNA and adds nucleotides in a chain to build the mRNA strand. The mRNA strand is complementary to the DNA strand, following the complementary base pair rule (or Chargaff’s rules).
Adenine (A) in the DNA strand pairs with uracil (U) in the RNA.
Thymine (T) in the DNA strand pairs with adenine (A) in the RNA.
Cytosine (C) in the DNA strand pairs with guanine (G) in the RNA.
Guanine (G) in the DNA strand pairs with cytosine (C) in the RNA.
U and T are similar in structure. The presence of U instead of T indicates that the molecule is RNA, not DNA. Thymine helps keep DNA stable, while uracil is more useful in RNA as it requires less energy to produce. [transcript]
Elongation step of transcription
A diagram showing the elongation stage of transcription. A segment of double-stranded DNA is shown. The RNA polymerase has attached to the bottom strand and moved along it, as shown by an arrow.
From the promoter region on the far left and extending to the right up until where the RNA polymerase is, the strands of DNA have separated from each other, or unwound.
Coming out from the RNA polymerase is a new strand of nucleotides labelled "mRNA". The nucleotide squence matches the top strand and is complementary to the strand that the polymerase has moved along and read.
[/transcript]
When the RNA polymerase reaches a stop signal (or termination sequence) in the gene, transcription ends. During termination, the RNA polymerase detaches from the DNA, and the mRNA strand is free to carry the blueprints for the protein to a ribosome in the cytosol or endoplasmic reticulum, where it can be translated into a protein. [transcript]
Termination step of transcription
A diagram showing the termination of transcription. A segment of double-stranded DNA is shown, with a "Stop signal" labelled on the far right. Between the promoter region and stop signal, the DNA double helix has unwound.
The purple RNA polymerase enzyme has detached from the DNA, shown by an arrow.
Above the double-stranded DNA is a singe strand labelled "pre-mRNA". Its sequence matches the top strand in the DNA.
[/transcript]
Watch this video to see transcription in action.
What you are about to see is DNA's most extraordinary secret — how a simple code is turned into flesh and blood. It begins with a bundle of factors assembling at the start of a gene. A gene is simply a length of DNA instructions stretching away to the left. The assembled factors trigger the first phase of the process, reading off the information that will be needed to make the protein. Everything is ready to roll: three, two, one, GO! The blue molecule racing along the DNA is reading the gene. It's unzipping the double helix, and copying one of the two strands. The yellow chain snaking out of the top is a copy of the genetic message and it's made of a close chemical cousin of DNA called RNA. The building blocks to make the RNA enter through an intake hole. They are matched to the DNA - letter by letter - to copy the As, Cs, Ts and Gs of the gene. The only difference is that in the RNA copy, the letter T is replaced with a closely related building block known as "U". You are watching this process - called transcription - in real time. It's happening right now in almost every cell in your body.
Post-transcriptional modifications
In eukaryotes, transcription forms pre-mRNA which needs to undergo additional modifications like 5’ capping, polyadenylation and splicing to prepare it for translation.
5’ capping – to improve the stability of the mRNA transcript and protect it from degradation, a cap can be added to the 5’ end via a phosphate linkage
polyadenylation – just like 5’ capping, polyadenylation helps to stabilise the mRNA transcript; it involves adding a poly-A tail consisting of about 200 adenine nucleotides
splicing – segments of genetic code called exons code for functional proteins but mRNA transcribed from DNA also contains segments called introns which no not code for protein; introns are cut out from the mRNA and the exons are joined together again.
[transcript]
Post-transcriptional modifications
5 prime capping (top): The pre-mRNA strand has an extra yellow nucleotide added to the left end.
Polyadenylation (middle): The pre-mRNA strand has an additional string of red nucleotides added to the right end.
Splicing (bottom): Two sections of the pre-mRNA template are labelled introns. In the final mRNA, they are removed.
[/transcript]
Translation
Translation is the conversion of the genetic code from the mRNA into a specific sequence of amino acids, forming a polypeptide or protein. It occurs within ribosomes, which consist of two subunits. The small subunit binds the mRNA transcript and the large subunit binds transfer RNA (tRNA).
A three-dimensional model of a single ribosome with the two subunits shown in different colours.
Ribosome model
Small subunit (pink): decodes the genetic messages sent by the mRNA.
Large subunit (purple): the ribosome reads the messenger RNA (mRNA) and uses the information to string together amino acids into a protein.
tRNA molecules read the mRNA template in sets of three nucleotides, called codons, with each codon corresponding to a specific amino acid. They bring the appropriate amino acids to the ribosome so they can be added to the polypeptide chain.
The codon chart is shown. [transcript]
The image is a codon chart, used to identify amino acids coded by mRNA sequences during translation.
Codon chart
It organises the codons by the three bases that make them up:
First base: displayed on the left, includes U (uracil), C (cytosine), A (adenine), and G (guanine)
Second base: Displayed along the top, with the same bases
Each cell within the chart represents a codon made up of a combination of bases, specifying a particular amino acid or a start/stop signal. Here's a breakdown:
Phenylalanine: UUU and UUC
Leucine: UUA, UUG, CUU, CUC, CUA and CUG
Isoleucine: AUU, AUC and AUA
Methionine: AUG, the start codon (indicated in green as START)
Valine: GUU, GUC, GUA, GUG
Serine: UCU, UCC, UCA and UCG
Proline: CCU, CCC, CCA and CCG
Threonine: ACU, ACC, ACA and ACG
Alanine: GCU, GCC, GCA and GCG
Tyrosine: UAU and UAC
Stop codons: UAA, UAG, UGA (indicated in red as STOP)
Histidine: CAU and CAC
Glutamine: CAA and CAG
Asparagine: AAU and AAC
Lysine: AAA and AAG
Aspartic acid: GAU and GAC
Glutamic acid: GAA and GAG
Cytosine: UGU and UGC
Tryptophan: UGG
Arginine: CGU, CGC, CGA and CGG
Serine: AGU and AGC
Arginine: AGA and AGG
Glycine: GGU, GGC, GGA and GGG
[/transcript]
Like transcription, translation involves an initiation, elongation and termination step.
To initiate translation, a complex structure is formed. Proteins called initiation factors help the mRNA template bind with the small subunit of the ribosome and an initiator tRNA, which recognises the start codon, AUG. As well as acting as the initiator of translation, AUG corresponds to the amino acid methionine (Met).
Once the start codon is identified and the small subunit, initiator tRNA and mRNA complex is formed, the next step can occur. [transcript]
Initiation step of protein translation
A diagram showing a segment of mRNA with a ribosome and transfer RNA attached. The small subunit of the ribosome is attached to the mRNA at the start codon. The large subunit is attached to an initiator tRNA molecule, which has methionine associated with it. The large subunit has three sites for tRNA to bind. The middle site is occupied by the initiator tRNA.
[/transcript]
The mRNA template is read from the 5’ to 3’ direction and tRNA molecules add amino acids in the correct sequence as the ribosome moves along the mRNA template. Elongation mostly involves the large subunit of the ribosome.
Based on the next sense codon in the mRNA sequence (i.e. a codon that corresponds to an amino acid), the tRNA with the correct complementary anticodon binds to the large subunit at the aminoacyl site or A site. The tRNA is "charged" with its corresponding amino acid. [transcript]
Elongation step of protein translation
A diagram showing a segment of mRNA with a ribosome and two tRNA molecules occupying the middle and right-hand sites. The methionine is shown attaching to the amino acid on the tRNA in the right-hand site, which is labelled "tRNA with anticodon". The anticodon matches with the codon, labelled "Sense codon", on the mRNA strand. The right-hand site is labelled the "A site". Freely floating to the right is another tRNA molecule with a different amino acid, labelled the "Charged tRNA".
[/transcript]
The tRNA moves to the peptidyl site or P site of the large subunit, where the amino acid forms a peptide bond with the previous amino acid in the polypeptide chain. This process is assisted by an enzyme called peptidyl transferase. [transcript]
Elongation step of protein translation
A diagram showing a segment of mRNA with a ribosome and three tRNA molecules occupying the left-hand, middle and right-hand sites. The amino acid is shown attaching to the growing polypeptide chain on the tRNA in the right-hand site. Each tRNA has moved one site to the left from the previous diagram. The middle site is labelled the "P site".
[/transcript]
After the amino acid is incorporated into the chain, the tRNA moves to the exit site or E site, where the tRNA dissociates (breaks away) from the amino acid and becomes an "uncharged" tRNA molecule. The released tRNA molecule can then be "recharged" with free amino acids. [transcript]
Elongation step of protein translation
A diagram showing a segment of mRNA with a ribosome and three tRNA molecules occupying the left-hand, middle and right-hand sites. The right-hand site contains a tRNA molecule with a blue amino acid. The middle site contains a tRNA molecule with the rest of the polypeptide chain. The left-hand site is labelled the E site. Each tRNA has moved one site to the left from the previous diagram. Freely floating to the left is the tRNA molecule that was in the E site in the previous diagram. It is labelled the "Uncharged tRNA".
[/transcript]
By freeing up the A site, the elongation process is made more efficient as the next tRNA molecule can add its amino acid.
Translation ends when the mRNA presents a nonsense codon or stop codon, which does not have a corresponding amino acid or tRNA molecule. The nonsense codons are UAA, UAG and UGA.
When a nonsense codon reaches the A site, peptidyl transferase adds a water molecule to the carboxyl end of the amino acid. The protein is released from the ribosome and translation is complete. After this stage, the protein has acquired its primary structure. [transcript]
Termination step of translation
A diagram showing the termination of translation. The right-hand site contains a tRNA molecule with no amino acid. It binds to a segment of the mRNA called the "Stop codon". The middle site contains a tRNA molecule with the final polypeptide chain. Each tRNA has moved one site to the left from the previous diagram.
[/transcript]
Watch this video to see translation in action.
When the RNA copy is complete, it snakes out into the outer part of the cell. Then in a dazzling display of choreography, all the components of a molecular machine lock together around the RNA to form a miniature factory called a ribosome. It translates the genetic information in the RNA into a string of amino acids that will become a protein. Special transfer molecules, the green triangles, bring each amino acid to the ribosome. The amino acids are the small red tips attached to the transfer molecules. There are different transfer molecules for each of the twenty amino acids. Each transfer molecule carries a three letter code that is matched with the RNA in the machine. Now we come to the heart of the process. Inside the ribosome, the RNA is pulled through like a tape. The code for each amino acid is read off, three letters at a time, and matched to three corresponding letters on the transfer molecules. When the right transfer molecule plugs in, the amino acid it carries is added to the growing protein chain. Again, you are watching this in real time. And after a few seconds the assembled protein starts to emerge from the ribosome. Ribosomes can make any kind of protein. It just depends what genetic message you feed in on the RNA. In this case, the end product is hemoglobin. The cells in our bone marrow churn out a hundred trillion molecules of it per second! And as a result, our muscles, brain and all the vital organs in our body receive the oxygen they need.
Post-translational modifications
Just like after transcription, there are modifications that are often made after translation. This depends on the specific role of the protein, its destination within or outside the cell, and the needs of the cell.
There are many modifications that can be made to alter the protein’s function, activity, stability or location. Common examples are:
phosphorylation – addition of a phosphate group to activate or deactivate enzymes and regulate cell signalling
glycosylation – addition of sugar molecules to help protein folding, stability and regular cell signalling
acetylation – addition of an acetyl group to affect gene expression
ubiquitination – addition of ubiquitin molecules to mark proteins for degradation
methylation – addition of a methyl group to affect gene expression and protein function.
These modifications occur in the Golgi apparatus and endoplasmic reticulum. Then, the protein is folded into its secondary, tertiary and sometimes, quarternary structure.
Humans have over 100,000 different proteins. Together with the unique sequence of amino acids, post-translational modifications give proteins their specificity. This accounts for the diversity of proteins in the body.
Exercise
See how well you understand how proteins are synthesised with a quick quiz.
Data drill
Read the scenario and use the information provided to answer the questions in the quiz.
Dr Jean Nome wishes to investigate how differences in transcription affects the final protein production in cells.
She looks at five different genes and measures mRNA abundance (the number of copies of messenger RNA in each cell) and protein yield (the mass of protein in milligrams produced by each cell). Dr Nome measures the protein yield using colorimetry.
Colorimetry is used to measure the amount of protein in each sample. The more intense the colour, the higher the protein concentration. Image by Ajpolino via Wikimedia Commons, licensed under CC0