105 Viral Evolution, Morphology, and Classification

Learning Objectives

By the end of this section, you will be able to do the following:

  • Describe how viruses were first discovered and how they are detected
  • Discuss three hypotheses about how viruses evolved
  • Describe the general structure of a virus
  • Recognize the basic shapes of viruses
  • Understand past and emerging classification systems for viruses
  • Describe the basis for the Baltimore classification system

Viruses are diverse entities: They vary in structure, methods of replication, and the hosts they infect. Nearly all forms of life—from prokaryotic bacteria and archaeans, to eukaryotes such as plants, animals, and fungi—have viruses that infect them. While most biological diversity can be understood through evolutionary history (such as how species have adapted to changing environmental conditions and how different species are related to one another through common descent), much about virus origins and evolution remains unknown.

Discovery and Detection

Viruses were first discovered after the development of a porcelain filter—the Chamberland-Pasteur filter—that could remove all bacteria visible in the microscope from any liquid sample. In 1886, Adolph Meyer demonstrated that a disease of tobacco plants—tobacco mosaic disease—could be transferred from a diseased plant to a healthy one via liquid plant extracts. In 1892, Dmitri Ivanowski showed that this disease could be transmitted in this way even after the Chamberland-Pasteur filter had removed all viable bacteria from the extract. Still, it was many years before it was proved that these “filterable” infectious agents were not simply very small bacteria but were a new type of very small, disease-causing particle.

Most virions, or single virus particles, are very small, about 20 to 250 nanometers in diameter. However, some recently discovered viruses from amoebae range up to 1000 nm in diameter. With the exception of large virions, like the poxvirus and other large DNA viruses, viruses cannot be seen with a light microscope. It was not until the development of the electron microscope in the late 1930s that scientists got their first good view of the structure of the tobacco mosaic virus (TMV) ((Figure)), discussed above, and other viruses ((Figure)). The surface structure of virions can be observed by both scanning and transmission electron microscopy, whereas the internal structures of the virus can only be observed in images from a transmission electron microscope. The use of electron microscopy and other technologies has allowed for the discovery of many viruses of all types of living organisms.

Most virus particles are visible only by electron microscopy. In these transmission electron micrographs, (a) a virus is as dwarfed by the bacterial cell it infects, as (b) these E. coli cells are dwarfed by cultured colon cells. (credit a: modification of work by U.S. Dept. of Energy, Office of Science, LBL, PBD; credit b: modification of work by J.P. Nataro and S. Sears, unpub. data, CDC; scale-bar data from Matt Russell)

Micrograph a shows a virus with a hexagonal head that stands on thin, bent legs. The virus sits on the surface of a cell that is so large that only a small fraction of its surface is visible. Micrograph b shows small bacterial cells that are about the size of the organelles in the adjacent colon cells.

Evolution of Viruses

Although biologists have a significant amount of knowledge about how present-day viruses mutate and adapt, much less is known about how viruses originated in the first place. When exploring the evolutionary history of most organisms, scientists can look at fossil records and similar historic evidence. However, viruses do not fossilize, as far as we know, so researchers must extrapolate from investigations of how today’s viruses evolve and by using biochemical and genetic information to create speculative virus histories.

Most scholars agree that viruses don’t have a single common ancestor, nor is there a single reasonable hypothesis about virus origins. There are current evolutionary scenarios that may explain the origin of viruses. One such hypothesis, the “devolution” or the regressive hypothesis, suggests that viruses evolved from free-living cells, or from intracellular prokaryotic parasites. However, many components of how this process might have occurred remain a mystery. A second hypothesis, the escapist or the progressive hypothesis, suggests that viruses originated from RNA and DNA molecules that escaped from a host cell. A third hypothesis, the self-replicating hypothesis, suggests that viruses may have originated from self-replicating entities similar to transposons or other mobile genetic elements. In all cases, viruses are probably continuing to evolve along with the cells on which they rely on as hosts.

As technology advances, scientists may develop and refine additional hypotheses to explain the origins of viruses. The emerging field called virus molecular systematics attempts to do just that through comparisons of sequenced genetic material. These researchers hope one day to better understand the origin of viruses—a discovery that could lead to advances in the treatments for the ailments they produce.

Viral Morphology

Viruses are noncellular, meaning they are biological entities that do not have a cellular structure. They therefore lack most of the components of cells, such as organelles, ribosomes, and the plasma membrane. A virion consists of a nucleic acid core, an outer protein coating or capsid, and sometimes an outer envelope made of protein and phospholipid membranes derived from the host cell. Viruses may also contain additional proteins, such as enzymes, within the capsid or attached to the viral genome. The most obvious difference between members of different viral families is the variation in their morphology, which is quite diverse. An interesting feature of viral complexity is that the complexity of the host does not necessarily correlate with the complexity of the virion. In fact, some of the most complex virion structures are found in the bacteriophages—viruses that infect the simplest living organisms, bacteria.


Viruses come in many shapes and sizes, but these features are consistent for each viral family. As we have seen, all virions have a nucleic acid genome covered by a protective capsid. The proteins of the capsid are encoded in the viral genome, and are called capsomeres. Some viral capsids are simple helices or polyhedral “spheres,” whereas others are quite complex in structure ((Figure)).

Viral capsids can be (a) helical, (b) polyhedral, or (c) have a complex shape. (credit a “micrograph”: modification of work by USDA ARS; credit b “micrograph”: modification of work by U.S. Department of Energy)

Figure a is a helical virus which has a long linear structure. The outer proteins are small spheres arranged into a long, hollow tube. Inside the tube is the genetic material. Tobacco mosaic virus is an example of a helical virus. Figure b is an Icosehedral viruses have a polyhedron structure. The example shown is human rhinovirus which has a pentagon structure. Complex viruses have a more complex structure. The example is variola which has an ovoid structure.

In general, the capsids of viruses are classified into four groups: helical, icosahedral, enveloped, and head-and-tail. Helical capsids are long and cylindrical. Many plant viruses are helical, including TMV. Icosahedral viruses have shapes that are roughly spherical, such as those of poliovirus or herpesviruses. Enveloped viruses have membranes derived from the host cell that surrounds the capsids. Animal viruses, such as HIV, are frequently enveloped. Head-and-tail viruses infect bacteria and have a head that is similar to icosahedral viruses and a tail shaped like helical viruses.

Many viruses use some sort of glycoprotein to attach to their host cells via molecules on the cell called viral receptors. For these viruses, attachment is required for later penetration of the cell membrane; only after penetration takes place can the virus complete its replication inside the cell. The receptors that viruses use are molecules that are normally found on cell surfaces and have their own physiological functions. It appears that viruses have simply evolved to make use of these molecules for their own replication. For example, HIV uses the CD4 molecule on T lymphocytes as one of its receptors ((Figure)). CD4 is a type of molecule called a cell adhesion molecule, which functions to keep different types of immune cells in close proximity to each other during the generation of a T lymphocyte immune response.

A virus and its host receptor protein. The HIV virus binds the CD4 receptor on the surface of human cells. CD4 receptors help white blood cells to communicate with other cells of the immune system when producing an immune response. (credit: modification of work by NIAID, NIH)

In the illustration a viral receptor on the surface of an H I V virus is attaches to a co-receptor embedded in the plasma membrane. The co-receptor is either C C R 5 or C X C R 4.

One of the most complex virions known, the T4 bacteriophage (which infects the Escherichia coli) bacterium, has a tail structure that the virus uses to attach to host cells and a head structure that houses its DNA.

Adenovirus, a non-enveloped animal virus that causes respiratory illnesses in humans, uses glycoprotein spikes protruding from its capsomeres to attach to host cells. Non-enveloped viruses also include those that cause polio (poliovirus), plantar warts (papillomavirus), and hepatitis A (hepatitis A virus).

Enveloped virions, such as the influenza virus, consist of nucleic acid (RNA in the case of influenza) and capsid proteins surrounded by a phospholipid bilayer envelope that contains virus-encoded proteins. Glycoproteins embedded in the viral envelope are used to attach to host cells. Other envelope proteins are the matrix proteins that stabilize the envelope and often play a role in the assembly of progeny virions. Chicken pox, HIV, and mumps are other examples of diseases caused by viruses with envelopes. Because of the fragility of the envelope, non-enveloped viruses are more resistant to changes in temperature, pH, and some disinfectants than enveloped viruses.

Overall, the shape of the virion and the presence or absence of an envelope tell us little about what disease the virus may cause or what species it might infect, but they are still useful means to begin viral classification ((Figure)).

Visual Connection
Complex Viruses. Viruses can be either complex or relatively simple in shape. This figure shows three relatively complex virions: the bacteriophage T4, with its DNA-containing head group and tail fibers that attach to host cells; adenovirus, which uses spikes from its capsid to bind to host cells; and the influenza virus, which uses glycoproteins embedded in its envelope to bind to host cells. The influenza virus also has matrix proteins, internal to the envelope, which help stabilize the virion’s shape. (credit “bacteriophage, adenovirus”: modification of work by NCBI, NIH; credit “influenza virus”: modification of work by Dan Higgins, Centers for Disease Control and Prevention)

Illustration a shows bacteriophage T 4, which houses its D N A genome in a hexagonal head. A long, straight tail extends from the bottom of the head. Tail fibers attached to the base of the tail are bent, like spider legs. In b, adenovirus houses its D N A genome in a round capsid made from many small capsomere subunits. Glycoproteins extend from the capsomere, like pins from a pincushion. In c, the H I V retrovirus houses its R N A genome and a bullet-shaped capsid. A spherical viral envelope, lined with matrix proteins, surrounds the capsid. Two different varieties of glycoprotein spike are embedded in the envelope. Approximately 80 percent of the spikes are hemagglutinin. The remaining 20 percent or so of the glycoprotein spikes consist of neuraminidase.

Which of the following statements about virus structure is true?

  1. All viruses are encased in a viral membrane.
  2. The capsomere is made up of small protein subunits called capsids.
  3. DNA is the genetic material in all viruses.
  4. Glycoproteins help the virus attach to the host cell.


Types of Nucleic Acid

Unlike nearly all living organisms that use DNA as their genetic material, viruses may use either DNA or RNA. The virus core contains the genome—the total genetic content of the virus. Viral genomes tend to be small, containing only those genes that encode proteins which the virus cannot get from the host cell. This genetic material may be single- or double-stranded. It may also be linear or circular. While most viruses contain a single nucleic acid, others have genomes divided into several segments. The RNA genome of the influenza virus is segmented, which contributes to its variability and continuous evolution, and explains why it is difficult to develop a vaccine against it.

In DNA viruses, the viral DNA directs the host cell’s replication proteins to synthesize new copies of the viral genome and to transcribe and translate that genome into viral proteins. Human diseases caused by DNA viruses include chickenpox, hepatitis B, and adenoviruses. Sexually transmitted DNA viruses include the herpes virus and the human papilloma virus (HPV), which has been associated with cervical cancer and genital warts.

RNA viruses contain only RNA as their genetic material. To replicate their genomes in the host cell, the RNA viruses must encode their own enzymes that can replicate RNA into RNA or, in the retroviruses, into DNA. These RNA polymerase enzymes are more likely to make copying errors than DNA polymerases, and therefore often make mistakes during transcription. For this reason, mutations in RNA viruses occur more frequently than in DNA viruses. This causes them to change and adapt more rapidly to their host. Human diseases caused by RNA viruses include influenza, hepatitis C, measles, and rabies. The HIV virus, which is sexually transmitted, is an RNA retrovirus.

The Challenge of Virus Classification

Because most viruses probably evolved from different ancestors, the systematic methods that scientists have used to classify prokaryotic and eukaryotic cells are not very useful. If viruses represent “remnants” of different organisms, then even genomic or protein analysis is not useful. Why?, Because viruses have no common genomic sequence that they all share. For example, the 16S rRNA sequence so useful for constructing prokaryote phylogenies is of no use for a creature with no ribosomes! Biologists have used several classification systems in the past. Viruses were initially grouped by shared morphology. Later, groups of viruses were classified by the type of nucleic acid they contained, DNA or RNA, and whether their nucleic acid was single- or double-stranded. However, these earlier classification methods grouped viruses differently, because they were based on different sets of characters of the virus. The most commonly used classification method today is called the Baltimore classification scheme, and is based on how messenger RNA (mRNA) is generated in each particular type of virus.

Past Systems of Classification

Viruses contain only a few elements by which they can be classified: the viral genome, the type of capsid, and the envelope structure for the enveloped viruses. All of these elements have been used in the past for viral classification ((Figure) and (Figure)). Viral genomes may vary in the type of genetic material (DNA or RNA) and its organization (single- or double-stranded, linear or circular, and segmented or non-segmented). In some viruses, additional proteins needed for replication are associated directly with the genome or contained within the viral capsid.

Virus Classification by Genome Structure
Genome Structure Examples
  • RNA
  • DNA
  • Rabies virus, retroviruses
  • Herpesviruses, smallpox virus
  • Single-stranded
  • Double-stranded
  • Rabies virus, retroviruses
  • Herpesviruses, smallpox virus
  • Linear
  • Circular
  • Rabies virus, retroviruses, herpesviruses, smallpox virus
  • Papillomaviruses, many bacteriophages
  • Non-segmented: genome consists of a single segment of genetic material
  • Segmented: genome is divided into multiple segments
  • Parainfluenza viruses
  • Influenza viruses
Viruses can be classified according to their core genetic material and capsid design. (a) Rabies virus has a single-stranded RNA (ssRNA) core and an enveloped helical capsid, whereas (b) variola virus, the causative agent of smallpox, has a double-stranded DNA (dsDNA) core and a complex capsid. Rabies transmission occurs when saliva from an infected mammal enters a wound. The virus travels through neurons in the peripheral nervous system to the central nervous system, where it impairs brain function, and then travels to other tissues. The virus can infect any mammal, and most die within weeks of infection. Smallpox is a human virus transmitted by inhalation of the variola virus, localized in the skin, mouth, and throat, which causes a characteristic rash. Before its eradication in 1979, infection resulted in a 30 to 35 percent mortality rate. (credit “rabies diagram”: modification of work by CDC; “rabies micrograph”: modification of work by Dr. Fred Murphy, CDC; credit “small pox micrograph”: modification of work by Dr. Fred Murphy, Sylvia Whitfield, CDC; credit “smallpox photo”: modification of work by CDC; scale-bar data from Matt Russell)

Part a is an illustration of the rabies virus, which is bullet shaped. R N A is coiled inside a capsid, which is encased in a matrix protein lined viral envelope studded with glycoproteins. Part a below is a micrograph of a cluster of bullet shaped rabies viruses. Part b is a micrograph of variola virus, which has D N A encased in a bow shaped capsid. An oval matrix protein-lined envelope surrounds the capsid. Part b below shows irregular, bumpy lesions on the arms and legs of a person with smallpox.

Viruses can also be classified by the design of their capsids ((Figure) and (Figure)). Capsids are classified as naked icosahedral, enveloped icosahedral, enveloped helical, naked helical, and complex. The type of genetic material (DNA or RNA) and its structure (single- or double-stranded, linear or circular, and segmented or non-segmented) are used to classify the virus core structures ((Figure)).

Virus Classification by Capsid Structure
Capsid Classification Examples
Naked icosahedral Hepatitis A virus, polioviruses
Enveloped icosahedral Epstein-Barr virus, herpes simplex virus, rubella virus, yellow fever virus, HIV-1
Enveloped helical Influenza viruses, mumps virus, measles virus, rabies virus
Naked helical Tobacco mosaic virus
Complex with many proteins; some have combinations of icosahedral and helical capsid structures Herpesviruses, smallpox virus, hepatitis B virus, T4 bacteriophage
Transmission electron micrographs of various viruses show their capsid structures. The capsid of the (a) polio virus is naked icosahedral; (b) the Epstein-Barr virus capsid is enveloped icosahedral; (c) the mumps virus capsid is an enveloped helix; (d) the tobacco mosaic virus capsid is naked helical; and (e) the herpesvirus capsid is complex. (credit a: modification of work by Dr. Fred Murphy, Sylvia Whitfield; credit b: modification of work by Liza Gross; credit c: modification of work by Dr. F. A. Murphy, CDC; credit d: modification of work by USDA ARS; credit e: modification of work by Linda Stannard, Department of Medical Microbiology, University of Cape Town, South Africa, NASA; scale-bar data from Matt Russell)

Micrograph a shows icosahedral polioviruses arranged in a grid; micrograph b shows two Epstein-Barr viruses with icosahedral capsids encased in an oval membrane; micrograph c shows a mumps virus capsid encased in an irregular membrane; micrograph d shows rectangular tobacco mosaic virus capsids; and micrograph e shows a spherical herpesvirus envelope studded with glycoproteins.

Baltimore Classification

The most commonly and currently used system of virus classification was first developed by Nobel Prize-winning biologist David Baltimore in the early 1970s. In addition to the differences in morphology and genetics mentioned above, the Baltimore classification scheme groups viruses according to how the mRNA is produced during the replicative cycle of the virus.

Group I viruses contain double-stranded DNA (dsDNA) as their genome. Their mRNA is produced by transcription in much the same way as with cellular DNA, using the enzymes of the host cell.

Group II viruses have single-stranded DNA (ssDNA) as their genome. They convert their single-stranded genomes into a dsDNA intermediate before transcription to mRNA can occur.

Group III viruses use dsRNA as their genome. The strands separate, and one of them is used as a template for the generation of mRNA using the RNA-dependent RNA polymerase encoded by the virus.

Group IV viruses have ssRNA as their genome with a positive polarity, which means that the genomic RNA can serve directly as mRNA. Intermediates of dsRNA, called replicative intermediates, are made in the process of copying the genomic RNA. Multiple, full-length RNA strands of negative polarity (complementary to the positive-stranded genomic RNA) are formed from these intermediates, which may then serve as templates for the production of RNA with positive polarity, including both full-length genomic RNA and shorter viral mRNAs.

Group V viruses contain ssRNA genomes with a negative polarity, meaning that their sequence is complementary to the mRNA. As with Group IV viruses, dsRNA intermediates are used to make copies of the genome and produce mRNA. In this case, the negative-stranded genome can be converted directly to mRNA. Additionally, full-length positive RNA strands are made to serve as templates for the production of the negative-stranded genome.

Group VI viruses have diploid (two copies) ssRNA genomes that must be converted, using the enzyme reverse transcriptase, to dsDNA; the dsDNA is then transported to the nucleus of the host cell and inserted into the host genome. Then, mRNA can be produced by transcription of the viral DNA that was integrated into the host genome.

Group VII viruses have partial dsDNA genomes and make ssRNA intermediates that act as mRNA, but are also converted back into dsDNA genomes by reverse transcriptase, necessary for genome replication.

The characteristics of each group in the Baltimore classification are summarized in (Figure) with examples of each group.

Baltimore Classification
Group Characteristics Mode of mRNA Production Example
I Double-stranded DNA mRNA is transcribed directly from the DNA template Herpes simplex (herpesvirus)
II Single-stranded DNA DNA is converted to double-stranded form before RNA is transcribed Canine parvovirus (parvovirus)
III Double-stranded RNA mRNA is transcribed from the RNA genome Childhood gastroenteritis (rotavirus)
IV Single stranded RNA (+) Genome functions as mRNA Common cold (picornavirus)
V Single stranded RNA (-) mRNA is transcribed from the RNA genome Rabies (rhabdovirus)
VI Single stranded RNA viruses with reverse transcriptase Reverse transcriptase makes DNA from the RNA genome; DNA is then incorporated in the host genome; mRNA is transcribed from the incorporated DNA Human immunodeficiency virus (HIV)
VII Double stranded DNA viruses with reverse transcriptase The viral genome is double-stranded DNA, but viral DNA is replicated through an RNA intermediate; the RNA may serve directly as mRNA or as a template to make mRNA Hepatitis B virus (hepadnavirus)

Section Summary

Viruses are tiny, noncellular entities that usually can be seen only with an electron microscope. Their genomes contain either DNA or RNA—never both—and they replicate either by using the replication proteins of a host cell or by using proteins encoded in the viral genome. Viruses are diverse, infecting archaea, bacteria, fungi, plants, and animals. Viruses consist of a nucleic acid core surrounded by a protein capsid with or without an outer lipid envelope. The capsid shape, presence of an envelope, and core composition dictate some elements of the classification of viruses. The most commonly used classification method, the Baltimore classification, categorizes viruses based on how they produce their mRNA.

Visual Connection Questions

(Figure) Which of the following statements about virus structure is true?

  1. All viruses are encased in a viral membrane.
  2. The capsomere is made up of small protein subunits called capsids.
  3. DNA is the genetic material in all viruses.
  4. Glycoproteins help the virus attach to the host cell.

Review Questions

Which statement is true?

  1. A virion contains DNA and RNA.
  2. Viruses are acellular.
  3. Viruses replicate outside of the cell.
  4. Most viruses are easily visualized with a light microscope.


The viral ________ play(s) a role in attaching a virion to the host cell.

  1. core
  2. capsid
  3. envelope
  4. both b and c



  1. all have a round shape
  2. cannot have a long shape
  3. do not maintain any shape
  4. vary in shape


The observation that the bacteria genus Chlamydia contains species that can only survive as intracellular parasites supports which viral origin hypothesis?

  1. Progressive
  2. Regressive
  3. Self-replicating
  4. Virus-first


A scientist discovers a new virus with a linear, RNA genome surrounded by a helical capsid. The virus is most likely a member of which family based on structure classification?

  1. Rabies virus
  2. Herpesviruses
  3. Retroviruses
  4. Influenza viruses


Critical Thinking Questions

The first electron micrograph of a virus (tobacco mosaic virus) was produced in 1939. Before that time, how did scientists know that viruses existed if they could not see them? (Hint: Early scientists called viruses “filterable agents.”)

Viruses pass through filters that eliminated all bacteria which were visible in the light microscopes at the time. As the bacteria-free filtrate could still cause infections when given to a healthy organism, this observation demonstrated the existence of very small infectious agents. These agents were later shown to be unrelated to bacteria and were classified as viruses.

Varicella-zoster virus is a double-stranded DNA virus that causes chickenpox. How does its genome structure provide an evolutionary advantage over a single-stranded DNA virus?

Both viruses are made of DNA, but single-stranded DNA viruses lack the ability to create the double helix. Thus, double-stranded DNA viruses have a more stable genome due to the complimentary base pairing, increasing the lifespan of the virus’s genome.

Classify the Rabies virus (a rhabdovirus family member) and HIV-1 with both the Baltimore and genomic structure systems. Compare your results. What conclusions can be made about these two different methods?

Rabies virus is a (-) strand RNA virus that transcribes mRNAs from its genome (Group V).

HIV-1 is a single-stranded RNA retrovirus that uses reverse transcriptase to create a double-stranded DNA copy of its genome which is integrated into the host human’s genome prior to making mRNAs (Group VI).

The genome structure system classifies both viruses as single-stranded RNA viruses with linear genomes.

Baltimore classification sorts Rabies virus and HIV-1 into two different groups, indicating that the two viruses have very different life cycles. However, genome structure classification does not distinguish between the two viruses. This leaves out important information regarding virus function and survival.


lacking cells
protein coating of the viral core
protein subunit that makes up the capsid
lipid bilayer that encircles some viruses
group I virus
virus with a dsDNA genome
group II virus
virus with an ssDNA genome
group III virus
virus with a dsRNA genome
group IV virus
virus with an ssRNA genome with positive polarity
group V virus
virus with an ssRNA genome with negative polarity
group VI virus
virus with an ssRNA genome converted into dsDNA by reverse transcriptase
group VII virus
virus with a single-stranded mRNA converted into dsDNA for genome replication
matrix protein
envelope protein that stabilizes the envelope and often plays a role in the assembly of progeny virions
negative polarity
ssRNA viruses with genomes complementary to their mRNA
positive polarity
ssRNA virus with a genome that contains the same base sequences and codons found in their mRNA
replicative intermediate
dsRNA intermediate made in the process of copying genomic RNA
reverse transcriptase
enzyme found in Baltimore groups VI and VII that converts single-stranded RNA into double-stranded DNA
viral receptor
glycoprotein used to attach a virus to host cells via molecules on the cell
individual virus particle outside a host cell
virus core
contains the virus genome


Icon for the Creative Commons Attribution 4.0 International License

Biology 2e Copyright © 2018 by OpenStax Biology 2nd Edition is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Share This Book