HOMOLOGOUS RECOMBINATION
In the preceding parts of this chapter, we discussed the mechanisms that allow the DNA sequences in cells to be maintained from generation to generation with very little change. In this part, we further explore a group of repair mechanisms that depend on a process called homologous recombination. The key feature of homologous recombination (also known as general recombination) is an exchange of DNA strands between a pair of homologous duplex DNA sequences. Such a strand exchange between two regions of double helix that are very similar or identical in nucleotide sequence allows one stretch of duplex DNA to restore lost or damaged information on a second stretch of duplex DNA. Because the DNA sequence information that is used to correct the damage can come from a separate DNA molecule, homologous recombination can repair many types of DNA damage. It makes possible, for example, the accurate repair of double-strand breaks, as mentioned previously (see Figure 5–45B). As pointed out earlier, these double-strand breaks can result from reactive chemicals or radiation (for example, that from radon gas that accumulates in some old basements). But more frequently they arise from DNA replication accidents—when forks become stalled or broken independently of any such external cause. Homologous recombination accurately corrects these accidents, and, because they occur during nearly every round of DNA replication, this repair pathway is essential for every proliferating cell. Homologous recombination can also repair other types of DNA damage (for example, covalent cross-links between the two strands of a DNA double helix), being perhaps the most versatile DNA repair mechanism available to the cell; this probably explains why its mechanism and the proteins that carry it out have been conserved in virtually all cells on Earth.
We shall also see that homologous recombination plays an additional role in sexually reproducing organisms. During meiosis, a key step in gamete (sperm and egg) production, it catalyzes the orderly exchange of blocks of genetic information between corresponding (homologous) maternal and paternal chromosomes. This creates new combinations of DNA sequences in the chromosomes that are passed to offspring, giving the next generation unique characteristics upon which natural selection can act.
Homologous Recombination Has Common Features in All Cells
The current view of homologous recombination as a critical DNA repair mechanism in all cells developed slowly from its original discovery as a key component in the specialized process of meiosis in plants and animals. The subsequent recognition that homologous recombination also occurs in unicellular organisms made it readily amenable to molecular analyses. Thus, much of what we know about the biochemistry of genetic recombination was derived from studies of bacteria, especially of E. coli and its viruses, as well as from experiments with simple eukaryotes such as yeasts. For these organisms with short generation times and relatively small genomes, it was possible to isolate a large set of mutants with defects in their recombination processes. The protein altered in each mutant was then identified and, ultimately, studied biochemically. Very close relatives of these proteins were subsequently found in more complex eukaryotes including flies, mice, and humans, and it is now possible to directly analyze homologous recombination in these species as well. As a result, we now know that the fundamental processes that catalyze homologous recombination are common to all cells.
DNA Base-pairing Guides Homologous Recombination
The hallmark of homologous recombination is that it takes place only between DNA duplexes that have extensive regions of sequence similarity (homology). Not surprisingly, base-pairing underlies this requirement: before undergoing homologous recombination, two DNA helices will “sample” each other’s DNA sequence by testing the potential base-pairing between a single strand from one DNA duplex and a complementary single strand from the other. Recombination is initiated when a match is found; this match need not be perfect, but it must be very close for homologous recombination to succeed. As we shall see, the process is carefully controlled and guided by a group of specialized proteins.
Homologous Recombination Can Flawlessly Repair Double-Strand Breaks in DNA
Unlike the nonhomologous end joining discussed earlier, homologous recombination repairs double-strand breaks accurately, without any loss or alteration of nucleotides at the site of repair. For homologous recombination to do this repair job, the damaged DNA must first be brought into proximity with a homologous but undamaged DNA double helix, which can then serve as a template for repair. For this reason, homologous recombination often occurs after DNA replication, when the two daughter DNA molecules lie close together and one can serve as a template for repair of the other.
One of the simplest pathways through which homologous recombination can repair double-strand breaks is shown in Figure 5–47. In essence, the broken DNA duplex and the template duplex carry out a “strand dance” so that one of the damaged strands can use the complementary strand of the intact DNA duplex as a template for repair. Once the damaged and template DNA double helices are in proximity (as occurs, for example, after DNA replication), the ends of the broken DNA are chewed back, or “resected,” by specialized nucleases to produce overhanging, single-strand 3′ ends. The next step is strand exchange (also called strand invasion), during which one of the single-strand 3′ ends from the damaged DNA molecule searches the template duplex for homologous sequences through base-pairing. Once stable base-pairing is established (which completes the strand-exchange step), an accurate DNA polymerase extends the invading strand using the information provided by the undamaged template molecule, thus restoring one of the damaged DNA strands. The last steps—strand displacement, further repair synthesis, and ligation—restore the two original DNA double helices and complete the repair process, as illustrated.
Homologous recombination resembles other DNA repair reactions in that a DNA polymerase utilizes a pristine template to restore damaged DNA. However, instead of using the partner strand as a template, as occurs in most DNA repair pathways, homologous recombination makes use of a complementary strand from a separate DNA duplex. In the following sections, we discuss the steps of homologous recombination in more detail with an emphasis on the proteins that guide this remarkable process.
Specialized Processing of Double-Strand Breaks Commits Repair to Homologous Recombination
Once a double-strand break occurs, nonhomologous end joining and homologous recombination compete to repair the damage. But the specialized nuclease that resects DNA ends to begin homologous recombination becomes highly active during S and G2 (through its phosphorylation by cell-cycle-controlled kinases), and homologous recombination usually wins out at these times, allowing use of a newly replicated daughter DNA molecule as a template. The initiating nuclease (called the Mre11 complex in eukaryotes) chews back in the 5′→3′ direction leaving protruding 3′ ends on either side of the break that can be as long as several thousand nucleotides. Single-strand binding protein (the same one used at replication forks) then coats the exposed single strands, protecting them from other nucleases in the cell and ensuring that they remain free of intramolecular base-pairing. The formation of these protruding ends prevents nonhomologous end joining from occurring, and it commits the repair pathway to homologous recombination.
Strand Exchange Is Directed by the RecA/Rad51 Protein
Of all the steps of homologous recombination, strand invasion is the most difficult to imagine. How does the invading single strand rapidly sample a DNA duplex for a complementary sequence? Once the homology is found, how is the structure stabilized? And how is the inherent stability of the template double helix overcome to allow tests for base-pairing during this process?
The answers to these questions came from biochemical and structural studies of the main protein that carries out this feat, called RecA in E. coli and Rad51 in virtually all eukaryotic organisms. A special group of accessory proteins loads a set of RecA/Rad51 monomers onto a protruding DNA single strand (such as that in Figure 5-47), forming a cooperatively bound filament that displaces the single-strand binding protein originally present. This orderly loading process produces a protein–DNA filament in which the DNA is held by RecA/Rad51 in an unusual conformation: groups of three consecutive nucleotides are positioned as though they were in a conventional DNA double helix, but, between adjacent triplets, the DNA backbone is untwisted and stretched out (Figure 5–48). This unusual protein–DNA structure then grasps a nearby duplex DNA molecule in a way that stretches it, destabilizing it and making it easy to pull the strands apart. The invading single strand then can sample the sequence of the duplex by conventional base-pairing to one of its strands. This sampling occurs in triplet nucleotide blocks, each of which is already in a “base-pair ready” conformation in the invading strand; when a good triplet match occurs, only then is the adjacent triplet sampled, and so on. In this way, mismatches very quickly cause dissociation, so that millions of possible pairings can be tested. Only an extended stretch of base-pairing (at least 15 nucleotides) can stabilize the invading strand, leading to the next steps in homologous recombination.
RecA/Rad51 is an ATPase, and the steps described above require that each monomer along the filament be in the ATP-bound state. However, the searching itself does not require ATP hydrolysis; instead, the process occurs by simple molecular collisions, allowing an enormous number of potential sequences to be rapidly sampled. Once stable base-pairing occurs and a strand-exchange reaction is completed, ATP hydrolysis is necessary to disassemble RecA from the complex of DNA molecules. At this point, repair DNA polymerases and DNA ligase, which we encountered earlier in this chapter, complete the repair process, as shown previously in Figure 5–47.
Homologous Recombination Can Rescue Broken and Stalled DNA Replication Forks
Although accurately repairing double-strand breaks is a crucial function of homologous recombination, it can also repair other types of damage. For example, some chemicals cross-link the two strands of DNA together by covalently joining nucleotides on opposite strands. A special set of enzymes unlinks the strands and cuts out the damaged bits on both strands. At this point, the damaged DNA has been converted to a double-strand break, which can be accurately repaired by homologous recombination, as discussed earlier. Similarly, proteins can become accidently covalently linked to DNA, and these sites can also be converted by nucleases into double-strand breaks, allowing repair by homologous recombination. But perhaps the most important role of homologous recombination is in rescuing broken or stalled DNA replication forks. Many types of events can cause a replication fork to stop, and here we consider two examples. The first arises from an accidental single-strand gap in the parent DNA helix that lies just ahead of a replication fork. When the fork reaches this lesion, it falls apart—resulting in one broken and one intact daughter chromosome. Because this is a “one-sided” double-strand break, it cannot be repaired by nonhomologous end joining, and homologous recombination becomes crucial. The broken fork can be accurately repaired using the same basic reactions we discussed earlier for the repair of double-strand breaks (Figure 5–49). With slight modifications, the set of reactions just depicted can accurately repair many different types of DNA damage, providing that an undamaged duplex DNA template is available.
A different type of problem arises when a replication fork attempts to move through certain types of DNA damage that clogs up the replication machinery, stalling the fork. Because such damaged DNA often ends up deeply buried in the core of the replication fork, it cannot be easily repaired. To resolve this problem, the replication machine “backs up” through a series of strand-exchange reactions similar to those we have discussed (Figure 5–50). This maneuver allows one newly synthesized DNA strand to act as a template for synthesis of the other new strand, thereby bypassing the damaged template and allowing replication to proceed.
DNA Repair by Homologous Recombination Entails Risks to the Cell
Although homologous recombination neatly solves the problem of accurately repairing double-strand breaks and other types of DNA damage, it sometimes “repairs” damage using the wrong bit of the genome as the template. For example, sometimes a broken human chromosome is repaired using the homolog from the other parent instead of the sister chromatid as the template. Because maternal and paternal chromosomes differ in DNA sequence at many positions along their lengths, this type of repair can convert the sequence of the repaired DNA from the maternal to the paternal sequence or vice versa. The result of this type of errant recombination is a loss of heterozygosity. It can have severe consequences if the homolog used for repair contains a deleterious mutation, because the recombination event destroys the “good” copy. Loss of heterozygosity, although it happens rarely, is nonetheless a critical step in the formation of many cancers (discussed in Chapter 20).
Cells go to great lengths to minimize the risk of mishaps of these types; indeed, as we have seen, nearly every step of homologous recombination is carefully regulated. Recall that the first step (resection of the broken ends) is coordinated with the cell cycle: it occurs primarily in the S and G2 phases of the cell cycle, favoring the use of a daughter duplex (either as a partially replicated chromosome or a fully replicated sister chromatid) as a template for repair (see Figure 5–47). The close proximity of the two daughter chromosomes disfavors the use of other genome sequences in the repair process.
The loading of RecA/Rad51 onto the processed DNA ends and the subsequent strand-exchange reaction are also tightly controlled by the cell, and a host of accessory proteins is needed to regulate these steps. There are many such proteins, and exactly how all of them coordinate and control homologous recombination remains a mystery, although we do understand how a few of them work, as described below. We also know that the enzymes that catalyze recombinational repair are made at relatively high levels in eukaryotes and are dispersed throughout the nucleus in an inactive form. In response to DNA damage, they rapidly converge on the sites of DNA damage, become activated, and form “repair factories” where many lesions are apparently brought together and repaired (Figure 5–51). Formation of these factories probably results from many weak interactions between different repair proteins and between repair proteins and damaged DNA, producing the type of biomolecular condensates discussed in Chapter 3 (see Figure 3–77). The high local concentration of the appropriate proteins and their substrates within these condensates is thought to increase the speed and efficiency of the repair process.
In Chapter 20, we shall see that both too much and too little homologous recombination can lead to cancer in humans, the former through repair using the “wrong” template (as described above) and the latter through an increased mutation rate caused by inefficient DNA repair. Clearly, a delicate balance has evolved that keeps this process in check on undamaged DNA, while still allowing it to act efficiently and rapidly on DNA lesions as soon as they arise.
Not surprisingly, mutations in the components that carry out and regulate homologous recombination are responsible for several inherited forms of cancer. Two of these, the Brca1 and Brca2 proteins, were first discovered because mutations in their genes lead to a greatly increased frequency of breast cancer. Because these mutations cause inefficient repair by homologous recombination, accumulation of DNA damage can, in a small proportion of cells, give rise to a cancer. Brca1 regulates an early step in broken-end processing; without it, such ends are not processed correctly for homologous recombination and instead damaged molecules are shunted to the error-prone nonhomologous end-joining pathway (see Figure 5–45). After resection, Brca2 is needed to correctly load the Rad51 protein onto the protruding single-strand DNA ends in preparation for strand exchange.
Homologous Recombination Is Crucial for Meiosis
We have seen that homologous recombination can use a set of reactions—including broken-end resection, strand invasion, limited DNA synthesis, and ligation—to exchange DNA sequences between two double helices with the same nucleotide sequence and thereby repair damaged DNA. We now describe how homologous recombination is used to deliberately exchange material between two different chromosomes in order to generate DNA molecules that carry novel combinations of genes. This is a frequent and necessary part of meiosis, which occurs in sexually reproducing organisms such as fungi, plants, and animals.
In meiosis, homologous recombination is an integral part of the process that allows chromosomes to be parceled out to germ cells (sperm and eggs in animals). We discuss the process of meiosis in detail in Chapter 17; here we discuss how homologous recombination during meiosis produces chromosome crossing-over and gene conversion, resulting in hybrid chromosomes that contain genetic information from both the maternal and paternal homologs (Figure 5–52). These mechanisms, at their core, closely resemble those used to repair double-strand breaks.
Meiotic Recombination Begins with a Programmed Double-Strand Break
Homologous recombination in meiosis starts with a bold stroke: a specialized Spo11 protein complex breaks both strands of the DNA double helix in one of the recombining chromosomes (Figure 5–53). After catalyzing this reaction, the protein complex remains covalently bound to the broken DNA, much like the DNA topoisomerase we encountered earlier in this chapter (see Figure 5–22). Many of the subsequent recombination reactions closely resemble those already described for the repair of double-strand breaks; indeed, some of the same proteins are used for both processes. For example, the Mre11 complex, which we encountered earlier, chews back the DNA ends, removing the proteins along with the DNA and leaving the protruding 3′ single-strand ends needed for strand invasion.
However, several meiosis-specific proteins come into play and guide the reactions somewhat differently, resulting in the distinctive outcomes observed for meiosis. A key difference is that, in meiosis, recombination occurs preferentially between maternal and paternal chromosomal homologs (which are held closely together during meiosis), rather than between newly replicated, identical DNA duplexes as in double-strand break repair. In the sections that follow, we describe in more detail those aspects of homologous recombination that are especially important for meiosis.
Holliday Junctions Are Recognized by Enzymes That Drive Branch Migration
Of special importance in meiosis is an intermediate structure known as a Holliday junction, or cross-strand exchange, in which two homologous DNA helices that have paired are held together by the reciprocal exchange of two of the four strands present, one strand originating from each of the helices. This junction can be considered to contain two pairs of strands: one pair of crossing strands and one pair of noncrossing strands (Figure 5-54A). But by undergoing a series of rotational movements, it can isomerize to form an open, symmetrical structure in which both pairs of strands occupy equivalent positions (Figure 5–54B and D). A special set of recombination proteins that binds to this open isomer uses the energy of ATP hydrolysis to catalyze a reaction known as branch migration (Figure 5–55), which greatly expands the region of heteroduplex DNA that was initially created by a strand-exchange reaction (Figure 5–54B and C). In meiosis, heteroduplex regions often “migrate” thousands of nucleotides from the original site of the double-strand break. The step where this migration occurs is indicated in Figure 5–53. As shown in the figure, Holliday junctions are often produced in pairs, known as double Holliday junctions.
Homologous Recombination Produces Crossovers Between Maternal and Paternal Chromosomes During Meiosis
There are two basic outcomes of homologous recombination during meiosis, as shown previously in Figure 5–53 (Movie 5.8). In humans, approximately 90% of the double-strand breaks produced during meiosis are resolved as non-crossovers (right side of Figure 5–53). Here, the two original DNA duplexes separate from each other in a form unaltered except for a region of heteroduplex that formed near the site of the original double-strand break. As already noted, this set of reactions resembles that described earlier for the repair of double-strand breaks.
The other outcome is much more profound: a double Holliday junction is formed and is cleaved by specialized enzymes (blue arrows on the left side of Figure 5–53) to create a crossover. The two original portions of each chromosome upstream and downstream from the two Holliday junctions are thereby swapped, creating two chromosomes that are said to have “crossed over”—each containing a large number of both maternally inherited and paternally inherited genes.
How does the cell decide which double-strand breaks to resolve as crossovers? The answer is not yet known, but we know the decision is not random. The relatively few crossovers that do form are distributed along chromosomes in such a way that a crossover in one position inhibits crossing-over in neighboring regions. Termed crossover control, this fascinating but poorly understood regulatory mechanism ensures the roughly even distribution of crossover points along chromosomes. It also ensures that each chromosome—no matter how small—undergoes at least one crossover event every meiosis. For many organisms, roughly two crossovers per chromosome occur during each meiosis, one on each arm. As discussed in detail in Chapter 17, these crossovers, in addition to producing novel DNA molecules, play an important mechanical role in the proper segregation of chromosomes during meiosis.
Whether a meiotic recombination event is resolved as a crossover or a non-crossover, the recombination machinery leaves behind a heteroduplex region where a strand with the DNA sequence of the paternal homolog is base-paired with a strand from the maternal homolog (Figure 5–56). These heteroduplex regions can tolerate a small percentage of mismatched base pairs, and because of branch migration, they often extend for thousands of nucleotide pairs. The many non-crossover events that occur in meiosis thereby produce scattered sites in the germ cells where short DNA sequences from one homolog have been pasted into the other homolog. Heteroduplex regions mark sites of potential gene conversion—where the four haploid chromosomes produced by meiosis contain three copies of a DNA sequence from one homolog and only one copy of this sequence from the other homolog, as explained next.
Homologous Recombination Often Results in Gene Conversion
In sexually reproducing organisms, it is a fundamental law of genetics that—aside from mitochondrial DNA, which is inherited only through the mother—each parent makes an equal genetic contribution to an offspring. One complete set of nuclear genes is inherited from the father and one complete set is inherited from the mother. Underlying this law is the accurate parceling out of chromosomes to the germ cells (eggs and sperm) that takes place during meiosis. Thus, when a diploid cell in a parent undergoes meiosis to produce four haploid germ cells, exactly half of the genes distributed among these four cells should be maternal (genes inherited from the mother of this parent) and the other half paternal (genes inherited from the father of this parent). In some organisms (fungi, for example), it is possible to recover and analyze all four of the haploid gametes produced from a single cell by meiosis. Studies in such organisms have revealed rare cases in which the parceling out of genes violates the standard genetic rules. Occasionally, for example, meiosis yields three copies of the maternal version of a gene and only one copy of the paternal version. Alternative versions of the same gene are called alleles, and it is the divergence from their expected distribution during meiosis that is known as gene conversion (Movie 5.8). Genetic studies show that only small sections of DNA typically undergo gene conversion, and in many cases only a part of a gene is changed. How is this possible?
We have seen that both crossovers and non-crossovers produce heteroduplex regions of DNA. If the two strands that make up a heteroduplex region do not have identical nucleotide sequences, mismatched base pairs are formed, and these are often repaired by the cell’s mismatch repair system (see Figure 5–20). However, unlike what happens after DNA replication, in meiosis the mismatch repair system randomly selects the strand to be used as a template, causing one allele to be lost and the other duplicated (Figure 5–57). Thus, gene conversion (the “conversion” of one allele to the other)—originally regarded as a mysterious deviation from the rules of genetics—can be seen as a straightforward consequence of the mechanisms of homologous recombination during meiosis.
Summary
Homologous recombination describes a flexible set of reactions resulting in the exchange of DNA sequences between a pair of identical or nearly identical duplex DNA molecules. Of special importance is a strand-exchange step whereby a single strand from one DNA duplex invades a second duplex and base-pairs with one strand while displacing the other. This reaction, catalyzed by the RecA/Rad51 family of proteins, can only occur if the invading strand can form a short stretch of consecutive nucleotide pairs with one of the strands of the duplex. This requirement ensures that homologous recombination occurs only between identical or very similar DNA sequences.
When used as a DNA repair mechanism, homologous recombination usually occurs between a damaged DNA molecule and its recently duplicated sister molecule, with the undamaged duplex acting as a template to repair the damaged copy flawlessly. In meiosis, homologous recombination is initiated by deliberate, carefully regulated double-strand breaks and occurs preferentially between the homologous chromosomes rather than the newly replicated sister chromatids. The outcome can be either two chromosomes that have crossed over (that is, chromosomes in which the DNA on either side of the site of DNA pairing originates from two different homologs) or two non-crossover chromosomes. In the latter case, the two chromosomes that result are identical to the original two homologs, except for relatively minor DNA sequence changes at the site of recombination.
Glossary
- homologous recombination (general recombination)
- Genetic exchange between a pair of identical or very similar DNA sequences, often those located on two copies of the same chromosome. Provides an error-free mechanism for repairing DNA double-strand breaks.
- strand exchange
- Reaction in which a single-strand 3′ end from one duplex DNA molecule penetrates another duplex and finds a homologous sequence through base-pairing. Also called strand invasion.
- RecA protein
- Prototype for a ubiquitous class of DNA-binding proteins that catalyze synapsis of DNA strands during genetic recombination in bacteria; analogous to Rad51 protein in eukaryotes.
- Rad51 protein
- Eukaryotic protein that catalyzes the pairing of homologous DNA strands during recombination and repair processes. Analogous to the RecA protein in E. coli and other bacteria.
- loss of heterozygosity
- The result of errant homologous recombination that uses the homolog from the other parent instead of the sister chromatid as the template, converting the sequence of the repaired DNA to that of the other homolog.
- Holliday junction (cross-strand exchange)
- X-shaped structure formed in DNA molecules undergoing recombination, in which the two DNA molecules are held together by exchanging one of their two strands; also called a cross-strand exchange.
- allele
- One of several alternative forms of a gene. In a diploid cell, each gene will typically have two alleles, occupying the same corresponding position (locus) on homologous chromosomes.
- gene conversion
- Process by which DNA sequence information can be transferred from one DNA helix (which remains unchanged) to another DNA helix whose sequence is altered. It often accompanies general recombination events.