Enhanced features in DSSR for G-quadruplexes

Over the past couple of months, I’ve further enhanced the DSSR-derived structural features for Q-quadruplexes (G4). One was the implementation of the single descriptor of intramolecular canonical G4 structures with three connecting loops recently proposed by Dvorkin et al. The descriptor contains the number of guanines in the G4 stem, the type and relative direction of loops linking G-tracts of the stem, and the groove-widths associated with lateral loops. For example, PDB entry 2GKU (see the DSSR-enabled PyMOL schematic image below, Fig. 1A) has the following DSSR output.

List of 1 G4-stem
  Note: a G4-stem is defined as a G4-helix with backbone connectivity.
        Bulges are also allowed along each of the four strands.
  stem#1[#1] layers=3 INTRA-molecular loops=3 descriptor=3(-P-Lw-Ln) note=hybrid-1(3+1) UUDU anti-parallel
   1  glyco-bond=ss-s groove=-wn- mm(<>,outward)  area=14.24 rise=3.58 twist=16.8  nts=4 GGGG A.DG3,A.DG9,A.DG17,A.DG21
   2  glyco-bond=--s- groove=-wn- pm(>>,forward)  area=13.12 rise=3.71 twist=25.9  nts=4 GGGG A.DG4,A.DG10,A.DG16,A.DG22
   3  glyco-bond=--s- groove=-wn-                                                  nts=4 GGGG A.DG5,A.DG11,A.DG15,A.DG23
    strand#1  U DNA glyco-bond=s-- nts=3 GGG A.DG3,A.DG4,A.DG5
    strand#2  U DNA glyco-bond=s-- nts=3 GGG A.DG9,A.DG10,A.DG11
    strand#3  D DNA glyco-bond=-ss nts=3 GGG A.DG17,A.DG16,A.DG15
    strand#4  U DNA glyco-bond=s-- nts=3 GGG A.DG21,A.DG22,A.DG23
    loop#1 type=propeller strands=[#1,#2] nts=3 TTA A.DT6,A.DT7,A.DA8
    loop#2 type=lateral   strands=[#2,#3] nts=3 TTA A.DT12,A.DT13,A.DA14
    loop#3 type=lateral   strands=[#3,#4] nts=3 TTA A.DT18,A.DT19,A.DA20

The descriptor=3(-P-Lw-Ln) means that the G4 structure has three layers of G-tetrads, connected via three loops: the first is the Propeller loop in anti-clockwise (negative) direction, then the Lateral loop passing a wide groove anti-clockwise, and finally another Lateral loop passing a narrow groove, also anti-clockwise. The DSSR symbols follow those of Dvorkin et al. but with capital letters L, P, and D for lateral, propeller, and diagonal loops instead of lower case letters (l, p, d) to avoid using subscript for groove-width info. So the 2GKU descriptor 3(-P-Lw-Ln) from DSSR corresponds to 3(-p-lw-ln) of Dvorkin et al.

The DSSR-enabled, PyMOL-rendered, block image in Fig. 1A makes the three G-tetrad layers (squared green blocks) immediately obvious. Other base identities and stacking interactions also become clear — for example, the A24 (in red) stacks on the top G-tetrad, and T1-A20 pair stacks with the bottom G-tetrad.

Two other PDB entries (2LOD and 2KOW) are illustrated in Fig. 1B and Fig. 1C. They have different topologies than 2GKU (Fig. 1A). DSSR is able to characterize all of them consistently.

DSSR-enabled G4 analysis and representation
Figure 1. DSSR-enabled, PyMOL-rendered, block images of five G-quadruplexes. A in red, C in yellow, G (and G-tetrad) in green, and T in blue.

Another G4-related new feature in DSSR is the detection of V-shaped loops in noncanonical G4 structures where one of the four G-G columns (strands) that link adjacent G-tetrads is broken. Two of recent PDB examples with V-loops are shown in Fig. 1D (5ZEV) and Fig. 1E (6H1K). An excerpt of DSSR output for the PDB entry 6H1K is shown below.

List of 1 G4-helix
  Note: a G4-helix is defined by stacking interactions of G4-tetrads, regardless
        of backbone connectivity, and may contain more than one G4-stem.
  helix#1[1] stems=[#1] layers=3 INTRA-molecular
   1  glyco-bond=-sss groove=w--n mm(<>,outward)  area=12.76 rise=3.47 twist=18.2  nts=4 GGGG A.DG2,A.DG19,A.DG15,A.DG26
   2  glyco-bond=s--- groove=w--n pm(>>,forward)  area=12.84 rise=3.07 twist=33.4  nts=4 GGGG A.DG1,A.DG20,A.DG16,A.DG27
   3  glyco-bond=s--- groove=w--n                                                  nts=4 GGGG A.DG25,A.DG21,A.DG17,A.DG28
    strand#1 DNA glyco-bond=-ss nts=3 GGG A.DG2,A.DG1,A.DG25
    strand#2 DNA glyco-bond=s-- nts=3 GGG A.DG19,A.DG20,A.DG21
    strand#3 DNA glyco-bond=s-- nts=3 GGG A.DG15,A.DG16,A.DG17
    strand#4 DNA glyco-bond=s-- nts=3 GGG A.DG26,A.DG27,A.DG28

List of 1 G4-stem
  Note: a G4-stem is defined as a G4-helix with backbone connectivity.
        Bulges are also allowed along each of the four strands.
  stem#1[#1] layers=2 INTRA-molecular loops=3 descriptor=2(D+PX) note=UD3(1+3) UDDD anti-parallel
   1  glyco-bond=s--- groove=w--n mm(<>,outward)  area=12.76 rise=3.47 twist=18.2  nts=4 GGGG A.DG1,A.DG20,A.DG16,A.DG27
   2  glyco-bond=-sss groove=w--n                                                  nts=4 GGGG A.DG2,A.DG19,A.DG15,A.DG26
    strand#1  U DNA glyco-bond=s- nts=2 GG A.DG1,A.DG2
    strand#2  D DNA glyco-bond=-s nts=2 GG A.DG20,A.DG19
    strand#3  D DNA glyco-bond=-s nts=2 GG A.DG16,A.DG15
    strand#4  D DNA glyco-bond=-s nts=2 GG A.DG27,A.DG26
    loop#1 type=diagonal  strands=[#1,#3] nts=12 GAGGCGTGGCCT A.DG3,A.DA4,A.DG5,A.DG6,A.DC7,A.DG8,A.DT9,A.DG10,A.DG11,A.DC12,A.DC13,A.DT14
    loop#2 type=propeller strands=[#3,#2] nts=2 GC A.DG17,A.DC18
    loop#3 type=diag-prop strands=[#2,#4] nts=5 GACTG A.DG21,A.DA22,A.DC23,A.DT24,A.DG25

List of 2 non-stem G4 loops (INCLUDING the two terminal nts)
   1 type=lateral   helix=#1 nts=5 GACTG A.DG21,A.DA22,A.DC23,A.DT24,A.DG25
   2 type=V-shaped  helix=#1 nts=4 GGGG A.DG25,A.DG26,A.DG27,A.DG28

Note that here a new loop type (diag-prop) and topology description symbol (X) are introduced. In developing DSSR in general, and G4-related features in particular, I’ve always tried to follow conventions widely used by the community. Whereas inconsistency exists, I pick up the ones that are in line with other parts of DSSR. For unique DSSR features lacking outside references, I came up my own nomenclature. When DSSR becomes more widely used, it may serve to standardize G4 nomenclatures.





Thank you for printing this article from http://home.x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu