X3DNA-DSSR Homepage -- Nucleic Acid Structures

Cover images provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

See the 2020 paper titled "DSSR-enabled innovative schematics of 3D nucleic acid structures with PyMOL" in Nucleic Acids Research and the corresponding Supplemental PDF for details. Many thanks to Drs. Wilma Olson and Cathy Lawson for their help in the preparation of the illustrations.

Details on how to reproduce the cover images are available on the 3DNA Forum.

November 2025

Structure of the human minor spliceosome pre-B complex (PDB id: 8Y7E; Bai R, Yuan M, Zhang P, Luo T, Shi Y, Wan R. 2024. Structural basis of U12-type intron engagement by the fully assembled human minor spliceosome. Science 383: 1245–1252). The protein–RNA assembly reveals the mechanisms of recognition and recruitment of several small nuclear ribonucleoproteins (snRNPs) involved in the splicing of U12-type introns. The pre-mRNA is depicted by a red ribbon, and the U12 small nuclear RNA (snRNA) by a green ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the proteins are shown as gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

October 2025

Human tRNA splicing endonuclease (TSEN) complex bound to pre-tRNAArg (PDB id: 7UXA; Hayne CK, Butay KJ, Stewart ZD, Krahn JM, Perera L, Williams JG, Petrovitch RM, Deterding LJ, Matera AG, Borgnia MJ, Stanley RE. 2023. Structural basis for pre-tRNA recognition and processing by the human tRNA splicing endonuclease complex. Nat Struct Mol Biol 30: 824–833). Cryo-EM structure of the TSEN protein assembly with pre-tRNAArg provides insights into the recognition and splicing of an intron that must be removed from the pre-tRNA before translation. The pre-tRNAArg is depicted by a red ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the TSEN subunits are shown as gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

September 2025

Systemic RNA interference defective protein 1 (SID1) in complex with dsRNA (PDB id: 8XC1; Wang R, Cong Y, Qian D, Yan C, Gong D. 2024. Structural basis for double-stranded RNA recognition by SID1. Nucleic Acids Res 52: 6718–6727). The cryo-EM structure provides a major step towards understanding the mechanism of dsRNA recognition by SID1, involving extensive interactions between basic amino-acid residues and the sugar-phosphate backbone. The dsRNA chains are depicted by red, green, blue, and yellow ribbons, with bases and Watson-Crick base pairs represented as color-coded blocks and minor-groove edges colored white: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; SID1 is shown by a gold ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

August 2025

Complex of arginyl-tRNA-protein transferase 1 (ATE1) with tRNA^Arg and a short peptide substrate (PDB id: 8UAU; Lan X, Huang W, Kim SB, Fu D, Abeywansha T, Lou J, Balamurugan U, Kwon YT, Ji CH, Taylor DJ, Zhang Y. 2024. Oligomerization and a distinct tRNA-binding loop are important regulators of human arginyl-transferase function. Nat Commun 15: 6350). The ATE1 homodimer dissociates upon binding the peptide and forms a loop that wraps around tRNA^Arg. The tRNA^Arg is depicted by a red ribbon, with bases and Watson–Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; ATE1 is shown by a gold ribbon and the peptide by a white ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

July 2025

Structure of endoribonuclease P (RNase P) in complex with pre-tRNA^His-Ser (PDB id: 8CBK; Meynier V, Hardwick SW, Catala M, Roske JJ, Oerum S, Chirgadze DY, Barraud P, Yue WW, Luisi BF, Tisné C. 2024. Structural basis for human mitochondrial tRNA maturation. Nat Commun 15: 4683). The structure reveals the first step of human mitochondrial tRNA maturation by RNase P, processing the 5′-leader of pre-tRNA. The RNA is depicted by a red ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the protein assembly is shown by the gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

June 2025

Structure of a group II intron ribonucleoprotein in the pre-ligation state (PDB id: 8T2R; Xu L, Liu T, Chung K, Pyle AM. 2023. Structural insights into intron catalysis and dynamics during splicing. Nature 624: 682–688). The pre-ligation complex of the Agathobacter rectalis group II intron reverse transcriptase/maturase with intron and 5′-exon RNAs makes it possible to construct a picture of the splicing active site. The intron is depicted by a green ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the 5′-exon is shown by white spheres and the protein by a gold ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

May 2025

Complex of terminal uridylyltransferase 7 (TUT7) with pre-miRNA and Lin28A (PDB id: 8OPT; Yi G, Ye M, Carrique L, El-Sagheer A, Brown T, Norbury CJ, Zhang P, Gilbert RJ. 2024. Structural basis for activity switching in polymerases determining the fate of let-7 pre-miRNAs. Nat Struct Mol Biol 31: 1426–1438). The RNA-binding pluripotency factor LIN28A invades and melts the RNA and affects the mechanism of action of the TUT7 enzyme. The RNA backbone is depicted by a red ribbon, with bases and Watson-Crick base pairs represented as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; TUT7 is represented by a gold ribbon and LIN28A by a white ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

April 2025

Cryo-EM structure of the pre-B complex (PDB id: 8QP8; Zhang Z, Kumar V, Dybkov O, Will CL, Zhong J, Ludwig SE, Urlaub H, Kastner B, Stark H, Lührmann R. 2024. Structural insights into the cross-exon to cross-intron spliceosome switch. Nature 630: 1012–1019). The pre-B complex is thought to be critical in the regulation of splicing reactions. Its structure suggests how the cross-exon and cross-intron spliceosome assembly pathways converge. The U4, U5, and U6 snRNA backbones are depicted respectively by blue, green, and red ribbons, with bases and Watson-Crick base pairs shown as color-coded blocks: A/A-U in red, C/C-G in yellow, G/G-C in green, U/U-A in cyan; the proteins are represented by gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

February 2025

Structure of the Hendra henipavirus (HeV) nucleoprotein (N) protein-RNA double-ring assembly (PDB id: 8C4H; Passchier TC, White JB, Maskell DP, Byrne MJ, Ranson NA, Edwards TA, Barr JN. 2024. The cryoEM structure of the Hendra henipavirus nucleoprotein reveals insights into paramyxoviral nucleocapsid architectures. Sci Rep 14: 14099). The HeV N protein adopts a bi-lobed fold, where the N- and C-terminal globular domains are bisected by an RNA binding cleft. Neighboring N proteins assemble laterally and completely encapsidate the viral genomic and antigenomic RNAs. The two RNAs are depicted by green and red ribbons. The U bases of the poly(U) model are shown as cyan blocks. Proteins are represented as semitransparent gold ribbons. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

January 2025

Structure of the helicase and C-terminal domains of Dicer-related helicase-1 (DRH-1) bound to dsRNA (PDB id: 8T5S; Consalvo CD, Aderounmu AM, Donelick HM, Aruscavage PJ, Eckert DM, Shen PS, Bass BL. 2024. Caenorhabditis elegans Dicer acts with the RIG-I-like helicase DRH-1 and RDE-4 to cleave dsRNA. eLife 13: RP93979. Cryo-EM structures of Dicer-1 in complex with DRH-1, RNAi deficient-4 (RDE-4), and dsRNA provide mechanistic insights into how these three proteins cooperate in antiviral defense. The dsRNA backbone is depicted by green and red ribbons. The U-A pairs of the poly(A)·poly(U) model are shown as long rectangular cyan blocks, with minor-groove edges colored white. The ADP ligand is represented by a red block and the protein by a gold ribbon. Cover image provided by X3DNA-DSSR, an NIGMS National Resource for structural bioinformatics of nucleic acids (R24GM153869; skmatics.x3dna.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

Moreover, the following 30 [12(2021) + 12(2022) + 6(2023)] cover images of the RNA Journal were generated by the NAKB (nakb.org).

Cover image provided by the Nucleic Acid Database (NDB)/Nucleic Acid Knowledgebase (NAKB; nakb.org). Image generated using DSSR and PyMOL (Lu XJ. 2020. Nucleic Acids Res 48: e74).

3DNA/blocview-PyMOL images in covers of the RNA journal

I recently performed a quick survey of the cover images of the RNA journal in 2019. I was pleased to find that 9 out of the 12 cover images were provided by the Nucleic Acid Database where 3DNA/blockview and PyMOL were employed, as noted below:

The RNA backbone is displayed as a red ribbon; bases are shown as blocks with NDB coloring: A—red, C—yellow, G—green, U—cyan; geneticin ligands are shown in spacefill with element colors: C—white, N—blue, O—red. The image was generated using 3DNA/blocview and PyMol software.

Details of the 9 cover images are listed below:

January 2019 Rhodobacter sphaeroides Argonaute with guide RNA/target DNA duplex containing noncanonical A-G pair (PDB code: 6d9k)
April 2019 Group I self-splicing intron P4-P6 domain mutant U131A (PDB code: 6d8l)
May 2019 Crystal structure of T. thermophilus 50S ribosomal protein L1 in complex with helices H76, H77, and H78 of 23S RNA (PDB code: 5npm)
June 2019 Crystal structure of ykoY-mntP riboswitch chimera bound to cadmium (PDB code: 6cc3)
July 2019 G96A mutant of the PRPP riboswitch from T. mathranii bound to ppGpp (PDB code: 6ck4)
August 2019 Crystal structure of the metY SAM V riboswitch (PDB code: 6fz0)
October 2019 Crystal structure of protease factor Xa bound to RNA aptamer 11F7t and rivaroxaban (PDB code: 5vof)
November 2019 Drosophila melanogaster nucleosome remodeling complex (PDB code: 6f4g)
December 2019 Crystal structure of the Homo Sapiens cytoplasmic ribosomal decoding site in complex with Geneticin (PDB code: 5xz1)

Here is the composite figure of the 9 cover images.

Web API to 3DNA

I’ve created a web API to DSSR and SNAP, and fiber models. The overall help message is available via http://api.x3dna.org. Individually, each program is accessed as below.

Help message on `x3dna-dssr` (DSSR): http://api.x3dna.org/dssr/help

Usage with 'http' (HTTPie):
    http -f http://api.x3dna.org/dssr [options] url=|model@
    http http://api.x3dna.org/dssr/help   -- display this help message

Options:
    json=true-or-FALSE(default)    [e.g., json=true # JSON output]
    pair=true-or-FALSE(default)    [e.g., pair=1    # base-pair only]
    hbond=true-or-FALSE(default)   [e.g., hbond=t   # H-bonding info]
    more=true-or-FALSE(default)    [e.g., more=y    # further details]

Required parameter:
    url=URL-to-coordinate-file [e.g., url=https://files.rcsb.org/download/1ehz.pdb.gz]
    model@coordinate-file      [e.g., model@1ehz.cif]
    # Only one must be specified. 'url' precedes 'model' when both are specified.
    # The coordinate file must be in PDB or PDBx/mmCIF format, optionally gzipped.

Examples:
    http -f http://api.x3dna.org/dssr url=https://files.rcsb.org/download/1ehz.pdb.gz
    http -f http://api.x3dna.org/dssr model@1ehz.cif pair=1
    # with 'curl'
    curl http://api.x3dna.org/dssr -F 'url=https://files.rcsb.org/download/1ehz.pdb.gz'
    curl http://api.x3dna.org/dssr -F 'model=@1msy.pdb' -F 'pair=1'

Note:
    The web API has an upper limit on coordinate file size (gzipped): < 6 MB

Help message on `x3dna-snap` (SNAP): http://api.x3dna.org/snap/help

Usage with 'http' (HTTPie):
    http -f http://api.x3dna.org/snap [options] url=|model@
    http http://api.x3dna.org/snap/help   -- display this help message

Options:
    json=true-or-FALSE(default)    [e.g., json=true # JSON output]
    hbond=true-or-FALSE(default)   [e.g., hbond=t   # H-bonding info]

Required parameter:
    url=URL-to-coordinate-file [e.g., url=https://files.rcsb.org/download/1oct.pdb.gz]
    model@coordinate-file      [e.g., model@1oct.cif]
    # Only one must be specified. 'url' precedes 'model' when both are specified.
    # The coordinate file must be in PDB or PDBx/mmCIF format, optionally gzipped.

Examples:
    http -f http://api.x3dna.org/snap url=https://files.rcsb.org/download/1oct.pdb.gz
    http -f http://api.x3dna.org/snap model@1oct.cif json=1
    # with 'curl'
    curl http://api.x3dna.org/snap -F 'url=https://files.rcsb.org/download/1oct.pdb.gz'
    curl http://api.x3dna.org/snap -F 'model=@1oct.cif' -F 'json=1'

Note:
    The web API has an upper limit on coordinate file size (gzipped): < 6 MB

Help message on 56 fiber models: http://api.x3dna.org/fiber/help

Usage with 'http' (HTTPie):
    http http://api.x3dna.org/fiber/help    # display this help message
    http http://api.x3dna.org/fiber/list    # show a list of available fiber models (56 in total)
    http http://api.x3dna.org/fiber/str_id  # build model 'str_id' in the range of [1, 56]
    http http://api.x3dna.org/fiber/name    # generate a model with common names as shown below:
              A-DNA, B-dna, C_DNA, D-DNA, ZDNA, RNA, RNAduplex, PaulingTriplex, G4
              Case does not matter, and the separator can be '-' or '_' or omitted.
              So a-dna, A-dNA, a_DNA, or ADNA is valid for building an A-DNA model.

Options (via query strings, or form fields):
    seq=base-sequence # A, C, G, T, U for generic model
    repeat=number     # number of repeats of the sequence
    cif=1             # output file in mmCIF format

Examples with 'http' (HTTPie):
    http http://api.x3dna.org/fiber/1       # model no. 1 (i.e., calf thymus A-DNA model)
    http -f http://api.x3dna.org/fiber/1 seq=A3TTT repeat=2  # specific sequence, repeated twice
    http http://api.x3dna.org/fiber/rna     # single-stranded RNA model
    http http://api.x3dna.org/fiber/rna-ds  # double-stranded RNA model
    http http://api.x3dna.org/fiber/pauling # the triplex model of Pauling & Corey
    http http://api.x3dna.org/fiber/g4      # G-quadruplex model
    # with 'curl'
    curl http://api.x3dna.org/fiber/1
    curl http://api.x3dna.org/fiber/1 -d 'seq=A3TTT' -d 'repeat=2'
    curl http://api.x3dna.org/fiber/rna
    curl http://api.x3dna.org/fiber/rna-ds
    curl http://api.x3dna.org/fiber/pauling
    curl http://api.x3dna.org/fiber/g4

Note:
    The web API has two upper limits: repeats < 1,000, and nucleotides < 10,000.

Comment

DSSR-enhanced visualization of nucleic acid structures in PyMOL

The skmatic.x3dna.org website (see screenshot below) aims to showcase DSSR-enabled cartoon-block schematics of nucleic acid structures using PyMOL. It presents a simple interface to browse pre-calculated PDB entries with a set of default settings: long rectangular blocks for Watson-Crick base-pairs, square blocks for G-tetrads in G-quadruplexes, with minor-groove edges in black. Users can also specify an URL to a PDB- or mmCIF-formatted file or upload such an atomic coordinates file directly, and set several common options to customerize to the rendered image.

Moreover, a web API to DSSR-PyMOL schematics is available to allow for its easy integration into third-party tools.

Input a PDB id

Pre-calculated cartoon-block images together with summary information are available for PDB entries of nucleic-acid-containing structures. Note that gigantic structures like ribosomes that are only represented in mmCIF format are excluded from the resource. The base block images are most effective for small to medium-sized structures.

Here are a few examples:

1ehz, the crystal structure of yeast phenylalanine tRNA at 1.93-Å resolution
2lx1, the major conformation of the internal loop 5’GAGU/3’UGAG
2grb”, the crystal structure of an RNA quadruplex containing inosine-tetrad
4da3, the crystal structure of an intramolecular human telomeric DNA G-quadruplex 21-mer bound by the naphthalene diimide compound MM41
1oct, crystal structure of the Oct-1 POU domain bound to an octamer site
2hoj, the crystal structure of an E. coli thi-box riboswitch bound to thiamine pyrophosphate, manganese ions

Each entry is shown with images in six orthogonal perspectives: front, back, right, left, top, bottom. The ‘front’ image (upper-left in the panel) is oriented into the most-extended view with the DSSR --blocview option. The corresponding PyMOL session file and PDB coordinate file are available for download. One can also visualize the structure interactively via 3Dmol.js.

Sample PDB entries

Users can browse random samples of pre-calculated PDB entries. The number should be between 3 and 99, with a default of 12 entries (see below for an example). Simply click the ‘Submit’ button or the “Random samples (3 to 99)”: http://skmatic.x3dna.org/pdb_entry link to see results of randomly picked 12 PDB entries each time.

Specify a coordinate file

The atomic coordinate file must be in PDB or mmCIF format, and can be optionally gzipped (.gz). One can either specify an URL to or select a coordinate file. Several common options are available to allow for user customizations.

Web API help message

Usage with 'http' (HTTPie):
    http -f http://skmatic.x3dna.org/api [options] url=|model@
    http http://skmatic.x3dna.org/api/pdb/pdb_id  -- for a pre-calculated PDB entry
    http http://skmatic.x3dna.org/api/help        -- display this help message
Options:
    block_file=styles-in-free-text-format [e.g., block_file=wc-minor]
    block_color=nt-selection-and-color    [e.g., block_color='A:pink']
    block_depth=thickness-of-base-block   [e.g., block_depth=1.2]
    r3d_file=true-or-FALSE(default)       [e.g., r3d_file=true]
    raw_xyz=true-or-FALSE(default)        [e.g., raw_xyz=true]
Required parameter
    url=URL-to-coordinate-file [e.g., url=https://files.rcsb.org/download/1ehz.pdb.gz]
    model@coordinate-file      [e.g., model@1ehz.cif]
    # Only one must be specified. 'url' precedes 'model' when both are specified.
    # The coordinate file must be in PDB or PDBx/mmCIF format, optionally gzipped.
Examples
    http -f http://skmatic.x3dna.org/api block_file='wc-minor' model@1ehz.cif r3d_file=t
    http -f http://skmatic.x3dna.org/api url=https://files.rcsb.org/download/1ehz.pdb.gz -d -o 1ehz.png
    http http://skmatic.x3dna.org/api/pdb/1ehz -d -o 1ehz.png
    # with 'curl'
    curl http://skmatic.x3dna.org/api -F 'model=@1msy.pdb' -F 'block_file=wc-minor' -F 'r3d_file=1'
    curl http://skmatic.x3dna.org/api -F 'url=https://files.rcsb.org/download/1ehz.pdb.gz' -o 1ehz.png
    curl http://skmatic.x3dna.org/api/pdb/1ehz -o 1ehz.png

Sample images

Comment

SNAP and DSSR in DNAproDB

While reading DNAproDB: an expanded database and web-based tool for structural analysis of DNA–protein complexes, I noticed SNAP and DSSR being mentioned. The detailed citations are as below:

Information about individual nucleotide–residue interactions is also provided, such as hydrogen bonding, interaction geometry (based on SNAP (10)), buried solvent accessible surface areas and identification of the interacting residue and nucleotide moieties …

DNAproDB assigns a geometry for every nucleotide–residue interaction identified using SNAP, a component of the 3DNA program suite (10). The residues for which probabilities are shown are those with planar side chains so that a stacking conformation can be defined.

Base pairing and base stacking between nucleotides are identified using the program DSSR (20).

SNAP and DSSR are two (relatively) new programs in the 3DNA software suite. As the author, I am always glad to see them being cited explicitly in literature. The fact that SNAP and DSSR are cited together by DNAproDB, however, is especially significant. I am aware of the initial version of DNAproDB, but I definitely like the updated one better. This is what I recently wrote in response to a question on the 3DNA Forum:

Regarding DNA-protein interactions in general, you may want to have a look of DNAproDB from the Remo Rohs laboratory. A new paper has just been published in NAR, ‘DNAproDB: an expanded database and web-based tool for structural analysis of DNA–protein complexes’.

I’ve no doubt that SNAP and DSSR would be widely used in applications related to DNA/RNA structural bioinformatics. DSSR (to a lesser extent, SNAP) represents my view of what a scientific software tool should be.

Comment

ONZ classification of G-tetrads

Recently I read the article Topology-based classification of tetrads and quadruplex structures in Bioinformatics by Popenda et al. In this work, the authors proposed an ONZ classification scheme of G-tetrads in intramolecular G-quadruplexes (G4) as shown below (Fig. 2 in the publication):

I am glad to find that DSSR has been used as a component in their computational tool ElTetrado to automatically identify and classify tetrads and quadruplexes.

Structures from both sets were analysed using self-implemented programs along with DSSR software from the 3DNA suite (Lu et al. (2015)). From DSSR, we acquired the information about base pairs and stacking.

I like the ONZ classification scheme: it is simple in concept yet provides a new perspective for the topologies of G-tetrads in intramolecular G4 structures. So I implemented the idea in DSSR v1.9.8-2019oct16, with this feature available via the --g4-onz option. Note that ElTetrado, according to the authors, is applicable to ONZ classifications of general types of tetrads and quadruplexes. The DSSR implementation of ONZ classifications, on the other hand, is strictly limited to G-tetrads in intramolecular G4 structures.

The DSSR ONZ classification results match the ones reported in Figs. 1, 5, and 6 of the Popenda et al. paper. For example, for PDB entry 6H1K (Fig. 6), the relevant results with the --g4-onz option and without it are listed below:

# x3dna-dssr -i=6h1k.pdb --g4-onz
List of 3 G-tetrads
   1 glyco-bond=s--- groove=w--n planarity=0.149 type=planar Z- nts=4 GGGG A.DG1,A.DG20,A.DG16,A.DG27
   2 glyco-bond=-sss groove=w--n planarity=0.136 type=planar Z+ nts=4 GGGG A.DG2,A.DG19,A.DG15,A.DG26
   3 glyco-bond=--s- groove=-wn- planarity=0.307 type=other  O+ nts=4 GGGG A.DG17,A.DG21,A.DG25,A.DG28
# ---------------------------------------
# x3dna-dssr -i=6h1k.pdb 
#   without option --g4-onz
List of 3 G-tetrads
   1 glyco-bond=s--- groove=w--n planarity=0.149 type=planar nts=4 GGGG A.DG1,A.DG20,A.DG16,A.DG27
   2 glyco-bond=-sss groove=w--n planarity=0.136 type=planar nts=4 GGGG A.DG2,A.DG19,A.DG15,A.DG26
   3 glyco-bond=--s- groove=-wn- planarity=0.307 type=other  nts=4 GGGG A.DG17,A.DG21,A.DG25,A.DG28

With the --json option, the ONZ classification results are always available. An example is shown below for PDB entry 6H1K (Fig. 6):

# x3dna-dssr -i=6h1k.pdb --json | jq -c '.G4tetrads[] | [.nts_long, .topo_class]'
["A.DG1,A.DG20,A.DG16,A.DG27","Z-"]
["A.DG2,A.DG19,A.DG15,A.DG26","Z+"]
["A.DG17,A.DG21,A.DG25,A.DG28","O+"]

Comment

H-bonds reported by DSSR and SNAP

I recently read a short communication by Pavel Afonine, titled phenix.hbond: a new tool for annotation hydrogen bonds in the July 2019 issue of the Computational Crystallography Newsletter (CCN). It appears that every bioinformatics tool (e.g., PyMOL or Jmol) has its own implementation of an algorithm on calculating H-bonds, one of the fundamental stabilizing forces of proteins and DNA/RNA structures. So does 3DNA/DSSR, as noted in my 2014-04-11 blogpost Get hydrogen bonds with DSSR.

Both DSSR and SNAP have the --get-hbond option, and they use the same underlying algorithm. However, the default output from the two programs differs: DSSR reports the H-bonds within nucleic acids, and SNAP covers only those at the DNA/RNA-protein interface. Using the PDB entry 1oct as an example, Running DSSR on it with the --get-hbond option gives 33 H-bonds in the DNA duplex, while SNAP reports 38 H-bonds at the DNA-protein interface. By design, the default output caters for the most-common use case of each program.

Under the scene, however, there exist variations in the seemingly simple --get-hbond option. One can attach text ‘nucleic’ (or ‘nuc’, ‘nt’), as in --get-hbond-nucleic, to output H-bonds within nucleic acids. Similarly, --get-hbond-protein (or ‘amino’, ‘aa’) would output H-bonds within proteins. Not surprisingly, the --get-hbond-nt-aa option would list H-bonds in nucleic acids and proteins, including those at their interface. These variations apply to both DSSR and SNAP, even though some are redundant with the default.

Notably, in combination with --json, the --get-hbond option by default would output all H-bonds, as if --get-hbond-nt-aa has been set. For PDB entry 1oct, DSSR or SNAP would report 208 H-bonds. Moreover, the JSON output has a residue_pair field for each identified H-bond, with values like "nt:nt", "nt:aa", or "aa:aa". Using 1oct as an example,

# x3dna-dssr -i=1oct.pdb --get-hbond --json | jq '.hbonds[0]'
{
  "index": 1,
  "atom1_serNum": 34,
  "atom2_serNum": 608,
  "donAcc_type": "standard",
  "distance": 3.304,
  "atom1_id": "O6@A.DG202",
  "atom2_id": "N4@B.DC230",
  "atom_pair": "O:N",
  "residue_pair": "nt:nt"
}
# x3dna-dssr -i=1oct.pdb --get-hbond --json | jq '.hbonds[60]'
{
  "index": 61,
  "atom1_serNum": 462,
  "atom2_serNum": 1187,
  "donAcc_type": "standard",
  "distance": 3.692,
  "atom1_id": "O2@B.DT223",
  "atom2_id": "NH2@C.ARG102",
  "atom_pair": "O:N",
  "residue_pair": "nt:aa"
}
# x3dna-dssr -i=1oct.pdb --get-hbond --json | jq '.hbonds[100]'
{
  "index": 101,
  "atom1_serNum": 791,
  "atom2_serNum": 818,
  "donAcc_type": "standard",
  "distance": 2.871,
  "atom1_id": "N@C.THR26",
  "atom2_id": "OD2@C.ASP29",
  "atom_pair": "N:O",
  "residue_pair": "aa:aa"
}

In the above three cases, using SNAP instead of DSSR would give the same results.

Also, one can take advantage of the residue_pair value to filter H-bonds by type. For example, the following command would extract only H-bonds at the DNA-protein interface (38 occurrences, same as the number noted above):

x3dna-snap -i=1oct.pdb --get-hbond --json | jq '.hbonds[] | select(.residue_pair=="nt:aa")'

Back to the phenix.hbond tool, the author noted that:

Running phenix.hbond requires atomic model in PDB or mmCIF format with all hydrogen atoms added, as well as ligand restraint files if the model contains unknown to the library items.

While there is no particular reason why this should not work for all bio-macromolecules, currently phenix.hbond is only optimized and tested to work with proteins, which is the limitation that will be removed in future.

In contrast, the H-bond identification algorithm in DSSR/SNAP does not require hydrogen atoms. In fact, hydrogen atoms are simply ignored if they exist. As shown above, the H-bond method as implemented in DSSR/SNAP works for DNA, RNA, protein, or their complexes. This does not necessarily mean that the 3DNA way is superior to other similar tools. It just works well in my hand, and it may serve as a pragmatic choice for other users.

Comment

DSSR is used in RNAMake and 3dRNA 2.0

Recently I noticed two new citations to DSSR, an integrated software tool for dissecting the spatial structure of RNA. One is from the Yesselman et al. article Computational design of three-dimensional RNA structure and function in Nature Nanotechnology, and the other is from the Wang et al. article 3dRNA v2.0: An Updated Web Server for RNA 3D Structure Prediction in International Journal of Molecular Sciences.

Yesselman et al. has used DSSR in RNAMake for building the motif library. The relevant section is as follows:

We processed each RNA structure to extract every motif with Dissecting the Spatial Structure of RNA (DSSR)54 with the following command:

x3dna-dssr –i file.pdb –o file_dssr.out

We manually checked each extracted motif to confirm that it was the correct type, as DSSR sometimes classifies tertiary contacts as higher-order junctions and vice versa. For each motif collected from DSSR, we ran the X3DNA find_pair and analyze programs to determine the reference frame for the first and last base pair of each motif to allow for the alignment between motifs:

find_pair file.pdb 2> /dev/null stdout | analyze stdin >& /dev/null

It is worth noting the sentence that “DSSR sometimes classifies tertiary contacts as higher-order junctions and vice versa.” Presumably. the authors are referring to the inclusion of ‘isolated canonical pairs’ in junctions by default in DSSR. Overall, the default DSSR settings follow the most common practice in RNA literature. In the meantime, I am aware that the community may not agree on every detail. Thus DSSR provide many options (documented or otherwise) to cater for other potential use cases. See the Stems of junction structure have only one base pair and Junction definition threads on the 3DNA Forum for two examples. In the long run, DSSR is likely to help consolidate RNA nomenclature that can be applied in a pragmatic, consistent manner.

Note also that DSSR provides the reference frame of each identified base pair via the JSON option. Using 1ehz as an example, the following command provides detailed information about base pairs:

x3dna-dssr -i=1ehz.pdb --json --more | jq .pairs

In the 3dRNA 2.0 paper, DSSR is cited as below. This is the first time DSSR is integrated in the 3dRNA pipeline.

The predicted structures are built from the sequence and secondary structure, while the former is obtained from their native structures fetched from PDB (https://www.rcsb.org/), and the latter is calculated from DSSR (Dissecting the Spatial Structure of RNA) [39].

Comment

5CM and 5MC, two forms of 5-methylcytosine in the PDB

In the PDB, the ligand identifiers 5MC and 5CM all refer to 5-methylcytosine, but differ in the sugar moieties the base is attached to. Chemically, 5CM is 5-methyl-2’-deoxycytidine-5’-monophosphate as in DNA, and 5MC is 5-methylcytidine-5’-monophosphate. See the molecular images shown below.

The 5-methyl group is named C5A in 5CM and CM5 in 5MC, respectively, for non-obvious reasons other than conventions. For comparison, the methyl-group in thymine of DNA is named C7, as for example in PDB id 355d. It is worth noting that DSSR is able to handle all such variations in atom or residue names.

Comment

R wrapper to DSSR in VeriNA3d

I recently came across a Bioinformatics article VeriNA3d: an R package for nucleic acids data mining by Gallego et al. from IRB Barcelona. VeriNA3d can perform dataset analysis, single-structure analysis, and exploratory data analyses, with an emphasis on complex RNA structures. I am glad to see the DSSR is one of the third-party utilities that have been integrated into VeriNA3d, as shown below

VeriNA3d offers integration with third-party utilities such as the non-redundant lists of RNA structures (Leontis and Zirbel, 2012), the eRMSD suggested to compare RNA structures (Bottaro et al., 2014), a wrapper to the DSSR (Dissecting the Spatial Structure of RNA) software (Lu et al., 2015) and query functions to access the PDBe REST API (Velankar et al., 2016).

I browsed the GitLab repository and read through the supplemental documents. Clearly, VeriNA3d is a handy tool for the R community to perform RNA 3D structural analyses.

To DSSR users, Section “9 The dssr wrapper: getting the base pairs” of the supplemental PDF “VeriNA3d: introduction and use cases” is particularly relevant. The three paragraphs (with minor edits) are excerpted below:

The DSSR software (Dissecting the Spatial Structure of RNA) (Lu, Bussemaker, and Olson 2015) represents an invaluable resource to handle RNA structures. Some of the functions of veriNA3d overlap with the functionalities of DSSR, and both applications provide unique different features. We implement a wrapper to execute DSSR directly from R and get the best of both worlds in one place.

Note that installing veriNA3d does not automatically install DSSR, since we don’t redistribute third-party software. Before any user can use our wrapper, the dssr function, DSSR should be installed separately. To address this installation we redirect you to the DSSR manual, where anyone can find the specific instructions for their system. Once DSSR is installed and working in your computer, you will also be able to use it with our wrapper. If the DSSR executable (named x3dna-dssr) is in your path, dssr will find it automatically. If the wrapper does not find it, you can still use it specifying the absolute path to the executable with the argument exefile. Find more information running ?dssr.

One of the DSSR capabilities that users might be interested in is the detection and classification of base pairs. The following code shows a simple example. The output of the dssr wrapper is an object got from the json DSSR output. From R, json objects are parsed in the form of a tree of lists, with different types of information. Most of the interesting data is under the list models, sublist parameters, as shown herein.

I echo the authors’ policy of not redistributing third-party software with VeriNA3d. DSSR is under active development. Users should always visit the 3DNA Forum for downloading the latest version of DSSR, reporting bugs, and asking questions.

The R interface to DSSR (via JSON output) in VeriNA3d represents one of the intended use cases of DSSR’s many possible applications. No doubt DSSR is being increasingly integrated into other resources of RNA structural bioinformatics. Hopefully, more advanced DSSR features (than the detection and classification of base pairs) will also be widely appreciated in the future. Users would love DSSR better when they gain more experience in structural bioinformatics.

Comment

« Older · Newer »

Thank you for printing this article from http://home.x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids(An NIGMS National Resource supported by NIH grant R24GM153869)

Help message on x3dna-dssr (DSSR): http://api.x3dna.org/dssr/help

Help message on x3dna-snap (SNAP): http://api.x3dna.org/snap/help

Help message on 56 fiber models: http://api.x3dna.org/fiber/help

Input a PDB id

Sample PDB entries

Specify a coordinate file

Web API help message

Sample images

X3DNA-DSSR: a resource for structural bioinformatics of nucleic acids
(An NIGMS National Resource supported by NIH grant R24GM153869)

Help message on `x3dna-dssr` (DSSR): http://api.x3dna.org/dssr/help

Help message on `x3dna-snap` (SNAP): http://api.x3dna.org/snap/help