Perl scripts are obsolete but still available

As of v2.1, I’ve switched from Perl to Ruby as the scripting language for 3DNA. Consequently, the Perl scripts in previous versions of 3DNA (v1.5 and v2.0) are now obsolete. I’ll only correct bugs in existing Perl scripts, but will not add any new features.

For back reference, the scripts are still available from a separate directory $X3DNA/perl_scripts, with the following contents:

OP_Mxyz*          dcmnfile*         nmr_strs*
README            del_ms*           pdb_frag*
block_atom*       expand_ids*       x3dna2charmm_pdb*
blocview.pl*      manalyze*         x3dna_r3d2png*
bp_mutation*      mstack2img*       x3dna_setup.pl*
cp_std*           nmr_ensemble*     x3dna_utils.pm

Among them, x3dna_setup.pl and blocview.pl have corresponding Ruby versions: x3dna_setup and blocview. Actually, the .pl file extension (for Perl) was added to avoid confusion with the new Ruby scripts.

Some of the functionalities have been incorporated into the Ruby script x3dna_utils:

------------------------------------------------------------------------
A miscellaneous collection of 3DNA utilities
    Usage: x3dna_utils [-h|-v] sub-command [-h] [options]
    where sub-command must be one of: 
        block_atom -- generate a base block schematic representation
        cp_std -- select standard PDB datasets for analyze/rebuild
        dcmnfile -- remove fixed-name files generated with 3DNA
        x3dna_r3d2png -- convert .r3d to image with Raster3D or PyMOL
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Along the same line, ensemble-related functionalities (for NMR or molecular dynamics simulations) have been consolidated and extended into the new Ruby script x3dna_ensemble:

------------------------------------------------------------------------
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of: 
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Conceivably, C programs in 3DNA can also be consolidated. For backward compatibility, however, all existing C programs will be kept — and refined as necessary — in the current 3DNA v2.x series. As of v3.x, I’ll completely re-organize 3DNA incorporating my years of experience in programming languages and knowledge of macromolecular structures.

Comment

---

Specification of base pairs in 3DNA

In 3DNA, each base pair (bp) is specified by the identity of its two comprising nucleotides (nts), and their interactions. Some examples are shown below based on the PDB entry 1ehz (the crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution), with the shorthand form on the right:

....>A:...1_:[..G]G-----C[..C]:..72_:A<....  G-C
....>A:...4_:[..G]G-*---U[..U]:..69_:A<....  G-U
....>A:...9_:[..A]A-**+-A[..A]:..23_:A<....  A+A
....>A:..15_:[..G]G-**+-C[..C]:..48_:A<....  G+C
....>A:..26_:[M2G]g-**--A[..A]:..44_:A<....  g-A

Specification of a nucleotide

The nt specification string consists of 6 fields and follows the pattern below, with the number of characters in each field inside the parentheses:

modelNum(4)>chainId(1):ntNum(4)insCode(1):[ntName(3)]baseName(1)

  1. modelNum(4) — the model number is up to 4 digits, right-justified, with each leading space replaced by a dot. If no model number is available, as is the case for 1ehz (and virtually all other x-ray crystal structures in the PDB), it is written as .... (4 dots).
  2. chainId(1) — the chain id is 1-char long, with space replaced by underscore.
  3. ntNum(4) — the nt residue number, handled as for the model number.
  4. insCode(1) — insertion code, handled as for the chain id.
  5. ntName(3) — the nt residue name is up to 3-char long, right-justified, with each leading space replaced by a dot.
  6. baseName(1) — the base name is 1-char long, mapped from ntName(3) following $X3DNA/config/baselist.dat. Note that modified nucleotides are put in lower case to distinguish them from the canonical ones — for example, M2G to g.

For the complementary base in a bp, the order of the 6 fields is reversed — see examples above. To see the full list of nts in a PDB data file, run: find_pair -s 1ehz.pdb stdout (here using 1ehz as an example).

Specification of a base pair

The pattern of a bp is M-xyz-N, where M and N are 1-char base names (as in aforesaid field #6), and the three characters xyz have the following meaning:

  • z — the sign of the dot product of the z-axes of the M and N base reference frames. It is positive (+) if the two z-axes point in similar directions, as in Hoogsteen or reverse Watson-Crick bps. Conversely, it is negative (-) when the two z-axes point in opposite directions, as in the canonical Watson-Crick and Wobble bps. See figure below:

Watson-Crick (M-N) vs Hoogsteen base pairs

  • y — it is - if M and N are in a so-called Watson-Crick geometry (the two y-axes of the M and N base reference frames are anti-parallel, so are the two z-axes, whilst the two x-axes are parallel), e.g., the G-U Wobble pair; otherwise, *.
  • x — it is - for Watson-Crick bps, otherwise, *.

By design, Watson-Crick bps would be of the pattern M-----N, Wobble bps M-*---N, and non-canonical bps M-**+-N or M-**--N. Thus by browsing through the 3DNA output, users can readily identify these three bp types.

The shortened form is represented as MzN; following aforementioned notation, it can be either M-N or M+N. The relative direction of the two z-axes is critical in effecting 3DNA-calculated bp (and step) parameters, as detailed in the 2003 3DNA NAR paper:

To calculate the six complementary base pair parameters of an M–N pair (Shear, Stretch, Stagger, Buckle, Propeller and Opening), where the two z‐axes run in opposite directions, the reference frame of the complementary base N is rotated about the x2‐axis by 180°, i.e. reversing the y2‐ and z2‐axes in Figure 2a. Under this convention, if the base pair is reckoned as an N–M pair, rather than an M–N pair, the x‐axis parameters (Shear and Buckle) reverse their signs. For an M+N pair, e.g. the Hoogsteen A+U in Figure 2b, the x2‐, y2‐ and z2‐axes do not change sign; thus all six parameters for an N+M pair are of opposite sign(s) from those for an M+N pair.

The M-N and M+N bp designation is unique to 3DNA. In combination with the corresponding 6 bp parameters (shear, stretch, stagger, buckle, propeller, and opening), 3DNA provides a rigorous description of all possible bps. This contrasts and complements with the conventional Saenger scheme and the 3-edge based Leontis/Westhof notation.

The 3DNA M-N vs M+N bp designation is base-centric, without concerning the sugar-phosphate backbone. The chi (χ) torsion angle, which characterizes base/sugar relative orientation, can be in either anti or syn conformation; thus similar backbone(S) can accommodate either M-N or M+N.

Comment

---

Seeing is understanding as well as believing

As the old saying goes, a picture is worth a thousand words. To help you have a better idea of what 3DNA/DSSR is about, we’ve collected the following pictures; they serve to demonstrate selected features from 3DNA/DSSR’s versatile functionality.

Cartoon-block schematic representations generated with DSSR and PyMOL

yeast phenylalanine tRNA (1ehz) with base blocks yeast phenylalanine tRNA (1ehz) with WC base-pair blocks
1msy: with the minor groove edge (black) of the C-G pair that closes the GUAA tetraloop facing the viewer 27-nt rRNA fragment with GUAA tetraloop (1msy) -- base blocks in outline

Schematic diagram of base-pair parameters

Schematic diagram of rigid body parameters

Influence of Slide and Roll on DNA helical conformation

Influence of Slide and Roll on DNA helical conformation

Roll-introduced DNA bending

Roll-introduced DNA bending

Global bending of DNA associated with selective B → A conformational transformation

Global bending of DNA associated with selective B → A conformational transformation

Canonical fiber models of A-, B-, C- and Z-DNA

Canonical fiber models of A-, B-, C- and Z-DNA

3DNA-generated view of a four-way DNA–RNA junction (1egk)

four-way DNA–RNA junction (1egk)

3DNA-detected pentaplets in the large ribosomal subunit (1jj2)

pentaplets in the large ribosomal subunit (1jj2)

3DNA enabled the discovery of the O2′(G)−O2P(U) H-bond which stabilizes the GpU dinucleotide platform

GpU dinucleotide platform stabilized by the O2′(G)−O2P(U) H-bond

Nucleic-acid-containing structures generated with w3DNA

Nucleic-acid-containing structures generated with w3DNA

Analysis of DNA with a B-Z junction (2acj, left) and detection of hydration patterns (right)

B/Z junction and hydration patterns

Schematics images auto-generated via blocview

2f4u 408d 9ant
complex of the bacterial ribosomal aminoacyl-tRNA site (A- site) with a designer antibiotic (2f4u) drug recognition of A-T and T-A base pairs in the B-DNA minor groove (408d) complex of DNA with the Antennapedia homeodomain (9ant)

Comment [1]

---

Welcome

A video overview of DSSR

DSSR (Dissecting the Spatial Structure of RNA) is an integrated software tool for the analysis/annotation, model building, and schematic visualization of 3D nucleic acid structures (see the figures below and the video overview). It is built upon the well-known, tested, and trusted 3DNA suite of programs. DSSR has been made possible by the developer’s extensive user-support experience, detail-oriented software engineering skills, and expert domain knowledge accumulated over two decades. It streamlines tasks in RNA/DNA structural bioinformatics, and outperforms its ‘competitors’ by far in terms of functionality, usability, and support.

Wide citations. DSSR has been widely cited in scientific literature, including: (i) “Selective small-molecule inhibition of an RNA structural element” (Nature, 2015; Merck Research Laboratories), (ii) “The structure of the yeast mitochondrial ribosome” (Science, 2017), (iii) “RNA force field with accuracy comparable to state-of-the-art protein force fields” (PNAS, 2018; D. E. Shaw Research), (iv) “Predicting site-binding modes of ions and water to nucleic acids using molecular solvation theory” (JACS, 2019), (v) “RIC-seq for global in situ profiling of RNA-RNA spatial interactions” (Nature, 2020), and (vi) “DNA mismatches reveal conformational penalties in protein-DNA recognition” (Nature, 2020).

Broad integrations. To make DSSR as widely accessible as possible, I have initiated collaborations with the principal developers of Jmol and PyMOL. The DSSR-Jmol and DSSR-PyMOL integrations bring unparalleled search capabilities (e.g., ‘select junctions’ for all multi-branch loops) and innovative visualization styles into 3D nucleic acid structures. DSSR has also been adopted into numerous other structural bioinformatics resources, including: (i) URS, (ii) RiboSketch, (iii) RNApdbee, (iv) forgi, (v) RNAvista, (vi) VeriNA3d, (vii) RNAMake, (viii) ElTetrado, (ix) DNAproDB, (x) LocalSTAR3D, (xi) IPANEMAP, and (xii) RNANet.

Advanced features. DSSR may be licensed from Columbia University. DSSR Pro is the commercial version. It has more functionalities than DSSR basic (the free academic version), including: (i) homology modeling via in silico base mutations, a feature employed by Merck scientists, (ii) easy generation of regular helical models, including circular or super-helical DNA (see figures below), (iii) creation of customized structures with user-specified base sequences and rigid-body parameters, (iv) efficient processing of molecular dynamics (MD) trajectories, (v) detailed characterization of DNA-protein or RNA-protein spatial interactions, and (vi) template-based modeling of DNA-protein complexes (see figures below). DSSR Pro supersedes 3DNA. It integrates the disparate analysis and modeling programs of 3DNA under one umbrella, and offers new advanced features, through a convenient interface. For example, with the mutate module of DSSR Pro, one can automatically perform the following tasks: (i) mutate all bases to Us, (ii) mutate bases in hairpin loops to Gs, and (iii) mutate G–C Watson-Crick pairs to C–G, and A–U to U–A. Moreover, DSSR Pro includes an in-depth user manual and one-year technical support from the developer.

Quality control. DSSR is a solid software product that excels in RNA structural bioinformatics. It is written in strict ANSI C, as a single command-line program. It is self-contained, with zero runtime dependencies on third-party libraries. The binary executables for macOS, Linux, and Windows are just ~2MB. DSSR has been extensively tested using all nucleic-acid-containing structures in the PDB. It is also routinely checked with Valgrind to avoid memory leaks. DSSR requires no set up or configuration: it simply works.


Theoretical models of G-quadruplexes, created using DSSR Pro.



Template-based modeling of DNA-protein complexes using DSSR Pro.
Here are two chromatin-like models using PDB entry 4xzq as the template.



Circular DNA duplexes modeled using DSSR Pro.




DNA super helices modeled using DSSR Pro.



Innovative cartoon-block schematics enabled by the DSSR-PyMOL integration for six representative PDB entries. Watson-Crick pairs are shown as long blocks with minor-groove edges in black (A, B), G-tetrads represented as square blocks and the metal ion as sphere ©, the ligand rendered as balls-and-sticks (D), and proteins depicted as purple cartoons (E, F). Color code for base blocks: A, red; C, yellow; G, green; T, blue; U, cyan; G-tetrad, green; WC-pairs, per base in the leading strand. Visit http://skmatic.x3dna.org.
Recommended in Faculty Opinions: “simple and effective”, “Good for Teaching”.
Employed by the NDB to create cover images of the RNA Journal.

---

Outside links

The following links point to tools that are relevant to 3DNA.

  • Curves+ — an updated version of the well-known Curves program, and it conforms to the standard base reference frame.
  • 3D-DART — 3DNA-Driven DNA Analysis and Rebuilding Tool. Another web-interface to commonly used 3DNA functionality.
  • do_x3dna — “do_x3dna has been developed for analysis of the DNA/RNA dynamics during the molecular dynamics simulations. It uses the 3DNA package to calculate several structural descriptors of DNA/RNA from the GROMACS MD trajectory. It executes 3DNA tools to calculate these descriptors and subsequently, extracts these output and saves into external output files as a function of time.”
  • SwS — a Solvation web Service for Nucleic Acids where 3DNA plays a role.
  • Raster3D — a set of tools for generating high-quality raster images of proteins or other molecules.
  • MolScript — a program for displaying molecular 3D structures, such as proteins, in both schematic and detailed representations.
  • Jmol — an open-source Java viewer for chemical structures in 3D with features for chemicals, crystals, materials, and biomolecules.
  • PyMOL — a user-sponsored molecular visualization system on an open-source foundation.
  • ImageMagick — a software suite to create, edit, compose, or convert bitmap images.
  • NDB — Nucleic acids database.
  • SBGrid — Excellent services for structural biology laboratories as well software developers.

Comment

---

New features in 3DNA v2.1

The v2.1 release of 3DNA, currently in beta, contains many refinements of existing C programs, a complete migration from Perl scripts to Ruby, and additions of several significant new programs. All know bugs in v2.0 have been fixed. Highlights include:

  • Added mutate_bases to perform in silico base mutations in nucleic-acid-containing structures (DNA, RNA, and their complexes with ligands and proteins). The program has two key and unique features: (1) the sugar-phosphate backbone conformation is untouched; (2) the base reference frame (position and orientation) is reserved, i.e., the mutated structure shares the same base-pair/step parameters as those of the native structure.
  • Added x3dna_ensemble, a Ruby script to automate the processing of an NMR structure ensemble or MD trajectories in MODEL/ENDMDL delineated PDB format. It has sub-commands analyze, extract, reorient, and block_iamge. To add: convert to transform Amber, Gromacs or CHARMM trajectories.
  • Enhanced find_pair with -c+ option for generating input to Curves+.
  • Expanded fiber with the -s option for generating single stranded structures; the -seq option for specifying base sequence directly on the command line; and the -r option for generating RNA structures (single or double stranded) of arbitrary ACGU sequences.
  • Updated the ‘baselist.dat’ file to incorporate all types of NDB/PDB nucleotides as of February 15, 2015; refined find_pair/analyze/mutate_bases etc to automatically detect and assign of modified bases.
  • Renamed Atomic_a.pdb and Atomic.a.pdb etc for modified bases to account for Mac OS X filesystem case sensitivity issue; Copied all Perl scripts to a new directory perl_scripts/.
  • 3DNA now generates PDB files that are compliant with PDB format v3.x, and also has option to allow for three-letter nucleotide names, thus directly compatible with PdbViewer and HADDock. An option is provided to convert 3DNA-generated base rectangular blocks in Alchemy to the more widely accepted MDL molfile format (e.g. by PyMOL).

Comment

---

· Newer »

Thank you for printing this article from http://home.x3dna.org/. Please do not forget to visit back for more 3DNA-related information. — Xiang-Jun Lu