4.1. Topology readers — MDAnalysis.topology

This submodule contains the topology readers. A topology file supplies the list of atoms in the system, their connectivity and possibly additional information such as B-factors, partial charges, etc. The details depend on the file format and not every topology file provides all (or even any) additional data. As a minimum, a topology file has to contain the names of atoms in the order of the coordinate file and their residue names and numbers.

The following table lists the currently supported topology formats.

Table of Supported Topology Formats
Name extension remarks
CHARMM/XPLOR PSF psf reads either format, atoms, bonds, angles, torsions/dihedrals information is all used; MDAnalysis.topology.PSFParser
CHARMM CARD [1] crd “CARD” coordinate output from CHARMM; deals with either standard or EXTended format; MDAnalysis.topology.CRDParser
Brookhaven [1] pdb a simplified PDB format (as used in MD simulations) is read by default; the full format can be read by supplying the permissive=False flag to MDAnalysis.Universe; MDAnalysis.topology.PrimitivePDBParser and MDAnalysis.topology.PDBParser
XPDB [1] pdb Extended PDB format (can use 5-digit residue numbers). To use, specify the format “XPBD” explicitly: Universe(..., topology_format="XPDB"). Module MDAnalysis.coordinates.PDB
PQR [1] pqr PDB-like but whitespace-separated files with charge and radius information; MDAnalysis.topology.PQRParser
PDBQT [1] pdbqt file format used by AutoDock with atom types t and partial charges q. Module: MDAnalysis.topology.PDBQTParser
GROMOS96 [1] gro GROMOS96 coordinate file; MDAnalysis.topology.GROParser
AMBER top, prmtop, parm7 simple AMBER format reader (only supports a subset of flags); MDAnalysis.topology.TOPParser
DESRES [1] dms DESRES molecular sturcture reader (only supports the atom and bond records); MDAnalysis.topology.DMSParser
TPR [2] tpr Gromacs portable run input reader (limited experimental support for some of the more recent versions of the file format); MDAnalysis.topology.TPRParser
MOL2 [1] mol2 Tripos MOL2 molecular structure format; MDAnalysis.topology.MOL2Parser
LAMMPS [1] data LAMMPS Data file parser MDAnalysis.topology.LAMMPSParser
XYZ [1] xyz XYZ File Parser. Reads only the labels from atoms and constructs minimal topology data. MDAnalysis.topology.XYZParser
GAMESS [1] gms, log GAMESS output parser. Read only atoms of assembly section (atom, elems and coords) and construct topology. MDAnalysis.topology.GMSParser
DL_Poly [1] config DL_Poly CONFIG file. Reads only the atom names. If atoms are written out of order, will correct the order. MDAnalysis.topology.DLPolyParser
DL_Poly [1] history DL_Poly HISTORY file. Reads only the atom names. If atoms are written out of order, will correct the order. MDAnalysis.topology.DLPolyParser
Hoomd XML xml HOOMD XML topology file. Reads atom types, masses, and charges if possible. Also reads bonds, angles, and dihedrals. MDAnalysis.topology.HoomdXMLParser
[1](1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13) This format can also be used to provide coordinates so that it is possible to create a full Universe by simply providing a file of this format as the sole argument to Universe: u = Universe(filename)
[2]The Gromacs TPR format contains coordinate information but parsing coordinates from a TPR file is currently not implemented in TPRParser.

4.1.1. Developer Notes

New in version 0.8.

Topology information consists of data that do not change over time, i.e. information that is the same for all time steps of a trajectory. This includes

  • identity of atoms (name, type, number, partial charge, ...) and to which residue and segment they belong; atoms are identified in MDAnalysis by their index, an integer number starting at 0 and incremented in the order of atoms found in the topology.
  • bonds (pairs of atoms)
  • angles (triplets of atoms)
  • dihedral angles (quadruplets of atoms) — proper and improper dihedrals should be treated separately

At the moment, only the identity of atoms is mandatory and at its most basic, the topology is simply a list of atoms to be associated with a list of coordinates.

The current implementation contains submodules for different topology file types. Each submodule must contain a function parse():

The function returns the basic MDAnalysis representation of the topology. At the moment, this is simply a dictionary with keys atoms, bonds, angles, dihedrals, impropers. The dictionary is stored as MDAnalysis.AtomGroup.Universe._topology.

Warning

The internal dictionary representation is subject to change. User code should not access this dictionary directly. The information provided here is solely for developers who need to work with the existing parsers.

The format of the individual keys is the following (see PSFParser for a reference implementation):

4.1.1.1. atoms

The atoms are represented as a list of Atom instances. The parser needs to initialize the Atom objects with the data read from the topology file.

The order of atoms in the list must correspond to the sequence of atoms in the topology file. The atom’s index corresponds to its index in this list.

4.1.1.2. bonds

Bonds are represented as a tuple of tuple. Each tuple contains two atom numbers, which indicate the atoms between which the bond is formed. Only one of the two permutations is stored, typically the one with the lower atom number first.

4.1.1.3. bondorder

Some bonds have additional information called order. When available this is stored in a dictionary of format {bondtuple:order}. This extra information is then passed to Bond initialisation in u._init_bonds()

4.1.1.4. angles

Angles are represented by a list of tuple. Each tuple contains three atom numbers. The second of these numbers represents the apex of the angle.

4.1.1.5. dihedrals

Proper dihedral angles are represented by a list of tuple. Each tuple contains four atom numbers. The angle of the torsion is defined by the angle between the planes formed by atoms 1, 2, and 3, and 2, 3, and 4.

4.1.1.6. impropers

Improper dihedral angles are represented by a list of tuple. Each tuple contains four atom numbers. The angle of the improper torsion is again defined by the angle between the planes formed by atoms 1, 2, and 3, and 2, 3, and 4. Improper dihedrals differ from regular dihedrals as the four atoms need not be sequentially bonded, and are instead often all bonded to the second atom.