4.2.4. Calculating root mean square quantities — MDAnalysis.analysis.rms
¶
Author: | Oliver Beckstein, David L. Dotson, John Detlefs |
---|---|
Year: | 2016 |
Copyright: | GNU Public License v2 |
New in version 0.7.7.
Changed in version 0.11.0: Added RMSF
analysis.
Changed in version 0.16.0: Refactored RMSD to fit AnalysisBase API
The module contains code to analyze root mean square quantities such
as the coordinat root mean square distance (RMSD
) or the
per-residue root mean square fluctuations (RMSF
).
This module uses the fast QCP algorithm [Theobald2005] to calculate
the root mean square distance (RMSD) between two coordinate sets (as
implemented in
MDAnalysis.lib.qcprot.CalcRMSDRotationalMatrix()
).
When using this module in published work please cite [Theobald2005].
See also
MDAnalysis.analysis.align
- aligning structures based on RMSD
MDAnalysis.lib.qcprot
- implements the fast RMSD algorithm.
4.2.4.1. Example applications¶
4.2.4.1.1. Calculating RMSD for multiple domains¶
In this example we will globally fit a protein to a reference structure and investigate the relative movements of domains by computing the RMSD of the domains to the reference. The example is a DIMS trajectory of adenylate kinase, which samples a large closed-to-open transition. The protein consists of the CORE, LID, and NMP domain.
- superimpose on the closed structure (frame 0 of the trajectory), using backbone atoms
- calculate the backbone RMSD and RMSD for CORE, LID, NMP (backbone atoms)
The trajectory is included with the test data files. The data in
RMSD.rmsd
is plotted with matplotlib.pyplot.plot()
:
import MDAnalysis
from MDAnalysis.tests.datafiles import PSF,DCD,CRD
u = MDAnalysis.Universe(PSF,DCD)
ref = MDAnalysis.Universe(PSF,DCD) # reference closed AdK (1AKE) (with the default ref_frame=0)
#ref = MDAnalysis.Universe(PSF,CRD) # reference open AdK (4AKE)
import MDAnalysis.analysis.rms
R = MDAnalysis.analysis.rms.RMSD(u, ref,
select="backbone", # superimpose on whole backbone of the whole protein
groupselections=["backbone and (resid 1-29 or resid 60-121 or resid 160-214)", # CORE
"backbone and resid 122-159", # LID
"backbone and resid 30-59"], # NMP
filename="rmsd_all_CORE_LID_NMP.dat")
R.run()
R.save()
import matplotlib.pyplot as plt
rmsd = R.rmsd.T # transpose makes it easier for plotting
time = rmsd[1]
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111)
ax.plot(time, rmsd[2], 'k-', label="all")
ax.plot(time, rmsd[3], 'k--', label="CORE")
ax.plot(time, rmsd[4], 'r--', label="LID")
ax.plot(time, rmsd[5], 'b--', label="NMP")
ax.legend(loc="best")
ax.set_xlabel("time (ps)")
ax.set_ylabel(r"RMSD ($\AA$)")
fig.savefig("rmsd_all_CORE_LID_NMP_ref1AKE.pdf")
4.2.4.2. Functions¶
-
MDAnalysis.analysis.rms.
rmsd
(a, b, weights=None, center=False, superposition=False)[source]¶ Returns RMSD between two coordinate sets a and b.
a and b are arrays of the coordinates of N atoms of shape N*3 as generated by, e.g.,
MDAnalysis.core.groups.AtomGroup.coordinates()
.Note
If you use trajectory data from simulations performed under periodic boundary conditions then you must make your molecules whole before performing RMSD calculations so that the centers of mass of the mobile and reference structure are properly superimposed.
Parameters: - b (a,) – coordinates to align
- weights (array_like (optional)) – 1D array with weights, use to compute weighted average
- center (bool (optional)) – subtract center of geometry before calculation. With weights given compute weighted average as center.
- superposition (bool (optional)) – perform a rotational and translational superposition with the fast QCP algorithm [Theobald2005] before calculating the RMSD
Returns: rmsd – RMSD between a and b
Return type: Example
>>> u = Universe(PSF,DCD) >>> bb = u.select_atoms('backbone') >>> A = bb.positions.copy() # coordinates of first frame >>> u.trajectory[-1] # forward to last frame >>> B = bb.positions.copy() # coordinates of last frame >>> rmsd(A, B, center=True) 3.9482355416565049
4.2.4.3. Analysis classes¶
-
class
MDAnalysis.analysis.rms.
RMSD
(atomgroup, reference=None, select=’all’, groupselections=None, filename=’rmsd.dat’, mass_weighted=None, weights=None, tol_mass=0.1, ref_frame=0, **kwargs)[source]¶ Class to perform RMSD analysis on a trajectory.
The RMSD will be computed between select and reference for all frames in the trajectory belonging to atomgroup.
Note
If you use trajectory data from simulations performed under periodic boundary conditions then you must make your molecules whole before performing RMSD calculations so that the centers of mass of the mobile and reference structure are properly superimposed.
Run the analysis with
RMSD.run()
, which stores the results in the arrayRMSD.rmsd
.Parameters: - atomgroup (
AtomGroup
or) –Universe
MDAnalysis AtomGroup or Universe with an associated trajectory to be RMSD fit. - reference (
AtomGroup
or) –Universe
, optional Reference coordinates, ifNone
current frame of atomgroup is used - select (str / dict / tuple (optional)) –
The selection to operate on; can be one of:
- any valid selection string for
select_atoms()
that produces identical selections in mobile and reference; or - a dictionary
{'mobile':sel1, 'reference':sel2}
(theMDAnalysis.analysis.align.fasta2select()
function returns such a dictionary based on a ClustalW or STAMP sequence alignment); or - a tuple
(sel1, sel2)
When using 2. or 3. with sel1 and sel2 then these selections can also each be a list of selection strings (to generate a AtomGroup with defined atom order as described under Ordered selections).
- any valid selection string for
- groupselections (list (optional)) –
A list of selections as described for select Each selection describes additional RMSDs to be computed after the structures have be superpositioned according to select. The output contains one additional column for each selection. [
None
]Note
Experimental feature. Only limited error checking implemented.
- filename (str, optional) – write RSMD into file file
RMSD.save()
- mass_weighted (bool (deprecated)) – do a mass-weighted RMSD fit
- weights (str/array_like (optional)) – choose weights. If ‘str’ uses masses as weights
- tol_mass (float (optional)) – Reject match if the atomic masses for matched atoms differ by more than tol_mass
- ref_frame (int (optional)) – frame index to select frame from reference
Notes
The root mean square deviation of a group of \(N\) atoms relative to a reference structure as a function of time is calculated as
\[\rho(t) = \sqrt{\frac{1}{N} \sum_{i=1}^N \left(\mathbf{x}_i(t) - \mathbf{x}_i^{\text{ref}}\right)^2}\]This class uses Douglas Theobald’s fast QCP algorithm [Theobald2005] to calculate the RMSD (see
MDAnalysis.lib.qcprot
for implementation details).The class runs various checks on the input to ensure that the two atom groups can be compared. This includes a comparison of atom masses (i.e., only the positions of atoms of the same mass will be considered to be correct for comparison). If masses should not be checked, just set tol_mass to a large value such as 1000.
New in version 0.7.7.
Changed in version 0.8: groupselections added
Changed in version 0.16.0: Flexible weighting scheme with new weights keyword.
Deprecated since version 0.16.0: Instead of
mass_weighted=True
use newweights='mass'
; refactored to fit with AnalysisBase API-
rmsd
¶ Contains the time series of the RMSD as an N×3
numpy.ndarray
array with content[[frame, time (ps), RMSD (A)], [...], ...]
.
-
run
()¶ Perform the calculation
- atomgroup (
-
class
MDAnalysis.analysis.rms.
RMSF
(atomgroup, weights=None, **kwargs)[source]¶ Calculate RMSF of given atoms across a trajectory.
Parameters: - atomgroup (AtomGroup) – Atoms for which RMSF is calculated
- start (int (optional)) – starting frame, default None becomes 0.
- stop (int (optional)) – Frame index to stop analysis. Default: None becomes n_frames. Iteration stops before this frame number, which means that the trajectory would be read until the end.
- step (int (optional)) – step between frames, default None becomes 1.
- weights (str / array_like (optional)) – used weights. If
'mass'
use masses of atomgroup, itNone
use uniform weights. - verbose (bool (optional)) – if
False
, suppress all output
Note
No RMSD-superposition is performed; it is assumed that the user is providing a trajectory where the protein of interest has been structurally aligned to a reference structure.
Notes
The root mean square fluctuation of an atom \(i\) is computed as the time average
\[\rho_i = \sqrt{\left\langle (\mathbf{x}_i - \langle\mathbf{x}_i\rangle)^2 \right\rangle}\]This method implements an algorithm for computing sums of squares while avoiding overflows and underflows [Welford1962].
References
[Welford1962] B. P. Welford (1962). “Note on a Method for Calculating Corrected Sums of Squares and Products.” Technometrics 4(3):419-420. New in version 0.11.0.
Changed in version 0.16.0: Flexible weighting scheme with new weights keyword.
Deprecated since version 0.16.0: Instead of
mass_weighted=True
use newweights='mass'
; refactored to fit with AnalysisBase API; the keyword argument quiet is deprecated in favor of verbose.-
rmsf
¶ Results are stored in this N-length
numpy.ndarray
array, giving RMSFs for each of the given atoms.