1# -*- Mode: python; tab-width: 4; indent-tabs-mode:nil; coding:utf-8 -*-
2# vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4
3#
4# MDAnalysis --- https://www.mdanalysis.org
5# Copyright (c) 2006-2017 The MDAnalysis Development Team and contributors
6# (see the file AUTHORS for the full list of names)
7#
8# Released under the GNU Public Licence, v2 or any higher version
9#
10# Please cite your use of MDAnalysis in published work:
11#
12# R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler,
13# D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein.
14# MDAnalysis: A Python package for the rapid analysis of molecular dynamics
15# simulations. In S. Benthall and S. Rostrup editors, Proceedings of the 15th
16# Python in Science Conference, pages 102-109, Austin, TX, 2016. SciPy.
17#
18# N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein.
19# MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations.
20# J. Comput. Chem. 32 (2011), 2319--2327, doi:10.1002/jcc.21787
21#
22r"""Dihedral angles analysis --- :mod:`MDAnalysis.analysis.dihedrals`
23===========================================================================
24
25:Author: Henry Mull
26:Year: 2018
27:Copyright: GNU Public License v2
28
29.. versionadded:: 0.19.0
30
31This module contains classes for calculating dihedral angles for a given set of
32atoms or residues. This can be done for selected frames or whole trajectories.
33
34A list of time steps that contain angles of interest is generated and can be
35easily plotted if desired. For the :class:`~MDAnalysis.analysis.dihedrals.Ramachandran`
36and :class:`~MDAnalysis.analysis.dihedrals.Janin` classes, basic plots can be
37generated using the method :meth:`Ramachandran.plot()` or :meth:`Janin.plot()`.
38These plots are best used as references, but they also allow for user customization.
39
40
41See Also
42--------
43:func:`MDAnalysis.lib.distances.calc_dihedrals()`
44   function to calculate dihedral angles from atom positions
45
46
47Example applications
48--------------------
49
50General dihedral analysis
51~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52
53The :class:`~MDAnalysis.analysis.dihedrals.Dihedral` class is useful for calculating
54angles for many dihedrals of interest. For example, we can find the phi angles
55for residues 5-10 of adenylate kinase (AdK). The trajectory is included within
56the test data files::
57
58   import MDAnalysis as mda
59   from MDAnalysisTests.datafiles import GRO, XTC
60   u = mda.Universe(GRO, XTC)
61
62   # selection of atomgroups
63   ags = [res.phi_selection() for res in u.residues[4:9]]
64
65   from MDAnalysis.analysis.dihedrals import Dihedral
66   R = Dihedral(ags).run()
67
68The angles can then be accessed with :attr:`Dihedral.angles`.
69
70Ramachandran analysis
71~~~~~~~~~~~~~~~~~~~~~
72
73The :class:`~MDAnalysis.analysis.dihedrals.Ramachandran` class allows for the
74quick calculation of phi and psi angles. Unlike the :class:`~MDanalysis.analysis.dihedrals.Dihedral`
75class which takes a list of `atomgroups`, this class only needs a list of
76residues or atoms from those residues. The previous example can repeated with::
77
78   u = mda.Universe(GRO, XTC)
79   r = u.select_atoms("resid 5-10")
80
81   R = Ramachandran(r).run()
82
83Then it can be plotted using the built-in plotting method :meth:`Ramachandran.plot()`::
84
85   import matplotlib.pyplot as plt
86   fig, ax = plt.subplots(figsize=plt.figaspect(1))
87   R.plot(ax=ax, color='k', marker='s')
88
89as shown in the example :ref:`Ramachandran plot figure <figure-ramachandran>`.
90
91.. _figure-ramachandran:
92
93.. figure:: /images/rama_demo_plot.png
94   :scale: 50 %
95   :alt: Ramachandran plot
96
97   Ramachandran plot for residues 5 to 10 of AdK, sampled from the AdK test
98   trajectory (XTC). The contours in the background are the "allowed region"
99   and the "marginally allowed" regions.
100
101The Janin class works in the same way, only needing a list of residues; see the
102:ref:`Janin plot figure <figure-janin>` as an example. To plot the data
103yourself, the angles can be accessed using :attr:`Ramachandran.angles` or
104:attr:`Janin.angles`.
105
106Reference plots can be added to the axes for both the Ramachandran and Janin
107classes using the kwarg ``ref=True``. The Ramachandran reference data
108(:data:`~MDAnalysis.analysis.data.filenames.Rama_ref`) and Janin reference data
109(:data:`~MDAnalysis.analysis.data.filenames.Janin_ref`) were made using data
110obtained from a large selection of 500 PDB files, and were analyzed using these
111classes. The allowed and marginally allowed regions of the Ramachandran reference
112plt have cutoffs set to include 90% and 99% of the data points, and the Janin
113reference plot has cutoffs for 90% and 98% of the data points. The list of PDB
114files used for the referece plots was taken from [Lovell2003]_ and information
115about general Janin regions was taken from [Janin1978]_.
116
117.. Note::
118   These classes are prone to errors if the topology contains duplicate or missing
119   atoms (e.g. atoms with `altloc` or incomplete residues). If the topology has as
120   an `altloc` attribute, you must specify only one `altloc` for the atoms with
121   more than one (``"protein and not altloc B"``).
122
123
124.. _figure-janin:
125
126.. figure:: /images/janin_demo_plot.png
127   :scale: 50 %
128   :alt: Janin plot
129
130   Janin plot for residues 5 to 10 of AdK, sampled from the AdK test trajectory
131   (XTC). The contours in the background are the "allowed region" and the
132   "marginally allowed" regions for all possible residues.
133
134
135Analysis Classes
136----------------
137
138.. autoclass:: Dihedral
139   :members:
140   :inherited-members:
141
142   .. attribute:: angles
143
144       Contains the time steps of the angles for each atomgroup in the list as
145       an ``n_frames×len(atomgroups)`` :class:`numpy.ndarray` with content
146       ``[[angle 1, angle 2, ...], [time step 2], ...]``.
147
148.. autoclass:: Ramachandran
149   :members:
150   :inherited-members:
151
152   .. attribute:: angles
153
154       Contains the time steps of the :math:`\phi` and :math:`\psi` angles for
155       each residue as an ``n_frames×n_residues×2`` :class:`numpy.ndarray` with
156       content ``[[[phi, psi], [residue 2], ...], [time step 2], ...]``.
157
158.. autoclass:: Janin
159   :members:
160   :inherited-members:
161
162   .. attribute:: angles
163
164       Contains the time steps of the :math:`\chi_1` and :math:`\chi_2` angles
165       for each residue as an ``n_frames×n_residues×2`` :class:`numpy.ndarray`
166       with content ``[[[chi1, chi2], [residue 2], ...], [time step 2], ...]``.
167
168References
169----------
170.. [Lovell2003] Simon C. Lovell, Ian W. Davis, W. Bryan Arendall III,
171   Paul I. W. de Bakker, J. Michael Word, Michael G. Prisant,
172   Jane S. Richardson, and David C. Richardson (2003). "Structure validation by
173   :math:`C_{\alpha}` geometry: :math:`\phi`, :math:`\psi`, and
174   :math:`C_{\beta}` deviation". *Proteins* 50(3): 437-450. doi:
175   `10.1002/prot.10286 <https://doi.org/10.1002/prot.10286>`_
176
177.. [Janin1978] Joël Janin, Shoshanna Wodak, Michael Levitt, and Bernard
178   Maigret. (1978). "Conformation of amino acid side-chains in
179   proteins". *Journal of Molecular Biology* 125(3): 357-386. doi:
180   `10.1016/0022-2836(78)90408-4 <https://doi.org/10.1016/0022-2836(78)90408-4>`_
181
182"""
183from __future__ import absolute_import
184from six.moves import zip, range
185
186import numpy as np
187import matplotlib.pyplot as plt
188
189import warnings
190
191import MDAnalysis as mda
192from MDAnalysis.analysis.base import AnalysisBase
193from MDAnalysis.lib.distances import calc_dihedrals
194from MDAnalysis.analysis.data.filenames import Rama_ref, Janin_ref
195
196
197class Dihedral(AnalysisBase):
198    """Calculate dihedral angles for specified atomgroups.
199
200    Dihedral angles will be calculated for each atomgroup that is given for
201    each step in the trajectory. Each :class:`~MDAnalysis.core.groups.AtomGroup`
202    must contain 4 atoms.
203
204    Note
205    ----
206    This class takes a list as an input and is most useful for a large
207    selection of atomgroups. If there is only one atomgroup of interest, then
208    it must be given as a list of one atomgroup.
209
210    """
211    def __init__(self, atomgroups, **kwargs):
212        """Parameters
213        ----------
214        atomgroups : list
215            a list of atomgroups for which the dihedral angles are calculated
216
217        Raises
218        ------
219        ValueError
220            If any atomgroups do not contain 4 atoms
221
222        """
223        super(Dihedral, self).__init__(atomgroups[0].universe.trajectory, **kwargs)
224        self.atomgroups = atomgroups
225
226        if any([len(ag) != 4 for ag in atomgroups]):
227            raise ValueError("All AtomGroups must contain 4 atoms")
228
229        self.ag1 = mda.AtomGroup([ag[0] for ag in atomgroups])
230        self.ag2 = mda.AtomGroup([ag[1] for ag in atomgroups])
231        self.ag3 = mda.AtomGroup([ag[2] for ag in atomgroups])
232        self.ag4 = mda.AtomGroup([ag[3] for ag in atomgroups])
233
234    def _prepare(self):
235        self.angles = []
236
237    def _single_frame(self):
238        angle = calc_dihedrals(self.ag1.positions, self.ag2.positions,
239                               self.ag3.positions, self.ag4.positions,
240                               box=self.ag1.dimensions)
241        self.angles.append(angle)
242
243    def _conclude(self):
244        self.angles = np.rad2deg(np.array(self.angles))
245
246class Ramachandran(AnalysisBase):
247    """Calculate :math:`\phi` and :math:`\psi` dihedral angles of selected residues.
248
249    :math:`\phi` and :math:`\psi` angles will be calculated for each residue
250    corresponding to `atomgroup` for each time step in the trajectory. A
251    :class:`~MDAnalysis.ResidueGroup` is generated from `atomgroup` which is
252    compared to the protein to determine if it is a legitimate selection.
253
254    Note
255    ----
256    If the residue selection is beyond the scope of the protein, then an error
257    will be raised. If the residue selection includes the first or last residue,
258    then a warning will be raised and they will be removed from the list of
259    residues, but the analysis will still run. If a :math:`\phi` or :math:`\psi`
260    selection cannot be made, that residue will be removed from the analysis.
261
262    """
263    def __init__(self, atomgroup, **kwargs):
264        """Parameters
265        ----------
266        atomgroup : AtomGroup or ResidueGroup
267            atoms for residues for which :math:`\phi` and :math:`\psi` are
268            calculated
269
270        Raises
271        ------
272        ValueError
273            If the selection of residues is not contained within the protein
274
275        """
276        super(Ramachandran, self).__init__(atomgroup.universe.trajectory, **kwargs)
277        self.atomgroup = atomgroup
278        residues = self.atomgroup.residues
279        protein = self.atomgroup.universe.select_atoms("protein").residues
280
281        if not residues.issubset(protein):
282            raise ValueError("Found atoms outside of protein. Only atoms "
283                             "inside of a 'protein' selection can be used to "
284                             "calculate dihedrals.")
285        elif not residues.isdisjoint(protein[[0, -1]]):
286            warnings.warn("Cannot determine phi and psi angles for the first "
287                          "or last residues")
288            residues = residues.difference(protein[[0, -1]])
289
290        phi_sel = [res.phi_selection() for res in residues]
291        psi_sel = [res.psi_selection() for res in residues]
292        # phi_selection() and psi_selection() currently can't handle topologies
293        # with an altloc attribute so this removes any residues that have either
294        # angle return none instead of a value
295        if any(sel is None for sel in phi_sel):
296            warnings.warn("Some residues in selection do not have phi selections")
297            remove = [i for i, sel in enumerate(phi_sel) if sel is None]
298            phi_sel = [sel for i, sel in enumerate(phi_sel) if i not in remove]
299            psi_sel = [sel for i, sel in enumerate(psi_sel) if i not in remove]
300        if any(sel is None for sel in psi_sel):
301            warnings.warn("Some residues in selection do not have psi selections")
302            remove = [i for i, sel in enumerate(psi_sel) if sel is None]
303            phi_sel = [sel for i, sel in enumerate(phi_sel) if i not in remove]
304            psi_sel = [sel for i, sel in enumerate(psi_sel) if i not in remove]
305        self.ag1 = mda.AtomGroup([atoms[0] for atoms in phi_sel])
306        self.ag2 = mda.AtomGroup([atoms[1] for atoms in phi_sel])
307        self.ag3 = mda.AtomGroup([atoms[2] for atoms in phi_sel])
308        self.ag4 = mda.AtomGroup([atoms[3] for atoms in phi_sel])
309        self.ag5 = mda.AtomGroup([atoms[3] for atoms in psi_sel])
310
311    def _prepare(self):
312        self.angles = []
313
314    def _single_frame(self):
315        phi_angles = calc_dihedrals(self.ag1.positions, self.ag2.positions,
316                                    self.ag3.positions, self.ag4.positions,
317                                    box=self.ag1.dimensions)
318        psi_angles = calc_dihedrals(self.ag2.positions, self.ag3.positions,
319                                    self.ag4.positions, self.ag5.positions,
320                                    box=self.ag1.dimensions)
321        phi_psi = [(phi, psi) for phi, psi in zip(phi_angles, psi_angles)]
322        self.angles.append(phi_psi)
323
324    def _conclude(self):
325        self.angles = np.rad2deg(np.array(self.angles))
326
327
328    def plot(self, ax=None, ref=False, **kwargs):
329        """Plots data into standard ramachandran plot. Each time step in
330        :attr:`Ramachandran.angles` is plotted onto the same graph.
331
332        Parameters
333        ----------
334        ax : :class:`matplotlib.axes.Axes`
335              If no `ax` is supplied or set to ``None`` then the plot will
336              be added to the current active axes.
337
338        ref : bool, optional
339              Adds a general Ramachandran plot which shows allowed and
340              marginally allowed regions
341
342        Returns
343        -------
344        ax : :class:`matplotlib.axes.Axes`
345             Axes with the plot, either `ax` or the current axes.
346
347        """
348        if ax is None:
349            ax = plt.gca()
350        ax.axis([-180,180,-180,180])
351        ax.axhline(0, color='k', lw=1)
352        ax.axvline(0, color='k', lw=1)
353        ax.set(xticks=range(-180, 181, 60), yticks=range(-180, 181, 60),
354               xlabel=r"$\phi$ (deg)", ylabel=r"$\psi$ (deg)")
355        if ref == True:
356            X, Y = np.meshgrid(np.arange(-180, 180, 4), np.arange(-180, 180, 4))
357            levels = [1, 17, 15000]
358            colors = ['#A1D4FF', '#35A1FF']
359            ax.contourf(X, Y, np.load(Rama_ref), levels=levels, colors=colors)
360        a = self.angles.reshape(np.prod(self.angles.shape[:2]), 2)
361        ax.scatter(a[:,0], a[:,1], **kwargs)
362        return ax
363
364class Janin(Ramachandran):
365    """Calculate :math:`\chi_1` and :math:`\chi_2` dihedral angles of selected residues.
366
367    :math:`\chi_1` and :math:`\chi_2` angles will be calculated for each residue
368    corresponding to `atomgroup` for each time step in the trajectory. A
369    :class:`~MDAnalysis.ResidueGroup` is generated from `atomgroup` which is
370    compared to the protein to determine if it is a legitimate selection.
371
372    Note
373    ----
374    If the residue selection is beyond the scope of the protein, then an error
375    will be raised. If the residue selection includes the residues ALA, CYS,
376    GLY, PRO, SER, THR, or VAL, then a warning will be raised and they will be
377    removed from the list of residues, but the analysis will still run. Some
378    topologies have altloc attribues which can add duplicate atoms to the
379    selection and must be removed.
380
381    """
382    def __init__(self, atomgroup, **kwargs):
383        """Parameters
384        ----------
385        atomgroup : AtomGroup or ResidueGroup
386            atoms for residues for which :math:`\chi_1` and :math:`\chi_2` are
387            calculated
388
389        Raises
390        ------
391        ValueError
392             If the selection of residues is not contained within the protein
393
394        ValueError
395             If not enough or too many atoms are found for a residue in the
396             selection, usually due to missing atoms or alternative locations
397
398        """
399        super(Ramachandran, self).__init__(atomgroup.universe.trajectory, **kwargs)
400        self.atomgroup = atomgroup
401        residues = atomgroup.residues
402        protein = atomgroup.universe.select_atoms("protein").residues
403        remove = residues.atoms.select_atoms("resname ALA CYS GLY PRO SER"
404                                             " THR VAL").residues
405
406        if not residues.issubset(protein):
407            raise ValueError("Found atoms outside of protein. Only atoms "
408                             "inside of a 'protein' selection can be used to "
409                             "calculate dihedrals.")
410        elif len(remove) != 0:
411            warnings.warn("All ALA, CYS, GLY, PRO, SER, THR, and VAL residues"
412                          " have been removed from the selection.")
413            residues = residues.difference(remove)
414
415        self.ag1 = residues.atoms.select_atoms("name N")
416        self.ag2 = residues.atoms.select_atoms("name CA")
417        self.ag3 = residues.atoms.select_atoms("name CB")
418        self.ag4 = residues.atoms.select_atoms("name CG CG1")
419        self.ag5 = residues.atoms.select_atoms("name CD CD1 OD1 ND1 SD")
420
421        # if there is an altloc attribute, too many atoms will be selected which
422        # must be removed before using the class, or the file is missing atoms
423        # for some residues which must also be removed
424        if any(len(self.ag1) != len(ag) for ag in [self.ag2, self.ag3,
425                                                   self.ag4, self.ag5]):
426            raise ValueError("Too many or too few atoms selected. Check for "
427                             "missing or duplicate atoms in topology.")
428
429    def _conclude(self):
430        self.angles = (np.rad2deg(np.array(self.angles)) + 360) % 360
431
432    def plot(self, ax=None, ref=False, **kwargs):
433        """Plots data into standard Janin plot. Each time step in
434        :attr:`Janin.angles` is plotted onto the same graph.
435
436        Parameters
437        ----------
438        ax : :class:`matplotlib.axes.Axes`
439              If no `ax` is supplied or set to ``None`` then the plot will
440              be added to the current active axes.
441
442        ref : bool, optional
443              Adds a general Janin plot which shows allowed and marginally
444              allowed regions
445
446        Returns
447        -------
448        ax : :class:`matplotlib.axes.Axes`
449             Axes with the plot, either `ax` or the current axes.
450
451        """
452        if ax is None:
453            ax = plt.gca()
454        ax.axis([0, 360, 0, 360])
455        ax.axhline(180, color='k', lw=1)
456        ax.axvline(180, color='k', lw=1)
457        ax.set(xticks=range(0, 361, 60), yticks=range(0, 361, 60),
458               xlabel=r"$\chi1$ (deg)", ylabel=r"$\chi2$ (deg)")
459        if ref == True:
460            X, Y = np.meshgrid(np.arange(0, 360, 6), np.arange(0, 360, 6))
461            levels = [1, 6, 600]
462            colors = ['#A1D4FF', '#35A1FF']
463            ax.contourf(X, Y, np.load(Janin_ref), levels=levels, colors=colors)
464        a = self.angles.reshape(np.prod(self.angles.shape[:2]), 2)
465        ax.scatter(a[:,0], a[:,1], **kwargs)
466        return ax
467