Show Hydrogens in PyMol Tutorial
Show Hydrogens in PyMol Tutorial
Introduction
PyMol is a very powerful, general-purpose macromolecular structure viewing, analysis
and documenting program available on all generally used computer platforms. Although
there are a few other comparable programs available (ChimeraX
[[Link] is the main alternative), PyMol has excellent
overall versatility, power and ease of use.
Myoglobin
Through most of this tutorial we will be looking at 1A6M, a 1.0 Ångstrom resolution
structure of sperm whale myoglobin. This is a modern, high-resolution refinement of the
first protein structure ever solved – in 1954, by Kendrew and colleagues at Cambridge.
The protein has an iron-containing heme group that reversibly binds oxygen atoms; this
structure is discussed in introductory biochemistry courses – at Guelph this is the first
protein discussed in BIOC3560, so you are hopefully already broadly familiar with its
function and organization. Hopefully, this tutorial will give you some appreciation of the
additional possibilities of working directly with a structure, rather than relying on static
images in a textbook.
(Vojtechovsky, J., Chu, K., Berendzen, J., Sweet, R.M., Schlichting, I. Crystal Structures of
Myoglobin-Ligand Complexes at Near-Atomic Resolution Biophys.J. v77 pp.2153-
2164 , 1999)
Downloading and installing the program:
PyMol unfortunately now requires buying a subscription. For the purpose of use in the
course, we have obtained an educational license. The version of Pymol given below is
only for students enrolled in the course; please do not share the access information with
anybody not enrolled in MCB*3560. Please conform with the terms of the license we
have been granted.
Go to [Link]
Username: jun2021
Password: betabarrel
In general, in this course, we will be using Pymol v 3.10. Versions of PyMol are available
for Windows and Mac OSX (as well as for Linux). Choose the version appropriate to your
operating system.
You will need to download a license file, and then link it to PyMol when prompted the
first time you launch the program.
Note: you may be tempted to save time by simply reading this tutorial, rather than
going through each exercise. Do not do this, as this will not lead to an adequate
understanding of how to use the program, and you will be lost when you actually need
to use it.
Getting Oriented in Pymol
Launch PyMol through the start menu (Start->Programs->PyMol). You should be looking
at something like this (note – I am using OSX; windows likely looks a little different):
upper menu
Viewer
GUI
Main toolbar
Viewer Window
Below GUI
menu
Information window
Main toolbar
Command line
Main toolbar
At the top of the screen is the main menu for Pymol. This menu controls loading and
saving files, as well as numerous settings that alter the display properties of objects. The
large viewer window is an interactive surface where molecules will be displayed. Below
this is a command line, where written commands can be input to control the program,
including functions that go beyond what is available in the point-and-click menus. On
the right of this is a Viewer GUI at the right which controls the appearance of specific
molecules. There are also two other menus – above the viewer window, and below the
viewer GUI – that give access to various other global settings.
Loading a pdb file
To find all structures that contain myoglobin, you can simply search with “myoglobin”.
However, there are many such structures (>2500). For this tutorial we want to use a
specific version of the structure - 1A6M - to ensure that we are looking at the same
thing. To access specifically this structure, enter “1A6M” into the search bar. You should
get a window which gives details about the pdb file, including an image of the structure:
At the top right corner of the browser window is a set of menus; under Download Files
hit the “PDB File (Text)” option; this will download the full pdb file to your hard drive;
the desktop is a convenient destination, but I would suggest creating a new directory
specifically to work on this tutorial.
File-> open in Pymol and then navigating to the pdb file saved on your hard drive will
open it. This allows you to have more than one pdb file in the same session.
As a short cut, if you selected PyMol as your default viewer, you can open the pdb file by
either double clicking the icon representing the pdb file (this will launch a new PyMol
session).
File->Get PDB… is another very useful way of loading molecules. This method gives you
additional options - it allows you to be certain that you are viewing the oligomeric state
(not true by default), as well as enabling loading electron density maps.
Finally, a quick option is to type “fetch 1A6M” in the command line will retrieve the pdb
file from the pdb website for you.
New
structure
Here you can switch between viewing, editing, lights and motion modes. We will
generally be working in viewing mode. There are also options for 2-button modes –
useful if you only have access to a trackpad.
Changing the orientation of the molecule is simply a matter of left-clicking and dragging
within the viewer pane. Try clicking and dragging with the left mouse button within the
viewer panel; the molecule should rotate. Left clicking near the edge of the viewing
panel and dragging should rotate the molecule in the plane of the screen.
Shifting the molecule sideways is accomplished by middle mouse clicking and dragging.
Zooming in and out is accomplished by right mouse clicking and dragging up (to zoom
out) or down (to zoom in). Alternately, under the main menu, Display->Zoom->… gives a
series of preset zoom levels.
Adjusting clipping planes. Clipping planes can be thought of as invisible walls which cut
off the viewable portion of the molecule above and below the screen. This is helpful if
half the molecule is blocking your view of the residues you are trying to see. Press
<shift> and right-mouse click, and drag within the viewing window. This option is
especially useful when you are zoomed in and looking at details and need to clear up
some of the clutter.
Clipping planes can also be adjusted by using the main Menu Display->Clipping-> and
selecting the desired spacing of clipping planes.
One of the more convenient features for new users in the actions menu are the “preset”
options. These are essentially preset combinations of representations, combined with
some changes to some global variables that change how proteins are represented. Try
out the various options – e.g. Actions->preset->publication. Note – to ensure that you
restore hidden settings, it is a good idea to go back to the “default” view at the end.
The Show Menu: Left click the next box labeled “S”. The Show menu should appear,
with a list of options. Each of these shows a different representation of select atoms.
Under “wire”, lines shows residues and ligands as thin lines, while nonbonded shows
ions and water molecules as small “+” signs. Under “licorice”, sticks shows residues and
ligands as thicker sticks, while nb_spheres shows ions and water molecules as small
spheres. For figures, you generally want to use sticks and nb_spheres. However, lines
and nonbonded representations are very useful as a starting point. You can show all
residues as sticks, select the ones you want to show in the final figure, show these as
sticks, and then hide the lines representation. Other key representations include
cartoon, and surface. Try each of the options under the “as” submenu. Note which ones
you have seen in textbooks, course notes or papers, and which seem more rarely used.
If you want to combine multiple representations at once, click the corresponding name
directly under the Show menu (rather than under “As”) try, for example, the cartoon
and line representations together.
The Hide Menu (under H) hides the corresponding representation from the display. For
example, the red “+” signs are the oxygen atoms of individual water molecules, and this
representation is called “nonbonded” in Pymol. To hide this representation, use hide-
>nonbonded.
The Label Menu labels atoms or residues; labeling a few amino acids is informative,
labeling the whole protein tends to create a mess; you might want to leave off
experimenting with this menu until you have learned how to deal with subsets of
residues. In addition, because figures in papers need more finesse than Pymol offers,
labels are better added in an external graphics program.
The Color menu: Under the C button is the color menu. The last nine options (reds,
greens etc.) access a set of submenus that allow you to colour all the atoms the same
colour, with about 100 choices. The first four options are more complex. Coloring “by
element” uses the red for oxygen, blue for nitrogen convention while allowing a choice
of colours for the carbon atoms. The first one is especially useful as it imposes the
standard “atom colors” on the other elements while keeping carbon as whatever it was
previously; this allows you to color the whole molecule an unusual color, and then
impose atom colors on the non-carbon atoms.
The first three options under “by chain” colour each chain a separate colour (not very
useful here as myoglobin has only one chain). The color-> by chain-> chainbow option
will colour each chain in a gradation from blue to red from the N- to the C- terminus. Try
colouring a ribbon representation of the molecule by “chainbow” to get an overall sense
of how the protein is put together.
Color-> by ss colors by secondary structure; unfortunately this palate is very limited, but
you can manually set colours for each secondary structural element too.
Color-> spectrum ->rainbow is similar to chainbow, except the gradation includes chain
progression (so with two chains, chain A is red to yellow, chain B is yellow to blue).
Color -> spectrum ->b-factors colors by how well ordered a crystal structure is (blue is
best ordered, red is least well ordered). Note that this is meaningless for an NMR
structure, and is mapped to pLDDT in an alphafold structure (a measure of prediction
confidence).
You can at any time save your work in a format called “.pse”. Under the main menu,
File->save session; you will be prompted to specify a location and name. <control>S will
allow you to quickly save the latest version as you go along. To reload the saved file,
double-click on the icon representing the saved file; alternatively, start up Pymol, and
under file->open select the .pse file. The pse file captures the exact state of your
session, including all molecules, saved scenes, and settings.
Saving Scenes
PyMol allows you to effectively “bookmark” specific views by saving a series of “scenes”
– i.e. the exact view you are looking at - and then restore them at any time. These
scenes will also be saved in the pse file, meaning that you can save a single file with
multiple distinct views of your molecule. Scene->Save new scene will save the present
scene under the name 001. This scene will then show up on the left-hand side of the
screen as a small vignette.
Clicking this image will restore the viewer and all molecule specific
settings to this exact scene (unless you have done something
drastic, like delete atoms). Clicking the empty square box with a +
below the current scene will add a new scene. It can be a good
idea to save work in progress as a scene – this will allow you to go
back if you make a mistake.
Note – the scenes take up a lot of screen space. You can toggle the views on and off
using the scenes button in the upper menu:
Undo and redo: Pymol has an undo button, located in the upper menu (avoce the
viewer).
The button to the immediate right is redo. Note that there appear to be limits on which
actions you can undo; undo can also be automatically disabled if memory becomes
limited (which can happen if lots of molecules are loaded). In general, saving scenes (see
below) can be a more reliable way to get back to an earlier view
Pymol .pse files representing structures discussed in the course will be made available
with scenes that illustrate salient points saved; students are encouraged to use these to
better understand the proteins presented in the lectures.
Exporting images
While pymol sessions are interactive, for most contexts you want to be able to export
static images compatible with other programs. The image presently displayed in the
view can be saved as a png file. In the menu bar above the viewer GUI, select the
Draw/Ray icon:
This opens an interactive menu where you can control size, pixilation, and have the
option to use ray tracing.
Often you want to select a subset of all the residues, and manipulate them all in a
consistent way; for example, you might want to change all of the residues that are
involved in ligand binding to sticks representation, and color the carbon atoms yellow
instead of green. PyMol provides you with multiple ways of selecting and working with
subsets of atoms and residues.
Through the viewer: Right clicking and holding on an individual atom in the viewer will
call up a hierarchical drop-down menu that allows you to alter specific atom subsets.
Highlighting one of these choices gives a new submenu that includes color, show, hide,
preset, label etc. As these labels suggest, they give you access in a slightly rearranged
form, to many of the same options that you have under the main object menu to the
right of the viewer.
One key option here is “select”. A selection is a virtual object, by default called “sele” in
the Viewer GUI. You can do this through the drop down menu accessed after right
clicking, but simply left clicking on a visible atom in the Viewer will select that residue
(by default). Note that all currently selected atoms are highlighted by small in pink
boxes. Clicking additional residues will also add them to the selection. Left clicking an
already highlighted residue will remove it from the selection. Shift-left clicking will drag
a box; when you release, all residues that were in that box will be added to the
selection; shift-middle mouse clicking will produce a similar box that removes atoms
from the selection.
This selection is a virtual object that can be altered in standard ways using the
associated menus in the viewer GUI (e.g. changing the colour, representation), but these
changes will apply only to the selected residues. One can also copy these residues to a
new object.
Selections are not permanent. If you left-click the background, the selection will be
hidden from view. Left clicking an atom with the selection hidden will then reset the
selection. Note that if you need a persistent selection that is not overwritten, rename
the selection (actions->rename). This can be very useful if you want one set of catalytic
residues, one set of binding residues that you want to apply different changes to, and
perhaps revisit later.
will open a new window showing the sequences of all currently visible objects at the top
of the viewer:
Note that the colours for each residue reflect the colour in the structure (here currently
chainbow). Under the main menu Display->Sequence performs the same function;
Display->sequence mode gives access to different options for how the sequence is
displayed. The sequence view is fully interactive, and tied to the structural
representation in the viewer window. Left mouse clicking on an amino acid within the
sequence will select that residue. Left click drag will allow you to select a series of
consecutive residues. Middle clicking on an amino acid in the sequence will re-center
the view on that residue. Right mouse clicking a residue in the sequence will activate the
same menu as you get by right clicking on the residue in the Viewer. Selections from the
viewer and sequence are treated as equivalent and can be combined. Note that co-
factors and ligands are typically found at the end of each protein chain in the sequence
view (look for heme – it is called HEM).
The sequence view is especially useful if you are trying to follow a discussion in a paper;
for myoglobin, His93 is the fifth iron ligand, while His64 makes a hydrogen bond to the
bound oxygen atom. To highlight these residues, locate them in the sequence view and
right click, then navigate to show sticks. These residues should then show up in the
viewer.
While residues are the default “unit” of selection, you can also select the structure at
other levels of grouping – e.g. objects, chains or atoms. You can select a different
selection mode using the “Residues” option in the top menu:
In the appropriate mode, one can select the whole chain using a single click. Similarly,
sequence selections have modes. Sequence mode->chain identifiers in particular allows
one to select whole chains with a single click.
Altering selections using simple logical rules: While it is nice to be able to select
residues individually, often we want to be able to efficiently select and manipulate a set
of residues that have a logical relationship to each other. The actions menu gives you a
variety of tools under the “modify” option that allows you to alter your selections.
For example, suppose we want to represent only those residues that contact the ligand,
as sticks. Through the sequence view, select the heme group (it’s called HEM, and is
located after the amino acids in the sequence). Then left-click activate “action->modify-
>around->residues within 4 A”. This will expand the original selection to all residues
within 4 Å (so within physical contact range) of the originally selected residue (i.e. van
der Waals contact range). Then Show->sticks within (sele) will show sticks for these
nearby residues.
Similar to the “around” option is the “extend” option that goes through bonds rather
than through space.
“Actions->modify->invert->invert within object” (or invert within chain) which then
selects everything but that which was originally selected (useful if you want to, say,
change everything but the ligand)
“Actions->modify->complete->” gives several options for completing selection, by, for
example, extending the selection to any residue where you have at least one atom
selected, or to complete chains.
“Actions->modify->include->” allows you to expand your selection to include the atoms
in another pre-existing selection or object.
“Actions->modify-restrict->” gives options that allow you to restrict the selection; this is
useful if you have more than one pdb file open, and you want your selection limited to
only one object.
“Actions->rename selection” allows you to rename your selection, allowing you to start
a new selection under the (sele) name
“Actions->create object” which allows you to create a new object from your selection.
Polar contacts and measurements: Pymol has an algorithm for checking “polar
contacts” – points where two polar atoms (N or O) approach each other closely and
might form a hydrogen bond. Use Actions->Find->polar contacts-> and choosing the
appropriate subset of atoms. E.g. “Actions->Find->polar contacts->just intra main-chain
atoms” will find all candidate hydrogen bonds within the backbone. Note that these
“polar contacts” are not necessarily hydrogen bonds, just places where pairs of polar
atoms approach closely; Pymol does not look at the angles involved, or assess whether a
suitable donor and acceptor is present. Polar contacts more closely match real hydrogen
bonds if PyMol has hydrogen atoms to work with. However, many structures in the pdb,
including 1A6M, lack these atoms. To add hydrogen atoms, use “Actions->hydrogens-
>add”. For more accurate H-bonds, you will want to add them individually using the
distance measure tool.
Hydrogen bonds have fairly stringent geometry requirements. We can measure
geometric properties of the structure using the wizard->measurement option.
The little wand icon pops out the “wizard” menu, which is essentially a collection of
tools. A new menu will come up in the viewer GUI.
Once this is selected, if we left click on two atoms, a new measurement object will be
created, with a dashed line joining the two atoms, and with the distance displayed. Here
I used this tool to measure the distance between the heme iron and His93. Because
measures are objects in the menu, you can change their color or hide the label/dash.
These dashed lines are how most people show hydrogen bonds
The other settings in this menu I tend to leave as defaults. You can try toggling them to
see what they do or look them up online.
The edit all option (red) opens up a window that allows all
of the hundreds of hidden settings in Pymol to be seen and
edited. These all affect the final appearance of objects, and
it is fun to play with these (use the pymol wiki as an intro to
these features, though many are not yet fully publicly
documented.) The menu short cuts change these settings
(e.g. the global surface transparency is controlled by the
value for “transparency”, which through this menu, can be
edited to values other than those preset in the drop down
menu). This menu can allow almost every aspect of how
molecules are represented in Pymol to be controlled.
These options control the way cartoons are represented. Try toggling them to get a
sense of what they do. I will often enable fancy helices and highlight colours in
publication quality figures. Turning off flat sheets ensures that the Cartoon goes through
the C alpha position, so the cartoon connects with the side chains – useful if you are
showing some side chain elements with the cartoon. Cylindrical helices simplify the
cartoon, and are sometimes used for very large structures.
Making transparent representations: the solid object types within PyMol can be made
partially (or wholly) transparent; the menus to do this are under Setting->Transparency,
with Cartoon, Surface, Stick and Spheres being modifiable. The most common usage of
this is to overlay a transparent surface over a cartoon/stick representation to give a
sense of the shape and volume of the protein while still showing other details. Often a
uniformly coloured transparent white surface (using a duplicate object) overlain on
coloured cartoon/stick representations is used.
Take 1A6M, and create a duplicate object (Actions->duplicate object). Colour this new
object white and show “as surface”. Colour 1A6M by secondary structure, show as
cartoon; show the heme group as sticks, and colour by element with carbon yellow.
Setting->transparency->surface->0.4.
You should give you something like the figure below.
You can also avoid replicating objects by using setting->surface->color to white. Surface
colour defaults to atom colours.
Selective transparency: a useful variation on this is to use the command line to modify
transparency
set transparency=0.6, sele
can be used to make only the selected atoms transparent; this can be used to create a
transparent window in an otherwise opaque surface. Setting transparency->angle
dependent makes surfaces less transparent at acute angles.
The rendering submenu allows shadows and other aspects of the rendering to be
changed:
Note that shadows only show up when you ray-trace. Shadows add visual drama, but
also add visual clutter, and can obscure scientifically important details. I tend to turn
shadows off for making figures.
Note: ambient occlusion darkens deep cavities, and is very useful for highlighting the
topography of surfaces. E.g.:
Much cleaner. In general, hide main chain atoms if you are showing the cartoon and
these atoms do not directly contribute to the interactions of interest.
Electrostatic surfaces – these are surface representations of the molecule that are
colored red to blue from the most electronegative to the most electropositive region.
Pymol’s algorithm ignores the dielectric constants of materials, so the results are only
qualitatively useful. Actions->generate->electrostatic surface. Try this with the
myoglobin structure. You should get something like this.
Electrostatic surfaces can be made transparent if desired.
A more accurate calculation is utilized by the Advanced Poisson Boltzmann Solver
(APBS). This algorithm is available in the plugin menu. In the Plugins menu, select the
“APBS Electrostatics” plugin. Select 1A6M, then run with default settings.
Now you have an object downloaded that contains a second “state”, which is the other
half of the molecule. In order to make this half accessible, we need to split the single
object into its two component states. To do this we need Action->state->split
This gives us two new objects, each of which contains two of the four chains within the
hemoglobin structure.
Ball and stick
Ball and sticks is an alternative way to show sticks. To activate, go settings->line and
stick -> Ball and Stick.
To edit ball and stick, scroll down to the entry stick_ball. Highlight the “off” in the
second column and change it to “on”. Then, in the next entry, change “stick_ball_ratio”
to 1.5. The sticks should now show up as ball and sticks
We will need to recombine these two new objects into one, but before doing so, we
need to ensure that atoms within the object can be distinguished (otherwise they will
tend to overwrite one another in the computer memory). An easy way to do this is to
give objects a different “segid” value. Note if you had more states, you would need to
assign each its own segid. This requires the command line:
This adds a segment identifying tag to copy #2 of the structure. Now we can recombine
these two dimers back into a tetramer. Drag a box over them to select them. Then use
sele->modify->complete->objects to ensure you have all atoms. Finally, copy-to
object->new to make a new object that behaves as a single object. Note that the surface
is now continuous.
Showing subsets of residues as a surface: Suppose we wanted to show how one A chain
of oxy-hemoglobin, shown as a cartoon, interacts with the rest of the molecule, shown
as a surface. This seems straightforward enough. You start by showing everything in the
combined oligomer as a cartoon, select the residues that correspond to the bits that we
want to be a surface, and do show->surface. Set surface colour to white. This results in
something like this:
This representation is not what we wanted – instead of showing a surface around all of
the selected atoms, we instead see an ugly hole where the alpha chain interacts with
the rest of the molecule. This is because Pymol thinks surface only exists where the
protein meets the outside environment; any residues that interact with other residues
by definition not part of the surface, even if we are currently not showing those atoms.
To get a surface around the rest of the object, we need to create a new object that
includes everything we want to include in the surface, and nothing else. In the sequence
view, we can select the relevant residues (the easiest way to do this is through the
sequence view, with chains shown), go sele->action->copy to object, and then show the
surface around this new object. This gives:
An example of using the command line to do something not easily done through menus
– color carbon atoms by residues type – cut and paste this into the command line
window:
The Pymol wiki has details on how to use the command line interface (along with lots of
extensions and scripts) for those who are interested.
Type help command_name in the command line to get an explanation, and details of
the syntax. E.g. typing “help show” in the command line gets a summary of how the
“show command works in the command line, Most commands are documented in the
Pymol wiki.
Running scripts:
Preset sets of instructions can be saved as scripts, and run to modify objects within
pymol. Two simple scripts have been uploaded to Courselink (color_by_hydrophobicity
and color_by_residue_type). Download these scripts and execute them by file->run and
then navigating to the relevant file.
Tips for publication quality figures:
Use a white background.
Make sure you have a clear idea what information you want to convey with a figure
before you make it. Then show the necessary details. Be careful not to clutter up the
image with excessive details (e.g. detailed representations of residues you never
discuss, showing backbone atoms when only side chains are relevant). Use colour to
delineate details. Pay attention to the colour palette – e.g. use complementary colours,
all bright colours, or all pastels, etc.
Make surfaces from an object with hydrogens added. Hydrogens contribute to the bulk
of the surface; surfaces rendered without hydrogens added are misleading.
Add labels for residues in an external program, not in pymol (as you can’t control where
the labels end up). Word or powerpoint will work, but the control is not great, and these
programs cannot output publication quality images. Inkscape ([Link] is a
powerful vector graphics program similar to adobe illustrator that is available for free.
Do not resample images if you need to resize; rather redo the image at the desired
number of pixels (you saved the scene, right?). Especially never, ever, ever, ever
“squish” the figure horizontally or vertically to fit. This distorts the image so that it is no
longer an accurate representation of the structure.
Pay attention to the pixilation of the image. Publications generally want images at
300dpi – anything less will look blocky. You need to get this resolution upon making the
image; resizing the image will produce artifacts such as blurred lines, and will give a less
than ideal figure.
Suppose you want a figure at 3 inches (76 mm, standard 1 column figure) at 300 dpi.
Then you should be ray tracing an image of at least 900 pixels width (and ideally
considerably more than this, in case you later decide to resize). This may end up being
more pixels than many computer monitors, so you will not be able to drag the image
large enough. After composing your image and framing it (placing the molecule to
remove any excess white space), make the png image at a sufficient dpi.
Always ray trace any image that will be shared with others
This site has details on every function of pymol. The gallery has instructions to make
some very cool looking figures. The plugin gallery has a wide variety of additional
functions
Movies
Pymol can also make movies. Essentially you can save a series of scenes as states, and
pymol will interpolate between them. You can also add rotations or rocking at individual
scenes. The file is produced when you save a movie file (file->export->movie as…MPEG).
Ray tracing a movie takes a while - I would recommend trying to just draw as a first pass
until you are happy with it.