Chris Moth asked why ChimeraX does not use Biopython. Biopython is a great package for doing non-interactive calculations on molecular structures. But ChimeraX is all about interactive analysis and speed is very important to make it usable.
I timed opening a 2 million atom mmCIF file 3j3q and getting its atom coordinates. ChimeraX was 16 times faster reading the mmCIF, uses 3 times less memory, makes a list of atoms 100 times faster, and gets coordinates 200 times faster.
|software||read mmCIF||memory used||atom list||coordinates|
|Biopython 1.78||131 sec||6.2 GB||0.4 sec||9.5 sec|
|ChimeraX 1.1||8 sec||1.7 GB||0.004 sec||0.04 sec|
Here is the code I used to time ChimeraX opening the structure
time open 3j3q
and accessing atoms and coordinates from Python shell (menu Tools / General / Shell)
s = session.models from time import time t0 = time() ; atoms = s.atoms ; t1 = time() ; print ('atoms', t1-t0) t0 = time() ; xyz = atoms.coords ; t1 = time() ; print ('coords', t1-t0)
And here is the code I used to time Biopython
from Bio import PDB parser = PDB.MMCIFParser() from time imort time t0 = time() ; s = parser.get_structure('3j3q', '/Users/goddard/Downloads/ChimeraX/PDB/3j3q.cif') ; t1 = time() ; print('read mmcif', t1-t0) t0 = time() ; a = tuple(s.get_atoms()) ; t1 = time() ; print ('atoms', t1-t0) t0 = time() ; c = [a0.get_vector() for a0 in a] ; t1 = time() ; print ('coords', t1-t0)
Tom Goddard, September 15, 2020