Monday, September 22, 2014

Replacing new line with something else using /sed/

sed ':a;N;$!ba;s/\n/ /g'

From here http://stackoverflow.com/questions/1251999/sed-how-can-i-replace-a-newline-n

Sunday, June 8, 2014

Uncurated notes on Gromacs GPU installation

This file is in /scratch [flute]

GPU installation on Spear was successful with this cmake

cmake .. -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DGMX_BUILD_OWN_FFTW=ON

[Performance]
12102 atoms CG

Using 2 MPI threads
Using 6 OpenMP threads per tMPI thread
Detecting CPU-specific acceleration.
Present hardware specification:
Vendor: AuthenticAMD
Brand:  AMD Opteron(tm) Processor 4184
Family: 16  Model:  8  Stepping:  1
Features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
Acceleration most likely to fit this hardware: SSE2
Acceleration selected at GROMACS compile time: SSE2


2 GPUs detected:
  #0: NVIDIA Tesla M2050, compute cap.: 2.0, ECC: yes, stat: compatible
  #1: NVIDIA Tesla M2050, compute cap.: 2.0, ECC: yes, stat: compatible

2 GPUs auto-selected for this run.
Mapping of GPUs to the 2 PP ranks in this node: #0, #1

Cut-off's:   NS: 1.441   Coulomb: 1.2   LJ: 1.2
               Core t (s)   Wall t (s)        (%)
       Time:    48384.100     4035.993     1198.8
                         1h07:15
                 (ns/day)    (hour/ns)
Performance:     2140.737        0.011 <----------- font="">
Finished mdrun on node 0 Thu Mar 20 18:40:36 2014

------------ [CPU Edison install] ----------------
git clone git://git.gromacs.org/gromacs.git
module purge
module load intel
module load cmake
module load openmpi-ccm
module load fftw
mkdir 1049pm-mar24-build
export CCDIR=/opt/intel/composer_xe_2013_sp1.0.080/bin/intel64/
export MPICCDIR=/usr/common/usg/openmpi-ccm/1.6.5/intel/bin/
export FFTW_LOCATION=/opt/fftw/3.3.0.4/sandybridge/
export CXX=mpicxx
export CC=mpicc
mkdir ~/gromacs/1049pm-mar24-build/
cmake ../   -DFFTW3F_INCLUDE_DIR=$FFTW_LOCATION/include       -DFFTW3F_LIBRARIES=$FFTW_LOCATION/lib/libfftw3f.a       -DCMAKE_INSTALL_PREFIX=/global/homes/p/pavang/gromacs/1049pm-mar24-build/       -DGMX_X11=OFF       -DCMAKE_CXX_COMPILER=${MPICCDIR}/mpicxx       -DCMAKE_C_COMPILER=${MPICCDIR}/mpicc       -DGMX_MPI=ON       -DGMX_PREFER_STATIC_LIBS=ON -DFFTWF_LIBRARY=$FFTW_LOCATION/lib/libfftw3f.a -DFFTWF_INCLUDE_DIR=$FFTW_LOCATION/include

make
make install
source /global/homes/p/pavang/gromacs/1049pm-mar24-build/bin/GMXRC.bash 
mdrun_mpi -v -deffnm  em

[Performance]
38315 atoms Lysozyme in water

On 1 MPI rank, each using 32 OpenMP threads
               Core t (s)   Wall t (s)        (%)
       Time:     7596.371      252.435     3009.2
                 (ns/day)    (hour/ns)
Performance:       11.802        2.034 <-------------- font="">

Friday, March 28, 2014

Crystal Symmetry Operations

Pymol has a neat command to generate neighbor copies of proteins downloaded from PDB using their unit cell parameters.
 symexp sym = 401D, all, 3 

http://chemistry.osu.edu/~foster.281/biochem766/download/PDB/xray_tut_766.html

Saturday, March 1, 2014

Handling topologies of non-standard molecules in Gromacs

I have a file containing a protein chain and a non-standard ligand in a file [centered.pdb]. To obtain .top file
create a directory in the workspace and call it forcefield.ff.

export GMXLIB=$GMXLIB:$PWD/

The contents of that directory are:


ls -1 forcefield.ff/
aminoacids.c.tdb
aminoacids.hdb
aminoacids.n.tdb
aminoacids.r2b
aminoacids.rtp
aminoacids.vsd
atomtypes.atp
b12rtp
elements.dat
ffbonded.itp
ff_dum.itp
ffG53a6.itp
ffnonbonded.itp
forcefield.doc
forcefield.itp
gromos53a6.ff
ions.itp
old
org-aminoacidsrtp
spce.itp
spc.itp
tip3p.itp
tip4p.itp
watermodels.dat

The file aminoacids.itp has these lines added at the end:

[ Ligand ]
[ atoms ]
C27     C       0.38000 0 ; from        GLN     residue
O28     O       -0.38000 0 ; from       GLN     residue
N29     NT      -0.83000 0 ; from       GLN     residue
..

[ bonds ]
; ai aj gromos type
C26     C27 gb_27 ; from        GLN     fragment 0
C27     O28 gb_5 ; from GLN
C27     N29 gb_9 ; from GLN
...

[ dihedrals ]
; ai aj ak al gromos type
C3R     O2      P       O3 gd_20 ; from ADE     phosphate definition
C3R     O2      P       O3 gd_27 ; from ADE     phosphate definition
C2P     O3      P       O2 gd_20 ; from ADE     phosphate definition...





Then...

 pdb2gmx_sp -f centered.pdb  -o centered.pg.gro
...


Select the Force Field:
From current directory:
 1: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
From '/usr/common/usg/gromacs/4.6.1-sp/share/gromacs/top':
 2: AMBER03 protein, nucleic AMBER94 (Duan et al.,.. 24, 1999-2012, 2003)
 3: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
...

16: [DEPRECATED] Encad all-atom force field, using full solvent charges
17: [DEPRECATED] Encad all-atom force field, using scaled-down vacuum charges
18: [DEPRECATED] Gromacs force field (see manual)
19: [DEPRECATED] Gromacs force field with hydrogens for NMR

Select 1 which is the forcefield contained in the directory forcefield.ff. This should run w/o errors and give a topol.top file. W/o editing topol.top running this code will give the following error;

grompp_sp -f min.mdp -c centered.pg.gro -p topol.top -o em.tpr
...

-------------------------------------------------------
Program grompp_sp, VERSION 4.6.1
Source code file: /global/homes/j/jdeslip/Builds/Hopper/gromacs-4.6.1/src/kernel/topio.c, line: 734

Fatal error:
Syntax error - File ffnonbonded.itp, line 1
Last line read:
'[ atomtypes ]'
Invalid order for directive atomtypes
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

So, hide the first line and add the default gromos53a6.ff

;#include "./forcefield.ff/forcefield.itp"
#include "gromos53a6.ff/forcefield.itp"

That should let grompp run smoothly.


Sunday, February 9, 2014

Remove white background in multiple images

convert file-with-white-bg.png -transparent white file-with-transparent-bg.png

Saturday, February 1, 2014

Ubuntu apt-get 'package has no installation candidate' error

I was trying to install a package using apt-get and realized that apt-get wouldn't install that or any other package. I found various suggestions to fix this problem:

Problem

$ sudo apt-get install scons
Reading package lists... Done
Building dependency tree      
Reading state information... Done
Package scons is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Package 'scons' has no installation candidate

Solution:

$ sudo apt-get update
$ sudo apt-get install build-essential
$ sudo dpkg --configure -a
$ sudo apt-get -f install

None of those worked in my case. I then went to Ubuntu Software Center>Edit> Software Sources and made these changes.


After that the following commands did the job
$ sudo apt-get update
$ sudo apt-get install build-essential
$ sudo apt-get install scons

Saturday, January 25, 2014

Changing desktop background and display themes

Accomplishing that is trivial in any OS but it is a good idea to do it once in a while to break the monotony and keep things fresh.

Tuesday, January 21, 2014

Calculating Rg using VMD

I am working on a system with 60 copies of a molecule which has 1287 atoms in it. The script below loads the trajectory and calculates the radius of gyration of each copy.

mol new system.gro

for {set num 0} {$num < 201} {incr num} {
mol addfile cen_md_$num.xtc waitfor all
}

for {set i 1} {$i <61} {incr i} {
set start [expr 1 + 1287 * ($i-1) ]
set end [expr 1287 * $i]
puts "$start $end"
set sel [atomselect top "serial $start to $end"]
set filename [open "rg-dextran-number-$i.txt" w]
puts "Number of frames is [molinfo top get numframes]";
for {set frame 0} {$frame < [molinfo top get numframes]} {incr frame} {
animate goto $frame
$sel update
puts $filename [format "%8.3f" [measure rgyr $sel]]
}
close $filename
}

quit

Weighted Histogram Analysis Method - WHAM

WHAM is used extensively to calculate the free energy of a system along a properly chosen reaction coordinate in molecular dynamics simulations or monte carlo simulations. To be able to use WHAM you need to perform multiple simulations that cover the entire reaction coordinate to sample possible configurations. The choice of reaction coordinate constitutes a length discussion and is a topic of ongoing debate. So I encourage the reader to look it up elsewhere. After you have chosen a reaction coordinate and performed Umbrella Sampling or Free Energy Perturbation calculations, histograms corresponding to every window need to be extracted.

For the simple case of dragging a methane molecule from water into a lipid bilayer and out into water again, z axis is a good reaction coordinate if the lipid bilayer is on the x-y plane. With this set up, the z axis is split into discrete windows and the center of mass (COM) of the methane molecule is constrained (usually harmonic) to the center of every window. The length of windows and force constant of the constraint are chose such that the distribution of the z coordinate of the COM of methane overlaps with the distribution of the adjacent windows. This is a necessary prerequisite for using WHAM. Following the figure below, for a 60A long reaction coordinate a choice of 121 windows (each 0.5A and including both ends) will be plenty to ensure overlap of z coordinate distributions.

Collect z coordinate distributions and put them 121 files labeled window-001 to window-121.
Create a new file input.in that contains the following lines;

./window-001 -30.0 xx
./window-002 -29.5 xx
...
./window-120 +29.5 xx
./window-121 +30.0 xx

Here the first column is the file name, second is the mean value of that window and the third column needs to be filled with an integer but is irrelevant and not used by Alan Grossfield's implmentation of WHAM. Then run the following command

./wham 1 11 10 0.001 300 0 input.in freefile

where the inputs are

./wham hist.min hist.max num.bins tol Temp numpad infile outfile

Figure Source: http://www.utdallas.edu/~son051000/comp/FreeE.pdf
























The output contains a plot for the potential of mean force (PMF). Pay particular attention to units since they depend on your input. WHAM's documentation can be found here [http://membrane.urmc.rochester.edu/sites/default/files/wham/doc.html].

Thursday, January 16, 2014

Ubuntu grub rescue

Following are my notes while I successfully recovered Ubuntu 12.04-LTS from grub recovery mode.


I am using a Macbook Pro and trying to create a Ubuntu Live USB.

Create a Live USB using instructions from here.
http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-mac-osx

This tells you how to work through grub rescue.
http://askubuntu.com/questions/197833/recovering-from-grub-rescue-crash

This video offers additional help
http://www.youtube.com/watch?v=8lod8sRb_6I

On a side note, if you ever accidentally delete /boot directory simply type the line below - just like this guy. Of course you need the find the appropriate linux-image-x.x.xx-xx-server and you do need to catch this blunder before rebooting.

sudo apt-get install --reinstall linux-image-2.6.32-42-server

Sorting by columns in R

Sorting columns in R is a breeze. If d is a matrix with 2 columns and you want to sort d by column 1, use the command order as below.


Thursday, January 2, 2014

DNA simulation set-up with AMBER

I have been working on the structure and dynamics of single and double stranded DNA for some time now and below is a write up of an easy way to setup the simulations using the AMBER suite of software. Along with primary executables sander, pmemd, tleap and ptraj, AMBER comes with a number of other useful tools. One such tool is the nucleic acid builder (or nab). To create a 10mer poly-Adenosine poly-Thymidine, put the following text in a file called nuc.nab;

molecule m;
m = fd_helix( "abdna", "aaaaaaaaaa", "dna" );
putpdb( "nuc.pdb", m, "-wwpdb");

Then run these in the terminal

nab nuc.nab
./a.out

You should get a nuc.pdb file which contains the poly A-T double stranded DNA. Everything so far is available in more detail at http://ambermd.org/tutorials/basic/tutorial1/section2.htm

To get a single stranded poly-A DNA simply delete the lines corresponding to poly-T. To create stem-loop type configurations commonly found in RNA, do the following. Say you want the ssDNA

AAATAGAGCTAGTGAGGATTT

with the first 4 bp [AAAT] paired with the last 4 [TTTA or ATTT if you want to read from above]. For this I simply generate a double stranded DNA like so...

AAATAGAGCTAGTGAGGATTT
TTTATCTCGATCACTCCTAAA

Then delete the bases shown in red from the PDB file.

AAATAGAGCTAGTGAGGATTT
TTTATCTCGATCACTCCTAAA

Now you should have

AAATAGAGCTAGTGAGG
TTTA

This is the same as AAATAGAGCTAGTGAGGATTT but with the ends paired up. For the stem-loop conformation you should be able to use the parameter and topology files generated for the fully stretched single stranded DNA. You will have a long bond between  AAATAGAGCTAGTGAGG--and--ATTT. But this can be easily fixed with a quick minimization.