Wednesday, June 20, 2012

Subsetting in R

Following are some example of subsetting in R I have used recently.


> head(crystal)
    V1     V2      V3      V4
1 1ACJ PHE120 290.980 289.101
2 1ACJ PHE153 307.673 283.814
3 1ACJ PHE155 296.153  89.152
4 1ACJ PHE186 311.606 292.186
5 1ACJ PHE187 298.920 270.613
6 1ACJ PHE197  66.678 323.351

>subset(crystal, crystal$V2=="PHE330" & grepl("^2A",crystal$V1))

http://www.ats.ucla.edu/stat/r/faq/subset_R.htm
http://stackoverflow.com/questions/2125231/subsetting-in-r-using-or-condition-with-strings

Monday, June 11, 2012

Charmm2lammps

Charmm2lammps.pl is a useful little code which takes in forcefield [top/par], psf and pdb files in Charmm format and converts them to Lammps data and sample input file. This is extremely useful for running simulations in the much better written MD code, LAMMPS. While trying to simulate a polymer chain I came across a very interesting feature in charmm2lammps; it really expects the force constant in the forcefield parameter file to be non-zero. This works just fine when you never want to specify angle/dihedral parameters to free chains. But when you want to control the parameters in between runs, it is a good idea to define all the angle and dihedrals and then decide to turn them off/on in LAMMPS by setting the force constant to 0/nonzero.

Sunday, June 10, 2012

Compiling C programs on Ubuntu

On Ubuntu, it turns out, this does not work
gcc -lm code.c
but this does
gcc code.c -lm

Thursday, May 10, 2012

Running job on clusters continuously

Here is a way to run jobs on cluster in a continuous fashion. This example addresses NAMD jobs but should also work for others.

Tuesday, May 8, 2012

Labeling figures

http://www.imagemagick.org/Usage/annotating/
This website offers a  nice way to add labels to images that "convert" can handle. I thought it was a nice idea to have my name on figures I generate.

Friday, April 27, 2012

Public folder on Dropbox

http://dl.dropbox.com/u/65895350/

Thursday, April 26, 2012

Extracting coordinates of protein from a simulation

set dir [pwd]
mol new     ${dir}/ache.psf       type {psf}
mol addfile ${dir}/ache.pdb       type {pdb} waitfor all
mol addfile ${dir}/acheminnpt.dcd type {dcd} waitfor all

set molnum 0
set skip 1
set backframe0 [atomselect $molnum backbone frame 0]

for {set i 0} {$i < [molinfo $molnum get numframes]} {incr i $skip} {
set pro  [atomselect $molnum protein  frame $i]

# Change 'protein' to "resid N to Z" to suit your needs.
set back [atomselect $molnum backbone frame $i]
set trans_mat [measure fit $back $backframe0]
$pro move $trans_mat
set k [expr $i/$skip]
$pro writedcd ${dir}/dcd/project-pdb-$k.dcd
}

Now generate a protein-only psf file. At this stage you should have n number of dcd files. To join them all in to one dcd file,

mol new    ${dir}/ache_protein.psf
set pro_frames [expr [molinfo $molnum get numframes]/$skip ]
for {set j 0} {$j < $pro_frames} {incr j 1} {
mol addfile ${dir}/dcd/project-pdb-$j.dcd
}
set top [molinfo top]
animate write dcd {${dir}/dcd/project_protein.dcd} beg 0 end [expr $pro_frames - 1] skip 1 [molinfo top]

The last step is where the individual files are stitched together.

Monday, April 23, 2012

Installing Gimp on Ubuntu 11.1

Last month I installed Gimp on Ubuntu 11.1 which dropped Gimp since they deemed it too complex for new linux converts. Since this was last month I don't remember all the steps I took but I did copy the commands to a file and they might come handy in future. Here they are...

   69  sudo add-apt-repository ppa:matthaeus123/mrw-gimp-svn
   71  sudo apt-get install gimp gimp-data gimp-plugin-registry gimp-data-extras
   73  sudo add-apt-repository ppa:lucid-bleed/ppa
   74  sudo add-apt-repository ppa:matthaeus123/mrw-gimp-svn
   79  mkdir gimp
   80  cd gimp/
   81  wget ftp://ftp.gimp.org/pub/gimp/v2.7/gimp-2.7.2.tar.bz2
   82  tar xjf gimp-2.7.2.tar.bz2
   83  export PATH=/opt/gimp-2.7/bin:$PATH
   84  export PKG_CONFIG_PATH=/opt/gimp-2.7/lib/pkgconfig
   85  export LD_LIBRARY_PATH=/opt/gimp-2.7/lib
   86  cd gimp-2.7.2/
   87  ./configure —prefix=/opt/gimp-2.7
   89  ./configure
   90  cd ..
   92  sudo apt-get install git
   93  git clone git://git.gnome.org/babl
   94  cd gimp-2.7.2/
   95  ./configure --prefix=/opt/gimp-2.7
   96  cd ..
   97  ./autogen.sh --prefix=/opt/gimp-2.7
   99  cd babl/
  100  ./autogen.sh --prefix=/opt/gimp-2.7
  101  make -j5
  102  cd ../gimp-2.7.2/
  103  ./configure --prefix=/opt/gimp-2.7
  104  cd ../babl/
  105  cd ..
  106  git clone git://git.gnome.org/babl
  107  ./autogen.sh --prefix=/opt/gimp-2.7
  108  sudo apt-get install libtool
  109  ./autogen.sh --prefix=/opt/gimp-2.7
  110  cd babl/
  111  ./autogen.sh --prefix=/opt/gimp-2.7
  112  make -j5
  113  sudo make install
  114  cd ../gimp-2.7.2/
  115  ./configure --prefix=/opt/gimp-2.7
  116  cd ..
  117  wget ftp://ftp.gimp.org/pub/gegl/0.1/gegl-0.1.6.tar.bz2
  118  tar -xjf gegl-0.1.6.tar.bz2
  119  sudo apt-get install libjpeg62-dev libopenexr-dev librsvg2-dev
  120  cd gegl-0.1.6/
  121  ./configure --prefix=/opt/gimp-2.7
  122  sudo make install
  123  cd ..
  124  ls
  125  cd gimp-2.7.2/
  126  ./configure --prefix=/opt/gimp-2.7
  127  sudo apt-get install libtiff4-dev python-dev python-gtk2-dev
  128  ./configure --prefix=/opt/gimp-2.7
  129  which gimp
  130  sudo apt-get install libexif-dev liblcms1-dev
  131  ./configure --prefix=/opt/gimp-2.7
  132  make -j5
  133  make install -j5
  134  sudo make install -j5
  135  ls /opt/gimp-2.7/bin/gimp-2.7
  136  /opt/gimp-2.7/bin/gimp-2.7
  137  history > ~/gimpinstallcommands.dat

Making movies with VMD/PyMol


These two scripts help greatly in making and handling movies. I need scripts instead of the GUI because most of my data is on machines on the other coast and it gets rather cumbersome to pull a gigantic trajectory file just to see a quick snapshot of the system. I could make PDB (or better dcd) files of parts of trajectory I am interested in, but I will need to make movies at some point.

view_change_render.tcl
trajectory_movie.tcl
http://www.myofilament.org/documents/howto/modeling/VMD_Tcl.htm

PyMol uses get_view and "set view" commands to save and retrieve views of molecules. VMD handles things a little differently. It needs 4 parameters per view. They are [rotate/center/scale/global]_matrix. Saving and retrieving are done by processes below. They have been taken from view_change_render.tcl
I added the last part get_vp to see what the view point actually looks like (similar to get_view in PyMol).

proc save_vp {view_num} {
  global viewpoints
  if [info exists viewpoints($view_num)] {unset viewpoints($view_num)}
  # get the current matricies
  foreach mol [molinfo list] {
    set viewpoints($view_num,$mol,0) [molinfo $mol get rotate_matrix]
    set viewpoints($view_num,$mol,1) [molinfo $mol get center_matrix]
    set viewpoints($view_num,$mol,2) [molinfo $mol get scale_matrix]
    set viewpoints($view_num,$mol,3) [molinfo $mol get global_matrix]
  }
} 
  
 
 proc retrieve_vp {view_num} {
  global viewpoints
  foreach mol [molinfo list] {
    if [info exists viewpoints($view_num,$mol,0)] {
      molinfo $mol set rotate_matrix   $viewpoints($view_num,$mol,0)
      molinfo $mol set center_matrix   $viewpoints($view_num,$mol,1)
      molinfo $mol set scale_matrix   $viewpoints($view_num,$mol,2)
      molinfo $mol set global_matrix   $viewpoints($view_num,$mol,3)
    } else {
      puts "View $view_num was not saved"}
  }
}
 
 proc get_vp {view_num} {
  global viewpoints
  foreach mol [molinfo list] {
    if [info exists viewpoints($view_num,$mol,0)] {
puts "global viewpoints"
puts "molinfo $mol set rotate_matrix     \"$viewpoints($view_num,$mol,0)\""
puts "molinfo $mol set center_matrix     \"$viewpoints($view_num,$mol,1)\""
puts "molinfo $mol set scale_matrix      \"$viewpoints($view_num,$mol,2)\""
puts "molinfo $mol set global_matrix     \"$viewpoints($view_num,$mol,3)\""
    } else {
      puts "View $view_num was not saved"}
  }
}
 
 
Once you save this output (as a text file) it becomes very easy to retrieve the view even 
after closing down VMD. Simply copy and paste the output of get_vp.

Friday, April 20, 2012

Loops in Tcl/Tk

If loop

if { test1 } {
body1 
} elseif { test2 } {
body2 
} else {
bodyn
}


For loop

for { set i 1 } { $i <= 10 } { incr i } { # This will append all multiples of all numbers for the number n # into a variable called table set table "$table $i x $n = [expr $i \* $n]\n" }
 
 #Make the list
set list [list "Item 1" {Item 2} Last]
set text ""
foreach item $list {
 #Append the every item to $text
 set text "$text\n$item"
}
Foreach loop

#Make the list 
set list [list "Item 1" {Item 2} Last] 
set text "" 

foreach item $list { 
#Append the every item to $text 
 set text "$text\n$item"
 }


Thursday, March 29, 2012

Find non-zero size files in Linux

http://www.unix.com/unix-dummies-questions-answers/25514-how-find-files-not-empty.html

You can use the find command to find non-zero length files:

find path -type f ! -size 0

In most UNIX implementations, the -size expression can also be used to search for file sizes of exactly N bytes (-size Nc), greater-than N bytes (-size +Nc), and less-than N bytes (-size -Nc).

The confusing thing is that the numeric following -size is, by default, 512-byte blocks not a byte count. The numeric must be followed by a 'c' for that. The following command will find files less than 2KB:

find . -type f -size -2048c -print

Conversely, for files greater than 2KB:

find . -type f -size +2048c -print

One note, the 512-byte block count does not directly translate into bytes. It's a long story. You can display a file's block usage with, under Solaris, ls -s. Stick to byte counts.

Oh, yes, one more thing, you can search for a specific range by using multiple -size expressions as long as they can all be satisfied by a single file. For example, locate files larger than 2KB and less than 8KB (inclusive):

find . -type f -size +2047c -size -8193c

Making VMD detect improperly named atoms

mol reanalyze does the job.
More here; http://www.ks.uiuc.edu/Research/vmd/mailing_list/vmd-l/19657.html

Wednesday, March 28, 2012

List of webpages that have come handy

http://ubuntugeeknerd.blogspot.com/2010/01/you-have-not-chosen-to-trust-thawte.html

For a better solution, try this:
sudo ln -s /etc/ssl/certs/Thawte_Premium_Server_CA.pem /usr/lib/ICAClient/keystore/cacerts/

Useful when you get this error;
" Citrix SSL Error 61: You have not chosen to trust “Thawte Server CA”".

Tuesday, March 27, 2012

NAMD sample scripts

http://www.ks.uiuc.edu/Research/namd/2.7/ug/node64.html

This file demonstrates the analysis of a DCD trajectory file using NAMD. The file pair.pdb contains the definition of pair interaction groups; NAMD will compute the interaction energy and force between these groups for each frame in the DCD file. It is assumed that coordinate frames were written every 1000 timesteps. See Sec. 12.1 for more about pair interaction calculations.

# initial config
coordinates     alanin.pdb
temperature     0

# output params
outputname      /tmp/alanin-analyze
binaryoutput    no

# integrator params
timestep        1.0

# force field params
structure       alanin.psf
parameters      alanin.params
exclude         scaled1-4
1-4scaling      1.0
switching       on
switchdist      8.0
cutoff          12.0
pairlistdist    13.5
stepspercycle   20
 
# Atoms in group 1 have a 1 in the B column; group 2 has a 2.
pairInteraction  on
pairInteractionFile pair.pdb
pairInteractionCol B
pairInteractionGroup1 1
pairInteractionGroup2 2

# First frame saved was frame 1000.
set ts 1000

coorfile open dcd /tmp/alanin.dcd

# Read all frames until nonzero is returned.
while { ![coorfile read] } {
  # Set firstTimestep so our energy output has the correct TS.
  firstTimestep $ts
  # Compute energies and forces, but don't try to move the atoms.
  run 0
  incr ts 1000
}
coorfile close

Monday, March 5, 2012

Drawing arrows in Gimp

Gimp (Gnu image manipulation program) is an open source image editor/manipulator that I use regularly to edit images and add text. I often need to add arrows to images and for some strange reason Gimp does not have an arrow button by default. But thanks to
http://registry.gimp.org/node/20269 there is now an easy way to accomplish that. For my purposes, I found these values ideal;
I could have learned a lot by simply reading the asterisk which says that negative values indicate fractions. For example a value of -2 indicates that the length of wings is 1/2 of the length of arrow. A value of -8 for brush thickness indicates that the brush is 1/8th of the length of arrow. For very long arrows one should select very low negative values (-100 or -200) to get reasonable looking arrows.

Wednesday, February 29, 2012

PLINK

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.

PLINK (one syllable) is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others.


Basic usage for SNP lookup function
plink --lookup rs1475515
which outputs to the LOG file the following information

Wednesday, February 22, 2012

Bookmarks

My favorite online bookmarking tool is delicious.com. It is a pretty neat, no-nonsense website currently run by Chad Hurley and Steve Chen. I typically assign a tag to a saved link and looking at what other people have saved under that tag can be quite educational. Here's how to do it:
delicious.com/tag/name-of-tag

Monday, February 20, 2012

Solvating molecules with VMD

To perform MD simulations with NAMD with Charmm ff, I typically use Charmm to build the initial psf/pdb of the protein and then use VMD to solvate it. Now VMD does not like any psf file generated by Charmm. It likes to its own version of psf file. For this one has to first read in a pdb file and read out a psf file. For this the following script is helpful:


package require psfgen
topology ./top_all27_prot_na.inp
segment PROT { pdb xp5helixonXaxis.out.pdb }
coordpdb xp5helixonXaxis.out.pdb PROT
guesscoord
writepsf vmd.psf

package require solvate
solvate vmd.psf xp5helixonXaxis.out.pdb  -o solvate  -s WT -t 15

Useful resource:
http://www.ks.uiuc.edu/~villa/dna/

Thursday, February 16, 2012

Fitting Protein Structures in VMD

# This script can be used to first load a reference structure and
# fit the second structure/trajectory to the reference.

mol new reference.pdb

mol new enzfibsol.psf
mol addfile md2.dcd type dcd first 0 last -1 step 1 waitfor all
mol addfile md3.dcd type dcd first 0 last -1 step 1 waitfor all
mol addfile md4.dcd type dcd first 0 last -1 step 1 waitfor all
mol addfile md5.dcd type dcd first 0 last -1 step 1 waitfor all

set backbone [atomselect 0 backbone]

for {set i 0} {$i < [molinfo 1 get numframes]} {incr i 1} {

set bk [atomselect 1 backbone frame $i]
set all [atomselect 1 all frame $i]
$all move [measure fit $bk $backbone]

}

Wednesday, February 15, 2012

Image manipulation in Matlab

a = double(imread('figures_from_microscope/500nm_particles_1200.tif'));
colormap('gray'), imagesc(a);

b = bpass(a,0,5,20);
#    bpass(img,lnoise, lobject, threshold)
colormap('gray'), image(b);

pk = pkfnd(b,25,3);
#     pkfnd(im, th, sz)

fid = fopen('exp.txt', 'w');
fprintf(fid, '%6.2f %12.8f\n', pk);
fclose(fid);

pk= pkfnd(b(200:400,200:400), 30,10);

# Below is a way to display multiple figures
figure(1); colormap('gray'), imagesc(a);
figure(2); colormap('gray'), image(b);
figure(3); colormap('gray'), image(b(200:400,200,400));
figure(4); colormap('gray'), image(a(200:400,200:400));

# This will put dots of 'r'ed on the particles of interest.
plot(pk(:,1),pk(:,2),'r.')
Looking at a small section of an image and checking for accuracy of the image manipulation is necessary. This is important since we make many assumptions on what particles to look at and what to ignore [this is what values in bpass and pkfnd do].

If var is a variable you are using in Matlab, it can be saved into a file using:
save('filename.txt', 'var', '-ascii')

These are the relevant links:
http://physics.georgetown.edu/matlab/tutorial.html
http://www.physics.emory.edu/~weeks/idl/tracking.html
http://www.physics.emory.edu/~weeks/idl/index.html

Friday, February 10, 2012

Getting header names in R and playing with csv files


names(myTable)

a<-read.csv("file.csv", header=T, sep=",")
names(a)
motility<-subset(a, Putative.orf.function=="Motility & Attachment")
write.csv(motolity, file="motility.csv")
 
Edit Feb 15th 2012
grep("Motility", a$Putative.orf.function)
gives the list of rows whose Putative.orf.function column contains the string "Motility".
Using this, one can get the entire row information.
df<-read.csv("file.csv", header=T, sep=",")
names(df)
 > names(df)
 [1] "Strain.Name"           "ID"                    "Location"             
 [4] "PA.ORF"                "Gene"                  "Gene.Abbrev."         
 [7] "Putative.orf.function" "Position.in.ORF"       "Frame"                
[10] "Genome.Position"       "Transposon.Direction"  "Transposon"           
[13] "F.Primer.Name"         "F.Primer.Seq"          "F.Primer.Position"    
[16] "F.Primer.Tm"           "Insert...F"            "R.Primer.Name"        
[19] "R.Primer.Seq"          "R.Primer.Position"     "R.Primer.Tm"          
[22] "R...Insert"            "WT.PCR.Len"            "Confirmed."           
[25] "How.confirmed."        "Notes"                
> df$Strain.Name[grep("Attachment",df$Putative.orf.function)]
  [1] PW10299 PW10300 PW1728  PW1729  PW1730  PW1731  PW1749  PW1750  PW1751 
 [10] PW1752  PW1753  PW1754  PW1755  PW1756  PW1757  PW2791  PW2792  PW2794 
 [19] PW2795  PW2941  PW2942  PW2943  PW2944  PW2945  PW2946  PW2947  PW2948 
...
[118] PW8681  PW8682  PW9465  PW9466  PW9467  PW9468  PW9469  PW9470  PW9471 
[127] PW9472  PW9473  PW9474 

Friday, January 27, 2012

Useful commands in Pymol


load file.pdb
create protein, all and segid PROT
create water, all and resname WAT
create res40, protein and resid 40
# To set display to orthoscopic
cmd.set('''orthoscopic''','''1''',quiet=0)
# To reset to default view
reset
hide everything, all
hide everything, protein
show sticks, protein
show surface, protein
set transparency=0.5
bg_color white
# To save images
ray 1200,1200
save file.png
distance (selection1), (selection2)
distance name = (selection1), (selection1) [,cutoff [,mode] ]
fit (selection), (target-selection)
# Colors everything by chain id
util.cbc

Resources:
http://pymol.sourceforge.net/newman/ref/S1000comref.html
http://www.pymolwiki.org/index.php/TOPTOC
http://www.pymolwiki.org
http://www.pymol.org/support
http://www.pymolwiki.org/images/7/77/PymolRef.pdf
http://people.mbi.ucla.edu/sawaya/tutorials/Graphics/pymol.html

The last link is particularly useful in the post-beginner stage of using Pymol.


 
load xxx.pdb
hide everything, all
create protein, all and not name h*
show sticks, protein
distance resi 22 and name CD and chain A, resi 22 and name CD and chain B
distance resi 22 and name CD and chain B, resi 22 and name CD and chain C
distance resi 22 and name CD and chain C, resi 22 and name CD and chain D
distance resi 22 and name CD and chain D, resi 22 and name CD and chain E
distance resi 28 and name NZ and chain D, resi 23 and name OD1 and chain C
distance resi 28 and name NZ and chain D, resi 23 and name OD2 and chain C
distance resi 28 and name NZ and chain C, resi 23 and name OD1 and chain B
distance resi 28 and name NZ and chain C, resi 23 and name OD2 and chain B
distance resi 28 and name NZ and chain B, resi 23 and name OD1 and chain A
distance resi 28 and name NZ and chain B, resi 23 and name OD2 and chain A
hide labels, all

March 2nd 2012

create protein, all and not resn DT+DG+NAG+HOH+OG6
create rna, all and resn DG+DT
create nag, all and resn NAG
create og6, all and resn OG6
hide everything, all
show cartoon, rna
show lines, rna
create closeprotein, (rna around 10) and not resn DG+DT
show sticks, closeprotein
distance /rna//D/DT`13/O4, /closeprotein//H/ARG`75/NH1
distance /rna//D/DT`13/O4, /closeprotein//H/ARG`75/NH2
distance /rna//D/DG`2/OP2, /closeprotein//H/ARG`77A/NH1
distance /rna//D/DG`2/OP2, /closeprotein//H/ARG`77A/NH2


March 5th, 2012

 # simple example: label residue 22's atoms with their names
label i. 22, name
 
# Label residue #44's alpha carbon with it's residue name, number and B-factor.
label n. CA and i. 44, "(%s, %s, %s") % (resn, resi, b)


This link has a lot of info on labeling. http://pymolwiki.org/index.php/Label

Monday, January 23, 2012

Printers

Here are a few commands to work with printers on Suse Linux.

# To check the status of all printers
lpstat -a
# To check the status of a printer
lpstat -p printer
# To set a default printer
lpoptions -d printer-name
lpq
# To print a file
lp -d printer file
# To remove a print job
lprm -Pprinter job-id


sudo -u pkc  /usr/sbin/cupsenable HP_LaserJet_6MP

Friday, January 20, 2012

To merge pdf files in Mac

Here's a neat little trick to combine pdf files on a Mac
http://macintoshhowto.com/leopard/how-to-merge-pdf-files-with-preview-in-leopard.html

Wednesday, January 18, 2012

VMD script to pick a box within a box


set cellxmin [lindex [lindex [measure minmax $all] 0] 0]
set cellxmax [lindex [lindex [measure minmax $all] 1] 0]
set cellymin [lindex [lindex [measure minmax $all] 0] 1]
set cellymax [lindex [lindex [measure minmax $all] 1] 1]
set cellzmin [lindex [lindex [measure minmax $all] 0] 2]
set cellzmax [lindex [lindex [measure minmax $all] 1] 2]

for {set i 0} {$i < 100} {incr i} {
# This lets you pick a random box of size 10x10x10 in the simulation box
set xmin [expr $cellxmin + ( ($cellxmax-$cellxmin-$boxsize) * rand())]
set ymin [expr $cellymin + ( ($cellymax-$cellymin-$boxsize) * rand())]
set zmin [expr $cellzmin + ( ($cellzmax-$cellzmin-$boxsize) * rand())]
set xmax [expr $xmin + $boxsize]
set ymax [expr $ymin + $boxsize]
set zmax [expr $zmin + $boxsize]

set selection [atomselect top "(x>$xmin and x<$xmax and y>$ymin and y<$ymax and z>$zmin and z<$zmax)"]
# This gives the number of atom in that selection
set n [$selection num];
#This gives the number of non-bonded interactions in the 10x10x10 box selected.
puts $file [expr $n * ($n -1)/2 ]
}

Thursday, January 12, 2012

Useful perl code

#!/usr/bin/perl

$infile  = "./test.pdb";
$outa = "./test.pdb.A";
$outb = "./test.pdb.B";
open (FH,$infile) || die;
open (OUTA, ">$outa") || die;
open (OUTB, ">$outb") || die;

while ($buf = )
{
    $x = substr($buf,16,1);
    if ($x eq 'A') { print OUTA $buf; }
    elsif ($x eq 'B') { print OUTB $buf; }
    else { print OUTA $buf; print OUTB $buf; }
}

close FH;
close OUTA;
close OUTB;


# This code reads in a file and if the 17th column in each line matches A/B, that line is put in OUTA/OUTB.This is useful in separating multiple coordinates for particular residues in 3D coordinate file of a protein. Sometimes studying all possible conformations is important especially if it involves an aromatic ring flip or flips in residues with amine ends.

Thursday, January 5, 2012

Delete lines matching a string on VIM

:g/pattern/d
:v/pattern/d

Tuesday, December 13, 2011

Go to col on vim/vi


You can quickly jump to a specified column in Vim by simply typing the column number followed by a pipe.


80| # go to the 80th column

Thanks to DailyVim

Tuesday, December 6, 2011

Min-Max on R plots

d <- density(rnorm(1000))
plot(d)
d$x[which.max(d$y)]
abline(v=d$x[which.max(d$y)])

http://r.789695.n4.nabble.com/Find-x-value-of-density-plots-td4165926.html

Monday, November 28, 2011

Great source for 'sed' examples

http://sed.sourceforge.net/sed1line.txt

Bash commands in R

x<-as.matrix(read.table(pipe("awk -f cut.awk F120x1")))

Wednesday, November 23, 2011

NAMD inputs from VMD

set all [atomselect top all]
measure minmax $all
set cellxmin [lindex [lindex [measure minmax $all] 0] 0]
set cellxmax [lindex [lindex [measure minmax $all] 1] 0]
set cellymin [lindex [lindex [measure minmax $all] 0] 1]
set cellymax [lindex [lindex [measure minmax $all] 1] 1]
set cellzmin [lindex [lindex [measure minmax $all] 0] 2]
set cellzmax [lindex [lindex [measure minmax $all] 1] 2]

puts "cellOrigin              [format "%7.3f %7.3f %7.3f" [lindex [measure center $all] 0] [lindex [measure center $all] 1] [lindex [measure center $all] 2]]"
puts "cellBasisVector1        [format "%7.3f %7.3f %7.3f"      [vecsub $cellxmax $cellxmin] 0 0]"
puts "cellBasisVector2        [format "%7.3f %7.3f %7.3f"  0   [vecsub $cellymax $cellymin] 0]"
puts "cellBasisVector3        [format "%7.3f %7.3f %7.3f"  0 0 [vecsub $cellzmax $cellzmin] ]"



Pretty primitive, but it works.

Thursday, November 17, 2011

Common VMD Tcl/Tk commands

mol modselect 0 0 resname BGLC and noh
mol modselect 0 1 resname BGLC and noh
mol modselect 0 2 resname BGLC and noh
mol modselect 0 3 resname BGLC and noh

mol smoothrep 0 0 5
mol smoothrep 1 0 5
mol smoothrep 2 0 5
mol smoothrep 3 0 5


mol modstyle 0 0 Licorice 0.300000 10.000000 10.000000

mol addrep 0
mol modselect 1 0 protein
mol modstyle  1 0 Licorice 0.300000 10.000000 10.000000
mol modstyle  1 0 vdw
mol smoothrep 1 1 5

mol off 0
mol off 1
mol off 2
mol off 3

mol showrep 0 0 0
mol showrep 0 0 1

color Display Background white

Tuesday, November 8, 2011

Calculating phi/psi values in VMD

First source the following script:
--- start ---
proc all_dihed_angle { a1 a2 a3 a4 } {
  # Delete all existing Dihdral labels so that the one we add has index 0.
  label delete Dihedrals

  # Use the top molecule
  set molid [molinfo top]

  # Add the dihedral monitor
  label add Dihedrals $molid/$a1 $molid/$a2 $molid/$a3 $molid/$a4

  # Get all values
  return [label graph Dihedrals 0]
}
--- end ---

Then run this script:

mol new enzfibsol.psf
mol addfile md7.dcd first 0 last 10 waitfor all
set phifile [open "cellulose_dihed_phi.tcl" w]
set psifile [open "cellulose_dihed_psi.tcl" w]

set oxygen [atomselect top "resname BGLC and resid 436 and name O1"]
set ox [$oxygen get index]
set h1 [expr {$ox-1}]
set c1 [expr {$ox-2}]
set c4 [expr {$ox-9}]
set h4 [expr {$ox-8}]

puts $phifile [all_dihed_angle $h1 $c1 $ox $c4]
puts $psifile [all_dihed_angle        $c1 $ox $c4 $h4]

close $phifile
close $psifile
quit

Thursday, September 22, 2011

Resize images on a console

sips --resampleWidth 2400 IMG_0062.JPG --out test.jpg
(double dash) resampleWidth and (double dash) out

This Mac command does a fantastic job of resizing large images in to more manageable file sizes.

Edit: A width of 2400 is too large. Use 1600 or lower instead.

Tuesday, September 13, 2011

Wildcards in Tcl/Tk

http://phaseit.net/claird/comp.lang.tcl/fmm.html

Very useful link on wildcards in Tcl Tk. This line solved my problem;
mol new [glob fitted*.pdb rev*.pdb]

A little background would help greatly. I had a set of directories each of which had either a fittedxxx.pdb or revxxx.pdb file. I wanted a single tcl script that would go into these directories and read the fitted/rev file and perform calculations. Since the name of the file in each directory was different, I needed a way to select whatever was available. glob does exactly that. 

Wednesday, September 7, 2011

Install packages in R

R CMD INSTALL mypkg -l /my/own/R-packages/

install.packages("mypkg", lib="/my/own/R-packages/")

Friday, September 2, 2011

ssh-keygen

To reset IP and known_host keys, use the following command:

ssh-keygen -R moldyn.ornl.gov -f /home/pkc/.ssh/known_hosts
RSA host key for moldyn.ornl.gov has changed and you have requested strict checking.


Monday, August 29, 2011

That #$%@ beep

1. xset b off
2. set b off
3. Turn off beep on Gmixer (or whatever audio controller you run)

These commands turn off the annoying beep that accompanies [tab], [auto-fill], and errors in the linux environment.


Thursday, August 18, 2011

png figures in a loop in R

This sample script creates 100 png figures with the iteration number included in the name.

for(i in 1:100) {
a<-rnorm(100)
name<-paste("test",i,".png",sep="")
png(name)
plot(a/max(a),type="l")
dev.off()
}

Friday, May 27, 2011

Running multiple serial jobs on clusters

This script describes a way to run multiple serial jobs on clusters.

#PBS -N serial.analysis
#PBS -j oe
#PBS -l walltime=4:00:00,nodes=22:ppn=16,gres=widow2

cd $PBS_O_WORKDIR
date
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PBS_O_WORKDIR

sort $PBS_NODEFILE | uniq > ./TEMP.NODEFILE

n=`wc -l ./TEMP.NODEFILE |awk '{print $1}'`
for((i=1;i<=$n;i++)); do j=`expr $i - 1`; sed -n ''$i','$i'p' ./TEMP.NODEFILE > host.$j; done

$MPI -np 1 --hostfile host.0 $NAMD ./w1_md3_resid_100.namd > ./w1_md3_resid_100.namd.out &
$MPI -np 1 --hostfile host.0 $NAMD ./w1_md3_resid_101.namd > ./w1_md3_resid_101.namd.out &
....
$MPI -np 1 --hostfile host.0 $NAMD ./w1_md3_resid_174.namd > ./w1_md3_resid_174.namd.out &
$MPI -np 1 --hostfile host.1 $NAMD ./w1_md3_resid_175.namd > ./w1_md3_resid_175.namd.out &
....
$MPI -np 1 --hostfile host.1 $NAMD ./w1_md3_resid_246.namd > ./w1_md3_resid_246.namd.out &
$MPI -np 1 --hostfile host.2 $NAMD ./w1_md3_resid_247.namd > ./w1_md3_resid_247.namd.out &
$MPI -np 1 --hostfile host.2 $NAMD ./w1_md3_resid_251.namd > ./w1_md3_resid_251.namd.out &
..

and so on.

Friday, May 6, 2011

Conditional plots in R

This link says it all.

 Al - 19 Jan 2007 15:33 GMT
Hi!
Given a 1000x2 matrix x, i.e. 1000 2d-points. How can I directly (!)
plot/visualize only those which fulfill a certain condition, say
x[,1]>0 & x[,2]>0?
Thanks,
Al
Marc Schwartz - 19 Jan 2007 21:24 GMT
> Hi!
> Given a 1000x2 matrix x, i.e. 1000 2d-points. How can I directly (!)
> plot/visualize only those which fulfill a certain condition, say
> x[,1]>0 & x[,2]>0?
> Thanks,
> Al

First, R specific programming queries are best posted to the r-help
list. More information at:

http://www.r-project.org/

under the Mailing Lists link.

Second, the easiest way to do this is to subset the data while using
plot.formula(), which is a plot method that will be dispatched if the
first argument to plot() is a formula, such as 'y ~ x'. See ?plot and
?plot.formula for more information.

As an example, let's create a data frame with 2 columns and 1000 rows,
each column comprised of 1000 random normal deviates:

DF <- data.frame(x = rnorm(1000), y = rnorm(1000))
> str(DF)
'data.frame': 1000 obs. of 2 variables:
$ x: num -0.0265 0.1554 -0.1050 -0.9697 -0.3430 ...
$ y: num 1.386 -1.356 -1.170 0.426 0.204 ...

Now, let's plot the data meeting the criteria you indicated above:

plot(y ~ x, data = DF, subset = (x > 0) & (y > 0))

This says to generate a scatter plot, using 'DF' as the input data set,
and specifying the logical subset where x and y are both > 0. Modify
the subset argument as you require.

Note that in this usage, the 'data' argument must be a data frame. So
you could coerce your matrix (if needed) by using:

DF <- as.data.frame(matrix)

HTH,

Marc Schwartz

And this is the way (ex.) to get the length of the array that satisfies the conditions.
sum((x > 0) & (y > 0), na.rm=TRUE)

Wednesday, May 4, 2011

Custom replacements in VI

Thanks to Steve for this pearl of VI wisdom.
For search and replace operation in VI, parenthesis remembers the item. This can be used for custom replacements later. Try the command below on a file with this data:

29 5 2010
30 6 2011

:1,$s/[0-9]* \([0-9]*\) \([0-9]*\)/\2 \1/
VI can remember 9 such fields.

Monday, April 11, 2011

Potential Energy Scans with NWChem

While performing PES on dihedrals of relatively large molecules using NWchem, one can get stuck during the minimization process. Possible ways to overcome those problems are:
1. Try a smaller step size
2. Use the "tight" function in the driver
3. Redo everything with a smaller molecule that can mimic the dihedral in question

Tuesday, March 1, 2011

Useful "vi" tips

1. If you want to merge rows ( for exp. row 1 to 20 ) do:
:1,20 j
2. To replace string with a new line
:%s/ TI/^MTI
To get ^M use following
Hold down the Ctrl, and press v m
DO NOT USE Shift+6 and Shift+M
3. To move a string to a new line w/o hitting Enter, do this;
:s/string/\r&/g

Friday, February 25, 2011

Flush VMD

While running VMD analysis scripts over large number of trajectories, it is a good idea to free up memory after every step like so:
for {set j 1} {$j < 101} {incr j}
{
mol addfile /.$j.dcd first 0 last -1 waitfor all
set ca [atomselect top "name CA"]
set cb [atomselect top "name CB"]
....
$ca delete
$cb delete
animate delete all
}
The last three lines will make life much better.

Wednesday, January 19, 2011

Using Bit Bucket

Following my friend's recommendation I decided to check out this site called Bit Bucket which is a free code sharing/hosting website where one can essentially store their code and any changes they make to that file. Following are the steps to get things up and running:
1. Install Mercurial (if you don't already have it)
2. Open an account with https://bitbucket.org
3. Create a repository on bitbucket.org. Let's call it bit_repo.
4. Create a /home/user/.hgrc file with the following lines on your desktop.
[ui]
; editor used to enter commit logs, etc. Most text editors will work.
editor = vi
username = firstname lastname
5. cd in to a directory on your desktop which contains codes that you want to backup.
5.1. Type> hg clone https://
bitbucket_user_id@bitbucket.org/bitbucket_user_id/bit_repo
destination directory: bit_repo
http authorization required
...
Then cd
bit_repo
6. To add a specific file to Bitbucket, type
hg add file1.c
hg commit -m "I commit to add this file to bitbucket" [The msg in quotes can by anything]
hg push https://bitbucket_user_id@bitbucket.org/bitbucket_user_id/bit_repo
7. To remove a file type
hg remove file2.c
hg commit -m "I commit to remove this file"
hg push https://bitbucket_user_id@bitbucket.org/bitbucket_user_id/bit_repo
8. To copy files/repositories from bitbucket, type
hg clone https://bitbucket_user_id@bitbucket.org/bitbucket_user_id/bit_repo
9. Here is a Bitbucket 101 http://confluence.atlassian.com/display/BITBUCKET/Bitbucket+101
10. To keep things safe I would just softlink files to their sources on my desktop.

Friday, August 6, 2010

Little victory

I have a linux box running on openSUSE 11.1 in my office. Firefox on the machine was not configured to open pdf files in the browser and after years of downloading files just to view them I found a solution. All I needed to do was to put this plugin called nppdf.so in the folder ~/.mozilla/plugins/. Now I couldn't find a single working link that would let me download the plugin. Frustrated, I did a simple "find / -name nppdf.so" on my terminal and found it somewhere in the Realplayer folder. Copied it and voila! The sweetest part was that the whole process took ~2mins. Geee...

Sunday, March 21, 2010

VPN on Ubuntu 9.1

I wanted to write about how awesome Ubuntu9.1 was when it came to accepting my ancient Netgear wireless card from 2004 (w/o a single keystroke), installing Skype and Logitech Quickcam Pro 9000 (no installation required), but the VPN installation has caused enough heartburn to prevent me from appreciating the previous breezes. Here go my whinings:

1. I downloaded vpnclient-linux-4.8.01.0640-k9.tar.gz
2. Patched with this file vpnclient-linux-4.8.01-64bit.patch (from http://projects.tuxx-home.at/ciscovpn/patches/)
3. Ran 'bash vpn_install' (went w/o incident)
4. /etc/init.d/vpnclient_init start --> This died with the following error:
Starting /opt/cisco-vpnclient/bin/vpnclient: insmod: error inserting '/lib/modules/2.6.31-14-generic/CiscoVPN/cisco_ipsec.ko': -1 Invalid module format
Failed (insmod)
5. I also tried repatching using vpnclient-linux.2.6.31.diff since this guy seems to have had success with it.

And for the life of me I can't find a working solution online. I am temporarily giving up on it with the following doubts lingering:
1. The "64bit" in vpnclient-linux-4.8.01-64bit.patch makes me very suspicious. But this is the only Cisco vpn client my lab distributes for Linux users. Any and all useful inputs are welcome!

Edit (22nd March):

I fixed the problem by installing vpnc
>sudo apt-get install network-manager-vpnc
and pcf2vpnc
>wget http://svn.unix-ag.uni-kl.de/vpnc/trunk/pcf2vpnc
>./pcf2vpnc xxx.pcf > xxx.conf
>sudo vpnc xxx.conf
Viola! Life's good again. Thanks to this site and this one for clarifying the difference between a .pcf and a .conf file.

Thursday, March 11, 2010

Making figures using VMD


I have lately been playing around with VMD to make pictures. After having used their "screenshot" feature to make pictures all this time I decided to try out rendering images using Tachyon and boy! what a difference. A simple example is attached. I'll let you guess which of the two was made using the Tachyon ray tracer. More info here.

Tuesday, February 23, 2010

You know you are a Latex addict when...

You write -- for a dash(-) while taking notes on a piece of paper with a pen.

Friday, February 12, 2010

Plot layout in R

I recently needed to make a figure with two plots side-by-side. Plot 1 was to have a width 3 times that of plot 2. The layout feature in R seemed to be the answer. But, due to either lack of decent tutorials online or just plain bad luck in finding one, I found it rather difficult to understand how this layout function works. The usual help command help(layout) wasn't terribly helpful either. Here go the set of commands that really nailed it for me. Run them on your terminal to follow the rest of the blog.

nf <- layout(matrix(c(1,2),1,2),widths=c(1,1)); layout.show(nf)

Pretty straight forward isn't it? Now try,

nf <- layout(matrix(c(1,2),1,2),widths=c(3,1)); layout.show(nf)

Basically the matrix(c(1,2),1,2) is what contains the layout information. c(1,2) is simply the label of the plot in question. The 1,2 after that tells you the number if rows(1) and columns(2) in the layout. Now a little more complicated example,

nf <- layout(matrix(c(1,3,4,2,5,6,7,8),2,4),widths=c(1,1,1,5)); layout.show(nf)

Hope this helps.

Thursday, February 11, 2010

Colors in R

Hello World,
R is a pretty wicked statistical analysis tool with great resources for plotting data. I use it on a regular basis and I'll use this blog to document notable finds along the way. This will hopefully help me find answers to previously encountered problems by taking a peek at the blog rather than fishing through my sea of bookmarks. Here is a short comment on colors in R.

colors() gives a list of all the available colors in R. It looks like this:
[1] "white" "aliceblue" "antiquewhite"
[4] "antiquewhite1" "antiquewhite2" "antiquewhite3"
[7] "antiquewhite4" "aquamarine" "aquamarine1"
[10] "aquamarine2" "aquamarine3" "aquamarine4"
.
.
[646] "wheat" "wheat1" "wheat2"
[649] "wheat3" "wheat4" "whitesmoke"
[652] "yellow" "yellow1" "yellow2"
[655] "yellow3" "yellow4" "yellowgreen"

The way to select a particular color is:
plot(x,y,col=color()[646])
or
plot(x,y,col="wheat")

Ciao!