Running applications over many hosts
Available workstations
Multiple machines
Using SGE to submit jobs
We have 4 dual AMD opteron processor blades (servers) and 6 IBM workstations running off AMD opteron dual processors. In total there are 20 cpu's that can be used to submit jobs to.
The two servers consist of a primary (server01) and secondary (server02). The secondary is constantly updated and mirrors the primary (updated every second). If for any reason the primary server crashes, then the secondary server takes over (hopefully you will not notice a thing unless you were running a job on the server).
Another two blades are simply used as computational power houses. This is mainly due to space restrictions in the Linux room. These should be used if you want to submit large jobs
Finally we have 6 IBM workstations - two of these will have dual monitors to help visualization for NMR analysis. Again, you can use these if you wish to specify a big job over many nodes
We insist that all long jobs are submitted via the SGE job queuing system
We use the NFS system, which allows you to access your home directory and applications from any machine on the cluster
When you are given a username, your home directory will be
/nfs/home/username
Softwares available to everyone are stored at /nfs/usr/
In fact your home directory and software application are both
located on the server. This can cause a major problem with speed for
programmes using a lot of I/O access. e.g. nmrPipe. To run these
programmes it is best to have the data stored locally on your
workstation (/usr/local/scratch) for details on this see below.
Disk space
The nfs system has 160 GigaBytes of space
In addition each workstation has 80 Gb of space. At present, however, the system is set up to delete any data that has been on the local disk for more than 15 days, so if you do something locally (e.g processing, then make sure you copy it over to your home directory. I'm not sure this is something I want to continue, but we shall see how this goes.
Running applications
over many hosts
Most processes we use require login to other machines via ssh. It is important to set up your ssh configuration to login automatically without having to type a password. To do this perform the following steps:
cd ~/.ssh
ssh-keygen -t dsa
cp id_dsa.pdb authorized_keys
now to ensure that you can only logon from the cluster you need to edit the authorized_keys file. At the very beginning you need to insert:
from="*.iric.udem"
so it should have something like
from="*.iric.udem" ssh-dss AAAA
Finally ssh to all the machines. You will be asked for the password first time, thereafter it should login automatically.
The names of the machines are here
In my view the cluster (multiple machines) should be used for the following applications:
You can also try NMRpipe but this is really slow over the
network (check my stats on this below) because of the excessive I/O
access. Since the NFS copies data from server01 to server02 every
second, and nmrPipe writes and reads data a lot quicker than that -
you can see the problem.
My advice is to use run nmrPipe with the data located locally on the /scratch directory.
The workstations and servers/pnodes are named as follows:
server01
server02
pnode01
pnode02
station01
station02
station03
station04
station05
station06
To set up add the following command to your .bashrc
export PYTHONPATH=/nfs/home/osbornem/local/aria1.1.2:$PYTHONPATH
alias aria='/usr/bin/python2 /nfs/home/osbornem/local/aria1.1.2/Aria/RunAria.py'
and your .cshrc
setenv PYTHONPATH /nfs/home/osbornem/local/aria1.1.2:$PYTHONPATH
alias aria '/usr/bin/python /nfs/home/osbornem/local/aria1.1.2/Aria/RunAria.py'
To set up add the following to your .bashrc
export PYTHONPATH=/nfs/home/osbornem/local/aria_1.2:$PYTHONPATH
alias aria='/usr/bin/python2 /nfs/home/osbornem/local/aria_1.2/Aria/RunAria.py'
and your .cshrc:
setenv PYTHONPATH /nfs/home/osbornem/local/aria_1.2:$PYTHONPATH
alias aria '/usr/bin/python /nfs/home/osbornem/local/aria_1.2/Aria/RunAria.py'
To run on multiple machines alter the run.cns script as follows:
{===>} queue_1="csh";
{===>} cns_exe_1="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";
{===>} cpunumber_1=1;
{===>} queue_2="ssh station04 csh";
{===>} cns_exe_2="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";
{===>} cpunumber_2=2;
{===>} queue_3="ssh station03 csh";
{===>} cns_exe_3="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";
{===>} cpunumber_3=2;
However, not that this is not necessary if you run ARIA via the SGE. I have scripts that will do this automatically for you. So PLEASE USE THE SGE SYSTEM. Users not doing this run the risk of their process being stopped.
see here for info on SGE and ARIA
Just csh will run on the current workstation. CPU number is the number of CPU's to use on that station.
Note that the maximum processors is 16: but as you calculate 20 structures, using 10 CPU's will take the same time as using 16..
Alternatively you can edit the appropriate forms for data conversion (top) or for the run.cns (see bottom) using forms on our server:
OR from the server at the pasteur institute:
Number of CPU's |
Time |
Increase in performance |
---|---|---|
1 |
9 hours 4 minutes |
N/A |
10 |
1 hour 15 minutes |
~ 9 fold |
16 |
1 hour 13 minutes |
~ 9-fold |
These are based on calculating 20 structures per iteration (8 iterations in total). Since Aria does parallelization
Why no performance from 10 to 16 cpu's. This is because the next iteration cannot continue until all 20 structures have been computed. Thus if we could use 20 cpu's then the time will probably be halved. Or we could calculate 16 structures to get a similarly halving of the time.
BOTTOM LINE:
For maximum performance use only 10 CPU's...(assuming no-one else is using them).
Running Xplor on multiple machines:
xplor -parallel -machines XX.file -o
oz.out < xplor.inp
where XXX.file contains names of the hosts
e.g.
station02
station03
station04
If you have old xplor scripts you wish to use in parallel then add the following syntax (remember the total number of structures should be greater than or equal to the number of machines specified:
Place these lines at the beginning of the script:
!total number of structures to calculate:
eval ($numStructs = 100)
!random seed:
eval ($randomSeed = 785)
! get parallel info
cpyth "from os import environ as env"
cpyth "xplor.command('eval ($proc_num=%s)' % env['XPLOR_PROCESS'])"
cpyth "xplor.command('eval ($num_procs=%s)' % env['XPLOR_NUM_PROCESSES'])"
eval ($firstStruct = ($proc_num * $numStructs) / $num_procs)
eval ($lastStruct = (($proc_num+1) * $numStructs) / $num_procs)
2) Then, the structure loop should be modified to :
eval ($count = $firstStruct)
while ($count <= $lastStruct) loop structure
eval ($seed = $randomSeed+$count)
set seed $seed end
.
.
evaluate ($file = "1gb1_" + encode($count) + ".pdb")
write coor output= $file end
eval ($count = $count + 1)
end loop structure
Within the Python interface, parallelization is automatic if one uses the StructureLoop class from the simulationTools module.
Benchmarks:
Running over 6 CPU's
Invoking the following command to calculate 50 structures:
nfs/usr/xplor-nih-2.10/bin/xplor -parallel -machines ./allhosts.list -o xplor_par < anneal1_par.inp"
Number CPU |
Time |
Increased performance |
---|---|---|
1 |
~ 17 hours |
|
6 |
~ 3 hours |
~ 6-fold |
NMRPipe
If you have a 2D or 3D (without too many LP) then it is
probably best to store the data locally on /scratch and run on a
single cpu!!
However for 3D or 4D which would normally take a long time - you can run over many machines.
1. First you need to have a file called pipe.net describing the CPU's e.g
station01.iric.udem 10
station01.iric.udem 10
station02.iric.udem 10
station02.iric.udem 10
station03.iric.udem 10
station03.iric.udem 10
station04.iric.udem 10
station04.iric.udem 10
station06.iric.udem 10
station06.iric.udem 10
station05.iric.udem 10
Now split your processing script into seperate files for each transform:
Here is an example of double lp using parallel processingNB the bold are the lines for parallel:
This id the ft of the 1st and second dimension with no LP
# parallel1.com NMRPipe Data Processing Script
#! /bin/csh
echo 'Transforming 3-dimensional NMR data!'
echo 'Processing F3/F1 dimensions first'
hostname
waitFor -par pipe.net -init -ext part1 -cpu $1 -verb
xyz2pipe -in ./data/test%03d.fid -x -verb -par pipe.net -cpu $1 \
| nmrPipe -fn SOL -mode 2 -fl 64 -fs 2 -poly \
| nmrPipe -fn CBF -last 12 \
| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 512 -c 0.5 \
| nmrPipe -fn ZF -size 1024 \
| nmrPipe -fn FT \
| nmrPipe -fn PS -p0 -66 -p1 0 -di \
| nmrPipe -fn TP \
| nmrPipe -fn SP -off 0.5 -end 0.95 -pow 2 -size 60 -c 0.5 \
| nmrPipe -fn ZF -size 256 \
| nmrPipe -fn FT \
| nmrPipe -fn PS -p0 0 -p1 0 -di \
| nmrPipe -fn TP \
| pipe2xyz -out lp/shit%03d.ft2 -x
waitFor -par pipe.net -ext part1 -cpu $1 -verb
Now ft the 3rd dimnesion with LP
# parallel2.com NMRPipe Data Processing Script
#! /bin/csh
hostname
waitFor -par pipe.net -init -ext part1 -cpu $1 -verb
echo 'Processing F2'
xyz2pipe -in lp/shit%03d.ft2 -z -verb -par pipe.net -cpu $1 \
| nmrPipe -fn LP -fb -ord 10 -x1 2 -xn 28 -pred 32 -fix -fixMode 1 -after \
| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 60 -c 0.5 \
| nmrPipe -fn ZF -size 128 \
| nmrPipe -fn FT \
| nmrPipe -fn PS -p0 0 -p1 0 -di \
| pipe2xyz -out lp/shit%03d.ft3 -z
waitFor -par pipe.net -ext part1 -cpu $1 -verb
Now do inv ft on the 2nd dimension and LP
# Parallel3.com NMRPipe Data Processing Script
#! /bin/csh
echo 'Processing F3/F1 dimensions first'
hostname
waitFor -par pipe.net -init -ext part1 -cpu $1 -verb
echo 'Processing F2'
xyz2pipe -in lp/shit%03d.ft2 -z -verb -par pipe.net -cpu $1 \
| nmrPipe -fn LP -fb -ord 10 -x1 2 -xn 28 -pred 32 -fix -fixMode 1 -after \
| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 60 -c 0.5 \
| nmrPipe -fn ZF -size 128 \
| nmrPipe -fn FT \
| nmrPipe -fn PS -p0 0 -p1 0 -di \
| pipe2xyz -out lp/shit%03d.ft3 -z
waitFor -par pipe.net -ext part1 -cpu $1 -verb
Due to I/O time, I find the first stage (direct dimnesion FT) is best done on one machine... only the LP steps are best done over many cpu's..
To run these three files I make a csh script such as below
nmrShell -rsh ssh -par pipe.net -sh parallel1.com -path /nfs/home/osbornem/shitter > shit
nmrShell -rsh ssh -par pipe.net -sh parallel2.com -path /nfs/home/osbornem/shitter > shit1
nmrShell -rsh ssh -par pipe.net -sh parallel3.com -path /nfs/home/osbornem/shitter > shit2
./parallel4.com > shit3
echo "COMPLETED"
YOU CAN DOWNLOAD EXAMPLE SCRIPTS here
Note the use of nmrShell to launch parallel processing. The final script parallel4.com is done on a sing;e processor..
Usage of nmrShell is given below
nmrShell -rsh ssh -par pipe.net -sh
nmrproc.com -path /FULLPATHNAME -log
e.g if your script is in
/nfs/home/osbornem/nmrpipe
nmrShell -rsh ssh -par pipe.net -sh nmrproc.com -path /nfs/home/osbornem/nmrpipe -log
Some
pointers
The best way to do this is by using nmrShell using the following command:
Do not use any of the verbose commands in NMRPipe (-v, -verb) as displaying text over the nfs will really slow things down
your script will be identical as for a single cpu but you need to add on the line containing the input"-par
Benchmarks:
Running over 6 CPU's
Ruuning a double lp of a hcch-tocsy using the scripts shown above
Number CPU |
Time |
Increased performance |
---|---|---|
1 |
~35 minutes |
|
12 |
~ 8.5 minutes |
~ 4-fold |
CNS
CNS cannot (to my knowledge anyway) be parallelized. I guess you can submit a number of input files with differing starting trajectories. Nonetheless, CNS has been optimised (32-bit) by Savoirefaire linux, and it is pretty fast.
Check out some benchmark data here and graphs work compare our system with others.
Benchmarks for test suite programmes (time in seconds)
Test suite program |
SGI 600 MHz |
|
1.8 GHz Athlon |
2.8 GHz Xeon |
---|---|---|---|---|
general/alternate.inp |
0.4 |
0.06 |
0.1 |
0.1 |
general/bfactor_plot.inp |
0.4 |
0.14 |
0.4 |
0.3 |
general/buried_surface.inp |
5.5 |
2.34 |
4.9 |
4.4 |
general/cis_peptide.inp |
30.3 |
11.48 |
64.7 |
55.6 |
general/cns_to_o.inp |
1.0 |
0.38 |
0.7 |
0.7 |
general/contact.inp |
105.6 |
39.17 |
219.6 |
189.4 |
general/delete_atoms.inp |
0.3 |
0.05 |
0.1 |
0.1 |
general/delta_phipsi.inp |
2.7 |
0.94 |
4.9 |
4.0 |
general/difference_distance.inp |
25.3 |
8 |
27.0 |
22.2 |
general/fractional_transform.inp |
0.7 |
0.28 |
0.7 |
0.6 |
general/generate.inp |
2.5 |
0.89 |
3.9 |
3.3 |
general/generate_easy.inp |
42.8 |
15.61 |
84.3 |
71.3 |
general/generate_seq.inp |
1.0 |
0.3 |
0.9 |
0.8 |
general/get_ncs_matrices.inp |
0.6 |
0.22 |
0.4 |
0.3 |
general/hydrogen_bonds.inp |
20.4 |
7.63 |
39.2 |
33.1 |
general/merge_structures.inp |
1.5 |
0.42 |
1.4 |
1.1 |
general/model_anneal.inp |
38.4 |
17.69 |
29.5 |
24.2 |
general/model_mask.inp |
1.0 |
0.41 |
0.8 |
0.8 |
general/model_minimize.inp |
8.5 |
3.84 |
6.6 |
5.4 |
general/model_rigid.inp |
2.0 |
0.86 |
1.8 |
1.4 |
general/molecule_extent.inp |
0.7 |
0.27 |
1.1 |
0.9 |
general/mtf_to_psf.inp |
0.1 |
0.01 |
0.0 |
0.0 |
general/neighbours.inp |
10.6 |
3.9 |
20.1 |
17.5 |
general/psf_to_mtf.inp |
0.3 |
0.03 |
0.1 |
0.0 |
general/ramachandran.inp |
2.3 |
0.82 |
4.5 |
3.7 |
general/realspace_transform.inp |
1.3 |
0.51 |
1.1 |
1.0 |
general/rename_segid.inp |
2.1 |
0.32 |
0.7 |
0.6 |
general/rms_fit.inp |
0.1 |
0.04 |
0.1 |
0.0 |
general/rmsd.inp |
0.5 |
0.15 |
0.5 |
0.4 |
general/shift_molecules.inp |
92.1 |
34.38 |
162.0 |
139.7 |
general/split_structure.inp |
7.2 |
1.98 |
2.8 |
2.4 |
general/surface_plot.inp |
0.7 |
0.25 |
0.8 |
0.6 |
nmr_calc/accept.inp |
411.0 |
181.60 |
275.6 |
381.7 |
nmr_calc/anneal.inp |
3972.3 |
1520.820 |
2563.7 |
2216.4 |
nmr_calc/anneal_1.inp |
1881.7 |
635.28 |
1208.4 |
982.9 |
nmr_calc/anneal_2.inp |
2765.2 |
1032.07 |
2004.5 |
1482.3 |
nmr_calc/anneal_3.inp |
1110.3 |
450.29 |
870.0 |
686.0 |
nmr_calc/anneal_cv.inp |
10758.3 |
3911.98 |
7117.1 |
5429.7 |
nmr_calc/dg_sa.inp |
1886.7 |
627.41 |
1259.0 |
947.5 |
nmr_calc/dg_sa_1.inp |
334.7 |
171.42 |
680.2 |
333.0 |
nmr_calc/ensemble.inp |
1289.0 |
457.7 |
917.6 |
818.8 |
nmr_calc/ensemble_1.inp |
860.2 |
267.75 |
548.5 |
550.2 |
nmr_calc/ensemble_cv.inp |
3827.9 |
1365.79 |
2783.8 |
2475.1 |
nmr_calc/ensemble_cv_1.inp |
8660.1 |
2513.48 |
5309.9 |
5512.9 |
nmr_calc/generate_extended.inp |
115.8 |
19.01 |
37.6 |
49.3 |
nmr_calc/generate_extended_1.inp |
71.4 |
21.68 |
25.1 |
20.5 |
nmr_calc/pmrefine.inp |
875.9 |
441.89 |
770.0 |
809.2 |
nmr_calc/pmrefine_1.inp |
598.3 |
285.54 |
444.6 |
474.9 |
nmr_calc/rmsd_pm.inp |
8.2 |
2.990 |
12.2 |
10.6 |
nmr_calc/rmsd_pm_1.inp |
8.5 |
3.17 |
11.7 |
10.3 |
xtal_mr/cross_rotation.inp |
1737.6 |
871.04 |
1434.2 |
1643.0 |
xtal_mr/cross_rotation_1.inp |
541.0 |
175.13 |
544.3 |
393.1 |
xtal_mr/self_rotation.inp |
65.5 |
21.2 |
67.2 |
50.6 |
xtal_mr/translation.inp |
224.8 |
106.31 |
219.9 |
234.6 |
xtal_mr/translation_1.inp |
971.4 |
493.4 |
868.2 |
983.3 |
xtal_mr/translation_2.inp |
202.7 |
94.23 |
206.0 |
212.2 |
xtal_patterson/heavy_search.inp |
413.3 |
|
283.6 |
364.3 |
xtal_patterson/patterson_map.inp |
17.6 |
|
18.7 |
16.7 |
xtal_patterson/patterson_map_1.inp |
20.5 |
|
20.7 |
18.5 |
xtal_patterson/patterson_refine.inp |
22.7 |
|
20.7 |
20.6 |
xtal_patterson/predict_patterson.inp |
1.7 |
|
1.7 |
1.6 |
xtal_phase/cns_to_sdb.inp |
0.1 |
|
0.0 |
0.0 |
xtal_phase/delete_sites.inp |
0.1 |
|
0.1 |
0.1 |
xtal_phase/density_modify.inp |
36.2 |
|
29.6 |
26.8 |
xtal_phase/density_modify_1.inp |
430.4 |
|
319.4 |
250.2 |
xtal_phase/flip_sites.inp |
0.7 |
|
0.8 |
0.7 |
xtal_phase/generate_sdb.inp |
0.1 |
|
0.1 |
0.1 |
xtal_phase/ir_phase.inp |
97.1 |
|
68.2 |
81.8 |
xtal_phase/ir_phase_1.inp |
334.2 |
|
375.6 |
290.1 |
xtal_phase/ir_phase_2.inp |
175.9 |
|
120.6 |
151.8 |
xtal_phase/ir_phase_3.inp |
312.6 |
|
365.1 |
281.4 |
xtal_phase/mad_bijvoet_ave.inp |
15.8 |
|
16.7 |
15.0 |
xtal_phase/mad_phase.inp |
2331.5 |
|
1305.0 |
1814.2 |
xtal_phase/mad_phase_1.inp |
227.0 |
|
155.6 |
199.7 |
xtal_phase/mad_phase_2.inp |
207.7 |
|
145.2 |
182.8 |
xtal_phase/optimize_ncsop.inp |
380.5 |
|
243.6 |
176.0 |
xtal_phase/pdb_to_sdb.inp |
0.1 |
|
0.1 |
0.1 |
xtal_phase/sdb_manipulate.inp |
0.5 |
|
0.6 |
0.5 |
xtal_phase/sdb_split.inp |
0.4 |
|
0.5 |
0.3 |
xtal_phase/sdb_to_pdb.inp |
0.2 |
|
0.2 |
0.2 |
xtal_phase/sdb_to_sdb.inp |
0.2 |
|
0.2 |
0.2 |
xtal_phase/shift_sites.inp |
3.6 |
|
4.0 |
3.5 |
xtal_phase/solvent_mask.inp |
6.8 |
|
5.6 |
4.8 |
xtal_phase/solvent_mask_1.inp |
6.8 |
|
6.0 |
5.0 |
xtal_refine/anneal.inp |
177.6 |
|
155.3 |
160.1 |
xtal_refine/anneal_1.inp |
102.0 |
|
102.4 |
80.5 |
xtal_refine/anneal_2.inp |
27.6 |
|
25.5 |
24.1 |
xtal_refine/bdomain.inp |
10.3 |
|
8.6 |
9.9 |
xtal_refine/bgroup.inp |
11.3 |
|
9.3 |
10.7 |
xtal_refine/bindividual.inp |
11.1 |
|
9.2 |
10.5 |
xtal_refine/composite_omit_map.inp |
1543.5 |
|
1423.7 |
1430.4 |
xtal_refine/fo-fo_map.inp |
5.8 |
|
5.2 |
4.8 |
xtal_refine/fp_fdp_group.inp |
4.4 |
|
3.8 |
4.1 |
xtal_refine/map_cover.inp |
2.6 |
|
2.6 |
2.3 |
xtal_refine/minimize.inp |
60.0 |
|
49.1 |
58.8 |
xtal_refine/model_map.inp |
4.7 |
|
4.0 |
3.4 |
xtal_refine/model_map_1.inp |
4.4 |
|
3.9 |
3.3 |
xtal_refine/model_stats.inp |
18.6 |
|
24.3 |
21.9 |
xtal_refine/ncs_average_map.inp |
245.3 |
|
218.7 |
199.0 |
xtal_refine/ncs_average_map_1.inp |
131.7 |
|
118.0 |
100.8 |
xtal_refine/optimize_average.inp |
2.1 |
|
2.0 |
1.7 |
xtal_refine/optimize_rweight.inp |
43.0 |
|
34.4 |
41.0 |
xtal_refine/optimize_wa.inp |
641.6 |
|
552.4 |
594.9 |
xtal_refine/qgroup.inp |
4.5 |
|
3.9 |
4.1 |
xtal_refine/qindividual.inp |
10.6 |
|
8.7 |
10.0 |
xtal_refine/refine.inp |
52.5 |
|
50.4 |
46.3 |
xtal_refine/rigid.inp |
8.2 |
|
6.9 |
7.8 |
xtal_refine/sa_omit_map.inp |
49.8 |
|
44.3 |
43.7 |
xtal_refine/shift_solvent.inp |
11.8 |
|
13.2 |
11.0 |
xtal_refine/water_delete.inp |
5.8 |
|
4.5 |
4.0 |
xtal_refine/water_pick.inp |
49.7 |
|
37.0 |
43.1 |
xtal_twin/anneal_twin.inp |
3252.3 |
|
2855.5 |
2414.2 |
xtal_twin/bdomain_twin.inp |
148.7 |
|
130.8 |
112.1 |
xtal_twin/bgroup_twin.inp |
152.7 |
|
133.6 |
116.9 |
xtal_twin/bindividual_twin.inp |
153.0 |
|
136.0 |
117.6 |
xtal_twin/detect_twinning.inp |
10.2 |
|
6.8 |
5.8 |
xtal_twin/detwin_partial.inp |
9.3 |
|
6.3 |
5.6 |
xtal_twin/detwin_perfect.inp |
20.7 |
|
16.5 |
14.0 |
xtal_twin/make_cv_twin.inp |
4.5 |
|
3.8 |
3.4 |
xtal_twin/minimize_twin.inp |
921.6 |
|
822.9 |
703.3 |
xtal_twin/model_map_twin.inp |
57.9 |
|
41.1 |
32.8 |
xtal_twin/model_stats_twin.inp |
109.0 |
|
134.6 |
119.0 |
xtal_twin/rigid_twin.inp |
109.0 |
|
97.1 |
84.2 |
xtal_twin/twin_fraction.inp |
16.7 |
|
13.9 |
12.0 |
xtal_twin/water_pick_twin.inp |
626.6 |
|
580.4 |
483.3 |
xtal_util/analyse.inp |
53.2 |
|
40.9 |
35.0 |
xtal_util/analyse_1.inp |
15.4 |
|
12.2 |
10.7 |
xtal_util/average_friedels.inp |
12.7 |
|
12.9 |
11.7 |
xtal_util/average_map.inp |
9.5 |
|
6.7 |
5.1 |
xtal_util/combine.inp |
5.5 |
|
5.8 |
5.2 |
xtal_util/flip_friedels.inp |
17.4 |
|
17.3 |
15.5 |
xtal_util/fourier_map.inp |
28.7 |
|
30.9 |
27.9 |
xtal_util/fourier_map_1.inp |
38.3 |
|
36.5 |
34.1 |
xtal_util/hlcoeff_blur.inp |
2.1 |
|
2.1 |
2.0 |
xtal_util/make_cv.inp |
1.3 |
|
1.3 |
1.2 |
xtal_util/make_hlcoeff.inp |
2.1 |
|
2.1 |
2.0 |
xtal_util/manipulate.inp |
13.6 |
|
13.3 |
11.9 |
xtal_util/manipulate_1.inp |
15.7 |
|
14.1 |
12.7 |
xtal_util/mask_map.inp |
3.6 |
|
2.8 |
2.3 |
xtal_util/matthews_coef.inp |
0.5 |
|
0.5 |
0.5 |
xtal_util/merge.inp |
23.0 |
|
21.6 |
19.2 |
xtal_util/merge_1.inp |
23.0 |
|
21.7 |
19.2 |
xtal_util/merge_2.inp |
13.5 |
|
13.5 |
12.0 |
xtal_util/model_fcalc.inp |
1.1 |
|
0.9 |
0.8 |
xtal_util/model_phase.inp |
2.6 |
|
2.3 |
2.0 |
xtal_util/scale.inp |
25.8 |
|
21.9 |
21.0 |
xtal_util/transform_map.inp |
5.9 |
|
4.0 |
3.0 |
CYANA
After a lot of messing around this finally works.
To run of multiple processors use the mpi run command,
e.g
mpirun -np 16 cyana CALC.cya
You can specify which nodes using n (mpirun n0,1,2,3,7 or mpirun n1-5 etc)
Soon I will try to get this working via SGE
Cyana Benchmarks
Running an automated structure calculation (i.e. no NOE's are know and are auto assigned by CYANA)
Number CPU |
Time |
Increased performance |
---|---|---|
1 |
273 minutes |
|
14 |
~ 25 minutes |
~ 10-fold |
HADDOCK
To set up haddock you can source the script haddock_configure at /nfs/home/osbornem/local/haddock/haddock_configure.
e.g. type the following at your Linux prompt
source /nfs/home/osbornem/local/haddock/haddock_configure
then type haddock1.3 in the directories containing the new.html this is good enough to set up you r run directories.
To run the run.cns script on multiple machines USE THE SGE SETUP detailed here which will essentially perform the following changes to you run.cns script. Failure to use SGE may result in your calculation being stopped prematurely
{===>} queue_1="csh";
{===>} cns_exe_1="/nfs/home/osbornem/local/aria_1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";
{===>} cpunumber_1=1;
{===>} queue_2="ssh station04 csh";
{===>} cns_exe_2="/nfs/home/osbornem/local/aria_1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";
{===>} cpunumber_2=2;
Note for haddock you can run up to 14 cpus at a time
Benchmarks
Running an example script producing 1000 structures in it0, 200 in it1 and 200 water-refined structures.
Number CPU |
Time |
Increased performance |
---|---|---|
1 |
5 days 4 hours (128 hours) |
|
12 |
11 hrs 29 minutes |
~ 11-fold |