Linux Clusters

  1. Information

  2. Architecture

  3. Running applications over many hosts

    1. Generating ssh keys

    2. Available workstations

    3. ARIA

        1. Version1.1 set up

        2. Version 1.2 set up

        3. Multiple Machines

        4. Benchmarks

    4. XPLOR

        1. Multiple Machines

        2. Benchmarks

    5. CNS

        1. Multiple machines

        2. Benchmarks

    6. CYANA

        1. Muliple Machines

        2. Benchmarks

    7. HADDOCK

      1. Multiple machines

      2. Benchmarks

    8. NMRPIPE

      1. Multiple Machines

      2. Benchmarks

    9. Using SGE to submit jobs


Information

We have 4 dual AMD opteron processor blades (servers) and 6 IBM workstations running off AMD opteron dual processors. In total there are 20 cpu's that can be used to submit jobs to.

The two servers consist of a primary (server01) and secondary (server02). The secondary is constantly updated and mirrors the primary (updated every second). If for any reason the primary server crashes, then the secondary server takes over (hopefully you will not notice a thing unless you were running a job on the server).

Another two blades are simply used as computational power houses. This is mainly due to space restrictions in the Linux room. These should be used if you want to submit large jobs

Finally we have 6 IBM workstations - two of these will have dual monitors to help visualization for NMR analysis. Again, you can use these if you wish to specify a big job over many nodes

We insist that all long jobs are submitted via the SGE job queuing system

Architecture

We use the NFS system, which allows you to access your home directory and applications from any machine on the cluster

When you are given a username, your home directory will be

/nfs/home/username

Softwares available to everyone are stored at /nfs/usr/


In fact your home directory and software application are both located on the server. This can cause a major problem with speed for programmes using a lot of I/O access. e.g. nmrPipe. To run these programmes it is best to have the data stored locally on your workstation (/usr/local/scratch) for details on this see below.


Disk space

The nfs system has 160 GigaBytes of space

In addition each workstation has 80 Gb of space. At present, however, the system is set up to delete any data that has been on the local disk for more than 15 days, so if you do something locally (e.g processing, then make sure you copy it over to your home directory. I'm not sure this is something I want to continue, but we shall see how this goes.


Running applications over many hosts

Using ssh keys

Most processes we use require login to other machines via ssh. It is important to set up your ssh configuration to login automatically without having to type a password. To do this perform the following steps:

cd ~/.ssh

ssh-keygen -t dsa

cp id_dsa.pdb authorized_keys

now to ensure that you can only logon from the cluster you need to edit the authorized_keys file. At the very beginning you need to insert:

from="*.iric.udem"

so it should have something like

from="*.iric.udem" ssh-dss AAAA

Finally ssh to all the machines. You will be asked for the password first time, thereafter it should login automatically.



The names of the machines are here

In my view the cluster (multiple machines) should be used for the following applications:

XPLOR
ARIA


You can also try NMRpipe but this is really slow over the network (check my stats on this below) because of the excessive I/O access. Since the NFS copies data from server01 to server02 every second, and nmrPipe writes and reads data a lot quicker than that - you can see the problem.

My advice is to use run nmrPipe with the data located locally on the /scratch directory.



Workstations and servers

The workstations and servers/pnodes are named as follows:

server01
server02
pnode01
pnode02
station01
station02
station03
station04
station05
station06

ARIA

ARIA1.1.2

To set up add the following command to your .bashrc

export PYTHONPATH=/nfs/home/osbornem/local/aria1.1.2:$PYTHONPATH

alias aria='/usr/bin/python2 /nfs/home/osbornem/local/aria1.1.2/Aria/RunAria.py'

and your .cshrc

setenv PYTHONPATH /nfs/home/osbornem/local/aria1.1.2:$PYTHONPATH

alias aria '/usr/bin/python /nfs/home/osbornem/local/aria1.1.2/Aria/RunAria.py'



ARIA1.2.

To set up add the following to your .bashrc

export PYTHONPATH=/nfs/home/osbornem/local/aria_1.2:$PYTHONPATH

alias aria='/usr/bin/python2 /nfs/home/osbornem/local/aria_1.2/Aria/RunAria.py'


and your .cshrc:

setenv PYTHONPATH /nfs/home/osbornem/local/aria_1.2:$PYTHONPATH

alias aria '/usr/bin/python /nfs/home/osbornem/local/aria_1.2/Aria/RunAria.py'



To run on multiple machines alter the run.cns script as follows:

{===>} queue_1="csh";

{===>} cns_exe_1="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";

{===>} cpunumber_1=1;


{===>} queue_2="ssh station04 csh";

{===>} cns_exe_2="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";

{===>} cpunumber_2=2;


{===>} queue_3="ssh station03 csh";

{===>} cns_exe_3="/nfs/home/osbornem/local/aria1.1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";

{===>} cpunumber_3=2;


However, not that this is not necessary if you run ARIA via the SGE. I have scripts that will do this automatically for you. So PLEASE USE THE SGE SYSTEM. Users not doing this run the risk of their process being stopped.

see here for info on SGE and ARIA

Just csh will run on the current workstation. CPU number is the number of CPU's to use on that station.

Note that the maximum processors is 16: but as you calculate 20 structures, using 10 CPU's will take the same time as using 16..


Alternatively you can edit the appropriate forms for data conversion (top) or for the run.cns (see bottom) using forms on our server:

aria1.1

aria1.2


OR from the server at the pasteur institute:

Aria1.1

For aria1.2





Some Benchmarks:

Number of CPU's

Time

Increase in performance

1

9 hours 4 minutes

N/A

10

1 hour 15 minutes

~ 9 fold

16

1 hour 13 minutes

~ 9-fold


These are based on calculating 20 structures per iteration (8 iterations in total). Since Aria does parallelization

Why no performance from 10 to 16 cpu's. This is because the next iteration cannot continue until all 20 structures have been computed. Thus if we could use 20 cpu's then the time will probably be halved. Or we could calculate 16 structures to get a similarly halving of the time.

BOTTOM LINE:

For maximum performance use only 10 CPU's...(assuming no-one else is using them).



XPLOR

Running Xplor on multiple machines:


xplor -parallel -machines XX.file -o oz.out < xplor.inp



where XXX.file contains names of the hosts

e.g.
station02
station03
station04

If you have old xplor scripts you wish to use in parallel then add the following syntax (remember the total number of structures should be greater than or equal to the number of machines specified:

Place these lines at the beginning of the script:

!total number of structures to calculate:

eval ($numStructs = 100)

!random seed:

eval ($randomSeed = 785)

! get parallel info

cpyth "from os import environ as env"

cpyth "xplor.command('eval ($proc_num=%s)' % env['XPLOR_PROCESS'])"

cpyth "xplor.command('eval ($num_procs=%s)' % env['XPLOR_NUM_PROCESSES'])"

eval ($firstStruct = ($proc_num * $numStructs) / $num_procs)

eval ($lastStruct = (($proc_num+1) * $numStructs) / $num_procs)


2) Then, the structure loop should be modified to :


eval ($count = $firstStruct)

while ($count <= $lastStruct) loop structure

eval ($seed = $randomSeed+$count)

set seed $seed end

.

.

evaluate ($file = "1gb1_" + encode($count) + ".pdb")

write coor output= $file end

eval ($count = $count + 1)

end loop structure



Within the Python interface, parallelization is automatic if one uses the StructureLoop class from the simulationTools module.



Benchmarks:

Running over 6 CPU's


Invoking the following command to calculate 50 structures:

nfs/usr/xplor-nih-2.10/bin/xplor -parallel -machines ./allhosts.list -o xplor_par < anneal1_par.inp"


Number CPU

Time

Increased performance

1

~ 17 hours


6

~ 3 hours

~ 6-fold


NMRPipe


If you have a 2D or 3D (without too many LP) then it is probably best to store the data locally on /scratch and run on a single cpu!!

However for 3D or 4D which would normally take a long time - you can run over many machines.

1. First you need to have a file called pipe.net describing the CPU's e.g

station01.iric.udem 10

station01.iric.udem 10

station02.iric.udem 10

station02.iric.udem 10

station03.iric.udem 10

station03.iric.udem 10

station04.iric.udem 10

station04.iric.udem 10

station06.iric.udem 10

station06.iric.udem 10

station05.iric.udem 10



  1. Now split your processing script into seperate files for each transform:

Here is an example of double lp using parallel processingNB the bold are the lines for parallel:

This id the ft of the 1st and second dimension with no LP

# parallel1.com NMRPipe Data Processing Script

#! /bin/csh


echo 'Transforming 3-dimensional NMR data!'

echo 'Processing F3/F1 dimensions first'


hostname

waitFor -par pipe.net -init -ext part1 -cpu $1 -verb



xyz2pipe -in ./data/test%03d.fid -x -verb -par pipe.net -cpu $1 \

| nmrPipe -fn SOL -mode 2 -fl 64 -fs 2 -poly \

| nmrPipe -fn CBF -last 12 \

| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 512 -c 0.5 \

| nmrPipe -fn ZF -size 1024 \

| nmrPipe -fn FT \

| nmrPipe -fn PS -p0 -66 -p1 0 -di \

| nmrPipe -fn TP \

| nmrPipe -fn SP -off 0.5 -end 0.95 -pow 2 -size 60 -c 0.5 \

| nmrPipe -fn ZF -size 256 \

| nmrPipe -fn FT \

| nmrPipe -fn PS -p0 0 -p1 0 -di \

| nmrPipe -fn TP \

| pipe2xyz -out lp/shit%03d.ft2 -x

waitFor -par pipe.net -ext part1 -cpu $1 -verb

Now ft the 3rd dimnesion with LP

# parallel2.com NMRPipe Data Processing Script

#! /bin/csh

hostname

waitFor -par pipe.net -init -ext part1 -cpu $1 -verb



echo 'Processing F2'

xyz2pipe -in lp/shit%03d.ft2 -z -verb -par pipe.net -cpu $1 \

| nmrPipe -fn LP -fb -ord 10 -x1 2 -xn 28 -pred 32 -fix -fixMode 1 -after \

| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 60 -c 0.5 \

| nmrPipe -fn ZF -size 128 \

| nmrPipe -fn FT \

| nmrPipe -fn PS -p0 0 -p1 0 -di \

| pipe2xyz -out lp/shit%03d.ft3 -z

waitFor -par pipe.net -ext part1 -cpu $1 -verb


Now do inv ft on the 2nd dimension and LP

# Parallel3.com NMRPipe Data Processing Script

#! /bin/csh


echo 'Processing F3/F1 dimensions first'


hostname

waitFor -par pipe.net -init -ext part1 -cpu $1 -verb


echo 'Processing F2'

xyz2pipe -in lp/shit%03d.ft2 -z -verb -par pipe.net -cpu $1 \

| nmrPipe -fn LP -fb -ord 10 -x1 2 -xn 28 -pred 32 -fix -fixMode 1 -after \

| nmrPipe -fn SP -off 0.5 -end 0.98 -pow 2 -size 60 -c 0.5 \

| nmrPipe -fn ZF -size 128 \

| nmrPipe -fn FT \

| nmrPipe -fn PS -p0 0 -p1 0 -di \

| pipe2xyz -out lp/shit%03d.ft3 -z

waitFor -par pipe.net -ext part1 -cpu $1 -verb



Due to I/O time, I find the first stage (direct dimnesion FT) is best done on one machine... only the LP steps are best done over many cpu's..

To run these three files I make a csh script such as below

nmrShell -rsh ssh -par pipe.net -sh parallel1.com -path /nfs/home/osbornem/shitter > shit

nmrShell -rsh ssh -par pipe.net -sh parallel2.com -path /nfs/home/osbornem/shitter > shit1

nmrShell -rsh ssh -par pipe.net -sh parallel3.com -path /nfs/home/osbornem/shitter > shit2

./parallel4.com > shit3

echo "COMPLETED"





YOU CAN DOWNLOAD EXAMPLE SCRIPTS here

Note the use of nmrShell to launch parallel processing. The final script parallel4.com is done on a sing;e processor..

Usage of nmrShell is given below

nmrShell -rsh ssh -par pipe.net -sh nmrproc.com -path /FULLPATHNAME -log
e.g if your script is in /nfs/home/osbornem/nmrpipe

nmrShell -rsh ssh -par pipe.net -sh nmrproc.com -path /nfs/home/osbornem/nmrpipe -log


Some pointers

The best way to do this is by using nmrShell using the following command:

Do not use any of the verbose commands in NMRPipe (-v, -verb) as displaying text over the nfs will really slow things down

your script will be identical as for a single cpu but you need to add on the line containing the input"-par

Benchmarks:

Running over 6 CPU's


Ruuning a double lp of a hcch-tocsy using the scripts shown above


Number CPU

Time

Increased performance

1

~35 minutes


12

~ 8.5 minutes

~ 4-fold


CNS

CNS cannot (to my knowledge anyway) be parallelized. I guess you can submit a number of input files with differing starting trajectories. Nonetheless, CNS has been optimised (32-bit) by Savoirefaire linux, and it is pretty fast.

Check out some benchmark data here and graphs work compare our system with others.

Benchmarks for test suite programmes (time in seconds)

Test suite program

SGI 600 MHz
R14000 IP30

AMD OPTERON

optimised by Savoirefaire 32-bit

1.8 GHz Athlon
(Biowulf cluster)

2.8 GHz Xeon
(Biowulf cluster)

general/alternate.inp

0.4

0.06

0.1

0.1

general/bfactor_plot.inp

0.4

0.14

0.4

0.3

general/buried_surface.inp

5.5

2.34

4.9

4.4

general/cis_peptide.inp

30.3

11.48

64.7

55.6

general/cns_to_o.inp

1.0

0.38

0.7

0.7

general/contact.inp

105.6

39.17

219.6

189.4

general/delete_atoms.inp

0.3

0.05

0.1

0.1

general/delta_phipsi.inp

2.7

0.94

4.9

4.0

general/difference_distance.inp

25.3

8

27.0

22.2

general/fractional_transform.inp

0.7

0.28

0.7

0.6

general/generate.inp

2.5

0.89

3.9

3.3

general/generate_easy.inp

42.8

15.61

84.3

71.3

general/generate_seq.inp

1.0

0.3

0.9

0.8

general/get_ncs_matrices.inp

0.6

0.22

0.4

0.3

general/hydrogen_bonds.inp

20.4

7.63

39.2

33.1

general/merge_structures.inp

1.5

0.42

1.4

1.1

general/model_anneal.inp

38.4

17.69

29.5

24.2

general/model_mask.inp

1.0

0.41

0.8

0.8

general/model_minimize.inp

8.5

3.84

6.6

5.4

general/model_rigid.inp

2.0

0.86

1.8

1.4

general/molecule_extent.inp

0.7

0.27

1.1

0.9

general/mtf_to_psf.inp

0.1

0.01

0.0

0.0

general/neighbours.inp

10.6

3.9

20.1

17.5

general/psf_to_mtf.inp

0.3

0.03

0.1

0.0

general/ramachandran.inp

2.3

0.82

4.5

3.7

general/realspace_transform.inp

1.3

0.51

1.1

1.0

general/rename_segid.inp

2.1

0.32

0.7

0.6

general/rms_fit.inp

0.1

0.04

0.1

0.0

general/rmsd.inp

0.5

0.15

0.5

0.4

general/shift_molecules.inp

92.1

34.38

162.0

139.7

general/split_structure.inp

7.2

1.98

2.8

2.4

general/surface_plot.inp

0.7

0.25

0.8

0.6

nmr_calc/accept.inp

411.0

181.60

275.6

381.7

nmr_calc/anneal.inp

3972.3

1520.820

2563.7

2216.4

nmr_calc/anneal_1.inp

1881.7

635.28

1208.4

982.9

nmr_calc/anneal_2.inp

2765.2

1032.07

2004.5

1482.3

nmr_calc/anneal_3.inp

1110.3

450.29

870.0

686.0

nmr_calc/anneal_cv.inp

10758.3

3911.98

7117.1

5429.7

nmr_calc/dg_sa.inp

1886.7

627.41

1259.0

947.5

nmr_calc/dg_sa_1.inp

334.7

171.42

680.2

333.0

nmr_calc/ensemble.inp

1289.0

457.7

917.6

818.8

nmr_calc/ensemble_1.inp

860.2

267.75

548.5

550.2

nmr_calc/ensemble_cv.inp

3827.9

1365.79

2783.8

2475.1

nmr_calc/ensemble_cv_1.inp

8660.1

2513.48

5309.9

5512.9

nmr_calc/generate_extended.inp

115.8

19.01

37.6

49.3

nmr_calc/generate_extended_1.inp

71.4

21.68

25.1

20.5

nmr_calc/pmrefine.inp

875.9

441.89

770.0

809.2

nmr_calc/pmrefine_1.inp

598.3

285.54

444.6

474.9

nmr_calc/rmsd_pm.inp

8.2

2.990

12.2

10.6

nmr_calc/rmsd_pm_1.inp

8.5

3.17

11.7

10.3

xtal_mr/cross_rotation.inp

1737.6

871.04

1434.2

1643.0

xtal_mr/cross_rotation_1.inp

541.0

175.13

544.3

393.1

xtal_mr/self_rotation.inp

65.5

21.2

67.2

50.6

xtal_mr/translation.inp

224.8

106.31

219.9

234.6

xtal_mr/translation_1.inp

971.4

493.4

868.2

983.3

xtal_mr/translation_2.inp

202.7

94.23

206.0

212.2

xtal_patterson/heavy_search.inp

413.3


283.6

364.3

xtal_patterson/patterson_map.inp

17.6


18.7

16.7

xtal_patterson/patterson_map_1.inp

20.5


20.7

18.5

xtal_patterson/patterson_refine.inp

22.7


20.7

20.6

xtal_patterson/predict_patterson.inp

1.7


1.7

1.6

xtal_phase/cns_to_sdb.inp

0.1


0.0

0.0

xtal_phase/delete_sites.inp

0.1


0.1

0.1

xtal_phase/density_modify.inp

36.2


29.6

26.8

xtal_phase/density_modify_1.inp

430.4


319.4

250.2

xtal_phase/flip_sites.inp

0.7


0.8

0.7

xtal_phase/generate_sdb.inp

0.1


0.1

0.1

xtal_phase/ir_phase.inp

97.1


68.2

81.8

xtal_phase/ir_phase_1.inp

334.2


375.6

290.1

xtal_phase/ir_phase_2.inp

175.9


120.6

151.8

xtal_phase/ir_phase_3.inp

312.6


365.1

281.4

xtal_phase/mad_bijvoet_ave.inp

15.8


16.7

15.0

xtal_phase/mad_phase.inp

2331.5


1305.0

1814.2

xtal_phase/mad_phase_1.inp

227.0


155.6

199.7

xtal_phase/mad_phase_2.inp

207.7


145.2

182.8

xtal_phase/optimize_ncsop.inp

380.5


243.6

176.0

xtal_phase/pdb_to_sdb.inp

0.1


0.1

0.1

xtal_phase/sdb_manipulate.inp

0.5


0.6

0.5

xtal_phase/sdb_split.inp

0.4


0.5

0.3

xtal_phase/sdb_to_pdb.inp

0.2


0.2

0.2

xtal_phase/sdb_to_sdb.inp

0.2


0.2

0.2

xtal_phase/shift_sites.inp

3.6


4.0

3.5

xtal_phase/solvent_mask.inp

6.8


5.6

4.8

xtal_phase/solvent_mask_1.inp

6.8


6.0

5.0

xtal_refine/anneal.inp

177.6


155.3

160.1

xtal_refine/anneal_1.inp

102.0


102.4

80.5

xtal_refine/anneal_2.inp

27.6


25.5

24.1

xtal_refine/bdomain.inp

10.3


8.6

9.9

xtal_refine/bgroup.inp

11.3


9.3

10.7

xtal_refine/bindividual.inp

11.1


9.2

10.5

xtal_refine/composite_omit_map.inp

1543.5


1423.7

1430.4

xtal_refine/fo-fo_map.inp

5.8


5.2

4.8

xtal_refine/fp_fdp_group.inp

4.4


3.8

4.1

xtal_refine/map_cover.inp

2.6


2.6

2.3

xtal_refine/minimize.inp

60.0


49.1

58.8

xtal_refine/model_map.inp

4.7


4.0

3.4

xtal_refine/model_map_1.inp

4.4


3.9

3.3

xtal_refine/model_stats.inp

18.6


24.3

21.9

xtal_refine/ncs_average_map.inp

245.3


218.7

199.0

xtal_refine/ncs_average_map_1.inp

131.7


118.0

100.8

xtal_refine/optimize_average.inp

2.1


2.0

1.7

xtal_refine/optimize_rweight.inp

43.0


34.4

41.0

xtal_refine/optimize_wa.inp

641.6


552.4

594.9

xtal_refine/qgroup.inp

4.5


3.9

4.1

xtal_refine/qindividual.inp

10.6


8.7

10.0

xtal_refine/refine.inp

52.5


50.4

46.3

xtal_refine/rigid.inp

8.2


6.9

7.8

xtal_refine/sa_omit_map.inp

49.8


44.3

43.7

xtal_refine/shift_solvent.inp

11.8


13.2

11.0

xtal_refine/water_delete.inp

5.8


4.5

4.0

xtal_refine/water_pick.inp

49.7


37.0

43.1

xtal_twin/anneal_twin.inp

3252.3


2855.5

2414.2

xtal_twin/bdomain_twin.inp

148.7


130.8

112.1

xtal_twin/bgroup_twin.inp

152.7


133.6

116.9

xtal_twin/bindividual_twin.inp

153.0


136.0

117.6

xtal_twin/detect_twinning.inp

10.2


6.8

5.8

xtal_twin/detwin_partial.inp

9.3


6.3

5.6

xtal_twin/detwin_perfect.inp

20.7


16.5

14.0

xtal_twin/make_cv_twin.inp

4.5


3.8

3.4

xtal_twin/minimize_twin.inp

921.6


822.9

703.3

xtal_twin/model_map_twin.inp

57.9


41.1

32.8

xtal_twin/model_stats_twin.inp

109.0


134.6

119.0

xtal_twin/rigid_twin.inp

109.0


97.1

84.2

xtal_twin/twin_fraction.inp

16.7


13.9

12.0

xtal_twin/water_pick_twin.inp

626.6


580.4

483.3

xtal_util/analyse.inp

53.2


40.9

35.0

xtal_util/analyse_1.inp

15.4


12.2

10.7

xtal_util/average_friedels.inp

12.7


12.9

11.7

xtal_util/average_map.inp

9.5


6.7

5.1

xtal_util/combine.inp

5.5


5.8

5.2

xtal_util/flip_friedels.inp

17.4


17.3

15.5

xtal_util/fourier_map.inp

28.7


30.9

27.9

xtal_util/fourier_map_1.inp

38.3


36.5

34.1

xtal_util/hlcoeff_blur.inp

2.1


2.1

2.0

xtal_util/make_cv.inp

1.3


1.3

1.2

xtal_util/make_hlcoeff.inp

2.1


2.1

2.0

xtal_util/manipulate.inp

13.6


13.3

11.9

xtal_util/manipulate_1.inp

15.7


14.1

12.7

xtal_util/mask_map.inp

3.6


2.8

2.3

xtal_util/matthews_coef.inp

0.5


0.5

0.5

xtal_util/merge.inp

23.0


21.6

19.2

xtal_util/merge_1.inp

23.0


21.7

19.2

xtal_util/merge_2.inp

13.5


13.5

12.0

xtal_util/model_fcalc.inp

1.1


0.9

0.8

xtal_util/model_phase.inp

2.6


2.3

2.0

xtal_util/scale.inp

25.8


21.9

21.0

xtal_util/transform_map.inp

5.9


4.0

3.0





CYANA

After a lot of messing around this finally works.

To run of multiple processors use the mpi run command,

e.g

mpirun -np 16 cyana CALC.cya

You can specify which nodes using n (mpirun n0,1,2,3,7 or mpirun n1-5 etc)

Soon I will try to get this working via SGE

Cyana Benchmarks

Running an automated structure calculation (i.e. no NOE's are know and are auto assigned by CYANA)

Number CPU

Time

Increased performance

1

273 minutes


14

~ 25 minutes

~ 10-fold






HADDOCK

To set up haddock you can source the script haddock_configure at /nfs/home/osbornem/local/haddock/haddock_configure.

e.g. type the following at your Linux prompt

source /nfs/home/osbornem/local/haddock/haddock_configure

then type haddock1.3 in the directories containing the new.html this is good enough to set up you r run directories.

To run the run.cns script on multiple machines USE THE SGE SETUP detailed here which will essentially perform the following changes to you run.cns script. Failure to use SGE may result in your calculation being stopped prematurely

{===>} queue_1="csh";

{===>} cns_exe_1="/nfs/home/osbornem/local/aria_1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";

{===>} cpunumber_1=1;


{===>} queue_2="ssh station04 csh";

{===>} cns_exe_2="/nfs/home/osbornem/local/aria_1.2/cns_solve_1.1/linux_opteron32_gnu/bin/cns";

{===>} cpunumber_2=2;



Note for haddock you can run up to 14 cpus at a time

Benchmarks

Running an example script producing 1000 structures in it0, 200 in it1 and 200 water-refined structures.

Number CPU

Time

Increased performance

1

5 days 4 hours (128 hours)


12

11 hrs 29 minutes

~ 11-fold