Contents

1. Introduction

A lot of XSEDE users request allocations on SDSC Trestles and Gordon because we are one of two XSEDE sites (the other being PSC and Blacklight with a Gaussian license that permits XSEDE users to use the software. I’ve found that many start-up allocations wanting to use Gaussian involve users who have never used a batch system, a remote resource, or even a command line. In the interest of providing a very quick crash course for such users, here are my notes on making the jump from Gaussian on a PC or workstation to Gaussian on an SDSC XSEDE resource.

This guide assumes the reader has never used a batch system, an XSEDE resource, or the Linux command line. Since Trestles gets most of the new Gaussian users on XSEDE, I will assume the reader is using that system. Instructions for using Gordon are quite similar.

2. Logging in

The XSEDE User Portal has a guide to getting started, and it covers all the options about which most users will want to know. All those options can be confusing at first, so for the sake of keeping it as simple as possible, I’ll lay out every step.

  1. Go to the XSEDE User Portal
  2. Log in with your XSEDE username and password. If you do not have an XSEDE User Portal account, you will have to create one and then get your project PI (your supervisor) to add that account to your group’s project
  3. Under the “MY XSEDE” tab, click the “Accounts“ option
  4. Scroll down to Trestles and click the link under the “Login Name” column. This will take you to the GSI-SSH Terminal Java applet which will take a while to load, then dump you at a black screen with white text.

This black screen should look something like this:

Rocks 5.4 (Maverick)
Profile built 10:09 13-Jan-2012

Kickstarted 10:17 13-Jan-2012
trestles Login Node
---------------------------------------------------------------------------
Welcome to the SDSC Trestles Appro Cluster

Trestles User Guide: http://www.sdsc.edu/us/resources/trestles
Questions: email help@xsede.org 
---------------------------------------------------------------------------

[username@trestles-login2 ~]$ 

This is the Linux terminal, and the last line is your prompt which lists your username, your current machine (trestles-login1 or trestles-login2), your current directory (~ is an abbreviation for your home directory), and a dollar sign ($) which means you are logged in as a regular (not administrative) user.

Typographic conventions hold that commands you are supposed to type at the command prompt (also called “the shell”) be preceded by a $ to represent the shell prompt. So, if this guide says to issue the following command:

$ pwd

you don’t actually type the dollar sign. It’s just there to tell you to type the pwd command in the Linux shell. I also will forego the black background from my samples below. You know what your terminal looks like.

3. Getting Permission and Loading Gaussian

Because Gaussian requires a license to use, new login accounts must be given permission to run Gaussian before they can actually use it. Chances are you will need to request this permission by sending an email to help@xsede.org. Once your request is processed, it may take a few hours for the changes to take effect. If you want to check to see if you can run Gaussian, you can use the groups command:

$ groups
rut100 gaussian

If you do not see gaussian listed in the output of this command, you will not be able to run Gaussian!

Once your account has been enabled for Gaussian, you can load its associated module with this command:

$ module load gaussian

This will give you access the Gaussian commands like formchk, unfchk, and of course g09. However, do not skip ahead and just start running g09! If you do, you will make a lot of other users upset and you will get a sternly worded e-mail from me or one of my colleagues.

4. Gaussian Job Setup

At this point I assume you have a Gaussian job you want to run, and it consists of the following files on your personal computer:

  • input.com - your Gaussian input file
  • molecule.chk - a Gaussian checkpoint file containing the data for the molecule you want to simulate. If you are starting from scratch, the coordinates of your nuclei will be in your .com file and you will not have this checkpoint file.

4.1. Creating a job directory

The first thing you want to do is create a directory in which you want all this simulation’s data to reside. Do

$ mkdir job1

to create a directory called job1. To then go into that directory,

$ cd job1

4.2. Transferring files to the cluster

Now you need to transfer your Gaussian input files from your computer to the cluster. The easiest way to do that is using a program like WinSCP (Windows) or FileZilla (Windows or Mac) that allows you to drag-and-drop files from your personal computer to any XSEDE resource. Connecting to your cluster will show your local files in the left pane and your job1 directory on the cluster in the right pane. Double click the job1 folder, and drag-and-drop your Gaussian job files onto the cluster..

Back in your terminal session, you should be able to type the ls command and see the files you just uploaded.

$ ls
input.com  molecule.chk

4.3. Setting up the queue script

Up until now, the steps have been very generic and can be used by any user to get started on Trestles. However to actually run jobs on Trestles, Gordon, or any other XSEDE supercomputer, you will have to interact with the batch system which is really what distinguishes using a shared supercomputer from using your personal computer.

At SDSC we use use the Torque Resource Manager which is comprised of a number of commands (e.g., qsub, qstat, qdel, and qmod), and running your simulation through the batch system requires a queue script to “glue” together the inner workings of Torque and Gaussian.

You can name these queue scripts whatever you want, but I like to give them all the extensions of .qsub. So, you will have to create a file called g09job.qsub using a command-line text editor. The nano editor is perhaps the easiest to use. Issue this command:

$ nano g09job.qsub

to create and edit a file called g09job.qsub. You will see a screen like this:

  GNU nano 1.3.12             File: g09job.qsub                                 







^G Get Help  ^O WriteOut  ^R Read File ^Y Prev Page ^K Cut Text  ^C Cur Pos
^X Exit      ^J Justify   ^W Where Is  ^V Next Page ^U UnCut Text^T To Spell

Some common nano commands are shown at the bottom: ctrl+x exits, ctrl+w to search, etc. You will need to paste the following lines into this new g09job.qsub file:

#!/bin/bash
#PBS -q shared
#PBS -l nodes=1:ppn=16
#PBS -l walltime=02:30:00

. /etc/profile.d/modules.sh
module load gaussian
 
cd $PBS_O_WORKDIR
export GAUSS_SCRDIR=/scratch/${USER}/${PBS_JOBID}
g09 < input.com > output.txt

Now exit nano (ctrl+x) and say yes to “Save modified buffer (ANSWERING “No” WILL DESTROY CHANGES) ?” to save your changes. This is the absolute bare minimum queue script you will need to run a Gaussian job, and for now, there are only two important lines. The first one is

#PBS -l nodes=1:ppn=16

which tells Torque that your job will require one node and sixteen CPU cores on that node. You will then have to modify your Gaussian input file, input.com, to actually use these sixteen cores. Open up that input.com file in nano and make sure the following red Link 0 commands are present above the Route section:

%chk=molecule.chk
%nproc=16
%mem=31GB
#p m062x/6-31+g(d) td=(Root=1,NStates=1) Freq NoSymm

The %nproc option tells Gaussian to use 16 cores, which must be the same as what your queue script requests. The %mem option specifies how much memory Gaussian can use. On Trestles, there is a max of 2 GB available per core, but it is good practice to not specify this absolute max since the operating system and other system programs on the node will also need some memory.

Everything else in our Gaussian input file can remain unchanged.

The second important line of our g09job.qsub queue script is

#PBS -l walltime=48:00:00

which says that your job needs 48 hours to complete. If you know your job takes less time you can change that to, say,

#PBS -l walltime=00:15:00

for fifteen minutes.

5. Running Gaussian

Once you have your input file set up, you still cannot run Gaussian yet. Unlike a workstation where you can just use the g09 command, Trestles (and all modern supercomputers) requires you to submit your job to a batch system that schedules and launches everyone’s job in a fair manner.

You will have to submit jobs to Trestles using the qsub command and a job submission script which contains more Linux terminal commands to be executed on one of the compute nodes. We’ve created g09job.qsub above, so first type ls to ensure that input.com, molecule.chk (if you had a checkpoint file you transferred from your personal computer), and g09job.qsub are in your job directory.

Once you’ve got all your files together, actually running your Gaussian simulation is the simple matter of using the qsub command:

$ qsub g09job.qsub

The job may sit in queue for a while, and you can check its status by typing qsub -u username. The second-to-last column (labeled “S”) is the job state. Q means it’s in queue, R means it is running, and C means the job has finished.

Once your job finishes, you should have a new file called output.txt which you can view using cat, edit using nano, and download to your computer using the XSEDE File Manager.

You can find a few more Gaussian submit scripts in the GitHub repository for Trestles or the GitHub repository for Gordon.