Frequently Asked Questions
Table of Contents
- 1. Logging In
- How do I change my password?
- I forgot my PIN. What do I do?
- After entering my password and my PIN, I am still unable to get a ticket. What else can be wrong?
- Why did the computer ask for my Passcode twice, and then fail to log me in?
- I want to install the latest Kerberos software. Where do I get it?
- How do I register my CAC with pIE?
- How can I view my account usage and remaining allocation?
- How do I login to an ARL DSRC machine using Kerberos on a PC?
- I get the following error when getting a ticket:
"kinit - Preauthentication failed while getting initial credentials".
How do I solve this? - What do I do when I get: "Can't send request (send_to_kdc)"?
- How do I run ktelnet when I receive the message:
"Kerberos! KDC can't fulfill requested option Kerberos V5:
error getting forwarded credentials."? - Why am I getting the error message "clock skew too great" when requesting a Kerberos ticket using
the Windows client? - What firewall settings should I have set locally to access the ARL DSRC?
- My login node response time is sluggish? How do I find a less busy node?
- What local shell variables are automatically defined in my login environment?
- How do I log into an ARL DSRC machine using my YubiKey?
- 2. Machine Configuration
- What is the configuration of your SGI Altix ICE Cluster (Harold)?
- What is the configuration of your IBM iDataPlex (Pershing)?
- 3. Manuals
- Where can I find the Harold User Guide?
- Where can I find the Pershing User Guide?
- Where can I find the Utility Server User Guide?
- Where can I find the PBS User Guide?
- Where can I find the Modules User Guide?
- 4. SSH
- What do I do when I receive a "Host Key Verification Error"?
- 5. Job Failures
- Why is my job not finding my input file and crashing?
- Why am I getting a "Segmentation Violation" error in my job?
- 6. COTS Software
- I am trying to request a piece of software that is not available but which has a price for
downloading and licensing.
What is the process for software of this kind? - What software is available on each system at ARL DSRC?
- Where can I find an example script for running a COTS package?
- How can modules help me?
- How do I transfer my data files between the scratch and archive file systems for a batch job if the
archive file system is not accessible from the compute nodes? - How can I transfer my ARL data to another DSRC?
- I can't see the stdout/stderr from my job until after the job completes.
Is there any way to check on this output during the run? - How can I check on the license usage status of COTS packages?
- How do I check the status of the PBS batch queues?
- 7. Compilers and MPI Suites
- What compilers are available on the Linux Clusters?
- What MPI Suites are available on the Linux Clusters?
- 8. Miscellaneous
- How do I remove characters caused by a FTP session from a PC to a Unix System?
- Where should I send back my broken or unwanted SecurID card?
- How can I reach the helpdesk?
- How do I change permission to a file or a directory?
- What is wrong with my backspace key?
- What shells are available?
- How do I use the Advanced Reservation Server to reserve nodes for future jobs?
- Where can I find instructions for downloading and using the Utility Server SRD software?
1. Logging In
Q. How do I change my password?
A.Use the Kerberos kpasswd command on your local host or on HPC servers. The UNIX command passwd is not valid on systems using Kerberos for secure login. Windows users may also use the "Change Password" button when using KRB5/kinit.
Q. I forgot my PIN. What do I do?
A.Call the CCAC Helpdesk (877-222-2039), and they will clear the PIN number, set it to the first 5 digits that come up on your next token generated by your SecurID card.
Q. After entering my password and my PIN, I am still unable to get a ticket. What else can be wrong?
A.Several things could cause your password and PIN to stop working. These things include a card that is out of sync, system outages, your card not working, or multiple incorrect login attempts. The quickest way to resolve the problem is to contact the CCAC helpdesk so the card can be checked.
Q. Why did the computer ask for my Passcode twice, and then fail to log me in?
A.This is a symptom of repeated passcode mistakes while trying to login. After two attempts, the system enters "Next Token Mode." What this means is that you must enter your regular passcode AND a verification passcode to gain access. When you get a second passcode prompt, you must allow the SecurID to cycle to the next random number, and enter the next number into the second passcode prompt. On the second passcode, enter the number directly off the display- the second passcode will fail if you try entering your PIN number a second time.
Many users get frustrated trying to get out of Next Token Mode. The CCAC Helpdesk can immediately clear the Next Token Mode, so if you fail to get past it several times in succession, just call and we will assist you.
Q. I want to install the latest Kerberos software. Where do I get it?
A.The location of the official DoD HPCMP Kerberos software is at http://www.hpcmo.hpc.mil/security/kerberos/. Read and click "OK" on the Government notice and consent banner. Then click the "Software" link on the left side of the page. DO NOT attempt to PKI or Kerberos login to the site.
For detailed download, installation, configuration, and useage instructions for kerberos, please see the KERBEROS/PKINIT/hToken/SecurID section of the CCAC Documentation page.
Q. How do I register my CAC with pIE?
A.Perform the following steps to register you CAC with pIE:
- Go to https://ieapp.erdc.hpc.mil
- Login with your Kerberos principle, Kerberos password, and SecurID passcode
- Once logged in, expand the User Information Environment tree and click "Register My Common Access Card (CAC)"
- Click "OK", "approve", and "register" when prompted.
It usually takes a day for someone to input the CAC information into the Kerberos KDC. You may or may not receive a notice after this has been done.
- When obtaining a ticket, in Windows>Kinit or Krb5, don't enter a password in the password box. Just click "login" and you'll be prompted for your CAC PIN. In Linux/Mac, issue the command kshell, then pkinit, and enter your CAC PIN when prompted.
Q. How can I view my account usage and remaining allocation?
A.The command "show_usage" is available on all computational systems and reports allocations and usage. This output is displayed at login, but users can run the command: /usr/local/bin/show_usage at any time to see the output.
Q. How do I login to an ARL DSRC machine using Kerberos on a PC?
A.Run KRB5.EXE
- Enter your HPCMP userid
- Enter your HPCMP Kerberos password
- Enter HPCMP.HPC.MIL (in all caps) as the realm
- Enter your PIN number on your SecurID card to generate a numeric passcode and then enter it in the "challenge" pop-up box. If you were successful, you will get a green Kerberos ticket that is valid for 4 hours. By default, KRB5 will show that you have a valid ticket for 10 hours. This is incorrect. You can change this setting in KRB5 by opening the program, click "File", "Options...", and change "Ticket lifetime" from 600 to 240. Click "OK".
- Open PuTTY and enter the information below:
Replace the username with your userid, replace Harold-l4 with whichever login node you would like (Harold-L1-7). After you have done this, name the connection (for instance, Harold) under "Saved Sessions" and click save (red circle). Now whenever you would like to connect to the machine you saved, double-click on the machine name like below:
Q. I get the following error when getting a ticket: "kinit - Preauthentication failed while getting initial credentials". How do I solve this?
A.For example:
smith@blue> kinit smith@HPCMP.HPC.MIL
Password for smith@ARL.HPC.MIL:
Passcode:
kinit: Preauthentication failed while getting initial credentials
There are three possible causes:
- A bad password.
- A bad passcode. This could be due to an incorrect PIN or a malfunctioning SecurID card.
- You don't use the kdc_timesync = 1 and ccache_type = 4 in your Kerberos configuration, and the time on the machine has drifted. If this is the case, try checking your configuration file and adjusting your system clock.
Try to kinit a couple times to make sure you have entered the "correct" password and passcode. If you are still having the same problem, please send an email to help@ccac.hpc.mil.
Q. What do I do when I get: "Can't send request (send_to_kdc)"?
- Your system can't get the correct address for the Kerberos at your principal realm. If you are a new Kerberos user, you need to check your Kerberos configuration file (krb5.ini or krb5.conf). Check the entries under [default_realm] and [domain_realm].
- The /etc/services file is missing the Kerberos information. Add these lines to your /etc/services (in Windows NT it is located in WinNT4)
#Kerberos Services klogin 543/tcp # Kerberos authenticated rlogin kshell 544/tcp cmd # and remote shell eklogin 2105/tcp # Kerberos encrypted rlogin Kerberos 88/udp kdc # Kerberos authentication--udp Kerberos 88/tcp kdc # Kerberos authentication--tcp kerberos-sec 750/udp # Kerberos authentication--udp kerberos-sec 750/tcp # Kerberos authentication--tcp kerberos_master 751/udp # Kerberos authentication kerberos_master 751/tcp # Kerberos authentication kerberos_adm 752/tcp # Kerberos 5 admin/changepw passwd_server 752/udp # Kerberos passwd server kpop 1109/tcp # Pop with Kerberos kshell 544/tcp cmd # and remote shell klogin 543/tcp # Kerberos authenticated rlogin eklogin 2105/tcp # Kerberos encrypted rlogin kftp 765/tcp # Kerberized ftp krb_prop 754/tcp # Kerberos slave propagation securid 1024/udp # SecurID
Q. How do I run ktelnet when I receive the message: "Kerberos! KDC can't fulfill requested option Kerberos V5: error getting forwarded credentials."?
A.The kinit didn't give you a "forwardable" ticket. You can verify this using the "klist -f" command to see if the Flags include a ("Forwardable"). On Windows, you simply check it on kinit box, see below:
To avoid a problem, you can run kinit with the "-f" flag or run ktelnet with the "--noforward" flag. On Windows, you can run ktelnet with the Forward credential box unchecked. See example below:
Q. Why am I getting the error message "clock skew too great" when requesting a Kerberos ticket using the Windows client?
A.This indicates that the clock on your computer has the wrong time.
"Clock skew" is the range of time allowed for a server to accept Kerberos authenticators from a client. In order for Kerberos authentication to work, your Windows client and the Kerberos server's time need to be within 5 minutes of each other. If they are too far off you will receive the "clock skew too great" error message and you will not be able to get a Kerberos ticket.
To resolve this issue you must manually set the clock on your system to the correct time.
To avoid this problem in the future there are several free time synchronization programs available to use under Windows. Here are a couple of easy to use programs for keeping the time current on your Windows system:
- Atomic Clock Sync: http://www.worldtimeserver.com/atomic-clock/
- Atomic Time Synchronizer: http://www.lmhsoft.com/timesync/
They both have install/uninstallers and require no knowledge of NTP (Network Time Protocol).
If you are running Windows XP, Vista or Windows 7 you will need to have administrator privileges in order to set your time, either manually or using an automated program.
For Admins of XP, Vista or Win7 boxes there is an NT port of the Unix NTP code. This might be useful if your site already has NTP servers setup and you want to sync your Domain controller to it (and have your clients sync to Domain controller using Microsoft utils): http://www.five-ten-sg.com/util/ntp4172.zip.
Q. What firewall settings should I have set locally to access the ARL DSRC?
A.If there is a Firewall between your machine using Kerberos and the ARL DSRC and you are unable to connect, provide the following information to your firewall administrator:
Different Kerberos clients sometimes contact different ports for the same services. Kerberos servers know how to respond to the various clients. Random client ports usually run from 1024 to 65536, but some ssh clients use priviledged ports 1023, 1022, 1021, ... for each successive simultaneous ssh connection.
A site should open all the ports listed below:
Service TCP/UDP Server Port Client Port kinit/krb5.exe tcp 88 random kinit/krb5.exe udp 88 88 kpasswd tcp 749 random kdc-a.afrl.hpc.mil tcp 749 random kshell/rcp/rsh tcp 544 random kshell/rcp/rsh tcp 1023,1022,... random encrypted rlogin -x tcp 2105 random
A site should also open the ports for Kerberos-enabled telnet and kftp. These use the standard telnet and ftp ports:
Service TCP/UDP Server Port Client Port Ssh tcp&udp 23 random ftp data tcp random random
The ARL DSRC hosts accept ssh from ssh clients that know how to use Kerberos credentials (Unix and Windows versions are available on the DoD HPCMP Kerberos Web site, http://www.hpcmo.hpc.mil/security/kerberos/). SSH should then be used to tunnel X11 sessions securely as well as for regular ssh connections.
A site should also open this port:
Service TCP/UDP Server Port Client Port ssh tcp 22 random
DoD networks may block standard X11 ports now or in the near future. These ports generally start at 6000 and work up for additional Xdisplays. One DoD suggestion is to block ports often used by X11 from 6000 tcp/udp to 6063 tcp/udp. This would have an adverse affect of most tcp/udp protocols. All tcp/udp protocols choose random ports from the range 1024-65536. When they get a failure, they often increment the port number by 1. So a valid process unrelated to X11 which happens to choose random port number 6000 could have to retry 64 times before getting an unblocked port number.
If the site permits X11, and X11 tunneled via SSH is not available, open the following ports (note, here the server is on the machine where the X11 display is running - at the user-site rather than at the ARL DSRC):
Service TCP/UDP Server Port Client Port X11 tcp 6000 random X11 tcp 6001 random X11 tcp 6002 random X11 tcp 6003 random
(Allow the most common X11 ports. Permit numbers greater than 6003 if firewall/filter logs show valid traffic getting blocked.)
These ports may be used by some Kerberos clients. Only open these if filter/firewall logs show that valid traffic to these ports is getting blocked:
Service TCP/UDP Server Port Client Port kftp tcp 765 random ftp-data tcp 20 random
The following port is only needed if your site has a client which talks directly to a SecurID server. Kerberos clients talk to the kerberos server port even if they require SecurID as part of the exchange. Unblocking the SecurID port is only necessary for unusual access where SecurID is used without also using Kerberos. All ARL DSRC system accesses requires Kerberos. Only open this if filter/firewall logs show that valid traffic to this port is getting blocked:
Service TCP/UDP Server Port Client Port securid udp 1024 random
Example filter configuration for a Cisco router. These filters would be installed at a user site and applied to traffic coming inbound.
Without comments, the configuration looks like this:
access-list 101 permit tcp any eq 88 any access-list 101 permit udp any eq 88 any access-list 101 permit tcp any eq 749 any established access-list 101 permit tcp any eq 544 any established access-list 101 permit tcp 140.31.0.0 0.0.63.255 range 1015 1023 any access-list 101 permit tcp any eq 2105 any established access-list 101 permit tcp 140.31.0.0 0.0.63.255 eq 23 any established access-list 101 permit tcp 140.31.0.0 0.0.63.255 eq 21 any established access-list 101 permit tcp 140.31.0.0 0.0.63.255 any established access-list 101 permit tcp any eq 22 any established
Q. My login node response time is sluggish? How do I find a less busy node?
A.You can use the node_use command to determine the current least busy node available. The command format is:
node_use [-a]
Without the "-a" option it provides the memory usage and load average of your current node. With the "-a" option it provides the same information for all the login nodes.
EX: harold-l2> node_use -a Node Name Total (Kb) Used (Kb) Free (Kb) Pct. Free Load Avg. ========== ========== ========== ========== ========== ============ l1 32960700 22224712 10735988 32.57% 1.12 l2 32960700 24089808 8870892 26.91% 0.08 l3 32960700 28266604 4694096 14.24% 0.14 l4 32960700 31600604 1360096 4.13% 0.16 l5 32960700 24008968 8951732 27.16% 0.14 l6 32960700 31905324 1055376 3.20% 6.80 l7 32960700 12836428 20124272 61.06% 0.13
Q. What local shell variables are automatically defined in my login environment?
A.The Baseline Configuration Team has defined what environment variables are to be defined at each center (ARL included). It can be found at the following Web address: http://www.ccac.hpc.mil/consolidated/bc/policies.php?choice=environment
Q. How do I log into an ARL DSRC machine using my YubiKey?
A. Please review http://www.ccac.hpc.mil/documentation/YubiKey.pdf.
2. Machine Configuration
Q. What is the configuration of your SGI Altix ICE Cluster (Harold)?
A.A thorough configuration summary for Harold is available in the System Configuration section of the Harold User Guide.
Q. What is the configuration of your IBM iDataPlex (Pershing)?
A.A thorough configuration summary for Pershing is available in the System Configuration section of the Pershing User Guide.
3. Manuals
Q. Where can I find the Harold User Guide?
A.On the ARL DSRC Web site at http://www.arl.hpc.mil/docs/haroldUserGuide.html
Q. Where can I find the Pershing User Guide?
A.On the ARL DSRC Web site at http://www.arl.hpc.mil/docs/pershingUserGuide.html
Q. Where can I find the Utility Server User Guide?
A.On the CCAC Web site at http://www.ccac.hpc.mil/documentation/heue/USUserGuide.pdf
Q. Where can I find the PBS User Guide?
A.On the ARL DSRC Web site at http://www.arl.hpc.mil/docs/pbsUserGuide.html
Q. Where can I find the Modules User Guide?
A.On the ARL DSRC Web site at http://www.arl.hpc.mil/docs/modulesUserGuide.html
4. SSH
Q. What do I do when I receive a "Host Key Verification Error"?
A.There are several possible causes for this problem:
The known_hosts file in your home directory has been corrupted. To correct this execute the following commands on Harold:
rm -R ${HOME}/.ssh exit (log back into Harold from your desktop)You are using the wrong version of ssh. There are two versions available on Harold, /usr/bin/ssh and /usr/brl/bin/ssh. Your default version should be /usr/bin/ssh. You can determine which is your default version by executing "which ssh". If /usr/brl/bin/ssh is your current default version, then you can change it by adding the following line to your .cshrc or .profile file:
.cshrc:
setenv PATH /usr/bin:$PATH
.profile:
set PATH="/usr/bin:$PATH";export $PATH- The access to your home directory (/usr/people/username) is too open. For security reasons, if "group" or "world" have write access to your home directory then ssh will not work. Remove the group/world write access from your home directory to correct this problem.
- The access to your .ssh directory (/usr/people/username/.ssh) is too open. For security reasons, if "group" or "world" have write access to this directory then ssh will not work. Remove the group/world write access from this directory to correct this problem.
The compute node you are trying to access is not yet in your known_hosts file. This is only a problem when running batch jobs. To avoid this problem add the following to your run script before you invoke the parallel executable:
Harold - csh, tcsh shells#==================================================================== foreach host (`cat $PBS_NODEFILE`) echo "Working on $host ...." /usr/bin/ssh -o StrictHostKeyChecking=no $host pwd end #====================================================================
Harold - sh, ksh, bash shells#==================================================================== host="" for new_host in `cat $PBS_NODEFILE` do if [ "$new_host" != "$host" ] then host=$new_host echo "Working on $host ...." /usr/bin/ssh -o StrictHostKeyChecking=no $host pwd fi #====================================================================
5. Job Failures
Q. Why is my job not finding my input file and crashing?
A.Since the compute nodes of Harold are not able to access your files in /home or /archive, you must pre-stage (manually copy) your input files to your /usr/people area, space permitting, or your /usr/var/tmp area before submitting your jobs. Job scripts will need to be modified to pick up input files from /usr/people or /usr/var/tmp.
Q. Why am I getting a "Segmentation Violation" error in my job?
A.This error generally means that your application has exceeded the stackspace limit for the shell in which the job script is running. By default, this value is set to relatively small amount of memory. To correct this problem for csh and tcsh scripts the "unlimit" command should be placed in the .cshrc file in your home directory. For sh and ksh scripts, the "ulimit" command should be placed in the .profile file in your home directory. For bash scripts, the "ulimit" command should be placed in the .bashrc file in your home directory.
Q. Why am I getting a "PBS: job killed: mem NNNNNNNNkb exceeded limit NNNNNNNkb" error in my job?
A.This error means your job is exceeding the maximum available user memory on one or more of the nodes being used by your job. For Harold, this value is 18 GBytes. For parallel jobs the first option for correcting this problem is to run on more nodes while using fewer processes on each node. For example:
On Harold, change the PBS option
"select=8,ncpus=8,mpiprocs=8" to
"select=16,ncpus=8,mpiprocs=4".
If the problem persists on Harold even while using 1 process per node, then you will need to redefine your problem to use a smaller memory footprint.
6. COTS Software
Q. I am trying to request a piece of software that is not available but which has a price for downloading and licensing. What is the process for software of this kind?
A.For all COTS software that we do not have on our systems which will require a purchase, please submit a software request form at: https://reservation.hpc.mil/index-sw_request.html
Q. What software is available on each system at ARL DSRC?
A.Please refer to the software listing at: http://www.arl.hpc.mil/software. Also, performing a "module avail" on the console will display a listing of all COTS applications.
Q. Where can I find an example script for running a COTS package?
A.The $SAMPLES_HOME environment variable points to the directory containing the Sample Code Repository. Execute an "ls $SAMPLES_HOME" to see all the sample scripts available for that system. The actual sample scripts are contained within the subdirectories listed. There is also an index file in the main directory explaining the contents of each subdirectory.
Q. How can modules help me?
A. Modules software is recommended for convenience in accessing ARL DSRC COTS software and is available on all our systems here. All new users should have modules already initialized. If you do not see any modules or you get a "module: Command not found." error, you must establish the modules software. To do this, copy the commands, depending on your login shell, from the /usr/cta/modules/samples/ directory into your shell's startup file (.cshrc or .profile and/or .bashrc). Once you have done this, and sourced your .cshrc (or .profile), you can use the module command "module avail" to see what software is available:
harold-l2> module avail ------------------------ /usr/cta/modules/3.1.6/COTS ------------------------ Xpatch4.7.22-2 cobalt4.2 gaussian overflow-2.1n abaqus cobalt5.0 gaussview overflow-2.1n_dp abaqus6.10-1 comsol4.1 gaussview5.0 overflow-2.1w abaqus6.9-3 cseinit gaussview5.0.9 overflow-2.1w_dp abaqus6.9-EF1 cseinit-devel gridgen15.11 pointwise16.01 accelrys cth gridgen15.12 pointwise16.02 accelrys4.4.1 cth_9.0_so gridgen15.13 pointwise16.03 accelrys5 cth_9.0_soi gridpro/4.5 starccm4.06 adf cth_9.0_soix gridpro/5.1 starccm5.02 adf_2008.01 cth_v9.0 gridpro/5.1_static starccm5.04 adf_2009.01 cubit11.1 gridpro/latest swtestsuite adf_2009.01.r23065 discover icemcfd11 tecplot360_2009r2 adf_2010.02 discover4.4 icemcfd121 tecplot360_2010r1 ale3d discover5.0 ls-dyna totalview ale3d4.12.3 ensight82 ls-dyna971_R4.2.1 totalview8.7 amber ensight90 ls-dyna971_R5 truegrid2.3.3 amber10 ensight91 ls-prepost truegrid2.3.4 ansa13.0.4 ensight92 maple14 unsupported ansys110 fluent matlab7.10.0 visit/1.12.0 ansys120 fluent12.1 matlab7.11.0 visit/1.12.1 ansys121 fluent6.3.26 matlab7.9.0 visit/1.12.2 autodyn12 g03_e01 mcnpx2.7a visit/2.0.0 cart3d g09_a01 mesodyn visit/2.0.2 cfd++10.1 g09_b01 mesodyn5 visit/2.1.0 cfd++8.1 gamess mesotek visit/latest cfd++8.1.2 gamess_jan09 namd2.7b3 cfx11.0 gasp5.0 overflow ------------------------ /usr/cta/modules/3.1.6/devel ----------------------- compiler/gcc4.1 compiler/intel11.1.latest mpi/openmpi-1.4 compiler/gcc4.4 compiler/jdk1.6 mpi/sgi_mpi-1.24 compiler/gcc4.5 compiler/jdk1.6-64 mpi/sgi_mpi-1.25 compiler/intel11.0 mpi/intelmpi-3.2 mpi/sgi_mpi-1.26 compiler/intel11.1 mpi/intelmpi-4.0.0 --------------------- /usr/cta/modules/3.1.6/modulefiles -------------------- Master Master-test Masterpm dot modules pbs
And you can use the command "module list" to see what modules you currently have loaded:
Harold-l2> module list
Currently Loaded Modulefiles:
1) modules 2) pbs 3) Master
Use the command "module load" to load a new module (i.e. select a certain package of software):
Harold-l2> module load gaussian Harold-l2> module list Currently Loaded Modulefiles: 1) modules 2) pbs 3) Master 4) Gaussian
A thorough discussion of modules, module commands, and their usage is available in the Modules User Guide. Additional information is also found in each HPC System User Guide about the specific modules that are found on each system.
Q. How do I transfer my data files between the scratch and archive file systems for a batch job if the archive file system is not accessible from the compute nodes?
A.A special PBS queue on Harold has been setup for this purpose, called the transfer queue. This queue allows serial jobs to run up to 24 hours on a login node, which has access to both the scratch area and the /home and /archive directories. Jobs may be submitted directly or within computational job scripts after the computation part of the job has completed. An example of how use the transfer queue can be found in $SAMPLES_HOME/transfer_queue_Example.
Q. How can I transfer my ARL data to another DSRC?
A.The mpscp command is provided at all DSRCs to facilitate transferring data files between DSRC sites. The mpscp command provides a parallel multi-streaming capability to significantly decrease transfer times for very large files. Below are examples of how to use this command.
- Transferring data from Harold to Mana at MHPCC using 4 parallel streams:
mpscp -w 4 /usr/var/tmp/petitgd/output_data.tgz mlogin1.mhpcc.hpc.mil:/scratch/petitgd/ output_data.tgz
- Transferring data from Mana at MHPCC to Harold using 2 parallel streams:
mpscp -w 2 mlogin1.mhpcc.hpc.mil:/scratch/petitgd/output_data.tgz /archive/navy/petitgd/ output_data.tgz
Q. I can't see the stdout/stderr from my job until after the job completes. Is there any way to check on this output during the run?
A.The qpeek command provides this capability. Its usage is as follows:
qpeek JOB_ID
Q. How can I check on the license usage status of COTS packages?
A.The check_license command provides this capability. Below is a description of its use:
For any non-SLB application, the invocation of the command "check_license application-name" will display the featured application and the number of unused licenses. For Example, the command:
check_license starccm
produces the following output:
Available starccm Licenses are as follows: ccmpdomains: 100 ccmpsuite: 5 stardesign: 6
Since Abaqus, Cobalt, Fluent and MATLAB are part of Software License Buffer (SLB), the invocation of check_license for an SLB application will display not only the number of unused licenses but also jobs that are waiting for the license to become available as well as jobs that have scheduled future reservations for the application. For example, the command
check_license abaqus
will produce the following output:
Available abaqus Licenses are as follows: abaqus:50 ams:1 aqua:210 cae:6 cfd:210 cosim_acusolve:1 cosim_direct:1 cse:1 design:210 euler_lagrange:1 explicit:144 foundation:210 multiphysics:1 parallel:16234 standard:162 viewer:1 Pending Jobs for abaqus ready to start are as follows: ID Host # Tokens Releasing (Hr:Min) Runtime (Hr:Min) --------- ----------- -------- ------------------ ---------------- 205561 hawk-12 16 0:25 17:42 195579.o2 n0021 14 0:13 71:30 205563 hawk-12 16 0:25 17:42 Pending Jobs for abaqus with future start times are as follows: ID Host # Tokens Starting (Hr:Min) Runtime (Hr:Min) --------- ----------- -------- ------------------ ---------------- abaqus_7557 HAWK 256 19:32 168:00
Q. How do I check the status of the PBS batch queues?
A.The show_queues command provides this capability.
harold-l2 > show_queues
QUEUE INFORMATION for harold:
Maximum Maximum Jobs Jobs Cores Cores Queue
Queue Name Wall Time Cores Running Pending Running Pending Running
---------------------------------------------------------------------------------
background 24:00:00 n/a 0 0 0 0 Y
standard-long 200:00:00 n/a 0 0 0 0 Y
challenge 168:00:00 n/a 5 26 1728 2192 Y
cots 96:00:00 n/a 1 0 128 0 Y
debug 01:00:00 n/a 1 0 8 0 Y
high 96:00:00 n/a 0 0 0 0 Y
interactive 12:00:00 n/a 0 0 0 0 Y
prepost 24:00:00 1 0 0 0 0 Y
staff 368:00:00 n/a 0 0 0 0 Y
standard 96:00:00 n/a 44 280 2650 30386 Y
test-hw 96:00:00 n/a 0 0 0 0 Y
transfer 24:00:00 n/a 0 0 0 0 Y
urgent 96:00:00 n/a 0 0 0 0 Y
R309308 72:00:00 640 0 0 0 0 N
NODE USAGE INFORMATION:
Node Type Cores Available Cores Running Cores Free
ALTIX 10744 5914 4830
Batch nodes with ALL cores available on them: 0
Reservation nodes with ALL cores available on them: 105
System Node Configuration:
Node-Type Nodes Cores Physical-Memory Available-Memory
Available Per Node Per Node Per Node
LOGIN 8 8 48Gb 30Gb
Compute 1344 8 24Gb 18Gb
Batch* 1024 8 24Gb 18Gb
Reservation* 320 8 24Gb 18Gb
* These are part of the total Compute pool of nodes.
7. Compilers and MPI Suites
Q. What compilers are available on the Linux Clusters?
A.Please type a "module avail" on the system's console and look under the modules listed under /usr/cta/modules/3.1.6/devel. If you get "module: Command not found", please see the question How can modules help me? to determine how to correct this. Detailed compilation, code optimization and compiler settings can be found in the HPC User Guides on the documentation page: http://www.arl.hpc.mil/docs.
Q. What MPI Suites are available on the Linux Clusters?
A.Please type a "module avail" on the system's console and look under the modules listed under /usr/cta/modules/3.1.6/devel and look for the module names that have MPI as part of the name. Most clusters have at least two MPI versions available. Note that if you get "module: Command not found", please see the question How can modules help me?" to determine how to correct this. Also, detailed compilation, code optimization and compiler settings can be found in the HPC User Guides on the documentation page: http://www.arl.hpc.mil/docs.
8. Miscellaneous
Q. How do I remove characters caused by a FTP session from a PC to a Unix System?
A.When using FTP to transfer a text file in binary mode or a tar file from a PC to a Unix system, this can cause your file to contain many "^M" characters, representing carriage returns, and "^Z" characters, which in turn can cause problems when compiling that file.
To remove these "^M"s and "^Z"s, use the dos2unix command as follows:
dos2unix file_name
Q. Where should I send back my broken or unwanted SecurID card?
A.Please mail it back to the following address:
AFRL/RCM
ATTN: Penny Semons - CCAC Accounts Center
2435 5th Street/Room 123
Wright-Patterson AFB, OH 45433-7802
Q. How can I reach the helpdesk?
A.Complete contact information is available on the Contact Us page.
Please Note: All unclassified kerberos and systems support questions should be directed to the HPCMP Consolidated Customer Assistance Center (CCAC). For all other inquiries, you may contact the ARL DSRC Helpdesk.
Q. How do I change permission to a file or a directory?
A. There are three basic modes to files and directories:
- (r)eadable - Value: 4
- (w)ritable - Value: 2
- e(x)ecutable - Value: 1
Additionally, each of these modes can be applied to the
- (u)ser
- (g)roup
- (o)thers
The user means you, the person who owns the file or directory.
The group refers to the group associated with the file or directory in which you are in. To find out what groups you belong to type groups at the unix prompt.
The modes follow a hierarchy of user, group, and then others. Using this we can assign three numerical values.
For example: to make a file readable to all, executable to the group and writable to the user, just add the permission values.
- user = readable + writable = 4 + 2 = 6
- group = readable + executable = 4 + 1 = 5
- other = readable = 4
So to change the permission, type:
% chmod 654 filename
The word "other" means everyone else (aka world), and we do not advise users to open this up for security purposes.
If you need to change permission to an entire directory and its files and its subdirectories, you may use the "-R" option. If you need more information you may review the man pages: man chmod.
Q. What is wrong with my backspace key?
A.All systems use different escape characters to Map your
keyboard type to the key. The most common way to work around this problem is to put
in your .cshrc or .profile a line "stty erase ^h" in most cases and
"stty erase ^?" in others. If this does not work please contact the Helpdesk.
Q. What shells are available?
A.The Bourne shell (sh), Korn shell (ksh), C-shell (csh), T-shell (tcsh) and Bash-shell (bash) are all available as default shells.
The shell establishes your user environment. Your shell functions as both a command interpreter and a programming language. The shell that you are specified to use resides in the /etc/passwd file.
All shells read the /etc/profile file at start up to set system wide environmental variables. Next your system reads individual user environment files depending on your default shell.
For Bourne shell and Korn shell logins, the shell executes /etc/profile and $HOME/.profile, if it exists.
For C shell logins T shell logins, the shell executes /etc/cshrc, $HOME/.cshrc, and $HOME/.login.
For Bash shell logins, the shell executes /etc/profile, $HOME/.profile, if it exists, and the $HOME/.bashrc, if it exists.
The default /etc/profile and /etc/cshrc files print /etc/motd and check for mail.
| Script | Description |
|---|---|
| $HOME/.cshrc | initial commands for each csh |
| $HOME/.login | user's login commands for csh |
| $HOME/.profile | user's login commands for sh and ksh |
| $HOME/.bashrc | user's login commands for bash |
.kshrc, .bashrc, and .cshrc are read at each invocation of a new shell.
Q. How do I use the Advanced Reservation Server to reserve nodes for future jobs?
A.Information on accessing the Advance Reservation Service (ARS) may be found on the Advance Reservations page. An ARS User Guide is also available once you are logged in to the system.
Q. Where can I find instructions for downloading and using the Utility Server SRD software?
A. On the CCAC Web site, at http://www.ccac.hpc.mil/documentation/heue/SRDInstructions.pdf

