First and foremost: We are using Mercer!
https://wikis.nyu.edu/display/NYUHPC/Clusters+-+Mercer
Further, for prototyping and coding, please try to use CPUs, your local
machine or the crunchies etc. . If you just want to setup some Torch code and code general reading/processing
of data, start with a small chunk of that data and run it on your local machine
or the crunchies. Then, once everything is working, ship your data and program
to the cluster and start running intensive jobs. I will share the raw
data for each assignment in the folder /scratch/courses/DSGA1008 (subfolders
A1, A2, etc).
Further this folder contains a /bin/ with Torch (executable: th) plus the common
packages installed. If there is any package missing, please post it on
Piazza and I'll look into it. I'd recommend including /scratch/courses/DSGA1008/bin
in your PATH variable (maybe even include it in your bashrc).
Note, that the crunchies don't come with torch installed and probably neither does
your local computer. See the section on Torch + Utilities to find links to the source and other helpful packages. CIMS will requires your installs to be local on the crunchies.
HPC wiki
https://wikis.nyu.edu/display/NYUHPC/Tutorials
https://wikis.nyu.edu/display/NYUHPC/Submitting+a+job+with+qsub
Requesting access
https://wikis.nyu.edu/display/NYUHPC/High+Performance+Computing+at+NYU
https://iiq.home.nyu.edu/identityiq/SPSSO
GPUs
https://wikis.nyu.edu/display/NYUHPC/FAQ
- see “How do I use GPUs?”
- Note the mention of environment variables, that
tell you the ids of GPUs, that you must use in your
bash script, to identify the GPUs on the server, that you have access to (CUDA_VISIBLE_DEVICES and CUDA_DEVICES )
HPC email (one HPC member is also a TA on Piazza - so ask there first)
hpc@nyu.edu - Central ITS facility
Access permissions \ Sharing data with team mates
There's only one group: users - so use chmod to make it world readable
For finer grained accces use: FACL
https://wikis.nyu.edu/display/NYUHPC/High+Performance+Computing+at+NYU
- See “Managing data: Storage, collaboration and moving data around”
X-forwarding, ssh tunneling
- You may want to forward to your local X server to display windows, or
maybe you even have an iTorch notebook session running. To push all that
traffic through HPC you will need to properly setup your SSH tunnels.
Tunneling - https://wikis.nyu.edu/display/NYUHPC/SSH+tunneling+overview
Hints
If Torch refuses to execute you may be on a compute node, that
binds a Tesla M2070 (as you can see here http://en.wikipedia.org/wiki/CUDA that
kind of GPU has compute capability 2.0. That is, the CUDA software it supports
is of at most version 2.0. Torch builds on newer software than that.).
So make sure you specifically request a Titan.