• People
  • Courses

Some general information on how to start jobs etc.

First and foremost: We are using Mercer!
Further, for prototyping and coding, please try to use CPUs, your local machine or the crunchies etc. . If you just want to setup some Torch code and code general reading/processing of data, start with a small chunk of that data and run it on your local machine or the crunchies. Then, once everything is working, ship your data and program to the cluster and start running intensive jobs. I will share the raw data for each assignment in the folder /scratch/courses/DSGA1008 (subfolders A1, A2, etc). Further this folder contains a /bin/ with Torch (executable: th) plus the common packages installed. If there is any package missing, please post it on Piazza and I'll look into it. I'd recommend including /scratch/courses/DSGA1008/bin in your PATH variable (maybe even include it in your bashrc).

Note, that the crunchies don't come with torch installed and probably neither does your local computer. See the section on Torch + Utilities to find links to the source and other helpful packages. CIMS will requires your installs to be local on the crunchies.

HPC wiki

Requesting access

- see “How do I use GPUs?”
- Note the mention of environment variables, that tell you the ids of GPUs, that you must use in your bash script, to identify the GPUs on the server, that you have access to (CUDA_VISIBLE_DEVICES and CUDA_DEVICES )

HPC email (one HPC member is also a TA on Piazza - so ask there first)
hpc@nyu.edu - Central ITS facility

Access permissions \ Sharing data with team mates
There's only one group: users - so use chmod to make it world readable
For finer grained accces use: FACL
- See “Managing data: Storage, collaboration and moving data around”

X-forwarding, ssh tunneling
- You may want to forward to your local X server to display windows, or maybe you even have an iTorch notebook session running. To push all that traffic through HPC you will need to properly setup your SSH tunnels.
Tunneling - https://wikis.nyu.edu/display/NYUHPC/SSH+tunneling+overview

If Torch refuses to execute you may be on a compute node, that binds a Tesla M2070 (as you can see here http://en.wikipedia.org/wiki/CUDA that kind of GPU has compute capability 2.0. That is, the CUDA software it supports is of at most version 2.0. Torch builds on newer software than that.). So make sure you specifically request a Titan.

/srv/www/cilvr/htdocs/data/pages/deeplearning2015/hpc_guide.txt · Last modified: 2015/01/21 16:19 by cp
Recent changes RSS feed Creative Commons License Valid XHTML 1.0 Valid CSS Driven by DokuWiki
Drupal Garland Theme for Dokuwiki