[OOD-users] common NFS for ood data shared with slurm workers

Anderson, Richard O - (ric) ric at email.arizona.edu
Fri Jan 18 10:40:45 EST 2019


We use the University's common authentication for all our HPC nodes, so extending that to the ondemand VM was relatively easy for us via Shibboleth.  That method means we have common User IDs and Group IDs for all systems, which make NFS easy.  OOD has an implicit dependency on the compute nodes having access to the same directory path as the OOD node as far as I can tell.

I wouldn't worry about .bashrc or .bash_profile.  The user should not be able to get a command line shell on the OOD node.  If the user opens open a shell window vi a OOD, it's on a cluster node.
Ric
--
Ric Anderson| Systems Administrator [Description: Description: Description: Description: Description: Description: Description: http://redbar.web.arizona.edu/logos/images/thumb_pawprints.gif]
Research And Discovery Tech | HPC Large Systems Support
XSEDE Campus Champion
ric at email.arizona.edu<mailto:Ric at email.arizona.edu>         (V):  +1-520-626-1642
[cid:image005.png at 01D01593.CF7DFA60]



From: "edijh403 at tutanota.com" <edijh403 at tutanota.com>
Date: Friday, January 18, 2019 at 6:26 AM
To: "Anderson, Richard O - (ric)" <ric at email.arizona.edu>
Cc: Ohio Super Computing On Demand Users List <ood-users at lists.osc.edu>
Subject: Re: [OOD-users] common NFS for ood data shared with slurm workers

Ok, thanks, Ric.

For the slurm master and the worker nodes it makes sense to have an NFS mounted at their /home
directories because these nodes are very similar (in my case they have at least the same users and the
same OS (Ubuntu)).

However, I'm hesitating to also share the ood node's /home because that node uses another OS
(CentOS, because OOD is not yet available as a Debian package) and there are different users on it
(no 'ubuntu' user but a 'centos' user instead). After all, I don't want the NFS to hide /home/centos.

So I could mount at /home/ood instead. But then who gives me the guarantee that e.g. .bash_profile,
.bashrc, etc. will work on both Ubuntu and CentOS?

So should I mount at /home/ood/ondemand where OOD actually puts its data?

Thanks.

Jan 17, 2019, 6:02 PM by ric at email.arizona.edu:
We use common NFS mount for /home and several other file systems that the compute nodes can access as users may have files in any/all of those they need to edit.
Ric
--
Ric Anderson| Systems Administrator [Image removed by sender. Description: Description: Description: Description: Description: Description: Description: http://redbar.web.arizona.edu/logos/images/thumb_pawprints.gif]
Research And Discovery Tech | HPC Large Systems Support
XSEDE Campus Champion
ric at email.arizona.edu<mailto:Ric at email.arizona.edu>         (V):  +1-520-626-1642
[Image removed by sender. cid:image005.png at 01D01593.CF7DFA60]



From: OOD-users <ood-users-bounces+ric=email.arizona.edu at lists.osc.edu<mailto:ood-users-bounces+ric=email.arizona.edu at lists.osc.edu>> on behalf of Ohio Super Computing On Demand Users List <ood-users at lists.osc.edu<mailto:ood-users at lists.osc.edu>>
Reply-To: "edijh403 at tutanota.com<mailto:edijh403 at tutanota.com>" <edijh403 at tutanota.com<mailto:edijh403 at tutanota.com>>, Ohio Super Computing On Demand Users List <ood-users at lists.osc.edu<mailto:ood-users at lists.osc.edu>>
Date: Thursday, January 17, 2019 at 9:59 AM
To: Ohio Super Computing On Demand Users List <ood-users at lists.osc.edu<mailto:ood-users at lists.osc.edu>>
Subject: [OOD-users] common NFS for ood data shared with slurm workers

Hi all,

when trying to launch a slurm job from within the ood dashboard, i get, in slurmd.log:

[14.batch] error: Could not open stdout file /home/ood/ondemand/data/sys/myjobs/projects/default/4/slurm-14.out: No such file or directory
[14.batch] error: IO setup failed: No such file or directory

similarly, when trying to launch a jupyter notebook, i get:

[39.batch] error: Could not open stdout file /home/ood/ondemand/data/sys/dashboard/batch_connect/dev/jupyter/output/380b6eec-6d71-4a83-8a5e-20398831668a/output.log: No such file or directory
[39.batch] error: IO setup failed: No such file or directory

and that's because that path only exists on the ood node but not on a slurm worker node.
to have this path exist on the ood node and all slurm worker nodes
i'd suggest to use a common NFS they all mount. is that the recommended way to go
or what would you suggest?

thanks in advance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/ood-users/attachments/20190118/d4ab2511/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 822 bytes
Desc: image001.png
URL: <http://lists.osu.edu/pipermail/ood-users/attachments/20190118/d4ab2511/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 3496 bytes
Desc: image002.png
URL: <http://lists.osu.edu/pipermail/ood-users/attachments/20190118/d4ab2511/attachment-0003.png>


More information about the OOD-users mailing list