Discussion:
[OpenAFS] Return of the getcwd bug?
Neil Brown
2018-10-11 10:08:23 UTC
Permalink
Hi,

Today I've been bitten twice by what appears to be the getcwd bug. I've
not had this problem in a long time. I thought that had been resolved? Is
anyone else seeing this?

This was on both a 1.8.2 and a 1.8.0 client. My home volume is on a 1.6.23
server.

Some details below.

Cheers,

Neil

(~)> uname -srv
Linux 3.10.0-862.6.3.el7.x86_64 #1 SMP Tue Jun 26 12:13:22 CDT 2018
(~)> cd /afs/inf.ed.ac.uk/user/n/neilb
(~)> getcwd
You should see the cwd here - No such file or directory
(~)> cd tmp
(tmp)> getcwd
You should see the cwd here - /afs/inf.ed.ac.uk/user/n/neilb/tmp
(tmp)> rpm -q openafs-client
openafs-client-1.8.0-1.el7.x86_64
(tmp)> fs version
openafs 1.8.0

On the 1.8.2 machine:
(~)> uname -srv
Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Tue Sep 25 14:32:52 CDT 2018
(~)> pwd
/afs/inf.ed.ac.uk/user/n/neilb
(~)> getcwd
You should see the cwd here - No such file or directory
(~)> cd tmp
(tmp)> getcwd
You should see the cwd here - /afs/inf.ed.ac.uk/user/n/neilb/tmp
(tmp)> rpm -q openafs
openafs-1.8.2-1.el7.x86_64
(tmp)> fs version
openafs 1.8.2

My "getcwd" script basically does
perl -e 'use Cwd; print getcwd()'

And the server with my home volume:

: uname -srv
Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Tue Sep 25 14:32:52 CDT 2018
: bos version
openafs 1.6.23
: rpm -q openafs
openafs-1.6.23-1.el7.x86_64
--
Neil Brown - Computing Officer - Appleton Tower 7.12a | Neil.Brown @ ed. ac.uk
School of Informatics, University of Edinburgh | Tel: +44 131 6504422

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Neil Brown
2018-11-02 11:55:42 UTC
Permalink
Today I've been bitten twice by what appears to be the getcwd bug. I've not
had this problem in a long time. I thought that had been resolved? Is anyone
else seeing this?
This was on both a 1.8.2 and a 1.8.0 client. My home volume is on a 1.6.23
server.
A follow up to my original post. My home volume is now on a 1.8.2 to match
the client, I didn't expect it to make a difference, it hasn't. This
morning (having rebooted to clear the last occurance on Monday), getcwd
problems have returned. Same issues as before, but I've just noticed this.

***@jingz(~)> pwd
/afs/inf.ed.ac.uk/user/n/neilb
***@jingz(~)> ls -l /proc/self/cwd
lrwxrwxrwx 1 neilb people 0 Nov 2 11:50 /proc/self/cwd -> /afs/inf.ed.ac.uk/user/n/neilb (deleted)

(note the "deleted) but if I cd down one level in my home dir:

***@jingz(tmp)> ls -l /proc/self/cwd
lrwxrwxrwx 1 neilb people 0 Nov 2 11:49 /proc/self/cwd -> /afs/inf.ed.ac.uk/user/n/neilb/tmp/

If it really is just us that's seeing this. I wonder if it may be related
to our use of automount (autofs). We use automount for various mappings,
but one is to map /autofs/nethome/USERNAME -> to the corresponding /afs/
path and /home/ is a symlink to /autofs/nethome/. When the bug strikes
then I can't access /home/neilb or /autofs/nethome/neilb

***@jingz(~)> ls /autofs/nethome/neilb
ls: cannot access /autofs/nethome/neilb: No such file or directory

I have references to /home/neilb in various dot files. I'm going to remove
those and see if things improve.

Neil
--
Neil Brown - Computing Officer - Appleton Tower 7.12a | Neil.Brown @ ed. ac.uk
School of Informatics, University of Edinburgh | Tel: +44 131 6504422

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Jonathan Billings
2018-11-02 12:22:18 UTC
Permalink
Post by Neil Brown
If it really is just us that's seeing this. I wonder if it may be related
to our use of automount (autofs). We use automount for various mappings,
but one is to map /autofs/nethome/USERNAME -> to the corresponding /afs/
path and /home/ is a symlink to /autofs/nethome/. When the bug strikes
then I can't access /home/neilb or /autofs/nethome/neilb
I've been using autofs to create bind mounts into AFS (because we use an
even more complicated path in AFS that can't be templated), and I've been
seeing a similar bug. The mount() syscall fails despite being able to see
the AFS home directory via ls and other tools.
--
Jonathan Billings <***@umich.edu>
College of Engineering - CAEN - Unix and Linux Support
Benjamin Kaduk
2018-11-03 14:44:10 UTC
Permalink
Post by Neil Brown
If it really is just us that's seeing this. I wonder if it may be related
to our use of automount (autofs). We use automount for various mappings,
but one is to map /autofs/nethome/USERNAME -> to the corresponding /afs/
path and /home/ is a symlink to /autofs/nethome/. When the bug strikes
then I can't access /home/neilb or /autofs/nethome/neilb
Automount seems likely to engage the annoying interactions required by the
linux kernel VFS's insistence on a single canonical path for a given
dentry, yes.

-Ben
Jonathan Billings
2018-11-05 13:24:54 UTC
Permalink
Post by Benjamin Kaduk
Automount seems likely to engage the annoying interactions required by the
linux kernel VFS's insistence on a single canonical path for a given
dentry, yes.
Any suggestions for providing a uniform path for AFS homedirs that can be
used with software like sssd's homedir_template for users like ours, who
have complicated paths for their AFS home? Unfortunately, I have little
control over how volumes are mounted in the cell I use.

The only other thing I can think of is to instead create a symlink in
/home, but that might cause issues.
--
Jonathan Billings <***@umich.edu>
College of Engineering - CAEN - Unix and Linux Support
Loading...