[PATCH 1/2] test/py: Use loop mounts if guestmount fails in filesystem tests

If guestmount isn't available on the system, filesystem test setup falls back to using loop mounts to prepare its disk images. If guestmount is available but fails to work, the tests are immediately skipped. Instead of giving up on a guestmount failure, try using loop mounts as an attempt to keep tests running.
Also stop checking if guestmount is in PATH, as trying to run a missing guestmount can now follow the same failure codepath and fall back to loop mounts anyway.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com ---
test/py/tests/test_fs/conftest.py | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index 410a675b9714..e3c461635f8e 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -205,24 +205,23 @@ def mount_fs(fs_type, device, mount_point): """ global fuse_mounted
- fuse_mounted = False try: - if tool_is_in_path('guestmount'): - fuse_mounted = True - check_call('guestmount -a %s -m /dev/sda %s' - % (device, mount_point), shell=True) - else: - mount_opt = 'loop,rw' - if re.match('fat', fs_type): - mount_opt += ',umask=0000' - - check_call('sudo mount -o %s %s %s' - % (mount_opt, device, mount_point), shell=True) - - # may not be effective for some file systems - check_call('sudo chmod a+rw %s' % mount_point, shell=True) + check_call('guestmount -a %s -m /dev/sda %s' + % (device, mount_point), shell=True) + fuse_mounted = True + return except CalledProcessError: - raise + fuse_mounted = False + + mount_opt = 'loop,rw' + if re.match('fat', fs_type): + mount_opt += ',umask=0000' + + check_call('sudo mount -o %s %s %s' + % (mount_opt, device, mount_point), shell=True) + + # may not be effective for some file systems + check_call('sudo chmod a+rw %s' % mount_point, shell=True)
def umount_fs(mount_point): """Unmount a volume.

Some filesystem tests are failing when their image is prepared with guestmount, but succeeding if loop mounts are used instead. The reason seems to be a race condition the guestmount(1) manual page explains:
When guestunmount(1)/fusermount(1) exits, guestmount may still be running and cleaning up the mountpoint. The disk image will not be fully finalized.
This means that scripts like the following have a nasty race condition:
guestmount -a disk.img -i /mnt # copy things into /mnt guestunmount /mnt # immediately try to use 'disk.img' ** UNSAFE **
The solution is to use the --pid-file option to write the guestmount PID to a file, then after guestunmount spin waiting for this PID to exit.
The Python standard library has an os.waitpid() function for waiting a child to terminate, but it cannot wait on non-child processes. Implement a utility function that can do this by polling the process repeatedly for a given duration, optionally killing the process if it won't terminate on its own. Apply the suggested solution with this utility function, which makes the failing tests succeed again.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com ---
test/py/tests/test_fs/conftest.py | 13 ++++++++++- test/py/u_boot_utils.py | 36 +++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index e3c461635f8e..6b1ff05a8143 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -8,6 +8,7 @@ import re from subprocess import call, check_call, check_output, CalledProcessError from fstest_defs import * +import u_boot_utils as util
supported_fs_basic = ['fat16', 'fat32', 'ext4'] supported_fs_ext = ['fat16', 'fat32'] @@ -206,7 +207,7 @@ def mount_fs(fs_type, device, mount_point): global fuse_mounted
try: - check_call('guestmount -a %s -m /dev/sda %s' + check_call('guestmount --pid-file guestmount.pid -a %s -m /dev/sda %s' % (device, mount_point), shell=True) fuse_mounted = True return @@ -235,6 +236,16 @@ def umount_fs(mount_point): if fuse_mounted: call('sync') call('guestunmount %s' % mount_point, shell=True) + + try: + with open("guestmount.pid", "r") as pidfile: + pid = int(pidfile.read()) + util.waitpid(pid, kill=True) + os.remove("guestmount.pid") + + except FileNotFoundError: + pass + else: call('sudo umount %s' % mount_point, shell=True)
diff --git a/test/py/u_boot_utils.py b/test/py/u_boot_utils.py index 939d82eec12a..e816c7fbb6a3 100644 --- a/test/py/u_boot_utils.py +++ b/test/py/u_boot_utils.py @@ -8,6 +8,7 @@ import os import os.path import pytest +import signal import sys import time import re @@ -339,3 +340,38 @@ def crc32(u_boot_console, address, count): assert m, 'CRC32 operation failed.'
return m.group(1) + +def waitpid(pid, timeout=60, kill=False): + """Wait a process to terminate by its PID + + This is an alternative to a os.waitpid(pid, 0) call that works on + processes that aren't children of the python process. + + Args: + pid: PID of a running process. + timeout: Time in seconds to wait. + kill: Whether to forcibly kill the process after timeout. + + Returns: + True, if the process ended on its own. + False, if the process was killed by this function. + + Raises: + TimeoutError, if the process is still running after timeout. + """ + try: + for _ in range(timeout): + os.kill(pid, 0) + time.sleep(1) + + if kill: + os.kill(pid, signal.SIGKILL) + return False + + except ProcessLookupError: + return True + + raise TimeoutError( + "Process with PID {} did not terminate after {} seconds." + .format(pid, timeout) + )

On Fri, 4 Jun 2021 at 13:05, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
Some filesystem tests are failing when their image is prepared with guestmount, but succeeding if loop mounts are used instead. The reason seems to be a race condition the guestmount(1) manual page explains:
When guestunmount(1)/fusermount(1) exits, guestmount may still be running and cleaning up the mountpoint. The disk image will not be fully finalized. This means that scripts like the following have a nasty race condition: guestmount -a disk.img -i /mnt # copy things into /mnt guestunmount /mnt # immediately try to use 'disk.img' ** UNSAFE ** The solution is to use the --pid-file option to write the guestmount PID to a file, then after guestunmount spin waiting for this PID to exit.
The Python standard library has an os.waitpid() function for waiting a child to terminate, but it cannot wait on non-child processes. Implement a utility function that can do this by polling the process repeatedly for a given duration, optionally killing the process if it won't terminate on its own. Apply the suggested solution with this utility function, which makes the failing tests succeed again.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com
test/py/tests/test_fs/conftest.py | 13 ++++++++++- test/py/u_boot_utils.py | 36 +++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index e3c461635f8e..6b1ff05a8143 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -8,6 +8,7 @@ import re from subprocess import call, check_call, check_output, CalledProcessError from fstest_defs import * +import u_boot_utils as util
supported_fs_basic = ['fat16', 'fat32', 'ext4'] supported_fs_ext = ['fat16', 'fat32'] @@ -206,7 +207,7 @@ def mount_fs(fs_type, device, mount_point): global fuse_mounted
try:
check_call('guestmount -a %s -m /dev/sda %s'
check_call('guestmount --pid-file guestmount.pid -a %s -m /dev/sda %s' % (device, mount_point), shell=True) fuse_mounted = True return
@@ -235,6 +236,16 @@ def umount_fs(mount_point): if fuse_mounted: call('sync')
should you remove guestmount.pid first in case it is there from a previous crash?
call('guestunmount %s' % mount_point, shell=True)
try:
with open("guestmount.pid", "r") as pidfile:
pid = int(pidfile.read())
util.waitpid(pid, kill=True)
os.remove("guestmount.pid")
except FileNotFoundError:
pass
- else: call('sudo umount %s' % mount_point, shell=True)
diff --git a/test/py/u_boot_utils.py b/test/py/u_boot_utils.py index 939d82eec12a..e816c7fbb6a3 100644 --- a/test/py/u_boot_utils.py +++ b/test/py/u_boot_utils.py @@ -8,6 +8,7 @@ import os import os.path import pytest +import signal import sys import time import re @@ -339,3 +340,38 @@ def crc32(u_boot_console, address, count): assert m, 'CRC32 operation failed.'
return m.group(1)
+def waitpid(pid, timeout=60, kill=False):
- """Wait a process to terminate by its PID
- This is an alternative to a os.waitpid(pid, 0) call that works on
- processes that aren't children of the python process.
- Args:
pid: PID of a running process.
timeout: Time in seconds to wait.
kill: Whether to forcibly kill the process after timeout.
- Returns:
True, if the process ended on its own.
False, if the process was killed by this function.
- Raises:
TimeoutError, if the process is still running after timeout.
- """
- try:
for _ in range(timeout):
os.kill(pid, 0)
time.sleep(1)
if kill:
os.kill(pid, signal.SIGKILL)
return False
- except ProcessLookupError:
return True
- raise TimeoutError(
"Process with PID {} did not terminate after {} seconds."
.format(pid, timeout)
- )
-- 2.32.0.rc2

On 26/06/2021 21:29, Simon Glass wrote:
On Fri, 4 Jun 2021 at 13:05, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
Some filesystem tests are failing when their image is prepared with guestmount, but succeeding if loop mounts are used instead. The reason seems to be a race condition the guestmount(1) manual page explains:
When guestunmount(1)/fusermount(1) exits, guestmount may still be running and cleaning up the mountpoint. The disk image will not be fully finalized. This means that scripts like the following have a nasty race condition: guestmount -a disk.img -i /mnt # copy things into /mnt guestunmount /mnt # immediately try to use 'disk.img' ** UNSAFE ** The solution is to use the --pid-file option to write the guestmount PID to a file, then after guestunmount spin waiting for this PID to exit.
The Python standard library has an os.waitpid() function for waiting a child to terminate, but it cannot wait on non-child processes. Implement a utility function that can do this by polling the process repeatedly for a given duration, optionally killing the process if it won't terminate on its own. Apply the suggested solution with this utility function, which makes the failing tests succeed again.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com
test/py/tests/test_fs/conftest.py | 13 ++++++++++- test/py/u_boot_utils.py | 36 +++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index e3c461635f8e..6b1ff05a8143 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -8,6 +8,7 @@ import re from subprocess import call, check_call, check_output, CalledProcessError from fstest_defs import * +import u_boot_utils as util
supported_fs_basic = ['fat16', 'fat32', 'ext4'] supported_fs_ext = ['fat16', 'fat32'] @@ -206,7 +207,7 @@ def mount_fs(fs_type, device, mount_point): global fuse_mounted
try:
check_call('guestmount -a %s -m /dev/sda %s'
check_call('guestmount --pid-file guestmount.pid -a %s -m /dev/sda %s' % (device, mount_point), shell=True) fuse_mounted = True return
@@ -235,6 +236,16 @@ def umount_fs(mount_point): if fuse_mounted: call('sync')
should you remove guestmount.pid first in case it is there from a previous crash?
Guestmount overwrites the pid file without complaining, and the worker deletes the file on its own when it terminates after the guestunmount call.
I think it's OK to ignore an existing pid file if the mount point is clean. If not clean, that's the actual problem and both guestmount and mount will fail, causing the tests to be skipped. We could clean mount points automatically before attempting to mount.
call('guestunmount %s' % mount_point, shell=True)
try:
with open("guestmount.pid", "r") as pidfile:
pid = int(pidfile.read())
util.waitpid(pid, kill=True)
os.remove("guestmount.pid")
except FileNotFoundError:
pass
- else: call('sudo umount %s' % mount_point, shell=True)
diff --git a/test/py/u_boot_utils.py b/test/py/u_boot_utils.py index 939d82eec12a..e816c7fbb6a3 100644 --- a/test/py/u_boot_utils.py +++ b/test/py/u_boot_utils.py @@ -8,6 +8,7 @@ import os import os.path import pytest +import signal import sys import time import re @@ -339,3 +340,38 @@ def crc32(u_boot_console, address, count): assert m, 'CRC32 operation failed.'
return m.group(1)
+def waitpid(pid, timeout=60, kill=False):
- """Wait a process to terminate by its PID
- This is an alternative to a os.waitpid(pid, 0) call that works on
- processes that aren't children of the python process.
- Args:
pid: PID of a running process.
timeout: Time in seconds to wait.
kill: Whether to forcibly kill the process after timeout.
- Returns:
True, if the process ended on its own.
False, if the process was killed by this function.
- Raises:
TimeoutError, if the process is still running after timeout.
- """
- try:
for _ in range(timeout):
os.kill(pid, 0)
time.sleep(1)
if kill:
os.kill(pid, signal.SIGKILL)
return False
- except ProcessLookupError:
return True
- raise TimeoutError(
"Process with PID {} did not terminate after {} seconds."
.format(pid, timeout)
- )
-- 2.32.0.rc2

On Fri, Jun 04, 2021 at 10:04:46PM +0300, Alper Nebi Yasak wrote:
Some filesystem tests are failing when their image is prepared with guestmount, but succeeding if loop mounts are used instead. The reason seems to be a race condition the guestmount(1) manual page explains:
When guestunmount(1)/fusermount(1) exits, guestmount may still be running and cleaning up the mountpoint. The disk image will not be fully finalized. This means that scripts like the following have a nasty race condition: guestmount -a disk.img -i /mnt # copy things into /mnt guestunmount /mnt # immediately try to use 'disk.img' ** UNSAFE ** The solution is to use the --pid-file option to write the guestmount PID to a file, then after guestunmount spin waiting for this PID to exit.
The Python standard library has an os.waitpid() function for waiting a child to terminate, but it cannot wait on non-child processes. Implement a utility function that can do this by polling the process repeatedly for a given duration, optionally killing the process if it won't terminate on its own. Apply the suggested solution with this utility function, which makes the failing tests succeed again.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com Reviewed-by: Simon Glass sjg@chromium.org
Applied to u-boot/master, thanks!

On Fri, 4 Jun 2021 at 13:05, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
If guestmount isn't available on the system, filesystem test setup falls back to using loop mounts to prepare its disk images. If guestmount is available but fails to work, the tests are immediately skipped. Instead of giving up on a guestmount failure, try using loop mounts as an attempt to keep tests running.
Also stop checking if guestmount is in PATH, as trying to run a missing guestmount can now follow the same failure codepath and fall back to loop mounts anyway.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com
test/py/tests/test_fs/conftest.py | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-)
https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/282788
thoughts below
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index 410a675b9714..e3c461635f8e 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -205,24 +205,23 @@ def mount_fs(fs_type, device, mount_point): """ global fuse_mounted
- fuse_mounted = False try:
if tool_is_in_path('guestmount'):
fuse_mounted = True
check_call('guestmount -a %s -m /dev/sda %s'
% (device, mount_point), shell=True)
else:
mount_opt = 'loop,rw'
if re.match('fat', fs_type):
mount_opt += ',umask=0000'
check_call('sudo mount -o %s %s %s'
% (mount_opt, device, mount_point), shell=True)
# may not be effective for some file systems
check_call('sudo chmod a+rw %s' % mount_point, shell=True)
check_call('guestmount -a %s -m /dev/sda %s'
% (device, mount_point), shell=True)
fuse_mounted = True
except CalledProcessError:return
raise
fuse_mounted = False
- mount_opt = 'loop,rw'
- if re.match('fat', fs_type):
This seems ambiguous. I think you should use:
if 'fat' in fs_type:
or
if fs_type.startswith('fat'):
or
if fs_type == 'fat':
so it is clear what you mean.
mount_opt += ',umask=0000'
- check_call('sudo mount -o %s %s %s'
% (mount_opt, device, mount_point), shell=True)
- # may not be effective for some file systems
- check_call('sudo chmod a+rw %s' % mount_point, shell=True)
def umount_fs(mount_point): """Unmount a volume. -- 2.32.0.rc2
Regards, Simon

On 26/06/2021 21:29, Simon Glass wrote:
On Fri, 4 Jun 2021 at 13:05, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
If guestmount isn't available on the system, filesystem test setup falls back to using loop mounts to prepare its disk images. If guestmount is available but fails to work, the tests are immediately skipped. Instead of giving up on a guestmount failure, try using loop mounts as an attempt to keep tests running.
Also stop checking if guestmount is in PATH, as trying to run a missing guestmount can now follow the same failure codepath and fall back to loop mounts anyway.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com
test/py/tests/test_fs/conftest.py | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-)
https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/282788
thoughts below
That CI run doesn't have this patch/series applied.
diff --git a/test/py/tests/test_fs/conftest.py b/test/py/tests/test_fs/conftest.py index 410a675b9714..e3c461635f8e 100644 --- a/test/py/tests/test_fs/conftest.py +++ b/test/py/tests/test_fs/conftest.py @@ -205,24 +205,23 @@ def mount_fs(fs_type, device, mount_point): """ global fuse_mounted
- fuse_mounted = False try:
if tool_is_in_path('guestmount'):
fuse_mounted = True
check_call('guestmount -a %s -m /dev/sda %s'
% (device, mount_point), shell=True)
else:
mount_opt = 'loop,rw'
if re.match('fat', fs_type):
mount_opt += ',umask=0000'
check_call('sudo mount -o %s %s %s'
% (mount_opt, device, mount_point), shell=True)
# may not be effective for some file systems
check_call('sudo chmod a+rw %s' % mount_point, shell=True)
check_call('guestmount -a %s -m /dev/sda %s'
% (device, mount_point), shell=True)
fuse_mounted = True
except CalledProcessError:return
raise
fuse_mounted = False
- mount_opt = 'loop,rw'
- if re.match('fat', fs_type):
This seems ambiguous. I think you should use:
if 'fat' in fs_type:
or
if fs_type.startswith('fat'):
or
if fs_type == 'fat':
so it is clear what you mean.
This just unindents existing code. I can send v2 with startswith('fat') though.
(I think I should rewrite the tests with guestfish and remove these mount functions instead; but I'm busy with other things for a while.)
mount_opt += ',umask=0000'
- check_call('sudo mount -o %s %s %s'
% (mount_opt, device, mount_point), shell=True)
- # may not be effective for some file systems
- check_call('sudo chmod a+rw %s' % mount_point, shell=True)
def umount_fs(mount_point): """Unmount a volume. -- 2.32.0.rc2
Regards, Simon

On Fri, Jun 04, 2021 at 10:04:45PM +0300, Alper Nebi Yasak wrote:
If guestmount isn't available on the system, filesystem test setup falls back to using loop mounts to prepare its disk images. If guestmount is available but fails to work, the tests are immediately skipped. Instead of giving up on a guestmount failure, try using loop mounts as an attempt to keep tests running.
Also stop checking if guestmount is in PATH, as trying to run a missing guestmount can now follow the same failure codepath and fall back to loop mounts anyway.
Signed-off-by: Alper Nebi Yasak alpernebiyasak@gmail.com
Applied to u-boot/master, thanks!
participants (3)
-
Alper Nebi Yasak
-
Simon Glass
-
Tom Rini