[PATCH V2] spi: Update speed/mode on change

The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com --- V2: - Use bus_data->speed for the check - Release the bus on error --- drivers/spi/spi-uclass.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/spi/spi-uclass.c b/drivers/spi/spi-uclass.c index 7155d4aebd6..5617f6645ee 100644 --- a/drivers/spi/spi-uclass.c +++ b/drivers/spi/spi-uclass.c @@ -405,12 +405,22 @@ int spi_get_bus_and_cs(int busnum, int cs, int speed, int mode, goto err; }
+ /* In case bus frequency or mode changed, update it. */ + if ((speed && bus_data->speed && bus_data->speed != speed) || + (plat->mode != mode)) { + ret = spi_set_speed_mode(bus, speed, mode); + if (ret) + goto err_speed_mode; + } + *busp = bus; *devp = slave; log_debug("%s: bus=%p, slave=%p\n", __func__, bus, *devp);
return 0;
+err_speed_mode: + spi_release_bus(slave); err: log_debug("%s: Error path, created=%d, device '%s'\n", __func__, created, dev->name);

On 2/26/21 3:21 PM, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
This bugfix is still missing from upstream.

On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.

On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.

On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".

On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?

On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118 #2 0x000000000055e083 in dm_test_sound (uts=0x652400 <global_dm_test_state>) at test/dm/sound.c:28 #3 0x0000000000514c4d in dm_do_test (uts=0x652400 <global_dm_test_state>, test=test@entry=0x621520 <_u_boot_list_2_dm_test_2_dm_test_sound>, of_live=<optimized out>) at test/dm/test-main.c:107 #4 0x00000000005147a4 in dm_test_main (test_name=0x0) at test/dm/test-main.c:189 #5 0x00000000004420c8 in cmd_call (cmdtp=0x61ebf8 <_u_boot_list_2_cmd_2_ut>, flag=<optimized out>, argc=2, argv=0x15bd63d0, repeatable=0x7fffffffda34) at common/command.c:580 #6 cmd_process (flag=<optimized out>, flag@entry=0, argc=<optimized out>, argv=0x15bd63d0, repeatable=0x63b97c <flag_repeat>, ticks=ticks@entry=0x0) at common/command.c:635 #7 0x000000000042c574 in run_pipe_real (pi=0x15bd6300) at common/cli_hush.c:1676 #8 run_list_real (pi=<optimized out>, pi@entry=0x15bd6300) at common/cli_hush.c:1873 #9 0x000000000042b415 in run_list (pi=0x15bd6300) at common/cli_hush.c:2022 #10 parse_stream_outer (inp=inp@entry=0x7fffffffdbe8, flag=<optimized out>, flag@entry=2) at common/cli_hush.c:3206 #11 0x000000000042b57a in parse_file_outer () at common/cli_hush.c:3289 #12 0x0000000000441517 in cli_loop () at common/cli.c:230 #13 0x000000000042a0ad in main_loop () at common/main.c:66 #14 0x000000000042d57e in run_main_loop () at common/board_r.c:588 #15 0x000000000042d32e in initcall_run_list (init_sequence=<optimized out>) at /u-boot/include/initcall.h:46 #16 board_init_r (new_gd=<optimized out>, dest_addr=dest_addr@entry=0) at common/board_r.c:830 #17 0x0000000000403a85 in main (argc=3, argv=0x7fffffffdfa8) at arch/sandbox/cpu/start.c:498

On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
The spi_get_bus_and_cs() may be called on the same bus and chipselect with different frequency or mode. This is valid usecase, but the code fails to notify the controller of such a configuration change. Call spi_set_speed_mode() in case bus frequency or bus mode changed to let the controller update the configuration.
The problem can easily be triggered using the sspi command: => sspi 0:0@1000 => sspi 0:0@2000 Without this patch, both transfers happen at 1000 Hz. With this patch, the later transfer happens correctly at 2000 Hz.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@amarulasolutions.com Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?

On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote:
On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote:
> The spi_get_bus_and_cs() may be called on the same bus and chipselect > with different frequency or mode. This is valid usecase, but the code > fails to notify the controller of such a configuration change. Call > spi_set_speed_mode() in case bus frequency or bus mode changed to let > the controller update the configuration. > > The problem can easily be triggered using the sspi command: > => sspi 0:0@1000 > => sspi 0:0@2000 > Without this patch, both transfers happen at 1000 > Hz. With this patch, > the later transfer happens correctly at 2000 Hz. > > Signed-off-by: Marek Vasut marex@denx.de > Cc: Jagan Teki jagan@amarulasolutions.com > Cc: Patrick Delaunay patrick.delaunay@st.com
So, very reliably I can make: https://source.denx.de/u-boot/u-boot/-/jobs/245517 happen locally as well building with clang. It's not obvious to me why the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.

On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote:
On 3/28/21 11:30 PM, Tom Rini wrote: > On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: > > > The spi_get_bus_and_cs() may be called on the same bus and chipselect > > with different frequency or mode. This is valid usecase, but the code > > fails to notify the controller of such a configuration change. Call > > spi_set_speed_mode() in case bus frequency or bus mode changed to let > > the controller update the configuration. > > > > The problem can easily be triggered using the sspi command: > > => sspi 0:0@1000 > > => sspi 0:0@2000 > > Without this patch, both transfers happen at 1000 > > Hz. With this patch, > > the later transfer happens correctly at 2000 Hz. > > > > Signed-off-by: Marek Vasut marex@denx.de > > Cc: Jagan Teki jagan@amarulasolutions.com > > Cc: Patrick Delaunay patrick.delaunay@st.com > > So, very reliably I can make: > https://source.denx.de/u-boot/u-boot/-/jobs/245517 > happen locally as well building with clang. It's not > obvious to me why > the test now fails however.
Can you please be more specific / clear ? I have no idea what those 300 lines of cryptic output mean, nor what are you trying to say by the above, sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.

On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote:
On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: > On 3/28/21 11:30 PM, Tom Rini wrote: >> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >> >>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>> with different frequency or mode. This is valid usecase, but the code >>> fails to notify the controller of such a configuration change. Call >>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>> the controller update the configuration. >>> >>> The problem can easily be triggered using the sspi command: >>> => sspi 0:0@1000 >>> => sspi 0:0@2000 >>> Without this patch, both transfers happen at 1000 >>> Hz. With this patch, >>> the later transfer happens correctly at 2000 Hz. >>> >>> Signed-off-by: Marek Vasut marex@denx.de >>> Cc: Jagan Teki jagan@amarulasolutions.com >>> Cc: Patrick Delaunay patrick.delaunay@st.com >> >> So, very reliably I can make: >> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >> happen locally as well building with clang. It's not >> obvious to me why >> the test now fails however. > > Can you please be more specific / clear ? I have no idea what those 300 > lines of cryptic output mean, nor what are you trying to say > by the above, > sorry.
If you build with clang, for sandbox, and run the tests, U-Boot crashes in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.

On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote:
On 3/29/21 3:59 AM, Tom Rini wrote: > On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: > > On 3/28/21 11:30 PM, Tom Rini wrote: > > > On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: > > > > > > > The spi_get_bus_and_cs() may be called on the same bus and chipselect > > > > with different frequency or mode. This is valid usecase, but the code > > > > fails to notify the controller of such a configuration change. Call > > > > spi_set_speed_mode() in case bus frequency or bus mode changed to let > > > > the controller update the configuration. > > > > > > > > The problem can easily be triggered using the sspi command: > > > > => sspi 0:0@1000 > > > > => sspi 0:0@2000 > > > > Without this patch, both transfers happen at 1000 > > > > Hz. With this patch, > > > > the later transfer happens correctly at 2000 Hz. > > > > > > > > Signed-off-by: Marek Vasut marex@denx.de > > > > Cc: Jagan Teki jagan@amarulasolutions.com > > > > Cc: Patrick Delaunay patrick.delaunay@st.com > > > > > > So, very reliably I can make: > > > https://source.denx.de/u-boot/u-boot/-/jobs/245517 > > > happen locally as well building with clang. It's not > > > obvious to me why > > > the test now fails however. > > > > Can you please be more specific / clear ? I have no idea what those 300 > > lines of cryptic output mean, nor what are you trying to say > > by the above, > > sorry. > > If you build with clang, for sandbox, and run the tests, U-Boot crashes > in the unit tests that you start with "ut dm".
And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.

+Heinrich Schuchardt
On Thu, 1 Apr 2021 at 10:45, Tom Rini trini@konsulko.com wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote: > On 3/29/21 3:59 AM, Tom Rini wrote: > > On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: > > > On 3/28/21 11:30 PM, Tom Rini wrote: > > > > On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: > > > > > > > > > The spi_get_bus_and_cs() may be called on the same bus and chipselect > > > > > with different frequency or mode. This is valid usecase, but the code > > > > > fails to notify the controller of such a configuration change. Call > > > > > spi_set_speed_mode() in case bus frequency or bus mode changed to let > > > > > the controller update the configuration. > > > > > > > > > > The problem can easily be triggered using the sspi command: > > > > > => sspi 0:0@1000 > > > > > => sspi 0:0@2000 > > > > > Without this patch, both transfers happen at 1000 > > > > > Hz. With this patch, > > > > > the later transfer happens correctly at 2000 Hz. > > > > > > > > > > Signed-off-by: Marek Vasut marex@denx.de > > > > > Cc: Jagan Teki jagan@amarulasolutions.com > > > > > Cc: Patrick Delaunay patrick.delaunay@st.com > > > > > > > > So, very reliably I can make: > > > > https://source.denx.de/u-boot/u-boot/-/jobs/245517 > > > > happen locally as well building with clang. It's not > > > > obvious to me why > > > > the test now fails however. > > > > > > Can you please be more specific / clear ? I have no idea what those 300 > > > lines of cryptic output mean, nor what are you trying to say > > > by the above, > > > sorry. > > > > If you build with clang, for sandbox, and run the tests, U-Boot crashes > > in the unit tests that you start with "ut dm". > > And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
This does not seem good. I wonder if this patch might have introduced some instability?
b308d9fd18f sandbox: Avoid using malloc() for system state
I have not seen this issue but did see a strange problem on sandbox when I tried to malloc() more space that was available. I haven't got back to it yet.
Regards, Simon

On Thu, Apr 01, 2021 at 11:19:27AM +1300, Simon Glass wrote:
+Heinrich Schuchardt
On Thu, 1 Apr 2021 at 10:45, Tom Rini trini@konsulko.com wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote: > On 3/29/21 4:09 AM, Marek Vasut wrote: > > On 3/29/21 3:59 AM, Tom Rini wrote: > > > On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: > > > > On 3/28/21 11:30 PM, Tom Rini wrote: > > > > > On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: > > > > > > > > > > > The spi_get_bus_and_cs() may be called on the same bus and chipselect > > > > > > with different frequency or mode. This is valid usecase, but the code > > > > > > fails to notify the controller of such a configuration change. Call > > > > > > spi_set_speed_mode() in case bus frequency or bus mode changed to let > > > > > > the controller update the configuration. > > > > > > > > > > > > The problem can easily be triggered using the sspi command: > > > > > > => sspi 0:0@1000 > > > > > > => sspi 0:0@2000 > > > > > > Without this patch, both transfers happen at 1000 > > > > > > Hz. With this patch, > > > > > > the later transfer happens correctly at 2000 Hz. > > > > > > > > > > > > Signed-off-by: Marek Vasut marex@denx.de > > > > > > Cc: Jagan Teki jagan@amarulasolutions.com > > > > > > Cc: Patrick Delaunay patrick.delaunay@st.com > > > > > > > > > > So, very reliably I can make: > > > > > https://source.denx.de/u-boot/u-boot/-/jobs/245517 > > > > > happen locally as well building with clang. It's not > > > > > obvious to me why > > > > > the test now fails however. > > > > > > > > Can you please be more specific / clear ? I have no idea what those 300 > > > > lines of cryptic output mean, nor what are you trying to say > > > > by the above, > > > > sorry. > > > > > > If you build with clang, for sandbox, and run the tests, U-Boot crashes > > > in the unit tests that you start with "ut dm". > > > > And that is related to this patch somehow ? How ? > > ... after further discussion off-list to get a better test case > description ... > > It seems the problem is in sound_beep() and it is unrelated to this > patch, as the same problem happens with / without this patch being > applied, on: > > a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") > > $ clang --version > Debian clang version 11.0.1-2 > ... > $ make CC=clang HOSTCC=clang sandbox_defconfig > $ make CC=clang HOSTCC=clang > $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb > => ut dm > ... > Program received signal SIGSEGV, Segmentation fault. > dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > common/dlmalloc.c:1623 > 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate > forward */ > (gdb) bt > #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > common/dlmalloc.c:1623 > #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized > out>, frequency_hz=100) > at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
This does not seem good. I wonder if this patch might have introduced some instability?
b308d9fd18f sandbox: Avoid using malloc() for system state
The problem is normally seen on master, where this patch is not yet. But on next it fails the same way.

On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote:
On 3/29/21 4:09 AM, Marek Vasut wrote: > On 3/29/21 3:59 AM, Tom Rini wrote: >> On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: >>> On 3/28/21 11:30 PM, Tom Rini wrote: >>>> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >>>> >>>>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>>>> with different frequency or mode. This is valid usecase, but the code >>>>> fails to notify the controller of such a configuration change. Call >>>>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>>>> the controller update the configuration. >>>>> >>>>> The problem can easily be triggered using the sspi command: >>>>> => sspi 0:0@1000 >>>>> => sspi 0:0@2000 >>>>> Without this patch, both transfers happen at 1000 >>>>> Hz. With this patch, >>>>> the later transfer happens correctly at 2000 Hz. >>>>> >>>>> Signed-off-by: Marek Vasut marex@denx.de >>>>> Cc: Jagan Teki jagan@amarulasolutions.com >>>>> Cc: Patrick Delaunay patrick.delaunay@st.com >>>> >>>> So, very reliably I can make: >>>> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >>>> happen locally as well building with clang. It's not >>>> obvious to me why >>>> the test now fails however. >>> >>> Can you please be more specific / clear ? I have no idea what those 300 >>> lines of cryptic output mean, nor what are you trying to say >>> by the above, >>> sorry. >> >> If you build with clang, for sandbox, and run the tests, U-Boot crashes >> in the unit tests that you start with "ut dm". > > And that is related to this patch somehow ? How ?
... after further discussion off-list to get a better test case description ...
It seems the problem is in sound_beep() and it is unrelated to this patch, as the same problem happens with / without this patch being applied, on:
a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'")
$ clang --version Debian clang version 11.0.1-2 ... $ make CC=clang HOSTCC=clang sandbox_defconfig $ make CC=clang HOSTCC=clang $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb => ut dm ... Program received signal SIGSEGV, Segmentation fault. dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate forward */ (gdb) bt #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at common/dlmalloc.c:1623 #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized out>, frequency_hz=100) at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2

Hi Marek,
On Thu, 1 Apr 2021 at 11:39, Marek Vasut marex@denx.de wrote:
On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote:
On 3/29/21 4:40 AM, Marek Vasut wrote: > On 3/29/21 4:09 AM, Marek Vasut wrote: >> On 3/29/21 3:59 AM, Tom Rini wrote: >>> On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: >>>> On 3/28/21 11:30 PM, Tom Rini wrote: >>>>> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >>>>> >>>>>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>>>>> with different frequency or mode. This is valid usecase, but the code >>>>>> fails to notify the controller of such a configuration change. Call >>>>>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>>>>> the controller update the configuration. >>>>>> >>>>>> The problem can easily be triggered using the sspi command: >>>>>> => sspi 0:0@1000 >>>>>> => sspi 0:0@2000 >>>>>> Without this patch, both transfers happen at 1000 >>>>>> Hz. With this patch, >>>>>> the later transfer happens correctly at 2000 Hz. >>>>>> >>>>>> Signed-off-by: Marek Vasut marex@denx.de >>>>>> Cc: Jagan Teki jagan@amarulasolutions.com >>>>>> Cc: Patrick Delaunay patrick.delaunay@st.com >>>>> >>>>> So, very reliably I can make: >>>>> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >>>>> happen locally as well building with clang. It's not >>>>> obvious to me why >>>>> the test now fails however. >>>> >>>> Can you please be more specific / clear ? I have no idea what those 300 >>>> lines of cryptic output mean, nor what are you trying to say >>>> by the above, >>>> sorry. >>> >>> If you build with clang, for sandbox, and run the tests, U-Boot crashes >>> in the unit tests that you start with "ut dm". >> >> And that is related to this patch somehow ? How ? > > ... after further discussion off-list to get a better test case > description ... > > It seems the problem is in sound_beep() and it is unrelated to this > patch, as the same problem happens with / without this patch being > applied, on: > > a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") > > $ clang --version > Debian clang version 11.0.1-2 > ... > $ make CC=clang HOSTCC=clang sandbox_defconfig > $ make CC=clang HOSTCC=clang > $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb > => ut dm > ... > Program received signal SIGSEGV, Segmentation fault. > dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > common/dlmalloc.c:1623 > 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate > forward */ > (gdb) bt > #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > common/dlmalloc.c:1623 > #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized > out>, frequency_hz=100) > at drivers/sound/sound-uclass.c:118
Tom, any further comments ? It seems your claim that this bugfix is causing problems with clang is not correct, so I think it should still be applied to v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
The Python test sets up a SPI-flash file (spi.bin) that is needed by the test.
I use this alias:
# Run a pytest on sandbox # $1: Name of test to run (optional, else run all) function pyt { test/py/test.py -B sandbox --build-dir /tmp/b/sandbox ${1:+"-k $1"} }
then 'pyt spi_find'
Once you've done that once you can do what you had before:
./u-boot -T
Regards, Simon

On 4/1/21 12:50 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 11:39, Marek Vasut marex@denx.de wrote:
On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote:
On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote: > On 3/29/21 4:40 AM, Marek Vasut wrote: >> On 3/29/21 4:09 AM, Marek Vasut wrote: >>> On 3/29/21 3:59 AM, Tom Rini wrote: >>>> On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: >>>>> On 3/28/21 11:30 PM, Tom Rini wrote: >>>>>> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >>>>>> >>>>>>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>>>>>> with different frequency or mode. This is valid usecase, but the code >>>>>>> fails to notify the controller of such a configuration change. Call >>>>>>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>>>>>> the controller update the configuration. >>>>>>> >>>>>>> The problem can easily be triggered using the sspi command: >>>>>>> => sspi 0:0@1000 >>>>>>> => sspi 0:0@2000 >>>>>>> Without this patch, both transfers happen at 1000 >>>>>>> Hz. With this patch, >>>>>>> the later transfer happens correctly at 2000 Hz. >>>>>>> >>>>>>> Signed-off-by: Marek Vasut marex@denx.de >>>>>>> Cc: Jagan Teki jagan@amarulasolutions.com >>>>>>> Cc: Patrick Delaunay patrick.delaunay@st.com >>>>>> >>>>>> So, very reliably I can make: >>>>>> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >>>>>> happen locally as well building with clang. It's not >>>>>> obvious to me why >>>>>> the test now fails however. >>>>> >>>>> Can you please be more specific / clear ? I have no idea what those 300 >>>>> lines of cryptic output mean, nor what are you trying to say >>>>> by the above, >>>>> sorry. >>>> >>>> If you build with clang, for sandbox, and run the tests, U-Boot crashes >>>> in the unit tests that you start with "ut dm". >>> >>> And that is related to this patch somehow ? How ? >> >> ... after further discussion off-list to get a better test case >> description ... >> >> It seems the problem is in sound_beep() and it is unrelated to this >> patch, as the same problem happens with / without this patch being >> applied, on: >> >> a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") >> >> $ clang --version >> Debian clang version 11.0.1-2 >> ... >> $ make CC=clang HOSTCC=clang sandbox_defconfig >> $ make CC=clang HOSTCC=clang >> $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb >> => ut dm >> ... >> Program received signal SIGSEGV, Segmentation fault. >> dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >> common/dlmalloc.c:1623 >> 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate >> forward */ >> (gdb) bt >> #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >> common/dlmalloc.c:1623 >> #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized >> out>, frequency_hz=100) >> at drivers/sound/sound-uclass.c:118 > > Tom, any further comments ? It seems your claim that this bugfix is causing > problems with clang is not correct, so I think it should still be applied to > v2021.04 , unless you can provide more details about the clang problem ?
There's at least a few funny things going on. Yes, apparently outside of CI building sandbox with clang and running the tests manually crashes in sound, before it gets to the SPI tests. Except when it crashes on the SPI tests instead, which I did get to happen once. So there's some sort of use-after-free or similar bug around here somewhere, similar to how we can't use the -fstack-protector patch as it exposes other real problems that need to be fixed first.
So, I'll see if with a new tag and a few other changes having been made we once again get to the point where your patch doesn't trigger this issue. But since it's also not a regression fix, if no one can figure out what to do about fixing what clang shows us, then it can wait for v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
The Python test sets up a SPI-flash file (spi.bin) that is needed by the test.
I use this alias:
# Run a pytest on sandbox # $1: Name of test to run (optional, else run all) function pyt { test/py/test.py -B sandbox --build-dir /tmp/b/sandbox ${1:+"-k $1"} }
then 'pyt spi_find'
Once you've done that once you can do what you had before:
./u-boot -T
Do you also have a minimal test which does not depend on pytest at all ? A testcase for a bug should be minimal and as simple as possible, it shouldn't depend on all kinds of variables which are only adding complexity and confusion.

Hi Marek,
On Thu, 1 Apr 2021 at 12:19, Marek Vasut marex@denx.de wrote:
On 4/1/21 12:50 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 11:39, Marek Vasut marex@denx.de wrote:
On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote: > On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote: >> On 3/29/21 4:40 AM, Marek Vasut wrote: >>> On 3/29/21 4:09 AM, Marek Vasut wrote: >>>> On 3/29/21 3:59 AM, Tom Rini wrote: >>>>> On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: >>>>>> On 3/28/21 11:30 PM, Tom Rini wrote: >>>>>>> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >>>>>>> >>>>>>>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>>>>>>> with different frequency or mode. This is valid usecase, but the code >>>>>>>> fails to notify the controller of such a configuration change. Call >>>>>>>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>>>>>>> the controller update the configuration. >>>>>>>> >>>>>>>> The problem can easily be triggered using the sspi command: >>>>>>>> => sspi 0:0@1000 >>>>>>>> => sspi 0:0@2000 >>>>>>>> Without this patch, both transfers happen at 1000 >>>>>>>> Hz. With this patch, >>>>>>>> the later transfer happens correctly at 2000 Hz. >>>>>>>> >>>>>>>> Signed-off-by: Marek Vasut marex@denx.de >>>>>>>> Cc: Jagan Teki jagan@amarulasolutions.com >>>>>>>> Cc: Patrick Delaunay patrick.delaunay@st.com >>>>>>> >>>>>>> So, very reliably I can make: >>>>>>> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >>>>>>> happen locally as well building with clang. It's not >>>>>>> obvious to me why >>>>>>> the test now fails however. >>>>>> >>>>>> Can you please be more specific / clear ? I have no idea what those 300 >>>>>> lines of cryptic output mean, nor what are you trying to say >>>>>> by the above, >>>>>> sorry. >>>>> >>>>> If you build with clang, for sandbox, and run the tests, U-Boot crashes >>>>> in the unit tests that you start with "ut dm". >>>> >>>> And that is related to this patch somehow ? How ? >>> >>> ... after further discussion off-list to get a better test case >>> description ... >>> >>> It seems the problem is in sound_beep() and it is unrelated to this >>> patch, as the same problem happens with / without this patch being >>> applied, on: >>> >>> a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") >>> >>> $ clang --version >>> Debian clang version 11.0.1-2 >>> ... >>> $ make CC=clang HOSTCC=clang sandbox_defconfig >>> $ make CC=clang HOSTCC=clang >>> $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb >>> => ut dm >>> ... >>> Program received signal SIGSEGV, Segmentation fault. >>> dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >>> common/dlmalloc.c:1623 >>> 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate >>> forward */ >>> (gdb) bt >>> #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >>> common/dlmalloc.c:1623 >>> #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized >>> out>, frequency_hz=100) >>> at drivers/sound/sound-uclass.c:118 >> >> Tom, any further comments ? It seems your claim that this bugfix is causing >> problems with clang is not correct, so I think it should still be applied to >> v2021.04 , unless you can provide more details about the clang problem ? > > There's at least a few funny things going on. Yes, apparently outside > of CI building sandbox with clang and running the tests manually crashes > in sound, before it gets to the SPI tests. Except when it crashes on > the SPI tests instead, which I did get to happen once. So there's some > sort of use-after-free or similar bug around here somewhere, similar to > how we can't use the -fstack-protector patch as it exposes other real > problems that need to be fixed first. > > So, I'll see if with a new tag and a few other changes having been made > we once again get to the point where your patch doesn't trigger this > issue. But since it's also not a regression fix, if no one can figure > out what to do about fixing what clang shows us, then it can wait for > v2021.07.
Nope, still gets things to blow up: https://source.denx.de/u-boot/u-boot/-/jobs/246848
and yes, it would be nice if when the reason a pytest fails is that U-Boot crashed, it would say something more clear than "OSError: [Errno 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
The Python test sets up a SPI-flash file (spi.bin) that is needed by the test.
I use this alias:
# Run a pytest on sandbox # $1: Name of test to run (optional, else run all) function pyt { test/py/test.py -B sandbox --build-dir /tmp/b/sandbox ${1:+"-k $1"} }
then 'pyt spi_find'
Once you've done that once you can do what you had before:
./u-boot -T
Do you also have a minimal test which does not depend on pytest at all ? A testcase for a bug should be minimal and as simple as possible, it shouldn't depend on all kinds of variables which are only adding complexity and confusion.
There is quite a bit more documentation languishing in the -next branch, e.g.:
https://github.com/u-boot/u-boot/blob/next/doc/develop/tests_sandbox.rst
The C SPI tests could perhaps be updated to automatically create the spi.bin file, but several of them share it, so it would need to be a shared function.
I certainly only used pytest when needed, hence the substantial effort I made to document how to run sandbox tests directly, but I should point out that pytest is the overall framework that we use.
Regards, Simon

On 4/1/21 1:41 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 12:19, Marek Vasut marex@denx.de wrote:
On 4/1/21 12:50 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 11:39, Marek Vasut marex@denx.de wrote:
On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote:
On 3/31/21 9:40 PM, Tom Rini wrote: > On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote: >> On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote: >>> On 3/29/21 4:40 AM, Marek Vasut wrote: >>>> On 3/29/21 4:09 AM, Marek Vasut wrote: >>>>> On 3/29/21 3:59 AM, Tom Rini wrote: >>>>>> On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: >>>>>>> On 3/28/21 11:30 PM, Tom Rini wrote: >>>>>>>> On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: >>>>>>>> >>>>>>>>> The spi_get_bus_and_cs() may be called on the same bus and chipselect >>>>>>>>> with different frequency or mode. This is valid usecase, but the code >>>>>>>>> fails to notify the controller of such a configuration change. Call >>>>>>>>> spi_set_speed_mode() in case bus frequency or bus mode changed to let >>>>>>>>> the controller update the configuration. >>>>>>>>> >>>>>>>>> The problem can easily be triggered using the sspi command: >>>>>>>>> => sspi 0:0@1000 >>>>>>>>> => sspi 0:0@2000 >>>>>>>>> Without this patch, both transfers happen at 1000 >>>>>>>>> Hz. With this patch, >>>>>>>>> the later transfer happens correctly at 2000 Hz. >>>>>>>>> >>>>>>>>> Signed-off-by: Marek Vasut marex@denx.de >>>>>>>>> Cc: Jagan Teki jagan@amarulasolutions.com >>>>>>>>> Cc: Patrick Delaunay patrick.delaunay@st.com >>>>>>>> >>>>>>>> So, very reliably I can make: >>>>>>>> https://source.denx.de/u-boot/u-boot/-/jobs/245517 >>>>>>>> happen locally as well building with clang. It's not >>>>>>>> obvious to me why >>>>>>>> the test now fails however. >>>>>>> >>>>>>> Can you please be more specific / clear ? I have no idea what those 300 >>>>>>> lines of cryptic output mean, nor what are you trying to say >>>>>>> by the above, >>>>>>> sorry. >>>>>> >>>>>> If you build with clang, for sandbox, and run the tests, U-Boot crashes >>>>>> in the unit tests that you start with "ut dm". >>>>> >>>>> And that is related to this patch somehow ? How ? >>>> >>>> ... after further discussion off-list to get a better test case >>>> description ... >>>> >>>> It seems the problem is in sound_beep() and it is unrelated to this >>>> patch, as the same problem happens with / without this patch being >>>> applied, on: >>>> >>>> a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") >>>> >>>> $ clang --version >>>> Debian clang version 11.0.1-2 >>>> ... >>>> $ make CC=clang HOSTCC=clang sandbox_defconfig >>>> $ make CC=clang HOSTCC=clang >>>> $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb >>>> => ut dm >>>> ... >>>> Program received signal SIGSEGV, Segmentation fault. >>>> dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >>>> common/dlmalloc.c:1623 >>>> 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate >>>> forward */ >>>> (gdb) bt >>>> #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at >>>> common/dlmalloc.c:1623 >>>> #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized >>>> out>, frequency_hz=100) >>>> at drivers/sound/sound-uclass.c:118 >>> >>> Tom, any further comments ? It seems your claim that this bugfix is causing >>> problems with clang is not correct, so I think it should still be applied to >>> v2021.04 , unless you can provide more details about the clang problem ? >> >> There's at least a few funny things going on. Yes, apparently outside >> of CI building sandbox with clang and running the tests manually crashes >> in sound, before it gets to the SPI tests. Except when it crashes on >> the SPI tests instead, which I did get to happen once. So there's some >> sort of use-after-free or similar bug around here somewhere, similar to >> how we can't use the -fstack-protector patch as it exposes other real >> problems that need to be fixed first. >> >> So, I'll see if with a new tag and a few other changes having been made >> we once again get to the point where your patch doesn't trigger this >> issue. But since it's also not a regression fix, if no one can figure >> out what to do about fixing what clang shows us, then it can wait for >> v2021.07. > > Nope, still gets things to blow up: > https://source.denx.de/u-boot/u-boot/-/jobs/246848 > > and yes, it would be nice if when the reason a pytest fails is that > U-Boot crashed, it would say something more clear than "OSError: [Errno > 5] Input/output error" and you have to know that means U-Boot died.
Can you please provide a detailed test case how to reproduce this ? So far I don't have one, the only test description I got is "install random version of clang, run ut dm and it fails", but that kind of imprecise test description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
The Python test sets up a SPI-flash file (spi.bin) that is needed by the test.
I use this alias:
# Run a pytest on sandbox # $1: Name of test to run (optional, else run all) function pyt { test/py/test.py -B sandbox --build-dir /tmp/b/sandbox ${1:+"-k $1"} }
then 'pyt spi_find'
Once you've done that once you can do what you had before:
./u-boot -T
Do you also have a minimal test which does not depend on pytest at all ? A testcase for a bug should be minimal and as simple as possible, it shouldn't depend on all kinds of variables which are only adding complexity and confusion.
There is quite a bit more documentation languishing in the -next branch, e.g.:
https://github.com/u-boot/u-boot/blob/next/doc/develop/tests_sandbox.rst
The C SPI tests could perhaps be updated to automatically create the spi.bin file, but several of them share it, so it would need to be a shared function.
I certainly only used pytest when needed, hence the substantial effort I made to document how to run sandbox tests directly, but I should point out that pytest is the overall framework that we use.
Surely for the sake of trivial test , spi.bin file can be generated manually using e.g. dd ?

On Thu, Apr 01, 2021 at 01:47:09AM +0200, Marek Vasut wrote:
On 4/1/21 1:41 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 12:19, Marek Vasut marex@denx.de wrote:
On 4/1/21 12:50 AM, Simon Glass wrote:
Hi Marek,
On Thu, 1 Apr 2021 at 11:39, Marek Vasut marex@denx.de wrote:
On 3/31/21 11:45 PM, Tom Rini wrote:
On Wed, Mar 31, 2021 at 10:05:28PM +0200, Marek Vasut wrote: > On 3/31/21 9:40 PM, Tom Rini wrote: > > On Wed, Mar 31, 2021 at 03:26:27PM -0400, Tom Rini wrote: > > > On Wed, Mar 31, 2021 at 09:18:28PM +0200, Marek Vasut wrote: > > > > On 3/29/21 4:40 AM, Marek Vasut wrote: > > > > > On 3/29/21 4:09 AM, Marek Vasut wrote: > > > > > > On 3/29/21 3:59 AM, Tom Rini wrote: > > > > > > > On Mon, Mar 29, 2021 at 03:32:51AM +0200, Marek Vasut wrote: > > > > > > > > On 3/28/21 11:30 PM, Tom Rini wrote: > > > > > > > > > On Fri, Feb 26, 2021 at 03:21:24PM +0100, Marek Vasut wrote: > > > > > > > > > > > > > > > > > > > The spi_get_bus_and_cs() may be called on the same bus and chipselect > > > > > > > > > > with different frequency or mode. This is valid usecase, but the code > > > > > > > > > > fails to notify the controller of such a configuration change. Call > > > > > > > > > > spi_set_speed_mode() in case bus frequency or bus mode changed to let > > > > > > > > > > the controller update the configuration. > > > > > > > > > > > > > > > > > > > > The problem can easily be triggered using the sspi command: > > > > > > > > > > => sspi 0:0@1000 > > > > > > > > > > => sspi 0:0@2000 > > > > > > > > > > Without this patch, both transfers happen at 1000 > > > > > > > > > > Hz. With this patch, > > > > > > > > > > the later transfer happens correctly at 2000 Hz. > > > > > > > > > > > > > > > > > > > > Signed-off-by: Marek Vasut marex@denx.de > > > > > > > > > > Cc: Jagan Teki jagan@amarulasolutions.com > > > > > > > > > > Cc: Patrick Delaunay patrick.delaunay@st.com > > > > > > > > > > > > > > > > > > So, very reliably I can make: > > > > > > > > > https://source.denx.de/u-boot/u-boot/-/jobs/245517 > > > > > > > > > happen locally as well building with clang. It's not > > > > > > > > > obvious to me why > > > > > > > > > the test now fails however. > > > > > > > > > > > > > > > > Can you please be more specific / clear ? I have no idea what those 300 > > > > > > > > lines of cryptic output mean, nor what are you trying to say > > > > > > > > by the above, > > > > > > > > sorry. > > > > > > > > > > > > > > If you build with clang, for sandbox, and run the tests, U-Boot crashes > > > > > > > in the unit tests that you start with "ut dm". > > > > > > > > > > > > And that is related to this patch somehow ? How ? > > > > > > > > > > ... after further discussion off-list to get a better test case > > > > > description ... > > > > > > > > > > It seems the problem is in sound_beep() and it is unrelated to this > > > > > patch, as the same problem happens with / without this patch being > > > > > applied, on: > > > > > > > > > > a7d3ac61482 ("Merge branch '2021-03-28-assorted-bugfixes'") > > > > > > > > > > $ clang --version > > > > > Debian clang version 11.0.1-2 > > > > > ... > > > > > $ make CC=clang HOSTCC=clang sandbox_defconfig > > > > > $ make CC=clang HOSTCC=clang > > > > > $ gdb --args ./u-boot -d arch/sandbox/dts/test.dtb > > > > > => ut dm > > > > > ... > > > > > Program received signal SIGSEGV, Segmentation fault. > > > > > dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > > > > > common/dlmalloc.c:1623 > > > > > 1623 if (!(inuse_bit_at_offset(next, nextsz))) /* consolidate > > > > > forward */ > > > > > (gdb) bt > > > > > #0 dlfree (mem=<optimized out>, mem@entry=0x15c19c50) at > > > > > common/dlmalloc.c:1623 > > > > > #1 0x00000000004759e3 in sound_beep (dev=0x15c05580, msecs=<optimized > > > > > out>, frequency_hz=100) > > > > > at drivers/sound/sound-uclass.c:118 > > > > > > > > Tom, any further comments ? It seems your claim that this bugfix is causing > > > > problems with clang is not correct, so I think it should still be applied to > > > > v2021.04 , unless you can provide more details about the clang problem ? > > > > > > There's at least a few funny things going on. Yes, apparently outside > > > of CI building sandbox with clang and running the tests manually crashes > > > in sound, before it gets to the SPI tests. Except when it crashes on > > > the SPI tests instead, which I did get to happen once. So there's some > > > sort of use-after-free or similar bug around here somewhere, similar to > > > how we can't use the -fstack-protector patch as it exposes other real > > > problems that need to be fixed first. > > > > > > So, I'll see if with a new tag and a few other changes having been made > > > we once again get to the point where your patch doesn't trigger this > > > issue. But since it's also not a regression fix, if no one can figure > > > out what to do about fixing what clang shows us, then it can wait for > > > v2021.07. > > > > Nope, still gets things to blow up: > > https://source.denx.de/u-boot/u-boot/-/jobs/246848 > > > > and yes, it would be nice if when the reason a pytest fails is that > > U-Boot crashed, it would say something more clear than "OSError: [Errno > > 5] Input/output error" and you have to know that means U-Boot died. > > Can you please provide a detailed test case how to reproduce this ? > So far I don't have one, the only test description I got is "install random > version of clang, run ut dm and it fails", but that kind of imprecise test > description is really not useful. Please provide more specific instructions.
Yes, follow what CI does. It's done in a container, so you can have the exact same runtime and the tests are in .azure-pipelines.yml and .gitlab-ci.yml or if you look at https://source.denx.de/u-boot/u-boot/-/jobs/246848/raw you can see the commands in there.
Can you please provide me the exact commands to run to reproduce this problem ? I just spent a lot of time trying to find this one single command in a wall of text, "ut dm spi_find", which I think is the one failing for you ? It seems to work for me with and without the patch:
$ ./u-boot -d arch/sandbox/dts/test.dtb sandbox: starting...
U-Boot 2021.04-rc5-00005-g583b6858733 (Apr 01 2021 - 00:34:09 +0200)
Model: sandbox DRAM: 128 MiB WDT: Started with servicing (60s timeout) MMC: mmc2: 2 (SD), mmc1: 1 (SD), mmc0: 0 (SD) In: serial Out: vidconsole Err: vidconsole Model: sandbox SCSI: Net: eth0: eth@10002000, eth5: eth@10003000, eth3: sbe5, eth6: eth@10004000 Hit any key to stop autoboot: 0
Without patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
With patch => ut dm spi_find Test: dm_test_spi_find: spi.c test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Test: dm_test_spi_find: spi.c (flat tree) test/dm/spi.c:71, dm_test_spi_find(): 0 == spi_get_bus_and_cs(busnum, cs, speed, mode, "jedec_spi_nor", "name", &bus, &slave): Expected 0x0 (0), got 0xfffffffb (-5) Failures: 2
The Python test sets up a SPI-flash file (spi.bin) that is needed by the test.
I use this alias:
# Run a pytest on sandbox # $1: Name of test to run (optional, else run all) function pyt { test/py/test.py -B sandbox --build-dir /tmp/b/sandbox ${1:+"-k $1"} }
then 'pyt spi_find'
Once you've done that once you can do what you had before:
./u-boot -T
Do you also have a minimal test which does not depend on pytest at all ? A testcase for a bug should be minimal and as simple as possible, it shouldn't depend on all kinds of variables which are only adding complexity and confusion.
There is quite a bit more documentation languishing in the -next branch, e.g.:
https://github.com/u-boot/u-boot/blob/next/doc/develop/tests_sandbox.rst
The C SPI tests could perhaps be updated to automatically create the spi.bin file, but several of them share it, so it would need to be a shared function.
I certainly only used pytest when needed, hence the substantial effort I made to document how to run sandbox tests directly, but I should point out that pytest is the overall framework that we use.
Surely for the sake of trivial test , spi.bin file can be generated manually using e.g. dd ?
But part of the problem, quite literally, is that the existing tests for SPI functionality are failing. Given that it seems "this change directly is wrong" is unlikely but "this change exposes some more general problem within any of {sandbox,unit tests,unit test framework,spi}" is likely, there is not yet a more minimal test for the problem because one has not been found. The easy reproducer is "run the tests like CI does". It further seems likely to me that it's a problem in either sandbox or unit test framework as you and I both made that sound test fail sometime. And note that doc/develop/py_testing.rst tells how to hook pytest and GDB up.
participants (3)
-
Marek Vasut
-
Simon Glass
-
Tom Rini