[PATCH 0/4] some serial rx buffer patches

Some small improvements to the serial rx buffer feature.
CI seems happy: https://github.com/u-boot/u-boot/pull/674
Rasmus Villemoes (4): serial: fix circular rx buffer edge case serial: do not overwrite not-consumed characters in rx buffer serial: add build-time sanity check of CONFIG_SERIAL_RX_BUFFER_SIZE serial: embed the rx buffer in struct serial_dev_priv
drivers/serial/serial-uclass.c | 23 ++++++++++++----------- include/serial.h | 8 +++++--- 2 files changed, 17 insertions(+), 14 deletions(-)

The current implementation of the circular rx buffer falls into a common trap with circular buffers: It keeps the head/tail indices reduced modulo the buffer size. The problem with that is that it makes it impossible to distinguish "buffer full" from "buffer empty", because in both situations one has head==tail.
This can easily be demonstrated: Build sandbox with RX_BUFFER enabled, set the RX_BUFFER_SIZE to 32, and try pasting the string
01234567890123456789012345678901
Nothing seems to happen, but in reality, all characters have been read and put into the buffer, but then tstc ends up believing nothing is in the buffer anyway because upriv->rd_ptr == upriv->wr_ptr.
A better approach is to let the indices be free-running, and only reduce them modulo the buffer size when accessing the array. Then "empty" is head-tail==0 and "full" is head-tail==size. This does rely on the buffer size being a power-of-two and the free-running indices simply wrapping around to 0 when incremented beyond the maximal positive value.
Incidentally, that change from signed to unsigned int also improves code generation quite a bit: In C, (signed int)%(signed int) is defined to have the sign of the dividend (so (-35) % 32 is -3, not 29), and hence despite the modulus being a power-of-two, x % 32 does not actually compile to the same as a simple x & 31 - on x86 with -Os, it seems that gcc ends up emitting an idiv instruction, which is quite expensive.
Signed-off-by: Rasmus Villemoes ravi@prevas.dk --- drivers/serial/serial-uclass.c | 10 ++++++---- include/serial.h | 4 ++-- 2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 84f02f7ac76..05fe9645bee 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -328,11 +328,12 @@ static int __serial_tstc(struct udevice *dev) static int _serial_tstc(struct udevice *dev) { struct serial_dev_priv *upriv = dev_get_uclass_priv(dev); + uint wr;
/* Read all available chars into the RX buffer */ while (__serial_tstc(dev)) { - upriv->buf[upriv->wr_ptr++] = __serial_getc(dev); - upriv->wr_ptr %= CONFIG_SERIAL_RX_BUFFER_SIZE; + wr = upriv->wr_ptr++ % CONFIG_SERIAL_RX_BUFFER_SIZE; + upriv->buf[wr] = __serial_getc(dev); }
return upriv->rd_ptr != upriv->wr_ptr ? 1 : 0; @@ -342,12 +343,13 @@ static int _serial_getc(struct udevice *dev) { struct serial_dev_priv *upriv = dev_get_uclass_priv(dev); char val; + uint rd;
if (upriv->rd_ptr == upriv->wr_ptr) return __serial_getc(dev);
- val = upriv->buf[upriv->rd_ptr++]; - upriv->rd_ptr %= CONFIG_SERIAL_RX_BUFFER_SIZE; + rd = upriv->rd_ptr++ % CONFIG_SERIAL_RX_BUFFER_SIZE; + val = upriv->buf[rd];
return val; } diff --git a/include/serial.h b/include/serial.h index d129dc3253c..14563239b7d 100644 --- a/include/serial.h +++ b/include/serial.h @@ -299,8 +299,8 @@ struct serial_dev_priv { struct stdio_dev *sdev;
char *buf; - int rd_ptr; - int wr_ptr; + uint rd_ptr; + uint wr_ptr; };
/* Access the serial operations for a device */

On Thu, 3 Oct 2024 at 08:10, Rasmus Villemoes ravi@prevas.dk wrote:
The current implementation of the circular rx buffer falls into a common trap with circular buffers: It keeps the head/tail indices reduced modulo the buffer size. The problem with that is that it makes it impossible to distinguish "buffer full" from "buffer empty", because in both situations one has head==tail.
This can easily be demonstrated: Build sandbox with RX_BUFFER enabled, set the RX_BUFFER_SIZE to 32, and try pasting the string
01234567890123456789012345678901
Nothing seems to happen, but in reality, all characters have been read and put into the buffer, but then tstc ends up believing nothing is in the buffer anyway because upriv->rd_ptr == upriv->wr_ptr.
A better approach is to let the indices be free-running, and only reduce them modulo the buffer size when accessing the array. Then "empty" is head-tail==0 and "full" is head-tail==size. This does rely on the buffer size being a power-of-two and the free-running indices simply wrapping around to 0 when incremented beyond the maximal positive value.
Incidentally, that change from signed to unsigned int also improves code generation quite a bit: In C, (signed int)%(signed int) is defined to have the sign of the dividend (so (-35) % 32 is -3, not 29), and hence despite the modulus being a power-of-two, x % 32 does not actually compile to the same as a simple x & 31 - on x86 with -Os, it seems that gcc ends up emitting an idiv instruction, which is quite expensive.
Signed-off-by: Rasmus Villemoes ravi@prevas.dk
drivers/serial/serial-uclass.c | 10 ++++++---- include/serial.h | 4 ++-- 2 files changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Perhaps we should use membuff, like in other cases, since it has some tests?

Simon Glass sjg@chromium.org writes:
On Thu, 3 Oct 2024 at 08:10, Rasmus Villemoes ravi@prevas.dk wrote:
drivers/serial/serial-uclass.c | 10 ++++++---- include/serial.h | 4 ++-- 2 files changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Perhaps we should use membuff, like in other cases, since it has some tests?
I didn't know about that till just now. But no, it seems to suffer from the same basic defect of losing one slot, and while it may handle it correctly, I don't like to introduce more uses of that model, and open-coding the single putc/getc that we need here is really not that hard.
Also, which tests? I don't see membuff mentioned anywhere under test/, but perhaps it's implicitly through the console recording testing?
I just had a quick peek in the implementation, and membuff_makecontig() seems to be buggy: the second memcpy() must also be a memmove() (suppose size==32, head==30, tail==10; then the memcpy() does a 10-byte overlap...). It has no users and never has had, so it doesn't matter too much.
Rasmus

Hi Rasmus,
On Wed, 9 Oct 2024 at 05:03, Rasmus Villemoes ravi@prevas.dk wrote:
Simon Glass sjg@chromium.org writes:
On Thu, 3 Oct 2024 at 08:10, Rasmus Villemoes ravi@prevas.dk wrote:
drivers/serial/serial-uclass.c | 10 ++++++---- include/serial.h | 4 ++-- 2 files changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Perhaps we should use membuff, like in other cases, since it has some tests?
I didn't know about that till just now. But no, it seems to suffer from the same basic defect of losing one slot, and while it may handle it correctly, I don't like to introduce more uses of that model, and open-coding the single putc/getc that we need here is really not that hard.
I will send a series which adds the 'full' flag in as an option. When I originally sent membuff I didn't include it.
Also, which tests? I don't see membuff mentioned anywhere under test/, but perhaps it's implicitly through the console recording testing?
Ooop, no tests. Will include with the series.
I just had a quick peek in the implementation, and membuff_makecontig() seems to be buggy: the second memcpy() must also be a memmove() (suppose size==32, head==30, tail==10; then the memcpy() does a 10-byte overlap...). It has no users and never has had, so it doesn't matter too much.
Nor does it have any tests, even with my series, unfortunately. Let's see if anyone finds a use for it. Or if you are keen you could add some.
Anyway: Reviewed-by: Simon Glass sjg@chromium.org
Regards, SImon

Before the previous patch, pasting a string of length x > CONFIG_SERIAL_RX_BUFFER_SIZE results in getting the last (x%CONFIG_SERIAL_RX_BUFFER_SIZE) characters from that string.
With the previous patch, one instead gets the last CONFIG_SERIAL_RX_BUFFER_SIZE characters repeatedly until the ->rd_ptr catches up.
Both behaviours are counter-intuitive, and happen because the code that checks for a character available from the hardware does not account for whether there is actually room in the software buffer to receive it. Fix that by adding such accounting. This also brings the software buffering more in line with how most hardware FIFOs behave (first received characters are kept, overflowing characters are dropped).
Signed-off-by: Rasmus Villemoes ravi@prevas.dk --- drivers/serial/serial-uclass.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 05fe9645bee..28d7a202afc 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -328,10 +328,11 @@ static int __serial_tstc(struct udevice *dev) static int _serial_tstc(struct udevice *dev) { struct serial_dev_priv *upriv = dev_get_uclass_priv(dev); - uint wr; + uint wr, avail;
- /* Read all available chars into the RX buffer */ - while (__serial_tstc(dev)) { + /* Read all available chars into the RX buffer while there's room */ + avail = CONFIG_SERIAL_RX_BUFFER_SIZE - (upriv->wr_ptr - upriv->rd_ptr); + while (avail-- && __serial_tstc(dev)) { wr = upriv->wr_ptr++ % CONFIG_SERIAL_RX_BUFFER_SIZE; upriv->buf[wr] = __serial_getc(dev); }

On Thu, 3 Oct 2024 at 08:10, Rasmus Villemoes ravi@prevas.dk wrote:
Before the previous patch, pasting a string of length x > CONFIG_SERIAL_RX_BUFFER_SIZE results in getting the last (x%CONFIG_SERIAL_RX_BUFFER_SIZE) characters from that string.
With the previous patch, one instead gets the last CONFIG_SERIAL_RX_BUFFER_SIZE characters repeatedly until the ->rd_ptr catches up.
Both behaviours are counter-intuitive, and happen because the code that checks for a character available from the hardware does not account for whether there is actually room in the software buffer to receive it. Fix that by adding such accounting. This also brings the software buffering more in line with how most hardware FIFOs behave (first received characters are kept, overflowing characters are dropped).
Signed-off-by: Rasmus Villemoes ravi@prevas.dk
drivers/serial/serial-uclass.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 05fe9645bee..28d7a202afc 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -328,10 +328,11 @@ static int __serial_tstc(struct udevice *dev) static int _serial_tstc(struct udevice *dev) { struct serial_dev_priv *upriv = dev_get_uclass_priv(dev);
uint wr;
uint wr, avail;
/* Read all available chars into the RX buffer */
while (__serial_tstc(dev)) {
/* Read all available chars into the RX buffer while there's room */
avail = CONFIG_SERIAL_RX_BUFFER_SIZE - (upriv->wr_ptr - upriv->rd_ptr);
while (avail-- && __serial_tstc(dev)) { wr = upriv->wr_ptr++ % CONFIG_SERIAL_RX_BUFFER_SIZE; upriv->buf[wr] = __serial_getc(dev); }
-- 2.46.2

The help text says it must be a power of 2, and the implementation does rely on that. Enforce it.
A violation gives a wall of text, but the last few lines should be reasonably obvious:
drivers/serial/serial-uclass.c:334:9: note: in expansion of macro ‘BUILD_BUG_ON_NOT_POWER_OF_2’ 334 | BUILD_BUG_ON_NOT_POWER_OF_2(CONFIG_SERIAL_RX_BUFFER_SIZE);
Signed-off-by: Rasmus Villemoes ravi@prevas.dk --- drivers/serial/serial-uclass.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 28d7a202afc..484f0f7d3e8 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -18,6 +18,7 @@ #include <dm/lists.h> #include <dm/device-internal.h> #include <dm/of_access.h> +#include <linux/build_bug.h> #include <linux/delay.h>
DECLARE_GLOBAL_DATA_PTR; @@ -330,6 +331,8 @@ static int _serial_tstc(struct udevice *dev) struct serial_dev_priv *upriv = dev_get_uclass_priv(dev); uint wr, avail;
+ BUILD_BUG_ON_NOT_POWER_OF_2(CONFIG_SERIAL_RX_BUFFER_SIZE); + /* Read all available chars into the RX buffer while there's room */ avail = CONFIG_SERIAL_RX_BUFFER_SIZE - (upriv->wr_ptr - upriv->rd_ptr); while (avail-- && __serial_tstc(dev)) {

On Thu, 3 Oct 2024 at 08:11, Rasmus Villemoes ravi@prevas.dk wrote:
The help text says it must be a power of 2, and the implementation does rely on that. Enforce it.
A violation gives a wall of text, but the last few lines should be reasonably obvious:
drivers/serial/serial-uclass.c:334:9: note: in expansion of macro ‘BUILD_BUG_ON_NOT_POWER_OF_2’ 334 | BUILD_BUG_ON_NOT_POWER_OF_2(CONFIG_SERIAL_RX_BUFFER_SIZE);
Signed-off-by: Rasmus Villemoes ravi@prevas.dk
drivers/serial/serial-uclass.c | 3 +++ 1 file changed, 3 insertions(+)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 28d7a202afc..484f0f7d3e8 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -18,6 +18,7 @@ #include <dm/lists.h> #include <dm/device-internal.h> #include <dm/of_access.h> +#include <linux/build_bug.h> #include <linux/delay.h>
DECLARE_GLOBAL_DATA_PTR; @@ -330,6 +331,8 @@ static int _serial_tstc(struct udevice *dev) struct serial_dev_priv *upriv = dev_get_uclass_priv(dev); uint wr, avail;
BUILD_BUG_ON_NOT_POWER_OF_2(CONFIG_SERIAL_RX_BUFFER_SIZE);
/* Read all available chars into the RX buffer while there's room */ avail = CONFIG_SERIAL_RX_BUFFER_SIZE - (upriv->wr_ptr - upriv->rd_ptr); while (avail-- && __serial_tstc(dev)) {
-- 2.46.2

The initialization of upriv->buf doesn't check for a NULL return. But there's actually no point in doing a separate, unconditional malloc() in post_probe; we can just make serial_dev_priv contain the rx buffer itself, and let the (larger) allocation be handled by the driver core when it allocates the ->per_device_auto. The total run-time memory used is mostly the same, we reduce the code size a little, and as a bonus, struct serial_dev_priv does not contain the unused members when !SERIAL_RX_BUFFER.
Signed-off-by: Rasmus Villemoes ravi@prevas.dk --- drivers/serial/serial-uclass.c | 5 ----- include/serial.h | 4 +++- 2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 484f0f7d3e8..d737e25223d 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -588,11 +588,6 @@ static int serial_post_probe(struct udevice *dev) sdev.getc = serial_stub_getc; sdev.tstc = serial_stub_tstc;
-#if CONFIG_IS_ENABLED(SERIAL_RX_BUFFER) - /* Allocate the RX buffer */ - upriv->buf = malloc(CONFIG_SERIAL_RX_BUFFER_SIZE); -#endif - stdio_register_dev(&sdev, &upriv->sdev); #endif return 0; diff --git a/include/serial.h b/include/serial.h index 14563239b7d..eabc49f820f 100644 --- a/include/serial.h +++ b/include/serial.h @@ -298,9 +298,11 @@ struct dm_serial_ops { struct serial_dev_priv { struct stdio_dev *sdev;
- char *buf; +#if CONFIG_IS_ENABLED(SERIAL_RX_BUFFER) + char buf[CONFIG_SERIAL_RX_BUFFER_SIZE]; uint rd_ptr; uint wr_ptr; +#endif };
/* Access the serial operations for a device */

On Thu, 3 Oct 2024 at 08:11, Rasmus Villemoes ravi@prevas.dk wrote:
The initialization of upriv->buf doesn't check for a NULL return. But there's actually no point in doing a separate, unconditional malloc() in post_probe; we can just make serial_dev_priv contain the rx buffer itself, and let the (larger) allocation be handled by the driver core when it allocates the ->per_device_auto. The total run-time memory used is mostly the same, we reduce the code size a little, and as a bonus, struct serial_dev_priv does not contain the unused members when !SERIAL_RX_BUFFER.
Signed-off-by: Rasmus Villemoes ravi@prevas.dk
drivers/serial/serial-uclass.c | 5 ----- include/serial.h | 4 +++- 2 files changed, 3 insertions(+), 6 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/drivers/serial/serial-uclass.c b/drivers/serial/serial-uclass.c index 484f0f7d3e8..d737e25223d 100644 --- a/drivers/serial/serial-uclass.c +++ b/drivers/serial/serial-uclass.c @@ -588,11 +588,6 @@ static int serial_post_probe(struct udevice *dev) sdev.getc = serial_stub_getc; sdev.tstc = serial_stub_tstc;
-#if CONFIG_IS_ENABLED(SERIAL_RX_BUFFER)
/* Allocate the RX buffer */
upriv->buf = malloc(CONFIG_SERIAL_RX_BUFFER_SIZE);
-#endif
stdio_register_dev(&sdev, &upriv->sdev);
#endif return 0; diff --git a/include/serial.h b/include/serial.h index 14563239b7d..eabc49f820f 100644 --- a/include/serial.h +++ b/include/serial.h @@ -298,9 +298,11 @@ struct dm_serial_ops { struct serial_dev_priv { struct stdio_dev *sdev;
char *buf;
+#if CONFIG_IS_ENABLED(SERIAL_RX_BUFFER)
char buf[CONFIG_SERIAL_RX_BUFFER_SIZE]; uint rd_ptr; uint wr_ptr;
+#endif };
/* Access the serial operations for a device */
2.46.2

On Thu, 03 Oct 2024 16:10:25 +0200, Rasmus Villemoes wrote:
Some small improvements to the serial rx buffer feature.
CI seems happy: https://github.com/u-boot/u-boot/pull/674
Rasmus Villemoes (4): serial: fix circular rx buffer edge case serial: do not overwrite not-consumed characters in rx buffer serial: add build-time sanity check of CONFIG_SERIAL_RX_BUFFER_SIZE serial: embed the rx buffer in struct serial_dev_priv
[...]
Applied to u-boot/master, thanks!
participants (3)
-
Rasmus Villemoes
-
Simon Glass
-
Tom Rini