[PATCH v4 0/5] video: Improve syncing performance with cyclic

Now that U-Boot has a background-processing feature, it is possible to reduce the amount of 'foreground' syncing of the display. At present this happens quite often.
Foreground syncing blocks all other processing, sometimes for 10ms or more. When pasting commands into U-Boot over the UART, this typically result in characters being dropped. For example, on rpi_4 it isn't possible to paste in more than 35 characters before things fail. This makes updating the environment or entering long commands very painful over the console, since text must be pasted in chunks, or the vidconsole device must be dropped from stdout.
This series introduces background syncing, enabled by default for boards which use video. The sync rates for foreground and background are configurable.
With this series it is possible to paste in any amount of text to the command line. Some sandbox-specific workarounds can now be removed and sandbox video (./u-boot -Dl) is significantly more responsive.
This obviously increases code size, since it enables a subsystem not normally used by default. However it only applies to boards which have VIDEO enabled, which are presumably less worried about memory space since the video code is fairly large.
Also it is possible to disable CMD_CYCLIC and reduce the growth to:
aarch64: (for 1/1 boards) all +1081.0 rodata +65.0 text +1016.0 arm: (for 1/1 boards) all +945.0 rodata +65.0 text +880.0
Without that, the increase doubles.
It is of course possible to disable CYCLIC and still use VIDEO but this reverts to the current behaviour
Changes in v4: - Imply SPL_CYCLIC if WATCHDOG is enabled
Changes in v3: - Drop inclusion of kconfig.h in cyclic.h - Adapt to new cyclic API
Changes in v2: - Add an SPL_CYCLIC symbol - Add a lot more explanation about the header files - Expand help for CONFIG_VIDEO - Fix 'groth' and 'work-around' typos in cover letter
Simon Glass (5): cyclic: Add a symbol for SPL video: Move last_sync to private data video: Use cyclic to handle video sync sandbox: Increase cyclic CPU-time limit sandbox: Drop video-sync in serial driver
common/Kconfig | 9 ++++++ common/Makefile | 2 +- drivers/serial/sandbox.c | 2 -- drivers/video/Kconfig | 35 +++++++++++++++++++++ drivers/video/video-uclass.c | 52 +++++++++++++++++++++++++------ drivers/watchdog/Kconfig | 2 ++ include/asm-generic/global_data.h | 2 +- include/cyclic.h | 5 +-- include/video.h | 2 ++ 9 files changed, 96 insertions(+), 15 deletions(-)

The cyclic subsystem is currently enabled either in all build phases or none. For tools this should not be enabled, but since lib/shc256.c and other files include watchdog.h in the host build, we must make sure that it is not enabled there.
Add an SPL symbol so that there is more control of this.
Add an include into cyclic.h so that tools can include this file.
Signed-off-by: Simon Glass sjg@chromium.org Reviewed-by: Devarsh Thakkar devarsht@ti.com Reviewed-by: Stefan Roese sr@denx.de ---
Changes in v4: - Imply SPL_CYCLIC if WATCHDOG is enabled
Changes in v3: - Drop inclusion of kconfig.h in cyclic.h
Changes in v2: - Add an SPL_CYCLIC symbol - Add a lot more explanation about the header files
common/Kconfig | 8 ++++++++ common/Makefile | 2 +- drivers/watchdog/Kconfig | 2 ++ include/asm-generic/global_data.h | 2 +- include/cyclic.h | 5 +++-- 5 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/common/Kconfig b/common/Kconfig index 4bb9f08977a..87b0ec3ea8f 100644 --- a/common/Kconfig +++ b/common/Kconfig @@ -626,6 +626,14 @@ config CYCLIC
if CYCLIC
+config SPL_CYCLIC + bool "General-purpose cyclic execution mechanism (SPL)" + help + This enables a general-purpose cyclic execution infrastructure in SPL, + to allow "small" (run-time wise) functions to be executed at + a specified frequency. Things like LED blinking or watchdog + triggering are examples for such tasks. + config CYCLIC_MAX_CPU_TIME_US int "Sets the max allowed time for a cyclic function in us" default 5000 diff --git a/common/Makefile b/common/Makefile index e9835473420..d871113cbb9 100644 --- a/common/Makefile +++ b/common/Makefile @@ -79,7 +79,7 @@ obj-$(CONFIG_CROS_EC) += cros_ec.o obj-y += dlmalloc.o obj-$(CONFIG_$(SPL_TPL_)SYS_MALLOC_F) += malloc_simple.o
-obj-$(CONFIG_CYCLIC) += cyclic.o +obj-$(CONFIG_$(SPL_TPL_)CYCLIC) += cyclic.o obj-$(CONFIG_$(SPL_TPL_)EVENT) += event.o
obj-$(CONFIG_$(SPL_TPL_)HASH) += hash.o diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 8318fd77a32..0c3e9913318 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -4,6 +4,7 @@ config WATCHDOG bool "Enable U-Boot watchdog reset" depends on !HW_WATCHDOG select CYCLIC + imply SPL_CYCLIC if SPL help This option enables U-Boot watchdog support where U-Boot is using watchdog_reset function to service watchdog device in U-Boot. Enable @@ -408,6 +409,7 @@ config WDT_ARM_SMC config SPL_WDT bool "Enable driver model for watchdog timer drivers in SPL" depends on SPL_DM + select SPL_CYCLIC if CYCLIC help Enable driver model for watchdog timer in SPL. This is similar to CONFIG_WDT in U-Boot. diff --git a/include/asm-generic/global_data.h b/include/asm-generic/global_data.h index aa336d63e3a..27aa75e7036 100644 --- a/include/asm-generic/global_data.h +++ b/include/asm-generic/global_data.h @@ -481,7 +481,7 @@ struct global_data { */ struct event_state event_state; #endif -#ifdef CONFIG_CYCLIC +#if CONFIG_IS_ENABLED(CYCLIC) /** * @cyclic_list: list of registered cyclic functions */ diff --git a/include/cyclic.h b/include/cyclic.h index 2c3d383c5ef..cd95b691d48 100644 --- a/include/cyclic.h +++ b/include/cyclic.h @@ -46,7 +46,8 @@ struct cyclic_info { /** Function type for cyclic functions */ typedef void (*cyclic_func_t)(struct cyclic_info *c);
-#if defined(CONFIG_CYCLIC) +#if CONFIG_IS_ENABLED(CYCLIC) + /** * cyclic_register - Register a new cyclic function * @@ -123,6 +124,6 @@ static inline int cyclic_unregister_all(void) { return 0; } -#endif +#endif /* CYCLIC */
#endif

Rather than using a static variable, use the video device's private data to remember when the last video sync was completed. This allows each display to have its own sync and avoids using static data in SPL.
Signed-off-by: Simon Glass sjg@chromium.org ---
(no changes since v1)
drivers/video/video-uclass.c | 10 +++------- include/video.h | 2 ++ 2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/video/video-uclass.c b/drivers/video/video-uclass.c index ff1382f4a43..a95b5f199dc 100644 --- a/drivers/video/video-uclass.c +++ b/drivers/video/video-uclass.c @@ -349,6 +349,7 @@ void video_set_default_colors(struct udevice *dev, bool invert) /* Flush video activity to the caches */ int video_sync(struct udevice *vid, bool force) { + struct video_priv *priv = dev_get_uclass_priv(vid); struct video_ops *ops = video_get_ops(vid); int ret;
@@ -364,20 +365,15 @@ int video_sync(struct udevice *vid, bool force) * out whether it exists? For now, ARM is safe. */ #if defined(CONFIG_ARM) && !CONFIG_IS_ENABLED(SYS_DCACHE_OFF) - struct video_priv *priv = dev_get_uclass_priv(vid); - if (priv->flush_dcache) { flush_dcache_range((ulong)priv->fb, ALIGN((ulong)priv->fb + priv->fb_size, CONFIG_SYS_CACHELINE_SIZE)); } #elif defined(CONFIG_VIDEO_SANDBOX_SDL) - struct video_priv *priv = dev_get_uclass_priv(vid); - static ulong last_sync; - - if (force || get_timer(last_sync) > 100) { + if (force || get_timer(priv->last_sync) > 100) { sandbox_sdl_sync(priv->fb); - last_sync = get_timer(0); + priv->last_sync = get_timer(0); } #endif return 0; diff --git a/include/video.h b/include/video.h index 4d8df9baaad..4013a949983 100644 --- a/include/video.h +++ b/include/video.h @@ -97,6 +97,7 @@ enum video_format { * the LCD is updated * @fg_col_idx: Foreground color code (bit 3 = bold, bit 0-2 = color) * @bg_col_idx: Background color code (bit 3 = bold, bit 0-2 = color) + * @last_sync: Monotonic time of last video sync */ struct video_priv { /* Things set up by the driver: */ @@ -121,6 +122,7 @@ struct video_priv { bool flush_dcache; u8 fg_col_idx; u8 bg_col_idx; + ulong last_sync; };
/**

At present U-Boot flushes the cache after every character written to ths display. This makes the command-line slower, to the point that pasting in long strings can fail.
Add a cyclic function to sync the display every 10ms. Enable this by default.
Allow much longer times for sandbox, since the SDL display is quite slow.
Avoid size growth if the feature is disabled by making the new init and destroy functions dependent on CYCLIC being enabled.
Signed-off-by: Simon Glass sjg@chromium.org ---
(no changes since v3)
Changes in v3: - Adapt to new cyclic API
Changes in v2: - Expand help for CONFIG_VIDEO
drivers/video/Kconfig | 35 +++++++++++++++++++++++++++ drivers/video/video-uclass.c | 46 ++++++++++++++++++++++++++++++++---- 2 files changed, 77 insertions(+), 4 deletions(-)
diff --git a/drivers/video/Kconfig b/drivers/video/Kconfig index 7808ae7919e..6e79694fd19 100644 --- a/drivers/video/Kconfig +++ b/drivers/video/Kconfig @@ -7,6 +7,7 @@ menu "Graphics support" config VIDEO bool "Enable driver model support for LCD/video" depends on DM + imply CYCLIC help This enables driver model for LCD and video devices. These support a bitmap display of various sizes and depths which can be drawn on @@ -14,6 +15,11 @@ config VIDEO option compiles in the video uclass and routes all LCD/video access through this.
+ If CYCLIC is enabled (which it is by default), the cyclic subsystem + is used to flush pending output to the display periodically, rather + than this happening with every chunk of output. This allows for more + efficient operation and faster display output. + if VIDEO
config VIDEO_FONT_4X6 @@ -232,6 +238,35 @@ config NO_FB_CLEAR loads takes over the screen. This, for example, can be used to keep splash image on screen until grub graphical boot menu starts.
+config VIDEO_SYNC_MS + int "Video-sync period in milliseconds for foreground processing" + default 300 if SANDBOX + default 100 + help + This sets the requested, maximum time before a video sync will take + place, in milliseconds. Note that the time between video syncs + may be longer than this, since syncs only happen when the video system + is used, e.g. by outputting a character to the console. + + It may also be shorter, since the video uclass will automatically + force a sync in certain situations. + + Many video-output systems require a sync operation before any output + is visible. This may flush the CPU cache or perhaps copy the + display contents to a hardware framebuffer. Without this, change to + the video may never be displayed. + +config VIDEO_SYNC_CYCLIC_MS + int "Video-sync period in milliseconds for cyclic processing" + depends on CYCLIC + default 100 if SANDBOX + default 10 + help + This sets the frequency of cyclic video syncs. The cyclic system is + used to ensure that when U-Boot is idle, it syncs the video. This + improves the responsiveness of the command line to new characters + being entered. + config PANEL bool "Enable panel uclass support" default y diff --git a/drivers/video/video-uclass.c b/drivers/video/video-uclass.c index a95b5f199dc..a5aa8dd5295 100644 --- a/drivers/video/video-uclass.c +++ b/drivers/video/video-uclass.c @@ -8,6 +8,7 @@ #include <bloblist.h> #include <console.h> #include <cpu_func.h> +#include <cyclic.h> #include <dm.h> #include <log.h> #include <malloc.h> @@ -52,6 +53,8 @@ */ DECLARE_GLOBAL_DATA_PTR;
+struct cyclic_info; + /** * struct video_uc_priv - Information for the video uclass * @@ -60,9 +63,12 @@ DECLARE_GLOBAL_DATA_PTR; * available address to use for a device's framebuffer. It starts at * gd->video_top and works downwards, running out of space when it hits * gd->video_bottom. + * @cyc: handle for cyclic-execution function, or NULL if none */ struct video_uc_priv { ulong video_ptr; + bool cyc_active; + struct cyclic_info cyc; };
/** struct vid_rgb - Describes a video colour */ @@ -359,6 +365,10 @@ int video_sync(struct udevice *vid, bool force) return ret; }
+ if (CONFIG_IS_ENABLED(CYCLIC) && !force && + get_timer(priv->last_sync) < CONFIG_VIDEO_SYNC_MS) + return 0; + /* * flush_dcache_range() is declared in common.h but it seems that some * architectures do not actually implement it. Is there a way to find @@ -371,11 +381,10 @@ int video_sync(struct udevice *vid, bool force) CONFIG_SYS_CACHELINE_SIZE)); } #elif defined(CONFIG_VIDEO_SANDBOX_SDL) - if (force || get_timer(priv->last_sync) > 100) { - sandbox_sdl_sync(priv->fb); - priv->last_sync = get_timer(0); - } + sandbox_sdl_sync(priv->fb); #endif + priv->last_sync = get_timer(0); + return 0; }
@@ -524,10 +533,16 @@ int video_default_font_height(struct udevice *dev) return vc_priv->y_charsize; }
+static void video_idle(struct cyclic_info *cyc) +{ + video_sync_all(); +} + /* Set up the display ready for use */ static int video_post_probe(struct udevice *dev) { struct video_uc_plat *plat = dev_get_uclass_plat(dev); + struct video_uc_priv *uc_priv = uclass_get_priv(dev->uclass); struct video_priv *priv = dev_get_uclass_priv(dev); char name[30], drv[15], *str; const char *drv_name = drv; @@ -618,6 +633,16 @@ static int video_post_probe(struct udevice *dev) } }
+ /* register cyclic as soon as the first video device is probed */ + if (CONFIG_IS_ENABLED(CYCLIC) && (gd->flags && GD_FLG_RELOC) && + !uc_priv->cyc_active) { + uint ms = CONFIG_IF_ENABLED_INT(CYCLIC, VIDEO_SYNC_CYCLIC_MS); + + cyclic_register(&uc_priv->cyc, video_idle, ms * 1000, + "video_init"); + uc_priv->cyc_active = true; + } + return 0; };
@@ -657,6 +682,18 @@ static int video_post_bind(struct udevice *dev) return 0; }
+__maybe_unused static int video_destroy(struct uclass *uc) +{ + struct video_uc_priv *uc_priv = uclass_get_priv(uc); + + if (uc_priv->cyc_active) { + cyclic_unregister(&uc_priv->cyc); + uc_priv->cyc_active = false; + } + + return 0; +} + UCLASS_DRIVER(video) = { .id = UCLASS_VIDEO, .name = "video", @@ -666,4 +703,5 @@ UCLASS_DRIVER(video) = { .priv_auto = sizeof(struct video_uc_priv), .per_device_auto = sizeof(struct video_priv), .per_device_plat_auto = sizeof(struct video_uc_plat), + CONFIG_IS_ENABLED(CYCLIC, (.destroy = video_destroy, )) };

Now that sandbox is using cyclic for video, it trips the 1us time limit. Updating the sandbox display often takes 20ms or more.
Increase the limit to 100ms to avoid a warning.
Signed-off-by: Simon Glass sjg@chromium.org ---
(no changes since v1)
common/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/common/Kconfig b/common/Kconfig index 87b0ec3ea8f..83c81edac20 100644 --- a/common/Kconfig +++ b/common/Kconfig @@ -636,6 +636,7 @@ config SPL_CYCLIC
config CYCLIC_MAX_CPU_TIME_US int "Sets the max allowed time for a cyclic function in us" + default 100000 if SANDBOX # sandbox video is quite slow default 5000 help The max allowed time for a cyclic function in us. If a functions

With sandbox, when U-Boot is waiting for input it syncs the video display, since presumably the user has finished typing.
Now that cyclic is used for video syncing, we can drop this. Cyclic will automatically call the video_idle() function when idle.
Signed-off-by: Simon Glass sjg@chromium.org ---
(no changes since v2)
Changes in v2: - Fix 'groth' and 'work-around' typos in cover letter
drivers/serial/sandbox.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/serial/sandbox.c b/drivers/serial/sandbox.c index ec0068e33d3..77a1558db68 100644 --- a/drivers/serial/sandbox.c +++ b/drivers/serial/sandbox.c @@ -138,8 +138,6 @@ static int sandbox_serial_pending(struct udevice *dev, bool input) return 0;
os_usleep(100); - if (IS_ENABLED(CONFIG_VIDEO) && !IS_ENABLED(CONFIG_SPL_BUILD)) - video_sync_all(); avail = membuff_putraw(&priv->buf, 100, false, &data); if (!avail) return 1; /* buffer full */

Hi Simon,
On Wed, 31 Jul 2024 08:44:07 -0600 Simon Glass sjg@chromium.org wrote: ...
Simon Glass (5): cyclic: Add a symbol for SPL video: Move last_sync to private data video: Use cyclic to handle video sync sandbox: Increase cyclic CPU-time limit sandbox: Drop video-sync in serial driver
common/Kconfig | 9 ++++++ common/Makefile | 2 +- drivers/serial/sandbox.c | 2 -- drivers/video/Kconfig | 35 +++++++++++++++++++++ drivers/video/video-uclass.c | 52 +++++++++++++++++++++++++------ drivers/watchdog/Kconfig | 2 ++ include/asm-generic/global_data.h | 2 +- include/cyclic.h | 5 +-- include/video.h | 2 ++ 9 files changed, 96 insertions(+), 15 deletions(-)
Applied to u-boot-video, thanks!
-- Anatolij
participants (2)
-
Anatolij Gustschin
-
Simon Glass