[U-Boot] [PATCH] tools: mkimage can read input on /dev/stdin

Hi,
I would like to give /dev/stdin to the flag -d of mkimage. The only thing that prevent doing it is the function copy_file of mkimage.c, which: - calls stat(2) on the file to get the input file size - calls mmap(2) with this size as length
When the file is a pipe, its size is set to 0 and mmap(2) fails.
This patch replaces the use of mmap(2) with read(2). If accepted, I could give a look to accept /dev/stdout as output file (which is currently also required to be a file).
From e107bdc73ee7b2159956cfc753328f9f03c058e8 Mon Sep 17 00:00:00 2001
From: Julien Castets castets.j@gmail.com Date: Fri, 26 Sep 2014 11:28:49 +0200 Subject: [PATCH] tools: mkimage can read input on /dev/stdin
Use a sequential read(2) instead of a mmap(2) in copy_file.
Signed-off-by: Julien Castets castets.j@gmail.com --- tools/mkimage.c | 64 +++++++++++++++++++++++++++---------------------------- 1 file changed, 31 insertions(+), 33 deletions(-)
diff --git a/tools/mkimage.c b/tools/mkimage.c index c70408c..bb35110 100644 --- a/tools/mkimage.c +++ b/tools/mkimage.c @@ -522,14 +522,14 @@ static void copy_file (int ifd, const char *datafile, int pad) { int dfd; - struct stat sbuf; - unsigned char *ptr; int tail; int zero = 0; uint8_t zeros[4096]; - int offset = 0; - int size; struct image_type_params *tparams = mkimage_get_type (params.type); + unsigned char buf[4096]; + ssize_t nbytes; + ssize_t i; + ssize_t size = 0;
if (pad >= sizeof(zeros)) { fprintf(stderr, "%s: Can't pad to %d\n", @@ -549,63 +549,62 @@ copy_file (int ifd, const char *datafile, int pad) exit (EXIT_FAILURE); }
- if (fstat(dfd, &sbuf) < 0) { - fprintf (stderr, "%s: Can't stat %s: %s\n", - params.cmdname, datafile, strerror(errno)); - exit (EXIT_FAILURE); - } - - ptr = mmap(0, sbuf.st_size, PROT_READ, MAP_SHARED, dfd, 0); - if (ptr == MAP_FAILED) { - fprintf (stderr, "%s: Can't read %s: %s\n", - params.cmdname, datafile, strerror(errno)); - exit (EXIT_FAILURE); - } - if (params.xflag) { - unsigned char *p = NULL; /* * XIP: do not append the image_header_t at the * beginning of the file, but consume the space * reserved for it. */ + nbytes = read(dfd, buf, tparams->header_size); + if (nbytes == -1) { + fprintf (stderr, "%s: Can't read XIP header of %s: %s\n", + params.cmdname, datafile, strerror(errno)); + exit (EXIT_FAILURE); + }
- if ((unsigned)sbuf.st_size < tparams->header_size) { + if (nbytes < tparams->header_size) { fprintf (stderr, "%s: Bad size: "%s" is too small for XIP\n", params.cmdname, datafile); exit (EXIT_FAILURE); }
- for (p = ptr; p < ptr + tparams->header_size; p++) { - if ( *p != 0xff ) { + for (i = 0; i < nbytes; ++i) { + if (buf[i] != 0xff) { fprintf (stderr, "%s: Bad file: "%s" has invalid buffer for XIP\n", params.cmdname, datafile); exit (EXIT_FAILURE); } } + }
- offset = tparams->header_size; + while ((nbytes = read(dfd, buf, sizeof(buf))) > 0) { + if (write(ifd, buf, nbytes) != nbytes) { + fprintf (stderr, "%s: Write error on %s: %s\n", + params.cmdname, params.imagefile, strerror(errno)); + exit (EXIT_FAILURE); + } + size += nbytes; }
- size = sbuf.st_size - offset; - if (write(ifd, ptr + offset, size) != size) { - fprintf (stderr, "%s: Write error on %s: %s\n", - params.cmdname, params.imagefile, strerror(errno)); - exit (EXIT_FAILURE); + if (nbytes == -1) { + fprintf (stderr, "%s: Read error on %s: %s\n", + params.cmdname, params.imagefile, strerror(errno)); + exit (EXIT_FAILURE); }
tail = size % 4; if ((pad == 1) && (tail != 0)) { - - if (write(ifd, (char *)&zero, 4-tail) != 4-tail) { + if (write(ifd, (char *)&zero, 4 - tail) != 4 - tail) { fprintf (stderr, "%s: Write error on %s: %s\n", - params.cmdname, params.imagefile, - strerror(errno)); + params.cmdname, params.imagefile, + strerror(errno)); exit (EXIT_FAILURE); } - } else if (pad > 1) { + } + + else if (pad > 1) { if (write(ifd, (char *)&zeros, pad) != pad) { fprintf(stderr, "%s: Write error on %s: %s\n", params.cmdname, params.imagefile, @@ -614,7 +613,6 @@ copy_file (int ifd, const char *datafile, int pad) } }
- (void) munmap((void *)ptr, sbuf.st_size); (void) close (dfd); }

Dear Julien,
In message CADF714agpUedt_o3iXcGBRBc5CaV6TBUn=1oGEqJj3h2b+f-Vg@mail.gmail.com you wrote:
I would like to give /dev/stdin to the flag -d of mkimage. The only
What would be the benefit of doing so? Do you have an example for a practical use case where this makes sense?
This patch replaces the use of mmap(2) with read(2). If accepted, I could give a look to accept /dev/stdout as output file (which is currently also required to be a file).
What is the size and performance impact of the suggested change for typical use cases?
Best regards,
Wolfgang Denk

On Sat, Sep 27, 2014 at 2:54 PM, Wolfgang Denk wd@denx.de wrote:
What would be the benefit of doing so? Do you have an example for a practical use case where this makes sense?
In my case, I have a TFTP server that dyncamically generates uboot bootfiles when a specific file is requested. The input template file being generated, I need to create a temporary file to store it before calling mkimage. Except the mmap, there's no technical restriction for mkimage to be able to read on a pipe.
What is the size and performance impact of the suggested change for typical use cases?
None. The behaviour is exactly the same.

Dear Julien,
In message CADF714bJ6-WvjBSd1u8UA-zT+H0TUO5vqO3_-6Q34yUjvbkVxw@mail.gmail.com you wrote:
On Sat, Sep 27, 2014 at 2:54 PM, Wolfgang Denk wd@denx.de wrote:
What would be the benefit of doing so? Do you have an example for a practical use case where this makes sense?
In my case, I have a TFTP server that dyncamically generates uboot bootfiles when a specific file is requested. The input template file being generated, I need to create a temporary file to store it before calling mkimage. Except the mmap, there's no technical restriction for mkimage to be able to read on a pipe.
Sorry, but I don't understand this. Where are the image(s) coming from, then? Who or what is feeding the pipe?
What is the size and performance impact of the suggested change for typical use cases?
None. The behaviour is exactly the same.
I don't believe you. Sizes rare certainly not identical, and neither is the performance. Did you do any real measurements?
Best regards,
Wolfgang Denk

On Sat, Sep 27, 2014 at 3:25 PM, Wolfgang Denk wd@denx.de wrote:
Sorry, but I don't understand this. Where are the image(s) coming from, then? Who or what is feeding the pipe?
Sorry, I wrote quickly.
In my situation: - I have a Python implementation of a TFTP server - when the file named "uboot.bootscript" is requested, a template is rendered. Currently, this template is stored in a temporary file - A new process is created to call mkimage, with the temporary file as input and another temporary file as output (mkimage -A arm -O linux -a 0 -e 0 -T script -C none -n 'comment' -d tmp_input tmp_output) - Finally, the output file content is returned to the client
Instead of creating two temporary file, two pipes could be used (-d /dev/stdin /dev/stdout), which, in my case, would be more efficient (even if I don't really have performance issues).
What is the size and performance impact of the suggested change for typical use cases?
None. The behaviour is exactly the same.
I don't believe you. Sizes rare certainly not identical, and neither is the performance. Did you do any real measurements?
mmap is useful when you need to make random access in a file, or to optimize memory: when a file is mmapped, the kernel only loads the parts of the file that are accessed. In the case of mkimage, the file is read sequentially (meaning all of it will be retrieved from the disk).

Dear Julien,
In message CADF714bpHv6x8wUxs=7H-st3-ThPCeBWbZvO_-uoQcdWS6WqJQ@mail.gmail.com you wrote:
I don't believe you. Sizes rare certainly not identical, and neither is the performance. Did you do any real measurements?
mmap is useful when you need to make random access in a file, or to optimize memory: when a file is mmapped, the kernel only loads the parts of the file that are accessed. In the case of mkimage, the file is read sequentially (meaning all of it will be retrieved from the disk).
OK, so you don't have any real data, but make a very specific statement: "exactly the same."
Sorry, but this is not an answer I'm going to buy.
Best regards,
Wolfgang Denk

On Sep 27, 2014 8:24 PM, "Wolfgang Denk" > OK, so you don't have any real data, but make a very specific
statement: "exactly the same."
Sorry, but this is not an answer I'm going to buy.
I'm not sure to understand what you mean. In both cases, the file is copied.
What is bothering you?

Dear Julien,
In message CADF714bLG21jobjTnbUbWqsyj3xbzL+Fb+WBfcrq8YH4-ugm5Q@mail.gmail.com you wrote:
I'm not sure to understand what you mean. In both cases, the file is copied.
What is bothering you?
I asked:
| What is the size and performance impact of the suggested change for | typical use cases?
Can you please provide values for the size of the binary and the execution time?
It's not really critical, but I'd like to understand the impact of your changes. You use case is pretty exotic, so it seems a valid question to me to try to understand what the extended functionality costs.
Best regards,
Wolfgang Denk

On Saturday, September 27, 2014 at 11:56:55 PM, Wolfgang Denk wrote:
Hello Wolfgang,
Dear Julien,
In message <CADF714bLG21jobjTnbUbWqsyj3xbzL+Fb+WBfcrq8YH4-
ugm5Q@mail.gmail.com> you wrote:
I'm not sure to understand what you mean. In both cases, the file is copied.
What is bothering you?
I asked: | What is the size and performance impact of the suggested change for | typical use cases?
Can you please provide values for the size of the binary and the execution time?
It's not really critical, but I'd like to understand the impact of your changes. You use case is pretty exotic, so it seems a valid question to me to try to understand what the extended functionality costs.
Won't it be better to focus on the overall concept first and dig in the finer details later ?
I think right now, the question is -- do we want to support stdin as a source of payload for mkimage or not at all?
Best regards, Marek Vasut

Dear Marek,
In message 201409280001.26383.marex@denx.de you wrote:
Can you please provide values for the size of the binary and the execution time?
It's not really critical, but I'd like to understand the impact of your changes. You use case is pretty exotic, so it seems a valid question to me to try to understand what the extended functionality costs.
Won't it be better to focus on the overall concept first and dig in the finer details later ?
I think right now, the question is -- do we want to support stdin as a source of payload for mkimage or not at all?
The general approach to new features in U-Boot is: 1) is it useful at least to some? and 2) does it not hurt others?
Re 1), I think the use case is pretty exostic, but apparently there is at least one user for that.
Re 2), we need some numbers. Plain mmap() on a regular file is supposed to be the fastest possible I/O method in a Unix OS, so we should understand how much a change costs, or if it makes sense to provide different implementations depending on input type (read() for stdin vs. mmap() for regular files). Or if the differences are so small that this is all not worth the time we spend here.
Best regards,
Wolfgang Denk

Thanks for your comments,
On Sat, Sep 27, 2014 at 11:56 PM, Wolfgang Denk wd@denx.de wrote:
Can you please provide values for the size of the binary and the execution time?
Without the patch, with mmap: $> dd if=/dev/zero of=test bs=1M count=10 $> time ./mkimage -A arm -O linux -a 0 -e 0 -T script -C none -n 'test' -d test test_out ... real 0m0.168s user 0m0.027s sys 0m0.023s
With the patch, with read: $> dd if=/dev/zero of=test bs=1M count=10 $> time ./mkimage -A arm -O linux -a 0 -e 0 -T script -C none -n 'test' -d test test_out ... real 0m0.160s user 0m0.025s sys 0m0.016s
In both cases, the binary mkimage size is 144k bytes (147333 with mmap, 147421 with read). Compiled with make sandbox_defconfig and make_tools.
It's not really critical, but I'd like to understand the impact of your changes. You use case is pretty exotic, so it seems a valid question to me to try to understand what the extended functionality costs.
I understand the use case, in its globality, is pretty exotic. However, I don't really get why giving /dev/stdin as input is.
Regards,

Dear Julien,
In message CADF714btdiJT=ccVRBuJk7xUJCgapA-ued=E1J1C0v-Bxa2h=A@mail.gmail.com you wrote:
Without the patch, with mmap:
Thanks for the numbers. So these indeed make no real difference.
I understand the use case, in its globality, is pretty exotic. However, I don't really get why giving /dev/stdin as input is.
The case where mkimage is taking a single input file is quickly becoming a rare corner case.
The recommended way to build U-Boot images is (and has been for years, even though marketing for this has been poor) to build FIT images. In this case you typically have several inout files, which are not even listed on the mkimage command line, but referenced in the fit-image.its file you use. OK, in this case you could feed the fit-image.its through stdin. But there are other file arguments - like when using signed images.
Even if you use the (deprecated) legacy image format, the case of using a single input file is not the generic one. mkimage syntax is:
mkimage ... -d data_file[:data_file...] ...
and allows to provide several input files - pretty often you need the kernel image and the DT blob. It is also not uncommon to have a ramdisk image included.
So if we add support to read from stdin instead from a file where we pass the file name as an argument, we should probably do this in a consistent way. It would be a frustrating experience to the end user to learn that he can use stdin here but not there - so we would probably have to rework all these use cases? And how should we implement this - would a file name "-" mean stdin (1), or should we simply pass "/dev/stdin" as file argument (2)?
With (1), we need to change more code, while (2) could probably be pretty transparent.
You see, even such a simple change like your suggestion needs some deeper thought if you want to keep the design consistent. This is why I am asking about your use case, and how it would fit into other situations.
Best regards,
Wolfgang Denk

On Sun, Sep 28, 2014 at 8:49 AM, Wolfgang Denk wd@denx.de wrote:
The case where mkimage is taking a single input file is quickly becoming a rare corner case.
The recommended way to build U-Boot images is (and has been for years, even though marketing for this has been poor) to build FIT images. In this case you typically have several inout files, which are not even listed on the mkimage command line, but referenced in the fit-image.its file you use. OK, in this case you could feed the fit-image.its through stdin. But there are other file arguments - like when using signed images.
Even if you use the (deprecated) legacy image format, the case of using a single input file is not the generic one. mkimage syntax is:
mkimage ... -d data_file[:data_file...] ...
and allows to provide several input files - pretty often you need the kernel image and the DT blob. It is also not uncommon to have a ramdisk image included.
Thank you so much for you explanations. To tell the truth, I don't even know what a FIT image is, and even if I saw the usage accepted several files as input, I didn't search to understand why.
So if we add support to read from stdin instead from a file where we pass the file name as an argument, we should probably do this in a consistent way. It would be a frustrating experience to the end user to learn that he can use stdin here but not there - so we would probably have to rework all these use cases? And how should we implement this - would a file name "-" mean stdin (1), or should we simply pass "/dev/stdin" as file argument (2)?
With (1), we need to change more code, while (2) could probably be pretty transparent.
If I understand well, your proposition for (1) would be to use mmap(2) for everything, but use read(2) for the special case "-".
I'm not sure it is a good idea. The standard input can be handled like any other file. And note the input could also be a named pipe, that you won't be able to mmap. With the patch proposed, it would work just fine.
Also, in the case you're having several files as input, they will be consumed one after the other. So if the input is "-d /dev/stdin:/dev/stdin:/dev/stdin", you can give the three files through stdin.
You see, even such a simple change like your suggestion needs some deeper thought if you want to keep the design consistent. This is why I am asking about your use case, and how it would fit into other situations.
Indeed!

Dear Julien,
In message CADF714Y4ufr4SuoF83tgyJxyhx1gdKYKw9E6JWtYhHfM3NxzUg@mail.gmail.com you wrote:
So if we add support to read from stdin instead from a file where we pass the file name as an argument, we should probably do this in a consistent way. It would be a frustrating experience to the end user to learn that he can use stdin here but not there - so we would probably have to rework all these use cases? And how should we implement this - would a file name "-" mean stdin (1), or should we simply pass "/dev/stdin" as file argument (2)?
With (1), we need to change more code, while (2) could probably be pretty transparent.
If I understand well, your proposition for (1) would be to use mmap(2) for everything, but use read(2) for the special case "-".
I did not mean to suggest this. I probably makes more sense to use the same code everywhere.
I'm not sure it is a good idea. The standard input can be handled like any other file. And note the input could also be a named pipe, that you won't be able to mmap. With the patch proposed, it would work just fine.
But the patch would only be a part of the implementation. I think we should see it all together to be able to compare approaches.
Also, in the case you're having several files as input, they will be consumed one after the other. So if the input is "-d /dev/stdin:/dev/stdin:/dev/stdin", you can give the three files through stdin.
Ouch. That would be error prone as hell. Not all things that can be done should be done ;-)
Best regards,
Wolfgang Denk
participants (3)
-
Julien Castets
-
Marek Vasut
-
Wolfgang Denk