[U-Boot] [PATCH] patman: encode CC list to UTF-8

This change encodes the CC list to UTF-8 to avoid failures on maintainer-addresses that include non-ASCII characters (observed on Debian 7.11 with Python 2.7.3).
Without this, I get the following failure: Traceback (most recent call last): File "tools/patman/patman", line 159, in <module> options.add_maintainers) File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile print(commit.patch, ', '.join(set(list)), file=fd) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128) from Heiko's email address: [..., u'"Heiko St\xfcbner" heiko@sntech.de', ...]
While with this change added this encodes to: "=?UTF-8?q?Heiko=20St=C3=BCbner?= heiko@sntech.de"
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com ---
tools/patman/series.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/patman/series.py b/tools/patman/series.py index c1b8652..134a381 100644 --- a/tools/patman/series.py +++ b/tools/patman/series.py @@ -119,7 +119,7 @@ class Series(dict): email = col.Color(col.YELLOW, "<alias '%s' not found>" % tag) if email: - print(' Cc: ', email) + print(' Cc: ', email.encode('utf-8')) print for item in to_set: print('To:\t ', item) @@ -230,7 +230,7 @@ class Series(dict): if add_maintainers: list += get_maintainer.GetMaintainer(commit.patch) all_ccs += list - print(commit.patch, ', '.join(set(list)), file=fd) + print(commit.patch, ', '.join(set(list)).encode('utf-8'), file=fd) self._generated_cc[commit.patch] = list
if cover_fname:

+Tom
On 19 April 2017 at 07:24, Philipp Tomsich philipp.tomsich@theobroma-systems.com wrote:
This change encodes the CC list to UTF-8 to avoid failures on maintainer-addresses that include non-ASCII characters (observed on Debian 7.11 with Python 2.7.3).
Without this, I get the following failure: Traceback (most recent call last): File "tools/patman/patman", line 159, in <module> options.add_maintainers) File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile print(commit.patch, ', '.join(set(list)), file=fd) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128) from Heiko's email address: [..., u'"Heiko St\xfcbner" heiko@sntech.de', ...]
While with this change added this encodes to: "=?UTF-8?q?Heiko=20St=C3=BCbner?= heiko@sntech.de"
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com
tools/patman/series.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
+Tom
On 19 April 2017 at 07:24, Philipp Tomsich philipp.tomsich@theobroma-systems.com wrote:
This change encodes the CC list to UTF-8 to avoid failures on maintainer-addresses that include non-ASCII characters (observed on Debian 7.11 with Python 2.7.3).
Without this, I get the following failure: Traceback (most recent call last): File "tools/patman/patman", line 159, in <module> options.add_maintainers) File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile print(commit.patch, ', '.join(set(list)), file=fd) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128) from Heiko's email address: [..., u'"Heiko St\xfcbner" heiko@sntech.de', ...]
While with this change added this encodes to: "=?UTF-8?q?Heiko=20St=C3=BCbner?= heiko@sntech.de"
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com
tools/patman/series.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Please put this in a PR for me, along with any other critical fixes to the various python tools we have, thanks!
And also, do we need to perhaps whack something at a higher level, and more consistently, about unicode? This is, I gather, doing UTF-8 right. In buildman we have a few patches to just translate to latin-1 instead. We should do the same thing I think, and perhaps there's a higher level up in the code where we need to do it too? I don't know..

Hi Tom,
On 25 April 2017 at 11:12, Tom Rini trini@konsulko.com wrote:
On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
+Tom
On 19 April 2017 at 07:24, Philipp Tomsich philipp.tomsich@theobroma-systems.com wrote:
This change encodes the CC list to UTF-8 to avoid failures on maintainer-addresses that include non-ASCII characters (observed on Debian 7.11 with Python 2.7.3).
Without this, I get the following failure: Traceback (most recent call last): File "tools/patman/patman", line 159, in <module> options.add_maintainers) File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile print(commit.patch, ', '.join(set(list)), file=fd) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128) from Heiko's email address: [..., u'"Heiko St\xfcbner" heiko@sntech.de', ...]
While with this change added this encodes to: "=?UTF-8?q?Heiko=20St=C3=BCbner?= heiko@sntech.de"
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com
tools/patman/series.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Please put this in a PR for me, along with any other critical fixes to the various python tools we have, thanks!
And also, do we need to perhaps whack something at a higher level, and more consistently, about unicode? This is, I gather, doing UTF-8 right. In buildman we have a few patches to just translate to latin-1 instead. We should do the same thing I think, and perhaps there's a higher level up in the code where we need to do it too? I don't know..
Actually I don't think we are quite there yet. This really needs a test with all the different places strings can come from, to make sure patman does the right thing.
Regards, Simon

Hi Simon,
On 25 Apr 2017, at 22:31, Simon Glass sjg@chromium.org wrote:
Hi Tom,
On 25 April 2017 at 11:12, Tom Rini trini@konsulko.com wrote:
On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
+Tom
On 19 April 2017 at 07:24, Philipp Tomsich philipp.tomsich@theobroma-systems.com wrote:
This change encodes the CC list to UTF-8 to avoid failures on maintainer-addresses that include non-ASCII characters (observed on Debian 7.11 with Python 2.7.3).
Without this, I get the following failure: Traceback (most recent call last): File "tools/patman/patman", line 159, in <module> options.add_maintainers) File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile print(commit.patch, ', '.join(set(list)), file=fd) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128) from Heiko's email address: [..., u'"Heiko St\xfcbner" heiko@sntech.de', ...]
While with this change added this encodes to: "=?UTF-8?q?Heiko=20St=C3=BCbner?= heiko@sntech.de"
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com
tools/patman/series.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Please put this in a PR for me, along with any other critical fixes to the various python tools we have, thanks!
And also, do we need to perhaps whack something at a higher level, and more consistently, about unicode? This is, I gather, doing UTF-8 right. In buildman we have a few patches to just translate to latin-1 instead. We should do the same thing I think, and perhaps there's a higher level up in the code where we need to do it too? I don't know..
Actually I don't think we are quite there yet. This really needs a test with all the different places strings can come from, to make sure patman does the right thing.
On the topic of ‘different places strings can come from’, here’s another change from my WIP tree that fixes some other UTF-8 issues in patman and may point you towards another trouble spot:
@@ -229,14 +229,16 @@ class Series(dict): raise_on_error=raise_on_error) if add_maintainers: list += get_maintainer.GetMaintainer(commit.patch) + list = [s.encode('utf-8') for s in list] all_ccs += list - print(commit.patch, ', '.join(set(list)).encode('utf-8'), file=fd) + print(commit.patch, ', '.join(set(list)), file=fd) self._generated_cc[commit.patch] = list
if cover_fname: cover_cc = gitutil.BuildEmailList(self.get('cover_cc', '')) - cc_list = ', '.join([x.decode('utf-8') for x in set(cover_cc + all_ccs)]) - print(cover_fname, cc_list.encode('utf-8'), file=fd) + cover_cc = [s.encode('utf-8') for s in cover_cc] + cc_list = ', '.join([x for x in set(cover_cc + all_ccs)]) + print(cover_fname, cc_list, file=fd)
fd.close() return fname
Regards, Philipp.
participants (4)
-
Dr. Philipp Tomsich
-
Philipp Tomsich
-
Simon Glass
-
Tom Rini