
On Tue, Jul 12, 2022 at 04:58:46AM -0600, Simon Glass wrote:
On Thu, 7 Jul 2022 at 13:22, Tom Rini trini@konsulko.com wrote:
Given the sometimes oddly formatted data that can come through when removing code, we need to be as flexible as possible when handling it. Set our encoding to unicode_escape and if we still run in to a problem, it's likely going to be OK to ignore it.
Signed-off-by: Tom Rini trini@konsulko.com
I've emailed this to Jonathan Corbet as well as he's the upstream for the project, and this does work for me. But I'm not a python guru by any means. But trying to run the stats for v2022.04..v2022.07-rc6 blows up in places otherwise.
logparser.py | 1 + 1 file changed, 1 insertion(+)
Reviewed-by: Simon Glass sjg@chromium.org
BTW I have found that using binary is helpful in many places, the convert to UTF-8 when displaying things.
diff --git a/logparser.py b/logparser.py index efbc72f868eb..d5906e97689d 100644 --- a/logparser.py +++ b/logparser.py @@ -37,6 +37,7 @@ class LogPatchSplitter: self.fd = fd self.buffer = None self.patch = []
sys.stdin.reconfigure(encoding='unicode_escape', errors='ignore')
def __iter__(self): return self
So, I followed up with Jonathan, but hadn't yet for the list. unicode_escape works, but then the results don't read right. It turned out utf-8 was the right encoding, but the first time I tried testing it I had some other problem locally.