[PATCH v1 0/4] Port gen_compile_commands.py from Linux to U-Boot

Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json files for improved code navigation and analysis. The series consists of four patches: the initial script import and the necessary modifications for U-Boot compatibility.
Your feedback on these contributions would be greatly appreciated.
Best regards,
Joao Marcos Costa (4): scripts: Port Linux's gen_compile_commands.py to U-Boot scripts/gen_compile_commands.py: adapt _LINE_PATTERN scripts/gen_compile_commands.py: fix docstring scripts/gen_compile_commands.py: add acknowledgments
scripts/gen_compile_commands.py | 229 ++++++++++++++++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100755 scripts/gen_compile_commands.py

This script generates a database of compiler flags, namely compile_commands.json. It is quite useful for text editors that use clangd LSP (e.g. Vim, Neovim).
It was ported from Linux's sources: - tag: v6.4 - revision 6995e2de6891c724bfeb2db33d7b87775f913ad1
Modifications for U-Boot compatibility will be added in a follow-up commit.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 228 ++++++++++++++++++++++++++++++++ 1 file changed, 228 insertions(+) create mode 100755 scripts/gen_compile_commands.py
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py new file mode 100755 index 0000000000..15ba56527a --- /dev/null +++ b/scripts/gen_compile_commands.py @@ -0,0 +1,228 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) Google LLC, 2018 +# +# Author: Tom Roeder tmroeder@google.com +# +"""A tool for generating compile_commands.json in the Linux kernel.""" + +import argparse +import json +import logging +import os +import re +import subprocess +import sys + +_DEFAULT_OUTPUT = 'compile_commands.json' +_DEFAULT_LOG_LEVEL = 'WARNING' + +_FILENAME_PATTERN = r'^..*.cmd$' +_LINE_PATTERN = r'^savedcmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' +_VALID_LOG_LEVELS = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] +# The tools/ directory adopts a different build system, and produces .cmd +# files in a different format. Do not support it. +_EXCLUDE_DIRS = ['.git', 'Documentation', 'include', 'tools'] + +def parse_arguments(): + """Sets up and parses command-line arguments. + + Returns: + log_level: A logging level to filter log output. + directory: The work directory where the objects were built. + ar: Command used for parsing .a archives. + output: Where to write the compile-commands JSON file. + paths: The list of files/directories to handle to find .cmd files. + """ + usage = 'Creates a compile_commands.json database from kernel .cmd files' + parser = argparse.ArgumentParser(description=usage) + + directory_help = ('specify the output directory used for the kernel build ' + '(defaults to the working directory)') + parser.add_argument('-d', '--directory', type=str, default='.', + help=directory_help) + + output_help = ('path to the output command database (defaults to ' + + _DEFAULT_OUTPUT + ')') + parser.add_argument('-o', '--output', type=str, default=_DEFAULT_OUTPUT, + help=output_help) + + log_level_help = ('the level of log messages to produce (defaults to ' + + _DEFAULT_LOG_LEVEL + ')') + parser.add_argument('--log_level', choices=_VALID_LOG_LEVELS, + default=_DEFAULT_LOG_LEVEL, help=log_level_help) + + ar_help = 'command used for parsing .a archives' + parser.add_argument('-a', '--ar', type=str, default='llvm-ar', help=ar_help) + + paths_help = ('directories to search or files to parse ' + '(files should be *.o, *.a, or modules.order). ' + 'If nothing is specified, the current directory is searched') + parser.add_argument('paths', type=str, nargs='*', help=paths_help) + + args = parser.parse_args() + + return (args.log_level, + os.path.abspath(args.directory), + args.output, + args.ar, + args.paths if len(args.paths) > 0 else [args.directory]) + + +def cmdfiles_in_dir(directory): + """Generate the iterator of .cmd files found under the directory. + + Walk under the given directory, and yield every .cmd file found. + + Args: + directory: The directory to search for .cmd files. + + Yields: + The path to a .cmd file. + """ + + filename_matcher = re.compile(_FILENAME_PATTERN) + exclude_dirs = [ os.path.join(directory, d) for d in _EXCLUDE_DIRS ] + + for dirpath, dirnames, filenames in os.walk(directory, topdown=True): + # Prune unwanted directories. + if dirpath in exclude_dirs: + dirnames[:] = [] + continue + + for filename in filenames: + if filename_matcher.match(filename): + yield os.path.join(dirpath, filename) + + +def to_cmdfile(path): + """Return the path of .cmd file used for the given build artifact + + Args: + Path: file path + + Returns: + The path to .cmd file + """ + dir, base = os.path.split(path) + return os.path.join(dir, '.' + base + '.cmd') + + +def cmdfiles_for_a(archive, ar): + """Generate the iterator of .cmd files associated with the archive. + + Parse the given archive, and yield every .cmd file used to build it. + + Args: + archive: The archive to parse + + Yields: + The path to every .cmd file found + """ + for obj in subprocess.check_output([ar, '-t', archive]).decode().split(): + yield to_cmdfile(obj) + + +def cmdfiles_for_modorder(modorder): + """Generate the iterator of .cmd files associated with the modules.order. + + Parse the given modules.order, and yield every .cmd file used to build the + contained modules. + + Args: + modorder: The modules.order file to parse + + Yields: + The path to every .cmd file found + """ + with open(modorder) as f: + for line in f: + obj = line.rstrip() + base, ext = os.path.splitext(obj) + if ext != '.o': + sys.exit('{}: module path must end with .o'.format(obj)) + mod = base + '.mod' + # Read from *.mod, to get a list of objects that compose the module. + with open(mod) as m: + for mod_line in m: + yield to_cmdfile(mod_line.rstrip()) + + +def process_line(root_directory, command_prefix, file_path): + """Extracts information from a .cmd line and creates an entry from it. + + Args: + root_directory: The directory that was searched for .cmd files. Usually + used directly in the "directory" entry in compile_commands.json. + command_prefix: The extracted command line, up to the last element. + file_path: The .c file from the end of the extracted command. + Usually relative to root_directory, but sometimes absolute. + + Returns: + An entry to append to compile_commands. + + Raises: + ValueError: Could not find the extracted file based on file_path and + root_directory or file_directory. + """ + # The .cmd files are intended to be included directly by Make, so they + # escape the pound sign '#', either as '#' or '$(pound)' (depending on the + # kernel version). The compile_commands.json file is not interepreted + # by Make, so this code replaces the escaped version with '#'. + prefix = command_prefix.replace('#', '#').replace('$(pound)', '#') + + # Use os.path.abspath() to normalize the path resolving '.' and '..' . + abs_path = os.path.abspath(os.path.join(root_directory, file_path)) + if not os.path.exists(abs_path): + raise ValueError('File %s not found' % abs_path) + return { + 'directory': root_directory, + 'file': abs_path, + 'command': prefix + file_path, + } + + +def main(): + """Walks through the directory and finds and parses .cmd files.""" + log_level, directory, output, ar, paths = parse_arguments() + + level = getattr(logging, log_level) + logging.basicConfig(format='%(levelname)s: %(message)s', level=level) + + line_matcher = re.compile(_LINE_PATTERN) + + compile_commands = [] + + for path in paths: + # If 'path' is a directory, handle all .cmd files under it. + # Otherwise, handle .cmd files associated with the file. + # built-in objects are linked via vmlinux.a + # Modules are listed in modules.order. + if os.path.isdir(path): + cmdfiles = cmdfiles_in_dir(path) + elif path.endswith('.a'): + cmdfiles = cmdfiles_for_a(path, ar) + elif path.endswith('modules.order'): + cmdfiles = cmdfiles_for_modorder(path) + else: + sys.exit('{}: unknown file type'.format(path)) + + for cmdfile in cmdfiles: + with open(cmdfile, 'rt') as f: + result = line_matcher.match(f.readline()) + if result: + try: + entry = process_line(directory, result.group(1), + result.group(2)) + compile_commands.append(entry) + except ValueError as err: + logging.info('Could not add line from %s: %s', + cmdfile, err) + + with open(output, 'wt') as f: + json.dump(compile_commands, f, indent=2, sort_keys=True) + + +if __name__ == '__main__': + main()

For U-Boot's context, the regular expression defined by _LINE_PATTERN should be adapted. Replace 'savedcmd' by 'cmd'.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 15ba56527a..0227522959 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -19,7 +19,7 @@ _DEFAULT_OUTPUT = 'compile_commands.json' _DEFAULT_LOG_LEVEL = 'WARNING'
_FILENAME_PATTERN = r'^..*.cmd$' -_LINE_PATTERN = r'^savedcmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' +_LINE_PATTERN = r'^cmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' _VALID_LOG_LEVELS = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] # The tools/ directory adopts a different build system, and produces .cmd # files in a different format. Do not support it.

The referred tool is now in U-Boot. Replace "the Linux kernel" by "U-Boot" to make the docstring coherent.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 0227522959..63d036a773 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -5,7 +5,7 @@ # # Author: Tom Roeder tmroeder@google.com # -"""A tool for generating compile_commands.json in the Linux kernel.""" +"""A tool for generating compile_commands.json in U-Boot."""
import argparse import json

Add acknowledgments for porting and modifying the script. Of course, the license, author, and copyright notice remain the same as in the original script.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 1 + 1 file changed, 1 insertion(+)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 63d036a773..1a9c49b34a 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -4,6 +4,7 @@ # Copyright (C) Google LLC, 2018 # # Author: Tom Roeder tmroeder@google.com +# Ported and modified for U-Boot by Joao Marcos Costa jmcosta944@gmail.com # """A tool for generating compile_commands.json in U-Boot."""

Hi Joao,
On Sun, 20 Aug 2023 at 13:04, Joao Marcos Costa jmcosta944@gmail.com wrote:
Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json files for improved code navigation and analysis. The series consists of four patches: the initial script import and the necessary modifications for U-Boot compatibility.
Your feedback on these contributions would be greatly appreciated.
Best regards,
Joao Marcos Costa (4): scripts: Port Linux's gen_compile_commands.py to U-Boot scripts/gen_compile_commands.py: adapt _LINE_PATTERN scripts/gen_compile_commands.py: fix docstring scripts/gen_compile_commands.py: add acknowledgments
scripts/gen_compile_commands.py | 229 ++++++++++++++++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100755 scripts/gen_compile_commands.py
Can you also please bring over the documentation for this feature?
Regards, Simon

Hello Simon,
Em seg., 21 de ago. de 2023 às 21:13, Simon Glass sjg@chromium.org escreveu:
Hi Joao,
Can you also please bring over the documentation for this feature?
Actually, I couldn't find any documentation per se (e.g. in linux/Documentation) besides what is already documented in the actual code, as in the help message, docstrings, and comments.
Would you have any suggestions?

On Thu, Aug 31, 2023 at 09:21:19PM +0200, João Marcos Costa wrote:
Hello Simon,
Em seg., 21 de ago. de 2023 às 21:13, Simon Glass sjg@chromium.org escreveu:
Hi Joao,
Can you also please bring over the documentation for this feature?
Actually, I couldn't find any documentation per se (e.g. in linux/Documentation) besides what is already documented in the actual code, as in the help message, docstrings, and comments.
Would you have any suggestions?
Can you write up a little something? It'd probably be good to have it documented in the kernel as well too, so something that can be contributed upstream (and we try and be good neighbors).

Hello Joao,
On Sun, 2023-08-20 at 21:04 +0200, Joao Marcos Costa wrote:
Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json files for improved code navigation and analysis. The series consists of four patches: the initial script import and the necessary modifications for U-Boot compatibility.
Your feedback on these contributions would be greatly appreciated.
Best regards,
Joao Marcos Costa (4): scripts: Port Linux's gen_compile_commands.py to U-Boot scripts/gen_compile_commands.py: adapt _LINE_PATTERN scripts/gen_compile_commands.py: fix docstring scripts/gen_compile_commands.py: add acknowledgments
Can you also add a patch to add the compile_commands.json to .gitignore, please?
Yannic
scripts/gen_compile_commands.py | 229 ++++++++++++++++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100755 scripts/gen_compile_commands.py

Hello Yannic,
Em seg., 28 de ago. de 2023 às 10:09, Yannic Moog Y.Moog@phytec.de escreveu:
Can you also add a patch to add the compile_commands.json to .gitignore, please?
Yannic
Absolutely. I will add such patch in the v2 series. Thanks for the suggestion.
participants (5)
-
Joao Marcos Costa
-
João Marcos Costa
-
Simon Glass
-
Tom Rini
-
Yannic Moog