[PATCH v4 0/8] Port gen_compile_commands.py from Linux to U-Boot

Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json file for improved code navigation and analysis. The series consists of the initial script import, the necessary modifications for U-Boot compatibility, and finally some documentation.
Your feedback on these contributions would be greatly appreciated.
Best regards,
Changes in v4: - Replace "Options" section in the doc by "Usage", as it is simply a reference to the script's usage message - Replace 'kernel' by 'U-Boot' in the usage message - Replace label by :doc: link - Instead of adding a chapter about integration with IDEs into doc/build/tools.rst, add a new file (i.e., doc/develop/ide_integration.rst) - Replace the doc's heading (gen_compile_commands) by 'Create build database for IDEs' - Add a section listing some of the compatible IDEs and how to set them up Changes in v3: - Add documentation to index and fix syntax issues - Add reference to documentation in doc/build/tools Changes in v2: - Add compile_commands.json to gitignore - Add documentation
Joao Marcos Costa (8): scripts: Port Linux's gen_compile_commands.py to U-Boot scripts/gen_compile_commands.py: adapt _LINE_PATTERN scripts/gen_compile_commands.py: fix docstring scripts/gen_compile_commands.py: add acknowledgments .gitignore: add compile_commands.json doc: add documentation for gen_compile_commands.py doc: add ide_integration.rst to doc/develop scripts/gen_compile_commands: fix usage message
.gitignore | 3 + doc/build/gen_compile_commands.rst | 84 +++++++++++ doc/build/index.rst | 1 + doc/develop/ide_integration.rst | 13 ++ doc/develop/index.rst | 1 + scripts/gen_compile_commands.py | 230 +++++++++++++++++++++++++++++ 6 files changed, 332 insertions(+) create mode 100644 doc/build/gen_compile_commands.rst create mode 100644 doc/develop/ide_integration.rst create mode 100755 scripts/gen_compile_commands.py

This script generates a database of compiler flags, namely compile_commands.json. It is quite useful for text editors that use clangd LSP (e.g. Vim, Neovim).
It was ported from Linux's sources: - tag: v6.4 - revision 6995e2de6891c724bfeb2db33d7b87775f913ad1
Modifications for U-Boot compatibility will be added in a follow-up commit.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 228 ++++++++++++++++++++++++++++++++ 1 file changed, 228 insertions(+) create mode 100755 scripts/gen_compile_commands.py
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py new file mode 100755 index 0000000000..15ba56527a --- /dev/null +++ b/scripts/gen_compile_commands.py @@ -0,0 +1,228 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) Google LLC, 2018 +# +# Author: Tom Roeder tmroeder@google.com +# +"""A tool for generating compile_commands.json in the Linux kernel.""" + +import argparse +import json +import logging +import os +import re +import subprocess +import sys + +_DEFAULT_OUTPUT = 'compile_commands.json' +_DEFAULT_LOG_LEVEL = 'WARNING' + +_FILENAME_PATTERN = r'^..*.cmd$' +_LINE_PATTERN = r'^savedcmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' +_VALID_LOG_LEVELS = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] +# The tools/ directory adopts a different build system, and produces .cmd +# files in a different format. Do not support it. +_EXCLUDE_DIRS = ['.git', 'Documentation', 'include', 'tools'] + +def parse_arguments(): + """Sets up and parses command-line arguments. + + Returns: + log_level: A logging level to filter log output. + directory: The work directory where the objects were built. + ar: Command used for parsing .a archives. + output: Where to write the compile-commands JSON file. + paths: The list of files/directories to handle to find .cmd files. + """ + usage = 'Creates a compile_commands.json database from kernel .cmd files' + parser = argparse.ArgumentParser(description=usage) + + directory_help = ('specify the output directory used for the kernel build ' + '(defaults to the working directory)') + parser.add_argument('-d', '--directory', type=str, default='.', + help=directory_help) + + output_help = ('path to the output command database (defaults to ' + + _DEFAULT_OUTPUT + ')') + parser.add_argument('-o', '--output', type=str, default=_DEFAULT_OUTPUT, + help=output_help) + + log_level_help = ('the level of log messages to produce (defaults to ' + + _DEFAULT_LOG_LEVEL + ')') + parser.add_argument('--log_level', choices=_VALID_LOG_LEVELS, + default=_DEFAULT_LOG_LEVEL, help=log_level_help) + + ar_help = 'command used for parsing .a archives' + parser.add_argument('-a', '--ar', type=str, default='llvm-ar', help=ar_help) + + paths_help = ('directories to search or files to parse ' + '(files should be *.o, *.a, or modules.order). ' + 'If nothing is specified, the current directory is searched') + parser.add_argument('paths', type=str, nargs='*', help=paths_help) + + args = parser.parse_args() + + return (args.log_level, + os.path.abspath(args.directory), + args.output, + args.ar, + args.paths if len(args.paths) > 0 else [args.directory]) + + +def cmdfiles_in_dir(directory): + """Generate the iterator of .cmd files found under the directory. + + Walk under the given directory, and yield every .cmd file found. + + Args: + directory: The directory to search for .cmd files. + + Yields: + The path to a .cmd file. + """ + + filename_matcher = re.compile(_FILENAME_PATTERN) + exclude_dirs = [ os.path.join(directory, d) for d in _EXCLUDE_DIRS ] + + for dirpath, dirnames, filenames in os.walk(directory, topdown=True): + # Prune unwanted directories. + if dirpath in exclude_dirs: + dirnames[:] = [] + continue + + for filename in filenames: + if filename_matcher.match(filename): + yield os.path.join(dirpath, filename) + + +def to_cmdfile(path): + """Return the path of .cmd file used for the given build artifact + + Args: + Path: file path + + Returns: + The path to .cmd file + """ + dir, base = os.path.split(path) + return os.path.join(dir, '.' + base + '.cmd') + + +def cmdfiles_for_a(archive, ar): + """Generate the iterator of .cmd files associated with the archive. + + Parse the given archive, and yield every .cmd file used to build it. + + Args: + archive: The archive to parse + + Yields: + The path to every .cmd file found + """ + for obj in subprocess.check_output([ar, '-t', archive]).decode().split(): + yield to_cmdfile(obj) + + +def cmdfiles_for_modorder(modorder): + """Generate the iterator of .cmd files associated with the modules.order. + + Parse the given modules.order, and yield every .cmd file used to build the + contained modules. + + Args: + modorder: The modules.order file to parse + + Yields: + The path to every .cmd file found + """ + with open(modorder) as f: + for line in f: + obj = line.rstrip() + base, ext = os.path.splitext(obj) + if ext != '.o': + sys.exit('{}: module path must end with .o'.format(obj)) + mod = base + '.mod' + # Read from *.mod, to get a list of objects that compose the module. + with open(mod) as m: + for mod_line in m: + yield to_cmdfile(mod_line.rstrip()) + + +def process_line(root_directory, command_prefix, file_path): + """Extracts information from a .cmd line and creates an entry from it. + + Args: + root_directory: The directory that was searched for .cmd files. Usually + used directly in the "directory" entry in compile_commands.json. + command_prefix: The extracted command line, up to the last element. + file_path: The .c file from the end of the extracted command. + Usually relative to root_directory, but sometimes absolute. + + Returns: + An entry to append to compile_commands. + + Raises: + ValueError: Could not find the extracted file based on file_path and + root_directory or file_directory. + """ + # The .cmd files are intended to be included directly by Make, so they + # escape the pound sign '#', either as '#' or '$(pound)' (depending on the + # kernel version). The compile_commands.json file is not interepreted + # by Make, so this code replaces the escaped version with '#'. + prefix = command_prefix.replace('#', '#').replace('$(pound)', '#') + + # Use os.path.abspath() to normalize the path resolving '.' and '..' . + abs_path = os.path.abspath(os.path.join(root_directory, file_path)) + if not os.path.exists(abs_path): + raise ValueError('File %s not found' % abs_path) + return { + 'directory': root_directory, + 'file': abs_path, + 'command': prefix + file_path, + } + + +def main(): + """Walks through the directory and finds and parses .cmd files.""" + log_level, directory, output, ar, paths = parse_arguments() + + level = getattr(logging, log_level) + logging.basicConfig(format='%(levelname)s: %(message)s', level=level) + + line_matcher = re.compile(_LINE_PATTERN) + + compile_commands = [] + + for path in paths: + # If 'path' is a directory, handle all .cmd files under it. + # Otherwise, handle .cmd files associated with the file. + # built-in objects are linked via vmlinux.a + # Modules are listed in modules.order. + if os.path.isdir(path): + cmdfiles = cmdfiles_in_dir(path) + elif path.endswith('.a'): + cmdfiles = cmdfiles_for_a(path, ar) + elif path.endswith('modules.order'): + cmdfiles = cmdfiles_for_modorder(path) + else: + sys.exit('{}: unknown file type'.format(path)) + + for cmdfile in cmdfiles: + with open(cmdfile, 'rt') as f: + result = line_matcher.match(f.readline()) + if result: + try: + entry = process_line(directory, result.group(1), + result.group(2)) + compile_commands.append(entry) + except ValueError as err: + logging.info('Could not add line from %s: %s', + cmdfile, err) + + with open(output, 'wt') as f: + json.dump(compile_commands, f, indent=2, sort_keys=True) + + +if __name__ == '__main__': + main()

For U-Boot's context, the regular expression defined by _LINE_PATTERN should be adapted. Replace 'savedcmd' by 'cmd'.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 15ba56527a..0227522959 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -19,7 +19,7 @@ _DEFAULT_OUTPUT = 'compile_commands.json' _DEFAULT_LOG_LEVEL = 'WARNING'
_FILENAME_PATTERN = r'^..*.cmd$' -_LINE_PATTERN = r'^savedcmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' +_LINE_PATTERN = r'^cmd_[^ ]*.o := (.* )([^ ]*.c) *(;|$)' _VALID_LOG_LEVELS = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'] # The tools/ directory adopts a different build system, and produces .cmd # files in a different format. Do not support it.

The referred tool is now in U-Boot. Replace "the Linux kernel" by "U-Boot" to make the docstring coherent.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 0227522959..63d036a773 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -5,7 +5,7 @@ # # Author: Tom Roeder tmroeder@google.com # -"""A tool for generating compile_commands.json in the Linux kernel.""" +"""A tool for generating compile_commands.json in U-Boot."""
import argparse import json

Add acknowledgments for porting and modifying the script. Of course, the license, author, and copyright notice remain the same as in the original script.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 1 + 1 file changed, 1 insertion(+)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 63d036a773..1a9c49b34a 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -4,6 +4,7 @@ # Copyright (C) Google LLC, 2018 # # Author: Tom Roeder tmroeder@google.com +# Ported and modified for U-Boot by Joao Marcos Costa jmcosta944@gmail.com # """A tool for generating compile_commands.json in U-Boot."""

Add Clang's compilation database file (i.e. compile_commands.json) to .gitignore, at the root of the repository.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- .gitignore | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/.gitignore b/.gitignore index 002f95de4f..261a1d6754 100644 --- a/.gitignore +++ b/.gitignore @@ -109,3 +109,6 @@ __pycache__
# moveconfig database /moveconfig.db + +# Clang's compilation database file +/compile_commands.json

This documentation briefly explains what is a compilation database, and how to use the script to generate one.
This is not a portage, as there was no original documentation in the Linux sources.
Acknowledge the documentation in the script's header and in doc/build index.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- doc/build/gen_compile_commands.rst | 83 ++++++++++++++++++++++++++++++ doc/build/index.rst | 1 + scripts/gen_compile_commands.py | 1 + 3 files changed, 85 insertions(+) create mode 100644 doc/build/gen_compile_commands.rst
diff --git a/doc/build/gen_compile_commands.rst b/doc/build/gen_compile_commands.rst new file mode 100644 index 0000000000..50305cec4a --- /dev/null +++ b/doc/build/gen_compile_commands.rst @@ -0,0 +1,83 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +Create build database for IDEs +============================== + +gen_compile_commands (scripts/gen_compile_commands.py) is a script used to +generate a compilation database (compile_commands.json). This database consists +of an array of "command objects" describing how each translation unit was +compiled. + +Example:: + + { + "command": "gcc -Wp,-MD,arch/x86/cpu/.lapic.o.d -nostdinc -isystem (...)" + "directory": "/home/jmcosta/u-boot", + "file": "/home/jmcosta/u-boot/arch/x86/cpu/lapic.c" + } + +Such information comes from parsing the respective .cmd file of each translation +unit. In the previous example, that would be `arch/x86/cpu/.lapic.o.cmd`. + +For more details on the database format, please refer to the official +documentation at https://clang.llvm.org/docs/JSONCompilationDatabase.html. + +The compilation database is quite useful for text editors (and IDEs) that use +Clangd LSP. It allows jumping to definitions and declarations. Since it relies +on parsing .cmd files, one needs to have a target (e.g. configs/xxx_defconfig) +built before running the script. + +Example:: + + make sandbox_defconfig + make + ./scripts/gen_compile_commands.py + +Beware that depending on the changes you made to the project's source code, you +may need to run the script again (presuming you recompiled your target, of +course) to have an up-to-date database. + +The database will be in the root of the repository. No further modifications are +needed for it to be usable by the LSP, unless you set a name for the database +other than it's default one (compile_commands.json). + +Compatible IDEs +=============== + +Several popular integrated development environments (IDEs) support the use +of JSON compilation databases for C/C++ development, making it easier to +manage build configurations and code analysis. Some of these IDEs include: + +1. **Visual Studio Code (VS Code)**: IntelliSense in VS Code can be set up to + use compile_commands.json by following the instructions in + https://code.visualstudio.com/docs/cpp/faq-cpp#_how-do-i-get-intellisense-to.... + +2. **CLion**: JetBrains' CLion IDE supports JSON compilation databases out + of the box. You can configure your project to use a compile_commands.json + file via the project settings. Details on setting up CLion with JSON + compilation databases can be found at + https://www.jetbrains.com/help/clion/compilation-database.html. + +3. **Qt Creator**: Qt Creator, a popular IDE for Qt development, also + supports compile_commands.json for C/C++ projects. Instructions on how to + use this feature can be found at + https://doc.qt.io/qtcreator/creator-clang-codemodel.html#using-compilation-d.... + +4. **Eclipse CDT**: Eclipse's C/C++ Development Tools (CDT) can be + configured to use JSON compilation databases for better project management. + You can find guidance on setting up JSON compilation database support at the + wiki: https://wiki.eclipse.org/CDT/User/NewIn910#Build. + +For Vim, Neovim, and Emacs, if you are using Clangd as your LSP, placing the +compile_commands.json in the root of the repository should suffice to enable +code navigation. + +Usage +===== + +For further details on the script's options, please refer to its help message, +as in the example below. + +Help:: + + ./scripts/gen_compile_commands.py --help diff --git a/doc/build/index.rst b/doc/build/index.rst index 64e66491bd..7a4507b574 100644 --- a/doc/build/index.rst +++ b/doc/build/index.rst @@ -14,3 +14,4 @@ Build U-Boot tools buildman documentation + gen_compile_commands diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index 1a9c49b34a..aa52e88e18 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -5,6 +5,7 @@ # # Author: Tom Roeder tmroeder@google.com # Ported and modified for U-Boot by Joao Marcos Costa jmcosta944@gmail.com +# Briefly documented at doc/build/gen_compile_commands.rst # """A tool for generating compile_commands.json in U-Boot."""

Add 'Integration with IDEs' chapter.
For now, this chapter is mostly a reference to the documentation of gen_compile_commands, in doc/build, but it can be futurely used as a guide for other IDE-friendly features.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- doc/develop/ide_integration.rst | 12 ++++++++++++ doc/develop/index.rst | 1 + 2 files changed, 13 insertions(+) create mode 100644 doc/develop/ide_integration.rst
diff --git a/doc/develop/ide_integration.rst b/doc/develop/ide_integration.rst new file mode 100644 index 0000000000..455e09959c --- /dev/null +++ b/doc/develop/ide_integration.rst @@ -0,0 +1,12 @@ +Integration with IDEs +===================== + +IDEs and text editors (e.g., VSCode, Emacs, Vim, Neovim) typically offer +plugins to enhance the development experience, such as Clangd LSP. These +plugins provide features like code navigation (i.e., jumping to definitions +and declarations), code completion, and code formatting. + +U-Boot provides a script (i.e., scripts/gen_compile_commands.py) that +generates a compilation database to be utilized by Clangd LSP for code +navigation. For detailed usage instructions, please refer to the script's +documentation: :doc:`../build/gen_compile_commands`. diff --git a/doc/develop/index.rst b/doc/develop/index.rst index 5b230d0321..272bdef84d 100644 --- a/doc/develop/index.rst +++ b/doc/develop/index.rst @@ -19,6 +19,7 @@ General security sending_patches system_configuration + ide_integration
Implementation --------------

Replace mentions to 'kernel' by 'U-Boot' to avoid confusion.
Signed-off-by: Joao Marcos Costa jmcosta944@gmail.com --- scripts/gen_compile_commands.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py index aa52e88e18..cdca85e6b0 100755 --- a/scripts/gen_compile_commands.py +++ b/scripts/gen_compile_commands.py @@ -37,10 +37,10 @@ def parse_arguments(): output: Where to write the compile-commands JSON file. paths: The list of files/directories to handle to find .cmd files. """ - usage = 'Creates a compile_commands.json database from kernel .cmd files' + usage = 'Creates a compile_commands.json database from U-Boot .cmd files' parser = argparse.ArgumentParser(description=usage)
- directory_help = ('specify the output directory used for the kernel build ' + directory_help = ('specify the output directory used for the U-Boot build ' '(defaults to the working directory)') parser.add_argument('-d', '--directory', type=str, default='.', help=directory_help)

Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json file for improved code navigation and analysis. The series consists of the initial script import, the necessary modifications for U-Boot compatibility, and finally some documentation.
Tested-by: Joao Paulo Goncalves joao.goncalves@toradex.com

On Sun, Oct 01, 2023 at 12:00:28PM +0200, Joao Marcos Costa wrote:
Hello U-Boot community,
I'm submitting a patch series that ports the gen_compile_commands.py script from the Linux kernel's sources to U-Boot. This script, originally located in scripts/clang-tools/gen_compile_commands.py, enables the generation of compile_commands.json file for improved code navigation and analysis. The series consists of the initial script import, the necessary modifications for U-Boot compatibility, and finally some documentation.
Your feedback on these contributions would be greatly appreciated.
For the series, applied to u-boot/master, thanks!
participants (3)
-
Joao Marcos Costa
-
Joao Paulo Goncalves
-
Tom Rini