[RFC] Enhancing Debuginfod Cache Location and Naming (original) (raw)
Objectives
- Minimize the amount of artifacts downloaded.
- Ensure text editors and IDEs can access the downloaded source files.
- Facilitate the sharing of debug information.
Summary
Currently, tools like llvm-debuginfod-find
store cache artifacts in ${cache_folder}/llvm-debuginfod/client
. This means that when searching for an artifact, it is located in that specific folder. The existing problem is that it does not adhere to the layout of how artifacts are stored.
Each objective corresponds to a specific scenario.
Scenario 1
- A program crashes on the desktop, prompting
gnome-abrt
or theKDE crash handler
to generate a back-trace using eithergdb
orelfutils-libdebuginfod
. gdb
retrieves the debug information, and the crash is logged.- If you wish to analyse the coredump using
lldb
, you will need to redownload all previously fetched artifacts.
Scenario 2
- You utilise
llvm-debuginfod-find
orlldb
(once it is supported) to obtain a source file. - The source file is provided with only a hash as its name, which may prevent editors from offering syntax highlighting.
- If you rename the file you have to repeat the process for every source file you fetch.
Scenario 3
- Debuginfod offers an environment variable
DEBUGINFOD_CACHE_PATH
, for storing cache files. - Even if
llvm-debuginfod
utilizes this environment variable, it will not be able to access the cache of existing artifacts and will be required to redownload them.
It is crucial to ensure compatibility with the current implementation, as cache artifacts can grow significantly in size over time.
Full disclosure: we talked about this a bit in person at the Cambridge meetup.
I see 3 problem statements here but not what you’d like to do about them. Which is ok, if you are looking for ideas for that, but do you have any plans already?
(or you clicked “submit” on Discourse too quickly and all that is coming in an edit )
My immediate reaction is that fixing Scenario 1 and 2 would be significant quality of life improvements.
Scenario 3…
I don’t have much experience using debuginfod, but is this something distros set up for users, is it in their initial bashrc?
Or is debuginfod something opt-in still. I know a lot of distros have their own ways to get debug info instead, like extra packages.
This is the opposite of what I would think. If it did use the standard setting, it would find the existing cache. What detail am I missing there?
Perhaps “it” refers to lldb here, so you mean llvm-debuginfod
would do the right thing but lldb is not aligned with it?
(which might be interesting for systems that use llvm tools as their system tools e.g. FreeBSD)
Also, the other way of standardising folder locations on Unix is XDG Base Directory Specification. Is there any overlap with that? We have moved things to use XDG paths in the past.
Maybe debuginfod’s documentation would have those details if they exist. It would be some “if env var use it else here if it exists else whatever the debugger wants to use”.
Do you know what the reason is for this? I suppose they are hashed because it papers over a lot of duplicate file name issues, and it’s a convenient way to address into the cache.
And btw, if someone wants to try setting up debuginfod to reproduce some of these issues, what’s a good starting point? I presume I could run a local server.
Sorry about that not used to the discourse interface.
The man page for debuginfod variables link
I see 3 problem statements here but not what you’d like to do about them.
I am not sure if this is the intended behaviour or not. The plan is to match the file path of elfutils-debuginfod
don’t have much experience using debuginfod, but is this something distros set up for users, is it in their initial bashrc?
It is what the user can set in their bashrc file. I am not aware of any distro setting it by default.
This is the opposite of what I would think. If it did use the standard setting, it would find the existing cache. What detail am I missing there?
The naming convention for the cache artifacts is different so it will download it again.
example
$ export DEBUGINFOD_CACHE_PATH=/tmp
$ echo $DEBUGINFOD_CACHE_PATH
/tmp
$ debuginfod-find debuginfo f83d43b9b4b0ed5c2bd0a1613bf33e08ee054c93
/tmp/f83d43b9b4b0ed5c2bd0a1613bf33e08ee054c93/debuginfo
$ llvm-debuginfod-find --debuginfo f83d43b9b4b0ed5c2bd0a1613bf33e08ee054c93
/tmp/llvmcache-4936913793558767509
Do you know what the reason is for this? I suppose they are hashed because it papers over a lot of duplicate file name issues, and it’s a convenient way to address into the cache.
debuginfod-find
sources end with the language extension so editors can infer the file language llvm-debuginfod-find
does not.
~ → llvm-debuginfod-find --source /usr/src/debug/glibc-2.37-4.fc38.x86_64/locale/setlocale.c 245240a31888ad5c11bbc55b18e02d87388f59a9
/home/da-viper/.cache/llvm-debuginfod/client/llvmcache-16099381850555133692 [0.04s]
Ξ ~ → debuginfod-find source 245240a31888ad5c11bbc55b18e02d87388f59a9 /usr/src/debug/glibc-2.37-4.fc38.x86_64/locale/setlocale.c
/home/da-viper/.cache/debuginfod_client/245240a31888ad5c11bbc55b18e02d87388f59a9/source-445708bd-#usr#src#debug#glibc-2.37-4.fc38.x86_64#locale#setlocale.c
Jlalond May 6, 2025, 12:18am 5
Do we have any context why this happens? It seems strange for us to not be including the file extension if it offers such a benefit
it uses the cache key for storing the artifcacts see.