-
Notifications
You must be signed in to change notification settings - Fork 13.5k
lld looks for libraries given via relative paths at wrong place (relative to system paths) #67779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lld looks for libraries given via relative paths at wrong place (relative to system paths) #67779
Comments
Is this the COFF or ELF backend? are you producing code for Linux or Windows? |
I am producing code for Windows. The "compatible" library that should be used is COFF. The "incompatible" that is used by accident is ELF. Per offline communication @mstorsjo knows almost surely why this is happening, I will let him comment. |
I think I know as well considering it's the COFF backend. I can have a look unless Martin already have a patch. |
Here's another even smaller testcase of the same issue. It's a bit more contrieved (one probably wouldn't be doing it like this), but it showcases the issue much clearer:
void myLibFunc(void) { }
void myLibFunc(void);
void mainCRTStartup() {
myLibFunc();
} To run: $ clang -target x86_64-linux-gnu -c lib.c -o lib-linux.o
$ clang -target aarch64-windows-msvc -c lib.c -o lib-win.o
$ clang -target aarch64-windows-msvc -c main.c -o main-win.o
$ mkdir -p lib-linux
$ rm -f lib-linux/mylib.a mylib.a
$ llvm-ar rcs lib-linux/mylib.a lib-linux.o
$ llvm-ar rcs mylib.a lib-win.o
$ lld-link main-win.o -libpath:lib-linux mylib.a -out:main-win.exe -subsystem:console
# Success; mylib.a resolves to the file in the local directory, not in -libpath
$ cp lib-linux/mylib.a $(dirname $(which lld-link))/../lib
$ lld-link main-win.o -libpath:lib-linux mylib.a -out:main-win.exe -subsystem:console
lld-link: error: unknown file type: lib-linux.o
# Failure; mylib.a is found in $(dirname $(which lld-link))/../lib The issue can be showed with the new
So the core of the issue, is that the toolchain directories,
Suggested change 1 I think that the empty path, i.e. resolving files based on the current directory, always should be first in the search path, before the toolchain files. Before the recent changes, this was always first, before anything grabbed from the windows SDK or from the I think we could change Currently we have this: // Construct search path list.
searchPaths.emplace_back("");
for (auto *arg : args.filtered(OPT_libpath))
searchPaths.push_back(arg->getValue());
detectWinSysRoot(args);
if (!args.hasArg(OPT_lldignoreenv) && !args.hasArg(OPT_winsysroot))
addLibSearchPaths();
addClangLibSearchPaths(argsArr[0]); But if we really want to have this priority order, if we'd change
Suggested change 2 Secondly, b6c2f10 changed so that we do resolve relative paths relative to the libpaths too. This was done with the intent of allowing Suggested change 3 For mingw contexts, we probably shouldn't be doing any of this logic for auto-adding search paths. In mingw contexts, the linker is always invoked (by the compiler driver) with suitable I made a patch to do this some months ago, https://reviews.llvm.org/D144084, but when applied, this struck an issue where some compiler-rt test expects to invoke LLD in an MSVC environment, but with I just haven't had a real demand for going forward with this patch so I haven't quite had time, but I guess this is a good case for it. |
Hello!
I also ran into this bug a few days ago and have been investigated it a bit. I was planning on submitting a bug report today but you beat me to it. You did a better job than I would have writing it up so its good that you did :) I see in the release notes for 17.0.1: In my case, my CI toolchain caught the bug in 17.0.1. I am building liblzma.a and disabling various features and testing that the build and tests are successful. The problem is that on MSYS2 CLANG64 environment liblzma.a is installed in the system libraries (/clang64/lib/). So when building the xz.exe command line tool, it links with the system liblzma.a instead of the one in the local directory. This causes an obvious problem if trying to build and distribute a new version of a program that depends on a library built locally. So it makes sense why this issue did not appear until 17.0.1. The expected behavior should be to check the local directory first, then the toolchain directories. I believe this problem is only with the COFF backend since I could not reproduce it on my Ubuntu machine after upgrading to the 17.0.1 release. It possible I set up my test wrong so don't believe me 100%. |
I think that's very reasonable and I would think that change would be easy and good. |
…e search paths In b6c2f10, we allowed looking up relative paths within the search directories. This was done with the intention to allow resolving paths like <triple>/clang_rt.builtins.lib within a base search directory. However we shouldn't try to look up e.g. a path like ../../../../lib/libLLVMSupport.a in the search paths. In some cases, this could actually lead to accidental matches, for a file other than the one that was intended. This should fix one aspect of llvm#67779.
I forgot to mention here, that this issue actually reproduces quite easily - if building LLVM with the CMake Makefile generator, as opposed to Ninja. With Ninja, all the build commands are executed in the same directory, the base of the build directory. However, with the makefile generator, it uses recursive makefiles, thus it enters subdirectories and executes commands there. That's why there are relative path references such as |
… path Before af744f0, the first entry among the search paths was the empty string, indicating searching in (or starting from) the current directory. After af744f0, the toolchain/clang specific lib directories were added at the head of the search path. This would cause lookups of literal file names or relative paths to match paths in the toolchain, if there are coincidental files with similar names there, even if they would be find in the current directory as well. Change addClangLibSearchPaths to append to the list like all other operations on searchPaths - but move the invocation of the function to the right place in the sequence. This fixes llvm#67779.
… path (#67857) Before af744f0, the first entry among the search paths was the empty string, indicating searching in (or starting from) the current directory. After af744f0, the toolchain/clang specific lib directories were added at the head of the search path. This would cause lookups of literal file names or relative paths to match paths in the toolchain, if there are coincidental files with similar names there, even if they would be find in the current directory as well. Change addClangLibSearchPaths to append to the list like all other operations on searchPaths - but move the invocation of the function to the right place in the sequence. This fixes #67779.
Reopening for reusing the PR for backport. The main fix should have landed now (which should be enough for avoiding the issue), I'll request backporting that. We'll see about backporting the others once they're landed. (Also, @tru, I know we're slated to release 17.0.2 tomorrow - if you feel this is too soon to be backported before that, feel free to defer it until after the release to let it cook in main for a few days. Although I believe these changes should be fairly risk free.) |
Failed to cherry-pick: f906fd5 https://github.com/llvm/llvm-project/actions/runs/6379152727 Please manually backport the fix and push it to your github fork. Once this is done, please add a comment like this:
|
Ah, apparently a reformatting in a5e280b affects the cherrypicking. While the reformatting isn't very small and localized, I think it's easiest to cherrypick it to the release branch (it applies cleanly); that also reduces the risk of conflicts for other future backports. |
/branch llvm/llvm-project-release-prs/issue67779 |
/branch llvm/llvm-project-release-prs/issue67779 |
/pull-request llvm/llvm-project-release-prs#719 |
… path (#67857) Before af744f0, the first entry among the search paths was the empty string, indicating searching in (or starting from) the current directory. After af744f0, the toolchain/clang specific lib directories were added at the head of the search path. This would cause lookups of literal file names or relative paths to match paths in the toolchain, if there are coincidental files with similar names there, even if they would be find in the current directory as well. Change addClangLibSearchPaths to append to the list like all other operations on searchPaths - but move the invocation of the function to the right place in the sequence. This fixes #67779. (cherry picked from commit f906fd5)
This contains a few bugfixes for mingw use cases: - LLD in 17.0.1 and the earlier 17.x release candidates looked up libraries (including libraries named as a relative path) preferring to search in toolchain relative paths over first testing from the current directory. In some cases, this could cause cryptic build failures; see llvm/llvm-project#67779. - Options that only are relevant for linking, like -mwindows, used to be accepted (but produce a warning) if used in a compilation-only job. In the earlier 17.x releases/RCs, such warnings caused an error instead; see llvm/llvm-project#64464 and #371.
FYI, this patch just landed in 241c290. |
@llvm/issue-subscribers-lld-coff Author: Tomas Kalibera (kalibera)
When libraries are passed to the linker as relative paths (e.g. "../../../../lib/libLLVMSupport.a"), they are sometimes looked for relative to the system installation path (e.g. "/something/usr/x86_64-pc-linux-gnu/lib/clang/17/lib/../../../../lib/libLLVMSupport.a"). Instead, they should be looked up always relative to the current working directory.
I've ran into this when cross-compiling a native LLVM compiler (to run on Windows/aarch64 and produce code for Windows/aarch64), using a cross-compiler running on Linux/x86_64. Linking failed due to errors like "ld.lld: error: unknown file type: CommandLine.cpp.o". The correct "libLLVMSupport.a" library has been available relatively to the current working directory in "../../../../lib/libLLVMSupport.a", but the linker instead used an incompatible version (for Linux/x86_64) which happened to be located in "/something/usr/x86_64-pc-linux-gnu/lib/clang/17/lib/../../../../lib/libLLVMSupport.a". When I deleted that incompatible library from that location, the linking passed as it should. I'm attaching a narrowed-down example based on this problem. On my system, with that "incompatible" library in place,
does
and outputs
With the "incompatible" library not in place, the output includes only complaints about undefined symbols, as it should, because I've narrowed down the example. The archive includes also outputs from strace, which make it clear that lld opens the "incompatible" library at the wrong place:
The archive also includes the "incompatible" version of the library one could place in the corresponding place to reproduce the problem. @mstorsjo |
When libraries are passed to the linker as relative paths (e.g. "../../../../lib/libLLVMSupport.a"), they are sometimes looked for relative to the system installation path (e.g. "/something/usr/x86_64-pc-linux-gnu/lib/clang/17/lib/../../../../lib/libLLVMSupport.a"). Instead, they should be looked up always relative to the current working directory.
I've ran into this when cross-compiling a native LLVM compiler (to run on Windows/aarch64 and produce code for Windows/aarch64), using a cross-compiler running on Linux/x86_64. Linking failed due to errors like "ld.lld: error: unknown file type: CommandLine.cpp.o". The correct "libLLVMSupport.a" library has been available relatively to the current working directory in "../../../../lib/libLLVMSupport.a", but the linker instead used an incompatible version (for Linux/x86_64) which happened to be located in "/something/usr/x86_64-pc-linux-gnu/lib/clang/17/lib/../../../../lib/libLLVMSupport.a". When I deleted that incompatible library from that location, the linking passed as it should.
I'm attaching a narrowed-down example based on this problem. On my system, with that "incompatible" library in place,
does
and outputs
With the "incompatible" library not in place, the output includes only complaints about undefined symbols, as it should, because I've narrowed down the example. The archive includes also outputs from strace, which make it clear that lld opens the "incompatible" library at the wrong place:
The archive also includes the "incompatible" version of the library one could place in the corresponding place to reproduce the problem.
test.zip
@mstorsjo
@tru @rnk @zmodem
The text was updated successfully, but these errors were encountered: