Core dumped after loading llama.cpp built with 'cmake' / building with 'make' works fine / CUBLAS enabled #1982

m-from-space · 2023-06-24T12:48:54Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).

Expected Behavior

I was expecting that llama.cpp will work with either building using make or cmake and CUBLAS.

Current Behavior

Building both with make or cmake and CUBLAS is working. Executing the main binary will not work when building with cmake though.

Environment and Context

Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
Ubuntu 20.04
Nvidia RTX 3060 12 GB
64 GB RAM
CUDA 12.1
GNU Make 4.2.1
Python 3.10.10
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
cmake version 3.26.4

Failure Information (for bugs)

Illegal instruction (core dumped)

This also affects installing llama-cpp-python using pip (with FORCE_CMAKE=1) and trying to use it or even trying to import a module into python. See issue I created earlier here: abetlen/llama-cpp-python#412

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake .. -DLLAMA_CUBLAS=ON
cmake --build . --config Release
(building...)
cd bin
./main
Illegal instruction (core dumped)

What is working:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make LLAMA_CUBLAS=1
(building...)
./main
main: build = 733 (fdd1860)
main: seed  = 1687610614
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3060

The text was updated successfully, but these errors were encountered:

slaren · 2023-06-24T12:50:48Z

The make build detects your CPU features, for cmake you have to configure it yourself. It builds for AVX2 by default, and your CPU doesn't support it, you need to disable it.

m-from-space · 2023-06-24T12:53:25Z

The make build detects your CPU features, for cmake you have to configure it yourself. It builds for AVX2 by default, and your CPU doesn't support it, you need to disable it.

Could you provide me with details on how to disable AVX2 when configuring cmake? This is also helpful for others. Thanks a bunch! :)

slaren · 2023-06-24T12:55:04Z

Add -DLLAMA_AVX2=OFF when configuring. More generally, check all the options and make sure that only the ones supported by your system are enabled.

m-from-space · 2023-06-24T13:10:25Z

You're the fastest responder I ever encountered. :)

Unfortunately this didn't fix the complete issue, but when starting main, it's at list printing out 2 lines before dumping again.

main: build = 733 (fdd1860)
main: seed  = 1687612078
Illegal instruction (core dumped)

What else is my CPU not supporting?

slaren · 2023-06-24T13:11:51Z

It's probably F16C.

m-from-space · 2023-06-24T17:18:39Z

Unfortunately this doesn't change anything. It's really quite an annoying problem. Is there a way to figure out what "make" is using for CPU architecture?

hungerf3 · 2023-06-24T19:24:44Z

Does setting -DLLAMA_NATIVE=on help? That should pass -march=native which is the same thing that the makefile does.

m-from-space · 2023-06-25T07:56:56Z

Does setting -DLLAMA_NATIVE=on help? That should pass -march=native which is the same thing that the makefile does.

I tried that and thought about it earlier too. But it also was not the solution.

But I actually and finally got it guys, thanks to you! :) The last missing piece was "FMA".

So the following command is building llama.cpp with cmake successfully with cublas activated (following the instructions on the main project page just for clarity):

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build
cd build
cmake .. -DLLAMA_CUBLAS=ON -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF
cmake --build . --config Release

This now also allows building it successfully for llama-cpp-python / text-generation-web-ui (for people that have issues there):

CMAKE_ARGS="-DLLAMA_CUBLAS=on -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir

m-from-space closed this as completed Jun 25, 2023

Jonpro03 mentioned this issue Aug 9, 2023

[Enhancement] Add the Ability to Use NVIDIA for Docker serge-chat/serge#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core dumped after loading llama.cpp built with 'cmake' / building with 'make' works fine / CUBLAS enabled #1982

Core dumped after loading llama.cpp built with 'cmake' / building with 'make' works fine / CUBLAS enabled #1982

m-from-space commented Jun 24, 2023

slaren commented Jun 24, 2023

m-from-space commented Jun 24, 2023

slaren commented Jun 24, 2023 •

edited

Loading

m-from-space commented Jun 24, 2023

slaren commented Jun 24, 2023

m-from-space commented Jun 24, 2023

hungerf3 commented Jun 24, 2023

m-from-space commented Jun 25, 2023

Core dumped after loading llama.cpp built with 'cmake' / building with 'make' works fine / CUBLAS enabled #1982

Core dumped after loading llama.cpp built with 'cmake' / building with 'make' works fine / CUBLAS enabled #1982

Comments

m-from-space commented Jun 24, 2023

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

slaren commented Jun 24, 2023

m-from-space commented Jun 24, 2023

slaren commented Jun 24, 2023 • edited Loading

m-from-space commented Jun 24, 2023

slaren commented Jun 24, 2023

m-from-space commented Jun 24, 2023

hungerf3 commented Jun 24, 2023

m-from-space commented Jun 25, 2023

slaren commented Jun 24, 2023 •

edited

Loading