-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Update python verions #13574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Update python verions #13574
Conversation
Updates the allowed Python version to include 3.10 up to, but not including, 3.14. Sets the minimum torch version to 2.5 in multiple dependency files.
Updates torchvision dependency to a more recent version. This ensures compatibility with the current torch version and leverages the latest features and bug fixes.
Increases the minimum required NumPy version to 2.1 across multiple project dependencies. This ensures compatibility with newer features and improvements in the NumPy library. Updates also include a minor patch to Torch version.
Were those checks already not passing? I can look into them/ changes required |
Thank you. I am not really sure why I am low-coding. I don't understand much. |
…der_to_gguf.py Fix torch 2.5.1 / numpy 2.x compatibility in convert_image_encoder_to_gguf.py - Updated Tensor-to-array conversions to use `np.asarray(..., dtype=...)` per NumPy 2.x migration rules (avoids copy error on float16). - Used explicit typing and `cast(...)` to guide Pyright/Pylance under torch 2.5.1: - Annotated `model` as PreTrainedModel. - Re-cast `model.vision_model` to `CLIPVisionTransformer` to safely access `.encoder.layers`. - Replaced slice assignment with `__init__` to reset ModuleList contents. - Verified compatibility by converting `openai/clip-vit-base-patch32` using `--clip-model-is-openclip`.
Corrects type annotations for numpy arrays to allow for a broader range of numpy dtypes and resolves type checking errors. Removes NumPy DTypeLike type hint Updates type hints for NumPy compatibility by removing DTypeLike. Ensures alignment with NumPy's typing system, preventing potential type-related issues.
Updates type hints related to numpy to use `np.dtype[Any]` instead of `DTypeLike` for better compatibility and clarity. This resolves potential issues with type checking and ensures accurate type annotations for numpy-related operations within the `gguf` library.
after researching the typing issues introduced in numpy 2.2
…ying class (such as np.float32) looking at microsoft/pyright#9051, they declined to fix it themselves, and suggested instead that the used must add a # pyright: ignore or # type: ignore directive to suppress this error. Numpy is working to resolve them: numpy/numpy#28076 and has already done so with npfloat64 (which I can verify in our errors) -- see numpy/numpy#27957 .
Updates the pyright configuration to use Python 3.10. This ensures compatibility with newer language features and libraries. It also excludes the 'tools/legacy' and 'tests' directories from analysis.
This reverts commit eb6499c.
I made a mistake in rolling back the changes to tools/mtmd (to protect the legacy files), and now the git history of my branch will blame this pr in the history of changes to some files which it should not. for this reason, I am going to recreate this pr. I am sorry for the error. |
The commits are squashed when merging to master anyway, so if the files are not changed in But if you end up fixing the history of the branch, you can also force-push that to the branch of this PR and GitHub will only show the commits belonging to the new tip of the branch. |
oh great :) git diff --name-only $(git merge-base master HEAD) gguf-py/gguf/gguf_reader.py |
Updates the pyright configuration to exclude the legacy models directory from type checking.
Temporarily disables `reportInvalidTypeForm` in pyright configuration due to ongoing transition to the new type system in numpy 2.2.x.
pyrightconfig.json
Outdated
"pythonPlatform": "All", | ||
"reportInvalidTypeForm": false, // TODO: remove once numpy 2.2.x resolves the transition to their new type system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"reportInvalidTypeForm": false, // TODO: remove once numpy 2.2.x resolves the transition to their new type system |
I don't think this is necessary with the pythonVersion
as 3.10
; I did not see this error kind in the checks of the previous commit. See https://github.com/ggml-org/llama.cpp/actions/runs/15123751323/job/42511726247, which mostly seems to be argument type issues where DTypeLike
was replaced with np.dtype[Any]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it actually is not working, as the comment at the top in each file was. I am kind of surprised by this but I see:
https://github.com/ggml-org/llama.cpp/actions/runs/15123815358/job/42511921578?pr=13574
The link you referenced was after reverting the top-of-file comment in gguf-reader.
(compare to https://github.com/ggml-org/llama.cpp/actions/runs/15122601593 when we had comments at the top of each file and testing in 3.8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a compromise. In both NumPy 1.x and 2.2.x, np.uint32 and np.dtype(np.uint32) are functionally equivalent when used as parameters in array creation or type specification. So, the current commit rolls back the pyright hint in both locations. there are scripts that might still benefit from this, like convert_hf_to_gguf.py, but they have other issues as well and maybe deserve their own pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think setting reportInvalidTypeForm
to false
is not necessary (since the version change to 3.10
in pyrightconfig.json
) and will hide future problems.
In both NumPy 1.x and 2.2.x, np.uint32 and np.dtype(np.uint32) are functionally equivalent when used as parameters in array creation or type specification
Yes, since both are DTypeLike
. Why not simply directly use (I see this was changed in a5c74dd)DTypeLike
in the type annotations as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they were removed because on Friday, I believe numpy 2.2.5 was current, and DtypeLike was raising errors in pyright. Specifically, I was seeing:
Error: Variable not allowed in type expression (reportInvalidTypeForm)
/home/runner/work/llama.cpp/llama.cpp/gguf-py/gguf/quants.py:29:96 - error: Variable not allowed in type expression (reportInvalidTypeForm)
Error: Variable not allowed in type expression (reportInvalidTypeForm)
/home/runner/work/llama.cpp/llama.cpp/gguf-py/gguf/quants.py:83:36 - error: Variable not allowed in type expression (reportInvalidTypeForm)
Error: Variable not allowed in type expression (reportInvalidTypeForm)
in the file before I had touched it at all. Today, numpy 2.2.6 is current, and pyright does not warn me on those lines (at least locally, waiting for the type check to complete to verify). I have removed the changes I had previously made to this type in the relevant files
edit: The check completed, I think we are good. this is the only remaining error:
/home/runner/work/llama.cpp/llama.cpp/examples/convert_legacy_llama.py:1311:37 - error: Cannot access attribute "newbyteorder" for class "uint32"
Attribute "newbyteorder" is unknown (reportAttributeAccessIssue)
Error: Cannot access attribute "newbyteorder" for class "uint32"
Attribute "newbyteorder" is unknown (reportAttributeAccessIssue)
1 error, 0 warnings, 12 informations
Error: 1 error
And, I am wondering, would you like me to add it to the exclude list in pyrightconfig, since it is marked as legacy? (elsewhere in this discussion, with @ngxson, a preference was to not change certain legacy scripts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, I am wondering, would you like me to add it to the exclude list in pyrightconfig, since it is marked as legacy?
Without ignoring the file, it's still a relatively simple fix:
diff --git a/examples/convert_legacy_llama.py b/examples/convert_legacy_llama.py
index c4ec5c524..710cca96b 100755
--- a/examples/convert_legacy_llama.py
+++ b/examples/convert_legacy_llama.py
@@ -1308,7 +1308,7 @@ def do_dump_model(model_plus: ModelPlus) -> None:
def main(args_in: list[str] | None = None) -> None:
output_choices = ["f32", "f16"]
- if np.uint32(1) == np.uint32(1).newbyteorder("<"):
+ if np.uint32(1) == np.uint32(1).view(np.dtype(np.uint32).newbyteorder("<")):
# We currently only support Q8_0 output on little endian systems.
output_choices.append("q8_0")
parser = argparse.ArgumentParser(description="Convert a LLaMA model to a GGML compatible file")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thank you!
that works for now :) Starting with python 3.13, Unpickler changes:
The persistent_load method overrides the Unpickler class in an incompatible way: the base method takes 1 positional parameter, but the override takes 2. (Pylance reportIncompatibleMethodOverride)
That's not effecting our tests though :)
edit: funnily enough, chatgpt tells me the C signature never changed: The one‑parameter signature you see in type stubs is a typing artefact that confused static checkers; the C implementation still forwards the persistent‑ID to your override. It suggests (and I just find this hilarious -- totally happy to ignore it if you are):
A pragmatic way to silence the Pyright/Pylance warning is simply:
class MyUnpickler(pickle.Unpickler):
def persistent_load(self, pid: Any) -> Any: # type: ignore[override]
...
If you do want perfect type‑checker harmony you can gate the signature on the Python version, but it’s rarely worth the added noise.
Replaces uses of bare NumPy type hints (e.g., `np.uint32`) with `np.dtype(np.uint32)` in the GGUF reader. This aligns with changes in NumPy 2.0+ and ensures forward compatibility. Also applies this change to the quantize/dequantize functions in the quants module.
DTypeLike was removed because pyright was choking on it against numpy 2.2.1 Now numpy at 2.2.6 seems to have changes needed for pyright to not have issues with it. additionally, DTypeLike is accurate in a way that np.dtype[Any] was not.
Updates the byteorder check in the conversion script to be more robust. Removes the "tests" directory from the pyright exclude list.
currently, and for quite some time actually, we have been in a weird situation where no python install is possible without using flags, at least with standard setups such as uv. here's an example:
The numpy thing was an issue at least straight through torch 2.5 (torch 2.5.1 switched to numpy 2.x). Torch never published their own numpy 1.26.4, switching at 1.26.3 to 2.0, and uv stops with the first valid provider and its already on torch, so uv gives up when it finds the provider doesn't have the version it needs.
this same install works starting at 3.10, (numpy aside). But starting at python 3.13, we get a problem with torch:
BTW, PyTorch (>= 2.3) officially dropped support for Python 3.9 as of June 22, 2024, following the NEP 29 deprecation schedule. So it really makes sense to bump this project up to 3.10.
Can we just update Pytorch?
I see 3 files that use Pytorch in the project.
I asked gemini to help look for issues. It notes:
Okay, I have analyzed the PyTorch usage in the three provided scripts:
./convert_hf_to_gguf.py
,./convert_lora_to_gguf.py
, and./tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
.Here's the verification for each file regarding compatibility with PyTorch 2.3 or greater:
1.
./convert_hf_to_gguf.py
import torch
from torch import Tensor
(for type hinting)torch.load(str(self.dir_model / part_name), map_location="cpu", mmap=True, weights_only=True)
:safetensors
format.weights_only=True
. This is the recommended security practice and aligns with PyTorch's direction. It means that even when PyTorch 2.6+ changes the default ofweights_only
toTrue
, this script's behavior will remain consistent and safe..dtype
(e.g.,first_tensor.dtype == torch.float16
,data_torch.dtype
).to(torch.float32)
(and other type conversions like totorch.float16
,torch.bfloat16
).numpy()
.shape
.reshape(...)
,torch.cat(...)
,torch.stack(...)
,torch.permute(...)
,torch.transpose(...)
,torch.div_(...)
(in-place),torch.view(...)
(used in various model-specificmodify_tensors
methods).torch.arange(...)
,torch.tensor(...)
,torch.exp(...)
,torch.equal(...)
,torch.norm(...)
,torch.from_numpy(...)
(viatorch.frombuffer
inLazyTorchTensor
),torch.split(...)
.LazyTorchTensor
Class:torch.empty
for meta tensor creation andtorch.frombuffer
for creating tensors from remote data.__torch_function__
to wrap PyTorch functions, which is a stable PyTorch feature.torch.float64
,torch.float32
,torch.bfloat16
,torch.float16
,torch.int64
, etc.) are all standard.torch.inference_mode()
:./convert_hf_to_gguf.py
:weights_only=True
intorch.load
is key. All other PyTorch operations are standard and stable.2.
./convert_lora_to_gguf.py
import torch
from torch import Tensor
(for type hinting)LazyTorchTensor
andModelBase
from./convert_hf_to_gguf.py
, inheriting their PyTorch usage patterns.torch.load(input_model, map_location="cpu", weights_only=True)
:adapter_model.safetensors
is not found, loadingadapter_model.bin
.weights_only=True
.weights_only
defaults.LoraTorchTensor
Class:__getitem__
,reshape
,view
,permute
,transpose
,to
,__torch_function__
fortorch.stack
,torch.cat
) perform operations on the underlying PyTorch tensors using standard PyTorch tensor methods. For example,self._lora_A.reshape(...)
.lora_a.T
(tensor transpose property).ModelBase
:LoraModel
class inherits fromModelBase
(defined inconvert_hf_to_gguf.py
). It reuses the tensor processing logic, which, as established above, is compatible.LazyTorchTensor.from_eager(tensor)
.torch.inference_mode()
:./convert_lora_to_gguf.py
:torch.load
usage and the compatible base class fromconvert_hf_to_gguf.py
.3.
./tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
import torch
import torch.nn.functional as F
from torch import nn
SiglipVisionEmbeddings
,SiglipAttention
,Idefics2VisionTransformer
):torch.nn.Module
orPreTrainedModel
(which itself is annn.Module
).nn.Conv2d
,nn.Embedding
,nn.Linear
,nn.LayerNorm
,nn.ModuleList
.ACT2FN
or directly likeF.pad
).nn.init.normal_
,nn.init.zeros_
,_trunc_normal_
,variance_scaling_
) use basic tensor operations liketensor.uniform_
,tensor.erfinv_()
,tensor.mul_()
,tensor.add_()
,tensor.clamp_()
,tensor.normal_()
.torch.nn
and associated tensor manipulations.tensor.erfinv_
is a standard operation.torch.load(os.path.join(dir_model, "minicpmv.clip"))
:Idefics2VisionTransformer
(orSiglipVisionTransformer
).weights_only
is NOT specified.weights_only
isFalse
. This means the script will load the file as is.model.load_state_dict(...)
. This function expects a dictionary mapping tensor names totorch.Tensor
objects (a state dictionary).minicpmv.clip
is a pickled file containing only a state dictionary (which is highly likely given its use withload_state_dict
), then:weights_only=False
(default in 2.3-2.5) works fine.weights_only=True
would also work fine and would be safer.weights_only=True
, this line of code will continue to work correctly ifminicpmv.clip
is indeed just a state dictionary of tensors.weights_only=True
to this call ifminicpmv.clip
is confirmed to be a safe state dictionary, for future-proofing and best practice.state_dict = model.state_dict()
data.squeeze().numpy()
v.chunk(3, dim=0)[0]
,v.transpose(-1, -2).contiguous()
,v.unsqueeze(0)
(in_replace_name_resampler
and_replace_name
).torch.from_numpy(...)
(used implicitly whentorch.from_numpy(get_2d_sincos_pos_embed(...))
is called).torch.no_grad()
: Used as a context manager../tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
:torch.load
call without an explicitweights_only
parameter is not a breaking issue for PyTorch 2.3-2.5 for loading state dictionaries. It will also likely work with PyTorch 2.6+ if the file is a standard state dictionary. However, for best practice and to eliminate any ambiguity, addingweights_only=True
to this specifictorch.load
call is advisable, assumingminicpmv.clip
contains only a state dictionary.Overall Conclusion:
All three scripts (
./convert_hf_to_gguf.py
,./convert_lora_to_gguf.py
, and./tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
) are compatible with upgrading PyTorch to 2.3 or greater.torch.load
calls in the first two scripts explicitly useweights_only=True
, which is ideal.torch.load
call in the third script does not specifyweights_only
, but given its context (loading a state dictionary), this is not a breaking issue for PyTorch 2.3-2.5 and is likely to remain compatible with PyTorch 2.6+ for this specific use case. Addingweights_only=True
to it is a good recommendation for robustness.So, I created a script to check general availability of torch at python version: https://gist.github.com/robbiemu/c224d6daf2f05d918ac8cf759db33288
uv run main.py --toml $GH/llama.cpp/pyproject.toml --requirements $GH/llama.cpp/requirements.txt --targets cpu,cuda
Building dynamic compatibility grid — this may take a few seconds…
Stable variants discovered: cpu, cu100, cu101, cu102, cu110, cu111, cu113, cu115, cu116, cu117, cu118, cu121, cu124, cu126, cu128, cu92, rocm3, rocm4, rocm5, rocm6, xpu
Nightly variants discovered: cpu, cu118, cu126, cu128, rocm6, xpu
Python \ Torch | 2.2 | 2.3 | 2.4 | 2.5 | 2.6 | 2.7 | 2.8
---------------+-----+-----+-----+-----+-----+-----+----
3.9 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | !
3.10 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | !
3.11 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | !
3.12 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | !
3.13 | X | X | X | ✓ | ✓ | ✓ | !
To support all python versions from 3.10 forward, we will need to upgrade torch to 2.5.
Taken together, we should upgrade to python >=3.10, torch 2.5.1, and numpy >2.0.
So, I wrote a small script to crawl the repo for "import numpy" and "from numpy" and collect all the numpy members used in the files, and asked gemini to check them for safety in upgrading to numpy 2.x, and verified our usage was compatible:
Now, let's verify these against potential breaking changes, primarily focusing on the NumPy 1.x to NumPy 2.0 transition, as this is the most significant recent shift. You're aiming for
numpy>=2.0.0
to go withtorch>=2.5.0
and Python 3.10+.Verification of NumPy Members for NumPy 2.0 Compatibility:
Core Array Object and Creation:
ndarray
,array
,asarray
,empty
,ones
,zeros
,arange
,frombuffer
,fromfile
,fromiter
: These are fundamental and remain. Their core behavior is stable.memmap
: Stable.Data Types (dtypes):
dtype
(the object itself and as a parameter): Stable.float16
,float32
,float64
: Stable. These are concrete types.int8
,int16
,int32
,int64
: Stable.uint8
,uint16
,uint32
,uint64
: Stable.bool_
: Stable (np.bool_
is the canonical boolean type).complex64
: Stable.generic
(likelynp.generic
for type checking): Stable.signedinteger
(likelynp.signedinteger
for type checking): Stable.DTypeLike
,NDArray
(fromnumpy.typing
): These are for type hinting and are designed to work with NumPy 2.0.np.int
,np.float
,np.bool
. Your list shows specific types (np.int32
,np.float64
, etc.) ornp.bool_
, which are the correct, non-deprecated forms. So, this common pitfall is avoided.Array Manipulations:
concatenate
,stack
,split
,array_split
,hsplit
: Stable.reshape
(implied, asndarray
objects have this method, though not explicitly listed asnp.reshape
): Stable.moveaxis
,take_along_axis
: Stable.pad
: Stable.packbits
,unpackbits
: Stable.clip
: Stable.triu
: Stable.Mathematical Operations & Functions:
cos
,sin
,exp
,sqrt
,floor
,trunc
,sign
: Stable.sum
,prod
,argmax
,count_nonzero
: Stable.dot
,einsum
: Stable.linalg
(e.g.,np.linalg.norm
): Thenp.linalg
submodule is stable.all
,allclose
,array_equal
: Stable.fmax
,fmin
: Stable.nan
(np.nan
): Stable.fft
(likelynp.fft
submodule): Stable.hanning
(likelynp.hanning
for window functions): Stable.Utilities & Submodules:
random
(e.g.,np.random.random
): Thenp.random
module and its functions are stable.lib
(e.g.,np.lib.stride_tricks
): This submodule contains utilities. While some very internal parts could change, common uses are generally stable.np.lib.pad
is nownp.pad
. Your list showspad
, suggesting the directnp.pad
is likely used, which is good.errstate
: Stable (context manager for floating point errors).set_printoptions
: Stable.meshgrid
: Stable.where
: Stable.Overall Assessment for NumPy >= 2.0.0:
Based on the list of members your script identified:
np.int
,np.float
aliases) does not appear to be an issue here, as the list shows specific dtypes (e.g.,np.float32
,np.int64
).Conclusion:
Given the specific NumPy members used by the
llama.cpp
Python scripts, an upgrade tonumpy>=2.0.0
is highly likely to be safe and successful.The identified usages are predominantly core NumPy functionalities that have remained stable or had very clear migration paths that an active project would typically handle. The benefits of moving to NumPy 2.0 (better Python 3.13+ support, API consistency, potential performance improvements) outweigh the minimal risks for this particular set of functions.
This confirms the earlier recommendation to set your NumPy requirement to
numpy>=2.0.0
when usingtorch>=2.5.0
and targeting Python 3.10+.After updating the requirements (for torch 2.5.1 and torchvision) and pyproject.toml, and installing like:
and we can verify that numpy v2 would always be acceptable:
and torch 2.5.1 was compiled on numpy 2.x when numpy 2.1 was current, per this nvidia thread, so >=2.1 makes sense.
This finally uncovers another dependency issue stemming from Pytorch's own published version of requests (they have literally only one):
and this is easy, because they are both minor versions is a very stable library, and the version numbers are actually close. we can reasonably set requests>=2.28.1. with that, we get at the easy install you'd expect:
And if we want the latest, we can use
uv pip install --upgrade -r requirements/requirements-all.txt --resolution=highest --index-strategy unsafe-best-match