Skip to content

Commit 47bdee4

Browse files
authored
Molmo Requirements (#17026)
Signed-off-by: Eyshika Agarwal <[email protected]> Signed-off-by: eyshika <[email protected]>
1 parent 49f1894 commit 47bdee4

File tree

2 files changed

+24
-0
lines changed

2 files changed

+24
-0
lines changed

docs/source/models/supported_models.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1111,6 +1111,10 @@ This limitation exists because the model's mixed attention pattern (bidirectiona
11111111
To use `TIGER-Lab/Mantis-8B-siglip-llama3`, you have to pass `--hf_overrides '{"architectures": ["MantisForConditionalGeneration"]}'` when running vLLM.
11121112
:::
11131113

1114+
:::{warning}
1115+
For improved output quality of `AllenAI/Molmo-7B-D-0924` (especially in object localization tasks), we recommend using the pinned dependency versions listed in <gh-file:requirements/molmo.txt> (including `vllm==0.7.0`). These versions match the environment that achieved consistent results on both A10 and L40 GPUs.
1116+
:::
1117+
11141118
:::{note}
11151119
The official `openbmb/MiniCPM-V-2` doesn't work yet, so we need to use a fork (`HwwwH/MiniCPM-V-2`) for now.
11161120
For more details, please see: <gh-pr:4087#issuecomment-2250397630>

requirements/molmo.txt

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Core vLLM-compatible dependencies with Molmo accuracy setup (tested on L40)
2+
torch==2.5.1
3+
torchvision==0.20.1
4+
transformers==4.48.1
5+
tokenizers==0.21.0
6+
tiktoken==0.7.0
7+
vllm==0.7.0
8+
9+
# Optional but recommended for improved performance and stability
10+
triton==3.1.0
11+
xformers==0.0.28.post3
12+
uvloop==0.21.0
13+
protobuf==5.29.3
14+
openai==1.60.2
15+
opencv-python-headless==4.11.0.86
16+
pillow==10.4.0
17+
18+
# Installed FlashAttention (for float16 only)
19+
flash-attn>=2.5.6 # Not used in float32, but should be documented
20+

0 commit comments

Comments
 (0)