Mllama(single + dual) + InternVL(single) + Llava (single) #267

ochougul · 2025-02-10T21:05:53Z

Adding generalized infrastructure to support VLMs with Dual/single QPC approaches

Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Rishin Raj <[email protected]> Co-authored-by: Amit Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

1. Mllama single qpc support added 2. Simplified generate inputs for single and dual qpc --------- Signed-off-by: Amit Raj <[email protected]> Co-authored-by: asmigosw <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Added support for Laava model single QPC Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Onkar Chougule <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Amit Raj <[email protected]>

QEfficient/generation/text_generation_inference.py

QEfficient/transformers/modeling_utils.py

QEfficient/transformers/models/internvl/modeling_internvl.py

anujgupt-github · 2025-02-11T09:14:52Z

QEfficient/transformers/models/modeling_auto.py


+class QEFFTransformersBase(QEFFBaseModel):


why is this entire part showing up as diff? possible to clean up the PR?

because we moved the QEFFAutoModelForCausalLM class to the bottom of the file from top as we wanted to support Intern via that.

As we discussed should I chnage it to use QEFFAutoModelImageTextTOText?

Signed-off-by: Amit Raj <[email protected]>

Signed-off-by: Onkar Chougule <[email protected]>

QEfficient/transformers/modeling_utils.py

QEfficient/transformers/models/internvl/modeling_internvl.py

QEfficient/transformers/models/mllama/modeling_mllama.py

QEfficient/transformers/models/modeling_auto.py

QEfficient/transformers/models/pytorch_transforms.py

Signed-off-by: Onkar Chougule <[email protected]>

quic-hemagnih

Can we please add the description about newly added argument model

QEfficient/base/modeling_qeff.py

Signed-off-by: Amit Raj <[email protected]>

QEfficient/base/pytorch_transforms.py

QEfficient/transformers/models/modeling_auto.py

quic-hemagnih · 2025-02-14T01:40:51Z

QEfficient/transformers/models/modeling_auto.py

+        config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True)
+        config._attn_implementation = "eager"
+        config.vision_config.use_flash_attn = "false"
+        model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, config, *args, **kwargs)


Why we are explicitly fetching the config and directly calling the from_pretrained() of AutoModelForImageTextToText. We could have called super.from_pretrained() and then internally it would have handled this

quic-hemagnih · 2025-02-14T01:48:09Z

QEfficient/transformers/models/modeling_auto.py

+            logger.warning("Updating low_cpu_mem_usage=False")
+
+        kwargs.update({"attn_implementation": "eager", "low_cpu_mem_usage": False})
+        model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, *args, **kwargs)


Instead of making these changes in derived class from_pretained() function shouldn't we make these changes in Base class and then call it from here, it might be good from design perspective

We think we can remove QEFFTransformerBaseClass we will need a bigger discussion to decide this.

quic-hemagnih · 2025-02-14T01:50:47Z

QEfficient/transformers/models/modeling_auto.py

+        pretrained_model_name_or_path,
+        *args,
+        **kwargs,
+    ):
        if kwargs.get("attn_implementation", None) not in {None, "eager"}:


why we are not calling super.from_pretained() here?, it already has this peice of code

because we need to change config options in this method that are not needed for other auto clases.

Signed-off-by: Onkar Chougule <[email protected]>

Adding generalized infrastructure to support VLMs with Dual/single QPC approaches --------- Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Onkar Chougule <[email protected]> Co-authored-by: Rishin Raj <[email protected]> Co-authored-by: Amit Raj <[email protected]> Co-authored-by: Amit Raj <[email protected]> Co-authored-by: asmigosw <[email protected]> Signed-off-by: Hem Agnihotri <[email protected]>

ochougul requested a review from quic-rishinr as a code owner February 10, 2025 21:05

quic-rishinr and others added 15 commits February 10, 2025 21:09

Mllama Vision support (#254)

48d24da

Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Rishin Raj <[email protected]> Co-authored-by: Amit Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Compiler command fix

0bdeea5

Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Mllama single qpc support added (#258)

c948607

1. Mllama single qpc support added 2. Simplified generate inputs for single and dual qpc --------- Signed-off-by: Amit Raj <[email protected]> Co-authored-by: asmigosw <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Export fix

649cd32

Signed-off-by: Amit Raj <[email protected]>

Generate fix-1

32f544c

Signed-off-by: Amit Raj <[email protected]>

minor-fix

7ebf06b

Signed-off-by: Amit Raj <[email protected]>

Model swap fix at the time of export and compile

3bc06be

Signed-off-by: Amit Raj <[email protected]>

two_qpc_working

5accf3f

Signed-off-by: Amit Raj <[email protected]>

Minor-fix-1

6ae6835

Signed-off-by: Amit Raj <[email protected]>

working single and double with single soc

1e181f4

Signed-off-by: Amit Raj <[email protected]>

ruff checks and format

5fb0acb

Signed-off-by: Amit Raj <[email protected]>

Updated factory class

87e07d0

Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>

Added support for Llava model single QPC (#265)

fc323f4

Added support for Laava model single QPC Signed-off-by: Amit Raj <[email protected]>

Added support for InternVL single QPC (#264)

1ac6b62

Signed-off-by: Onkar Chougule <[email protected]> Signed-off-by: Amit Raj <[email protected]>

final revision VLM

b3a5d22

Signed-off-by: Amit Raj <[email protected]>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from b9c0bc1 to b3a5d22 Compare February 10, 2025 21:10

quic-amitraj added 2 commits February 10, 2025 21:13

fixed liscence

d33e9a5

Signed-off-by: Amit Raj <[email protected]>

Minor fixes-1

905b703

Signed-off-by: Amit Raj <[email protected]>

anujgupt-github reviewed Feb 11, 2025

View reviewed changes

quic-amitraj and others added 3 commits February 11, 2025 12:05

Added get_input_info support and other compilers arguments

990adb1

Signed-off-by: Amit Raj <[email protected]>

generalized getting img_size from config

afe3cd2

Signed-off-by: Onkar Chougule <[email protected]>

refactor basic

b458193

Signed-off-by: Onkar Chougule <[email protected]>

vbaddi requested changes Feb 12, 2025

View reviewed changes

ochougul added 3 commits February 12, 2025 17:24

added warnings and auto_correct_inputs function

d1981c2

Signed-off-by: Onkar Chougule <[email protected]>

remove unused imports

a32007e

Signed-off-by: Onkar Chougule <[email protected]>

addressed comments

81cea10

Signed-off-by: Onkar Chougule <[email protected]>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from 24fad68 to 81cea10 Compare February 13, 2025 09:09

final commit changed documentation added better warnings

2f1ec08

Signed-off-by: Onkar Chougule <[email protected]>

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

ochougul mentioned this pull request Feb 13, 2025

Support of Llama 3.2 and AutoModelForImageTextToText model class #260

Closed

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

QEfficient/base/modeling_qeff.py Outdated Show resolved Hide resolved

quic-amitraj force-pushed the mllama_single_dual_qpc branch from 2de6982 to f5efdae Compare February 13, 2025 09:50

quic-amitraj mentioned this pull request Feb 13, 2025

Transformers version upgrade #238

Closed

Addressed comments

ad594d7

Signed-off-by: Amit Raj <[email protected]>

quic-amitraj force-pushed the mllama_single_dual_qpc branch from f5efdae to ad594d7 Compare February 13, 2025 11:49

quic-hemagnih reviewed Feb 13, 2025

View reviewed changes

QEfficient/base/pytorch_transforms.py Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

QEfficient/transformers/models/modeling_auto.py Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

QEfficient/transformers/models/modeling_auto.py Outdated Show resolved Hide resolved

quic-hemagnih reviewed Feb 14, 2025

View reviewed changes

ochougul added 3 commits February 14, 2025 13:07

minor bugfix

94e813c

Signed-off-by: Onkar Chougule <[email protected]>

addressed comments

a661840

Signed-off-by: Onkar Chougule <[email protected]>

removed image_text models tests to avoid pytest issues

ed7d5f2

Signed-off-by: Onkar Chougule <[email protected]>

ochougul merged commit d0ee7bc into main Feb 14, 2025
4 checks passed

This was referenced Feb 14, 2025

VLM Pipeline (Intern,LLava) #256

Closed

VLM Pipeline for Model Onboarding through QEff #261

Closed

Adding VLM pipeline #234

Closed

vbaddi mentioned this pull request Feb 14, 2025

Any plans on supporting Llama3.2 text and multimodal on Qualcomm AI 100? #152

Closed

quic-rishinr deleted the mllama_single_dual_qpc branch March 25, 2025 06:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mllama(single + dual) + InternVL(single) + Llava (single) #267

Mllama(single + dual) + InternVL(single) + Llava (single) #267

ochougul commented Feb 10, 2025 •

edited

Loading

anujgupt-github Feb 11, 2025

ochougul Feb 12, 2025

quic-hemagnih left a comment

quic-hemagnih Feb 14, 2025

quic-hemagnih Feb 14, 2025

ochougul Feb 14, 2025

quic-hemagnih Feb 14, 2025

ochougul Feb 14, 2025

Mllama(single + dual) + InternVL(single) + Llava (single) #267

Mllama(single + dual) + InternVL(single) + Llava (single) #267

Conversation

ochougul commented Feb 10, 2025 • edited Loading

anujgupt-github Feb 11, 2025

Choose a reason for hiding this comment

ochougul Feb 12, 2025

Choose a reason for hiding this comment

quic-hemagnih left a comment

Choose a reason for hiding this comment

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

ochougul Feb 14, 2025

Choose a reason for hiding this comment

quic-hemagnih Feb 14, 2025

Choose a reason for hiding this comment

ochougul Feb 14, 2025

Choose a reason for hiding this comment

ochougul commented Feb 10, 2025 •

edited

Loading