yxf
74c937acc4
fix: Enhance multi-GPU scheduling tests and improve attention heads validation messages
5 months ago
yxf
775fc18d5d
fix: Exclude vision_config.num_attention_heads from the num_attention_heads validation check.
5 months ago
gitlawr
b24cb7466d
fix: update broken tests
5 months ago
thxCode
152e57f8a7
test(npu-smi): refine example and test case
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
yxf
58af72bf08
fix:remove the redundant formatting issues in schedule_cycle prompt messages; optimize compatibility check text.
6 months ago
thxCode
8cb1a38013
fix(selector/gguf): incorrect partial offloading in uma
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
yxf
f16718c776
fix: correct compatibility check messages
6 months ago
gitlawr
e1a4a3fca2
chore: update transformers
6 months ago
yxf
013fc3d510
feat: skip select_multi_workers by distributed_inference_across_workers in MindIE
6 months ago
gitlawr
a3cb2445c5
fix: capitalize dataclass
6 months ago
gitlawr
5a816afe2b
refactor: migrate subordinate_workers
6 months ago
yxf
9647cd4237
feat: Update compatibility check messages for multi-worker/multi-GPU scenarios and refactor vllm_selector code structure
6 months ago
yxf
6812d39148
feat: enhance mindie and vllm scheduling messages
6 months ago
yxf
6c27bb8945
feat: clarifying compatibility check messages
6 months ago
gitlawr
ac42c6b21d
fix: flexible vendor validation
6 months ago
gitlawr
3f8d45ded1
fix: update trust-remote-code value error message
6 months ago
gitlawr
6714afa2ca
refactor: simplify test fixtures
6 months ago
thxCode
4e1d6d900c
fix(mindie): invalid world size in manual selection
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
thxCode
c9f44f33aa
test(mindie): simplify fixture import
...
reuse workers information from one definition
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
thxCode
777fd3d0cb
refactor(mindie): refine parallelism params
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
thxCode
3c5c9feb6c
feat(util): introduce attribute path operator
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
thxCode
a9fb44a43d
fix(mindie): resource fit selection
...
- fix miss match world size selection
- fix no IP address provided devices selection
- fix wrong comparsion in world size during multi-worker selection
- fix unavailable workers/devices selection
Signed-off-by: thxCode <thxcode0824@gmail.com>
6 months ago
thxCode
56bf30f199
refactor(mindie): introduce candidate selectors
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
7 months ago
thxCode
383baac2e0
chore(selectors): support import from package
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
7 months ago
thxCode
549750b739
refactor(detectors): collect device_index/device_chip_index
...
- collect device_index/device_chip_index for the device that owns two
chips in one card.
- align test cases.
Signed-off-by: thxCode <thxcode0824@gmail.com>
8 months ago
thxCode
14b70d472d
test(detectors): npu-smi adjust
...
do unit testing in the critical path,
focus on singular values and branches.
Signed-off-by: thxCode <thxcode0824@gmail.com>
8 months ago
gitlawr
d852b409f7
fix: min gguf unit test
8 months ago
gitlawr
807d7990b6
fix: update hub tests
8 months ago
gitlawr
ef0eb68c0a
chore: update catalog test
8 months ago
gitlawr
f69f51715e
fix: update modelscope catalog
9 months ago
gitlawr
ab89712999
feat: unit tests for command.py
9 months ago
gitlawr
291ff064f0
fix: update size test
9 months ago
gitlawr
0a5facdb40
fix: update non-LLM vllm claim
9 months ago
gitlawr
69c6fac56f
feat: validation for dist vllm limit per worker
9 months ago
gitlawr
5f312e9ed3
feat: add model evaluations
9 months ago
gitlawr
160fefa8de
refactor: rename claim dataclass
9 months ago
gitlawr
ffedf70089
refactor: move gpu filtering from worker filter to gguf resource selector
9 months ago
gitlawr
25cb8792ef
feat: model file management
10 months ago
gitlawr
d613118a38
feat: support distributed vLLM
10 months ago
michelia
1fe0d081e3
refactor: add advise message when pending
10 months ago
gitlawr
79b4e559c9
fix: infer max model len
10 months ago
michelia
048a462273
fix: setting ngl 0 doesn't work
10 months ago
michelia
704b413421
fix: only select main and rpc could offload at least one layer to gpu
10 months ago
michelia
cafaf9294f
feat: support scheduling with customized ngl
10 months ago
michelia
8c130b51c4
fix: vram insufficient GPUs selected by automatic scheduling
10 months ago
michelia
8b60bdbee4
refactor: update tests for scheduler
10 months ago
michelia
0534773b8c
refactor: enhance scheduler generate candidate combinations
11 months ago
michelia
6c846a3bef
chore: rename test fixture files
11 months ago
michelia
8b48d6daa9
feat: support hygon dcu
11 months ago
michelia
b6b6e23b87
fix: use the tensor-split generate from worker allocatable resources
11 months ago