-
Notifications
You must be signed in to change notification settings - Fork 49
Pull requests: nod-ai/shark-ai
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[sharktank] in device creation add arg to specify tensor parallelism
#1554
opened Jun 2, 2025 by
sogartar
Loading…
Bump IREE requirement pins to 3.5.0rc20250602
#1553
opened Jun 2, 2025 by
shark-pr-automator
bot
Loading…
[shortfin] Implement run_in_executor for PyWorkerEventLoop
#1551
opened Jun 1, 2025 by
vinayakdsci
Loading…
[sharktank] Fix pipeline parallelism perplexity regression for toy deepseek
#1545
opened May 30, 2025 by
archana-ramalingam
•
Draft
Add Conv1/3DLayer and LayerNorm with normalized_shape for Wan2.1 model
#1541
opened May 30, 2025 by
AmosLewis
Loading…
[sharktank] Fix which prefill logits eager perplexity calculations are using
#1540
opened May 30, 2025 by
Alex-Vasile
Loading…
1 task
Remove
argmax
and topk
functions from PagedLlmModelV1
#1536
opened May 29, 2025 by
stbaione
Loading…
[mlir_kernel] Use mlir_kernel for injecting Wave kernels
#1530
opened May 29, 2025 by
aviator19941
Loading…
Cache Storage to avoid allocate pinned memory for every prefill step and every decode step
#1516
opened May 27, 2025 by
dezhiAmd
Loading…
Thread pool token selection
enhancement
New feature or request
#1515
opened May 27, 2025 by
stbaione
Loading…
Handle more datatypes gracefully in the dump_gguf tool
#1508
opened May 22, 2025 by
KyleHerndon
Loading…
[Shortfin][LLM] Add initial support for disaggregated invocations
#1463
opened May 16, 2025 by
vinayakdsci
•
Draft
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.