Skip to content

[V3 proposal] Improved defaults for quantization and device selection #960

Open
@xenova

Description

@xenova

Feature request

Currently, Transformers.js V3 defaults to use CPU (WASM) instead of GPU (WebGPU) due to lack of support and instability across browsers (specifically Firefox and Safari, and Chrome in Ubuntu). However, this provides a poor user experience since is performance left on the table. As browser support for WebGPU increases (currently ~70%), this will become more important since users may experience poor performance when better settings are available.

A better proposal should be to use device: "auto" instead of device: null by default, which should select (1) quantization and (2) device) based on the following:

  1. Browser support (e.g., whether WebGPU is enabled)
  2. Device capabilities (OS, mobile vs. desktop, fp16 support)
  3. Model architecture/type (BERT models are more likely to succeed than encoder-decoder models) - some models have ops which are not supported in WebGPU.

Motivation

Improve user experience and performance with better defaults

Your contribution

Will work with @FL33TW00D on this

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions