Closed
Description
Question
Recently, I've been extensively utilizing transformers.js to load transformer models, and Kudos to the team for this wonderful library ...
Specifically, I've been experimenting with version 2.15.0 of transformers.js.
Despite the fact that the model runs on the web-assembly backend
, I've noticed some slowness in inference. In an attempt to address this issue, I experimented with webgpu inference
using the v3
branch. However, the inference time did not meet my expectations.
Is it possible for webgpu to significantly accelerate the inference time?