-
Notifications
You must be signed in to change notification settings - Fork 695
[Doc] Explain the effect of length
in Wav2Vec2Model
#1889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @hihunjin When batching multiple audios with different duration, the resulting Tensor would have padding for shorter audios. Say I create a batch from 1 second audio and 0.8 second audio, both single channel and sampled at 16k Hz. The resulting batch Tensor will be in shape of By providing the length Tensor,
|
Length computation in convolution layer audio/torchaudio/models/wav2vec2/components.py Lines 62 to 65 in e885204
Mask computation in Transformer layer audio/torchaudio/models/wav2vec2/components.py Lines 442 to 449 in e885204
|
Thanks a lot. I appreciate it. |
Glad to help. |
length
in Wav2Vec2Model
?length
in Wav2Vec2Model
🚀 The feature
More specific explanation in the docs.
in here, I need more in-detailed explanation about
length
. Is it a sample rate?Motivation, pitch
It's confusing to understand what the variable/argument behaves.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: