|
14858 | 14858 | MPS: _scaled_dot_product_attention_math_mps
|
14859 | 14859 | tags: nondeterministic_seeded
|
14860 | 14860 |
|
14861 |
| -- func: _scaled_dot_product_flash_attention(Tensor query, Tensor key, Tensor value, Tensor? attn_bias=None, float dropout_p=0.0, bool is_causal=False, bool return_debug_mask=False, *, float? scale=None) -> (Tensor output, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, Tensor philox_seed, Tensor philox_offset, Tensor debug_attn_mask) |
| 14861 | +- func: _scaled_dot_product_flash_attention(Tensor query, Tensor key, Tensor value, Tensor? attn_bias, float dropout_p=0.0, bool is_causal=False, bool return_debug_mask=False, *, float? scale=None) -> (Tensor output, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, Tensor philox_seed, Tensor philox_offset, Tensor debug_attn_mask) |
14862 | 14862 | dispatch:
|
14863 | 14863 | CUDA: _scaled_dot_product_flash_attention_cuda
|
14864 | 14864 | NestedTensorCUDA: _scaled_dot_product_flash_attention_nestedtensor_cuda
|
|
14874 | 14874 | CompositeExplicitAutograd: _scaled_dot_product_fused_attention_overrideable
|
14875 | 14875 | tags: nondeterministic_seeded
|
14876 | 14876 |
|
14877 |
| -- func: _scaled_dot_product_flash_attention_backward(Tensor grad_out, Tensor query, Tensor key, Tensor value, Tensor? attn_bias=None, Tensor out, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, Tensor philox_seed, Tensor philox_offset, *, float? scale=None) -> (Tensor grad_query, Tensor grad_key, Tensor grad_value) |
| 14877 | +- func: _scaled_dot_product_flash_attention_backward(Tensor grad_out, Tensor query, Tensor key, Tensor value, Tensor? attn_bias, Tensor out, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, Tensor philox_seed, Tensor philox_offset, *, float? scale=None) -> (Tensor grad_query, Tensor grad_key, Tensor grad_value) |
14878 | 14878 | device_check: NoCheck
|
14879 | 14879 | variants: function
|
14880 | 14880 | dispatch:
|
@@ -14915,13 +14915,13 @@
|
14915 | 14915 | CUDA: _scaled_dot_product_cudnn_attention_backward_cuda
|
14916 | 14916 | tags: nondeterministic_seeded
|
14917 | 14917 |
|
14918 |
| -- func: _flash_attention_forward(Tensor query, Tensor key, Tensor value, Tensor? attn_bias=None, Tensor? cum_seq_q, Tensor? cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, bool return_debug_mask, *, float? scale=None, SymInt? window_size_left=None, SymInt? window_size_right=None, Tensor? seqused_k=None, Tensor? alibi_slopes=None) -> (Tensor output, Tensor softmax_logsumexp, Tensor philox_seed, Tensor philox_offset, Tensor debug_attn_mask) |
| 14918 | +- func: _flash_attention_forward(Tensor query, Tensor key, Tensor value, Tensor? attn_bias, Tensor? cum_seq_q, Tensor? cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, bool return_debug_mask, *, float? scale=None, SymInt? window_size_left=None, SymInt? window_size_right=None, Tensor? seqused_k=None, Tensor? alibi_slopes=None) -> (Tensor output, Tensor softmax_logsumexp, Tensor philox_seed, Tensor philox_offset, Tensor debug_attn_mask) |
14919 | 14919 | variants: function
|
14920 | 14920 | dispatch:
|
14921 | 14921 | CUDA: _flash_attention_forward
|
14922 | 14922 | tags: nondeterministic_seeded
|
14923 | 14923 |
|
14924 |
| -- func: _flash_attention_backward(Tensor grad_out, Tensor query, Tensor key, Tensor value, Tensor? attn_bias=None, Tensor out, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, Tensor philox_seed, Tensor philox_offset, *, float? scale=None, SymInt? window_size_left=None, SymInt? window_size_right=None) -> (Tensor, Tensor, Tensor) |
| 14924 | +- func: _flash_attention_backward(Tensor grad_out, Tensor query, Tensor key, Tensor value, Tensor? attn_bias, Tensor out, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, SymInt max_q, SymInt max_k, float dropout_p, bool is_causal, Tensor philox_seed, Tensor philox_offset, *, float? scale=None, SymInt? window_size_left=None, SymInt? window_size_right=None) -> (Tensor, Tensor, Tensor) |
14925 | 14925 | device_check: NoCheck
|
14926 | 14926 | variants: function
|
14927 | 14927 | dispatch:
|
|
0 commit comments