RuntimeError: 0 <= device.index() && device.index() < static_cast<c10::DeviceIndex>(device_ready_queues_.size()) INTERNAL ASSERT FAILED at "/build/pytorch/torch/csrc/autograd/engine.cpp":1418

### Describe the bug

When this problem occurred, I tried to disable the CPU core, and then I could run normally, but the running results were very poor, the accuracy dropped sharply and the training time became longer. I have submitted this issue #565. Then when I restored the CPU core, the above error occurred.
Here is the part where the problem occurs.
device = 'xpu'
for train_idx, test_idx in kf.split(X_tensor):
    X_train, X_test = X_tensor[train_idx], X_tensor[test_idx]
    y_train, y_test = y_tensor[train_idx], y_tensor[test_idx]
    
    train_dataset = CustomDataset(X_train, y_train)
    train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
    
    model = MLP(X_train.shape[1]) 
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    model = model.to("xpu")
    criterion = criterion.to("xpu")
    model, optimizer = ipex.optimize(model, optimizer=optimizer)
    for epoch in range(1000):
        model.train() 
        for features, labels in train_loader:
            features, labels = features.to(device), labels.to(device)
            optimizer.zero_grad()
            outputs = model(features)
            loss = criterion(outputs, labels)
            **loss.backward()**
            optimizer.step()

### Versions

wget https://github.com/raw/intel/intel-extension-for-pytorch/master/scripts/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: 0 <= device.index() && device.index() < static_cast<c10::DeviceIndex>(device_ready_queues_.size()) INTERNAL ASSERT FAILED at "/build/pytorch/torch/csrc/autograd/engine.cpp":1418 #571

Describe the bug

Versions

For security purposes, please check the contents of collect_env.py before running it.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

RuntimeError: 0 <= device.index() && device.index() < static_cast<c10::DeviceIndex>(device_ready_queues_.size()) INTERNAL ASSERT FAILED at "/build/pytorch/torch/csrc/autograd/engine.cpp":1418 #571

Description

Describe the bug

Versions

For security purposes, please check the contents of collect_env.py before running it.

Activity

jgong5 commented on Mar 26, 2024

SoldierWz commented on Mar 27, 2024

amontse commented on Jul 11, 2024

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions