Closed
Description
🐛 Describe the bug
Build https://github.com/pytorch/pytorch/actions/runs/13724713611/job/38388317099?pr=148740 stuck at Calculate docker image
step trying to check if such image already exists or not
+ [[ 1741362495 -lt 1741364292 ]]
+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-focal-cuda12.6-cudnn9-py3-gcc11:c097a94c03da3be2f692f9ff22e3963e933633cf
denied: User: arn:aws:sts::391835788720:assumed-role/ghci-lf-github-action-runners-runner-role/i-0e98877505f067739 is not authorized to perform: ecr:BatchGetImage on resource: arn:aws:ecr:us-east-1:308535385114:repository/pytorch/pytorch-linux-focal-cuda12.6-cudnn9-py3-gcc11 because no resource-based policy allows the ecr:BatchGetImage action
+ '[' false == true ']'
+ sleep 300
++ date +%s
This logic was added by pytorch/test-infra#6013 but looks like it does not work right now due to some sort of security considerations. (Though all runners should have read access to ECR, shouldn't they?)
Versions
CI
cc @seemethere @pytorch/pytorch-dev-infra
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done