Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Having trouble when tried NNI with FrameworkController on-premise k8s #73

@juniroc

Description

@juniroc

when i tried nni with FrameworkController on on-premise k8s

I followed this document
https://nni.readthedocs.io/en/stable/TrainingService/FrameworkControllerMode.html

  • create Serviceaccount and clusterrolebinding
kubectl create serviceaccount frameworkcontroller --namespace default
kubectl create clusterrolebinding frameworkcontroller \
  --clusterrole=cluster-admin \
  --user=system:serviceaccount:default:frameworkcontroller
  • and set StatefulSet
    framework-config.yml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: frameworkcontroller
  namespace: default
spec:
  serviceName: frameworkcontroller
  selector:
    matchLabels:
      app: frameworkcontroller
  replicas: 1
  template:
    metadata:
      labels:
        app: frameworkcontroller
    spec:
      # Using the ServiceAccount with granted permission
      # if the k8s cluster enforces authorization.
      serviceAccountName: frameworkcontroller
      containers:
      - name: frameworkcontroller
        image: frameworkcontroller/frameworkcontroller
        # Using k8s inClusterConfig, so usually, no need to specify
        # KUBE_APISERVER_ADDRESS or KUBECONFIG
#        env:
        #- name: KUBE_APISERVER_ADDRESS
        #  value: {http[s]://host:port}
#          - name: KUBECONFIG
#            value: ~/.kube/config
kubectl apply -f framework-config.yml

and when I tried kubectl get po
image

and tried kubectl logs frameworkcontroller-0

image

It said

updateRemoteFrameworkStatus: 
Failed: Framework.frameworkcontroller.microsoft.com "nniexpqfwzydmkenvfpwjw" is invalid:
spec.taskRoles.task.podGracefulDeletionTimeoutSec: Invalid value: "null":
spec.taskRoles.task.podGracefulDeletionTimeoutSec in body must be of type integer: "null"

what should I do to solve this problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions