Skip to content

Delete all cortex-created AWS resources when deleting a cluster #2161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 10, 2021
Merged
233 changes: 163 additions & 70 deletions cli/cmd/cluster.go

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions dev/minimum_aws_policy.json
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
"iam:ListInstanceProfiles",
"logs:CreateLogGroup",
"logs:PutLogEvents",
"logs:DeleteLogGroup",
"iam:CreateOpenIDConnectProvider",
"iam:GetOpenIDConnectProvider",
"iam:GetRolePolicy"
Expand Down
12 changes: 6 additions & 6 deletions docs/clients/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,12 +164,12 @@ Usage:
cortex cluster down [flags]

Flags:
-c, --config string path to a cluster configuration file
-n, --name string name of the cluster
-r, --region string aws region of the cluster
-y, --yes skip prompts
--keep-volumes keep cortex provisioned persistent volumes
-h, --help help for down
-c, --config string path to a cluster configuration file
-n, --name string name of the cluster
-r, --region string aws region of the cluster
-y, --yes skip prompts
--keep-aws-resources skip deletion of resources that cortex provisioned on aws (bucket contents, ebs volumes, log group)
-h, --help help for down
```

## cluster export
Expand Down
1 change: 1 addition & 0 deletions docs/clusters/management/auth.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ Replace the following placeholders with their respective values in the policy te
"iam:ListInstanceProfiles",
"logs:CreateLogGroup",
"logs:PutLogEvents",
"logs:DeleteLogGroup",
"iam:CreateOpenIDConnectProvider",
"iam:GetOpenIDConnectProvider",
"iam:GetRolePolicy"
Expand Down
24 changes: 4 additions & 20 deletions docs/clusters/management/delete.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,34 +4,18 @@
cortex cluster down
```

## Delete metadata and log groups
## Bucket Contents

Since you may wish to have access to your data after spinning down your cluster, Cortex's bucket, log groups, and
Prometheus volume are not automatically deleted when running `cortex cluster down`.

To delete them:

```bash
# identify the name of your cortex S3 bucket
aws s3 ls

# delete the S3 bucket
aws s3 rb --force s3://<bucket>

# delete the log group (replace <cluster_name> with the name of your cluster, default: cortex)
aws logs describe-log-groups --log-group-name-prefix=<cluster_name> --query logGroups[*].[logGroupName] --output text | xargs -I {} aws logs delete-log-group --log-group-name {}
```
When a Cortex cluster is created, an S3 bucket is created for its internal use. When running `cortex cluster down`, a lifecycle rule is applied to the bucket such that its entire contents are removed within the next 24 hours. You can safely delete the bucket at any time after `cortex cluster down` has finished running.

## Delete Certificates

If you've configured a custom domain for your APIs, you can remove the SSL Certificate and Hosted Zone for the domain by
following these [instructions](../networking/custom-domain.md#cleanup).

## Keep Cortex Volumes
## Keep Cortex Resources

The volumes used by Cortex's Prometheus and Grafana instances are deleted by default on a cluster down operation.
If you want to keep the metrics and dashboards volumes for any reason,
you can pass the `--keep-volumes` flag to the `cortex cluster down` command.
The contents of Cortex's S3 bucket, the EBS volumes (used by Cortex's Prometheus and Grafana instances), and the log group are deleted by default when running `cortex cluster down`. If you want to keep these resources, you can pass the `--keep-aws-resources` flag to the `cortex cluster down` command.

## Troubleshooting

Expand Down
11 changes: 11 additions & 0 deletions pkg/lib/aws/cloudwatch.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,17 @@ func (c *Client) CreateLogGroup(logGroup string, tags map[string]string) error {
return nil
}

func (c *Client) DeleteLogGroup(logGroup string) error {
_, err := c.CloudWatchLogs().DeleteLogGroup(&cloudwatchlogs.DeleteLogGroupInput{
LogGroupName: aws.String(logGroup),
})
if err != nil {
return errors.Wrap(err, "log group "+logGroup)
}

return nil
}

func (c *Client) TagLogGroup(logGroup string, tagMap map[string]string) error {
tags := map[string]*string{}
for key, value := range tagMap {
Expand Down
17 changes: 17 additions & 0 deletions pkg/lib/aws/iam.go
Original file line number Diff line number Diff line change
Expand Up @@ -209,3 +209,20 @@ func (c *Client) DeletePolicy(policyARN string) error {
}
return nil
}

func (c *Client) GetPolicyOrNil(policyARN string) (*iam.Policy, error) {
policyOutput, err := c.IAM().GetPolicy(&iam.GetPolicyInput{
PolicyArn: aws.String(policyARN),
})
if err != nil {
if IsErrCode(err, iam.ErrCodeNoSuchEntityException) {
return nil, nil
}
return nil, errors.WithStack(err)
}

if policyOutput != nil {
return policyOutput.Policy, nil
}
return nil, nil
}
18 changes: 18 additions & 0 deletions pkg/lib/errors/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ limitations under the License.
package errors

import (
"fmt"
"strings"

s "github.com/cortexlabs/cortex/pkg/lib/strings"
Expand All @@ -37,3 +38,20 @@ func ErrorUnexpected(msgs ...interface{}) error {
Message: strings.Join(strs, ": "),
})
}

func ListOfErrors(errKind string, shouldPrint bool, errors ...error) error {
var errorsContents string
for i, err := range errors {
if err != nil {
errorsContents += fmt.Sprintf("error #%d: %s\n", i+1, err.Error())
}
}
if errorsContents == "" {
return nil
}
return WithStack(&Error{
Kind: errKind,
Message: errorsContents,
NoPrint: !shouldPrint,
})
}