-
Notifications
You must be signed in to change notification settings - Fork 0
[SL-30] Updates to docs site #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
04237d0
to
a098274
Compare
Minor tweaks for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed all of the files except the AWS ones.
helm repo update | ||
``` | ||
|
||
5. Install MDAI dependencies via Helm chart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about this step. It doesn't give the command and is followed by a section on AWS and other stuff. Then we get step 6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you got an answer for this yet, but it is a little confusing.
- The red box on the screenshot is the command. What is being said is you can set up with or without self observability or AWS however we recommend you do.
- For the sake of the quick start, it may be good for us to put a note about if you do not set up AWS, you will see an error on the mdai-s3-log-reader, like it does on the 0.8 branch of the mdai-labs, but the the core of mdai will still work even with the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manual install from 0.8 on mdai-labs may help as well.
|
||
6. Verify that the cluster's pods are running. | ||
``` | ||
kubectl get pods -n mdai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I did this step, I got a couple of pods that couldn't be created:
mdai-s3-logs-reader-84b4659569-kvl7r 0/1 CreateContainerConfigError 0 28m
mdai-s3-logs-reader-9dc9c8764-dhzg2 0/1 CreateContainerConfigError 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually directly related to my comment above. Without setting up AWS credentials, you will receive this error, but everything should still work.
|
||
1. Apply the configuration to the hub resource. | ||
``` | ||
kubectl apply -f ./mdai/hub/0.8/hub_guaranteed_working.yaml -n mdai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a warning that I've not seen before: "Warning: ObserverResource observer-collector does not define a replica count" -- should we say something about ignoring that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would ignore this for now. I will try getting you some info on this.
|
||
#### Install MDAI | ||
|
||
***Note**: Make sure you can access `values.yaml` your working directory. You have have to clone the `mdai-helm-chart` repo.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove double word "have have"
|
||
>[!WARNING] | ||
> | ||
>Without this capability, you will not have access to our built-in, self-instrumentation that ensures visibility and accuracy of MDAI operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comma "...built-in self-instrumentation"
@@ -54,10 +53,10 @@ An OpenTelemetry collector is a component that receives and forwards telemetry d | |||
|
|||
We'll use Fluentd to capture the synthetic log streams you created, and forward them to the collector. | |||
|
|||
1. From the [MDAI Example Config repo](https://github.com/DecisiveAI/configs/blob/main/synthetics/loggen_fluent_config.yaml), copy the `loggen_fluent_config.yaml` into your working directory. | |||
1. Your working directory should still be pointed at [MDAI Labs repo](https://github.com/DecisiveAI/mdai-labs/). Run the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again remove the "working directory" advice. The fewer words, the lower the cognitive stress.
README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tweaked the tweaks.
|
||
## About Self-monitoring | ||
|
||
The MDAI Smart Telemetry Hub contains complex infrastructure. To maintain and monitor operational excellence, we have included an opt-in capability to create an understanding internal metrics, audit history of change events, and log streams affording traceability across our services and their events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"create an understanding of internal metrics" ?
|
||
The `mdai-helm-chart` installed `mdai-operator` and `mdai-gateway` expect a destination to send their logs to, but this chart does not manage deploying the logs destination for those services. | ||
|
||
The `mdai-operator` has the ability to manage an opinionated collector, via compatible configurations, called the `mdai-collector` (sometimes referred to as the `hub-monitor`). The `mdai-collector` receives from this fixed list of services and sends the logs to a compatible destination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we mention what "this fixed list of services" is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned about users being able to understand what this means for them in a quick start. I mentioned in another comment that this seems to be adding complexity a little early, without enough context.
@@ -33,14 +33,13 @@ But not all logs are created equal. Now that we're generating log data, let's fi | |||
|
|||
MDAI is monitoring services by a service identifier (log attribute), `mdai_service`, and a tolerance threshold over a rolling time window. When a given service surpasses that threshold, we'd like to drop non-critical data such as log lines with a level below WARN. | |||
|
|||
To do that, let's add a managed filter to `otel_config.yaml`, the collector's configuration file. Open the file for editing and look for the configuration block that looks like this: | |||
To do that, let's add a managed filter to `otel/0.8/otel_guaranteed_working.yaml`, the collector's configuration file. Open the file for editing and look for the configuration block that looks like this: | |||
|
|||
``` | |||
# filter/service_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the otel/0.8/otel_guaranteed_working.yaml
I'm pulling from mdai_labs right now all of that is already uncommented. So is the next line below that's supposed to be commented out. So that file needs to be updated as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And so when I apply this config, nothing seems to change?
> kubectl apply -f otel/0.8/otel_guaranteed_working.yaml --namespace mdai
opentelemetrycollector.opentelemetry.io/gateway unchanged
So I'm not sure the filter is actually doing anything. We used to have a filter that specified filtering out INFO level logs.
> | ||
> 🛑 **Warning** | ||
> | ||
> We do not recommend using this setup for a development or greater environment -- this is truly for the quickstart. If you're in AW EKS, we recommend [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) for a similar effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If you're in AW EKS ..." → "AWS"?
|
||
##### Create / Use `.env` file to store Access Keys | ||
|
||
If you don't already, create a `.env` file. Then, update the `.env` file to have your credentials. It should look similar to the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"don't" → "haven't"
|
||
>[!NOTE] | ||
> | ||
>To stop logs from sending to s3, you will need to delete the MdaiCollector Custom Resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the MdaiCollector Custom Resource" → "the MDAI Collector Custom Resource (CR)"
Looks like it's called "MDAI Collector" below; also below we start using "CR" so would be nice to show that.
@textractor Was trying to go through each of your comments on this PR, but maybe it would be better we work together on it some time this week because I'm a bit confused now and seemingly a lot has happened since your first comment. Sent you a DM, but adding here for visibility. |
Updating docs site