From 8f24c35ff93f8a3814833be612f96f82df9d4d10 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Mon, 6 Sep 2021 11:23:17 -0500 Subject: [PATCH 1/8] k8s_ha: Refactor intro paragraph and drop helm 2 reference --- docs/source/install/k8s_ha.rst | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 7c72cf59..3b40481e 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -4,11 +4,13 @@ This document provides an installation blueprint for a Highly Available StackStorm cluster based on `Kubernetes `__, a container orchestration platform at planet scale. -The cluster deploys a minimum of 2 replicas for each component of StackStorm microservices for redundancy and reliability. It -also configures backends like MongoDB HA Replicaset, RabbitMQ HA and Redis Sentinel cluster that st2 relies on for database, -communication bus, and distributed coordination respectively. That raises a fleet of more than ``30`` pods total. +A StackStorm HA cluster consists of 2 replicas for most StackStorm microservices for redundancy and reliability. +The cluster must also have access to backend services like MongoDB HA Replicaset, RabbitMQ HA and a Redis Sentinel cluster +that st2 relies on for database, communication bus, and distributed coordination respectively. These services are +included in the default StackStorm HA cluster, but StackStorm can also use services provisioned separately. +By default, the StackStorm HA cluster consists of a fleet of more than ``30`` pods. -The source code for K8s resource templates is available as a GitHub repo: +The source code for K8s resource templates (part of our Helm chart) is available as a GitHub repo: `StackStorm/stackstorm-ha `_. .. warning:: @@ -23,13 +25,13 @@ The source code for K8s resource templates is available as a GitHub repo: Requirements ------------ * `Kubernetes `__ cluster -* `Helm `__, the K8s package manager and `Tiller `_ +* `Helm `__ 3, the K8s package manager (Helm 2 is not supported) * Enough computing resources for production use, respecting :doc:`/install/system_requirements` Usage ----- This document assumes some basic knowledge of Kubernetes and Helm. -Please refer to `K8s `__ and `Helm `__ +Please refer to `K8s `__ and `Helm `__ documentation if you find any difficulties using these tools. However, here are some minimal instructions to get started. From 6c706a4e4f0a8e51a5b61ae7bde9dc07a0b59144 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Mon, 6 Sep 2021 12:21:45 -0500 Subject: [PATCH 2/8] k8s_ha: Refactor configurable values list - Note packs.volumes vs packs.images - Add more settings to the list of configurable values. - List auto-generated secrets and update the secrets warning. --- docs/source/install/k8s_ha.rst | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 3b40481e..106572b3 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -54,16 +54,17 @@ or ``st2`` CLI client: .. figure :: /_static/images/helm-chart-notes.png :align: center +.. todo:: Update this screenshot. It is out of date. The installation uses some unsafe defaults which we recommend you change for production use via Helm ``values.yaml``. Helm Values ___________ Helm package ``stackstorm-ha`` comes with default settings (see `values.yaml `_). -Fine-tune them to achieve desired configuration for the StackStorm HA K8s cluster. +Fine-tune them to achieve desired configuration for your StackStorm HA K8s cluster. .. note:: - Keep custom values you want to override in a separated yaml file so they won't get lost. + Keep custom values you want to override in a separate yaml file so they won't get lost. Example: ``helm install -f custom_values.yaml`` or ``helm upgrade -f custom_values.yaml`` You can configure: @@ -73,13 +74,22 @@ You can configure: - st2.conf settings - RBAC roles, assignments and mappings (enterprise only for StackStorm v3.2 and before, open source for StackStorm v3.4 and later) -- custom st2 packs and its configs +- custom st2 packs (in persistent volumes or via custom docker images) and their configs - SSH private key -- K8s resources and settings to control pod/deployment placement -- Mongo, RabbitMQ clusters +- K8s resources, annotations, and settings to control pod/deployment placement +- Image tag and repository settings to select the ST2 version or use customized/private component images +- DNS and Ingress configuration +- Miscellaneous other ST2 cluster customizations +- Mongo, RabbitMQ, and Redis clusters + +If not defined, these values are auto-generated on install and preserved across upgrades: + +- SSH private key +- st2 auth secrets (ie: the password for the st2admin user) .. warning:: - It's highly recommended to set your own secrets as the file contains unsafe defaults like SSH keys, StackStorm access credentials and MongoDB/RabbitMQ passwords! + It's highly recommended to set your own secrets to replace the unsafe defaults for for the MongoDB and RabbitMQ subhcarts! + If you disable the subcharts, make sure to secure the services and add the relevant secrets to st2.conf. Upgrading _________ From 968cf245c4a7d54b1c5bee6f201e4489c2a18a3d Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Mon, 6 Sep 2021 19:50:44 -0500 Subject: [PATCH 3/8] k8s_ha: Reword a few sections for clarity. Also fix the channel. We use #k8s for the helm chart, not #docker. --- docs/source/install/k8s_ha.rst | 10 +++++----- docs/source/reference/ha.rst | 9 +++++---- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 106572b3..bceac847 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -197,7 +197,7 @@ st2web ______ st2web is a StackStorm Web UI admin dashboard. By default, st2web K8s config includes a Pod Deployment and a Service. ``2`` replicas (configurable) of st2web serve the web app and proxy requests to st2auth, st2api, st2stream. -By default, st2web uses HTTP instead of HTTPS. We recommend you rely on ``LoadBalancer`` or ``Ingress`` to add HTTPS layer on top of it. +By default, st2web uses HTTP instead of HTTPS. We recommend you rely on ``LoadBalancer`` (a ``Service`` type) or ``Ingress`` to add HTTPS layer on top of it. .. note:: By default, st2web is a NodePort Service and is not exposed to the public net. @@ -221,7 +221,7 @@ if you are planning a high-volume environment. st2stream _________ -StackStorm st2stream - exposes a server-sent event stream, used by the clients like WebUI and ChatOps to receive updates from the st2stream server. +The StackStorm ``st2stream`` service exposes a server-sent event stream, used by the clients like WebUI and ChatOps to receive updates from the st2stream server. Similar to st2auth and st2api, st2stream K8s configuration includes Pod Deployment with ``2`` replicas for HA (can be increased in ``values.yaml``) and ClusterIP Service listening on port ``9102``. @@ -275,8 +275,8 @@ st2actionrunner _______________ Stackstorm workers that actually execute actions. ``5`` replicas for K8s Deployment are configured by default to increase StackStorm ability to execute actions without excessive queuing. -Relies on ``redis`` for coordination. This is likely the first thing to lift if you have a lot of actions -to execute per time period in your StackStorm cluster. +Relies on ``redis`` for coordination. The ``st2actionrunner`` replicas count is likely the first thing to increase if you have +a lot of actions to execute per time period in your StackStorm cluster. st2scheduler ____________ @@ -324,7 +324,7 @@ Feedback Needed! ---------------- As this deployment method new and beta is in progress, we ask you to try it and provide your feedback via bug reports, ideas, feature or pull requests in `StackStorm/stackstorm-ha `_, -and ecourage discussions in `Slack `_ ``#docker`` channel or write us an email. +and ecourage discussions in `Slack `_ ``#k8s`` channel or write us an email. .. only:: community diff --git a/docs/source/reference/ha.rst b/docs/source/reference/ha.rst index ba0f338d..06d3c8a0 100644 --- a/docs/source/reference/ha.rst +++ b/docs/source/reference/ha.rst @@ -18,7 +18,7 @@ a reference to layer on some HA deployment-specific details. .. note:: - A reproducible blueprint of StackStorm HA cluster is available as a code based on Docker and Kubernetes, see :doc:`/install/k8s_ha`. + A reproducible blueprint of StackStorm HA cluster is available as a helm chart, which is based on Docker and Kubernetes. See :doc:`/install/k8s_ha`. Components @@ -122,9 +122,10 @@ You have to have exactly one active ``st2timersengine`` process running to sched Having more than one active ``st2timersengine`` will result in duplicate timer events and therefore duplicate rule evaluations leading to duplicate workflows or actions. -In HA deployments, external monitoring needs to setup and a new ``st2timersengine`` process needs -to be spun up to address failover. Losing the ``st2timersengine`` will mean no timer events will be -injected into |st2| and therefore no timer rules would be evaluated. +To address failover in HA deployments, use external monitoring of the ``st2timersengine`` process to ensure +one process is running, and to trigger spinning up a new ``st2timersengine`` process if it fails. +Losing the ``st2timersengine`` will mean no timer events will be injected into |st2| and therefore +no timer rules would be evaluated. st2workflowengine ^^^^^^^^^^^^^^^^^ From 993c5d763c6c440c6ec04d722967e62eb8c51fe4 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Wed, 8 Sep 2021 23:39:56 -0500 Subject: [PATCH 4/8] k8s_ha: Copy notes about deploying mongo/rabbitmq from chart readme --- docs/source/install/k8s_ha.rst | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index bceac847..990f454a 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -306,6 +306,14 @@ By default ``3`` nodes (1 primary and 2 secondaries) of MongoDB are deployed via For more advanced MongoDB configuration, refer to official `mongodb-replicaset `_ Helm chart settings, which might be fine-tuned via ``values.yaml``. +The deployment of MongoDB to the k8s cluster can be disabled by setting the mongodb-ha.enabled key in values.yaml to false. + +.. note:: + Stackstorm relies heavily on connections to a MongoDB instance. If the in-cluster deployment of MongoDB is disabled, + a connection to an external instance of MongoDB must be configured. The st2.config key in values.yaml provides a way + to configure stackstorm. + See `Configure MongoDB `_ for configuration details. + `RabbitMQ HA Cluster `_ ______________________________________________________________________________________ RabbitMQ is a message bus StackStorm relies on for inter-process communication and load distribution. @@ -314,6 +322,14 @@ By default ``3`` nodes of RabbitMQ are deployed via K8s StatefulSet. For more advanced RabbitMQ configuration, please refer to official `rabbitmq-ha `_ Helm chart repository, - all settings could be overridden via ``values.yaml``. +The deployment of RabbitMQ to the k8s cluster can be disabled by setting the rabbitmq-ha.enabled key in values.yaml to false. + +.. note:: + Stackstorm relies heavily on connections to a RabbitMQ instance. If the in-cluster deployment of RabbitMQ is disabled, + a connection to an external instance of RabbitMQ must be configured. The st2.config key in values.yaml provides a way + to configure stackstorm. + See `Configure RabbitMQ `_ for configuration details. + redis _____ StackStorm employs redis as a distributed coordination backend, required for st2 cluster components to work properly in an HA scenario. From dace581ddb0cfae120317de1e40ea4a6757738c5 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Wed, 8 Sep 2021 23:40:52 -0500 Subject: [PATCH 5/8] k8s_ha: begin documenting st2packs vs volumes methods --- docs/source/install/k8s_ha.rst | 51 ++++++++++++++++++++++++++++------ 1 file changed, 42 insertions(+), 9 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 990f454a..52564a25 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -133,16 +133,34 @@ Grab all logs only for stackstorm backend services, excluding st2web and DB/MQ/r Custom st2 packs ---------------- -To follow the stateless model, shipping custom st2 packs is now part of the deployment process. -It means that ``st2 pack install`` won't work in a distributed environment and you have to bundle all the -required packs into a Docker image that you can codify, version, package and distribute in a repeatable way. -The responsibility of this Docker image is to hold pack content and their virtualenvs. -So the custom st2 pack docker image you have to build is essentially a couple of read-only directories that -are shared with the corresponding st2 services in the cluster. - -For your convenience, we created a new ``st2-pack-install `` utility +There are two ways to install st2 packs in the k8s cluster. + +1. The ``st2packs`` method is the default. This method will work for practically all clusters, but ``st2 pack install`` does not work. The packs are injected via ``st2packs`` images instead. + +2. The other method defines shared/writable ``volumes``. This method allows ``st2 pack install`` to work, but requires a persistent storage backend to be available in the cluster. This chart will not configure a storage backend for you. + +.. note:: + In general, we recommend using only one of these methods. See the NOTE under Method 2 below about how both methods can be used together with care. + +Method 1: st2packs images (the default) +_______________________________________ + +This method strives to follow the stateless model, so shipping custom st2 packs is part of the deployment process. +Without persistent storage (ie without state), packs and their virtualenvs need to be installed in each pod. +``st2 pack install`` does not work in this distributed model because it assumes that nodes have a shared filesystem +(Method 2, below, uses a shared filesystem), so that only one node needs to download the pack files or setup the +virtualenv and all other nodes will see those files right away. + +In order to achieve this stateless model, you have to bundle all the required packs (and their virtualenvs) +into one or more Docker images that you can codify, version, package and distribute in a repeatable way. +The responsibility of these Docker images is to hold pack content and their virtualenvs. +Effectively, the st2packs Docker image(s) you have to build are a couple of read-only directories that +are shared with the corresponding st2 services in the cluster. When a new st2actionrunner +pod starts up, those directories get copied into the pod. + +For your convenience, we created an ``st2-pack-install `` utility and included it in a container `stackstorm/st2packs `_ -that will help to install custom packs during the Docker build process without relying on live DB and MQ connection. +that will help to install custom packs during the Docker build process without relying on live MongoDB and RabbitMQ connections. For more detailed instructions see `StackStorm/st2packs-dockerfiles `_ on how to build your custom `st2packs` image. @@ -151,6 +169,21 @@ Please refer to `StackStorm/stackstorm-ha#install-custom-st2-packs-in-the-cluste Helm chart repository with more information about how to reference custom st2pack Docker image in Helm values, providing packs configs, using private Docker registry and more. +Method 2: Shared Volumes +________________________ + +This method uses shared volumes to enable ``st2 pack install``. This sacrifices the stateless infrastructure model. +When multiple teams are involved in managing StackStorm, however, that tradeoff can be vital. +For example, if one team maintains StackStorm itself as shared infrastructure, but another +team handles installing packs and creating auto-remediation or other workflows, it may be necessary to separate +the StackStorm deployment process from StackStorm pack install process. + +Stateful infrastructure is generally more complex than stateless infrastructure. In our case, relying on +shared volumes requires cluster-specific storage setup and configuration. That storage setup varies +widely. Several attempts to include storage setup in our helm chart were not flexible enough +to handle that variation. As such, you must configure your storage solution before using the chart. +Then, just include your volume definitions in values. + .. note:: There is an alternative approach, - sharing pack content via read-write-many NFS (Network File System) as :doc:`/reference/ha` recommends. As beta is in progress and both methods have their pros and cons, we'd like to hear your feedback and which way would work better for you. From 1aa8dbd1c2aa1a9ea8553be01841910455cf077b Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Wed, 8 Sep 2021 23:58:05 -0500 Subject: [PATCH 6/8] k8s_ha: reword description of using Shared Volumes --- docs/source/install/k8s_ha.rst | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 52564a25..359572b8 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -172,21 +172,16 @@ using private Docker registry and more. Method 2: Shared Volumes ________________________ -This method uses shared volumes to enable ``st2 pack install``. This sacrifices the stateless infrastructure model. -When multiple teams are involved in managing StackStorm, however, that tradeoff can be vital. -For example, if one team maintains StackStorm itself as shared infrastructure, but another -team handles installing packs and creating auto-remediation or other workflows, it may be necessary to separate -the StackStorm deployment process from StackStorm pack install process. - -Stateful infrastructure is generally more complex than stateless infrastructure. In our case, relying on -shared volumes requires cluster-specific storage setup and configuration. That storage setup varies -widely. Several attempts to include storage setup in our helm chart were not flexible enough -to handle that variation. As such, you must configure your storage solution before using the chart. -Then, just include your volume definitions in values. - -.. note:: - There is an alternative approach, - sharing pack content via read-write-many NFS (Network File System) as :doc:`/reference/ha` recommends. - As beta is in progress and both methods have their pros and cons, we'd like to hear your feedback and which way would work better for you. +Pack content can also be shared via ReadWriteMany volumes such as NFS (Network File System) as :doc:`/reference/ha` recommends. +Using shared volumes sacrifices the stateless infrastructure model, but enables normal pack management features +such as ``st2 pack install``. + +Relying on shared volumes requires cluster-specific storage setup and configuration. As that storage setup varies +widely, manging that storage is out-of-scope for this helm chart. For example, before you can install this chart to use NFS, +you would have to create the NFS exports, and you might need ``PersistentVolume`` and ``PersistentVolumeClaim`` k8s objects. +Then, you add some volume definitions to your ``values.yaml``, and install or upgrade StackStorm with Helm. +Not every cluster uses NFS or PV/PVCs to manage the storage, so the chart treats your volume definitions as opaque data, +merely including your volume definitions in the appropriate place in various ``Deployment`` and ``Job`` k8s objects. Ingress ------- From 13d5b82cdd9f9105850ede3be86d016faa008179 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Thu, 9 Sep 2021 00:06:10 -0500 Subject: [PATCH 7/8] k8s_ha: add more notes about Storage Volumes --- docs/source/install/k8s_ha.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 359572b8..5ae926c9 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -183,6 +183,15 @@ Then, you add some volume definitions to your ``values.yaml``, and install or up Not every cluster uses NFS or PV/PVCs to manage the storage, so the chart treats your volume definitions as opaque data, merely including your volume definitions in the appropriate place in various ``Deployment`` and ``Job`` k8s objects. +.. note:: + With care, ``st2packs`` images can be used with ``volumes``. Just make sure to keep the ``st2packs`` images up-to-date + with any changes made via ``st2 pack install``. If a pack is installed via an ``st2packs`` image and then it gets updated + with ``st2 pack install``, a subsequent ``helm upgrade`` will revert back to the version in the ``st2packs`` image. + +Please refer to `StackStorm/stackstorm-ha#install-custom-st2-packs-in-the-cluster `_ +Helm chart repository with more information about how to pass custom volume definitions for ``packs``, ``virtualenvs`` +and pack ``configs`` in Helm values. + Ingress ------- From 2557a6fe17948fcbb1fb3667a1294bd87962dcb8 Mon Sep 17 00:00:00 2001 From: Jacob Floyd Date: Thu, 9 Sep 2021 12:26:29 -0500 Subject: [PATCH 8/8] k8s_ha: edits based on feedback --- docs/source/install/k8s_ha.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/source/install/k8s_ha.rst b/docs/source/install/k8s_ha.rst index 5ae926c9..003ad597 100644 --- a/docs/source/install/k8s_ha.rst +++ b/docs/source/install/k8s_ha.rst @@ -64,7 +64,7 @@ Helm package ``stackstorm-ha`` comes with default settings (see `values.yaml `_, -and ecourage discussions in `Slack `_ ``#k8s`` channel or write us an email. +and encourage discussions in `Slack `_ ``#k8s`` channel. .. only:: community