Skip to content

Commit b9d5e44

Browse files
committed
docs: workflows (#1274)
* docs: workflows * wip * wip * docs * docs * wip * wip * wip * fix: typo in javadoc
1 parent cad0e1c commit b9d5e44

File tree

4 files changed

+288
-10
lines changed

4 files changed

+288
-10
lines changed

docs/_data/sidebar.yml

+2
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@
1111
url: /docs/features
1212
- title: Dependent Resource Feature
1313
url: /docs/dependent-resources
14+
- title: Workflows
15+
url: /docs/workflows
1416
- title: Patterns and Best Practices
1517
url: /docs/patterns-best-practices
1618
- title: FAQ

docs/documentation/dependent-resources.md

+9-9
Original file line numberDiff line numberDiff line change
@@ -137,16 +137,15 @@ See the full source code [here](https://github.com/java-operator-sdk/java-operat
137137

138138
## Managed Dependent Resources
139139

140-
As mentioned previously, one goal of this implementation is to make it possible to semi-declaratively create and wire
140+
As mentioned previously, one goal of this implementation is to make it possible to declaratively create and wire
141141
dependent resources. You can annotate your reconciler with
142142
`@Dependent` annotations that specify which `DependentResource` implementation it depends upon. JOSDK will take the
143143
appropriate steps to wire everything together and call your
144144
`DependentResource` implementations `reconcile` method before your primary resource is reconciled. This makes sense in
145145
most use cases where the logic associated with the primary resource is usually limited to status handling based on the
146146
state of the secondary resources and the resources are not dependent on each other.
147147

148-
Note that all dependents will be reconciled in order. If an exception happens in one or more reconciliations, the
149-
followup resources will be reconciled.
148+
See [Workflows](https://javaoperatorsdk.io/docs/dependent-resources) how/ in what order the resources are reconciled.
150149

151150
This behavior and automated handling is referred to as "managed" because the `DependentResource` instances
152151
are managed by JOSDK.
@@ -186,15 +185,16 @@ sample [here](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/s
186185

187186
## Standalone Dependent Resources
188187

189-
To use dependent resources in more complex workflows, when there are some resources needs to be created only in certain
190-
conditions the standalone mode is available or the dependent resources are not independent of each other.
191-
For example if calling an API needs to happen if a service is already up and running
192-
(think configuring a running DB instance).
188+
In case just some or sub-set of the resources are desired to be managed by dependent resources use standalone mode.
193189
In practice this means that the developer is responsible to initializing and managing and
194-
calling `reconcile` method. However, this gives possibility for developers to fully customize the workflow for
190+
calling `reconcile` method. However, this gives possibility for developers to fully customize the process for
195191
reconciliation. Use standalone dependent resources for cases when managed does not fit.
196192

197-
The sample is similar to one above it just performs additional checks, and conditionally creates an `Ingress`:
193+
Note that [Workflows](https://javaoperatorsdk.io/docs/dependent-resources) support also standalone mode using
194+
standalone resources.
195+
196+
The sample is similar to one above it just performs additional checks, and conditionally creates an `Ingress`:
197+
(Note that now this condition creation is also possible with Workflows)
198198

199199
```java
200200

docs/documentation/workflows.md

+276
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
---
2+
title: Workflows
3+
description: Reference Documentation for Workflows
4+
layout: docs
5+
permalink: /docs/workflows
6+
---
7+
8+
## Overview
9+
10+
Kubernetes (k8s) does not have notion of a resource "depends on" on another k8s resource,
11+
in terms of in what order a set of resources should be reconciled. However, Kubernetes operators are used to manage also
12+
external (non k8s) resources. Typically, when an operator manages a service, after the service is first deployed
13+
some additional API calls are required to configure it. In this case the configuration step depends
14+
on the service and related resources, in other words the configuration needs to be reconciled after the service is
15+
up and running.
16+
17+
The intention behind workflows is to make it easy to describe more complex, almost arbitrary scenarios in a declarative
18+
way. While [dependent resources](https://javaoperatorsdk.io/docs/dependent-resources) describes a logic how a single
19+
resources should be reconciled, workflows describes the process how a set of target resources should be reconciled.
20+
21+
Workflows are defined as a set of [dependent resources](https://javaoperatorsdk.io/docs/dependent-resources) (DR)
22+
and dependencies between them, along with some conditions that mainly helps define optional resources and
23+
pre- and post-conditions to describe expected states of a resource at a certain point in the workflow.
24+
25+
## Elements of Workflow
26+
27+
- **Dependent resource** (DR) - are the resources which are managed in reconcile logic.
28+
- **Depends-on relation** - if a DR `B` depends on another DR `A`, means that `B` will be reconciled after `A`.
29+
- **Reconcile precondition** - is a condition that needs to be fulfilled before the DR is reconciled. This allows also
30+
to define optional resources, that for example only created if a flag in a custom resource `.spec` has some
31+
specific value.
32+
- **Ready postcondition** - checks if a resource could be considered "ready", typically if pods of a deployment are up
33+
and running.
34+
- **Delete postcondition** - during the cleanup phase it can be used to check if the resources is successfully deleted,
35+
so the next resource on which the target resources depends can be deleted as next step.
36+
37+
## Defining Workflows
38+
39+
Similarly to dependent resources, there are two ways to define workflows, in managed and standalone manner.
40+
41+
### Managed
42+
43+
Annotations can be used to declaratively define a workflow for the reconciler. In this case the workflow is executed
44+
before the `reconcile` method is called. The result of the reconciliation is accessed through the `context` object.
45+
46+
Following sample shows a hypothetical sample to showcase all the elements, where there are two resources a Deployment and
47+
a ConfigMap, where the ConfigMap depends on the deployment. Deployment has a ready condition so, the config map is only
48+
reconciled after the Deployment and only if it is ready (see ready-postcondition). The ConfigMap has attached reconcile
49+
precondition, therefore it is only reconciled if that condition holds. In addition to that has a delete-postCondition,
50+
thus only considered to be deleted if that condition holds.
51+
52+
```java
53+
@ControllerConfiguration(dependents = {
54+
@Dependent(name = DEPLOYMENT_NAME, type = DeploymentDependentResource.class,
55+
readyPostcondition = DeploymentReadyCondition.class),
56+
@Dependent(type = ConfigMapDependentResource.class,
57+
reconcilePrecondition = ConfigMapReconcileCondition.class,
58+
deletePostcondition = ConfigMapDeletePostCondition.class,
59+
dependsOn = DEPLOYMENT_NAME)
60+
})
61+
public class SampleWorkflowReconciler implements Reconciler<TestCustomResource>,
62+
Cleaner<WorkflowAllFeatureCustomResource> {
63+
64+
public static final String DEPLOYMENT_NAME = "deployment";
65+
66+
@Override
67+
public UpdateControl<WorkflowAllFeatureCustomResource> reconcile(
68+
WorkflowAllFeatureCustomResource resource,
69+
Context<WorkflowAllFeatureCustomResource> context) {
70+
71+
resource.getStatus()
72+
.setReady(
73+
context.managedDependentResourceContext() // accessing workflow reconciliation results
74+
.getWorkflowReconcileResult().orElseThrow()
75+
.allDependentResourcesReady());
76+
return UpdateControl.patchStatus(resource);
77+
}
78+
79+
@Override
80+
public DeleteControl cleanup(WorkflowAllFeatureCustomResource resource,
81+
Context<WorkflowAllFeatureCustomResource> context) {
82+
// emitted code
83+
84+
return DeleteControl.defaultDelete();
85+
}
86+
}
87+
88+
```
89+
90+
### Standalone
91+
92+
In this mode workflow is built manually using [standalone dependent resources](https://javaoperatorsdk.io/docs/dependent-resources#standalone-dependent-resources)
93+
. The workflow is created using a builder, that is explicitly called in the reconciler (from web page sample):
94+
95+
```java
96+
@ControllerConfiguration(
97+
labelSelector = WebPageDependentsWorkflowReconciler.DEPENDENT_RESOURCE_LABEL_SELECTOR)
98+
public class WebPageDependentsWorkflowReconciler
99+
implements Reconciler<WebPage>, ErrorStatusHandler<WebPage>, EventSourceInitializer<WebPage> {
100+
101+
public static final String DEPENDENT_RESOURCE_LABEL_SELECTOR = "!low-level";
102+
private static final Logger log =
103+
LoggerFactory.getLogger(WebPageDependentsWorkflowReconciler.class);
104+
105+
private KubernetesDependentResource<ConfigMap, WebPage> configMapDR;
106+
private KubernetesDependentResource<Deployment, WebPage> deploymentDR;
107+
private KubernetesDependentResource<Service, WebPage> serviceDR;
108+
private KubernetesDependentResource<Ingress, WebPage> ingressDR;
109+
110+
private Workflow<WebPage> workflow;
111+
112+
public WebPageDependentsWorkflowReconciler(KubernetesClient kubernetesClient) {
113+
initDependentResources(kubernetesClient);
114+
workflow = new WorkflowBuilder<WebPage>()
115+
.addDependent(configMapDR).build()
116+
.addDependent(deploymentDR).build()
117+
.addDependent(serviceDR).build()
118+
.addDependent(ingressDR).withReconcileCondition(new IngressCondition()).build()
119+
.build();
120+
}
121+
122+
@Override
123+
public Map<String, EventSource> prepareEventSources(EventSourceContext<WebPage> context) {
124+
return EventSourceInitializer.nameEventSources(
125+
configMapDR.initEventSource(context),
126+
deploymentDR.initEventSource(context),
127+
serviceDR.initEventSource(context),
128+
ingressDR.initEventSource(context));
129+
}
130+
131+
@Override
132+
public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {
133+
134+
var result = workflow.reconcile(webPage, context);
135+
136+
webPage.setStatus(createStatus(result));
137+
return UpdateControl.patchStatus(webPage);
138+
}
139+
// emitted code
140+
}
141+
142+
```
143+
144+
## Workflow Execution
145+
146+
This section describes how a workflow is executed in details, how is the ordering determined and how condition and
147+
errors affect the behavior. The workflow execution as also its API denotes, can be divided to into two parts,
148+
the reconciliation and cleanup. [Cleanup](https://javaoperatorsdk.io/docs/features#the-reconcile-and-cleanup) is
149+
executed if a resource is marked for deletion.
150+
151+
152+
## Common Principles
153+
154+
- **As complete as possible execution** - when a workflow is reconciled, it tries to reconcile as many resources as
155+
possible. Thus is an error happens or a ready condition is not met for a resources, all the other independent resources
156+
will be still reconciled. This is the opposite to fail-fast approach. The assumption is that eventually in this way the
157+
overall desired state is achieved faster than with a fail fast approach.
158+
- **Concurrent reconciliation of independent resources** - the resources which are not dependent on each are processed
159+
concurrently. The level of concurrency is customizable, could be set to one if required. By default, workflows use
160+
the executor service from [ConfigurationService](https://github.com/java-operator-sdk/java-operator-sdk/blob/6f2a252952d3a91f6b0c3c38e5e6cc28f7c0f7b3/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationService.java#L120-L120)
161+
162+
## Reconciliation
163+
164+
This section describes how a workflow is executed, first the rules are defined, then are explained on samples:
165+
166+
### Rules
167+
168+
1. DR is reconciled if it does not depend on another DR, or ALL the DRs it depends on are ready. In case it
169+
has a reconcile-precondition that condition must be met too. (So here ready means that it is successfully
170+
reconciled - without any error - and if it has a ready condition that condition is met).
171+
2. If a reconcile-precondition of a DR is not met, it is deleted. If there are dependent resources which depends on it
172+
are deleted too as first - this applies recursively. That means that DRs are always deleted in revers order compared
173+
how are reconciled.
174+
3. Delete is called on a dependent resource if as described in point 2. it (possibly transitively) depends on a DR which
175+
did not meet it's reconcile condition, and has no DRs depends on it, or if the DR-s which depends on it are already
176+
successfully deleted (within actual execution). "Delete is called" means, that the dependent resource is checked
177+
if it implements `Deleter` interface, if implements it but do not implement `GarbageCollected` interface,
178+
the `Deleter.delete` method called. If a DR does not implement `Deleter` interface, it is considered as deleted
179+
automatically. Successfully deleted means, that it is deleted and if a delete-postcondition is present it is met.
180+
181+
### Samples
182+
183+
Notation: The arrows depicts reconciliation ordering, or in depends-on relation in reverse direction:
184+
`1 --> 2` mean `DR 2` depends-on `DR 1`.
185+
186+
#### Reconcile Sample
187+
188+
<div class="mermaid" markdown="0">
189+
190+
stateDiagram-v2
191+
1 --> 2
192+
1 --> 3
193+
2 --> 4
194+
3 --> 4
195+
196+
</div>
197+
198+
- At the workflow the reconciliation of the nodes would happen in the following way. DR with index `1` is reconciled.
199+
After that DR `2` and `3` is reconciled concurrently, if both finished their reconciliation, node `4` is reconciled too.
200+
- In case for example `2` would have a ready condition, that would be evaluated as "not met", `4` would not be reconciled.
201+
However `1`,`2` and `3` would be reconciled.
202+
- In case `1` would have a ready condition that is not met, neither `2`,`3` or `4` would be reconciled.
203+
- If there would be an error during the reconciliation of `2`, `4` would not be reconciled, but `3` would be
204+
(also `1` of course).
205+
206+
#### Sample with Reconcile Precondition
207+
208+
<div class="mermaid" markdown="0">
209+
210+
stateDiagram-v2
211+
1 --> 2
212+
1 --> 3
213+
3 --> 4
214+
3 --> 5
215+
216+
</div>
217+
218+
- Considering this sample for case `3` has reconcile-precondition, what is not met. In that case DR `1` and `2` would be
219+
reconciled. However, DR `3`,`4`,`5` would be deleted in the following way. DR `4` and `5` would be deleted concurrently.
220+
DR `3` would be deleted if `4` and `5` is deleted successfully, thus no error happened during deletion and all
221+
delete-postconditions are met.
222+
- If delete-postcondition for `5` would not be met `3` would not be deleted; `4` would be.
223+
- Similarly, in there would be an error for `5`, `3` would not be deleted, `4` would be.
224+
225+
## Cleanup
226+
227+
Cleanup works identically as delete for resources in reconciliation in case reconcile-precondition is not met, just for
228+
the whole workflow.
229+
230+
The rule is relatively simple:
231+
232+
Delete is called on a DR if there is no DR that depends on it, or if the DR-s which depends on it are
233+
already deleted successfully (withing this execution of workflow). Successfully deleted means, that it is deleted and
234+
if a delete-postcondition is present it is met. "Delete is called" means, that the dependent resource is checked if it
235+
implements `Deleter` interface, if implements it but do not implement `GarbageCollected` interface, the `Deleter.delete`
236+
method called. If a DR does not implement `Deleter` interface, it is considered as deleted automatically.
237+
238+
### Sample
239+
240+
<div class="mermaid" markdown="0">
241+
242+
stateDiagram-v2
243+
1 --> 2
244+
1 --> 3
245+
2 --> 4
246+
3 --> 4
247+
248+
</div>
249+
250+
- The DRs are deleted in the following order: `4` is deleted, after `2` and `3` are deleted concurrently, after both
251+
succeeded `1` is deleted.
252+
- If delete-postcondition would not be met for `2`, node `1` would not be deleted. DR `4` and `3` would be deleted.
253+
- If `2` would be errored, DR `1` would not be deleted. DR `4` and `3` would be deleted.
254+
- if `4` would be errored, no other DR would be deleted.
255+
256+
## Error Handling
257+
258+
As mentioned before if an error happens during a reconciliation, the reconciliation of other dependent resources will
259+
still happen. There might a case that multiple DRs are errored, therefore workflows throws an
260+
['AggregatedOperatorException'](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/AggregatedOperatorException.java)
261+
that will contain all the related exceptions.
262+
263+
The exceptions can be handled by [`ErrorStatusHandler`](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/AggregatedOperatorException.java)
264+
265+
## Notes and Caveats
266+
267+
- Delete is almost always called on every resource during the cleanup. However, it might be the case that the resources
268+
was already deleted in a previous run, or not even created. This should not be a problem, since dependent resources
269+
usually cache the state of the resource, so are already aware that the resource not exists, thus basically doing nothing
270+
if delete is called on an already not existing resource.
271+
- If a resource has owner references, it will be automatically deleted by Kubernetes garbage collector if
272+
the owner resource is marked for deletion. This might not be desirable, to make sure that delete is handled by the
273+
workflow don't use garbage collected kubernetes dependent resource, use for example [`CRUDNoGCKubernetesDependentResource`](https://github.com/java-operator-sdk/java-operator-sdk/blob/86e5121d56ed4ecb3644f2bc8327166f4f7add72/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/dependent/kubernetes/CRUDNoGCKubernetesDependentResource.java).
274+
- After a workflow executed no state is persisted regarding the workflow execution. On every reconciliation
275+
all the resources are reconciled again, in other words the whole workflow is evaluated again.
276+

operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java

+1-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ public Operator(KubernetesClient kubernetesClient, ConfigurationService configur
6868
ConfigurationServiceProvider.set(configurationService);
6969
}
7070

71-
/** Adds a shutdown hook that automatically calls {@link #stop()} ()} when the app shuts down. */
71+
/** Adds a shutdown hook that automatically calls {@link #stop()} when the app shuts down. */
7272
public void installShutdownHook() {
7373
Runtime.getRuntime().addShutdownHook(new Thread(this::stop));
7474
}

0 commit comments

Comments
 (0)