Skip to content

Commit cd6beb7

Browse files
Add user facing documentation
1 parent e9b8505 commit cd6beb7

File tree

3 files changed

+271
-0
lines changed

3 files changed

+271
-0
lines changed

docs/book/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
- [Implementing Runtime Extensions](./tasks/experimental-features/runtime-sdk/implement-extensions.md)
4040
- [Implementing Lifecycle Hook Extensions](./tasks/experimental-features/runtime-sdk/implement-lifecycle-hooks.md)
4141
- [Implementing Topology Mutation Hook Extensions](./tasks/experimental-features/runtime-sdk/implement-topology-mutation-hook.md)
42+
- [Implementing In-Place Update Hooks Extensions](./tasks/experimental-features/runtime-sdk/implement-in-place-update-hooks.md)
4243
- [Deploying Runtime Extensions](./tasks/experimental-features/runtime-sdk/deploy-runtime-extension.md)
4344
- [Ignition Bootstrap configuration](./tasks/experimental-features/ignition.md)
4445
- [Running multiple providers](./tasks/multiple-providers.md)
Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
# Implementing in-place update hooks
2+
3+
<aside class="note warning">
4+
5+
<h1>Caution</h1>
6+
7+
Please note Runtime SDK is an advanced feature. If implemented incorrectly, a failing Runtime Extension can severely impact the Cluster API runtime.
8+
9+
</aside>
10+
11+
## Introduction
12+
13+
The proposal for [n-place updates in Cluster API](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/20240807-in-place-updates.md)
14+
introduced extensions allowing users to execute changes on existing machines without deleting the machines and creating a new one.
15+
16+
Notably, the Cluster API user experience remain the same as of today no matter of the in-place update feature is enabled
17+
or not e.g. in order to trigger a MachineDeployment rollout, you have to rotate a template, etc.
18+
19+
Users should care ONLY about the desired state (as of today).
20+
21+
Cluster API is responsible to choose the best strategy to achieve desired state, and with the introduction of
22+
update extensions, Cluster API is expanding the set of tools Cluster API can use to achieve the desired state.
23+
24+
If external update extensions can not cover the totality of the desired changes, CAPI will fall back to Cluster API’s default,
25+
immutable rollouts.
26+
27+
Cluster API will be also responsible to determine which Machine/MachineSet should be updated, as well as to handle rollout
28+
options like MaxSurge/MaxUnavailable. With this regard:
29+
30+
- Machines updating in-place are considered not available, because in-place updates are always considered as potentially disruptive.
31+
- For control plane machines, if maxSurge is one, a new machine must be created first, then as soon as there is
32+
“buffer” for in-place, in-place update can proceed.
33+
- KCP will not use in-place in case it will detect that it can impact health of the control plane.
34+
- For workers machines, if maxUnavailable is zero, a new machine must be created first, then as soon as there
35+
is “buffer” for in-place, in-place update can proceed.
36+
- When in-place is possible, the system should try to in-place update as many machines as possible.
37+
In practice, this means that maxSurge might be not fully used (it is used only for scale up by one if maxUnavailable=0).
38+
- No in-place updates are performed for workers machines when using rollout strategy on delete.
39+
40+
<!-- TOC -->
41+
* [Implementing in-place update hooks](#implementing-in-place-update-hooks)
42+
* [Introduction](#introduction)
43+
* [Guidelines](#guidelines)
44+
* [Definitions](#definitions)
45+
* [CanUpdateMachine](#canupdatemachine)
46+
* [CanUpdateMachineSet](#canupdatemachineset)
47+
* [UpdateMachine](#updatemachine)
48+
<!-- TOC -->
49+
50+
## Guidelines
51+
52+
All guidelines defined in [Implementing Runtime Extensions](implement-extensions.md#guidelines) apply to the
53+
implementation of Runtime Extensions for upgrade plan hooks as well.
54+
55+
In summary, Runtime Extensions are components that should be designed, written and deployed with great caution given
56+
that they can affect the proper functioning of the Cluster API runtime. A poorly implemented Runtime Extension could
57+
potentially block upgrade transitions from happening.
58+
59+
Following recommendations are especially relevant:
60+
61+
* [Timeouts](implement-extensions.md#timeouts)
62+
* [Idempotence](implement-extensions.md#idempotence)
63+
* [Deterministic result](implement-extensions.md#deterministic-result)
64+
* [Error messages](implement-extensions.md#error-messages)
65+
* [Error management](implement-extensions.md#error-management)
66+
* [Avoid dependencies](implement-extensions.md#avoid-dependencies)
67+
68+
## Definitions
69+
70+
For additional details about the OpenAPI spec of the upgrade plan hooks, please download the [`runtime-sdk-openapi.yaml`]({{#releaselink repo:"https://github.com/kubernetes-sigs/cluster-api" gomodule:"sigs.k8s.io/cluster-api" asset:"runtime-sdk-openapi.yaml" version:"1.11.x"}})
71+
file and then open it from the [Swagger UI](https://editor.swagger.io/).
72+
73+
### CanUpdateMachine
74+
75+
This hook is called by KCP when performing the "can update in-place" for a control plane machine.
76+
77+
Example request
78+
79+
```yaml
80+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
81+
kind: CanUpdateMachineRequest
82+
settings: <Runtime Extension settings>
83+
current:
84+
machine:
85+
apiVersion: cluster.x-k8s.io/v1beta2
86+
kind: Machine
87+
metadata:
88+
name: test-cluster
89+
namespace: test-ns
90+
spec:
91+
...
92+
infrastructureMachine:
93+
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
94+
kind: VSphereMachine
95+
metadata:
96+
name: test-cluster
97+
namespace: test-ns
98+
spec:
99+
...
100+
boostrapConfig:
101+
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
102+
kind: KubeadmConfig
103+
metadata:
104+
name: test-cluster
105+
namespace: test-ns
106+
spec:
107+
...
108+
desired:
109+
machine:
110+
...
111+
infrastructureMachine:
112+
...
113+
boostrapConfig:
114+
...
115+
```
116+
117+
Note:
118+
- All the objects will have the latest API version known by Cluster API.
119+
- Only spec is provided, status fields are not included
120+
- When more than one extension will be supported, the current state will already include changes that can handle in-place by other runtime extensions.
121+
122+
Example Response
123+
124+
```yaml
125+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
126+
kind: CanUpdateMachineResponse
127+
status: Success # or Failure
128+
message: "error message if status == Failure"
129+
machinePatch:
130+
patchType: JSONPatch
131+
patch: <JSON-patch>
132+
infrastructureMachinePatch:
133+
...
134+
boostrapConfigPatch:
135+
...
136+
```
137+
138+
Note:
139+
- Extensions should return per-object patches to be applied on current objects to indicate which changes they can handle in-place.
140+
- Only fields in Machine/InfraMachine/BootstrapConfig spec have to be covered by patches
141+
- Patches must be in JSONPatch or JSONMergePatch format
142+
143+
### CanUpdateMachineSet
144+
145+
This hook is called by the MachineDeployment controller when performing the "can update in-place" for all the Machines controlled by
146+
a MachineSet.
147+
148+
Example request
149+
150+
```yaml
151+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
152+
kind: CanUpdateMachineSetRequest
153+
settings: <Runtime Extension settings>
154+
current:
155+
machineSet:
156+
apiVersion: cluster.x-k8s.io/v1beta2
157+
kind: MachineSet
158+
metadata:
159+
name: test-cluster
160+
namespace: test-ns
161+
spec:
162+
...
163+
infrastructureMachineTemplate:
164+
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
165+
kind: VSphereMachineTemplate
166+
metadata:
167+
name: test-cluster
168+
namespace: test-ns
169+
spec:
170+
...
171+
boostrapConfigTemplate:
172+
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
173+
kind: KubeadmConfigTemplate
174+
metadata:
175+
name: test-cluster
176+
namespace: test-ns
177+
spec:
178+
...
179+
desired:
180+
machineSet:
181+
...
182+
infrastructureMachineTemplate:
183+
...
184+
boostrapConfigTemplate:
185+
...
186+
```
187+
188+
Note:
189+
- All the objects will have the latest API version known by Cluster API.
190+
- Only spec is provided, status fields are not included
191+
- When more than one extension will be supported, the current state will already include changes that can handle in-place by other runtime extensions.
192+
193+
Example Response
194+
195+
```yaml
196+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
197+
kind: CanUpdateMachineSetResponse
198+
status: Success # or Failure
199+
message: "error message if status == Failure"
200+
machineSetPatch:
201+
patchType: JSONPatch
202+
patch: <JSON-patch>
203+
infrastructureMachineTemplatePatch:
204+
...
205+
boostrapConfigTemplatePatch:
206+
...
207+
```
208+
209+
Note:
210+
- Extensions should return per-object patches to be applied on current objects to indicate which changes they can handle in-place.
211+
- Only fields in Machine/InfraMachine/BootstrapConfig spec have to be covered by patches
212+
- Patches must be in JSONPatch or JSONMergePatch format
213+
214+
### UpdateMachine
215+
216+
This hook is called by the Machine controller when performing the in-place updates for a Machine.
217+
218+
Example request
219+
220+
```yaml
221+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
222+
kind: UpdateMachineRequest
223+
settings: <Runtime Extension settings>
224+
desired:
225+
machine:
226+
apiVersion: cluster.x-k8s.io/v1beta2
227+
kind: Machine
228+
metadata:
229+
name: test-cluster
230+
namespace: test-ns
231+
spec:
232+
...
233+
infrastructureMachineTemplate:
234+
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
235+
kind: VSphereMachineTemplate
236+
metadata:
237+
name: test-cluster
238+
namespace: test-ns
239+
spec:
240+
...
241+
boostrapConfigTemplate:
242+
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
243+
kind: KubeadmConfigTemplate
244+
metadata:
245+
name: test-cluster
246+
namespace: test-ns
247+
spec:
248+
...
249+
```
250+
251+
Note:
252+
- Only desired is provided (the external updater extension should know current state of the Machine).
253+
- Only spec is provided, status fields are not included
254+
255+
Example Response
256+
257+
```yaml
258+
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
259+
kind: UpdateMachineSetResponse
260+
status: Success # or Failure
261+
message: "error message if status == Failure"
262+
retryAfterSeconds: 10
263+
```
264+
265+
Note:
266+
- The status of the update operation is determined by the CommonRetryResponse fields:
267+
- Status=Success + RetryAfterSeconds > 0: update is in progress
268+
- Status=Success + RetryAfterSeconds = 0: update completed successfully
269+
- Status=Failure: update failed

docs/book/src/tasks/experimental-features/runtime-sdk/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,5 +31,6 @@ Additional documentation:
3131
* [Implementing Runtime Extensions](./implement-extensions.md)
3232
* [Implementing Lifecycle Hook Extensions](./implement-lifecycle-hooks.md)
3333
* [Implementing Topology Mutation Hook Extensions](./implement-topology-mutation-hook.md)
34+
* [Implementing In-Place Update Hooks Extensions](./implement-in-place-update-hooks.md)
3435
* For Cluster operators:
3536
* [Deploying Runtime Extensions](./deploy-runtime-extension.md)

0 commit comments

Comments
 (0)