Kubernetes-native system managing the full lifecycle of conformant Kubernetes clusters as a service on Alicloud, AWS, Azure, GCP, OpenStack, EquinixMetal, vSphere, MetalStack, and Kubevirt with minimal TCO.
Gardener extension provider that implements the DNSRecord resource for cloudflare
What this PR does / why we need it:
The expiration of bootstrap tokens is hardcoded in the machine controller deployment. (depending on the provider but mostly approx. 20min).
There is an option to overwrite this default per worker group with machineCreationTimeout
.
This option is considered during the machine creation but not in the bootstrap token.
So this means that if the machine takes more time than 20 minutes to be created, the bootstrap token is already expired and the needed configs cannot be fetched form the cluster.
This PR makes the bootstrap token respect the machines machineCreationTimeout
option.
Which issue(s) this PR fixes: Fixes #772
Special notes for your reviewer:
Tested with the equinix provider
Release note:
Fix a bug in the bootstrap token creation that caused node to not be able to join the cluster due to an expired bootstrap token.
How to categorize this issue?
/kind bug /priority 3
What happened:
Mostly always my machines take more than 20 minutes to be created in the cloudprovider.
Therefore the workers are configured to with a specific machineCreationTimeout
of 50m
.
When the nodes come up after approx. 30min the node is unable to join cluster due to a expired bootstrap token.
What you expected to happen:
Bootstrap token for node does not expired before the configured spec.worker[].machineController.machineCreationTimeout
How to reproduce it (as minimally and precisely as possible):
machineCreationTimeout
of more than 20mAnything else we need to know?:
Environment:
kubectl version
): 1.22.17cloud controller crashes in github.com/equinix/cloud-provider-equinix-metal/metal.getMetalConfig
due to a nil pointer exception:
I0127 09:05:49.125494 1 serving.go:348] Generated self-signed cert in-memory
W0127 09:05:55.522563 1 authentication.go:423] failed to read in-cluster kubeconfig for delegated authentication: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0127 09:05:55.522602 1 authorization.go:226] failed to read in-cluster kubeconfig for delegated authorization: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0127 09:05:55.522616 1 authorization.go:194] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4b0cd4]
goroutine 1 [running]:
io.ReadAll({0x0, 0x0})
/usr/local/go/src/io/io.go:661 +0xd4
io/ioutil.ReadAll(...)
/usr/local/go/src/io/ioutil/ioutil.go:27
github.com/equinix/cloud-provider-equinix-metal/metal.getMetalConfig({_, _})
/workspace/metal/config.go:100 +0xbd
github.com/equinix/cloud-provider-equinix-metal/metal.init.0.func1({0x0?, 0x0?})
/workspace/metal/cloud.go:51 +0x5d
k8s.io/cloud-provider.GetCloudProvider({0x7ffecbf0a79e?, 0x8?}, {0x0, 0x0})
/go/pkg/mod/k8s.io/cloud-provider@v0.25.0/plugins.go:86 +0xf9
k8s.io/cloud-provider.InitCloudProvider({0x7ffecbf0a79e, 0xc}, {0x0?, 0x8?})
/go/pkg/mod/k8s.io/cloud-provider@v0.25.0/plugins.go:164 +0x2c5
main.cloudInitializer(0xc00011c820)
/workspace/main.go:57 +0x4c
k8s.io/cloud-provider/app.NewCloudControllerManagerCommand.func1(0xc000004a00?, {0xc000145590?, 0x5?, 0x5?})
/go/pkg/mod/k8s.io/cloud-provider@v0.25.0/app/controllermanager.go:88 +0x1e9
github.com/spf13/cobra.(*Command).execute(0xc000004a00, {0xc00011e130, 0x5, 0x5})
/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:856 +0x67c
github.com/spf13/cobra.(*Command).ExecuteC(0xc000004a00)
/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902
main.main()
/workspace/main.go:48 +0x2cc
Do I miss a critical config?
my deployment looks like:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: cloud-controller-manager
name: cloud-controller-manager
spec:
replicas: 1
selector:
matchLabels:
app: cloud-controller-manager
template:
metadata:
annotations:
checksum/secret-cloudprovider: 90174e466bb73b0d9c3090e0fe4466ee0ba63cd4f2146562cabd422f96fde3f2
labels:
app: cloud-controller-manager
gardener.cloud/role: controlplane
networking.gardener.cloud/from-prometheus: allowed
networking.gardener.cloud/to-dns: allowed
networking.gardener.cloud/to-public-networks: allowed
networking.gardener.cloud/to-shoot-apiserver: allowed
spec:
automountServiceAccountToken: false
containers:
- command:
- ./cloud-provider-equinix-metal
- --cloud-provider=equinixmetal
- --leader-elect=false
- --allow-untagged-cloud=true
- --authentication-skip-lookup=true
- --kubeconfig=/var/run/secrets/gardener.cloud/shoot/generic-kubeconfig/kubeconfig
env:
- name: METAL_API_KEY
valueFrom:
secretKeyRef:
key: apiToken
name: cloudprovider
- name: METAL_PROJECT_ID
valueFrom:
secretKeyRef:
key: projectID
name: cloudprovider
- name: METAL_METRO_NAME
value: fr
- name: METAL_LOAD_BALANCER
value: metallb:///
image: docker.io/equinix/cloud-provider-equinix-metal:v3.5.0
imagePullPolicy: IfNotPresent
name: cloud-provider-equinix-metal
ports:
- containerPort: 10253
name: metrics
protocol: TCP
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 64Mi
volumeMounts:
- mountPath: /var/run/secrets/gardener.cloud/shoot/generic-kubeconfig
name: kubeconfig
readOnly: true
tolerations:
- key: CriticalAddonsOnly
operator: Exists
volumes:
- name: kubeconfig
projected:
defaultMode: 420
sources:
- secret:
items:
- key: kubeconfig
path: kubeconfig
name: generic-token-kubeconfig
optional: false
- secret:
items:
- key: token
path: token
name: shoot-access-cloud-controller-manager
optional: false
(#9) Add gitignore for logs and cache
Makes it easier to manually synchronize workspaces. Logs and cache should not be shared between workspaces.
Makes it easier to manually synchronize workspaces. Logs and cache should not be shared between workspaces.
Add support for macos in alias file
Update configs add terminator and gardener config
add codesphere helper script
Add woff font support Add better gzip options
Update nginx wordpress config to support routes
Increase max file upload size for wordpress plugins
Remove testgrid link for k8s versions < 1.15
Removed testgrid links to outdated versions and added a disclaimer
Merge pull request #105 from schrodit/patch-2
Remove testgrid links to tests with kubernetes version < 1.15
Adjust version disclaimer message
Merge pull request #106 from schrodit/adjust-disclaimer
Adjust version disclaimer message
Allow to configure subnet which a router should be attached to
Co-authored-by: Raphael Vogel raphael.vogel@sap.com Co-authored-by: Martin Weindel martin.weindel@sap.com
Merge pull request #92 from gardener/fip-subnet-infra
Allow to configure subnet which a router should be attached to
Remove NoExecute toleration from CCM
Merge pull request #108 from stoyanr/remove-noexecute-toleration
Remove NoExecute toleration from CCM
Add missing CRD under example/
Signed-off-by: ialidzhikov i.alidjikov@gmail.com
Merge pull request #110 from ialidzhikov/nit/missing-crd
Nit: Add missing CRD under example/
add rolling update strategy to deployment
Adapt to terraform v0.12 language changes
Co-authored-by: afritzler andreas.fritzler@sap.com Signed-off-by: ialidzhikov i.alidjikov@gmail.com
Merge pull request #111 from mandelsoft/master
add rolling update strategy to deployment
Merge pull request #112 from ialidzhikov/cleanup/tf-syntax
Adapt to terraform v0.12 language changes
Delete running terraformer Pods before deleting infrastructure
Merge pull request #113 from tim-ebert/fix/ensure-cleanup
Delete running terraformer Pods before deleting infrastructure
Adapt the extension to Loki
Upgrade github.com/gardener/terraformer
from v1.2.0 to v1.3.0
Skip storageclass cleanup for hibernated shoots
Merge pull request #116 from gardener/ci-vixfyqbwm
[ci:component:github.com/gardener/terraformer:v1.2.0->v1.3.0]
Increase max body size to 500M
Increase php max_execution_time and max_input_vars to recommended values