sayboras
Repos
56
Followers
85
Following
54

eBPF-based Networking, Security, and Observability

13637
1856

Fast linters Runner for Go

11547
1089

The official CLI for Amazon EKS

4228
1163

Run Kubernetes locally

25191
4054

My Personal website tammach.dev

0
0

Kubernetes IN Docker - local clusters for testing Kubernetes

10800
1241

Events

Make fsnotify event more readable.

Signed-off-by: yanggang gang.yang@daocloud.io

datapath: remove unused ENCRYPT_NODE macro

It's safe to remove this unused macro.

Signed-off-by: Julian Wiedmann jwi@isovalent.com

bpf: nodeport: reduce scope of macaddr variables

The macaddr variables are only needed when updating the neighbour map.

Signed-off-by: Julian Wiedmann jwi@isovalent.com

bpf: nodeport: fine-tune path for delivery to local backend

When delivering a packet to its selected backend, we already have a check for whether the backend is local. Also use this path when deciding whether the packet should be passed up to the stack.

Signed-off-by: Julian Wiedmann jwi@isovalent.com

bpf: lb: remove direction argument in lb*_extract_key()

It's always CT_EGRESS.

Signed-off-by: Julian Wiedmann jwi@isovalent.com

build(deps): bump google.golang.org/grpc from 1.50.1 to 1.51.0

Bumps google.golang.org/grpc from 1.50.1 to 1.51.0.


updated-dependencies:

  • dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ...

Signed-off-by: dependabot[bot] support@github.com

.clomonitor: Update CLOMonitor checks exemptions

Add dangerous workflow, signed releases and token permissions checks to CLOMonitor exemptions.

Signed-off-by: Sandipan Panda samparksandipan@gmail.com

ingestion/gateway-api: Map backend weight to model

This commit is to make sure the weightage value is propagated to internal model.

Relates: 58c8aff11062f944e9f3a18569c647c64edd1bc9

Reported-by: Nico Vibert nicolas.vibert@isovalent.com Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 10 hours ago
delete branch
sayboras delete branch tam/secrets-permission-gateway
Created at 10 hours ago
delete branch
sayboras delete branch tam/service-weight
Created at 10 hours ago

build(deps): bump golang.org/x/tools from 0.2.0 to 0.3.0

Bumps golang.org/x/tools from 0.2.0 to 0.3.0.


updated-dependencies:

  • dependency-name: golang.org/x/tools dependency-type: direct:production update-type: version-update:semver-minor ...

Signed-off-by: dependabot[bot] support@github.com

helm: Add secret permission for agent

This commit is to make sure that cilium agent has required secret permission if gateway api (but not Ingress) is enabled. The original commit 759f7161a925b4e837338bd5c667c1abd8e59452 added the same logic for operator, but missed out agent part.

The end-goal is to have ingress and gateway api as independent features, so that users can just enable only what they need. Without this change, gateway API will only work if and only if ingressController.enabled is set and default secret namespace is used (e.g. cilium-secrets).

Relates: 759f7161a925b4e837338bd5c667c1abd8e59452 Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 11 hours ago
issue comment
ingestion/gateway-api: Map backend weight to model

This changes are only for Gateway API, full CI is not required. Conformance test is passed, in addition to manual verification. Mark this ready to merge.

Created at 11 hours ago
issue comment
helm/gateway-api: Add secret permission for agent

/test-1.24-4.19

Created at 23 hours ago
issue comment
ingestion/gateway-api: Map backend weight to model

Full CI is not required, the changes are covered by Gateway API conformance test.

Created at 23 hours ago
issue comment
helm/gateway-api: Add secret permission for agent

/test

Created at 1 day ago

helm: Add secret permission for agent

This commit is to make sure that cilium agent has required secret permission if gateway api (but not Ingress) is enabled. The original commit 759f7161a925b4e837338bd5c667c1abd8e59452 added the same logic for operator, but missed out agent part.

The end-goal is to have ingress and gateway api as independent features, so that users can just enable only what they need. Without this change, gateway API will only work if and only if ingressController.enabled is set and default secret namespace is used (e.g. cilium-secrets).

Relates: 759f7161a925b4e837338bd5c667c1abd8e59452 Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 1 day ago
issue comment
helm/gateway-api: Add secret permission for agent

/test

Created at 1 day ago

pkg/k8s: fallback on retrieving CiliumNode from kube-apiserver

Retrieving objects from caches can be useful to prevent doing useless requests to kube-apiserver. In the unlikely event that the object doesn't exist in the local cache Cilium can try to retrieve it from kube-apiserver directly. For this particular case, with CiliumNode, it is causing Cilium to fatal as it is unable to retrieve CiliumNode from the cache, due subsystem initialization issues, thus we will fallback on retrieving the object directly from kube-apiserver.

In this case, the subsystem initialization issue happened due to the fact that CiliumNode watcher is blocked on its event handler by the egressGatewayManager [1] which is blocked by the initialization of the identity allocator [2]. Unfortunately, the identity allocator is only initialized at a later stage causing the CiliumNode cache from being populated with all of its nodes.

[1] https://github.com/cilium/cilium/blob/933bdcbec9319b0148b12688f720fbaaf55e0dba/pkg/k8s/watchers/cilium_node.go#L56 [2] https://github.com/cilium/cilium/blob/933bdcbec9319b0148b12688f720fbaaf55e0dba/pkg/egressgateway/manager.go#L83

Fixes: 69e4c6974891 ("k8s: optimize API calls made to kube-apiserver") Signed-off-by: André Martins andre@cilium.io

operator: fix CEP GC

When CEP was converted to an internal CEP structure, the UID field was not copied, causing the delete requests of CEPs to have their UID precondition set as empty. When kube-apiserver received this delete request it didn't delete the CEP because an empty CEP UID didn't match an existent UID.

Fixes: 6f7bf6c51f7a ("Prevent CiliumEndpoint removal by non-owning agent")

Reported-by: Bruno Custódio bruno@isovalent.com Signed-off-by: André Martins andre@cilium.io

docker: Do not specify syntax

Not specifying the syntax starts builds faster, but relies the default syntax to be recent enough. This is currently the case, so remove the syntax references.

Signed-off-by: Jarno Rajahalme jarno@isovalent.com

images: update cilium-{runtime,builder}

Signed-off-by: Jarno Rajahalme jarno@isovalent.com

bugtool: Fix URL to blog.ralch.com

Signed-off-by: yanggang gang.yang@daocloud.io Signed-off-by: Joe Stringer joe@cilium.io

docs: fix deployment resource type output

Since k8s had remove support for extensions/v1beta1 API version after 1.16, we should update the docs to the latest and stable version.

Signed-off-by: cleverhu shouping.hu@daocloud.io

pkg/k8s: do not read k8s node annotations if they are not written

When there is an annotation in the k8s node object, the annotation io.cilium.network.ipv4-cilium-host is used as the CiliumInternal IP address of the CiliumNode object in [1]. Whenever Cilium is updating any state into the CiliumNode it retrieves all IP address from k8s node, including the ones from annotations, and appends the local node's IP addresses, including the newly correct internal / router IP address, in [2]. Since this is a list, the annotation's IP address is always used first and all other Cilium agents will wrongly use it for any operation.

[1] https://github.com/cilium/cilium/blob/927bd8c26904ff92e42c61cec6d00ea8ac062c05/pkg/nodediscovery/nodediscovery.go#L453-L459 [2] https://github.com/cilium/cilium/blob/927bd8c26904ff92e42c61cec6d00ea8ac062c05/pkg/nodediscovery/nodediscovery.go#L474-L489

Fixes: 73d6cae2c906 ("install: default AnnotateK8sNode to false") Signed-off-by: André Martins andre@cilium.io

pkg/nodediscovery: do not use Node annotations when mutating CiliumNode

When using CiliumNode, the agent's source of truth should be the agent itself and not k8s node annotations. Thus we will not use the annotations for the CiliumInternalIP address when generating a CiliumNode from the k8s Node resource.

Signed-off-by: André Martins andre@cilium.io

test: Fail on router IP mismatch warnings

We try to restore the router IP both from the filesystem (first) and from Kubernetes objects (as a fallback). If the two IP addresses don't match, we emit a warning.

There is no good reason for this to happen in CI so we should fail the test if that warning ever shows up. Doing so would have prevented the flake fixed by the previous commit.

Signed-off-by: Paul Chaignon paul@cilium.io

.github: Explicitly set build-commits job runner image version

github: Install libtinfo5 for clang in build-commits CI job

Signed-off-by: Chance Zibolski chance.zibolski@gmail.com

docs: Update Cilium Sphinx RTD Theme reference

This updates Documentation/requirements.txt to reference a new commit hash on the theme's v1.0 branch. This will trigger an RTD build.

Signed-off-by: Stacy Kim stacy.kim@ucla.edu

gha: Pin ubuntu-20.04 for conformance-test-ipv6

This commit is to avoid ubuntu version drift for runner, till the proper version upgrade is done.

Signed-off-by: Tam Mach tam.mach@cilium.io

.github: fix bpf-checks on ubuntu-latest runner

Take the same approach as in 5f7aa03fcc7b (".github: Explicitly set build-commits job runner image version").

Signed-off-by: Julian Wiedmann jwi@isovalent.com

relay: Add Go runtime metrics and process metrics

Currently the agent has a GoCollector and ProcessCollector but relay does not, this updates the relay for consistency and enhanced debuggability.

Signed-off-by: Chance Zibolski chance.zibolski@gmail.com

daemon/cmd: Fix error handling for getting proxy port

The error check handling should be done immediately after the GetProxyPort() call, in order to error out as soon as possible.

This unchecked error can cascade to code integrations with the Agent and cause potentially difficult to track down behavior.

Signed-off-by: Chris Tarazi chris@isovalent.com

build(deps): bump go.etcd.io/etcd/client/pkg/v3 from 3.5.5 to 3.5.6

Bumps go.etcd.io/etcd/client/pkg/v3 from 3.5.5 to 3.5.6.


updated-dependencies:

  • dependency-name: go.etcd.io/etcd/client/pkg/v3 dependency-type: direct:production update-type: version-update:semver-patch ...

Signed-off-by: dependabot[bot] support@github.com

build(deps): bump go.etcd.io/etcd/api/v3 from 3.5.5 to 3.5.6

Bumps go.etcd.io/etcd/api/v3 from 3.5.5 to 3.5.6.


updated-dependencies:

  • dependency-name: go.etcd.io/etcd/api/v3 dependency-type: direct:production update-type: version-update:semver-patch ...

Signed-off-by: dependabot[bot] support@github.com

docs: add instructions to build the base images from external forks

When opening a PR to update the base images from external forks, the bot does not have necessary permissions to push the changes into the fork. For those cases the developer should amend the commit locally and push the changes themselves.

Fixes: c5a778723a43 ("add auto-commit capability to build base images GH workflow") Signed-off-by: André Martins andre@cilium.io

Revert "relay: Add Go runtime metrics and process metrics"

This reverts commit f0fa683870e1030707ed01b4d4b23b57b2d5c6a8. It appears to introduce a double-initialization of metrics, causing relay initialization failures.

Signed-off-by: Joe Stringer joe@cilium.io

daemon/policy: Reduce overhead of policy deletion

This reduce the overhead of deleting policies, since it will now only loop through the policies in the repository once instead of twice.

We originally found this when some of our clusters started having networking problems where legitimate traffic was randomly dropped on pod startup. After a while, we tracked it down to the main cilium event loop having a bad time, and due to CPU contention, it was unable to keep up with the creation and deletions of policies in the cluster.

We grabbed a pprof, and realized that the biggest user of CPU time were "(*Daemon) policyAdd" and "(*Daemon) policyDelete". Overall, we would have expected them to be ~equally costly, and when looking at why, we saw that "(*Daemon) policyDelete" was effectively spending double the amount of CPU time, and that it was calling both "(*Repository) SearchRLocked" and "(*Repository) DeleteByLabelsLocked" for every policy delete; and that they were both ~equally expensive.

After some more investigation, we realised that we could omit the call to "(*Repository) SearchRLocked".

Signed-off-by: Odin Ugedal ougedal@palantir.com Signed-off-by: Odin Ugedal odin@uged.al

Created at 1 day ago
issue comment
helm/gateway-api: Add secret permission for agent

Minor changes to address review comment, please find below the diff

diff --git a/pkg/k8s/watchers/secret.go b/pkg/k8s/watchers/secret.go
index ac6bd7b152..de52a85d4f 100644
--- a/pkg/k8s/watchers/secret.go
+++ b/pkg/k8s/watchers/secret.go
@@ -46,9 +46,7 @@ func (k *K8sWatcher) tlsSecretInit(slimClient slimclientset.Interface, namespace
 			cache.ResourceEventHandlerFuncs{
 				AddFunc: func(obj interface{}) {
 					var valid, equal bool
-					defer func() {
-						k.K8sEventReceived(apiGroup, metricSecret, resources.MetricCreate, valid, equal)
-					}()
+					defer k.K8sEventReceived(apiGroup, metricSecret, resources.MetricCreate, valid, equal)
 					if k8sSecret := k8s.ObjToV1Secret(obj); k8sSecret != nil {
 						valid = true
 						err := k.addK8sSecretV1(k8sSecret)
@@ -174,7 +172,7 @@ func uniq(arr []string) []string {
 		keys[entry] = true
 	}
 
-	var list []string
+	list := make([]string, 0, len(keys))
 	for k := range keys {
 		list = append(list, k)
 	}
diff --git a/pkg/option/config.go b/pkg/option/config.go
index 3adc55de97..e83d4ae883 100644
--- a/pkg/option/config.go
+++ b/pkg/option/config.go
@@ -2558,7 +2558,7 @@ func (c *DaemonConfig) K8sIngressControllerEnabled() bool {
 	return c.EnableIngressController
 }
 
-// K8sGatewayAPIEnabled returns true if Gateway API feature is enabled in   Cilium
+// K8sGatewayAPIEnabled returns true if Gateway API feature is enabled in Cilium
 func (c *DaemonConfig) K8sGatewayAPIEnabled() bool {
 	return c.EnableGatewayAPI
 }
Created at 1 day ago

ingestion/gateway-api: Map backend weight to model

This commit is to make sure the weightage value is propagated to internal model.

Relates: 58c8aff11062f944e9f3a18569c647c64edd1bc9

Reported-by: Nico Vibert nicolas.vibert@isovalent.com Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 1 day ago
pull request opened
ingestion/gateway-api: Map backend weight to model

This commit is to make sure the weightage value is propagated to internal model.

Relates: 58c8aff11062f944e9f3a18569c647c64edd1bc9 Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 1 day ago
create branch
sayboras create branch tam/service-weight
Created at 1 day ago

daemon/policy: Reduce overhead of policy deletion

This reduce the overhead of deleting policies, since it will now only loop through the policies in the repository once instead of twice.

We originally found this when some of our clusters started having networking problems where legitimate traffic was randomly dropped on pod startup. After a while, we tracked it down to the main cilium event loop having a bad time, and due to CPU contention, it was unable to keep up with the creation and deletions of policies in the cluster.

We grabbed a pprof, and realized that the biggest user of CPU time were "(*Daemon) policyAdd" and "(*Daemon) policyDelete". Overall, we would have expected them to be ~equally costly, and when looking at why, we saw that "(*Daemon) policyDelete" was effectively spending double the amount of CPU time, and that it was calling both "(*Repository) SearchRLocked" and "(*Repository) DeleteByLabelsLocked" for every policy delete; and that they were both ~equally expensive.

After some more investigation, we realised that we could omit the call to "(*Repository) SearchRLocked".

Signed-off-by: Odin Ugedal ougedal@palantir.com Signed-off-by: Odin Ugedal odin@uged.al

maps/ipcache: add key.Prefix

Same as Key.IPNet, but returns a netip.Prefix instead of *net.IPNet. This will be used in a successive commit.

Signed-off-by: Tobias Klauser tobias@cilium.io

daemon: convert Daemon.restoredCIDRs to netip.Prefix

This avoids conversions to/from net.IPNet when populating and accessing the restored CIDRs.

Signed-off-by: Tobias Klauser tobias@cilium.io

ip: remove unused IPNetToPrefix

This helper function is now unused, remove it.

Signed-off-by: Tobias Klauser tobias@cilium.io

ip: remove deprecated and unused GetCIDRPrefixesFromIPs

Last remaining use was removed in commit bbcadc43758b ("treewide: Switch policy CIDR handling to netip").

Signed-off-by: Tobias Klauser tobias@cilium.io

operator: Fix bucket width for CEP histogram to the documented values

In CiliumEndpointSliceDensity histogram buckets configuration option was unset, so defaults were used (they have values form 0 to 10 as seen here: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#pkg-variables). This change makes the width of the buckets 10 as documented.

BUG=254474623

Signed-off-by: Aleksander Mistewicz amistewicz@google.com

operator: Adjust buckets for CEP queue delay histogram

In CiliumEndpointSliceQueueDelay histogram buckets configuration option was unset, so defaults were used (https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#pkg-variables). This change doubles the number of buckets and increases the end of the last bucket to 1 hour as values larger than this can be observed in large clusters.

Signed-off-by: Aleksander Mistewicz amistewicz@google.com

operator: Use hand picked bucket values

It gives nicer looking values than the computed version.

Signed-off-by: Aleksander Mistewicz amistewicz@google.com

Created at 4 days ago

gha: Pin ubuntu-20.04 for conformance-test-ipv6

This commit is to avoid ubuntu version drift for runner, till the proper version upgrade is done.

Signed-off-by: Tam Mach tam.mach@cilium.io

gha: Bump k8s version in kind conformance tests

This commit is just to bump k8s version from 1.19 to 1.15.x in both conformance tests (i.e. ipv4 and ipv6). No point running the test on EOL k8s version (e.g. 1.19).

This is suggested in cilium slack by Timo/Nicolas/Andre.

Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 5 days ago
delete branch
sayboras delete branch tam/fix-smoketest-ipv6
Created at 5 days ago
started
Created at 5 days ago
pull request opened
gha: Bump k8s version in kind conformance tests

This commit is just to bump k8s version from 1.19 to 1.15.x in both conformance tests (i.e. ipv4 and ipv6). No point running the test on EOL k8s version (e.g. 1.19).

This is suggested in cilium slack by Timo/Nicolas/Andre.

Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 5 days ago
create branch
sayboras create branch tam/bump-conformance-version
Created at 5 days ago

gha: Pin ubuntu-20.04 for conformance-test-ipv6

This commit is to avoid ubuntu version drift for runner, till the proper version upgrade is done.

Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 5 days ago

gha: Pin ubuntu-20.04 for conformance-test-ipv6

Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 5 days ago

gha: Pin ubuntu-20.04 for conformance-test-ipv6

Signed-off-by: Tam Mach tam.mach@cilium.io

Created at 5 days ago