Javascript client
Apache License 2.0
1588
29
423

Describe the bug

Hey, after upgrading the client library in our project https://github.com/snyk/kubernetes-monitor to version 0.17.0, we started getting reports from some of our AKS customers that the Kubernetes API server is issuing HTTP 429s and does not let the app recover.

Our customers provided logs of our app, and I can see that shortly after start-up (~15 secs to ~2 mins), AKS starts to heavily rate limit every informer connection. Even if we retry starting the informer, the rate-limiting continues. This triggers the informer.on(ERROR) handlers. It seems to happen after the informer has been set up and a couple of UPDATE events arrive in our app.

Interestingly, this issue does not occur on version 0.15.1 of the client. Something recent must have changed that caused the API server to heavily rate limit the client. Customers are stuck at the moment and need help to upgrade our app past the version that uses the 0.15.1 client.

Do you have any idea what could be causing this? Can I provide any additional information to help narrow this down?

For context, here's all that I know:

  • Our app opens 8 namespaced informers to the Kubernetes API
  • The customer deploys the app in 3 namespaces; each deployment watches only that namespace (so 24 connections in total)
  • Each namespace has around ~50 workloads

EDIT:

We are starting to get more and more bug reports, not only on AKS. It looks like it specifically occurs when watching resources in a single namespace. Based on netstat output, there are thousands of connections to the Kubernetes API:

~ $ netstat -n | grep "172.20.0.1:443" | grep -e "FIN_WAIT2" -e "CLOSE_WAIT" | wc -l
2922
~ $ netstat -n | grep "172.20.0.1:443" | grep -e "FIN_WAIT2" -e "CLOSE_WAIT" | wc -l
2757
~ $ netstat -n | grep "172.20.0.1:443" | grep -e "FIN_WAIT2" -e "CLOSE_WAIT" | wc -l
3692
~ $ netstat -n | grep "172.20.0.1:443" | grep -e "FIN_WAIT2" -e "CLOSE_WAIT" | wc -l
3387
~ $ netstat -n | grep "172.20.0.1:443" | grep -e "FIN_WAIT2" -e "CLOSE_WAIT" | wc -l
4885

Client Version

0.17.0

Server Version

1.23.12

Environment (please complete the following information):

  • OS: Linux
  • NodeJS 16
  • Cloud runtime: AKS

Describe the bug
The result of using the labelSelector and limit arguments together is incorrect

** Client Version **
e.g. 0.18.0

** Server Version **
e.g. 1.24

To Reproduce
kc.makeApiClient(k8s.CustomObjectsApi). listClusterCustomObject()
If you use the labelSelector and limit arguments together, the result is incorrect, for example, expecting three data items to be returned, but only one data item is returned

Expected behavior
I expect this to return the correct result

** Example Code**
Code snippet for what you are doing
kc .makeApiClient(k8s.CustomObjectsApi) .listClusterCustomObject( meta.group, meta.version, meta.plural, undefined, undefined, _continue, undefined, labelSelector, limit, );

Environment (please complete the following information):

  • OS: [e.g. Windows, Linux]
  • NodeJS Version16.18.1
  • Cloud runtime [e.g. Azure Functions, Lambda]

Additional context
Add any other context about the problem here.

I noticed that the release-1.x branch has quite a few unused imports. I might make another issue to clean up the other lint issues and maybe add a commit hook so we don't fall into this trap in the future.

src/azure_auth_test.ts
src/config.ts
src/gcp_auth_test.ts
src/metrics.ts
src/top.ts
src/util_test.ts
src/web-socket-handler_test.ts

Describe the bug
exec_auth cache the token despite gcloud adc token has been invalidated using gcloud auth revoke.
Our product Cloud Code integrates with this k8s nodejs application and we use gcloud ADC for permission to GCP services, our tool does allow user to sign out and sign back in and we need a way to clear the exec_auth cache as gke_gcloud_auth_plugin (which is an exec) is a common auth method for GKE clusters.

** Client Version **
0.16.3

** Server Version **
n/a

To Reproduce

  1. use gcloud auth revoke
  2. re-sign in using gcloud auth application-default login
  3. Token inside exec_auth is the old token despite the new token generated by gcloud

Expected behavior
A way for application to clear the cache in exec_auth on demand.

** Example Code**
ExecAuth.clearCache() and access to ExecAuth instance from KubeConfig object or
KubeConfig.clearTokenCache()

We can also send you a PR if you're ok with the API proposal below. Thanks!

Environment (please complete the following information):

  • OS: MacOSX, Linux and Windows
  • NodeJS Version: 18.7
  • Cloud runtime: GCP

Additional context
Add any other context about the problem here.

Hi There,

I was checking documentation and I found there is a way to connect to K8S using different methods, can you tell me how we can connect to K8S using EKS configured via eksctl command and aws configure.

I can see a /.kube/config file, but what values i need to connect to different k8S clusters created using this same method? My end goal is to have a web application where user can put their creds and they can connect to k8s
Below is sample of /.kube/config file, my PC has aws configure done. The server won't have this config, what kind of creds i should ask from user?

Sample of my kube config

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: A_LONG_STRING_HERE
    server: https://XXXXXXXXXXXXX.gr7.ap-south-1.eks.amazonaws.com
  name: xxxxxxx.ap-south-1.eksctl.io
contexts:
- context:
    cluster: xxxxxx.ap-south-1.eksctl.io
    user: iam-root-account@xxxxx.ap-south-1.eksctl.io
  name: iam-root-account@xxxxx.ap-south-1.eksctl.io
- context:
    cluster: xxxx.ap-south-1.eksctl.io
    user: user@xxxxx.ap-south-1.eksctl.io
  name: user@xxxxx.ap-south-1.eksctl.io
current-context: iam-root-account@xxxxx.ap-south-1.eksctl.io
kind: Config
preferences: {}
users:
- name: iam-root-account@xxxxxx.ap-south-1.eksctl.io
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - token
      - -i
      - xxxxxxx
      command: aws-iam-authenticator
      env:
      - name: AWS_STS_REGIONAL_ENDPOINTS
        value: regional
      - name: AWS_DEFAULT_REGION
        value: ap-south-1
      provideClusterInfo: false
- name: <NAME>@xxxxx.ap-south-1.eksctl.io
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - token
      - -i
      - xxxxxxxx
      command: aws-iam-authenticator
      env:
      - name: AWS_STS_REGIONAL_ENDPOINTS
        value: regional
      - name: AWS_DEFAULT_REGION
        value: ap-south-1
      provideClusterInfo: false

How to get all available api-resource including sub-resources?

Equivalent to kubectl api-resources

Describe the bug
When using the library and bundling it with esbuild (in this case for use within AWS lambda) an error is thrown.

{
  "errorType": "Runtime.ImportModuleError",
  "errorMessage": "Error: Cannot find module './src/cat'\nRequire stack:\n- /var/task/index.js\n- /var/runtime/index.mjs",
  "trace": [
    "Runtime.ImportModuleError: Error: Cannot find module './src/cat'",
    "Require stack:",
    "- /var/task/index.js",
    "- /var/runtime/index.mjs",
    "    at _loadUserApp (file:///var/runtime/index.mjs:1000:17)",
    "    at async UserFunction.js.module.exports.load (file:///var/runtime/index.mjs:1035:21)",
    "    at async start (file:///var/runtime/index.mjs:1200:23)",
    "    at async file:///var/runtime/index.mjs:1206:1"
  ]
}

This is being caused not by this library, but the direct dependency shelljs.
The reason why I'm reporting this here is, that this is not seen as a bug there

As shelljs only seems to be used in very few places in this library, I'm wondering if it would be an option to remove the dependency and implement the few places it uses in a different way.

https://github.com/kubernetes-client/javascript/search?q=shelljs

** Client Version **
0.18.0

** Server Version **
e.g. 1.24.0

To Reproduce
Steps to reproduce the behavior:

  • Bundle any project that depends on @kubernetes/client-node with esbuild (or other bundlers)

Expected behavior
No error is thrown and the library can be used.

** Example Code**

import * as k8s from '@kubernetes/client-node';
const kc = new k8s.KubeConfig();
kc.loadFromDefault();

Environment (please complete the following information):

  • OS: macOs 12.6
  • NodeJS Version: 18
  • Cloud runtime: Lambda

Additional context
If you use AWS CDK NodejsFunction you can use the following bundling settings as a workaround

bundling: {
    forceDockerBundling: true,
    minify: false,
    keepNames: true,
    nodeModules: ['shelljs'],
}

Dear Maintainer,

I am writing to express my interest in getting involved in your community and contributing to your project. I am a passionate developer with a strong interest in open source software and I believe that your project has the potential to make a real impact.

I would like to know more about how I can get involved and start contributing to your project. Are there any specific areas that you are currently looking for help in? Do you have any resources or guides that can help me get started as a contributor?

I am excited about the opportunity to work with your team and contribute to this important project. Please let me know if there is anything I can do to get started.

Thank you for considering my request

I want to start Kubernetes jobs on a GKE cluster from a Google Cloud Function (Firebase)

I've created a Kubernetes config file using `kubectl config view --flatten -o json'

and loaded it

const k8s = require('@kubernetes/client-node');
const kc = new k8s.KubeConfig();
kc.loadFromString(config)

This works perfectly locally but the problem is when running on cloud functions the token can't be refreshed so calls fail after a while.

My config k8s config files contains

          "user": {
              "auth-provider": {
                  "name": "gcp",
                  "config": {
                      "access-token": "redacted-secret-token",
                      "cmd-args": "config config-helper --format=json",
                      "cmd-path": "/usr/lib/google-cloud-sdk/bin/gcloud",
                      "expiry": "2022-10-20T16:25:25Z",
                      "expiry-key": "{.credential.token_expiry}",
                      "token-key": "{.credential.access_token}"
                  }
              }

I'm guessing the command path points to the gcloud sdk which is used to get a new token when the current one expires. This works locally but on cloud functions it doesn't as there is no /usr/lib/google-cloud-sdk/bin/gcloud

Is there a better way to authenticate?

Describe the bug

When trying to use patchNamespacedSecret method on release-1.x branch the following error is raised:

Error: None of the given media types are supported: application/json-patch+json, application/merge-patch+json, application/strategic-merge-patch+json, application/apply-patch+yaml
    at ObjectSerializer.getPreferredMediaType (ObjectSerializer.js:1760:1)
    at CoreV1ApiRequestFactory.patchNamespacedSecret (CoreV1Api.js:8653:1)
    at ObservableCoreV1Api.patchNamespacedSecret (ObservableAPI.js:9211:1)
    at ObjectCoreV1Api.patchNamespacedSecret (ObjectParamAPI.js:2661:1)
   //...

** Client Version **
release-1.x

** Server Version **
1.24.3

To Reproduce
Sample code triggering the error:

const coreV1 = new CoreV1Api(config);
coreV1.patchNamespacedSecret(
        {
          namespace: "default",
          name: "my-secret",
          body: {"tls.crt": "newValue"},
        }
      )

Also kind of related to this issue, I think we are missing a way to instruct patchNamespacedSecret which patch strategy we expect it to use. For example in the above example I expect it to use application/merge-patch+json but have no way right now to provide the patch strategy.

As stated here request is fully deprecated. The reasons are described in this issue .
I think it is the time to choose an alternative, considering this kubernetes client is used on some production applications.

These are the current alternatives.

It will be better if we add this line "Prior to creating a new issue, please ensure that a similar issue has not already been raised by searching existing issues." in the contributing.md Reporting an issue section

Describe the bug
I am developing an ElectronJS application to display Kubernetes objects status (I know such app already exists).
I have problems with the Informer object.
Basically, each view of the app will display & watch K8s objects, depending of the namespace/context.
So, when the user navigate to a new view, I should kill the current Informer and then start a new one, watching a new objects k kinds, possibly in another K8s context.

On "server side" logs, I do not see a bunch of Informers sending events.
However, on the client side (the view), the more I navigate back and forth to the same view, it looks like events are accumulating over and over.

** Client Version **
0.17.1

** Server Version **
Client Version: v1.24.4",
Server Version: v1.24.4+k3s1"

To Reproduce
Here is a snippet :

NOTE : not very elegant, but i reference the Informer in a global object to keep track of it.
I confirm that when quitting the view, the Informer throw an exception ( caught in informer.on('error')) and get ECONNRESET.

  // Object to track state of pods informer
  // Informer should be stopped when we quit the view
  let informerPods = {
    informer: '',
    created: false,
  }
  /* 
  Get all pods from namespace
  when client call this IPC Channel, trigger the creation of an Informer & start it.
  If Informer already exists : stop it.
   ==> this IPC Channel can be called:
          - at vue creation (vuejs "mount") -> create the infromer cause it does not exists
          - when quitting the vue (vuejs "destroy") --> stop the informer cause it already exists
  */
  ipcMain.on('pods-from-namespace', (event, contextname) => {
    if (informerPods.created)  {
      console.log("Pods informer already exist : killing");
      informerPods.informer.stop();
      informerPods = {
        informer: '',
        created: false,
      }
    } else {
      try {
        console.log("Creating Pods informer"); 
        const kc = new k8s.KubeConfig();
        kc.loadFromDefault();

        // IMPORTANT : allow to take into account the context change !
        kc.setCurrentContext(contextname);
        const context = kc.getContextObject(contextname);
        console.log("Active context : " + context.name);
        const k8sApi = kc.makeApiClient(k8s.CoreV1Api);
        
        const listPods = () => k8sApi.listNamespacedPod(context.namespace);
        const informer = k8s.makeInformer(kc, '/api/v1/namespaces/default/pods', listPods);
     
        // update global vers to track Informer
        informerPods = {
          informer: informer,
          created: true,
        }
        
        // ----- Informer events -----
        informer.on('add', (obj) => {
          console.log(`Added: ${obj.metadata.name}`);
          // send message to the view:
          win.webContents.send('pods-from-namespace', "add", obj);
        });
        informer.on('delete', (obj) => {
          console.log(`Deleted: ${obj.metadata.name}`);
          // send message to the view:
          win.webContents.send('pods-from-namespace', "delete", obj);
        });
        informer.on('error', (err) => {
          console.log("Informer ERROR");
          console.error(err);
          informer.stop();
        });
        informer.start();
      } catch(err) {
        console.error("BING");
        console.error(err);
        informer.off();
        informer.stop();
      }
    }
  });

Expected behavior
This is a screenshot of a view watching Pods through Informer. Fresh view just after starting the app :

podview1

So far, it's good. Note : there is indeed 7 pods.

However after navigating back & forth to this view several times, events seems to accumulate... But finally, it render as expected.
There are lots of multiples of 7. This show the accumulation : 7, 14, 21, 28, etc.....

)odview2

Environment (please complete the following information):

  • OS: [MAC & Windows]
  • NodeJS Version : v16.19.0
  • Cloud runtime : Azure AKS

** app stack**

  • ElectronJS + Vuejs2 + Vuetify

** Related Issues**

Those 2 issues are maybe related:
606
604

Describe the bug
The exec.env adds all the environment variables passed in through the KUBECONFIG to the process.env global variable.

For our usage, it introduces a race condition depending on when the call to the k8s API client occurs.

e.g. the KUBECONFIG users block below changes the entire nodejs process env:

users:
- name: arn:aws:eks:us-east-2:702880000000:cluster/our-cluster  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - --region
      - ap-southeast-2
      - eks
      - get-token
      - --cluster-name
      - our-cluster
      command: aws
      env:
        - name: teddy
          value: testing

It's not really necessary to do this, instead we can pass a copy of the process.env with the appended env array from the KUBECONFIG to the child_process.

** Client Version **
e.g. 0.18.0

** Server Version **
e.g. 1.23.0

To Reproduce
Steps to reproduce the behavior:

Run the code with the env specified in the KUBECONFIG:

...
const k8sApi = kc.makeApiClient(k8s.CoreV1Api);
console.log(`how come ${process.env.teddy} is not set?`)
const nodes = await k8sApi.listNode();
console.log(`how come ${process.env.teddy} is set?`)

Expected behavior
Entire process.env should not be mutated (unknowingly depending on what is stored in env of the KUBECONFIG).

Environment (please complete the following information):

  • OS: Debian Linux
  • NodeJS Version 16/18

Describe the bug
Informer stops following updates after ~hour without any error nor connect event. After informer restart it works fine again, so as workaround I've added interval to restart informer, but this is not graceful solution.

** Client Version **
e.g. 0.17.1

** Server Version **
e.g. 1.22.15

To Reproduce

  1. Follow example informer to create new informer.
  2. Wait 65 minutes.
  3. Create/delete/update pod and check the logs.

Expected behavior
Informer should watch pods continuously (reconnect) or at least throw some error in case of interruption/loosing connection with server.

** Example Code**
example informer

Environment (please complete the following information):
Docker image (node-12):

  • OS: Debian GNU/Linux 9 (stretch)
  • NodeJS Version: v12.22.12

Additional context
Add any other context about the problem here.

It is a known issue that some people have worked around by using rather pull than push method for receiving updates about changes. Another workaround is to restart the watcher every n minutes. As @brendandburns pointed out in my earlier PR576, csharp k8s client suffers from the same problem: kubernetes-client/csharp#533 .

My experience shows that it happens when the connection is idle for a long time. The connection is dropped without closing it so the client keeps waiting for events, not receiving any. I have seen it in Azure and Google cloud with managed k8s service.

The c# issue suggests that it happens because keepalives are not enabled on underlaying connection. And indeed, I found that it is the case for JS k8s client too. That can be fixed by adding keep-alive option to "request" options if there wasn't a bug in the request library. I have created a new ticket for it: request/request#3367 . The request library has been deprecated and cannot be fixed. I was able to work around the bug in watcher's code. So with my fix, the connections are kept alive. My experience shows that every three minutes TCP ACK is exchanged between client and server. I would like the keep-alive to happen more often to detect dead watcher connections in more timely fashion however it does not seem to be possible to tweak keep-alive interval for the connection in nodejs: nodejs/node-v0.x-archive#4109 .

The fix I have does not seem to fix the problem in all cases. That might be because the keep-alive of 3 minutes may not be sufficient in all cases. I will test the fix more thoroughly and update the ticket with the results of testing.

Describe the bug
After upgrade to sdk version to 0.17.1, cannot apply crd using apply example apply-example.ts

Client Version
0.17.1

Server Version
n/a

To Reproduce
Download example file and try to apply come custom crd.

Expected behavior
Work like usual.

Example Code
Same as apple example file.

Environment (please complete the following information):
n/a

Additional context
Error output:

./apply-example.ts:32:47
Type error: Argument of type 'KubernetesObject' is not assignable to parameter of type 'KubernetesObjectHeader<KubernetesObject>'.
  Type 'KubernetesObject' is not assignable to type '{ metadata: { name: string; namespace: string; }; }'.
    Types of property 'metadata' are incompatible.
      Type 'V1ObjectMeta | undefined' is not assignable to type '{ name: string; namespace: string; }'.
        Type 'undefined' is not assignable to type '{ name: string; namespace: string; }'.

The timestamps have incorrect typescript types.
For example
https://github.com/kubernetes-client/javascript/blob/master/src/gen/model/v1ObjectMeta.ts#L32

creationTimestamp is of type Date | undefined, but the documentation says it's an RFC3339 formatted date aka a string.

Since this is generated code I'm not sure of where to fix this. Either the type has to change to string | undefined, or the string has to be parsed and passed on as a date object.

A prototype system I'm working with has a controller implemented using the java client and one using this javascript client.

Both serve a similar purpose, they watch some configmaps and apply changes to some service in response to the events they receive.

In the java case there is the concept of a reconciler and in the javascript the controller is based on the informer example.

The main interesting difference is that the java case has a clear way to handle errors. Specifically errors that occur when using the event information within the controller to update some other service. If an error occurs with the java client then the reconcile function changes its return value. The system will then call it again later to have another go at handling the same events.

In the javascript case we can't see any support for handling this particular type of error. Is it possible? Or is that not supported?
(Searching across kubernetes-client this type of support was only obvious with the java client.)

Some more detail:

If there was code like this how would an error be handled such that the system would retry this add event later

informer.on('add', (obj: k8s.V1Pod) => {
console.log(Added: ${obj.metadata!.name});
//Call out to some system here that might fail
});

In the java we get to indicate with the reconcile return value that the event should get retried again in the future

public Result reconcile(Request request) {
try {
//do something that might throw an error
} catch (Exception e) {
LOG.error("Exception occured , e.getMessage(), e);
return new Result(true);
}
return new Result(false);
}

Describe the bug
When using the general KubernetesObjectApi the objects are not returned as typed objects as defined in the api definitions (https://github.com/kubernetes-client/javascript/tree/master/src/gen/api), but rather the plain json value is returned.

** Client Version **
e.g. 0.16.3

** Server Version **
e.g. 1.21.0

To Reproduce

see code snippet

Expected behavior
I expect that known objects are returned with their proper types.

** Example Code**

const k8s = require('@kubernetes/client-node');

const kc = new k8s.KubeConfig();
kc.loadFromDefault();

const k8sApi = kc.makeApiClient(k8s.CoreV1Api);
const client = kc.makeApiClient(k8s.KubernetesObjectApi);

client.list('v1', 'Pod', 'default').then(res => {
    const p = res.body.items[0];
    console.log(typeof p.metadata.creationTimestamp); // "string"
})

k8sApi.listNamespacedPod('default').then((res) => {
    const p = res.body.items[0];
    console.log(typeof p.metadata.creationTimestamp.getTime()); // "Object" --> Date
});

Environment (please complete the following information):

  • OS: [e.g. Linux]
  • NodeJS Version [eg. 16]

Enhancement Proposal

I guess that the current behavior is intended to also support unknown objects.
Also one can currently work around the described issue by using the ObjectSerializer like that

const k8s = require('@kubernetes/client-node');
const res = await client.read({apiVersion: 'v1', 'kind': 'Pod', metadata: {'name': 'a', 'namespace': 'default'}});
const pod = res.body; // type: KubernetesObject
pod.metadata.creationTimestamp // type "string"

const p = ObjectSerializer.serialize(pod, 'V1Pod');

But I think there could be a good opportunity to simplify the way the generic client can be used and improve the typing
Maybe something like that:

class KubernetesObjectApi {
    public read<T extends KubernetesObject | KubernetesObject>(spec: T, ...): {res: http.response, body: T} {
        // do the stuff that is currently done
        return this.requestPromise(localVarRequestOptions, this.getKnownType(spec));
    }
    
    private getKnownType(spec: KubernetesObject): string {
      // get type from spec + V1APIResource
    }

}

const spec: V1Pod = {apiVersion: 'v1', 'kind': 'Pod', metadata: {'name': 'a', 'namespace': 'default'}};
const res = await client.read(spec);
const pod = res.body; // type V1Pod
pod.metadata.creationTimestamp // type "Date"

Further this example might be enhanced so that even custom types could be added to the known types that the ObjectSerializer can also serialize CRDs.