jmdobry
Repos
12
Followers
459

Give your data the treatment it deserves with a framework-agnostic, datastore-agnostic JavaScript ORM built for ease of use and peace of mind. Works in Node.js and in the Browser. Main Site: http://js-data.io, API Reference Docs: http://api.js-data.io/js-data

1610
135

Angular wrapper for js-data

983
78

angular-cache is a very useful replacement for the Angular 1 $cacheFactory.

1398
153

Google's officially supported Node.js client library for accessing Google APIs. Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.

10277
1775

Node.js samples for Google Cloud Platform products.

2493
1786

A local emulator for deploying, running, and debugging Google Cloud Functions.

829
111

Events

Created at 1 month ago
Metrics on GitHub Access

The maintner and samplr workers both make a significant number of calls to GitHub, and measuring and observing the applications's access is critical to long term success.

Metrics that need to be tracked at a minimum:

  1. API Calls by Deployment/Repository to GitHub
  2. Which API Methods are being called most frequently, both per-repository and as a whole
  3. Token Rate Limit Quota
Created at 2 months ago
infra: Add a State Layer to samplr

Engineering Protip: Make state someone else's problem. -James Ward

It is probably time to introduce some sort of state (read that as "database") to our solution.

Reasons to introduce a database:

  • samplrd is the only "source of truth" for both reads and writes for Snippets. Should the pod go down (for whatever reason) requests for snippets will result in either errors or 0 snippets returned.
  • Calculating snippets is time consuming, especially for larger repositories, this introduces an unnecessary lag during startup time.
  • Proxying requests to individual pods requires:
    • Logic in the reverse proxy to query multiple orgs/repos
    • Individual worker pods to maintain pagination state for all incoming queries could cause them to be resource overloaded

Database Type

Relational:

There are some good arguments for using a Relational database:

  • Relatively familiar for most developers.
  • Huge swathes of documentation
  • Google Cloud offers a hosted MySQL solution (helps reduce ops cost)
  • Rigidly Structured Data

Graph DB

There are two main arguments for using a Graph DB:

  • Exposing an API on top of this data via GraphQL would mimic the behaivour of the v4 GitHub API, (which our users are presumably familiar with, or are interestd in migrating to)
  • The data stored is a directed acyclic graph, which Graph DBs are great at storing.
    • In fact, it comes to mind that samplr could be rearchitected to be an updater of commits on a GraphDB, with a periodic job (or trigger) which calculates and updates snippets from within the DB itself, and a GraphQL API on top of the database.
Created at 2 months ago
maintner: Issue Comment Pagination

Some Issues tracked in maintner have so many comments in them, it is impossible to return them in a single RPC. To that end, we need to implement a new "GetComments" rpc for maintner that supports pagination

Created at 2 months ago
infra: Makefile Updates

The Makefiles are getting better but need improvement

  1. Make variable names consistent
  2. Allow for importing of an "overrides" environment file to help speed up local development (export-ing variables in a shell doesn't scale well)
  3. Update the variables substituted in our kubernetes manifests to match them with Makefiles
  4. Add more targets for building specific parts of the stack.
Created at 2 months ago
core: GitHub API Key Status Service

As the number of keys in the cluster increases, observability to which keys are available and what their capacity is will be.... key. This necessitates the creation of an application to monitor these activities.

Note: This will be dependent on the work done in #14

The application should:

  1. Discover the set of GitHub tokens available to be vended to workers using a provided Label.
  2. Call the GitHub API to get the current usage of the token
  3. For each Token, store the status
  4. Spin up a gRPC service to provide this information including the Token Name (combination of Secret Name and File Name)
Created at 2 months ago
sprvsr: Auto Update perodically

Currently, the sprvsr module only updates the list of tracked repositories when told to do so via an HTTP request.

This should be moved to a periodic check every N minutes, with the added feature of being able to be told to sync "now"

Created at 2 months ago
sprvsr: Move away from self hosting

The sprvsr module has been useful for working across both maintner and samplr, but currently as the module defines it's service and exposes an api, it is not as reusable or as portable as it needs to be moving forward.

Migrate the sprvsr module to a 'corpus like' model which is stateful and updates itself on an interval, removing the api component from it entirely.

Created at 2 months ago
samplr-rtr: implement ListRepositories rpc

Currently, samplr-rtr does not implement the ListRepositories rpc specified in the proto.

To do so we need to

  1. Create a service account the samplr-rtr pods can run as so that they may have view permissions on services
  2. For performance reasons, run a goroutine that every X minutes queries the Kubernetes server and populates a list of TrackedRepositories based on the labels (this should be guarded by a RWMutex). Have the ListRepositories method read from that list.

Future work:

samplr-rtr should expose an internal-only gRPC service to signal the router to update it's list of available repositories. This rpc is called from samplr-sprvsr when it updates the set of repositories.

Created at 2 months ago
infra: mTLS throughout

Currently within the cluster, our services are communicating over the wire without any encryption or security. While this is fine, adding mTLS (possibly via Istio) would add an extra layer of security to the cluster.

Created at 2 months ago
samplr: Status API

The samplr service should expose a gRPC service to describe its status. Things the service should report:

  1. List of Repositories the application is tracking
  2. For each repository: when was the last time it was updated, and when it will check again
Created at 2 months ago
infra: Health Checks via gRPC

As a part of the gRPC unification, the health checks across all services/deployments need to be done via gRPC health probe, and each application should register for the gRPC health service

Created at 2 months ago
Depricate and move the v0 API

The v0 api needs to be removed and the current v1 api needs to be renamed to v1beta1

Created at 2 months ago
samplr: support branches other than "master"

Currently samplr only supports tracking snippets on a branch named master.

There are plenty of repositories that do not use that naming convention which would benefit from snippet tracking.

To that end, I propose we add an additional GitHub API call when first Tracking a GitHub repository which queries the GitHub API for the "default_branch" of a given repository' and stores it in the watchedRepository struct. This allows for repository owners to

Considerations:

  • This increases our GitHub API usage by O(nRepositories)
  • If a repository owner changes the default branch after the initial update, samplr will have stale data and requires a refresh.
Created at 2 months ago
jmdobry delete branch drghs-worker-comments-api
Created at 2 months ago
feat(maintner): add gRPC call to list Comments

Adds new RPC to ListGitHubComments (and associated message contracts).

Allows for filtering and pagination

Fixes #42

Created at 2 months ago
jmdobry delete branch tf
Created at 2 months ago
feat(core): add Terraform support

Adds Terraform config files to automate GCP project and resource creation.

This will (hopefully) obviate the need for awkward make file scripts and allow for continuous delivery and project collaboration.

Created at 2 months ago
jmdobry delete branch clean-sweep
Created at 2 months ago
feat(maintner-swpr): add error checking on not-found repositories

In the GraphQL API for GitHub, it is possible to encounter repositories that are considered "not existing" as we don't have api access to them (they could be private).

As opposed to erroring out and finishing the sweep job, simply log a warning and continue

Created at 2 months ago
jmdobry delete branch fix-mghp
Created at 2 months ago
fix(magic-github-proxy): update Dockerfile and Makefile
Created at 2 months ago
jmdobry delete branch scale-mtr-rtr
Created at 2 months ago
feat(maintner): scale maintner-rtr to 5 and define anti-affinity

Scales number of replicas to 5 and adds a pod anti affinity policy that requires pods for maintner-rtr to be scheduled on different Nodes in the pool. This is to help reduce single point of failure modes in the pods where either a single pod is unhealthy and cannot serve or where a single Node is unhealthy and cannot serve.

Created at 2 months ago
feat(maintner): add slobudget to issue api

Requires #119

Created at 2 months ago
jmdobry delete branch behind
Created at 2 months ago
feat(samplr): include new script to find behind repositories
Created at 2 months ago
jmdobry delete branch issue-tombstoner
Created at 2 months ago
feat(maintner): add issue-tombstoner cli

Add utility to individually mark issues as tombstoned

Created at 2 months ago