Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.
eBPF based always-on profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!
@brancz Hm yeah. Optimally the autocompletion would be able to understand the string context it's in (double-quoted string vs. backtick-quoted) and escape any inserted label values accordingly, if necessary.
Yeah agreed, that's option 2 that I suggested tl;dr the parser is behaving as expected, but the query builder/tooling is not quite.
The issue is really just that the table output is really dumb and doesn't do any escaping (besides automatic HTML safety stuff):
It's not just the table though, when you autocomplete and select the suggested value I think the behavior is unexpected. Quick screen recording to demonstrate what I mean:
cc @roidelapluie @juliusv
I don’t think we’ve established yet whether the parser is faulty or the tooling around query completion is behaving inconsistently.
We use the PromQL parser in the Parca project. We noticed something interesting when label values contained the
\x2d value. Ultimately we noticed that the PromQL parser interprets this, not as the string value, but rather interprets the UTF-8 escape character, which in this case is the dash (
A resulting inconsistency can be triggered using the following config:
scrape_configs: - job_name: "prometheus" static_configs: - targets: ["localhost:9090"] labels: faulty_label: \x2d
Then query Prometheus, for example using the
If we then use the UI to copy the label-matcher, and insert it into the query and query, however, we get an empty result.
Only if we explicitly escape the label value do we get the result we were looking for with the previous query:
I expected one of the following two, for Prometheus' behavior to be consistent within itself:
Described in the first section.
Darwin 21.6.0 arm64
$ ./prometheus --version prometheus, version 2.40.3 (branch: HEAD, revision: 84e95d8cbc51b89f1a69b25dd239cae2a44cb6c1) build user: root@692b092db288 build date: 20221124-09:15:54 go version: go1.19.3 platform: darwin/arm64
### Prometheus configuration file ```yaml scrape_configs: - job_name: "prometheus" static_configs: - targets: ["localhost:9090"] labels: faulty_label: \x2d
Re-deploy Parca Docs after release
Merge pull request #8 from parca-dev/vercel_deploy_hook
Re-deploy Parca Docs after release
Secret already added.
Is this a draft because it’s incomplete? I’d be more than happy to merge partial updates getting things out as we get them done.
Is there a particular reason we’re doing this in the query range API? I personally find this an odd placement and usage of arrow, typically arrow the format used within a query engine for zero copy computations. Have we considered doing this rather for remote read? That seems like the more natural space for it to me. And streaming there would also work very well as we already have machinery available from the proto streaming variant.
I'm especially impressed that this didn't need any pooling of byte buffers, just the replacement 🤯
Just deployed this on the demo cluster, and I'm happy to report that this got us a healthy ~20% decrease in CPU usage, nice job, and thank you very much!
I know next to nothing about .NET, so I can only go by docs or other web searches. From the docs, I understand that ahead-of-time compilation to a regular self-contained Linux ELF binary (which should work just fine with Parca Agent if debuginfos are available) is only supported since .NET 7, which appears to have only been released on November 8th, 2022, so it appears to me it's rather the bleeding edge than an old feature.
If so, could you run
date on all machines and share the output?
Could this potentially be a time synchronization issue? We've certainly seen issues with time synchronization before. Are you running the Parca Agent and Parca server on a different machine that you're using your browser on?
cc @stevej @olix0r @siggy
Also for reference, I have tried to build a container image with the
debug=1 flag (which includes sufficient debuginfos to get the above things working), and here are some stats:
[+] Building 332.9s (21/21) FINISHED localhost/linkerd/proxy HEAD.4e172043 a57cf56c46e4 4 minutes ago 92.7MB
[+] Building 377.6s (21/21) FINISHED localhost/linkerd/proxy HEAD.4e172043 8566dec1a8f5 20 minutes ago 978MB
This means there is approximately a ~13% increase in build time and about 10-11x artifact size.
I want to clarify why I selected "maybe" on whether I'd like to work on this. I'd be happy to be a supporting function, but since this is primarily build pipeline work and I very rarely interact with the rust ecosystem, I find it hard to believe that I'd be of much use. I'd be more than happy to do anything I can though!
I'm trying to profile the linkerd2-proxy, among everything else in my infrastructure, to understand where on a global level all CPU resources are being spent. The linkerd2-proxy, while efficient is meant to be deployed in practically every pod in a Kubernetes, cluster so while an individual container may not use a lot of resources, it would be great to understand what that looks like globally.
To do this, debuginfos need to be available somehow. This could be in a variety of different ways. My preference would be to produce debuginfos as part of the build process of the linkerd2-proxy, when creating the release tarball. Using the recently stabilized
split-debuginfo configuration option, the production binary would be created and in a separate file the debuginfos would be written. These debuginfos can then be separately distributed, ideally through a
debuginfod server, from where profilers and debuggers can automatically find the debuginfos using the binary's BuildID (which are identical in the "production" binary and the debuginfo file, that's how they're linked).
I think distributing the binary and the debuginfos this way it the best as there is only ever a single binary being produced per release, removing the possibility that the debug build it in any way different from a non-debug build.
There could be other ways, just like having a container images that contains a binary that has the debuginfos included. This would work as well, but it would require a user to explicitly change the infrastructure to make it debuggable, which is not ideal.
Users would primarily interact with it indirectly, as it enables profilers such as Parca to discover the debuginfos automatically without the user having to do anything. Same goes for regular Linux
perf as well as debuggers such as
The docs page does say
Limited diagnostic support (debugging and profiling).
To me this looks like there are just no debuginfos in this binary to symbolize. What does
nm <binary> output?
At first sight I would say either framepointers are there or dwarf unwinding is working, as the memory addresses look roughly like they could be correct and have a stack depth higher than one. (not an explanation for Maxim or the rest of the maintainers, but anyone who may be reading who's unfamiliar: if the unwinding is broken in some way we typically see a very very high memory address and maximum 1-2 stack depth)
Perhaps this is what you meant, but just in case if not, we did recently add an example of what to do to get .NET application working today: https://github.com/parca-dev/parca-demo/pull/18
We're still trying to figure out why ahead-of-time compiled .NET binaries aren't working (they should to our knowledge, it might be something small): https://github.com/parca-dev/parca-demo/pull/23
The ultimate goal is that what needs to be enabled in #18 is not necessary, which we haven't started the work for yet, so in the meantime that is the best way to use Parca with .NET.
I have a feeling this is not at all cors related, but much rather the same problem as reported in https://github.com/parca-dev/parca-agent/issues/1060