terraform provider that manages nix builds and nixos machines.
A high availability distributed filesystem built on FoundationDB and fuse.
A horizontally scaling object store based on the CRUSH placement algorithm.
Interesting that this was openbsd as well, I can maybe do my fuzzing in a vm there.
The permission error patch might be a potential cause, I will make a new release a priority so people don't need to rely on the custom patches.
There is also a chance something else is going on that is specific to openbsd. I will try to introduce more fuzzing to try and reproduce it on my end.
Add pipelined request parallelism to the python api.
This change makes use of grpc futures to make many concurrent requests to the server in parallel.
The streamset api automatically makes use of this feature when it can, as well as new streamset.insert, streamset.delete and streamset.obliterate functions.
This change also introduces conn.batch_query and conn.batch_create to allow users to create and query the existence of many streams in parallel.
Overall these changes allow 100x-300x single threaded stream creation, while also improving parallel queries.
Thank you so much for the report - Your setup sounds fine.
A few questions:
Add pipelined request parallelism to the python api.
This change makes use of grpc futures to make many concurrent requests to the server in parallel.
The streamset api automatically makes use of this feature when it can, as well as new streamset.insert, streamset.delete and streamset.obliterate functions.
This change also introduces conn.batch_query and conn.batch_create to allow users to create and query the existence of many streams in parallel.
Overall these changes allow 100x-300x single threaded stream creation, while also improving parallel queries.
Add pipelined request parallelism to the python api.
This change makes use of grpc futures to make many concurrent requests to the server in parallel.
The streamset api automatically makes use of this feature when it can, as well as new streamset.insert, streamset.delete and streamset.obliterate functions.
This change also introduces conn.batch_query and conn.batch_create to allow users to create and query the existence of many streams in parallel.
Overall these changes allow 100x-300x single threaded stream creation, while also improving parallel queries.
Add more batching functions.
I would also try swapping the stderr and stdout - I think its slightly counter intuitive - its essentially equivalent to a call to dup2 iirc.
I thought this should work, seems like a bug potentially if it doesn't.
Change from batch to pipelined operation.
Not an expert in timely dataflow, but I noticed once EOF is hit, 'epoch_started' is never updated again which means 'advance' will always be true:
https://github.com/bytewax/bytewax/blob/9e13300aff81857a1131c8fa6e504b4949e1aeda/src/execution/epoch/periodic_epoch.rs#L144
Reproduction:
import time
from bytewax.dataflow import Dataflow
from bytewax.inputs import ManualInputConfig
from bytewax.outputs import StdOutputConfig
from bytewax.execution import run_main, spawn_cluster
def input_builder(worker_index, worker_count, resume_state):
if worker_index != 0:
return
i = 0
while True:
time.sleep(0.001)
i += 1
yield None, i
flow = Dataflow()
flow.input("input", ManualInputConfig(input_builder))
flow.capture(StdOutputConfig())
if __name__ == '__main__':
spawn_cluster(
flow,
proc_count=4,
)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
695475 ac 20 0 654192 24928 14208 S 100.0 0.2 0:57.42 python
695476 ac 20 0 654192 24936 14212 S 100.0 0.2 0:57.42 python
695477 ac 20 0 654192 25064 14152 S 100.0 0.2 0:57.45 python
When using spawn_cluster and you create more workers than 'distribute' is able to allocate work to, the workers that finish early immediately jump to 100 percent cpu usage until the job finishes.
python -V
)3.10
0.15.1
No response
Curious if there has been any progress on this - It makes it a bit hard to judge the true cpu usage of some test dataflows I am benchmarking.
Temporarily add benchmark script.
So to clarify - this does the build in docker?
Really quite embarrassed at how long this has been taking for me - just lots of stuff to do really.