lmcd
Repos
49
Followers
40
Following
21

Events

issue comment
StreamMap slow?

Thanks for the technical explanation! I'm glad a lot of thought went into this.

Diving a bit deeper, I'm seeing a lot of the work is actually coming from debugOnly calls - so yeah seems like a false alarm. Notably, setting _backingCheck. Now I'm wondering why _UInt24 is used, but that's a rabbit hole I should probably avoid for today!

Created at 1 month ago

Swift nightly has improved alloc counts. (#314)

This patch updates the alloc counter to enforce the new numbers.

Deliver outbound stream errors to channels. (#315)

Motivation:

When we hit a stream error on an outbound frame, we don't currently distinguish between them and channel errors. This makes it impossible for anyone to tell them apart. It also makes it impossible for the multiplexer to deliver these errors into the relevant stream channels that triggered them, making it essentially certain they'll be lost.

Better diagnostics would be good, so we are providing them.

Modifications:

  • Add new StreamError type that carries a stream ID and an underlying error.
  • Emit this StreamError type for outbound stream errors.
  • Deliver this StreamError type into the relevant child channels when possible.

Result:

Better diagnostics for stream errors.

Add Transfer-Encoding chunked to HTTP1 requests and responses. (#316)

  • Add Transfer-Encoding chunked to HTTP1 requests and responses.

  • Apply suggestions from code review

Co-authored-by: Cory Benfield lukasa@apple.com

  • Code review II

  • Fixed deprecated tests

Co-authored-by: Cory Benfield lukasa@apple.com

Forward all reads before stream channel inactive (#317)

Motivation

In the AsyncHTTPClient the read event is not always forwarded right away. We have seen instances, in which we see a HTTPClientError.remoteConnectionClosed error, on requests that finish normally. https://github.com/swift-server/async-http-client/issues/488

On deeper inspection, I noticed: If there is no unsatisfied read event, when a stream is closed, the pending reads are not forwarded. This can lead to response bytes being ignored on successful requests. NIOHTTP2 should behave as NIO and force forward all pending reads on channelInactive.

Changes

  • Forward all pending reads on channelInactive even if no read event has hit the channel

Result

All requests will receive the complete request body.

Protect against HTT2ChannelHandler reentrancy (#319)

In rare cases the HTTP2ChannelHandlers method unbufferAndFlushAutomaticFrames and the HTTP2StreaMultiplexers newConnectionWindowSize method were recursively calling themselves. This was causing a stack overflow. To protect our selves from this stack overflow I added a variable to protect ourselves against reentrancy inside the unbufferAndFlushAutomaticFrames method

Only flush when necessary in the HTTP2ChannelHandler (#320)

Motivation:

The wroteFrame flag in HTTP2ChannelHandler indicates whether a frame has been written. This flag is checked before flushing in order to avoid unnecessary flushes. Unfortunately this is only ever set to true.

Modifications:

  • Add a flushIfNecessary to check and toggle wroteFrame and flush if necessary
  • Add a test

Result:

  • Few unnecessary flushes
  • Resolves https://github.com/apple/swift-nio-http2/issues/245

Update nightly allocation limits (#321)

Motivation:

Recent nightly builds have introduced small allocation regressions.

Modifications:

Update allocation limits for nightly Swift builds.

Result:

CI passes.

Update doc generation script (#322)

Motivation:

The docs haven't been generated for a while.

See also: https://github.com/apple/swift-nio/pull/2013

Modifications:

  • Pass --spm flag to sourcekitten
  • Update source location of sourcekitten
  • Fix link to NIO docs

Result:

Docs build.

Add 5.6 nightly CI (#324)

  • Add 5.6 nightly CI

The 5.6 nightly images are available, let's use them.

  • Update alloc limits

Merge pull request from GHSA-ccw9-q5h2-8c2w

  • Validate HEADERS frame length accounts for priority data

Motivation:

When parsing a HEADERS frame payload which includes priority data the expected length of the frame is not validated to ensure that it is large enough to account for the stream data. This can lead to the expected payload length being negative.

Modifications:

  • Validate that when stream priority data is present that the frame length is at least that size
  • Add fuzz testing failure case

Result:

The frame decoder will throw a protocol error when parsing a HEADERS frame if the length of the frame is less than the number of bytes required for stream priority data and the priority flag is set.

  • Add additional tests

  • Fix weirdo formatting

Merge pull request from GHSA-w3f6-pc54-gfw7

  • Refactor HPACK integer decoding

Motivation:

The HPACK integer decoding used a number of unchecked operations which can trap on some inputs. Since it is network reachable code, it should throw errors if the operations overflow.

Modifications:

  • Add a failure case to the fuzz testing
  • Refactor the integer decoding to check for overflow on the arithmetic
  • This throws a new 'HPACKErrors.UnrepresentableInteger' error on overflow
  • Add a missing bounds check
  • Remove an unnecessary and incorrect path
  • Remove the default argument from the function driving the decoding, the default was not valid and would cause an assertion to fail if used
  • Return the decoded value as an Int rather than a UInt
  • More tests

Result:

Integer decoding is safer.

  • Use unchecked shifting

  • Use truncatingIfNeeded

  • make error internal

Merge pull request from GHSA-pgfx-g6rc-8cjv

Motivation:

We don't currently support ALTSVC and ORIGIN frames. At the moment when receiving or attempting to write these frames we trap. Trapping when receiving an unsupported frame can lead to DoS attacks. Moreover these frames are non critical so can safely be ignored.

Modifications:

  • Forward inbound and outbound ALTSVC and ORIGIN frames to the connection state machine
  • The connection state machine now ignores inbound frames of this type
  • The connection state machine traps on outbound frames of this type (as per the current behaviour)
  • Document this behaviour on HTTP2Frame.FramePayload
  • Tests and fuzz testing failure case

Result:

We don't trap when receiving unsupported frames.

Make UnrepresentableInteger public (#325)

Motivation:

A new 'UnrepresentableInteger' error was added in GHSA-w3f6-pc54-gfw7. This was added as internal so that the advisory could be published as a patch release rather than a minor.

This error should be public so that callers may hold an instance of it.

Modifications:

  • Make UnrepresentableInteger public

Result:

UnrepresentableInteger is part of the public API.

Update minimum dependency version in README (#326)

Remove a retain/release (#327)

Motivation

Unnecessary ARC traffic hurts performance

Modifications

Restructure the decodeHeaders function to avoid taking a copy of the bytebuffer but instead to reset the reader index on error.

Result

One less retain/release on header block decode.

Use unchecked math in encodeInteger (#328)

Motivation

Checked math is great when you aren't sure what the constraints are, but in some places we happen to know that all the arithmetic operations occur in a defined domain and range. In those cases, we can improve the codegen by adopting unchecked math.

Modifications

Widely adopt unchecked math in encodeInteger.

Result

Better codegen for encodeInteger.

Merge pull request from GHSA-q36x-r5x4-h4q6

Motivation

The HTTP2FrameDecoder is a complex object that was written early in the development of swift-nio-http2. Its logical flow is complex, and it hasn't been meaningfully rewritten in quite some time, so it's difficult to work with and understand.

Annoyingly, some bugs have crept in over the years. Because of the structure of the code it can be quite difficult to understand how the parser actually works, and fixing a given issue can be difficult.

This patch aims to produce a substantial change to the HTTP2FrameDecoder to make it easier to understand and maintain in the long run.

Modifications

This patch provides a complete rewrite of HTTP2FrameDecoder. It doesn't do this by having a ground-up rewrite: instead, it's more like a renovation, with the general scaffolding kept. The rewrite was performed incrementally, keeping the existing test suite passing and writing new tests when necessary.

The following major changes were made:

  1. New states and edges were added to the state machine to handle padding.

    Prior to this change, padding was handled as part of frame payload decoding. This is not totally unreasonable, but it dispersed padding management widely and made it easy to have bugs slip in. This patch replaces this with a smaller set of locations.

    Padding is now handled in two distinct ways. For HEADERS and PUSH_PROMISE frames, trailing padding is still stripped as part of frame payload decode, but it's done so generically, and the padding bytes are never exposed to the individual frame parser. For DATA, there is a new state to handle trailing padding removal, which simplifies the highly complex logic around synthesised data frames.

    For all frames, the leading padding byte is handled by a new dedicated state which is used unconditionally, instead of attempting to opportunistically strip it. This simplifies the code flow.

    As a side benefit, this change means we can now accurately report the padding used on HEADERS and PUSH_PROMISE frames, even when they are part of a CONTINUATION sequence.

  2. The synthesised DATA frame logic has been substantially reworked.

    With the removal of the padding logic from the state, we now know that so long as we have either got a byte of data to emit or the DATA frame is zero length, we will always emit a frame. This has made it simpler to understand the control flow when synthesising DATA frames.

  3. The monolithic state switch has been refactored into per-state methods.

    This helps manage the amount of state that each method can see, as well as to logically split them up. In addition, it allows us to recast state transformations as (fairly) pure functions.

    Additionally, this allowed the larger methods to be refactored with smaller helpers that are more obviously correct.

  4. The frame payload parsers have been rewritten.

    The main goal here was to remove preflight length checks and unsafe code. The preflight length checks cause trouble when they disagree with the parsing code, so we now rely on the parsing code being correct with regard to length.

    Relatedly, we previously had two separate places where we communicated length: a frame header length and a ByteBuffer length. This was unnecessary duplication of information, so we instead use a ByteBuffer slice to manage the length. This ensures that we cannot over-parse a message.

    Finally, in places that used unsafe code or separate integer reads, we have refactored to stop using that unsafe code and to use combined integer reads.

  5. Extraneous serialization code has been extracted.

    The HTTP2FrameEncoder was unnecessarily in this file, which took a large file and made it larger. I moved this out.

Result

The resulting parser is clearer and safer. Complex logic has been broken out into smaller methods with less access to global data. The code should be generally clearer.

Update alloc limits and clean up soundness (#331)

  • Lower allocation limits

Our recent fixes reduced our allocation count, which is good.

  • fix code soundness check

motivation: code should be sound!

changes:

  • update one case of inappropriate langauge
  • update header years check regex

(cherry picked from commit a89e8ae4e0928026dfbab356dc8bf1ca33e1bc95)

Co-authored-by: tom doron tomer@apple.com

ci update (#329)

  • ci update

motivation: 5.6 is out

changes:

  • use release version of 5.6

  • add docker setup for 5.7 (using nightly for now)

  • Update to new alloc counts

Co-authored-by: Cory Benfield lukasa@apple.com

Clients can't push! (#332)

Motivation:

Nowhere in the PUSH_PROMISE code do we explicitly check whether or not we're a client or a server, and we don't provide special handling for SETTINGS_ENABLE_PUSH contingent on that fact. As a result, we do technically allow clients to push. This eventually gets caught, but not until we've created an idle stream, which we shouldn't do.

Modifications:

  • Forbid clients from pushing.

Result:

Clients won't be allowed to push.

Co-authored-by: George Barnett gbarnett@apple.com

Created at 1 month ago
issue comment
StreamMap slow?

Thanks for filing this: I think the debug builds are vastly overestimating the cost of the StreamMap. CircularBuffer is an abstraction over Array, and so it provides extra layers of performance cost that are not optimised out in debug builds. In release builds, however, it vanishes.

I've noticed release builds being way faster too, so maybe all of this is a non-issue. I'll check when I have time 👍

Created at 1 month ago
issue comment
StreamMap slow?

Just reading the header comment in StreamMap (which I should have done before posting this issue, sorry) - it notes that a tradeoff was made resulting in slower lookups, but it assumes number of streams is low. I think if you're totally blasting a server with many gigs/s in uploads over a fast connection, which is our use case, stream IDs per connection could actually wind up being quite substantial.

It also dismisses dictionaries, due to the computational load of computing hashes. But couldn't we just make HTTP2StreamID Hashable - and simply provide the underlying Int32 as the hashValue

Created at 1 month ago
closed issue
StreamMap slow?

This is a WIP issue as I plan to investigate further.

I'm working on a swift-nio project that has an emphasis on large file transfers. Getting a stream of bytes out of the server is very fast, but uploading a stream is around 3x slower - even when the incoming buffers aren't handled/stored at all.

I've profiled in instruments and see a lot of traffic (around 15%) in StreamMap that's just trying to pick a struct out of a collection based on a stream ID. On first glance, it seems to me that we can probably do away with CircularBuffer, in favour of something more simple. I think the fact that stream IDs start at zero, and increment from there (but may arrive out of order), potentially opens up some optimisation opportunities.

Note that this performance bottleneck was noticed in debug builds. I'll report back with release build findings.

Created at 1 month ago
issue comment
StreamMap slow?

Whoops, posted on wrong repo. See https://github.com/apple/swift-nio-http2/issues/353

Created at 1 month ago
opened issue
StreamMap slow?

This is a WIP issue as I plan to investigate further.

I'm working on a swift-nio project that has an emphasis on large file transfers. Getting a stream of bytes out of the server is very fast, but uploading a stream is around 3x slower - even when the incoming buffers aren't handled/stored at all.

I've profiled in instruments and see a lot of traffic (around 15%) in StreamMap that's just trying to pick a struct out of a collection based on a stream ID. On first glance, it seems to me that we can probably do away with CircularBuffer, in favour of something more simple. I think the fact that stream IDs start at zero, and increment from there (but may arrive out of order), potentially opens up some optimisation opportunities.

Note that this performance bottleneck was noticed in debug builds. I'll report back with release build findings.

Created at 1 month ago
opened issue
StreamMap slow?

This is a WIP issue as I plan to investigate further.

I'm working on a swift-nio project that has an emphasis on large file transfers. Getting a stream of bytes out of the server is very fast, but uploading a stream is around 3x slower - even when the incoming buffers aren't handled/stored at all.

I've profiled in instruments and see a lot of traffic (around 15%) in StreamMap that's just trying to pick a struct out of a collection based on a stream ID. On first glance, it seems to me that we can probably do away with CircularBuffer, in favour of something more simple. I think the fact that stream IDs start at zero, and increment from there (but may arrive out of order), potentially opens up some optimisation opportunities.

Note that this performance bottleneck was noticed in debug builds. I'll report back with release build findings.

Created at 1 month ago
issue comment
How about http3?

Appreciate the quick, clear response. Sad that server-side swift isn't yet getting any investment in these modern standards, but fair.

Created at 1 month ago
issue comment
How about http3?

@Lukasa - is there any update on what the thinking is on this internally? QUIC is long standardised now and supported in all browsers, with implementations in all major languages (except Swift). As someone with deeper insights/understanding on all of this, do you think borrowing some of the work done with Netty/QUIC would be a worthwhile or wasted effort?

Is this on the official nio roadmap, or should those of us wanting this migrate away and try our luck elsewhere? Thanks!

Created at 1 month ago