Perf issue: High longtail response times with KeepAlive=true when running with(or without) workers

**Describe the bug**

When using `ab` with `-k` or using `hey` (a go based a/b network testing library) I see very high long tail response times that is an order of magnitude of what I would expect.

AB without keepalive:

```
$  ab -c 10 -n 200 127.0.0.1:3000/
...
Percentage of the requests served within a certain time (ms)
  50%    420
  66%    421
  75%    422
  80%    422
  90%    423
  95%    424
  98%    426
  99%    428
 100%    432 (longest request)
```

Longest request took 432ms

With keepalive:

```
$  ab -c 10 -n 200 -k 127.0.0.1:3000/
...
Percentage of the requests served within a certain time (ms)
  50%    205
  66%    205
  75%    206
  80%    206
  90%   1852
  95%   2258
  98%   2273
  99%   2450
 100%   2477 (longest request)
```

Longest request took 2,477ms (or 5.7x longer than without keepalive). 

**Puma config:**

```
$ cat Gemfile
source "https://rubygems.org"

gem "puma"
gem "rack"
⛄️ 3.3.1 🚀 /private/tmp/e3f9bf2d7f5e861b6de95d7e48a1b5b0
$ cat config.ru
require 'rack'

response = ['200', {'Content-Type' => 'text/html'}, ['Hello, World!']]
app = Proc.new do |env|
  sleep 0.2
  response
end

run app
$ bundle exec puma -t5 -p 3000
Puma starting in single mode...
* Puma version: 6.4.2 (ruby 3.3.1-p55) ("The Eagle of Durango")
*  Min threads: 5
*  Max threads: 5
*  Environment: development
*          PID: 7434
* Listening on http://0.0.0.0:3000
Use Ctrl-C to stop
```

**To Reproduce**

- Use the above example app that sleeps for 200ms
- Boot with `bundle exec puma -t5 -p 3000`
- Hit with AB and keepalive enabled or `hey`

```
$ hey -c 10 -n 200 http://localhost:3000/

Summary:
  Total:	8.3877 secs
  Slowest:	2.4628 secs
  Fastest:	0.2005 secs
  Average:	0.3612 secs
  Requests/sec:	23.8445

  Total data:	2600 bytes
  Size/request:	13 bytes

Response time histogram:
  0.201 [1]	|
  0.427 [184]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.653 [0]	|
  0.879 [0]	|
  1.105 [0]	|
  1.332 [0]	|
  1.558 [0]	|
  1.784 [0]	|
  2.010 [0]	|
  2.237 [0]	|
  2.463 [15]	|■■■
```

## Extra observations

- Enabling workers makes the problem go away
- Disabling keepalive makes the problem go away

In `server.rb` the codepaths diverge https://github.com/puma/puma/blob/796d8c6af139a22e29e57e156c05c66ed3082634/lib/puma/server.rb#L467-L469 and that could be an okay place to look. Also within `prepare_response()` in `lib/puma/request.rb`. The return value of true indicates keepalive and false indicates no keepalive.

Edit (extra): you can also use existing middleware in the puma directory `puma -t5 -p3000 test/rackup/sleep.ru` and hit the `sleep0.2` endpoint to achieve roughly the same behavior.

Edit (extra): From within the puma directory you can run:

```
$ bundle exec ruby -Ilib bin/puma -t5  -p 3000 test/rackup/sleep.ru
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf issue: High longtail response times with KeepAlive=true when running with(or without) workers #3487

Extra observations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	case handle_request(client, requests + 1)
	when false
	break

Perf issue: High longtail response times with KeepAlive=true when running with(or without) workers #3487

Description

Extra observations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions