Please enable JavaScript.
Coggle requires JavaScript to display documents.
sbitmap hang - Coggle Diagram
sbitmap hang
theory
one invarient
one sbq_wait left in sbq_wait_state
there should be at least wake_cnt - compete_cnt + wake_batch inflight bits
sleeper is added in interleaved way
different wakeup may just wakeup one sleeper in same wq
sleeper is only added iff all tags are allocated
adding sleeper
three steps
get tag failed
add sleeper
retry get tag
final retry
final retry can be from normal wakeup code path
double wakeup
allocation should succeed if any tag is available
the un-completed allocation
when to be added to sleeper?
what is sbq's state when it is added
what is the last wakeup?
gap is increased to 7 from 6
what is the last wakeup batch?
make other sleepers get tags except fo this one
wakeup nothing
likely?
no
reports after fix merged
nvme: 5 queues, 64 depth, 5 fio, each fio have 8 threads
sbitmap resize run
cpu: 40 cores
completion - wakeup = 7
one completion less
one fio job hang
the last waiter
observation
all bits are cleared
one ws is still active
added to this sbq
wakeup
complete_cnt = wake_cnt +6
two times
so two extra wakeup are lost
possible reasons
how to investigate?
all bits are released
io jobs
libaio/direct write
60 io jobs
commands line?
ext4 FS
devices
nvme
3 queues
depth: 64
hw depth is 1023
nr_cpu_ids: 40
none scheduler
another nvme
2 queues
others are same
possible reasons
two completion batch wakes up just one wq
two sleepers are involved
interleaved
each wake_up_nr() returns > 0
investigation
bpftrace
tracepoint
sbitmap_queue_wake_up
sbitmap_prepare_to_wait
sbitmap_finish_wait
trace what?
the last wakeup
sbitmap_queue_wake_up
the last wakeup batch
sbq status when the sleeper is added to ws
how many sleepers are wakeup in last wakeup
is it possible?
warn when sleeper is added iff there is free bits
sbq stat?
observe sbq allocation statistics
counting ->ws_active
min/max/avg
on sbitmap_prepare_to_wait
counting how many sleepers are removed in each wakeup
bits utilization
100% most of time