syzbot


ID Workflow Result Correct Bug Created Started Finished Revision Error
be28beee-490e-4e38-957e-44e33e42eab1 assessment-kcsan Benign: ✅  Confident: ✅  KCSAN: data-race in __xa_clear_mark / xas_find_marked (2) 2026/02/24 18:46 2026/02/24 18:46 2026/02/24 18:48 305c0ec5cd886e2d13738e28e1b2df9b0ec20fc9
Benign: ✅    Confident: ✅   

Explanation:
The data race occurs between `__xa_clear_mark` (specifically `node_clear_mark`) and `xas_find_marked` (specifically `xas_find_chunk`) in the XArray implementation.

### Analysis

1.  **Nature of the Accesses:**
    *   **Writer (`__xa_clear_mark`):** This function is called with the `xa_lock` held. It uses `__test_and_clear_bit`, which is a non-atomic bitwise operation, to clear a mark in the `xa_node->marks` bitmask. On x86, this typically translates to a single `btr` instruction. While not "atomic" in the SMP sense (no `lock` prefix), it is a single memory access at the word level.
    *   **Reader (`xas_find_marked`):** This function is a lockless reader (typically used within an RCU read-side critical section). It calls `xas_find_chunk`, which performs a plain load of the `unsigned long` containing the marks (`unsigned long data = *addr & (~0UL << offset)`).

2.  **Concurrency Model:**
    *   XArray is designed to allow concurrent readers and writers. Writers are serialized by the `xa_lock`, but readers are lockless and rely on RCU for node lifetime management and the robustness of the walking algorithm.
    *   The `marks` bitmask is used as an optimization to quickly find entries with specific tags (e.g., `PAGECACHE_TAG_WRITEBACK`, `PAGECACHE_TAG_DIRTY`).

3.  **Impact of the Race:**
    *   Since the writer only changes a single bit and the reader performs a word-sized load, the reader will see either the old value (bit set) or the new value (bit cleared). On all supported architectures, aligned `unsigned long` loads and stores are atomic.
    *   If the reader sees the "old" value (bit set) while it's being cleared, it will proceed to the entry. In the case of `PAGECACHE_TAG_WRITEBACK`, the reader (e.g., `filemap_get_folios_tag`) will find the folio and then typically check its state or wait for writeback. As shown in `__folio_end_writeback`, the `PG_writeback` flag is cleared *before* the XArray mark. Thus, if a reader finds a folio because the mark was not yet cleared, it will see that `PG_writeback` is already gone and return immediately.
    *   If the reader sees the "new" value (bit cleared), it will simply skip the entry, which is correct because the mark was indeed being cleared.

4.  **Conclusion:**
    *   The race does not lead to memory corruption or crashes.
    *   The XArray walking algorithm handles the case where marks are updated concurrently.
    *   The "intermediate" state seen by the reader is always a valid state of the bitmask (either before or after the bit is cleared).
    *   This is a classic example of a benign data race in a lockless data structure where the reader can tolerate slightly stale information that is immediately reconciled by subsequent checks or is inherently safe for the logic.

### Verdict
The data race is **benign**. It is an intentional part of the XArray's high-performance lockless reading design. While KCSAN reports it because it involves a plain load racing with a non-atomic bitop, the architecture guarantees word-level atomicity, and the software logic is robust against this specific race.

Crash report:
EXT4-fs (loop4): mounted filesystem 00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode: none.
==================================================================
BUG: KCSAN: data-race in __xa_clear_mark / xas_find_marked

read-write to 0xffff88811b0ddb48 of 8 bytes by task 30 on cpu 0:
 instrument_read_write include/linux/instrumented.h:56 [inline]
 __instrument_read_write_bitop include/asm-generic/bitops/instrumented-non-atomic.h:84 [inline]
 ___test_and_clear_bit include/asm-generic/bitops/instrumented-non-atomic.h:114 [inline]
 node_clear_mark lib/xarray.c:102 [inline]
 xas_clear_mark lib/xarray.c:922 [inline]
 __xa_clear_mark+0xf5/0x1e0 lib/xarray.c:2100
 __folio_end_writeback+0xf4/0x360 mm/page-writeback.c:2948
 folio_end_writeback_no_dropbehind+0x6d/0x1d0 mm/filemap.c:1661
 folio_end_writeback+0x1c/0x70 mm/filemap.c:1687
 ext4_finish_bio+0x459/0x8b0 fs/ext4/page-io.c:145
 ext4_release_io_end+0x9f/0x1f0 fs/ext4/page-io.c:160
 ext4_end_io_end+0x18d/0x240 fs/ext4/page-io.c:211
 ext4_do_flush_completed_IO fs/ext4/page-io.c:291 [inline]
 ext4_end_io_rsv_work+0x15b/0x1f0 fs/ext4/page-io.c:306
 process_one_work kernel/workqueue.c:3275 [inline]
 process_scheduled_works+0x4de/0x9e0 kernel/workqueue.c:3358
 worker_thread+0x581/0x770 kernel/workqueue.c:3439
 kthread+0x22a/0x280 kernel/kthread.c:467
 ret_from_fork+0x150/0x360 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

read to 0xffff88811b0ddb48 of 8 bytes by task 4498 on cpu 1:
 xas_find_chunk include/linux/xarray.h:1752 [inline]
 xas_find_marked+0x213/0x620 lib/xarray.c:1510
 find_get_entry mm/filemap.c:2064 [inline]
 filemap_get_folios_tag+0xfa/0x510 mm/filemap.c:2332
 __filemap_fdatawait_range mm/filemap.c:523 [inline]
 file_write_and_wait_range+0x20c/0x2f0 mm/filemap.c:789
 generic_buffers_fsync_noflush+0x45/0x130 fs/buffer.c:606
 ext4_fsync_nojournal fs/ext4/fsync.c:88 [inline]
 ext4_sync_file+0x1aa/0x6b0 fs/ext4/fsync.c:147
 vfs_fsync_range+0xc5/0xe0 fs/sync.c:186
 generic_write_sync include/linux/fs.h:2640 [inline]
 ext4_buffered_write_iter+0x34f/0x3c0 fs/ext4/file.c:306
 ext4_file_write_iter+0x380/0xfa0 fs/ext4/file.c:-1
 do_iter_readv_writev+0x4fd/0x5a0 fs/read_write.c:-1
 vfs_writev+0x2e1/0x900 fs/read_write.c:1059
 do_pwritev fs/read_write.c:1155 [inline]
 __do_sys_pwritev2 fs/read_write.c:1213 [inline]
 __se_sys_pwritev2+0xfc/0x1c0 fs/read_write.c:1204
 __x64_sys_pwritev2+0x67/0x80 fs/read_write.c:1204
 x64_sys_call+0xe07/0x3020 arch/x86/include/generated/asm/syscalls_64.h:329
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x12c/0x370 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffffffe000000000 -> 0xfffffc0000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4498 Comm: syz.4.428 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
==================================================================

Trajectory:
Seq Timestamp Type Name Duration
0/0 2026/02/24 18:46 flow assessment-kcsan
2m Results:
map[Benign:true Confident:true Explanation:The data race occurs between `__xa_clear_mark` (specifically `node_clear_mark`) and `xas_find_marked` (specifically `xas_find_chunk`) in the XArray implementation.

### Analysis

1.  **Nature of the Accesses:**
    *   **Writer (`__xa_clear_mark`):** This function is called with the `xa_lock` held. It uses `__test_and_clear_bit`, which is a non-atomic bitwise operation, to clear a mark in the `xa_node->marks` bitmask. On x86, this typically translates to a single `btr` instruction. While not "atomic" in the SMP sense (no `lock` prefix), it is a single memory access at the word level.
    *   **Reader (`xas_find_marked`):** This function is a lockless reader (typically used within an RCU read-side critical section). It calls `xas_find_chunk`, which performs a plain load of the `unsigned long` containing the marks (`unsigned long data = *addr & (~0UL << offset)`).

2.  **Concurrency Model:**
    *   XArray is designed to allow concurrent readers and writers. Writers are serialized by the `xa_lock`, but readers are lockless and rely on RCU for node lifetime management and the robustness of the walking algorithm.
    *   The `marks` bitmask is used as an optimization to quickly find entries with specific tags (e.g., `PAGECACHE_TAG_WRITEBACK`, `PAGECACHE_TAG_DIRTY`).

3.  **Impact of the Race:**
    *   Since the writer only changes a single bit and the reader performs a word-sized load, the reader will see either the old value (bit set) or the new value (bit cleared). On all supported architectures, aligned `unsigned long` loads and stores are atomic.
    *   If the reader sees the "old" value (bit set) while it's being cleared, it will proceed to the entry. In the case of `PAGECACHE_TAG_WRITEBACK`, the reader (e.g., `filemap_get_folios_tag`) will find the folio and then typically check its state or wait for writeback. As shown in `__folio_end_writeback`, the `PG_writeback` flag is cleared *before* the XArray mark. Thus, if a reader finds a folio because the mark was not yet cleared, it will see that `PG_writeback` is already gone and return immediately.
    *   If the reader sees the "new" value (bit cleared), it will simply skip the entry, which is correct because the mark was indeed being cleared.

4.  **Conclusion:**
    *   The race does not lead to memory corruption or crashes.
    *   The XArray walking algorithm handles the case where marks are updated concurrently.
    *   The "intermediate" state seen by the reader is always a valid state of the bitmask (either before or after the bit is cleared).
    *   This is a classic example of a benign data race in a lockless data structure where the reader can tolerate slightly stale information that is immediately reconciled by subsequent checks or is inherently safe for the logic.

### Verdict
The data race is **benign**. It is an intentional part of the XArray's high-performance lockless reading design. While KCSAN reports it because it involves a plain load racing with a non-atomic bitop, the architecture guarantees word-level atomicity, and the software logic is robust against this specific race.]

1/1 2026/02/24 18:46 action kernel-checkouter
0m Results:
map[KernelSrc:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/src/83e3e4073a4cd51a6ced0f93cc08b3f6ffb54f19]

2/1 2026/02/24 18:46 action kernel-builder
0m Results:
map[KernelObj:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/build/b54bee6631687294095c3f2df1d84662f988232e]

3/1 2026/02/24 18:46 action codesearch-prepare
0m Results:
map[Index:codesearch-index]

4/1 2026/02/24 18:46 agent expert
1m Model:
gemini-3-flash-preview

Results:
map[Benign:true Confident:true]

Instruction:
You are an experienced Linux kernel developer tasked with determining if the given kernel
data race is benign or not. The data race report is from KCSAN tool.
It contains 2 stack traces of the memory accesses that constitute a data race.

A "benign" data races are on a simple int/bool variable or similar field,
and the accesses are not supposed to be protected by any mutual exclusion primitives.
Common examples of such "benign" data races are accesses to various flags fields,
statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash
with a conservative compiler that compiles memory accesses to primitive types
effectively as atomic.

A non-benign (or "harmful" data race) can lead to corruption/crash even with
a conservative compiler that compiles memory accesses to primitive types
effectively as atomic. A common example of a "harmful" data races is race on
a complex container (list/hashmap/etc), where accesses are supposed to be protected
by a mutual exclusion primitive.

Also consider races that happen at the same time with the given one.
If there is no synchronization in between, other memory accesses in the involved threads
race with each other if they access the same memory. For example, if both threads execute:

	some_struct->have_elements = true;
	list_add(new_node, &some_struct->list_head);

the race on some_struct->have_elements may appear benign, however it also implies there
is a race on some_struct->list_head which is not benign, since the list is not thread-safe.

Take into account that on 32-bit systems 64-bit memory accesses may be split into two accesses,
and thus even with a conservative compiler may not be fully atomic. However, such races may
still be benign depending on how writes are done, and how read data is used.

In the final reply explain why you think the given data race is benign or is harmful.

Use the provided tools to confirm any assumptions, variables/fields being accessed, etc.
In particular, don't make assumptions about the kernel source code,
use codesearch tools to read the actual source code.

Prefer calling several tools at the same time to save round-trips.


Use set-results tool to provide results of the analysis.
It must be called exactly once before the final reply.
Ignore results of this tool.

Prompt:
The data race report is:

EXT4-fs (loop4): mounted filesystem 00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode: none.
==================================================================
BUG: KCSAN: data-race in __xa_clear_mark / xas_find_marked

read-write to 0xffff88811b0ddb48 of 8 bytes by task 30 on cpu 0:
 instrument_read_write include/linux/instrumented.h:56 [inline]
 __instrument_read_write_bitop include/asm-generic/bitops/instrumented-non-atomic.h:84 [inline]
 ___test_and_clear_bit include/asm-generic/bitops/instrumented-non-atomic.h:114 [inline]
 node_clear_mark lib/xarray.c:102 [inline]
 xas_clear_mark lib/xarray.c:922 [inline]
 __xa_clear_mark+0xf5/0x1e0 lib/xarray.c:2100
 __folio_end_writeback+0xf4/0x360 mm/page-writeback.c:2948
 folio_end_writeback_no_dropbehind+0x6d/0x1d0 mm/filemap.c:1661
 folio_end_writeback+0x1c/0x70 mm/filemap.c:1687
 ext4_finish_bio+0x459/0x8b0 fs/ext4/page-io.c:145
 ext4_release_io_end+0x9f/0x1f0 fs/ext4/page-io.c:160
 ext4_end_io_end+0x18d/0x240 fs/ext4/page-io.c:211
 ext4_do_flush_completed_IO fs/ext4/page-io.c:291 [inline]
 ext4_end_io_rsv_work+0x15b/0x1f0 fs/ext4/page-io.c:306
 process_one_work kernel/workqueue.c:3275 [inline]
 process_scheduled_works+0x4de/0x9e0 kernel/workqueue.c:3358
 worker_thread+0x581/0x770 kernel/workqueue.c:3439
 kthread+0x22a/0x280 kernel/kthread.c:467
 ret_from_fork+0x150/0x360 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

read to 0xffff88811b0ddb48 of 8 bytes by task 4498 on cpu 1:
 xas_find_chunk include/linux/xarray.h:1752 [inline]
 xas_find_marked+0x213/0x620 lib/xarray.c:1510
 find_get_entry mm/filemap.c:2064 [inline]
 filemap_get_folios_tag+0xfa/0x510 mm/filemap.c:2332
 __filemap_fdatawait_range mm/filemap.c:523 [inline]
 file_write_and_wait_range+0x20c/0x2f0 mm/filemap.c:789
 generic_buffers_fsync_noflush+0x45/0x130 fs/buffer.c:606
 ext4_fsync_nojournal fs/ext4/fsync.c:88 [inline]
 ext4_sync_file+0x1aa/0x6b0 fs/ext4/fsync.c:147
 vfs_fsync_range+0xc5/0xe0 fs/sync.c:186
 generic_write_sync include/linux/fs.h:2640 [inline]
 ext4_buffered_write_iter+0x34f/0x3c0 fs/ext4/file.c:306
 ext4_file_write_iter+0x380/0xfa0 fs/ext4/file.c:-1
 do_iter_readv_writev+0x4fd/0x5a0 fs/read_write.c:-1
 vfs_writev+0x2e1/0x900 fs/read_write.c:1059
 do_pwritev fs/read_write.c:1155 [inline]
 __do_sys_pwritev2 fs/read_write.c:1213 [inline]
 __se_sys_pwritev2+0xfc/0x1c0 fs/read_write.c:1204
 __x64_sys_pwritev2+0x67/0x80 fs/read_write.c:1204
 x64_sys_call+0xe07/0x3020 arch/x86/include/generated/asm/syscalls_64.h:329
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x12c/0x370 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffffffe000000000 -> 0xfffffc0000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4498 Comm: syz.4.428 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
==================================================================


Reply:
The data race occurs between `__xa_clear_mark` (specifically `node_clear_mark`) and `xas_find_marked` (specifically `xas_find_chunk`) in the XArray implementation.

### Analysis

1.  **Nature of the Accesses:**
    *   **Writer (`__xa_clear_mark`):** This function is called with the `xa_lock` held. It uses `__test_and_clear_bit`, which is a non-atomic bitwise operation, to clear a mark in the `xa_node->marks` bitmask. On x86, this typically translates to a single `btr` instruction. While not "atomic" in the SMP sense (no `lock` prefix), it is a single memory access at the word level.
    *   **Reader (`xas_find_marked`):** This function is a lockless reader (typically used within an RCU read-side critical section). It calls `xas_find_chunk`, which performs a plain load of the `unsigned long` containing the marks (`unsigned long data = *addr & (~0UL << offset)`).

2.  **Concurrency Model:**
    *   XArray is designed to allow concurrent readers and writers. Writers are serialized by the `xa_lock`, but readers are lockless and rely on RCU for node lifetime management and the robustness of the walking algorithm.
    *   The `marks` bitmask is used as an optimization to quickly find entries with specific tags (e.g., `PAGECACHE_TAG_WRITEBACK`, `PAGECACHE_TAG_DIRTY`).

3.  **Impact of the Race:**
    *   Since the writer only changes a single bit and the reader performs a word-sized load, the reader will see either the old value (bit set) or the new value (bit cleared). On all supported architectures, aligned `unsigned long` loads and stores are atomic.
    *   If the reader sees the "old" value (bit set) while it's being cleared, it will proceed to the entry. In the case of `PAGECACHE_TAG_WRITEBACK`, the reader (e.g., `filemap_get_folios_tag`) will find the folio and then typically check its state or wait for writeback. As shown in `__folio_end_writeback`, the `PG_writeback` flag is cleared *before* the XArray mark. Thus, if a reader finds a folio because the mark was not yet cleared, it will see that `PG_writeback` is already gone and return immediately.
    *   If the reader sees the "new" value (bit cleared), it will simply skip the entry, which is correct because the mark was indeed being cleared.

4.  **Conclusion:**
    *   The race does not lead to memory corruption or crashes.
    *   The XArray walking algorithm handles the case where marks are updated concurrently.
    *   The "intermediate" state seen by the reader is always a valid state of the bitmask (either before or after the bit is cleared).
    *   This is a classic example of a benign data race in a lockless data structure where the reader can tolerate slightly stale information that is immediately reconciled by subsequent checks or is inherently safe for the logic.

### Verdict
The data race is **benign**. It is an intentional part of the XArray's high-performance lockless reading design. While KCSAN reports it because it involves a plain load racing with a non-atomic bitop, the architecture guarantees word-level atomicity, and the software logic is robust against this specific race.

5/2 2026/02/24 18:46 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 4567
							output: 122
							thoughts: 300
						

Thoughts:
**Analyzing a Data Race in the XArray Implementation**

Okay, so the report flags a nasty data race, right in the heart of the XArray implementation. Looks like `__xa_clear_mark` (which trickles down to `node_clear_mark`) is racing with `xas_find_marked` (which calls `xas_find_chunk`). The specific memory address is `0xffff88811b0ddb48`, and we're seeing a bit flip – specifically, a change from `0xffffffe000000000` to `0xfffffc0000000000`. This screams bitmask manipulation, which is always a prime suspect in these types of races.

My immediate thought is to zero in on the relevant code. So, `node_clear_mark` and `xas_find_marked` are both sitting pretty in `lib/xarray.c`. Gotta love that convenient placement.

Now, let's break down the race itself. We've got `__xa_clear_mark` calling `node_clear_mark`, which then calls `___test_and_clear_bit`.  And that's where the problem lies. The report tells us this is a non-atomic bit operation. Meanwhile, we have `xas_find_marked` going through `xas_find_chunk`, which is merely reading. The report indicates a `read-write` operation for `___test_and_clear_bit` and a simple `read` for `xas_find_chunk`. And bingo, that's exactly what's causing it.

Okay, time to open up the source and dig into the details.



6/2 2026/02/24 18:46 tool codesearch-definition-source
0m Args:
map[ContextFile:lib/xarray.c IncludeLines:true Name:node_clear_mark]

Results:
map[SourceCode:  98:	/* returns true if the bit was set */
  99:	static inline bool node_clear_mark(struct xa_node *node, unsigned int offset,
 100:					xa_mark_t mark)
 101:	{
 102:		return __test_and_clear_bit(offset, node_marks(node, mark));
 103:	}
 SourceFile:lib/xarray.c]

7/2 2026/02/24 18:46 tool codesearch-definition-source
0m Args:
map[ContextFile:lib/xarray.c IncludeLines:true Name:xas_find_marked]

Results:
map[SourceCode:1441:	/**
1442:	 * xas_find_marked() - Find the next marked entry in the XArray.
1443:	 * @xas: XArray operation state.
1444:	 * @max: Highest index to return.
1445:	 * @mark: Mark number to search for.
1446:	 *
1447:	 * If the @xas has not yet been walked to an entry, return the marked entry
1448:	 * which has an index >= xas.xa_index.  If it has been walked, the entry
1449:	 * currently being pointed at has been processed, and so we return the
1450:	 * first marked entry with an index > xas.xa_index.
1451:	 *
1452:	 * If no marked entry is found and the array is smaller than @max, @xas is
1453:	 * set to the bounds state and xas->xa_index is set to the smallest index
1454:	 * not yet in the array.  This allows @xas to be immediately passed to
1455:	 * xas_store().
1456:	 *
1457:	 * If no entry is found before @max is reached, @xas is set to the restart
1458:	 * state.
1459:	 *
1460:	 * Return: The entry, if found, otherwise %NULL.
1461:	 */
1462:	void *xas_find_marked(struct xa_state *xas, unsigned long max, xa_mark_t mark)
1463:	{
1464:		bool advance = true;
1465:		unsigned int offset;
1466:		void *entry;
1467:	
1468:		if (xas_error(xas))
1469:			return NULL;
1470:		if (xas->xa_index > max)
1471:			goto max;
1472:	
1473:		if (!xas->xa_node) {
1474:			xas->xa_index = 1;
1475:			goto out;
1476:		} else if (xas_top(xas->xa_node)) {
1477:			advance = false;
1478:			entry = xa_head(xas->xa);
1479:			xas->xa_node = NULL;
1480:			if (xas->xa_index > max_index(entry))
1481:				goto out;
1482:			if (!xa_is_node(entry)) {
1483:				if (xa_marked(xas->xa, mark))
1484:					return entry;
1485:				xas->xa_index = 1;
1486:				goto out;
1487:			}
1488:			xas->xa_node = xa_to_node(entry);
1489:			xas->xa_offset = xas->xa_index >> xas->xa_node->shift;
1490:		}
1491:	
1492:		while (xas->xa_index <= max) {
1493:			if (unlikely(xas->xa_offset == XA_CHUNK_SIZE)) {
1494:				xas->xa_offset = xas->xa_node->offset + 1;
1495:				xas->xa_node = xa_parent(xas->xa, xas->xa_node);
1496:				if (!xas->xa_node)
1497:					break;
1498:				advance = false;
1499:				continue;
1500:			}
1501:	
1502:			if (!advance) {
1503:				entry = xa_entry(xas->xa, xas->xa_node, xas->xa_offset);
1504:				if (xa_is_sibling(entry)) {
1505:					xas->xa_offset = xa_to_sibling(entry);
1506:					xas_move_index(xas, xas->xa_offset);
1507:				}
1508:			}
1509:	
1510:			offset = xas_find_chunk(xas, advance, mark);
1511:			if (offset > xas->xa_offset) {
1512:				advance = false;
1513:				xas_move_index(xas, offset);
1514:				/* Mind the wrap */
1515:				if ((xas->xa_index - 1) >= max)
1516:					goto max;
1517:				xas->xa_offset = offset;
1518:				if (offset == XA_CHUNK_SIZE)
1519:					continue;
1520:			}
1521:	
1522:			entry = xa_entry(xas->xa, xas->xa_node, xas->xa_offset);
1523:			if (!entry && !(xa_track_free(xas->xa) && mark == XA_FREE_MARK))
1524:				continue;
1525:			if (xa_is_sibling(entry))
1526:				continue;
1527:			if (!xa_is_node(entry))
1528:				return entry;
1529:			xas->xa_node = xa_to_node(entry);
1530:			xas_set_offset(xas);
1531:		}
1532:	
1533:	out:
1534:		if (xas->xa_index > max)
1535:			goto max;
1536:		return set_bounds(xas);
1537:	max:
1538:		xas->xa_node = XAS_RESTART;
1539:		return NULL;
1540:	}
 SourceFile:lib/xarray.c]

8/2 2026/02/24 18:46 tool codesearch-definition-source
0m Args:
map[ContextFile:include/linux/xarray.h IncludeLines:true Name:xas_find_chunk]

Results:
map[SourceCode:1741:	/* Private */
1742:	static inline unsigned int xas_find_chunk(struct xa_state *xas, bool advance,
1743:			xa_mark_t mark)
1744:	{
1745:		unsigned long *addr = xas->xa_node->marks[(__force unsigned)mark];
1746:		unsigned int offset = xas->xa_offset;
1747:	
1748:		if (advance)
1749:			offset++;
1750:		if (XA_CHUNK_SIZE == BITS_PER_LONG) {
1751:			if (offset < XA_CHUNK_SIZE) {
1752:				unsigned long data = *addr & (~0UL << offset);
1753:				if (data)
1754:					return __ffs(data);
1755:			}
1756:			return XA_CHUNK_SIZE;
1757:		}
1758:	
1759:		return find_next_bit(addr, XA_CHUNK_SIZE, offset);
1760:	}
 SourceFile:include/linux/xarray.h]

9/2 2026/02/24 18:46 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 7321
							output: 41
							thoughts: 314
						

Thoughts:
**Analyzing Concurrency in XArray Mark Clearing**

Okay, so I'm looking at a concurrency issue, specifically around clearing marks in an `xarray`. The heart of the matter seems to be the race condition between `__test_and_clear_bit(offset, node_marks(node, mark))` in `node_clear_mark` and `unsigned long data = *addr & (~0UL << offset)` within `xas_find_chunk`. The first uses a non-atomic bit operation (`__test_and_clear_bit`), while the second reads the whole `unsigned long` from the address.

The key here is that the `xarray` docs (or just general knowledge for me) emphasize the possibility of concurrent readers and writers, assuming RCU is employed. Crucially, `xas_find_marked` is frequently found in RCU read-side critical sections. That's immediately relevant.

Now, I'm trying to understand *why* this potential race matters. Specifically, I'm digging into `__xa_clear_mark` to see where it's being used. The stack trace is giving me some clues: `__xa_clear_mark` is called from `__folio_end_writeback`, which is intriguing and suggests a connection to writeback operations, and then `xas_find_marked` is called from `filemap_get_folios_tag`. Given that, I'm getting a clearer idea. Now, I want to dive into the implementation of `__xa_clear_mark` to get a better handle on what's going on.



10/2 2026/02/24 18:46 tool codesearch-definition-source
0m Args:
map[ContextFile:lib/xarray.c IncludeLines:true Name:__xa_clear_mark]

Results:
map[SourceCode:2086:	/**
2087:	 * __xa_clear_mark() - Clear this mark on this entry while locked.
2088:	 * @xa: XArray.
2089:	 * @index: Index of entry.
2090:	 * @mark: Mark number.
2091:	 *
2092:	 * Context: Any context.  Expects xa_lock to be held on entry.
2093:	 */
2094:	void __xa_clear_mark(struct xarray *xa, unsigned long index, xa_mark_t mark)
2095:	{
2096:		XA_STATE(xas, xa, index);
2097:		void *entry = xas_load(&xas);
2098:	
2099:		if (entry)
2100:			xas_clear_mark(&xas, mark);
2101:	}
 SourceFile:lib/xarray.c]

11/2 2026/02/24 18:46 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 8277
							output: 19
							thoughts: 6596
						

Thoughts:
**Analysis of Potential Data Race in XArray Mark Clearing**

Okay, here's my assessment of this potential data race. My focus is on understanding if this is a real problem and, if so, the implications.  The code in question involves clearing marks within an XArray, specifically, the `xas_clear_mark` function, and how it interacts with lockless readers like those used by `filemap_get_folios_tag`.

1.  **The Core Issue:** `xas_clear_mark` calls `__test_and_clear_bit` (or similar instruction like `btr` on x86) to clear bits within the `marks` field of XArray nodes.  The concern is that this bitwise operation is *not* atomic in the sense that it doesn't use a `lock` prefix on x86, which is crucial for SMP systems, meaning a concurrent reader in `xas_find_chunk` might observe an inconsistent intermediate state.  KCSAN is flagging this as a data race, which needs investigation.

2.  **Architecture-Specific Atomic Operations:** On x86, single instruction loads and stores of aligned 64-bit words are atomic.  So, at a low level, this means the reader will always see either the "old" or the "new" value of the entire mark word. The fact that the read is aligned is essential here. The `btr` instruction achieves atomicity, at the instruction level, with respect to accesses to the same `long`. But it is not atomic when it comes to the SMP sense.

3.  **XArray Design Considerations:** XArrays are designed to allow lockless readers.  Writers hold the `xa_lock`, and readers use RCU to safely traverse the structure without locking.  This means we *expect* readers to sometimes see a state that is transient.  The XArray code needs to be written to handle this.  Specifically, `xas_find_chunk` retrieves the marks without any explicit memory barriers (e.g., `READ_ONCE` or `WRITE_ONCE`), and it is intended to handle that a partially written mask might be read.

4.  **Mark Usage Context and `PAGECACHE_TAG_WRITEBACK`:** The code snippet hints that this marking process relates to `PAGECACHE_TAG_WRITEBACK`, indicating folios undergoing writeback to disk.  The implications are:
    *   `xas_clear_mark` is clearing the `PAGECACHE_TAG_WRITEBACK` mark, signaling that writeback is finished for a folio.
    *   `filemap_get_folios_tag` is used to find folios marked for writeback, likely in conjunction with `filemap_fdatawait_range` to wait for the writeback to complete.
    *   If a reader, while searching for folios to be written back, misses the mark for a folio (because it was just cleared), it is generally fine. The folio *has* finished writeback (or is very close to it). It will not be added to the list of folios to wait for.
    *   If the reader sees the mark, it will find the folio, and likely check the `PG_writeback` bit for confirmation.
    *   The crucial point is that a "miss" is typically acceptable if it means that we don't unnecessarily wait.

5.  **Compiler Reordering and Memory Model:** I am concerned about the compiler's potential to reorder instructions.  However, the `asm volatile` within `__test_and_clear_bit` acts as a compiler barrier.  While `xas_find_chunk` doesn't explicitly use `READ_ONCE`, the XArray is designed such that this doesn't matter, within the confines of XArray's RCU principles (assuming aligned 64-bit access).  The reader is expected to handle the potential for seeing an intermediate state, especially given the "mark" semantics.

6.  **The Value Change:** The KCSAN report shows a change in the `node->marks` from `0xffffffe000000000` to `0xfffffc0000000000`.  This corresponds to clearing a single bit (bit 41) within the 64-bit `unsigned long`. This is precisely what `__test_and_clear_bit` is designed to do.

7.  **Writer Concurrency:** Multiple writers are serialized by the `xa_lock`. Therefore, multiple concurrent writers are impossible, and the race can only occur between a single writer and a single reader.

8.  **Chain Reactions:** `xas_clear_mark` propagates mark clearing up the XArray tree if the current node has no other marks.  The reader is


12/2 2026/02/24 18:47 tool grepper
0m Args:
map[Expression:xas_find_marked]

Results:
map[Output:Documentation/core-api/xarray.rst=426=inline code.
Documentation/core-api/xarray.rst-427-
Documentation/core-api/xarray.rst:428:The xas_find_marked() function is similar.  If the xa_state has
Documentation/core-api/xarray.rst-429-not been walked, it will return the entry at the index of the xa_state,
--
Documentation/translations/zh_CN/core-api/xarray.rst=324=xas_next_entry()而不是xas_find()来移动到下一个当前条目,在大多数情况下会节省一个函数调用,但代价
--
Documentation/translations/zh_CN/core-api/xarray.rst-326-
Documentation/translations/zh_CN/core-api/xarray.rst:327:xas_find_marked()函数也是如此。如果xa_state没有被移动过,它将返回xa_state的索引处的条目,如果它
Documentation/translations/zh_CN/core-api/xarray.rst-328-被标记了。否则,它将返回xa_state所引用的条目之后的第一个被标记的条目。xas_next_marked()函数等同
--
drivers/infiniband/core/device.c=182=static void *xan_find_marked(struct xarray *xa, unsigned long *indexp,
--
drivers/infiniband/core/device.c-189-	do {
drivers/infiniband/core/device.c:190:		entry = xas_find_marked(&xas, ULONG_MAX, filter);
drivers/infiniband/core/device.c-191-		if (xa_is_zero(entry))
--
drivers/infiniband/core/ib_core_uverbs.c=283=int rdma_user_mmap_entry_insert_range(struct ib_ucontext *ucontext,
--
drivers/infiniband/core/ib_core_uverbs.c-317-		/* First find an empty index */
drivers/infiniband/core/ib_core_uverbs.c:318:		xas_find_marked(&xas, max_pgoff, XA_FREE_MARK);
drivers/infiniband/core/ib_core_uverbs.c-319-		if (xas.xa_node == XAS_RESTART)
--
fs/btrfs/extent_io.c=2127=static inline struct extent_buffer *find_get_eb(struct xa_state *xas, unsigned long max,
--
fs/btrfs/extent_io.c-2132-retry:
fs/btrfs/extent_io.c:2133:	eb = xas_find_marked(xas, max, mark);
fs/btrfs/extent_io.c-2134-
--
fs/cachefiles/ondemand.c=484=static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
--
fs/cachefiles/ondemand.c-559-		xas.xa_index = cache->msg_id_next;
fs/cachefiles/ondemand.c:560:		xas_find_marked(&xas, UINT_MAX, XA_FREE_MARK);
fs/cachefiles/ondemand.c-561-		if (xas.xa_node == XAS_RESTART) {
fs/cachefiles/ondemand.c-562-			xas.xa_index = 0;
fs/cachefiles/ondemand.c:563:			xas_find_marked(&xas, cache->msg_id_next - 1, XA_FREE_MARK);
fs/cachefiles/ondemand.c-564-		}
--
include/linux/xarray.h=1549=void xas_clear_mark(const struct xa_state *, xa_mark_t);
include/linux/xarray.h:1550:void *xas_find_marked(struct xa_state *, unsigned long max, xa_mark_t);
include/linux/xarray.h-1551-void xas_init_marks(const struct xa_state *);
--
include/linux/xarray.h=1742=static inline unsigned int xas_find_chunk(struct xa_state *xas, bool advance,
--
include/linux/xarray.h-1768- * xas_next_marked() is an inline function to optimise xarray traversal for
include/linux/xarray.h:1769: * speed.  It is equivalent to calling xas_find_marked(), and will call
include/linux/xarray.h:1770: * xas_find_marked() for all the hard cases.
include/linux/xarray.h-1771- *
--
include/linux/xarray.h=1774=static inline void *xas_next_marked(struct xa_state *xas, unsigned long max,
--
include/linux/xarray.h-1781-	if (unlikely(xas_not_node(node) || node->shift))
include/linux/xarray.h:1782:		return xas_find_marked(xas, max, mark);
include/linux/xarray.h-1783-	offset = xas_find_chunk(xas, true, mark);
--
include/linux/xarray.h-1788-	if (offset == XA_CHUNK_SIZE)
include/linux/xarray.h:1789:		return xas_find_marked(xas, max, mark);
include/linux/xarray.h-1790-	entry = xa_entry(xas->xa, node, offset);
include/linux/xarray.h-1791-	if (!entry)
include/linux/xarray.h:1792:		return xas_find_marked(xas, max, mark);
include/linux/xarray.h-1793-	return entry;
--
include/linux/xarray.h=1800=enum {
--
include/linux/xarray.h-1835-#define xas_for_each_marked(xas, entry, max, mark) \
include/linux/xarray.h:1836:	for (entry = xas_find_marked(xas, max, mark); entry; \
include/linux/xarray.h-1837-	     entry = xas_next_marked(xas, max, mark))
--
lib/idr.c=382=int ida_alloc_range(struct ida *ida, unsigned int min, unsigned int max,
--
lib/idr.c-398-next:
lib/idr.c:399:	bitmap = xas_find_marked(&xas, max / IDA_BITMAP_BITS, XA_FREE_MARK);
lib/idr.c-400-	if (xas.xa_index > min / IDA_BITMAP_BITS)
--
lib/xarray.c=1439=EXPORT_SYMBOL_GPL(xas_find);
--
lib/xarray.c-1441-/**
lib/xarray.c:1442: * xas_find_marked() - Find the next marked entry in the XArray.
lib/xarray.c-1443- * @xas: XArray operation state.
--
lib/xarray.c-1461- */
lib/xarray.c:1462:void *xas_find_marked(struct xa_state *xas, unsigned long max, xa_mark_t mark)
lib/xarray.c-1463-{
--
lib/xarray.c-1540-}
lib/xarray.c:1541:EXPORT_SYMBOL_GPL(xas_find_marked);
lib/xarray.c-1542-
--
lib/xarray.c=1985=int __xa_alloc(struct xarray *xa, u32 *id, void *entry,
--
lib/xarray.c-1999-		xas.xa_index = limit.min;
lib/xarray.c:2000:		xas_find_marked(&xas, limit.max, XA_FREE_MARK);
lib/xarray.c-2001-		if (xas.xa_node == XAS_RESTART)
--
lib/xarray.c=2189=void *xa_find(struct xarray *xa, unsigned long *indexp,
--
lib/xarray.c-2197-		if ((__force unsigned int)filter < XA_MAX_MARKS)
lib/xarray.c:2198:			entry = xas_find_marked(&xas, max, filter);
lib/xarray.c-2199-		else
--
lib/xarray.c=2239=void *xa_find_after(struct xarray *xa, unsigned long *indexp,
--
lib/xarray.c-2250-		if ((__force unsigned int)filter < XA_MAX_MARKS)
lib/xarray.c:2251:			entry = xas_find_marked(&xas, max, filter);
lib/xarray.c-2252-		else
--
mm/filemap.c=2055=static inline struct folio *find_get_entry(struct xa_state *xas, pgoff_t max,
--
mm/filemap.c-2063-	else
mm/filemap.c:2064:		folio = xas_find_marked(xas, max, mark);
mm/filemap.c-2065-
]

13/2 2026/02/24 18:47 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 18284
							output: 32
							thoughts: 934
						

Thoughts:
**Benign Data Race in XArray Mark Updates**

Okay, so the XArray code is rock solid, it's used *everywhere*. The fact that they're using non-atomic bit operations for the "marks" is intentional, and protected by the `xa_lock` for writers. The lockless reads, seeing the updates, that's also by design. If this were a real issue, we'd see it manifest in `find_next_bit` too, which is what `xas_find_chunk` uses for larger chunks, and it's also just reading memory.

Wait, let me just double check something...right, `__test_and_clear_bit` is, indeed, `btr` on x86. I remember checking that. And is there *any* architecture where that isn't atomic at the word level? Nope, every architecture has atomic word-sized stores.

So, the only "harm" here is that KCSAN gets a bit grumpy. But according to the definition of "benign," this is a *textbook* example: it's a simple bitmask, accesses are *not* supposed to be protected by mutual exclusion primitives, it's a collection of flags, and the algorithm is designed to handle the race. It's a "benign" data race.

Hmm, okay... but let's be thorough. What if `xas_find_marked` returns an entry that's *being* deleted? `xas_find_marked` calls `xa_entry()`. If it's being deleted, `xa_entry()` might return `NULL` or a sibling entry, or whatever. The XArray code is written to handle these cases. Specifically:

```c
1522:			entry = xa_entry(xas->xa, xas->xa_node, xas->xa_offset);
1523:			if (!entry && !(xa_track_free(xas->xa) && mark == XA_FREE_MARK))
1524:				continue;
```

If the entry *was* deleted (set to `NULL`) but the mark *wasn't* yet cleared, it just continues. If the entry *was* deleted and the mark *was* cleared, it won't find it. Both scenarios are fine, it seems. What if the entry was replaced by something else? The reader will see the new entry. What if the node gets collapsed? RCU ensures it's safe. `xas_find_marked` checks for changes in the node if necessary (even though it's walking the tree down). It's *really* robust, actually.

One more thing...the value in the report...it said `0xffffffe000000000 -> 0xfffffc0000000000`. This looks like a change in the high bits of a 64-bit word. Hold on... `f` is `1111`, `e` is `1110`, `c` is `1100`... So bit 41 was cleared. If `XA_CHUNK_SIZE` is 64, that corresponds to slot 41. If the mark for slot 41 is cleared, the entry is no longer marked. The reader might see either state, which is okay.

Conclusion: This is definitely a benign data race. It's a standard RCU-protected bitmask access where the reader can tolerate seeing the old or new value of a single bit. It's safe.

Just to be absolutely certain... let me double check what is actually being used. The stack trace says: `___test_and_clear_bit include/asm-generic/bitops/instrumented-non-atomic.h:114 [inline]` and `node_clear_mark lib/xarray.c:102 [inline]`. Right, so let's look at `include/asm-generic/bitops/instrumented-non-atomic.h`.



14/2 2026/02/24 18:47 tool read-file
0m Args:
map[File:include/asm-generic/bitops/instrumented-non-atomic.h]

Results:
map[Contents:/* SPDX-License-Identifier: GPL-2.0 */

/*
 * This file provides wrappers with sanitizer instrumentation for non-atomic
 * bit operations.
 *
 * To use this functionality, an arch's bitops.h file needs to define each of
 * the below bit operations with an arch_ prefix (e.g. arch_set_bit(),
 * arch___set_bit(), etc.).
 */
#ifndef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H
#define _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H

#include <linux/instrumented.h>

/**
 * ___set_bit - Set a bit in memory
 * @nr: the bit to set
 * @addr: the address to start counting from
 *
 * Unlike set_bit(), this function is non-atomic. If it is called on the same
 * region of memory concurrently, the effect may be that only one operation
 * succeeds.
 */
static __always_inline void
___set_bit(unsigned long nr, volatile unsigned long *addr)
{
	instrument_write(addr + BIT_WORD(nr), sizeof(long));
	arch___set_bit(nr, addr);
}

/**
 * ___clear_bit - Clears a bit in memory
 * @nr: the bit to clear
 * @addr: the address to start counting from
 *
 * Unlike clear_bit(), this function is non-atomic. If it is called on the same
 * region of memory concurrently, the effect may be that only one operation
 * succeeds.
 */
static __always_inline void
___clear_bit(unsigned long nr, volatile unsigned long *addr)
{
	instrument_write(addr + BIT_WORD(nr), sizeof(long));
	arch___clear_bit(nr, addr);
}

/**
 * ___change_bit - Toggle a bit in memory
 * @nr: the bit to change
 * @addr: the address to start counting from
 *
 * Unlike change_bit(), this function is non-atomic. If it is called on the same
 * region of memory concurrently, the effect may be that only one operation
 * succeeds.
 */
static __always_inline void
___change_bit(unsigned long nr, volatile unsigned long *addr)
{
	instrument_write(addr + BIT_WORD(nr), sizeof(long));
	arch___change_bit(nr, addr);
}

static __always_inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr)
{
	if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC)) {
		/*
		 * We treat non-atomic read-write bitops a little more special.
		 * Given the operations here only modify a single bit, assuming
		 * non-atomicity of the writer is sufficient may be reasonable
		 * for certain usage (and follows the permissible nature of the
		 * assume-plain-writes-atomic rule):
		 * 1. report read-modify-write races -> check read;
		 * 2. do not report races with marked readers, but do report
		 *    races with unmarked readers -> check "atomic" write.
		 */
		kcsan_check_read(addr + BIT_WORD(nr), sizeof(long));
		/*
		 * Use generic write instrumentation, in case other sanitizers
		 * or tools are enabled alongside KCSAN.
		 */
		instrument_write(addr + BIT_WORD(nr), sizeof(long));
	} else {
		instrument_read_write(addr + BIT_WORD(nr), sizeof(long));
	}
}

/**
 * ___test_and_set_bit - Set a bit and return its old value
 * @nr: Bit to set
 * @addr: Address to count from
 *
 * This operation is non-atomic. If two instances of this operation race, one
 * can appear to succeed but actually fail.
 */
static __always_inline bool
___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
{
	__instrument_read_write_bitop(nr, addr);
	return arch___test_and_set_bit(nr, addr);
}

/**
 * ___test_and_clear_bit - Clear a bit and return its old value
 * @nr: Bit to clear
 * @addr: Address to count from
 *
 * This operation is non-atomic. If two instances of this operation race, one
 * can appear to succeed but actually fail.
 */
static __always_inline bool
___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
{
	__instrument_read_write_bitop(nr, addr);
	return arch___test_and_clear_bit(nr, addr);
}

/**
 * ___test_and_change_bit - Change a bit and return its old value
 * @nr: Bit to change
 * @addr: Address to count from
 *
 * This operation is non-atomic. If two instances of this operation race, one
 * can appear to succeed but actually fail.
 */
static __always_inline bool
___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
{
	__instrument_read_write_bitop(nr, addr);
	return arch___test_and_change_bit(nr, addr);
}

/**
 * _test_bit - Determine whether a bit is set
 * @nr: bit number to test
 * @addr: Address to start counting from
 */
static __always_inline bool
_test_bit(unsigned long nr, const volatile unsigned long *addr)
{
	instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long));
	return arch_test_bit(nr, addr);
}

/**
 * _test_bit_acquire - Determine, with acquire semantics, whether a bit is set
 * @nr: bit number to test
 * @addr: Address to start counting from
 */
static __always_inline bool
_test_bit_acquire(unsigned long nr, const volatile unsigned long *addr)
{
	instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long));
	return arch_test_bit_acquire(nr, addr);
}

#endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
]

15/2 2026/02/24 18:47 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 21602
							output: 44
							thoughts: 873
						

Thoughts:
**Analyzing a KCSAN Read-Write Race Report**

Okay, so KCSAN is flagging a potential read-write race, specifically related to `___test_and_clear_bit`, which in turn calls `__instrument_read_write_bitop`.  The report says this is happening even though `CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC` is enabled, which should treat it as a read followed by an atomic write.  That's the core of the issue.

The race is reported as: "read-write to 0xffff88811b0ddb48 of 8 bytes..."  Tracing the calls, it goes through `instrument_read_write` and down to the specific line in `instrumented-non-atomic.h`. That's line 84, which is the `instrument_read_write` call within the `else` branch, confirming KCSAN *is* seeing it as non-atomic.

The reader is `xas_find_marked` within the XArray code, specifically reading at `include/linux/xarray.h:1752`, where it's doing a bitmask read.  This is what I expected. The code in the reader is the culprit, as analyzed.

The question is, is this actually benign? My initial assessment is *yes*. On x86, the store part of the `btr` (bit test and reset) instruction used by `arch___test_and_clear_bit` is atomic at the word level. I verified that `btr` *is* indeed used for `arch___test_and_clear_bit` on x86. I also eliminated compiler reordering as a concern, since the `asm volatile` prevents reordering.

So, the reader will see either the bit set or cleared.  No other XArray state is being corrupted, as the marks are just a bitmask. I'm reasonably confident this is benign.

Just to be absolutely sure, I'm checking the context.  `xas_find_marked` is called from `filemap_get_folios_tag`, which is part of the writeback path.  A miss in `filemap_get_folios_tag` just means it won't wait for a folio.  If the mark is *being* cleared, it means the folio's writeback is *finishing*. That makes sense.

Wait, looking at the surrounding code in `mm/page-writeback.c` around `__folio_end_writeback`, there is a clear in the following code:
```c
2948:			__xa_clear_mark(&mapping->i_pages, folio->index,
2949:						PAGECACHE_TAG_WRITEBACK);
```
So, overall, my analysis is that this KCSAN report is a false positive.



16/2 2026/02/24 18:47 tool codesearch-definition-source
0m Args:
map[ContextFile:mm/page-writeback.c IncludeLines:true Name:__folio_end_writeback]

Results:
map[SourceCode:2935:	bool __folio_end_writeback(struct folio *folio)
2936:	{
2937:		long nr = folio_nr_pages(folio);
2938:		struct address_space *mapping = folio_mapping(folio);
2939:		bool ret;
2940:	
2941:		if (mapping && mapping_use_writeback_tags(mapping)) {
2942:			struct inode *inode = mapping->host;
2943:			struct bdi_writeback *wb;
2944:			unsigned long flags;
2945:	
2946:			xa_lock_irqsave(&mapping->i_pages, flags);
2947:			ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback);
2948:			__xa_clear_mark(&mapping->i_pages, folio->index,
2949:						PAGECACHE_TAG_WRITEBACK);
2950:	
2951:			wb = inode_to_wb(inode);
2952:			wb_stat_mod(wb, WB_WRITEBACK, -nr);
2953:			__wb_writeout_add(wb, nr);
2954:			if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) {
2955:				wb_inode_writeback_end(wb);
2956:				if (mapping->host)
2957:					sb_clear_inode_writeback(mapping->host);
2958:			}
2959:	
2960:			xa_unlock_irqrestore(&mapping->i_pages, flags);
2961:		} else {
2962:			ret = folio_xor_flags_has_waiters(folio, 1 << PG_writeback);
2963:		}
2964:	
2965:		lruvec_stat_mod_folio(folio, NR_WRITEBACK, -nr);
2966:		zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr);
2967:		node_stat_mod_folio(folio, NR_WRITTEN, nr);
2968:	
2969:		return ret;
2970:	}
 SourceFile:mm/page-writeback.c]

17/2 2026/02/24 18:47 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 23762
							output: 19
							thoughts: 1372
						

Thoughts:
**Analyzing the XArray Writeback Synchronization**

Okay, let me think this through. I was initially concerned about the order of clearing the `PG_writeback` bit versus clearing the `PAGECACHE_TAG_WRITEBACK` mark in the XArray, and whether a race condition could arise. I need to make sure the order is correct.

I see now that `folio_xor_flags_has_waiters` *actually* clears the `PG_writeback` bit. I was initially off-track. This happens *before* the `__xa_clear_mark` call. So, the correct sequence is: (1) `PG_writeback` is cleared, and then (2) `PAGECACHE_TAG_WRITEBACK` is cleared in the XArray.

If a reader tries to grab a folio during the process, it might see the mark (step 2 hasn't happened yet). If so, it calls `folio_wait_writeback`. Inside, `folio_wait_writeback` will check `folio_test_writeback`. But because `PG_writeback` has already been cleared (step 1), `folio_wait_writeback` will exit immediately.

If the reader misses the mark (step 2 is complete), it skips the folio. That's also fine, because writeback is finished. Therefore, the order is safe. The data race on the XArray mark is benign, because the reader can handle either state—either the `PG_writeback` bit has been cleared or not.

I'm now digging into `folio_xor_flags_has_waiters`, trying to figure out the exact atomic operation. It seems to use `cmpxchg` or a similar atomic primitive internally, although for the `PG_writeback` case it is more like `test_and_clear_bit`. The important point is that `PG_writeback` is cleared *before* the XArray mark, which guarantees the behavior.

I've also checked where `PAGECACHE_TAG_WRITEBACK` is used beyond this specific code path. It's used in things like `filemap_fdatawait_range` and `filemap_write_and_wait_range`, which are waiting for writeback to finish. If they miss a recently completed folio, it's not a problem. No risk of infinite loops since `xas_find_marked` always moves forward in the XArray.

I'm pretty confident now that the race is benign.

Now, let me analyze the "value changed" part of the trace log. Only a single bit change from the look of it, which makes sense. I confirmed this by checking all bits, to see if anything other than the second from the right had changed.

I need to confirm if `node_clear_mark` is used for other marks in the XArray as well. I see it's used for `PAGECACHE_TAG_DIRTY` and `PAGECACHE_TAG_TOWRITE` as well, but the logic should be consistent. The XArray marks are used as optimizations to quickly find entries with specific properties. Those properties are also stored in the entry. The mark itself is used as an easy way to speed this process up. If the mark is out of sync for a short duration, it's fine.

Let me do a final check for harm: No memory corruption, no crash, the algorithm handles it and no non-atomic access to the structure. This appears safe. The fact that this is an x86_64 architecture also confirms that any 64-bit access is atomic. Even on a 32-bit system, the `unsigned long` is atomic.

I'll finalize this as safe.



18/2 2026/02/24 18:47 tool set-results
0m Args:
map[Benign:true Confident:true]

Results:
map[Benign:true Confident:true]

19/2 2026/02/24 18:47 llm expert
0m Model:
gemini-3-flash-preview

Tokens:
							input: 25964
							output: 773
							thoughts: 132
						

Thoughts:
**My Assessment of the Code**

Alright, the analysis is done. I've taken a look at this race condition, dissected the code, and I'm comfortable declaring it benign. It's a textbook example, actually. It's an RCU-protected bitmask access, exactly as I suspected. The reader tolerates seeing either the old or the new value of a single bit, and given the nature of the XArray algorithm, this level of concurrency is perfectly acceptable.

I zeroed in on the writer side – they're properly holding the `xa_lock`, which is crucial. That guarantees that no other writer can step on its toes during the modification. And on the reader's side, they're leveraging RCU, which, of course, gives me the memory safety I need to sleep soundly. I double-checked the specific scenario with `PAGECACHE_TAG_WRITEBACK`, just to be certain. Everything checks out; the sequence of operations for clearing `PG_writeback` and the XArray mark are safe and properly ordered. So, bottom line, this isn't a problem. We're good to go.



LLM Calls Summary:
Total Calls Total Tokens Avg Tokens Total Duration (Seconds) Avg Duration (Seconds)
Tool Calls Summary:
Total Calls Total Duration (Seconds) Avg Duration (Seconds)