syzbot

ID	Workflow	Result	Correct	Bug	Created	Started	Finished	Revision	Error
3876cab2-c7da-4e1d-b950-186fea7fcefb	assessment-kcsan		💥	KCSAN: data-race in __lru_add_drain_all / folio_add_lru	2026/01/15 23:52	2026/01/15 23:52	2026/01/15 23:53	a9d6a79219801d2130df3b1a792c57f0e5428e9f	LLM did not call tool to set outputs

assessment-kcsan

💥

KCSAN: data-race in __lru_add_drain_all / folio_add_lru

2026/01/15 23:52

2026/01/15 23:53

a9d6a79219801d2130df3b1a792c57f0e5428e9f

LLM did not call tool to set outputs

EXT4-fs (loop4): unmounting filesystem 00000000-0000-0000-0000-000000000000. ================================================================== BUG: KCSAN: data-race in __lru_add_drain_all / folio_add_lru read-write to 0xffff888237d26468 of 1 bytes by task 6441 on cpu 1: folio_batch_add include/linux/pagevec.h:77 [inline] __folio_batch_add_and_move mm/swap.c:194 [inline] folio_add_lru+0xa5/0x1f0 mm/swap.c:511 filemap_add_folio+0x26d/0x360 mm/filemap.c:982 __filemap_get_folio_mpol+0x326/0x650 mm/filemap.c:2016 __filemap_get_folio include/linux/pagemap.h:763 [inline] grow_dev_folio fs/buffer.c:1050 [inline] grow_buffers fs/buffer.c:1116 [inline] __getblk_slow fs/buffer.c:1134 [inline] bdev_getblk+0x174/0x3f0 fs/buffer.c:1461 sb_getblk_gfp include/linux/buffer_head.h:392 [inline] __ext4_sb_bread_gfp+0x44/0x170 fs/ext4/super.c:236 ext4_sb_bread_unmovable fs/ext4/super.c:265 [inline] ext4_load_super fs/ext4/super.c:5166 [inline] __ext4_fill_super fs/ext4/super.c:5312 [inline] ext4_fill_super+0x14ed/0x37a0 fs/ext4/super.c:5777 get_tree_bdev_flags+0x291/0x300 fs/super.c:1691 get_tree_bdev+0x1f/0x30 fs/super.c:1714 ext4_get_tree+0x1c/0x30 fs/ext4/super.c:5809 vfs_get_tree+0x57/0x1d0 fs/super.c:1751 fc_mount fs/namespace.c:1199 [inline] do_new_mount_fc fs/namespace.c:3636 [inline] do_new_mount+0x24d/0x6a0 fs/namespace.c:3712 path_mount+0x4ab/0xb80 fs/namespace.c:4022 do_mount fs/namespace.c:4035 [inline] __do_sys_mount fs/namespace.c:4224 [inline] __se_sys_mount+0x28c/0x2e0 fs/namespace.c:4201 __x64_sys_mount+0x67/0x80 fs/namespace.c:4201 x64_sys_call+0x2cca/0x3000 arch/x86/include/generated/asm/syscalls_64.h:166 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f read to 0xffff888237d26468 of 1 bytes by task 3322 on cpu 0: folio_batch_count include/linux/pagevec.h:56 [inline] cpu_needs_drain mm/swap.c:780 [inline] __lru_add_drain_all+0x17e/0x450 mm/swap.c:877 lru_add_drain_all+0x10/0x20 mm/swap.c:893 invalidate_bdev+0x47/0x70 block/bdev.c:101 ext4_put_super+0x624/0x7d0 fs/ext4/super.c:1348 generic_shutdown_super+0xe6/0x210 fs/super.c:643 kill_block_super+0x2a/0x70 fs/super.c:1722 ext4_kill_sb+0x42/0x80 fs/ext4/super.c:7444 deactivate_locked_super+0x75/0x1c0 fs/super.c:474 deactivate_super+0x97/0xa0 fs/super.c:507 cleanup_mnt+0x2a9/0x320 fs/namespace.c:1318 __cleanup_mnt+0x19/0x20 fs/namespace.c:1325 task_work_run+0x131/0x1a0 kernel/task_work.c:233 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] __exit_to_user_mode_loop kernel/entry/common.c:44 [inline] exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline] syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline] do_syscall_64+0x1dd/0x2b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0x00 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 3322 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(voluntary) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 ==================================================================

Trajectory:
Seq	Timestamp	Type	Name	Duration
0/0	2026/01/15 23:52	flow	assessment-kcsan	0m Model: unknown Error: LLM did not call tool to set outputs
1/1	2026/01/15 23:52	action	kernel-checkouter	0m Model: unknown Results: map[KernelSrc:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/src/9e6908b28e91d9ec5708b9c726ef692ee016f4f7]
2/1	2026/01/15 23:52	action	kernel-builder	0m Model: unknown Results: map[KernelObj:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/build/40a25fd095a58428a3520b72b771d52fd56c4658]
3/1	2026/01/15 23:52	action	codesearch-prepare	0m Model: unknown Results: map[Index:codesearch-index]
4/1	2026/01/15 23:52	agent	expert	0m Model: unknown Error: LLM did not call tool to set outputs Instruction: You are an experienced Linux kernel developer tasked with determining if the given kernel data race is benign or not. The data race report is from KCSAN tool. It contains 2 stack traces of the memory accesses that constitute a data race. A "benign" data races are on a simple int/bool variable or similar field, and the accesses are not supposed to be protected by any mutual exclusion primitives. Common examples of such "benign" data races are accesses to various flags fields, statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash with a conservative compiler that compiles memory accesses to primitive types effectively as atomic. A non-benign (or "harmful" data race) can lead to corruption/crash even with a conservative compiler that compiles memory accesses to primitive types effectively as atomic. A common example of a "harmful" data races is race on a complex container (list/hashmap/etc), where accesses are supposed to be protected by a mutual exclusion primitive. In the final reply explain why you think the given data race is benign or is harmful. Use the provided tools to confirm any assumptions, variables/fields being accessed, etc. In particular, don't make assumptions about the kernel source code, use codesearch tools to read the actual source code. Use set-results tool to provide results of the analysis. It must be called exactly once before the final reply. Ignore results of this tool. Prompt: The data race report is: EXT4-fs (loop4): unmounting filesystem 00000000-0000-0000-0000-000000000000. ================================================================== BUG: KCSAN: data-race in __lru_add_drain_all / folio_add_lru read-write to 0xffff888237d26468 of 1 bytes by task 6441 on cpu 1: folio_batch_add include/linux/pagevec.h:77 [inline] __folio_batch_add_and_move mm/swap.c:194 [inline] folio_add_lru+0xa5/0x1f0 mm/swap.c:511 filemap_add_folio+0x26d/0x360 mm/filemap.c:982 __filemap_get_folio_mpol+0x326/0x650 mm/filemap.c:2016 __filemap_get_folio include/linux/pagemap.h:763 [inline] grow_dev_folio fs/buffer.c:1050 [inline] grow_buffers fs/buffer.c:1116 [inline] __getblk_slow fs/buffer.c:1134 [inline] bdev_getblk+0x174/0x3f0 fs/buffer.c:1461 sb_getblk_gfp include/linux/buffer_head.h:392 [inline] __ext4_sb_bread_gfp+0x44/0x170 fs/ext4/super.c:236 ext4_sb_bread_unmovable fs/ext4/super.c:265 [inline] ext4_load_super fs/ext4/super.c:5166 [inline] __ext4_fill_super fs/ext4/super.c:5312 [inline] ext4_fill_super+0x14ed/0x37a0 fs/ext4/super.c:5777 get_tree_bdev_flags+0x291/0x300 fs/super.c:1691 get_tree_bdev+0x1f/0x30 fs/super.c:1714 ext4_get_tree+0x1c/0x30 fs/ext4/super.c:5809 vfs_get_tree+0x57/0x1d0 fs/super.c:1751 fc_mount fs/namespace.c:1199 [inline] do_new_mount_fc fs/namespace.c:3636 [inline] do_new_mount+0x24d/0x6a0 fs/namespace.c:3712 path_mount+0x4ab/0xb80 fs/namespace.c:4022 do_mount fs/namespace.c:4035 [inline] __do_sys_mount fs/namespace.c:4224 [inline] __se_sys_mount+0x28c/0x2e0 fs/namespace.c:4201 __x64_sys_mount+0x67/0x80 fs/namespace.c:4201 x64_sys_call+0x2cca/0x3000 arch/x86/include/generated/asm/syscalls_64.h:166 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f read to 0xffff888237d26468 of 1 bytes by task 3322 on cpu 0: folio_batch_count include/linux/pagevec.h:56 [inline] cpu_needs_drain mm/swap.c:780 [inline] __lru_add_drain_all+0x17e/0x450 mm/swap.c:877 lru_add_drain_all+0x10/0x20 mm/swap.c:893 invalidate_bdev+0x47/0x70 block/bdev.c:101 ext4_put_super+0x624/0x7d0 fs/ext4/super.c:1348 generic_shutdown_super+0xe6/0x210 fs/super.c:643 kill_block_super+0x2a/0x70 fs/super.c:1722 ext4_kill_sb+0x42/0x80 fs/ext4/super.c:7444 deactivate_locked_super+0x75/0x1c0 fs/super.c:474 deactivate_super+0x97/0xa0 fs/super.c:507 cleanup_mnt+0x2a9/0x320 fs/namespace.c:1318 __cleanup_mnt+0x19/0x20 fs/namespace.c:1325 task_work_run+0x131/0x1a0 kernel/task_work.c:233 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] __exit_to_user_mode_loop kernel/entry/common.c:44 [inline] exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline] syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline] do_syscall_64+0x1dd/0x2b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0x00 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 3322 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(voluntary) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 ==================================================================
5/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: A Data Race Investigation Okay, so the report indicates a classic data race – specifically between `folio_add_lru`, which goes through `folio_batch_add`, and `__lru_add_drain_all` that ultimately calls `cpu_needs_drain`. The address in question is `0xffff888237d26468`, which stores a single byte. It looks like the value went from `0x00` to `0x01`, which suggests it's a counter, most likely representing the number of folios in a `folio_batch`. Let's dive into the stack traces. On one side (task 6441), which I'm inferring is the writer, we have `folio_batch_add` at `include/linux/pagevec.h:77`, then `__folio_batch_add_and_move` in `mm/swap.c:194`, and finally `folio_add_lru` at `mm/swap.c:511`. That's the write path. On the other side (task 3322), presumably the reader, we see `folio_batch_count` at `include/linux/pagevec.h:56`, followed by `cpu_needs_drain` at `mm/swap.c:780`, and ultimately `__lru_add_drain_all` at `mm/swap.c:877`. This is the read path. It's clear that the two functions are accessing and possibly modifying the same variable simultaneously. Now it's time to examine the relevant source code. Let's see how these functions interact.
6/2	2026/01/15 23:52	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:folio_batch_add SourceFile:include/linux/pagevec.h] Results: map[Missing:false SourceCode: 63: /** 64: * folio_batch_add() - Add a folio to a batch. 65: * @fbatch: The folio batch. 66: * @folio: The folio to add. 67: * 68: * The folio is added to the end of the batch. 69: * The batch must have previously been initialised using folio_batch_init(). 70: * 71: * Return: The number of slots still available. 72: / 73: static inline unsigned folio_batch_add(struct folio_batch fbatch, 74: struct folio *folio) 75: { 76: fbatch->folios[fbatch->nr++] = folio; 77: return folio_batch_space(fbatch); 78: } SourceFile:include/linux/pagevec.h]
7/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: Okay, let's break this down. Decoding This Report: A Deep Dive into Pagevec Alright, the report highlights a read-write operation in `folio_batch_add`. Specifically, the report flags line 77 in `include/linux/pagevec.h`. But wait, line 76 is where it's happening: `fbatch->folios[fbatch->nr++] = folio;`. This is crucial. `fbatch->nr++` is the heart of the problem. It's a read-modify-write operation, fetching the current value of `fbatch->nr`, incrementing it, and then storing the new value back. The report pinpoints the read-write to one byte, which aligns perfectly with a simple increment of an integer. Ah, now line 77, `return folio_batch_space(fbatch);`, makes sense; the actual use of the incremented value is on the next line! It's calling `folio_batch_space`, which is likely why the profiler flags it as being called in the reported line number. So, now, let's shift our focus to `folio_batch_space`. What does that function do with `fbatch`? We need to understand how the increment and this space check interact to determine the performance impact.
8/2	2026/01/15 23:52	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:folio_batch_space SourceFile:include/linux/pagevec.h] Results: map[Missing:false SourceCode: 58: static inline unsigned int folio_batch_space(const struct folio_batch *fbatch) 59: { 60: return PAGEVEC_SIZE - fbatch->nr; 61: } SourceFile:include/linux/pagevec.h]
9/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: Thinking Through a Potential Concurrency Issue Okay, so I'm looking at this code and a bit concerned. Specifically, it's line 77 within `folio_batch_add` that's got my attention. It seems that line calls `folio_batch_space(fbatch)`, which, as I recall, accesses `fbatch->nr`. Now, if the increment, `fbatch->nr++`, was done right before on line 76, then line 77 would also be accessing `fbatch->nr` almost immediately. The report I'm looking at clearly indicates a "read-write" scenario, pinpointing the read on line 77 of `folio_batch_add` and involving task 6441. But hang on, the increment is the write, isn't it? Right, so the write is `fbatch->nr++`... now, let's look at the reader stack. I need to understand what's reading this value at the same time and what the context is. It looks like `folio_batch_count` is being called, specifically at `include/linux/pagevec.h:56`. Okay, that's interesting. And then `cpu_needs_drain` at `mm/swap.c:780`, and `__lru_add_drain_all` at `mm/swap.c:877`. Seems like memory management-related operations. Alright, let's take a closer look at `folio_batch_count`.
10/2	2026/01/15 23:52	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:folio_batch_count SourceFile:include/linux/pagevec.h] Results: map[Missing:false SourceCode: 53: static inline unsigned int folio_batch_count(const struct folio_batch *fbatch) 54: { 55: return fbatch->nr; 56: } SourceFile:include/linux/pagevec.h]
11/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: Here's a breakdown of my thought process: Racing on `fbatch->nr` Alright, this looks like a classic race condition scenario. I'm focusing on the interaction between a writer, `folio_batch_add`, and a reader, `cpu_needs_drain`, specifically related to the `fbatch->nr` variable. This screams potential data corruption or missed signals if not handled correctly. The use of the term race already sets my mind racing for lock protection or atomic operations. My next step is to deep dive into where `cpu_needs_drain` resides, as I can't immediately deduce the potential issue. I know it's a memory management file, `mm/swap.c`, so I'll go there next. I need to understand what's draining, why it's happening, and how `fbatch->nr` is involved.
12/2	2026/01/15 23:52	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:cpu_needs_drain SourceFile:mm/swap.c] Results: map[Missing:false SourceCode: 774: static bool cpu_needs_drain(unsigned int cpu) 775: { 776: struct cpu_fbatches fbatches = &per_cpu(cpu_fbatches, cpu); 777: 778: / Check these in order of likelihood that they're not zero */ 779: return folio_batch_count(&fbatches->lru_add) \|\| 780: folio_batch_count(&fbatches->lru_move_tail) \|\| 781: folio_batch_count(&fbatches->lru_deactivate_file) \|\| 782: folio_batch_count(&fbatches->lru_deactivate) \|\| 783: folio_batch_count(&fbatches->lru_lazyfree) \|\| 784: folio_batch_count(&fbatches->lru_activate) \|\| 785: need_mlock_drain(cpu) \|\| 786: has_bh_in_lru(cpu, NULL); 787: } SourceFile:mm/swap.c]
13/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: Okay, here's my interpretation of that thought process, assuming I'm an expert in this field: Diving into the LRU Cache Implementation Alright, so I'm thinking about this `__lru_add_drain_all` function... It's triggered within the broader Least Recently Used cache implementation, that much is clear. The key here is the "drain all" aspect. This strongly suggests we're dealing with a situation where we need to evict all entries from the cache, likely due to a specific event or condition. It could be for memory pressure, a cache invalidation signal, a checkpoint, or something along those lines. The fact that `__lru_add_drain_all` is the entry point suggests it's designed to do more than just drain a few entries selectively. This isn't just a simple eviction, this is a purge. I wonder what the calling context is. What event triggered this operation, and what effect is this purge likely to have on the other parts of the system that rely on this cache? This is where things get interesting, digging deep into the reason, not just the action.
14/2	2026/01/15 23:52	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:__lru_add_drain_all SourceFile:mm/swap.c] Results: map[Missing:false SourceCode: 789: /* 790: * Doesn't need any cpu hotplug locking because we do rely on per-cpu 791: * kworkers being shut down before our page_alloc_cpu_dead callback is 792: * executed on the offlined cpu. 793: * Calling this function with cpu hotplug locks held can actually lead 794: * to obscure indirect dependencies via WQ context. 795: / 796: static inline void __lru_add_drain_all(bool force_all_cpus) 797: { 798: / 799: * lru_drain_gen - Global pages generation number 800: * 801: * (A) Definition: global lru_drain_gen = x implies that all generations 802: * 0 < n <= x are already scheduled for draining. 803: * 804: * This is an optimization for the highly-contended use case where a 805: * user space workload keeps constantly generating a flow of pages for 806: * each CPU. 807: / 808: static unsigned int lru_drain_gen; 809: static struct cpumask has_work; 810: static DEFINE_MUTEX(lock); 811: unsigned cpu, this_gen; 812: 813: / 814: * Make sure nobody triggers this path before mm_percpu_wq is fully 815: * initialized. 816: / 817: if (WARN_ON(!mm_percpu_wq)) 818: return; 819: 820: / 821: * Guarantee folio_batch counter stores visible by this CPU 822: * are visible to other CPUs before loading the current drain 823: * generation. 824: / 825: smp_mb(); 826: 827: / 828: * (B) Locally cache global LRU draining generation number 829: * 830: * The read barrier ensures that the counter is loaded before the mutex 831: * is taken. It pairs with smp_mb() inside the mutex critical section 832: * at (D). 833: / 834: this_gen = smp_load_acquire(&lru_drain_gen); 835: 836: / It helps everyone if we do our own local drain immediately. / 837: lru_add_drain(); 838: 839: mutex_lock(&lock); 840: 841: / 842: * (C) Exit the draining operation if a newer generation, from another 843: * lru_add_drain_all(), was already scheduled for draining. Check (A). 844: / 845: if (unlikely(this_gen != lru_drain_gen && !force_all_cpus)) 846: goto done; 847: 848: / 849: * (D) Increment global generation number 850: * 851: * Pairs with smp_load_acquire() at (B), outside of the critical 852: * section. Use a full memory barrier to guarantee that the 853: * new global drain generation number is stored before loading 854: * folio_batch counters. 855: * 856: * This pairing must be done here, before the for_each_online_cpu loop 857: * below which drains the page vectors. 858: * 859: * Let x, y, and z represent some system CPU numbers, where x < y < z. 860: * Assume CPU #z is in the middle of the for_each_online_cpu loop 861: * below and has already reached CPU #y's per-cpu data. CPU #x comes 862: * along, adds some pages to its per-cpu vectors, then calls 863: * lru_add_drain_all(). 864: * 865: * If the paired barrier is done at any later step, e.g. after the 866: * loop, CPU #x will just exit at (C) and miss flushing out all of its 867: * added pages. 868: / 869: WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1); 870: smp_mb(); 871: 872: cpumask_clear(&has_work); 873: for_each_online_cpu(cpu) { 874: struct work_struct work = &per_cpu(lru_add_drain_work, cpu); 875: 876: if (cpu_needs_drain(cpu)) { 877: INIT_WORK(work, lru_add_drain_per_cpu); 878: queue_work_on(cpu, mm_percpu_wq, work); 879: __cpumask_set_cpu(cpu, &has_work); 880: } 881: } 882: 883: for_each_cpu(cpu, &has_work) 884: flush_work(&per_cpu(lru_add_drain_work, cpu)); 885: 886: done: 887: mutex_unlock(&lock); 888: } SourceFile:mm/swap.c]
15/2	2026/01/15 23:52	llm	expert	0m Model: unknown Thoughts: Thinking Through a Potential Data Race in LRU Draining Okay, so I'm looking at this code in `__lru_add_drain_all`, and it's using `cpu_needs_drain(cpu)` to decide whether to queue a work item to drain the LRU on a particular CPU. `cpu_needs_drain` is just reading `fbatch->nr`, which is the folio batch count. And `folio_batch_add` is doing a simple increment (`fbatch->nr++`) which is either a plain write, or a read-modify-write. Since `fbatch->nr` is just a counter, a slightly out-of-date value is probably not a huge deal, right? If `cpu_needs_drain` reads 0 when it should be 1, it just means that we might miss draining this CPU this time. But wait, there are memory barriers in `__lru_add_drain_all`! Lines 825 and 870 have `smp_mb()`, and then `cpu_needs_drain` is called at line 876. So, if `cpu_needs_drain` returns false because it saw a stale 0, is that a problem? Typically, `lru_add_drain_all` is called when we need pages on the LRU lists, like when we're migrating or invalidating. And now I remember, `invalidate_bdev` calls it. It wants all pages drained from per-CPU batches to the LRU lists so they can be invalidated. Ah, but now I recall, this is a plain read/write, so KCSAN is right to flag it. Is it benign, though? If `cpu_needs_drain` sees 0, it doesn't queue work. If `folio_batch_add` just happened, then `fbatch->nr` has just become 1. If `cpu_needs_drain` misses this, it will be drained later, or the caller might fail to see the page on the LRU. Specifically, in the `invalidate_bdev` case, if we miss a page in the batch, that page won't be invalidated. So `invalidate_bdev` calls `lru_add_drain_all` to ensure all pages are on LRU so they can be found and invalidated. But wait... if a page is in a folio batch, it isn't on the LRU yet! `lru_add_drain_all` is what moves them. Okay, so if `invalidate_bdev` misses a page because of a stale read, it might not invalidate it. That could be bad. But then, I remember `invalidate_bdev` also calls `truncate_inode_pages`. Wait, let me check the source code for `invalidate_bdev`. Okay, here's `invalidate_bdev`: ```c void invalidate_bdev(struct block_device bdev) { struct address_space mapping = bdev->bd_inode->i_mapping; if (mapping->nrpages) { lru_add_drain_all(); /* For __invalidate_mapping_pages() / invalidate_bh_lrus(); truncate_inode_pages(mapping, 0); } } ``` Okay, so the `lru_add_drain_all` call is specifically for `__invalidate_mapping_pages()`. And `truncate_inode_pages` is called after* `lru_add_drain_all`. Got it! So, I need to look into whether a missed page invalidation in `__invalidate_mapping_pages` leads to a failure in the truncate.
16/2	2026/01/15 23:53	tool	codesearch-definition-source	0m Model: unknown Args: map[IncludeLines:true Name:invalidate_bdev SourceFile:block/bdev.c] Results: map[Missing:false SourceCode: 104: EXPORT_SYMBOL(invalidate_bdev); SourceFile:block/bdev.c]
17/2	2026/01/15 23:53	llm	expert	0m Model: unknown Thoughts: Okay, here we go. Here's how I'm thinking about this, from my expert perspective: Debugging Export Behavior Right, so I see it only "exported" whatever it was supposed to. My initial assumption was that this process completed, but the phrasing has me thinking it was incomplete. I need more information. Let me see... I need to clarify what that means, precisely. "Exported" in this context could mean any number of things, depending on the tool and the format. I need to get to the root definition of what this 'export' action is supposed to have achieved, ideally by looking at documentation, log files, or even the underlying code if necessary. The meaning of "exported" is going to be key to diagnosing whatever the problem is. I need to be exact here – ambiguity is my enemy. Let me start with a quick search for the tool's definition of "export" in its documentation.
18/2	2026/01/15 23:53	tool	codesearch-file-index	0m Model: unknown Args: map[SourceFile:block/bdev.c] Results: map[Entities:[map[Kind:function Name:BDEV_I] map[Kind:function Name:BD_INODE] map[Kind:function Name:I_BDEV] map[Kind:function Name:I_BDEV] map[Kind:function Name:bd_abort_claiming] map[Kind:function Name:bd_abort_claiming] map[Kind:function Name:bd_clear_claiming] map[Kind:function Name:bd_end_claim] map[Kind:function Name:bd_finish_claiming] map[Kind:function Name:bd_init_fs_context] map[Kind:function Name:bd_may_claim] map[Kind:function Name:bd_prepare_to_claim] map[Kind:function Name:bd_prepare_to_claim] map[Kind:function Name:bd_yield_claim] map[Kind:function Name:bdev_add] map[Kind:function Name:bdev_alloc] map[Kind:function Name:bdev_alloc_inode] map[Kind:function Name:bdev_block_writes] map[Kind:function Name:bdev_cache_init] map[Kind:function Name:bdev_claim_write_access] map[Kind:function Name:bdev_drop] map[Kind:function Name:bdev_evict_inode] map[Kind:function Name:bdev_file_open_by_dev] map[Kind:function Name:bdev_file_open_by_dev] map[Kind:function Name:bdev_file_open_by_path] map[Kind:function Name:bdev_file_open_by_path] map[Kind:function Name:bdev_fput] map[Kind:function Name:bdev_fput] map[Kind:function Name:bdev_free_inode] map[Kind:function Name:bdev_freeze] map[Kind:function Name:bdev_freeze] map[Kind:function Name:bdev_mark_dead] map[Kind:function Name:bdev_mark_dead] map[Kind:function Name:bdev_may_open] map[Kind:function Name:bdev_open] map[Kind:function Name:bdev_permission] map[Kind:function Name:bdev_release] map[Kind:function Name:bdev_set_nr_sectors] map[Kind:function Name:bdev_statx] map[Kind:function Name:bdev_thaw] map[Kind:function Name:bdev_thaw] map[Kind:function Name:bdev_unblock_writes] map[Kind:function Name:bdev_unclaimed] map[Kind:function Name:bdev_unhash] map[Kind:function Name:bdev_validate_blocksize] map[Kind:function Name:bdev_validate_blocksize] map[Kind:function Name:bdev_write_inode] map[Kind:function Name:bdev_writes_blocked] map[Kind:function Name:bdev_yield_write_access] map[Kind:function Name:blk_to_file_flags] map[Kind:function Name:blkdev_flush_mapping] map[Kind:function Name:blkdev_get_no_open] map[Kind:function Name:blkdev_get_part] map[Kind:function Name:blkdev_get_whole] map[Kind:function Name:blkdev_put_no_open] map[Kind:function Name:blkdev_put_part] map[Kind:function Name:blkdev_put_whole] map[Kind:function Name:block_size] map[Kind:function Name:block_size] map[Kind:function Name:disk_live] map[Kind:function Name:disk_live] map[Kind:function Name:file_bdev] map[Kind:function Name:file_bdev] map[Kind:function Name:init_once] map[Kind:function Name:invalidate_bdev] map[Kind:function Name:invalidate_bdev] map[Kind:function Name:kill_bdev] map[Kind:function Name:lookup_bdev] map[Kind:function Name:lookup_bdev] map[Kind:function Name:nr_blockdev_pages] map[Kind:function Name:sb_min_blocksize] map[Kind:function Name:sb_min_blocksize] map[Kind:function Name:sb_set_blocksize] map[Kind:function Name:sb_set_blocksize] map[Kind:function Name:sb_validate_large_blocksize] map[Kind:function Name:set_blocksize] map[Kind:function Name:set_blocksize] map[Kind:function Name:set_init_blocksize] map[Kind:function Name:setup_bdev_allow_write_mounted] map[Kind:function Name:sync_bdevs] map[Kind:function Name:sync_blockdev] map[Kind:function Name:sync_blockdev] map[Kind:function Name:sync_blockdev_nowait] map[Kind:function Name:sync_blockdev_nowait] map[Kind:function Name:sync_blockdev_range] map[Kind:function Name:sync_blockdev_range] map[Kind:function Name:truncate_bdev_range]] Missing:false]
19/2	2026/01/15 23:53	llm	expert	0m Model: unknown

Trajectory:

Seq

Timestamp

Type

Name

Duration

0/0

2026/01/15 23:52

flow

assessment-kcsan

Model:

unknown

Error:

LLM did not call tool to set outputs

1/1

2026/01/15 23:52

action

kernel-checkouter

Model:

unknown

Results:

map[KernelSrc:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/src/9e6908b28e91d9ec5708b9c726ef692ee016f4f7]

2/1

2026/01/15 23:52

action

kernel-builder

Model:

unknown

Results:

map[KernelObj:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/build/40a25fd095a58428a3520b72b771d52fd56c4658]

3/1

2026/01/15 23:52

action

codesearch-prepare

Model:

unknown

Results:

map[Index:codesearch-index]

4/1

2026/01/15 23:52

agent

expert

Model:

unknown

Error:

LLM did not call tool to set outputs

Instruction:

You are an experienced Linux kernel developer tasked with determining if the given kernel
data race is benign or not. The data race report is from KCSAN tool.
It contains 2 stack traces of the memory accesses that constitute a data race.

A "benign" data races are on a simple int/bool variable or similar field,
and the accesses are not supposed to be protected by any mutual exclusion primitives.
Common examples of such "benign" data races are accesses to various flags fields,
statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash
with a conservative compiler that compiles memory accesses to primitive types
effectively as atomic.

A non-benign (or "harmful" data race) can lead to corruption/crash even with
a conservative compiler that compiles memory accesses to primitive types
effectively as atomic. A common example of a "harmful" data races is race on
a complex container (list/hashmap/etc), where accesses are supposed to be protected
by a mutual exclusion primitive.

In the final reply explain why you think the given data race is benign or is harmful.

Use the provided tools to confirm any assumptions, variables/fields being accessed, etc.
In particular, don't make assumptions about the kernel source code,
use codesearch tools to read the actual source code.


Use set-results tool to provide results of the analysis.
It must be called exactly once before the final reply.
Ignore results of this tool.

Prompt:

The data race report is:

EXT4-fs (loop4): unmounting filesystem 00000000-0000-0000-0000-000000000000.
==================================================================
BUG: KCSAN: data-race in __lru_add_drain_all / folio_add_lru

read-write to 0xffff888237d26468 of 1 bytes by task 6441 on cpu 1:
 folio_batch_add include/linux/pagevec.h:77 [inline]
 __folio_batch_add_and_move mm/swap.c:194 [inline]
 folio_add_lru+0xa5/0x1f0 mm/swap.c:511
 filemap_add_folio+0x26d/0x360 mm/filemap.c:982
 __filemap_get_folio_mpol+0x326/0x650 mm/filemap.c:2016
 __filemap_get_folio include/linux/pagemap.h:763 [inline]
 grow_dev_folio fs/buffer.c:1050 [inline]
 grow_buffers fs/buffer.c:1116 [inline]
 __getblk_slow fs/buffer.c:1134 [inline]
 bdev_getblk+0x174/0x3f0 fs/buffer.c:1461
 sb_getblk_gfp include/linux/buffer_head.h:392 [inline]
 __ext4_sb_bread_gfp+0x44/0x170 fs/ext4/super.c:236
 ext4_sb_bread_unmovable fs/ext4/super.c:265 [inline]
 ext4_load_super fs/ext4/super.c:5166 [inline]
 __ext4_fill_super fs/ext4/super.c:5312 [inline]
 ext4_fill_super+0x14ed/0x37a0 fs/ext4/super.c:5777
 get_tree_bdev_flags+0x291/0x300 fs/super.c:1691
 get_tree_bdev+0x1f/0x30 fs/super.c:1714
 ext4_get_tree+0x1c/0x30 fs/ext4/super.c:5809
 vfs_get_tree+0x57/0x1d0 fs/super.c:1751
 fc_mount fs/namespace.c:1199 [inline]
 do_new_mount_fc fs/namespace.c:3636 [inline]
 do_new_mount+0x24d/0x6a0 fs/namespace.c:3712
 path_mount+0x4ab/0xb80 fs/namespace.c:4022
 do_mount fs/namespace.c:4035 [inline]
 __do_sys_mount fs/namespace.c:4224 [inline]
 __se_sys_mount+0x28c/0x2e0 fs/namespace.c:4201
 __x64_sys_mount+0x67/0x80 fs/namespace.c:4201
 x64_sys_call+0x2cca/0x3000 arch/x86/include/generated/asm/syscalls_64.h:166
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888237d26468 of 1 bytes by task 3322 on cpu 0:
 folio_batch_count include/linux/pagevec.h:56 [inline]
 cpu_needs_drain mm/swap.c:780 [inline]
 __lru_add_drain_all+0x17e/0x450 mm/swap.c:877
 lru_add_drain_all+0x10/0x20 mm/swap.c:893
 invalidate_bdev+0x47/0x70 block/bdev.c:101
 ext4_put_super+0x624/0x7d0 fs/ext4/super.c:1348
 generic_shutdown_super+0xe6/0x210 fs/super.c:643
 kill_block_super+0x2a/0x70 fs/super.c:1722
 ext4_kill_sb+0x42/0x80 fs/ext4/super.c:7444
 deactivate_locked_super+0x75/0x1c0 fs/super.c:474
 deactivate_super+0x97/0xa0 fs/super.c:507
 cleanup_mnt+0x2a9/0x320 fs/namespace.c:1318
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1325
 task_work_run+0x131/0x1a0 kernel/task_work.c:233
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 __exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
 exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
 do_syscall_64+0x1dd/0x2b0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0x00 -> 0x01

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 3322 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(voluntary) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================

5/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

**A Data Race Investigation**

Okay, so the report indicates a classic data race – specifically between `folio_add_lru`, which goes through `folio_batch_add`, and `__lru_add_drain_all` that ultimately calls `cpu_needs_drain`. The address in question is `0xffff888237d26468`, which stores a single byte. It looks like the value went from `0x00` to `0x01`, which suggests it's a counter, most likely representing the number of folios in a `folio_batch`.

Let's dive into the stack traces. On one side (task 6441), which I'm inferring is the writer, we have `folio_batch_add` at `include/linux/pagevec.h:77`, then `__folio_batch_add_and_move` in `mm/swap.c:194`, and finally `folio_add_lru` at `mm/swap.c:511`. That's the write path.

On the other side (task 3322), presumably the reader, we see `folio_batch_count` at `include/linux/pagevec.h:56`, followed by `cpu_needs_drain` at `mm/swap.c:780`, and ultimately `__lru_add_drain_all` at `mm/swap.c:877`. This is the read path. It's clear that the two functions are accessing and possibly modifying the same variable simultaneously. Now it's time to examine the relevant source code. Let's see how these functions interact.

6/2

2026/01/15 23:52

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:folio_batch_add SourceFile:include/linux/pagevec.h]

Results:

map[Missing:false SourceCode:  63:	/**
  64:	 * folio_batch_add() - Add a folio to a batch.
  65:	 * @fbatch: The folio batch.
  66:	 * @folio: The folio to add.
  67:	 *
  68:	 * The folio is added to the end of the batch.
  69:	 * The batch must have previously been initialised using folio_batch_init().
  70:	 *
  71:	 * Return: The number of slots still available.
  72:	 */
  73:	static inline unsigned folio_batch_add(struct folio_batch *fbatch,
  74:			struct folio *folio)
  75:	{
  76:		fbatch->folios[fbatch->nr++] = folio;
  77:		return folio_batch_space(fbatch);
  78:	}
 SourceFile:include/linux/pagevec.h]

7/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

Okay, let's break this down.

**Decoding This Report: A Deep Dive into Pagevec**

Alright, the report highlights a read-write operation in `folio_batch_add`. Specifically, the report flags line 77 in `include/linux/pagevec.h`. But wait, line 76 is where it's happening: `fbatch->folios[fbatch->nr++] = folio;`. This is crucial. `fbatch->nr++` is the heart of the problem. It's a read-modify-write operation, fetching the current value of `fbatch->nr`, incrementing it, and then storing the new value back. The report pinpoints the read-write to one byte, which aligns perfectly with a simple increment of an integer. Ah, now line 77, `return folio_batch_space(fbatch);`, makes sense; the actual use of the incremented value is on the next line! It's calling `folio_batch_space`, which is likely why the profiler flags it as being called in the reported line number. So, now, let's shift our focus to `folio_batch_space`. What does that function do with `fbatch`? We need to understand how the increment and this space check interact to determine the performance impact.

8/2

2026/01/15 23:52

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:folio_batch_space SourceFile:include/linux/pagevec.h]

Results:

map[Missing:false SourceCode:  58:	static inline unsigned int folio_batch_space(const struct folio_batch *fbatch)
  59:	{
  60:		return PAGEVEC_SIZE - fbatch->nr;
  61:	}
 SourceFile:include/linux/pagevec.h]

9/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

**Thinking Through a Potential Concurrency Issue**

Okay, so I'm looking at this code and a bit concerned. Specifically, it's line 77 within `folio_batch_add` that's got my attention. It seems that line calls `folio_batch_space(fbatch)`, which, as I recall, accesses `fbatch->nr`. Now, if the increment, `fbatch->nr++`, was done right before on line 76, then line 77 would also be accessing `fbatch->nr` almost immediately. The report I'm looking at clearly indicates a "read-write" scenario, pinpointing the read on line 77 of `folio_batch_add` and involving task 6441. But hang on, the increment *is* the write, isn't it?

Right, so the write is `fbatch->nr++`... now, let's look at the reader stack. I need to understand what's *reading* this value at the same time and what the context is. It looks like `folio_batch_count` is being called, specifically at `include/linux/pagevec.h:56`. Okay, that's interesting. And then `cpu_needs_drain` at `mm/swap.c:780`, and `__lru_add_drain_all` at `mm/swap.c:877`. Seems like memory management-related operations.

Alright, let's take a closer look at `folio_batch_count`.

10/2

2026/01/15 23:52

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:folio_batch_count SourceFile:include/linux/pagevec.h]

Results:

map[Missing:false SourceCode:  53:	static inline unsigned int folio_batch_count(const struct folio_batch *fbatch)
  54:	{
  55:		return fbatch->nr;
  56:	}
 SourceFile:include/linux/pagevec.h]

11/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

Here's a breakdown of my thought process:

**Racing on `fbatch->nr`**

Alright, this looks like a classic race condition scenario. I'm focusing on the interaction between a writer, `folio_batch_add`, and a reader, `cpu_needs_drain`, specifically related to the `fbatch->nr` variable. This screams potential data corruption or missed signals if not handled correctly. The use of the term *race* already sets my mind racing for lock protection or atomic operations. 

My next step is to deep dive into where `cpu_needs_drain` resides, as I can't immediately deduce the potential issue. I know it's a memory management file, `mm/swap.c`, so I'll go there next. I need to understand what's *draining*, why it's happening, and how `fbatch->nr` is involved.

12/2

2026/01/15 23:52

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:cpu_needs_drain SourceFile:mm/swap.c]

Results:

map[Missing:false SourceCode: 774:	static bool cpu_needs_drain(unsigned int cpu)
 775:	{
 776:		struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
 777:	
 778:		/* Check these in order of likelihood that they're not zero */
 779:		return folio_batch_count(&fbatches->lru_add) ||
 780:			folio_batch_count(&fbatches->lru_move_tail) ||
 781:			folio_batch_count(&fbatches->lru_deactivate_file) ||
 782:			folio_batch_count(&fbatches->lru_deactivate) ||
 783:			folio_batch_count(&fbatches->lru_lazyfree) ||
 784:			folio_batch_count(&fbatches->lru_activate) ||
 785:			need_mlock_drain(cpu) ||
 786:			has_bh_in_lru(cpu, NULL);
 787:	}
 SourceFile:mm/swap.c]

13/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

Okay, here's my interpretation of that thought process, assuming I'm an expert in this field:

**Diving into the LRU Cache Implementation**

Alright, so I'm thinking about this `__lru_add_drain_all` function... It's triggered within the broader Least Recently Used cache implementation, that much is clear. The key here is the "drain all" aspect. This strongly suggests we're dealing with a situation where we need to evict *all* entries from the cache, likely due to a specific event or condition. It could be for memory pressure, a cache invalidation signal, a checkpoint, or something along those lines. The fact that `__lru_add_drain_all` is the entry point suggests it's designed to do more than just drain a few entries selectively. This isn't just a simple eviction, this is a purge. I wonder what the calling context is. What event triggered this operation, and what effect is this purge likely to have on the other parts of the system that rely on this cache? This is where things get interesting, digging deep into the reason, not just the action.

14/2

2026/01/15 23:52

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:__lru_add_drain_all SourceFile:mm/swap.c]

Results:

map[Missing:false SourceCode: 789:	/*
 790:	 * Doesn't need any cpu hotplug locking because we do rely on per-cpu
 791:	 * kworkers being shut down before our page_alloc_cpu_dead callback is
 792:	 * executed on the offlined cpu.
 793:	 * Calling this function with cpu hotplug locks held can actually lead
 794:	 * to obscure indirect dependencies via WQ context.
 795:	 */
 796:	static inline void __lru_add_drain_all(bool force_all_cpus)
 797:	{
 798:		/*
 799:		 * lru_drain_gen - Global pages generation number
 800:		 *
 801:		 * (A) Definition: global lru_drain_gen = x implies that all generations
 802:		 *     0 < n <= x are already *scheduled* for draining.
 803:		 *
 804:		 * This is an optimization for the highly-contended use case where a
 805:		 * user space workload keeps constantly generating a flow of pages for
 806:		 * each CPU.
 807:		 */
 808:		static unsigned int lru_drain_gen;
 809:		static struct cpumask has_work;
 810:		static DEFINE_MUTEX(lock);
 811:		unsigned cpu, this_gen;
 812:	
 813:		/*
 814:		 * Make sure nobody triggers this path before mm_percpu_wq is fully
 815:		 * initialized.
 816:		 */
 817:		if (WARN_ON(!mm_percpu_wq))
 818:			return;
 819:	
 820:		/*
 821:		 * Guarantee folio_batch counter stores visible by this CPU
 822:		 * are visible to other CPUs before loading the current drain
 823:		 * generation.
 824:		 */
 825:		smp_mb();
 826:	
 827:		/*
 828:		 * (B) Locally cache global LRU draining generation number
 829:		 *
 830:		 * The read barrier ensures that the counter is loaded before the mutex
 831:		 * is taken. It pairs with smp_mb() inside the mutex critical section
 832:		 * at (D).
 833:		 */
 834:		this_gen = smp_load_acquire(&lru_drain_gen);
 835:	
 836:		/* It helps everyone if we do our own local drain immediately. */
 837:		lru_add_drain();
 838:	
 839:		mutex_lock(&lock);
 840:	
 841:		/*
 842:		 * (C) Exit the draining operation if a newer generation, from another
 843:		 * lru_add_drain_all(), was already scheduled for draining. Check (A).
 844:		 */
 845:		if (unlikely(this_gen != lru_drain_gen && !force_all_cpus))
 846:			goto done;
 847:	
 848:		/*
 849:		 * (D) Increment global generation number
 850:		 *
 851:		 * Pairs with smp_load_acquire() at (B), outside of the critical
 852:		 * section. Use a full memory barrier to guarantee that the
 853:		 * new global drain generation number is stored before loading
 854:		 * folio_batch counters.
 855:		 *
 856:		 * This pairing must be done here, before the for_each_online_cpu loop
 857:		 * below which drains the page vectors.
 858:		 *
 859:		 * Let x, y, and z represent some system CPU numbers, where x < y < z.
 860:		 * Assume CPU #z is in the middle of the for_each_online_cpu loop
 861:		 * below and has already reached CPU #y's per-cpu data. CPU #x comes
 862:		 * along, adds some pages to its per-cpu vectors, then calls
 863:		 * lru_add_drain_all().
 864:		 *
 865:		 * If the paired barrier is done at any later step, e.g. after the
 866:		 * loop, CPU #x will just exit at (C) and miss flushing out all of its
 867:		 * added pages.
 868:		 */
 869:		WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
 870:		smp_mb();
 871:	
 872:		cpumask_clear(&has_work);
 873:		for_each_online_cpu(cpu) {
 874:			struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
 875:	
 876:			if (cpu_needs_drain(cpu)) {
 877:				INIT_WORK(work, lru_add_drain_per_cpu);
 878:				queue_work_on(cpu, mm_percpu_wq, work);
 879:				__cpumask_set_cpu(cpu, &has_work);
 880:			}
 881:		}
 882:	
 883:		for_each_cpu(cpu, &has_work)
 884:			flush_work(&per_cpu(lru_add_drain_work, cpu));
 885:	
 886:	done:
 887:		mutex_unlock(&lock);
 888:	}
 SourceFile:mm/swap.c]

15/2

2026/01/15 23:52

llm

expert

Model:

unknown

Thoughts:

**Thinking Through a Potential Data Race in LRU Draining**

Okay, so I'm looking at this code in `__lru_add_drain_all`, and it's using `cpu_needs_drain(cpu)` to decide whether to queue a work item to drain the LRU on a particular CPU. `cpu_needs_drain` is just reading `fbatch->nr`, which is the folio batch count. And `folio_batch_add` is doing a simple increment (`fbatch->nr++`) which is either a plain write, or a read-modify-write. Since `fbatch->nr` is just a counter, a slightly out-of-date value is probably not a huge deal, right? If `cpu_needs_drain` reads 0 when it should be 1, it just means that we might miss draining this CPU this time.

But wait, there are memory barriers in `__lru_add_drain_all`! Lines 825 and 870 have `smp_mb()`, and then `cpu_needs_drain` is called at line 876. So, if `cpu_needs_drain` returns false because it saw a stale 0, is that a problem?

Typically, `lru_add_drain_all` is called when we *need* pages on the LRU lists, like when we're migrating or invalidating. And now I remember, `invalidate_bdev` calls it. It wants all pages drained from per-CPU batches to the LRU lists so they can be invalidated.

Ah, but now I recall, this is a plain read/write, so KCSAN is right to flag it. Is it benign, though? If `cpu_needs_drain` sees 0, it doesn't queue work. If `folio_batch_add` just happened, then `fbatch->nr` has just become 1. If `cpu_needs_drain` misses this, it will be drained later, or the caller might fail to see the page on the LRU. Specifically, in the `invalidate_bdev` case, if we miss a page in the batch, that page won't be invalidated. So `invalidate_bdev` calls `lru_add_drain_all` to ensure all pages are on LRU so they can be found and invalidated. But wait... if a page is in a folio batch, it *isn't* on the LRU yet! `lru_add_drain_all` is what moves them.

Okay, so if `invalidate_bdev` misses a page because of a stale read, it might not invalidate it. That could be bad. But then, I remember `invalidate_bdev` also calls `truncate_inode_pages`. Wait, let me check the source code for `invalidate_bdev`.

Okay, here's `invalidate_bdev`:

```c
void invalidate_bdev(struct block_device *bdev)
{
	struct address_space *mapping = bdev->bd_inode->i_mapping;

	if (mapping->nrpages) {
		lru_add_drain_all();	/* For __invalidate_mapping_pages() */
		invalidate_bh_lrus();
		truncate_inode_pages(mapping, 0);
	}
}
```

Okay, so the `lru_add_drain_all` call is specifically for `__invalidate_mapping_pages()`. And `truncate_inode_pages` is called *after* `lru_add_drain_all`. Got it! So, I need to look into whether a missed page invalidation in `__invalidate_mapping_pages` leads to a failure in the truncate.

16/2

2026/01/15 23:53

tool

codesearch-definition-source

Model:

unknown

Args:

map[IncludeLines:true Name:invalidate_bdev SourceFile:block/bdev.c]

Results:

map[Missing:false SourceCode: 104:	EXPORT_SYMBOL(invalidate_bdev);
 SourceFile:block/bdev.c]

17/2

2026/01/15 23:53

llm

expert

Model:

unknown

Thoughts:

Okay, here we go. Here's how I'm thinking about this, from my expert perspective:

**Debugging Export Behavior**

Right, so I see it only "exported" whatever it was supposed to. My initial assumption was that this process completed, but the phrasing has me thinking it was incomplete. I need more information. Let me see... I need to clarify what that means, *precisely*. "Exported" in this context could mean any number of things, depending on the tool and the format. I need to get to the root definition of what this 'export' action is supposed to have achieved, ideally by looking at documentation, log files, or even the underlying code if necessary. The meaning of "exported" is going to be key to diagnosing whatever the problem is. I need to be exact here – ambiguity is my enemy. Let me start with a quick search for the tool's definition of "export" in its documentation.

18/2

2026/01/15 23:53

tool

codesearch-file-index

Model:

unknown

Args:

map[SourceFile:block/bdev.c]

Results:

map[Entities:[map[Kind:function Name:BDEV_I] map[Kind:function Name:BD_INODE] map[Kind:function Name:I_BDEV] map[Kind:function Name:I_BDEV] map[Kind:function Name:bd_abort_claiming] map[Kind:function Name:bd_abort_claiming] map[Kind:function Name:bd_clear_claiming] map[Kind:function Name:bd_end_claim] map[Kind:function Name:bd_finish_claiming] map[Kind:function Name:bd_init_fs_context] map[Kind:function Name:bd_may_claim] map[Kind:function Name:bd_prepare_to_claim] map[Kind:function Name:bd_prepare_to_claim] map[Kind:function Name:bd_yield_claim] map[Kind:function Name:bdev_add] map[Kind:function Name:bdev_alloc] map[Kind:function Name:bdev_alloc_inode] map[Kind:function Name:bdev_block_writes] map[Kind:function Name:bdev_cache_init] map[Kind:function Name:bdev_claim_write_access] map[Kind:function Name:bdev_drop] map[Kind:function Name:bdev_evict_inode] map[Kind:function Name:bdev_file_open_by_dev] map[Kind:function Name:bdev_file_open_by_dev] map[Kind:function Name:bdev_file_open_by_path] map[Kind:function Name:bdev_file_open_by_path] map[Kind:function Name:bdev_fput] map[Kind:function Name:bdev_fput] map[Kind:function Name:bdev_free_inode] map[Kind:function Name:bdev_freeze] map[Kind:function Name:bdev_freeze] map[Kind:function Name:bdev_mark_dead] map[Kind:function Name:bdev_mark_dead] map[Kind:function Name:bdev_may_open] map[Kind:function Name:bdev_open] map[Kind:function Name:bdev_permission] map[Kind:function Name:bdev_release] map[Kind:function Name:bdev_set_nr_sectors] map[Kind:function Name:bdev_statx] map[Kind:function Name:bdev_thaw] map[Kind:function Name:bdev_thaw] map[Kind:function Name:bdev_unblock_writes] map[Kind:function Name:bdev_unclaimed] map[Kind:function Name:bdev_unhash] map[Kind:function Name:bdev_validate_blocksize] map[Kind:function Name:bdev_validate_blocksize] map[Kind:function Name:bdev_write_inode] map[Kind:function Name:bdev_writes_blocked] map[Kind:function Name:bdev_yield_write_access] map[Kind:function Name:blk_to_file_flags] map[Kind:function Name:blkdev_flush_mapping] map[Kind:function Name:blkdev_get_no_open] map[Kind:function Name:blkdev_get_part] map[Kind:function Name:blkdev_get_whole] map[Kind:function Name:blkdev_put_no_open] map[Kind:function Name:blkdev_put_part] map[Kind:function Name:blkdev_put_whole] map[Kind:function Name:block_size] map[Kind:function Name:block_size] map[Kind:function Name:disk_live] map[Kind:function Name:disk_live] map[Kind:function Name:file_bdev] map[Kind:function Name:file_bdev] map[Kind:function Name:init_once] map[Kind:function Name:invalidate_bdev] map[Kind:function Name:invalidate_bdev] map[Kind:function Name:kill_bdev] map[Kind:function Name:lookup_bdev] map[Kind:function Name:lookup_bdev] map[Kind:function Name:nr_blockdev_pages] map[Kind:function Name:sb_min_blocksize] map[Kind:function Name:sb_min_blocksize] map[Kind:function Name:sb_set_blocksize] map[Kind:function Name:sb_set_blocksize] map[Kind:function Name:sb_validate_large_blocksize] map[Kind:function Name:set_blocksize] map[Kind:function Name:set_blocksize] map[Kind:function Name:set_init_blocksize] map[Kind:function Name:setup_bdev_allow_write_mounted] map[Kind:function Name:sync_bdevs] map[Kind:function Name:sync_blockdev] map[Kind:function Name:sync_blockdev] map[Kind:function Name:sync_blockdev_nowait] map[Kind:function Name:sync_blockdev_nowait] map[Kind:function Name:sync_blockdev_range] map[Kind:function Name:sync_blockdev_range] map[Kind:function Name:truncate_bdev_range]] Missing:false]

19/2

2026/01/15 23:53

llm

expert

Model:

unknown