Skip to content

Commit 0bad47c

Browse files
chuckleverJ. Bruce Fields
authored and
J. Bruce Fields
committed
svcrdma: Preserve CB send buffer across retransmits
During each NFSv4 callback Call, an RDMA Send completion frees the page that contains the RPC Call message. If the upper layer determines that a retransmit is necessary, this is too soon. One possible symptom: after a GARBAGE_ARGS response an NFSv4.1 callback request, the following BUG fires on the NFS server: kernel: BUG: Bad page state in process kworker/0:2H pfn:7d3ce2 kernel: page:ffffea001f4f3880 count:-2 mapcount:0 mapping: (null) index:0x0 kernel: flags: 0x2fffff80000000() kernel: raw: 002fffff80000000 0000000000000000 0000000000000000 fffffffeffffffff kernel: raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 kernel: page dumped because: nonzero _refcount kernel: Modules linked in: cts rpcsec_gss_krb5 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue rpcrdm a ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pc lmul crc32_pclmul ghash_clmulni_intel pcbc iTCO_wdt iTCO_vendor_support aesni_intel crypto_simd glue_helper cryptd pcspkr lpc_ich i2c_i801 mei_me mf d_core mei raid0 sg wmi ioatdma ipmi_si ipmi_devintf ipmi_msghandler shpchp acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en mlx4_ib mlx5_ib ib_core sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crc32c_intel libahci drm mlx5_core igb libata mlx4_core dca i2c_algo_bit i2c_core nvme kernel: ptp nvme_core pps_core dm_mirror dm_region_hash dm_log dm_mod dax kernel: CPU: 0 PID: 11495 Comm: kworker/0:2H Not tainted 4.14.0-rc3-00001-g577ce48 #811 kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015 kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] kernel: Call Trace: kernel: dump_stack+0x62/0x80 kernel: bad_page+0xfe/0x11a kernel: free_pages_check_bad+0x76/0x78 kernel: free_pcppages_bulk+0x364/0x441 kernel: ? ttwu_do_activate.isra.61+0x71/0x78 kernel: free_hot_cold_page+0x1c5/0x202 kernel: __put_page+0x2c/0x36 kernel: svc_rdma_put_context+0xd9/0xe4 [rpcrdma] kernel: svc_rdma_wc_send+0x50/0x98 [rpcrdma] This issue exists all the way back to v4.5, but refactoring and code re-organization prevents this simple patch from applying to kernels older than v4.12. The fix is the same, however, if someone needs to backport it. Reported-by: Ben Coddington <[email protected]> BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=314 Fixes: 5d252f9 ('svcrdma: Add class for RDMA backwards ... ') Cc: [email protected] # v4.12 Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
1 parent 256a89f commit 0bad47c

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

net/sunrpc/xprtrdma/svc_rdma_backchannel.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,10 @@ static int svc_rdma_bc_sendto(struct svcxprt_rdma *rdma,
132132
if (ret)
133133
goto out_err;
134134

135+
/* Bump page refcnt so Send completion doesn't release
136+
* the rq_buffer before all retransmits are complete.
137+
*/
138+
get_page(virt_to_page(rqst->rq_buffer));
135139
ret = svc_rdma_post_send_wr(rdma, ctxt, 1, 0);
136140
if (ret)
137141
goto out_unmap;
@@ -164,7 +168,6 @@ xprt_rdma_bc_allocate(struct rpc_task *task)
164168
return -EINVAL;
165169
}
166170

167-
/* svc_rdma_sendto releases this page */
168171
page = alloc_page(RPCRDMA_DEF_GFP);
169172
if (!page)
170173
return -ENOMEM;
@@ -183,6 +186,7 @@ xprt_rdma_bc_free(struct rpc_task *task)
183186
{
184187
struct rpc_rqst *rqst = task->tk_rqstp;
185188

189+
put_page(virt_to_page(rqst->rq_buffer));
186190
kfree(rqst->rq_rbuffer);
187191
}
188192

0 commit comments

Comments
 (0)