commit 4fccc2503536a564a4ba31a1d50439854201659f Author: Greg Kroah-Hartman Date: Wed Feb 19 19:51:59 2020 +0100 Linux 4.19.105 commit e39cc4b09437479331e25d9f934ae5c8bdc65afc Author: Sean Christopherson Date: Fri Feb 7 09:37:42 2020 -0800 KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging [ Upstream commit f6ab0107a4942dbf9a5cf0cca3f37e184870a360 ] Define PT_MAX_FULL_LEVELS as PT64_ROOT_MAX_LEVEL, i.e. 5, to fix shadow paging for 5-level guest page tables. PT_MAX_FULL_LEVELS is used to size the arrays that track guest pages table information, i.e. using a "max levels" of 4 causes KVM to access garbage beyond the end of an array when querying state for level 5 entries. E.g. FNAME(gpte_changed) will read garbage and most likely return %true for a level 5 entry, soft-hanging the guest because FNAME(fetch) will restart the guest instead of creating SPTEs because it thinks the guest PTE has changed. Note, KVM doesn't yet support 5-level nested EPT, so PT_MAX_FULL_LEVELS gets to stay "4" for the PTTYPE_EPT case. Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 2a3cf3553ead8bab53f4f9bc6484e81a5ae97500 Author: zhangyi (F) Date: Tue Feb 18 18:59:30 2020 +0800 jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer [ Upstream commit c96dceeabf765d0b1b1f29c3bf50a5c01315b820 ] Commit 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") set the BH_Freed flag when forgetting a metadata buffer which belongs to the committing transaction, it indicate the committing process clear dirty bits when it is done with the buffer. But it also clear the BH_Mapped flag at the same time, which may trigger below NULL pointer oops when block_size < PAGE_SIZE. rmdir 1 kjournald2 mkdir 2 jbd2_journal_commit_transaction commit transaction N jbd2_journal_forget set_buffer_freed(bh1) jbd2_journal_commit_transaction commit transaction N+1 ... clear_buffer_mapped(bh1) ext4_getblk(bh2 ummapped) ... grow_dev_page init_page_buffers bh1->b_private=NULL bh2->b_private=NULL jbd2_journal_put_journal_head(jh1) __journal_remove_journal_head(hb1) jh1 is NULL and trigger oops *) Dir entry block bh1 and bh2 belongs to one page, and the bh2 has already been unmapped. For the metadata buffer we forgetting, we should always keep the mapped flag and clear the dirty flags is enough, so this patch pick out the these buffers and keep their BH_Mapped flag. Link: https://lore.kernel.org/r/20200213063821.30455-3-yi.zhang@huawei.com Fixes: 904cdbd41d74 ("jbd2: clear dirty flag when revoking a buffer from an older transaction") Reviewed-by: Jan Kara Signed-off-by: zhangyi (F) Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Sasha Levin commit 056c7c22fcda495f38dc7fa08451e65dc9ec461d Author: zhangyi (F) Date: Tue Feb 18 18:59:29 2020 +0800 jbd2: move the clearing of b_modified flag to the journal_unmap_buffer() [ Upstream commit 6a66a7ded12baa6ebbb2e3e82f8cb91382814839 ] There is no need to delay the clearing of b_modified flag to the transaction committing time when unmapping the journalled buffer, so just move it to the journal_unmap_buffer(). Link: https://lore.kernel.org/r/20200213063821.30455-2-yi.zhang@huawei.com Reviewed-by: Jan Kara Signed-off-by: zhangyi (F) Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Sasha Levin commit 32865d65c4d232033f73e75bd9a003233df2b066 Author: Olga Kornievskaia Date: Wed Feb 12 17:32:12 2020 -0500 NFSv4.1 make cachethis=no for writes commit cd1b659d8ce7697ee9799b64f887528315b9097b upstream. Turning caching off for writes on the server should improve performance. Fixes: fba83f34119a ("NFS: Pass "privileged" value to nfs4_init_sequence()") Signed-off-by: Olga Kornievskaia Reviewed-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit aa90c2cbbefe5241307c823e1eec25c1d85668bc Author: Mike Jones Date: Tue Jan 28 10:59:59 2020 -0700 hwmon: (pmbus/ltc2978) Fix PMBus polling of MFR_COMMON definitions. commit cf2b012c90e74e85d8aea7d67e48868069cfee0c upstream. Change 21537dc driver PMBus polling of MFR_COMMON from bits 5/4 to bits 6/5. This fixs a LTC297X family bug where polling always returns not busy even when the part is busy. This fixes a LTC388X and LTM467X bug where polling used PEND and NOT_IN_TRANS, and BUSY was not polled, which can lead to NACKing of commands. LTC388X and LTM467X modules now poll BUSY and PEND, increasing reliability by eliminating NACKing of commands. Signed-off-by: Mike Jones Link: https://lore.kernel.org/r/1580234400-2829-2-git-send-email-michael-a1.jones@analog.com Fixes: e04d1ce9bbb49 ("hwmon: (ltc2978) Add polling for chips requiring it") Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 6f1e32c53e19a52d2a03c98cb4489af446469be9 Author: Kan Liang Date: Tue Jan 21 11:01:25 2020 -0800 perf/x86/intel: Fix inaccurate period in context switch for auto-reload commit f861854e1b435b27197417f6f90d87188003cb24 upstream. Perf doesn't take the left period into account when auto-reload is enabled with fixed period sampling mode in context switch. Here is the MSR trace of the perf command as below. (The MSR trace is simplified from a ftrace log.) #perf record -e cycles:p -c 2000000 -- ./triad_loop //The MSR trace of task schedule out //perf disable all counters, disable PEBS, disable GP counter 0, //read GP counter 0, and re-enable all counters. //The counter 0 stops at 0xfffffff82840 write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 0 write_msr: MSR_P6_EVNTSEL0(186), value 40003003c rdpmc: 0, value fffffff82840 write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff //The MSR trace of the same task schedule in again //perf disable all counters, enable and set GP counter 0, //enable PEBS, and re-enable all counters. //0xffffffe17b80 (-2000000) is written to GP counter 0. write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 write_msr: MSR_IA32_PMC0(4c1), value ffffffe17b80 write_msr: MSR_P6_EVNTSEL0(186), value 40043003c write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 1 write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff When the same task schedule in again, the counter should starts from previous left. However, it starts from the fixed period -2000000 again. A special variant of intel_pmu_save_and_restart() is used for auto-reload, which doesn't update the hwc->period_left. When the monitored task schedules in again, perf doesn't know the left period. The fixed period is used, which is inaccurate. With auto-reload, the counter always has a negative counter value. So the left period is -value. Update the period_left in intel_pmu_save_and_restart_reload(). With the patch: //The MSR trace of task schedule out write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 0 write_msr: MSR_P6_EVNTSEL0(186), value 40003003c rdpmc: 0, value ffffffe25cbc write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff //The MSR trace of the same task schedule in again write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 write_msr: MSR_IA32_PMC0(4c1), value ffffffe25cbc write_msr: MSR_P6_EVNTSEL0(186), value 40043003c write_msr: MSR_IA32_PEBS_ENABLE(3f1), value 1 write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value f000000ff Fixes: d31fc13fdcb2 ("perf/x86/intel: Fix event update for auto-reload") Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20200121190125.3389-1-kan.liang@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit fce14b5b2f86b54d3d1a254888d5c3a7e5116071 Author: Nathan Chancellor Date: Sat Feb 8 07:08:59 2020 -0700 s390/time: Fix clk type in get_tod_clock commit 0f8a206df7c920150d2aa45574fba0ab7ff6be4f upstream. Clang warns: In file included from ../arch/s390/boot/startup.c:3: In file included from ../include/linux/elf.h:5: In file included from ../arch/s390/include/asm/elf.h:132: In file included from ../include/linux/compat.h:10: In file included from ../include/linux/time.h:74: In file included from ../include/linux/time32.h:13: In file included from ../include/linux/timex.h:65: ../arch/s390/include/asm/timex.h:160:20: warning: passing 'unsigned char [16]' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign] get_tod_clock_ext(clk); ^~~ ../arch/s390/include/asm/timex.h:149:44: note: passing argument to parameter 'clk' here static inline void get_tod_clock_ext(char *clk) ^ Change clk's type to just be char so that it matches what happens in get_tod_clock_ext. Fixes: 57b28f66316d ("[S390] s390_hypfs: Add new attributes") Link: https://github.com/ClangBuiltLinux/linux/issues/861 Link: http://lkml.kernel.org/r/20200208140858.47970-1-natechancellor@gmail.com Reviewed-by: Nick Desaulniers Signed-off-by: Nathan Chancellor Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman commit 5595f492779d36c4b01e31478b19964dc9b2edff Author: Leon Romanovsky Date: Wed Feb 12 10:06:51 2020 +0200 RDMA/core: Fix protection fault in get_pkey_idx_qp_list commit 1dd017882e01d2fcd9c5dbbf1eb376211111c393 upstream. We don't need to set pkey as valid in case that user set only one of pkey index or port number, otherwise it will be resulted in NULL pointer dereference while accessing to uninitialized pkey list. The following crash from Syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 14753 Comm: syz-executor.2 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:get_pkey_idx_qp_list+0x161/0x2d0 Code: 01 00 00 49 8b 5e 20 4c 39 e3 0f 84 b9 00 00 00 e8 e4 42 6e fe 48 8d 7b 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 01 0f 8e d0 00 00 00 48 8d 7d 04 48 b8 RSP: 0018:ffffc9000bc6f950 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff82c8bdec RDX: 0000000000000002 RSI: ffffc900030a8000 RDI: 0000000000000010 RBP: ffff888112c8ce80 R08: 0000000000000004 R09: fffff5200178df1f R10: 0000000000000001 R11: fffff5200178df1f R12: ffff888115dc4430 R13: ffff888115da8498 R14: ffff888115dc4410 R15: ffff888115da8000 FS: 00007f20777de700(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2f721000 CR3: 00000001173ca002 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: port_pkey_list_insert+0xd7/0x7c0 ib_security_modify_qp+0x6fa/0xfc0 _ib_modify_qp+0x8c4/0xbf0 modify_qp+0x10da/0x16d0 ib_uverbs_modify_qp+0x9a/0x100 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs") Link: https://lore.kernel.org/r/20200212080651.GB679970@unreal Signed-off-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Message-Id: <20200212080651.GB679970@unreal> Signed-off-by: Greg Kroah-Hartman commit 5fb35764d69601c649f8a935ea680acdddec3679 Author: Zhu Yanjun Date: Wed Feb 12 09:26:33 2020 +0200 RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq commit 8ac0e6641c7ca14833a2a8c6f13d8e0a435e535c upstream. When run stress tests with RXE, the following Call Traces often occur watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0] ... Call Trace: create_object+0x3f/0x3b0 kmem_cache_alloc_node_trace+0x129/0x2d0 __kmalloc_reserve.isra.52+0x2e/0x80 __alloc_skb+0x83/0x270 rxe_init_packet+0x99/0x150 [rdma_rxe] rxe_requester+0x34e/0x11a0 [rdma_rxe] rxe_do_task+0x85/0xf0 [rdma_rxe] tasklet_action_common.isra.21+0xeb/0x100 __do_softirq+0xd0/0x298 irq_exit+0xc5/0xd0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 ... The root cause is that tasklet is actually a softirq. In a tasklet handler, another softirq handler is triggered. Usually these softirq handlers run on the same cpu core. So this will cause "soft lockup Bug". Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200212072635.682689-8-leon@kernel.org Signed-off-by: Zhu Yanjun Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit b817c10bff218b46f01bd4b3f07c423d9c2034a0 Author: Kamal Heib Date: Wed Feb 5 13:05:30 2020 +0200 RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create commit 8a4f300b978edbbaa73ef9eca660e45eb9f13873 upstream. Make sure to free the allocated cpumask_var_t's to avoid the following reported memory leak by kmemleak: $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8897f812d6a8 (size 8): comm "kworker/1:1", pid 347, jiffies 4294751400 (age 101.703s) hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace: [<00000000bff49664>] alloc_cpumask_var_node+0x4c/0xb0 [<0000000075d3ca81>] hfi1_comp_vectors_set_up+0x20f/0x800 [hfi1] [<0000000098d420df>] hfi1_init_dd+0x3311/0x4960 [hfi1] [<0000000071be7e52>] init_one+0x25e/0xf10 [hfi1] [<000000005483d4c2>] local_pci_probe+0xd4/0x180 [<000000007c3cbc6e>] work_for_cpu_fn+0x51/0xa0 [<000000001d626905>] process_one_work+0x8f0/0x17b0 [<000000007e569e7e>] worker_thread+0x536/0xb50 [<00000000fd39a4a5>] kthread+0x30c/0x3d0 [<0000000056f2edb3>] ret_from_fork+0x3a/0x50 Fixes: 5d18ee67d4c1 ("IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support") Link: https://lore.kernel.org/r/20200205110530.12129-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib Reviewed-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 11c74276df2b5998ee8a11b5be9e8e0b96aa8217 Author: Avihai Horon Date: Sun Jan 26 19:15:00 2020 +0200 RDMA/core: Fix invalid memory access in spec_filter_size commit a72f4ac1d778f7bde93dfee69bfc23377ec3d74f upstream. Add a check that the size specified in the flow spec header doesn't cause an overflow when calculating the filter size, and thus prevent access to invalid memory. The following crash from syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 17834 Comm: syz-executor.3 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:memchr_inv+0xd3/0x330 Code: 89 f9 89 f5 83 e1 07 0f 85 f9 00 00 00 49 89 d5 49 c1 ed 03 45 85 ed 74 6f 48 89 d9 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 <80> 3c 01 00 0f 85 0d 02 00 00 44 0f b6 e5 48 b8 01 01 01 01 01 01 RSP: 0018:ffffc9000a13fa50 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 7fff88810de9d820 RCX: 0ffff11021bd3b04 RDX: 000000000000fff8 RSI: 0000000000000000 RDI: 7fff88810de9d820 RBP: 0000000000000000 R08: ffff888110d69018 R09: 0000000000000009 R10: 0000000000000001 R11: ffffed10236267cc R12: 0000000000000004 R13: 0000000000001fff R14: ffff88810de9d820 R15: 0000000000000040 FS: 00007f9ee0e51700(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000115ea0006 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: spec_filter_size.part.16+0x34/0x50 ib_uverbs_kern_spec_to_ib_spec_filter+0x691/0x770 ib_uverbs_ex_create_flow+0x9ea/0x1b40 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b49 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9ee0e50c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000465b49 RDX: 00000000000003a0 RSI: 00000000200007c0 RDI: 0000000000000004 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ee0e516bc R13: 00000000004ca2da R14: 000000000070deb8 R15: 00000000ffffffff Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) Fixes: 94e03f11ad1f ("IB/uverbs: Add support for flow tag") Link: https://lore.kernel.org/r/20200126171500.4623-1-leon@kernel.org Signed-off-by: Avihai Horon Reviewed-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 7697672ccbbe028881cbb4a5542c8187505b68c9 Author: Kaike Wan Date: Mon Feb 10 08:10:40 2020 -0500 IB/rdmavt: Reset all QPs when the device is shut down commit f92e48718889b3d49cee41853402aa88cac84a6b upstream. When the hfi1 device is shut down during a system reboot, it is possible that some QPs might have not not freed by ULPs. More requests could be post sent and a lingering timer could be triggered to schedule more packet sends, leading to a crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000102 IP: [ffffffff810a65f2] __queue_work+0x32/0x3c0 PGD 0 Oops: 0000 1 SMP Modules linked in: nvmet_rdma(OE) nvmet(OE) nvme(OE) dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) pal_raw(POE) pal_pmt(POE) pal_cache(POE) pal_pile(POE) pal(POE) pal_compatible(OE) rpcrdma sunrpc ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi ipmi_ssif pcspkr ses enclosure joydev scsi_transport_sas i2c_i801 sg mei_me lpc_ich mei ioatdma shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core crct10dif_pclmul crct10dif_common hfi1(OE) igb crc32c_intel rdmavt(OE) ahci ib_core libahci libata ptp megaraid_sas pps_core dca i2c_algo_bit i2c_core devlink dm_mirror dm_region_hash dm_log dm_mod CPU: 23 PID: 0 Comm: swapper/23 Tainted: P OE ------------ 3.10.0-693.el7.x86_64 #1 Hardware name: Intel Corporation S2600CWR/S2600CWR, BIOS SE5C610.86B.01.01.0028.121720182203 12/17/2018 task: ffff8808f4ec4f10 ti: ffff8808f4ed8000 task.ti: ffff8808f4ed8000 RIP: 0010:[ffffffff810a65f2] [ffffffff810a65f2] __queue_work+0x32/0x3c0 RSP: 0018:ffff88105df43d48 EFLAGS: 00010046 RAX: 0000000000000086 RBX: 0000000000000086 RCX: 0000000000000000 RDX: ffff880f74e758b0 RSI: 0000000000000000 RDI: 000000000000001f RBP: ffff88105df43d80 R08: ffff8808f3c583c8 R09: ffff8808f3c58000 R10: 0000000000000002 R11: ffff88105df43da8 R12: ffff880f74e758b0 R13: 000000000000001f R14: 0000000000000000 R15: ffff88105a300000 FS: 0000000000000000(0000) GS:ffff88105df40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000102 CR3: 00000000019f2000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88105b6dd708 0000001f00000286 0000000000000086 ffff88105a300000 ffff880f74e75800 0000000000000000 ffff88105a300000 ffff88105df43d98 ffffffff810a6b85 ffff88105a301e80 ffff88105df43dc8 ffffffffc0224cde Call Trace: IRQ [ffffffff810a6b85] queue_work_on+0x45/0x50 [ffffffffc0224cde] _hfi1_schedule_send+0x6e/0xc0 [hfi1] [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffffc0224d62] hfi1_schedule_send+0x32/0x70 [hfi1] [ffffffffc0170644] rvt_rc_timeout+0xd4/0x120 [rdmavt] [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffff81097316] call_timer_fn+0x36/0x110 [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffff8109982d] run_timer_softirq+0x22d/0x310 [ffffffff81090b3f] __do_softirq+0xef/0x280 [ffffffff816b6a5c] call_softirq+0x1c/0x30 [ffffffff8102d3c5] do_softirq+0x65/0xa0 [ffffffff81090ec5] irq_exit+0x105/0x110 [ffffffff816b76c2] smp_apic_timer_interrupt+0x42/0x50 [ffffffff816b5c1d] apic_timer_interrupt+0x6d/0x80 EOI [ffffffff81527a02] ? cpuidle_enter_state+0x52/0xc0 [ffffffff81527b48] cpuidle_idle_call+0xd8/0x210 [ffffffff81034fee] arch_cpu_idle+0xe/0x30 [ffffffff810e7bca] cpu_startup_entry+0x14a/0x1c0 [ffffffff81051af6] start_secondary+0x1b6/0x230 Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 02 00 00 41 f6 86 02 01 00 00 01 0f 85 58 02 00 00 49 c7 c7 28 19 01 00 RIP [ffffffff810a65f2] __queue_work+0x32/0x3c0 RSP ffff88105df43d48 CR2: 0000000000000102 The solution is to reset the QPs before the device resources are freed. This reset will change the QP state to prevent post sends and delete timers to prevent callbacks. Fixes: 0acb0cc7ecc1 ("IB/rdmavt: Initialize and teardown of qpn table") Link: https://lore.kernel.org/r/20200210131040.87408.38161.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn Signed-off-by: Kaike Wan Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 63e58567e644dad9d8ccdf5646dc3972729a6104 Author: Mike Marciniszyn Date: Mon Feb 10 08:10:33 2020 -0500 IB/hfi1: Close window for pq and request coliding commit be8638344c70bf492963ace206a9896606b6922d upstream. Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [] dump_stack+0x19/0x1b [] __warn+0xd8/0x100 [] warn_slowpath_fmt+0x5f/0x80 [] __list_del_entry+0x63/0xd0 [] list_del+0xd/0x30 [] kmem_cache_destroy+0x50/0x110 [] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [] hfi1_file_close+0x70/0x1e0 [hfi1] [] __fput+0xec/0x260 [] ____fput+0xe/0x10 [] task_work_run+0xbb/0xe0 [] do_notify_resume+0xa5/0xc0 [] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[] [] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [] __kmem_cache_shutdown+0x14/0x80 [] kmem_cache_destroy+0x58/0x110 [] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [] hfi1_file_close+0x70/0x1e0 [hfi1] [] __fput+0xec/0x260 [] ____fput+0xe/0x10 [] task_work_run+0xbb/0xe0 [] do_notify_resume+0xa5/0xc0 [] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [] kmem_cache_close+0x7e/0x300 RSP CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: e87473bc1b6c ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan Signed-off-by: Mike Marciniszyn Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 910b13999566a5e7fb9724756ce3d03706f32400 Author: Kaike Wan Date: Mon Feb 10 08:10:26 2020 -0500 IB/hfi1: Acquire lock to release TID entries when user file is closed commit a70ed0f2e6262e723ae8d70accb984ba309eacc2 upstream. Each user context is allocated a certain number of RcvArray (TID) entries and these entries are managed through TID groups. These groups are put into one of three lists in each user context: tid_group_list, tid_used_list, and tid_full_list, depending on the number of used TID entries within each group. When TID packets are expected, one or more TID groups will be allocated. After the packets are received, the TID groups will be freed. Since multiple user threads may access the TID groups simultaneously, a mutex exp_mutex is used to synchronize the access. However, when the user file is closed, it tries to release all TID groups without acquiring the mutex first, which risks a race condition with another thread that may be releasing its TID groups, leading to data corruption. This patch addresses the issue by acquiring the mutex first before releasing the TID groups when the file is closed. Fixes: 3abb33ac6521 ("staging/hfi1: Add TID cache receive init and free funcs") Link: https://lore.kernel.org/r/20200210131026.87408.86853.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn Signed-off-by: Kaike Wan Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit e517ef194900d47f552d5ea1cb5d43b06642b47a Author: Yi Zhang Date: Fri Feb 14 18:48:02 2020 +0800 nvme: fix the parameter order for nvme_get_log in nvme_get_fw_slot_info commit f25372ffc3f6c2684b57fb718219137e6ee2b64c upstream. nvme fw-activate operation will get bellow warning log, fix it by update the parameter order [ 113.231513] nvme nvme0: Get FW SLOT INFO log error Fixes: 0e98719b0e4b ("nvme: simplify the API for getting log pages") Reported-by: Sujith Pandel Reviewed-by: David Milburn Signed-off-by: Yi Zhang Signed-off-by: Keith Busch Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit a4fc3b99c1940fc591a635ff09e8b58472207a70 Author: Kim Phillips Date: Tue Jan 21 11:12:31 2020 -0600 perf/x86/amd: Add missing L2 misses event spec to AMD Family 17h's event map commit 25d387287cf0330abf2aad761ce6eee67326a355 upstream. Commit 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h"), claimed L2 misses were unsupported, due to them not being found in its referenced documentation, whose link has now moved [1]. That old documentation listed PMCx064 unit mask bit 3 as: "LsRdBlkC: LS Read Block C S L X Change to X Miss." and bit 0 as: "IcFillMiss: IC Fill Miss" We now have new public documentation [2] with improved descriptions, that clearly indicate what events those unit mask bits represent: Bit 3 now clearly states: "LsRdBlkC: Data Cache Req Miss in L2 (all types)" and bit 0 is: "IcFillMiss: Instruction Cache Req Miss in L2." So we can now add support for L2 misses in perf's genericised events as PMCx064 with both the above unit masks. [1] The commit's original documentation reference, "Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors", originally available here: https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf is now available here: https://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Models_00h-0Fh.pdf [2] "Processor Programming Reference (PPR) for Family 17h Model 31h, Revision B0 Processors", available here: https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h") Reported-by: Babu Moger Signed-off-by: Kim Phillips Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Tested-by: Babu Moger Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200121171232.28839-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman commit 740d876bd9565857a695ce7c05efda4eba5bc585 Author: Sean Christopherson Date: Fri Feb 7 09:37:41 2020 -0800 KVM: nVMX: Use correct root level for nested EPT shadow page tables commit 148d735eb55d32848c3379e460ce365f2c1cbe4b upstream. Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU currently also hardcodes the page walk level for nested EPT to be 4 levels. The L2 guest is all but guaranteed to soft hang on its first instruction when L1 is using EPT, as KVM will construct 4-level page tables and then tell hardware to use 5-level page tables. Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 7a89674c2e8758dcc809d680719055e4f5662b92 Author: Will Deacon Date: Thu Feb 6 10:42:58 2020 +0000 arm64: ssbs: Fix context-switch when SSBS is present on all CPUs commit fca3d33d8ad61eb53eca3ee4cac476d1e31b9008 upstream. When all CPUs in the system implement the SSBS extension, the SSBS field in PSTATE is the definitive indication of the mitigation state. Further, when the CPUs implement the SSBS manipulation instructions (advertised to userspace via an HWCAP), EL0 can toggle the SSBS field directly and so we cannot rely on any shadow state such as TIF_SSBD at all. Avoid forcing the SSBS field in context-switch on such a system, and simply rely on the PSTATE register instead. Cc: Cc: Catalin Marinas Cc: Srinivas Ramana Fixes: cbdf8a189a66 ("arm64: Force SSBS on context switch") Reviewed-by: Marc Zyngier Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 04b2cbc1a91d8ef85ac0b280a7fc3e30c505afd6 Author: Krzysztof Kozlowski Date: Thu Jan 30 20:55:24 2020 +0100 ARM: npcm: Bring back GPIOLIB support commit e383e871ab54f073c2a798a9e0bde7f1d0528de8 upstream. The CONFIG_ARCH_REQUIRE_GPIOLIB is gone since commit 65053e1a7743 ("gpio: delete ARCH_[WANTS_OPTIONAL|REQUIRE]_GPIOLIB") and all platforms should explicitly select GPIOLIB to have it. Link: https://lore.kernel.org/r/20200130195525.4525-1-krzk@kernel.org Cc: Fixes: 65053e1a7743 ("gpio: delete ARCH_[WANTS_OPTIONAL|REQUIRE]_GPIOLIB") Signed-off-by: Krzysztof Kozlowski Signed-off-by: Olof Johansson Signed-off-by: Greg Kroah-Hartman commit a3eccdff2ce26a09a2ee4bb21ea453c5e4af303b Author: David Sterba Date: Wed Feb 5 17:12:28 2020 +0100 btrfs: log message when rw remount is attempted with unclean tree-log commit 10a3a3edc5b89a8cd095bc63495fb1e0f42047d9 upstream. A remount to a read-write filesystem is not safe when there's tree-log to be replayed. Files that could be opened until now might be affected by the changes in the tree-log. A regular mount is needed to replay the log so the filesystem presents the consistent view with the pending changes included. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Anand Jain Reviewed-by: Johannes Thumshirn Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 2a902b48a096efc20c1c6b9a0eb5ad48dd061297 Author: David Sterba Date: Wed Feb 5 17:12:16 2020 +0100 btrfs: print message when tree-log replay starts commit e8294f2f6aa6208ed0923aa6d70cea3be178309a upstream. There's no logged information about tree-log replay although this is something that points to previous unclean unmount. Other filesystems report that as well. Suggested-by: Chris Murphy CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Anand Jain Reviewed-by: Johannes Thumshirn Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 67d9c9e4201c4e1e38612abf2f50ad2a75f50f60 Author: Wenwen Wang Date: Sat Feb 1 20:38:38 2020 +0000 btrfs: ref-verify: fix memory leaks commit f311ade3a7adf31658ed882aaab9f9879fdccef7 upstream. In btrfs_ref_tree_mod(), 'ref' and 'ra' are allocated through kzalloc() and kmalloc(), respectively. In the following code, if an error occurs, the execution will be redirected to 'out' or 'out_unlock' and the function will be exited. However, on some of the paths, 'ref' and 'ra' are not deallocated, leading to memory leaks. For example, if 'action' is BTRFS_ADD_DELAYED_EXTENT, add_block_entry() will be invoked. If the return value indicates an error, the execution will be redirected to 'out'. But, 'ref' is not deallocated on this path, causing a memory leak. To fix the above issues, deallocate both 'ref' and 'ra' before exiting from the function when an error is encountered. CC: stable@vger.kernel.org # 4.15+ Signed-off-by: Wenwen Wang Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 4a4257c75cfa586819f54c9b3870cae5359e108f Author: Filipe Manana Date: Fri Jan 31 14:06:07 2020 +0000 Btrfs: fix race between using extent maps and merging them commit ac05ca913e9f3871126d61da275bfe8516ff01ca upstream. We have a few cases where we allow an extent map that is in an extent map tree to be merged with other extents in the tree. Such cases include the unpinning of an extent after the respective ordered extent completed or after logging an extent during a fast fsync. This can lead to subtle and dangerous problems because when doing the merge some other task might be using the same extent map and as consequence see an inconsistent state of the extent map - for example sees the new length but has seen the old start offset. With luck this triggers a BUG_ON(), and not some silent bug, such as the following one in __do_readpage(): $ cat -n fs/btrfs/extent_io.c 3061 static int __do_readpage(struct extent_io_tree *tree, 3062 struct page *page, (...) 3127 em = __get_extent_map(inode, page, pg_offset, cur, 3128 end - cur + 1, get_extent, em_cached); 3129 if (IS_ERR_OR_NULL(em)) { 3130 SetPageError(page); 3131 unlock_extent(tree, cur, end); 3132 break; 3133 } 3134 extent_offset = cur - em->start; 3135 BUG_ON(extent_map_end(em) <= cur); (...) Consider the following example scenario, where we end up hitting the BUG_ON() in __do_readpage(). We have an inode with a size of 8KiB and 2 extent maps: extent A: file offset 0, length 4KiB, disk_bytenr = X, persisted on disk by a previous transaction extent B: file offset 4KiB, length 4KiB, disk_bytenr = X + 4KiB, not yet persisted but writeback started for it already. The extent map is pinned since there's writeback and an ordered extent in progress, so it can not be merged with extent map A yet The following sequence of steps leads to the BUG_ON(): 1) The ordered extent for extent B completes, the respective page gets its writeback bit cleared and the extent map is unpinned, at that point it is not yet merged with extent map A because it's in the list of modified extents; 2) Due to memory pressure, or some other reason, the MM subsystem releases the page corresponding to extent B - btrfs_releasepage() is called and returns 1, meaning the page can be released as it's not dirty, not under writeback anymore and the extent range is not locked in the inode's iotree. However the extent map is not released, either because we are not in a context that allows memory allocations to block or because the inode's size is smaller than 16MiB - in this case our inode has a size of 8KiB; 3) Task B needs to read extent B and ends up __do_readpage() through the btrfs_readpage() callback. At __do_readpage() it gets a reference to extent map B; 4) Task A, doing a fast fsync, calls clear_em_loggin() against extent map B while holding the write lock on the inode's extent map tree - this results in try_merge_map() being called and since it's possible to merge extent map B with extent map A now (the extent map B was removed from the list of modified extents), the merging begins - it sets extent map B's start offset to 0 (was 4KiB), but before it increments the map's length to 8KiB (4kb + 4KiB), task A is at: BUG_ON(extent_map_end(em) <= cur); The call to extent_map_end() sees the extent map has a start of 0 and a length still at 4KiB, so it returns 4KiB and 'cur' is 4KiB, so the BUG_ON() is triggered. So it's dangerous to modify an extent map that is in the tree, because some other task might have got a reference to it before and still using it, and needs to see a consistent map while using it. Generally this is very rare since most paths that lookup and use extent maps also have the file range locked in the inode's iotree. The fsync path is pretty much the only exception where we don't do it to avoid serialization with concurrent reads. Fix this by not allowing an extent map do be merged if if it's being used by tasks other then the one attempting to merge the extent map (when the reference count of the extent map is greater than 2). Reported-by: ryusuke1925 Reported-by: Koki Mitani Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206211 CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit c48bf2fcad547900c34845d517d64ee7cdc77b2c Author: Theodore Ts'o Date: Fri Feb 14 18:11:19 2020 -0500 ext4: improve explanation of a mount failure caused by a misconfigured kernel commit d65d87a07476aa17df2dcb3ad18c22c154315bec upstream. If CONFIG_QFMT_V2 is not enabled, but CONFIG_QUOTA is enabled, when a user tries to mount a file system with the quota or project quota enabled, the kernel will emit a very confusing messsage: EXT4-fs warning (device vdc): ext4_enable_quotas:5914: Failed to enable quota tracking (type=0, err=-3). Please run e2fsck to fix. EXT4-fs (vdc): mount failed We will now report an explanatory message indicating which kernel configuration options have to be enabled, to avoid customer/sysadmin confusion. Link: https://lore.kernel.org/r/20200215012738.565735-1-tytso@mit.edu Google-Bug-Id: 149093531 Fixes: 7c319d328505b778 ("ext4: make quota as first class supported feature") Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit a5c03b93e7b5f2080cc574ac65312f0433758158 Author: Shijie Luo Date: Mon Feb 10 20:17:52 2020 -0500 ext4: add cond_resched() to ext4_protect_reserved_inode commit af133ade9a40794a37104ecbcc2827c0ea373a3c upstream. When journal size is set too big by "mkfs.ext4 -J size=", or when we mount a crafted image to make journal inode->i_size too big, the loop, "while (i < num)", holds cpu too long. This could cause soft lockup. [ 529.357541] Call trace: [ 529.357551] dump_backtrace+0x0/0x198 [ 529.357555] show_stack+0x24/0x30 [ 529.357562] dump_stack+0xa4/0xcc [ 529.357568] watchdog_timer_fn+0x300/0x3e8 [ 529.357574] __hrtimer_run_queues+0x114/0x358 [ 529.357576] hrtimer_interrupt+0x104/0x2d8 [ 529.357580] arch_timer_handler_virt+0x38/0x58 [ 529.357584] handle_percpu_devid_irq+0x90/0x248 [ 529.357588] generic_handle_irq+0x34/0x50 [ 529.357590] __handle_domain_irq+0x68/0xc0 [ 529.357593] gic_handle_irq+0x6c/0x150 [ 529.357595] el1_irq+0xb8/0x140 [ 529.357599] __ll_sc_atomic_add_return_acquire+0x14/0x20 [ 529.357668] ext4_map_blocks+0x64/0x5c0 [ext4] [ 529.357693] ext4_setup_system_zone+0x330/0x458 [ext4] [ 529.357717] ext4_fill_super+0x2170/0x2ba8 [ext4] [ 529.357722] mount_bdev+0x1a8/0x1e8 [ 529.357746] ext4_mount+0x44/0x58 [ext4] [ 529.357748] mount_fs+0x50/0x170 [ 529.357752] vfs_kern_mount.part.9+0x54/0x188 [ 529.357755] do_mount+0x5ac/0xd78 [ 529.357758] ksys_mount+0x9c/0x118 [ 529.357760] __arm64_sys_mount+0x28/0x38 [ 529.357764] el0_svc_common+0x78/0x130 [ 529.357766] el0_svc_handler+0x38/0x78 [ 529.357769] el0_svc+0x8/0xc [ 541.356516] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mount:18674] Link: https://lore.kernel.org/r/20200211011752.29242-1-luoshijie1@huawei.com Reviewed-by: Jan Kara Signed-off-by: Shijie Luo Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit bda71c14e115dbdff20136930ac289fed9ef3767 Author: Jan Kara Date: Mon Feb 10 15:43:16 2020 +0100 ext4: fix checksum errors with indexed dirs commit 48a34311953d921235f4d7bbd2111690d2e469cf upstream. DIR_INDEX has been introduced as a compat ext4 feature. That means that even kernels / tools that don't understand the feature may modify the filesystem. This works because for kernels not understanding indexed dir format, internal htree nodes appear just as empty directory entries. Index dir aware kernels then check the htree structure is still consistent before using the data. This all worked reasonably well until metadata checksums were introduced. The problem is that these effectively made DIR_INDEX only ro-compatible because internal htree nodes store checksums in a different place than normal directory blocks. Thus any modification ignorant to DIR_INDEX (or just clearing EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch and trigger kernel errors. So we have to be more careful when dealing with indexed directories on filesystems with checksumming enabled. 1) We just disallow loading any directory inodes with EXT4_INDEX_FL when DIR_INDEX is not enabled. This is harsh but it should be very rare (it means someone disabled DIR_INDEX on existing filesystem and didn't run e2fsck), e2fsck can fix the problem, and we don't want to answer the difficult question: "Should we rather corrupt the directory more or should we ignore that DIR_INDEX feature is not set?" 2) When we find out htree structure is corrupted (but the filesystem and the directory should in support htrees), we continue just ignoring htree information for reading but we refuse to add new entries to the directory to avoid corrupting it more. Link: https://lore.kernel.org/r/20200210144316.22081-1-jack@suse.cz Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Reviewed-by: Andreas Dilger Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 5ad597ec14679f8c616f4cfc097568d11b44016f Author: Theodore Ts'o Date: Thu Feb 6 17:35:01 2020 -0500 ext4: fix support for inode sizes > 1024 bytes commit 4f97a68192bd33b9963b400759cef0ca5963af00 upstream. A recent commit, 9803387c55f7 ("ext4: validate the debug_want_extra_isize mount option at parse time"), moved mount-time checks around. One of those changes moved the inode size check before the blocksize variable was set to the blocksize of the file system. After 9803387c55f7 was set to the minimum allowable blocksize, which in practice on most systems would be 1024 bytes. This cuased file systems with inode sizes larger than 1024 bytes to be rejected with a message: EXT4-fs (sdXX): unsupported inode size: 4096 Fixes: 9803387c55f7 ("ext4: validate the debug_want_extra_isize mount option at parse time") Link: https://lore.kernel.org/r/20200206225252.GA3673@mit.edu Reported-by: Herbert Poetzl Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit ca0d17078b15108f65a09b8e54cb1a5d42a60f72 Author: Andreas Dilger Date: Sun Jan 26 15:03:34 2020 -0700 ext4: don't assume that mmp_nodename/bdevname have NUL commit 14c9ca0583eee8df285d68a0e6ec71053efd2228 upstream. Don't assume that the mmp_nodename and mmp_bdevname strings are NUL terminated, since they are filled in by snprintf(), which is not guaranteed to do so. Link: https://lore.kernel.org/r/1580076215-1048-1-git-send-email-adilger@dilger.ca Signed-off-by: Andreas Dilger Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 8e57f6a66102c186c01b96feb2c93543bdb60766 Author: Alexander Tsoy Date: Thu Feb 13 02:54:50 2020 +0300 ALSA: usb-audio: Add clock validity quirk for Denon MC7000/MCX8000 commit 9f35a31283775e6f6af73fb2c95c686a4c0acac7 upstream. It should be safe to ignore clock validity check result if the following conditions are met: - only one single sample rate is supported; - the terminal is directly connected to the clock source; - the clock type is internal. This is to deal with some Denon DJ controllers that always reports that clock is invalid. Tested-by: Tobias Oszlanyi Signed-off-by: Alexander Tsoy Cc: Link: https://lore.kernel.org/r/20200212235450.697348-1-alexander@tsoy.me Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 59ed2b7a186a52eddf01f3f4e137d5469c803c1a Author: Saurav Girepunje Date: Tue Oct 29 23:22:00 2019 +0530 ALSA: usb-audio: sound: usb: usb true/false for bool return type commit 1d4961d9eb1aaa498dfb44779b7e4b95d79112d0 upstream. Use true/false for bool type return in uac_clock_source_is_valid(). Signed-off-by: Saurav Girepunje Link: https://lore.kernel.org/r/20191029175200.GA7320@saurav Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit e074c64a27b52b5f97460816c622e6fc74656b52 Author: Suzuki K Poulose Date: Fri Feb 14 16:57:34 2020 +0000 arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly commit 52f73c383b2418f2d31b798e765ae7d596c35021 upstream We detect the absence of FP/SIMD after an incapable CPU is brought up, and by then we have kernel threads running already with TIF_FOREIGN_FPSTATE set which could be set for early userspace applications (e.g, modprobe triggered from initramfs) and init. This could cause the applications to loop forever in do_nofity_resume() as we never clear the TIF flag, once we now know that we don't support FP. Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag for tasks which may have them set, as we would have done in the normal case, but avoiding touching the hardware state (since we don't support any). Also to make sure we handle the cases seemlessly we categorise the helper functions to two : 1) Helpers for common core code, which calls into take appropriate actions without knowing the current FPSIMD state of the CPU/task. e.g fpsimd_restore_current_state(), fpsimd_flush_task_state(), fpsimd_save_and_flush_cpu_state(). We bail out early for these functions, taking any appropriate actions (e.g, clearing the TIF flag) where necessary to hide the handling from core code. 2) Helpers used when the presence of FP/SIMD is apparent. i.e, save/restore the FP/SIMD register state, modify the CPU/task FP/SIMD state. e.g, fpsimd_save(), task_fpsimd_load() - save/restore task FP/SIMD registers fpsimd_bind_task_to_cpu() \ - Update the "state" metadata for CPU/task. fpsimd_bind_state_to_cpu() / fpsimd_update_current_state() - Update the fp/simd state for the current task from memory. These must not be called in the absence of FP/SIMD. Put in a WARNING to make sure they are not invoked in the absence of FP/SIMD. KVM also uses the TIF_FOREIGN_FPSTATE flag to manage the FP/SIMD state on the CPU. However, without FP/SIMD support we trap all accesses and inject undefined instruction. Thus we should never "load" guest state. Add a sanity check to make sure this is valid. Cc: stable@vger.kernel.org # v4.19 Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Acked-by: Marc Zyngier Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit b7230b62fc07902de0108d763b325f29eae3ead4 Author: Suzuki K Poulose Date: Fri Feb 14 16:57:33 2020 +0000 arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly commit 7559950aef1ab8792c50797c6c5c7c5150a02460 upstream We set the compat_elf_hwcap bits unconditionally on arm64 to include the VFP and NEON support. However, the FP/SIMD unit is optional on Arm v8 and thus could be missing. We already handle this properly in the kernel, but still advertise to the COMPAT applications that the VFP is available. Fix this to make sure we only advertise when we really have them. Cc: stable@vger.kernel.org # v4.19 Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Signed-off-by: Suzuki K Poulose Signed-off-by: Sasha Levin commit 9c8cd851a5e7b87d169bc6aac0cb1a3e7c78358a Author: Arvind Sankar Date: Tue Feb 11 11:22:35 2020 -0500 ALSA: usb-audio: Apply sample rate quirk for Audioengine D1 commit 93f9d1a4ac5930654c17412e3911b46ece73755a upstream. The Audioengine D1 (0x2912:0x30c8) does support reading the sample rate, but it returns the rate in byte-reversed order. When setting sampling rate, the driver produces these warning messages: [168840.944226] usb 3-2.2: current rate 4500480 is different from the runtime rate 44100 [168854.930414] usb 3-2.2: current rate 8436480 is different from the runtime rate 48000 [168905.185825] usb 3-2.1.2: current rate 30465 is different from the runtime rate 96000 As can be seen from the hexadecimal conversion, the current rate read back is byte-reversed from the rate that was set. 44100 == 0x00ac44, 4500480 == 0x44ac00 48000 == 0x00bb80, 8436480 == 0x80bb00 96000 == 0x017700, 30465 == 0x007701 Rather than implementing a new quirk to reverse the order, just skip checking the rate to avoid spamming the log. Signed-off-by: Arvind Sankar Cc: Link: https://lore.kernel.org/r/20200211162235.1639889-1-nivedita@alum.mit.edu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0bef6d5c9ec2526855ad7e88694813cfdaa5327d Author: Takashi Iwai Date: Wed Feb 12 09:10:47 2020 +0100 ALSA: hda/realtek - Fix silent output on MSI-GL73 commit 7dafba3762d6c0083ded00a48f8c1a158bc86717 upstream. MSI-GL73 laptop with ALC1220 codec requires a similar workaround for Clevo laptops to enforce the DAC/mixer connection path. Set up a quirk entry for that. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=204159 Cc: Link: https://lore.kernel.org/r/20200212081047.27727-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit f09e9a45d12fca8e5dde018784edf70ef1c7d2aa Author: Takashi Iwai Date: Tue Feb 11 17:05:21 2020 +0100 ALSA: usb-audio: Fix UAC2/3 effect unit parsing commit d75a170fd848f037a1e28893ad10be7a4c51f8a6 upstream. We've got a regression report about M-Audio Fast Track C400 device, and the git bisection resulted in the commit e0ccdef92653 ("ALSA: usb-audio: Clean up check_input_term()"). This commit was about the rewrite of the input terminal parser, and it's not too obvious from the change what really broke. The answer is: it's the interpretation of UAC2/3 effect units. In the original code, UAC2 effect unit is as if through UAC1 processing unit because both UAC1 PU and UAC2/3 EU share the same number (0x07). The old code went through a complex switch-case fallthrough, finally bailing out in the middle: if (protocol == UAC_VERSION_2 && hdr[2] == UAC2_EFFECT_UNIT) { /* UAC2/UAC1 unit IDs overlap here in an * uncompatible way. Ignore this unit for now. */ return 0; } ... and this special handling was missing in the new code; the new code treats UAC2/3 effect unit as if it were equivalent with the processing unit. Actually, the old code was too confusing. The effect unit has an incompatible unit description with the processing unit, so we shouldn't have dealt with EU in the same way. This patch addresses the regression by changing the effect unit handling to the own parser function. The own parser function makes the clear distinct with PU, so it improves the readability, too. The EU parser just sets the type and the id like the old kernels. Once when the proper effect unit support is added, we can revisit this parser function, but for now, let's keep this simple setup as is. Fixes: e0ccdef92653 ("ALSA: usb-audio: Clean up check_input_term()") Cc: BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206147 Link: https://lore.kernel.org/r/20200211160521.31990-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit a2827b00d72c28c93d6f9bcdd4d7c778fdee14ab Author: Benjamin Tissoires Date: Thu Feb 13 17:07:47 2020 -0800 Input: synaptics - remove the LEN0049 dmi id from topbuttonpad list commit 5179a9dfa9440c1781816e2c9a183d1d2512dc61 upstream. The Yoga 11e is using LEN0049, but it doesn't have a trackstick. Thus, there is no need to create a software top buttons row. However, it seems that the device works under SMBus, so keep it as part of the smbus_pnp_ids. Signed-off-by: Benjamin Tissoires Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200115013023.9710-1-benjamin.tissoires@redhat.com Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit be21aa415d3280e723391ff5c02e08e3ad4d99c2 Author: Gaurav Agrawal Date: Thu Feb 13 17:06:10 2020 -0800 Input: synaptics - enable SMBus on ThinkPad L470 commit b8a3d819f872e0a3a0a6db0dbbcd48071042fb98 upstream. Add touchpad LEN2044 to the list, as it is capable of working with psmouse.synaptics_intertouch=1 Signed-off-by: Gaurav Agrawal Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CADdtggVzVJq5gGNmFhKSz2MBwjTpdN5YVOdr4D3Hkkv=KZRc9g@mail.gmail.com Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit b436680bab27c6aec3112141483548a36f2590a9 Author: Lyude Paul Date: Thu Feb 13 16:59:15 2020 -0800 Input: synaptics - switch T470s to RMI4 by default commit bf502391353b928e63096127e5fd8482080203f5 upstream. This supports RMI4 and everything seems to work, including the touchpad buttons. So, let's enable this by default. Signed-off-by: Lyude Paul Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200204194322.112638-1-lyude@redhat.com Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman